
Introducing the Stagehand API

A Modern Platform for Intelligent Automation
Imagine you want to automatically test your app's checkout flow every hour, gather competitive pricing from a dozen websites, or guide new users through your product. Historically, the technology to do this has been brittle, slow, and a frustrating "black box" when things go wrong, making it difficult to trust for important business tasks.
At Browserbase, we’re hell-bent on shining a light on this black box. We power browser solutions for AI leaders like Vercel, Perplexity, and Clay, and our recent Series B round at a $300 million valuation has allowed us to perfect this next-generation platform for web automation. Our open source framework, Stagehand, is the most downloaded framework for automating browsers with AI.
We’re attacking this problem with a complete, modern platform where three powerful components work together seamlessly:
- Browserbase (”the where”) — Think of this as the industrial-scale cloud foundation. We host and manage massive fleets of secure, ready-to-use browsers, so your team doesn't have to worry about servers, scaling, or maintenance.
- Stagehand (”the what”), our open-source toolkit that defines what you want to do. Using simple, intelligent commands like
act
(to perform an action) andextract
(to get data), your team can easily define any workflow. - [NEW] Stagehand API (”the how”) is the intelligent engine that controls how it all gets done. This API is the core of our platform, taking your Stagehand prompts, translating them into precise browser commands, and optimizing their execution on the Browserbase infrastructure for maximum speed and reliability.
It’s this deep integration of the Stagehand framework (the what), the intelligent Stagehand API (the how), and the Browserbase infrastructure (the where) that creates an impactful unified solution.
Business Impact
This new platform does more than just automate tasks; it changes how you can rely on and learn from your automation efforts. The business impact is clear and immediate.
You get actionable insights, not opaque logs. Instead of confusing CDP traces, the Stagehand API creates a clear, human-readable log of every action. You can see exactly what instruction was sent to the LLM, what was returned, and how many cache hits you got, making it incredibly easy to diagnose issues. Your team can now pinpoint and fix problems in minutes, not hours, leading to less downtime and a better user experience.
Your automations learn and heal. The Stagehand API is designed for efficiency and resilience. For workflows that you run repeatedly, we can securely cache actions to make them faster and cheaper over time. If it encounters a small error, like an element not loading instantly, it can often self-heal by intelligently retrying the action without any human intervention, making your automations incredibly robust.
You get built-in intelligence with zero effort. The world of AI is complex, but your team doesn't need to be AI experts to get expert results. The Stagehand API acts as a navigator, automatically routing your request to the best AI model for the job—whether for understanding natural language or analyzing a webpage's layout.
Better cross-platform support. The Stagehand API being a fully hosted solution enables better cross-platform support for things like multi-language support, better MCPs, and whatever else you may use AI browsers for!
Engineering Deep Dive
The Old Way: A "Black Box" Problem
Observability is one of the clearest wins we’ve enabled with the Stagehand API. Previously, in a typical setup, an engineer writes a Stagehand script with a high-level command like page.act("click the quickstart button")
. That command is translated into a series of very low-level, technical instructions using the Chrome DevTools Protocol that are then sent one-by-one over a websocket to the browser using Playwright.

The critical issue here is a communication breakdown. The browser only receives the low-level CDP commands and has no knowledge of the original, high-level goal. The "why" behind the action is completely lost during the translation process.
As a result, when you try to review or debug a failed automation, the session replay is just a stream of these opaque, technical traces. It tells you what happened at a micro-level, but it’s impossible for a person to look at and quickly understand the business step that failed.

This problem is a direct result of a flawed architecture where all the complex work happens on the user's side. This design is not only inefficient and slow due to every instruction traveling across the internet, but it's also the source of the debugging black box.
The New Way: Stagehand API on Browserbase
Instead of sending low-level instructions, client-side Stagehand can now send a single, high-level Stagehand command — like "click the quickstart button" — directly to our Stagehand API.
The Stagehand API receives this request, translates your goal into the necessary low-level CDP actions, and directs a Browserbase browser to execute them.

The difference is that all the complex translation work now happens on managed infra, allowing us to actually view meaningful LLM traces.

We are now able to get crucial information on what the user is specifically asking for, and we can intelligently handle the request to generate an optimal response!
Conclusion: A New, Clearer, Headache-Free Path for Browser Automation
By combining our intelligent Stagehand framework with a powerful, purpose-built API, we provide a complete solution that transforms automation from a fragile script into a core, strategic, and reliable part of your business operations. This new approach delivers the simplicity, clarity, and resilience needed to automate with confidence.
Get started now: npx create-browser-app