Skip to main content
Truvisory
AI Agents

Browser and Code-Execution Agents on Cloudflare: How Agents Act with Browser Run and Sandboxes

Tony Adams 9 min read

An agent that can only talk is half an agent. The other half is doing — and on Cloudflare an agent does things two ways: it drives a real browser to navigate, scrape, and complete web tasks, and it runs code in an isolated sandbox to compute, transform files, or operate a full Linux machine. Those two capabilities are Browser Run and Sandboxes, and the reason to run them on Cloudflare is colocation: both sit on the same global network as the agent itself, the models it calls, the workflows that retry its steps, and the tools it exposes. No separate browser farm, no Kubernetes cluster, no service to operate between the agent and the world.

This is the action-layer deep dive behind our guide to building AI agents on Cloudflare. It’s the companion to the agent-as-Durable-Object spoke — the agent that orchestrates these actions is a Durable Object — to durable execution with Workflows, since multi-step browse-and-execute pipelines should be wrapped in workflows for retries, and to hosting MCP servers, because browsers and sandboxes are frequently exposed to agents as MCP tools.

Half one: giving the agent a browser

Browser Run is Cloudflare’s headless-browser service — it was Browser Rendering until an April 2026 rename, so you’ll see both names in the docs; the product and the APIs are the same. An agent launches a real Chrome on demand, drives it, and lets it go. You control it three ways: the Workers bindings for Puppeteer and Playwright (Cloudflare maintains forks), a REST API for one-shot “Quick Actions” like screenshot, PDF, markdown, and structured-data extraction, and — new in 2026 — a direct Chrome DevTools Protocol endpoint that lets any Puppeteer, Playwright, or CDP client connect with a one-line config change, from any language, with no Worker in between.

The 2026 additions are what turn it from a rendering service into an action layer for agents. Live View lets you watch a session in real time — the page, the DOM, the console, the network. Human in the Loop lets a person take control of a live session to click, type, enter credentials, or submit a form, then hand control back to the agent — the pattern you want for any high-stakes web action. Session Recordings capture DOM changes and input events as structured JSON you can replay. And the ceilings went up: 120 concurrent browsers per account on the paid plan, up four-fold, after the service was rebuilt on Cloudflare’s own Containers for speed and scale.

A minimal browser task is a few lines:

import puppeteer from "@cloudflare/puppeteer";
// wrangler: { "browser": { "binding": "MYBROWSER" } }, nodejs_compat

export default {
  async fetch(req: Request, env: Env) {
    const browser = await puppeteer.launch(env.MYBROWSER);
    const page = await browser.newPage();
    await page.goto("https://example.com");
    const shot = await page.screenshot();
    await browser.close();
    return new Response(shot, { headers: { "Content-Type": "image/png" } });
  }
};

For agents that browse repeatedly, the cost-and-latency move is to reuse sessions rather than launch a fresh browser each time — list open sessions, reconnect to a free one, and disconnect() instead of close() to keep it warm. On limits and price: the free plan gives a few minutes a day and three concurrent browsers; the paid plan includes ten browser-hours and ten concurrent browsers a month, then bills about nine cents per additional hour and two dollars per additional concurrent browser, with Quick Actions charged on browser-hours only.

One thing to state plainly because it shapes what Browser Run is for: it deliberately does not bypass bot detection. Its crawler self-identifies as a signed bot, respects robots.txt, and does not solve CAPTCHAs. That’s the right call ethically, and it means Browser Run is a tool for legitimate automation and your own web flows, not for evading anti-bot systems. There’s also an Agents-SDK path (createBrowserTools) that hands an LLM a search-and-execute pair for driving the browser, but it’s still beta and runs model-written JavaScript in a sandbox with outbound network blocked — useful, but treat its status accordingly.

Half two: giving the agent a computer

A Sandbox is a persistent, isolated Linux environment, addressed by name, that starts on demand and sleeps when idle. You get it with getSandbox(env.Sandbox, "session-id") and then run commands or code in it — including a persistent code interpreter whose state survives across calls like a notebook, returning rich output such as charts and tables. That’s what lets an agent do real work: clone a repo, run a build, execute model-generated Python for data analysis, manipulate files, or stand up a dev server behind a preview URL.

import { getSandbox } from "@cloudflare/sandbox";
export { Sandbox } from "@cloudflare/sandbox";

export default {
  async fetch(req: Request, env: Env) {
    const sandbox = getSandbox(env.Sandbox, "agent-session-47");
    const ctx = await sandbox.createCodeContext({ language: "python" });
    const result = await sandbox.runCode("import pandas as pd; print(2 + 2)", { context: ctx });
    return Response.json({ stdout: result.logs?.stdout });
  }
};

The feature that matters most for security is the egress proxy. Sandboxed code routinely needs to call an authenticated API — but you do not want a credential sitting inside an environment running model-written code. Cloudflare solves this by injecting credentials outside the sandbox, at the network layer, so the sandbox never sees the real token:

class AgentBox extends Sandbox {
  static outboundByHost = {
    "github.com": (request, env, ctx) => {
      const headers = new Headers(request.headers);
      headers.set("Authorization", `Bearer ${env.GITHUB_TOKEN}`); // injected, never exposed
      return fetch(request, { headers });
    }
  };
}

You can pair that with deny-by-default host allowlists, so the sandbox can only reach the destinations you permit. Sandboxes also support point-in-time backups to object storage — a from-scratch clone-and-install that takes thirty seconds restores in about two. On price, Sandboxes inherit Containers’ active-CPU model: you pay roughly two-thousandths of a cent per vCPU-second of active compute and nothing while the sandbox sits idle waiting on the model, which is exactly the right shape for agent work.

Sandboxes, Containers, or Dynamic Workers?

“Run some code” has two answers on Cloudflare, and picking the wrong one is the most common mistake. A full Linux Sandbox (built on Containers) is the right tool when the agent needs a real operating system. But if it only needs to run a snippet of JavaScript — most “Code Mode” tool calls, where the model writes code against a typed API instead of chaining tool calls — then Dynamic Workers are far better: V8 isolates that boot in single-digit milliseconds, use a few megabytes of memory, and scale without the concurrency caps of containers. The decision rule:

// Choosing the right code-execution tier for an agent action
Reach for… When the agent needs to…
Dynamic WorkersRun ephemeral, untrusted JavaScript fast and at scale (Code Mode tool execution)
Sandboxes / ContainersUse a real Linux computer — git, bash, Python, dev servers, a filesystem, long-running processes
Browser RunDrive an actual browser — navigate, fill forms, scrape rendered pages, capture screenshots or PDFs

The token economics of Dynamic Workers are the headline: Cloudflare’s own API server exposes more than 2,500 endpoints through two tools in under a thousand tokens, versus the roughly 1.17 million tokens a flat tool list would cost — more than the entire context window of the largest models. Worth flagging honestly, though: Cloudflare itself notes that isolate sandboxing is a harder security problem than hardware VMs, and mitigates it with rapid V8 patching, a second-layer sandbox, and hardware isolation. Dynamic Workers are in open beta; treat them as a strong bet on direction rather than a settled foundation.

Where this sits in the agent stack

The action layer doesn’t work alone. The agent that decides when to browse or run code is a Durable Object, and its state model is the subject of the agent-as-Durable-Object spoke — we won’t re-derive it here. Because browser and code tasks are flaky by nature — pages change, networks hiccup — you wrap multi-step browse-and-execute pipelines in durable workflows so a failed step retries from its last checkpoint instead of restarting the whole run. And when you want other agents to use a browser or sandbox, you expose it as a tool through an MCP server — there are ready-made MCP servers for both Playwright and Chrome DevTools. This spoke owns the action primitives themselves; those three own the orchestration, retry, and tool-exposure around them.

The highest-risk thing your agent does

Browser automation and code execution are, by a wide margin, the riskiest things an agent does — and the honest position is that Cloudflare’s defenses reduce that risk without eliminating it. This deserves to be a design constraint, not a footnote.

The threat is real and growing. Google’s threat-intelligence team measured a roughly 32% relative rise in malicious indirect-prompt-injection content on the web over a recent three-month window.

32%
relative rise in malicious indirect-prompt-injection content on the web over a recent three-month window, per Google threat intelligence — Google, April 2026

Palo Alto Networks documented more than twenty distinct web-based injection techniques observed in the wild, including attempts to destroy data and bypass content moderation. Brave’s researchers showed a real agentic browser being steered by hidden instructions on a web page into reading a one-time passcode from the user’s email and exfiltrating it — no memory corruption, no code-execution bug, just text on a page the agent trusted. Anthropic’s own guidance is blunt that “no browser agent is immune to prompt injection.”

Cloudflare gives you three real mitigations: isolation (each browser and sandbox is contained), the egress proxy (credentials never reach model-controlled code), and human-in-the-loop takeover for high-stakes actions. Use all three, and add the obvious discipline: least-privilege bindings, egress allowlists, and a human approval step before anything irreversible. Treat every web page the agent visits and every line of code the model writes as potentially hostile. That posture is what separates a production action layer from a demo that works until the day someone feeds it a poisoned page.

Why Cloudflare for the action layer

The pillar names the “ability to act” as one of the things a production agent needs, and the case for Cloudflare here is structural. The browser and the sandbox run on the same network as the agent, the models, the durable workflows, and the MCP tools — so a tool call doesn’t pay a cross-cloud network hop and an operational seam every time it does real work. There’s no browser farm to keep patched and no container orchestrator to run; you get on-demand Chrome from a warm global pool and named Linux sandboxes that start on a request. Billing is per-use and scales to zero — browser-hours for the browser, active CPU for the sandbox — which matches how agents actually work, in bursts with long idle waits on the model. The interfaces are standard (Chrome DevTools Protocol, Puppeteer, Playwright, MCP), so existing automation code largely carries over. And the egress proxy and human-in-the-loop takeover are built in rather than bolted on. The agent reasons; Browser Run and Sandboxes are how it touches the world, without a second platform between the two.

Other honest trade-offs

  • Browser automation is brittle. Sites change, and anti-bot defenses get in the way. Cloudflare deliberately doesn’t bypass them, so plan for flakiness and wrap browse-and-execute pipelines in durable workflows for retries.
  • Cold starts exist. Containers and Sandboxes have historically taken a couple of seconds to start cold; Dynamic Workers remove that for JavaScript-only work but can’t run a full OS.
  • There are concurrency ceilings. 120 concurrent browsers and per-second launch limits on the paid plan, plus account-level container caps — raisable on request, but real.
  • Maturity is uneven. Browser Run and Sandboxes are GA; site crawling, WebMCP, the Agents-SDK browser tools, and Dynamic Workers are beta or experimental, and at least one product’s docs still carry beta labels despite a GA announcement. Build on the GA core and adopt the rest deliberately.
  • Cost grows with heavy use. Browser-hours and CPU-heavy sandbox workloads add up; model your own usage rather than assuming the included allowances cover it.
  • Vendor concentration. Colocation is the upside and the lock-in at once. Standard CDP, Puppeteer, and Playwright support softens it, but the Sandbox SDK and the Browser Run bindings are Cloudflare-specific.

Frequently asked

What's the difference between Browser Run and Sandboxes?
Browser Run gives the agent a real web browser — for navigating sites, filling forms, scraping rendered pages, and capturing screenshots or PDFs. Sandboxes give it a real Linux computer — for running code, building software, analyzing data, and manipulating files. Many agents use both.
Can I use my existing Puppeteer or Playwright code?
Largely, yes. Cloudflare maintains forks of both, and there is also a direct Chrome DevTools Protocol endpoint, so most standard browser-automation code connects with a one-line configuration change.
Is it safe to run model-generated code?
Safer than running it unprotected, not risk-free. Sandboxes isolate execution, the egress proxy keeps real credentials out of the code reach, and host allowlists restrict where it can connect. You still treat model output as untrusted and gate anything irreversible behind a human approval step.
Does Cloudflare bypass CAPTCHAs and bot detection for scraping?
No — and that is intentional. Browser Run identifies itself as a signed bot and respects robots.txt rather than evading anti-bot systems. It is built for legitimate automation and your own web flows, not for circumventing site protections.

Working with Truvisory

If you’d rather have a browser-driving, code-running agent built and shipped — with the egress proxy, isolation, and human-in-the-loop controls wired in from the start — this is the work we do: senior-engineer-led, fixed-scope agent systems on Cloudflare, with the riskiest layer handled with care. See how we deliver agent systems, or read the pillar guide to the full stack.