The Cloudflare Agent Is a Durable Object: How Per-Agent State Works
On Cloudflare, an AI agent isn’t a function you wire to a database, a queue, and a session store. The agent is a Durable Object — a single, addressable, stateful micro-server with its own embedded SQLite database, its own WebSocket connections, and its own scheduler. The Agents SDK’s Agent class is built on a Durable Object, which means three things are true the moment you write class MyAgent extends Agent: each instance has persistent state that survives restarts, deploys, and failures; each instance is a single-writer actor, so its state can’t be corrupted by concurrent requests; and each instance costs nothing while it’s idle. Persistence, isolation, and scale-to-zero — that’s why “the agent is a Durable Object” is the most important sentence in the Cloudflare agent model, and it’s what this article unpacks.
This is the deep dive behind the durable-state layer of our guide to building production AI agents on Cloudflare.
What it means that an agent is a Durable Object
Under the hood, the Agent class extends the SDK’s server base class, which extends a Durable Object — so every agent you write compiles down to one. An AIChatAgent subclass adds streaming chat on top. The practical upshot is that the agent owns its compute and its storage and its scheduling in one entity, instead of being glue code between three managed services.
Identity is the part people get wrong first. An agent instance is addressed by name, and the same name always resolves to the same instance with the same memory. You route to it one of two ways: routeAgentRequest(request, env) maps an incoming URL to the right instance and fires its onRequest or onConnect handler, or getAgentByName(env.MyAgent, "user-123") returns a typed stub you can call by RPC from elsewhere in your Worker. The catch worth a callout: a named instance uses a stable, name-derived ID, while a fresh unique ID creates a brand-new object every time. Reach for the unique-ID constructor by accident and your agent gets amnesia on every request — new instance, empty state.
A minimal agent and its config look like this:
import { Agent, routeAgentRequest, callable } from "agents";
type State = { count: number };
export class CounterAgent extends Agent<Env, State> {
initialState = { count: 0 };
@callable()
increment() {
this.setState({ count: this.state.count + 1 });
return this.state.count;
}
}
export default {
fetch: (req, env) =>
routeAgentRequest(req, env) ?? new Response("Not found", { status: 404 }),
};// wrangler.jsonc
{
"compatibility_flags": ["nodejs_compat"],
"durable_objects": { "bindings": [{ "name": "CounterAgent", "class_name": "CounterAgent" }] },
"migrations": [{ "tag": "v1", "new_sqlite_classes": ["CounterAgent"] }]
}Two configuration details trip up newcomers: an agent needs both a Durable Object binding and a SQLite migration (new_sqlite_classes), and the SDK uses standard decorators — turning on TypeScript’s legacy experimentalDecorators silently breaks @callable at runtime.
How per-agent state actually works
The agent has two distinct ways to hold state, and choosing the right one is most of the skill.
The first is this.setState(). You pass it a JSON-serializable object; the SDK persists it to the agent’s SQLite storage and broadcasts it to every connected WebSocket client, then fires the onStateChanged hook. Set initialState for the default a brand-new agent starts with. There’s a validation hook, validateStateChange(), that runs before the write and aborts it if you throw — that’s where input validation belongs, not in onStateChanged, which is a notification only. The docs describe this state as persistent, synchronized, type-safe, and immediately consistent — you read your own writes.
The second is this.sql, the agent’s embedded SQLite database, queried with tagged template literals. Because it runs in the same thread as the agent, access is effectively zero-latency — there’s no round-trip across a region or a continent to reach your own data. You can pass a result type for inference, but there’s no runtime validation, so validate untrusted shapes yourself.
When to use which:
| Use this.setState() for | Use this.sql for |
|---|---|
| UI state and live counters | Historical records and logs |
| Active session / config data | Large collections and relationships |
| Anything you want pushed to clients | Anything you need to query or filter |
The rule that prevents the most pain: keep state small. Every setState call is broadcast to all connected clients, so storing a growing message array in state means re-sending the whole array on every change. Keep state light — a messageCount, a lastMessageId — and put the bulk in SQL.
On the client, useAgent() (React) or AgentClient (vanilla JS) connects over a WebSocket and exposes the agent’s state as a reactive value that updates on changes from either side; chat agents get useAgentChat, which persists and resumes message streams automatically.
What makes all of this correct is the substrate. A Durable Object is single-threaded, and the runtime’s input and output gates mean a request can’t interleave with another mid-update and clients never see writes that haven’t been persisted — the classic read-modify-write race simply can’t happen on an agent’s own state. We won’t re-derive Durable Object internals here; our Durable Objects deep dive covers the gates and the actor model in full.
What happens when an agent goes idle
An agent’s lifecycle is: wake on an event, run onStart, handle work, go idle, hibernate, and — if the machine is reclaimed or you redeploy — get evicted from memory entirely. After roughly 70 to 140 seconds with no requests, messages, or alarms, the instance is evicted from memory. None of that loses your data, because the things that matter are durable: state, SQL rows, scheduled tasks, and per-connection state all survive eviction. What does not survive is anything in memory — class fields, setTimeout timers, open fetch calls, local closures. The mental model to internalize: persist anything you can’t afford to lose; treat in-memory values as a cache.
This is also where the cost story lives. Using the WebSocket Hibernation API, an agent can keep thousands of long-lived client connections open while being evicted from memory, and duration charges don’t accrue while it’s hibernated. That’s what makes one-instance-per-user economical at scale — the opposite of paying for an always-on server per session.
How an agent wakes itself up
Agents don’t only react to incoming requests; they can schedule their own future work with this.schedule(), which accepts a delay in seconds, a specific Date, or a cron string and calls back a named method with a payload. There’s a scheduleEvery() for sub-minute recurring work, plus helpers to list and cancel schedules. Under the hood this rides on Durable Object alarms — and because a Durable Object allows only one alarm at a time, the SDK multiplexes all your schedules through a single alarm backed by an internal SQLite table. A scheduled task wakes a hibernated agent and can do anything a request could: read state, run SQL, call tools, send email. That’s how you build follow-ups, reminders, polling loops, and recurring digests without a separate cron service.
// in onStart(): wake every morning at 9:00 UTC
await this.schedule("0 9 * * *", "dailyDigest", {});What about agents that run for minutes or hours?
Short tasks fit comfortably inside a single activation. Longer ones used to require workarounds, but during Cloudflare’s Agents Week in April 2026 long-running sessions shipped natively into the SDK — enough, in Cloudflare’s own internal use, for an agent to clone a large repository, run a full test suite, iterate on failures, and open a merge request in one continuous session. Two newer mechanisms support that, and both are worth flagging as not-yet-stable: keepAlive(), added in March 2026, is an experimental heartbeat that holds an agent active for minutes-long work, and Project Think — announced in April 2026 — adds durable execution primitives like fibers (runFiber, stash, and an onFiberRecovered hook for resuming after eviction). Cloudflare is explicit that Project Think is in preview and its APIs may change. Build on it as a bet on direction, not a foundation.
For multi-step work that should fail and retry one step at a time rather than run as one long session, the right tool is Workflows, not a long-lived agent — that’s the subject of durable execution for agents with Workflows.
Why per-agent durable state is the foundation of a production agent
A production agent has to remember — conversation history, task progress, per-user context, prior tool results — across requests that might be seconds or weeks apart, and across the deploys, restarts, and evictions that happen to every real system. Cloudflare makes that memory the default rather than an add-on, and the consequences compound:
State survives restarts, deploys, and hibernation, with nothing to externalize. It’s colocated with compute, so reads and writes don’t pay a network round-trip the way a stateless function hitting a regional database does. And one-instance-per-entity gives you isolation that buys three things at once: correctness (a single-writer actor needs no distributed locks), multi-tenancy (one tenant’s agent can’t read or corrupt another’s), and blast-radius containment (a failure stays inside one instance). The stateless-function-plus-external-store pattern — a function, a Redis or DynamoDB session store, a separate scheduler, and your own concurrency control — has to assemble and operate all of that. The Durable Object collapses it into one addressable entity. We keep the full compute-and-cost comparison in the Workers-versus-Lambda cost analysis rather than repeat it here.
The honest trade-offs
This model is the right default for per-entity agents, and it is genuinely the wrong default for a few things. Worth knowing before you commit:
- Single-writer is a ceiling as well as a guarantee. One instance is single-threaded and tops out around a thousand requests per second. A single “global” agent is an anti-pattern; high write-fan-out or broadcast-heavy workloads have to shard across many instances. Perfect for one-agent-per-conversation; wrong for one shared high-throughput counter.
- Cold start and placement. A hibernated or evicted agent re-runs its constructor on the next event, and a Durable Object lives in one location — so a globally distributed audience sees latency to wherever the instance was first created. In-memory caches vanish on eviction.
- SQLite has limits. Each agent gets 10 GB of SQLite, with a 2 MB ceiling per row or value. An agent accumulating unbounded history needs pruning, or offload to D1, R2, or a managed memory service.
- Vendor concentration. The agent-as-Durable-Object model has no drop-in equivalent on AWS or GCP today; porting means reintroducing the external-store pattern you avoided. That’s a real lock-in cost to weigh against the operational simplicity it buys.
- The SDK is pre-1.0. It’s a fast-moving 0.x release that isn’t accepting external pull requests while the surface stabilizes, and several primitives are experimental or preview. Pin your versions and expect monthly change.
When to graduate beyond built-in state
Built-in state and sql cover most agents. When you need long-term memory that’s retrieved by relevance rather than queried by key, or retrieval over a corpus of your own documents, that’s the job of Cloudflare’s managed memory and search primitives — covered in agent memory and grounding. Reach for them when an agent’s history outgrows what you want to hand-roll in SQL; until then, the Durable Object’s own state is the simpler, faster default.
Frequently asked
Is the Cloudflare Agents SDK production-ready?
Do I use this.state or this.sql?
What happens to my agent's data when it's not running?
How is this different from running an agent on Lambda?
Working with Truvisory
If you’d rather have a stateful, production-grade agent built and shipped than assemble this yourself, this is the work we do: senior-engineer-led, fixed-scope agent systems on Cloudflare, with the durable-state foundation done right from day one. See how we deliver agent systems, or read the pillar guide to the full stack.