Skip to main content
Truvisory
AI Agents

How to Build and Host a Remote MCP Server on Cloudflare (McpAgent, OAuth, Portals, Code Mode)

Tony Adams 10 min read

A toy MCP server runs on your laptop and talks to one client over stdio. A production MCP server is the opposite: a remote, authenticated, stateful service that any client — Claude, Cursor, ChatGPT, your own agent — can reach over the internet, sign into, and call safely. Getting from the toy to the production version comes down to four things: where the server runs, how it holds state per user, how it authenticates the caller, and how you keep its tool surface from blowing up your token budget. On Cloudflare, the answer to all four is one object. Your MCP server is an McpAgent — a Durable Object served from a Worker over Streamable HTTP — that you secure with OAuth, optionally put behind a governance portal, and slim with Code Mode. This is how you build and host one.

This is the tool-layer deep dive behind our guide to building AI agents on Cloudflare, and the companion to the agent-as-Durable-Object spoke — because an MCP server on Cloudflare is a Durable Object — and to durable execution with Workflows, since workflows are a common caller of MCP tools. For the architecture-level case for designing around MCP in the first place, see why we build MCP-first.

What MCP is, and why production means a remote server

The Model Context Protocol, introduced by Anthropic in late 2024, is the open standard that lets an AI agent (the client) call external tools and data (the server) through one universal interface. There are two kinds of server. A local server runs on the user’s machine and speaks stdio — fine for a single developer. A remote server is reachable over the internet, requires the caller to sign in and grant permission, and works from web and mobile clients. Production is remote: that’s how you serve a tool to more than one person, govern who can use it, and update it without redeploying anything on the client side.

Clients connect to a remote server directly when they support it — Claude, Cursor, ChatGPT’s developer mode, the MCP Inspector, and Cloudflare’s AI Playground all do. Clients that still only speak local stdio reach a remote server through a small adapter, mcp-remote.

The McpAgent: your server is a Durable Object

On Cloudflare you build a remote server by extending the McpAgent class. Each client session is backed by its own Durable Object with an embedded SQLite database, which is what makes sessions stateful, and hibernation is on by default so an idle session consumes no compute. The state mechanics — setState, the embedded SQL, hibernation timing — are the subject of the agent-as-Durable-Object spoke and aren’t re-explained here; what matters for an MCP server is that you get durable per-session state for free. One thing to know going in: a session maps to one instance, and when a client reconnects, a new session starts and its in-memory state resets.

You register tools on a standard McpServer from the official MCP SDK, with Zod schemas for inputs:

import { McpAgent } from "agents/mcp";
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { z } from "zod";

export class MyMCP extends McpAgent {
  server = new McpServer({ name: "Demo", version: "1.0.0" });

  async init() {
    this.server.tool(
      "add",
      "Add two numbers",
      { a: z.number(), b: z.number() },
      async ({ a, b }) => ({ content: [{ type: "text", text: String(a + b) }] })
    );
  }
}

export default MyMCP.serve("/mcp");

That last line is the whole server export. The wrangler config needs a Durable Object binding and a SQLite migration:

{
  "durable_objects": { "bindings": [{ "name": "MyMCP", "class_name": "MyMCP" }] },
  "migrations": [{ "new_sqlite_classes": ["MyMCP"], "tag": "v1" }]
}

Which transport do you use?

This is the question with the most churn, so it’s worth being precise. The MCP spec defines stdio for local servers and Streamable HTTP for remote ones; Streamable HTTP was introduced in the March 2025 spec revision and ratified as the production transport that June. The older HTTP-plus-SSE transport is deprecated — you’ll still see it, but it’s not what you build on now.

On Cloudflare that maps to a small set of handlers. McpAgent.serve("/mcp") serves Streamable HTTP automatically and is the default for a stateful server. For a stateless server, createMcpHandler(server) wraps a server you create fresh per request. McpAgent.serveSSE() exists purely for backward compatibility with legacy SSE-only clients. And for internal agent-to-server calls that never leave Cloudflare, there’s an RPC transport that skips HTTP and auth entirely. Re-verify the exact handler names before you ship — Cloudflare reorganized the stateful path recently, and this is the part of the API most likely to have moved.

Locking it down: OAuth and identity

A public tool is fine for something like a calculator. The moment a tool touches user data, it needs auth, and MCP standardizes on a subset of OAuth 2.1. Cloudflare’s workers-oauth-provider turns your Worker into a full OAuth 2.1 provider with PKCE, so you don’t hand-roll the protocol. It supports four patterns, and you pick by how you want users to sign in: Cloudflare Access as the identity provider; a third-party provider like GitHub or Google where your Worker issues its own bound token; an auth-as-a-service provider like Stytch, Auth0, or WorkOS; or your Worker handling the entire flow itself.

The provider wraps your server:

import { OAuthProvider } from "@cloudflare/workers-oauth-provider";

export default new OAuthProvider({
  apiHandlers: { "/mcp": MyMCP.serve("/mcp") },
  authorizeEndpoint: "/authorize",
  tokenEndpoint: "/token",
  clientRegistrationEndpoint: "/register",
  defaultHandler: AuthHandler, // GitHub, Google, Access, or your own
});

Two details matter for production. First, the authenticated user reaches your tools through this.props — so a handler can read this.props.claims.email and act as that user. Second, you can gate tools two ways: check permissions inside a handler and return a denial the model can explain, or conditionally register the tool so the model never sees it at all. On token storage, the provider keeps access tokens, refresh tokens, and client secrets only as SHA-256 hashes and supports dynamic client registration, which aligns it with the MCP authorization spec’s RFC requirements.

If your tools front internal, Access-protected apps, Managed OAuth for Cloudflare Access makes those apps agent-ready without insecure shared service accounts, using standards-based discovery so a non-browser client gets pointed at the right OAuth endpoints automatically. It’s on by default for new MCP portals. Worth flagging: the documentation still labels it beta, so confirm its status before you design around it.

Governing many servers at once: MCP Server Portals

One server is easy to reason about. An organization with a dozen — GitHub, Jira, Sentry, internal services — is where governance gets hard, and that’s the job of MCP Server Portals. A portal aggregates multiple MCP servers, Cloudflare-hosted or third-party, behind one endpoint and one Cloudflare Access flow, with curated per-tool exposure, automatic namespacing, centralized logging, and DLP guardrails. Access enforces which users and agents can see which servers and tools, using your corporate identity provider plus policies like MFA, device posture, or geography.

Two honest caveats belong here. Portal log export through Logpush is an Enterprise-plan feature, not something every plan gets. And unless an underlying server enforces its own authorization, a blocked user can still reach it through its direct URL — the portal governs the front door, not every window. Cloudflare flags this and says enforcement mechanisms are planned. Separately, Cloudflare offers Shadow MCP detection: a Gateway ruleset that flags unsanctioned MCP traffic by inspecting request bodies for MCP method calls. Read that for what it is — detection, not prevention. It tells you when someone is using an unapproved server; it doesn’t inherently stop them.

Portals are in open beta and have been since they were announced in August 2025; I haven’t found a general-availability announcement, so treat them as beta and plan accordingly.

Code Mode: an entire API in about a thousand tokens

The problem Code Mode solves is real and gets worse as you add tools. Exposing every tool’s schema to the model burns context: Cloudflare’s own API server covers more than 2,500 endpoints, and handing the model all those schemas costs north of 1.17 million tokens. Code Mode flips the approach — instead of listing tools, the server exposes two (search and execute), the model writes TypeScript against a typed SDK, and that code runs in a sandboxed isolate. The same 2,500-endpoint API drops to roughly a thousand tokens, about a 99.9% reduction.

~1,000
tokens for Cloudflare's entire 2,500-endpoint API under Code Mode, versus ~1.17M for a flat tool list — a ~99.9% reduction — Cloudflare Code Mode blog

On Cloudflare’s internal portal the effect held at scale: a 52-tool surface that cost around 9,400 tokens fell to about 600 with Code Mode, and it stays flat as more servers connect. At the portal level, Code Mode is on by default, collapsing all the upstream tools into a search-and-execute pair. The standalone @cloudflare/codemode SDK is open-source but explicitly experimental — breaking changes are possible, so use it cautiously in production. Its sandbox is locked down by default: outbound network access is blocked, and tool calls are dispatched over Workers RPC rather than the open internet. One current gap worth knowing: Code Mode doesn’t yet support tool-approval flows, so any tool that needs a human approval step should go through standard tool calling instead.

Deploying it, and what it costs

The fast path is a template: npm create cloudflare@latest with the remote-mcp-authless starter for a public server, or remote-mcp-github-oauth for an authenticated one, then wrangler deploy. You test with the MCP Inspector or Cloudflare’s AI Playground — you can’t just open the /mcp URL in a browser — and you connect Claude Desktop through mcp-remote.

On cost, an MCP server bills on the underlying Workers and Durable Objects model, so the detailed math belongs to the platform’s pricing rather than this article. One pitfall is worth calling out, though: because each session is a Durable Object with its own storage, a server with very high session counts can run up storage costs faster than you’d expect. A developer reported on the project’s GitHub spending roughly $676 in a month against an expected $60, traced to per-session storage. Treat it as a single report, but audit your Durable Object storage if you’re running a high-traffic server — and note that storage billing for SQLite-backed Durable Objects began in January 2026.

As of this writing, the maturity picture is mixed: the McpAgent and OAuth core is stable (with an API that still shifts), Streamable HTTP is the current standard and SSE is deprecated, Portals are open beta, Managed OAuth is beta, and the Code Mode SDK is experimental. Stamp any version of this for the month you publish it.

Why Cloudflare is a strong production MCP host

The pillar names tool-layer and tool-security as two of the seven things a production agent needs. A remote MCP server is exactly where those requirements land, and Cloudflare answers each with a first-party piece:

// What a production MCP server needs, and the Cloudflare answer to each
A production MCP server needs… On Cloudflare
Persistent per-session stateMcpAgent = Durable Object + SQLite, idle-free hibernation
OAuth done rightworkers-oauth-provider (OAuth 2.1 + PKCE), Access OAuth, Managed OAuth
Global low-latency hostingWorkers run in 335+ cities, within ~50ms of most of the world’s population
Governance and observabilityPortals (curation, DLP, logging), Gateway Shadow-MCP detection
Token-cost control at scaleCode Mode keeps the tool surface roughly constant regardless of API size
The ability to actTools run colocated with Workers AI, Workflows, Browser Run, and Sandboxes

That last row is the one people underrate. A tool that has to call out to a separate cloud to do real work pays a network hop and an operational seam every time; on Cloudflare the tool, the model, the durable workflow, and the browser or sandbox it drives all run in the same place.

The honest trade-offs

  • MCP is young and the spec moves. The transport changed in 2025 and the auth spec is still evolving. Expect API drift; pin versions and date your assumptions.
  • Several headline features are beta or experimental. Portals (open beta), Managed OAuth for Access (beta), the Code Mode SDK (experimental), and the Dynamic Worker runtime underneath Code Mode (beta) are not all GA. Build on the stable core and treat the rest as roadmap.
  • Security is reduced, not eliminated. Code Mode’s sandbox and scoped OAuth lower the risk of prompt injection, confused-deputy attacks, and tool misuse, but don’t remove them. When you proxy a third-party OAuth provider you still implement your own consent and CSRF protection, and Shadow MCP detection flags problems rather than preventing them.
  • Vendor concentration. McpAgent, the OAuth provider, Portals, and Code Mode are Cloudflare-specific; moving off-platform means re-architecting the tool layer. Weigh that against how much the platform does for you.

Frequently asked

What's the difference between a local and a remote MCP server?
A local server runs on one machine and talks over stdio — good for a single developer. A remote server is reachable over the internet, requires sign-in and permission, and works from web and mobile clients. Production is remote; on Cloudflare that is an McpAgent served over Streamable HTTP.
Do I need OAuth, or can my server be public?
A read-only or computational tool can be public — the authless template exists for exactly that. The moment a tool reads or writes user data, add OAuth so each call runs as an authenticated user with scoped permissions. The workers-oauth-provider library handles the protocol for you.
Is Code Mode safe — doesn't letting the model run code add risk?
The model writes code that runs in a sandboxed isolate with outbound network access blocked by default and tool calls dispatched over internal RPC, which contains it considerably. It reduces risk rather than eliminating it, and it does not yet support human-approval flows, so route approval-required actions through standard tool calling.
Is all of this production-ready?
The core — McpAgent, Streamable HTTP, the OAuth provider — is solid and is what Cloudflare and partner servers run on. Portals, Managed OAuth, and the Code Mode SDK are at beta or experimental maturity. Build on the stable pieces and adopt the newer ones deliberately.

Working with Truvisory

If you’d rather have a secured, production MCP server built and shipped than wire up OAuth, transports, and governance yourself, this is the work we do: senior-engineer-led, fixed-scope agent and tool systems on Cloudflare, with the tool layer secured from the first commit. See how we deliver agent systems, or read the pillar guide to the full stack.