Skip to main content
Truvisory
Cloudflare · Federal

Region-pinning, audit logging, and the FedRAMP-aware edge stack.

Tony Adams 13 min read

There is a specific class of federal AI engagement that does not fit cleanly into either of the two postures most vendors take. It is not full-ATO production work on a FedRAMP High environment, which requires a multi-year compliance investment and a contract value to match. And it is not pure commercial-stack work, which a federal contracting officer cannot accept for anything touching non-public agency data or production workloads. The class I’m describing — pilots, SBIR Phase I and II builds, prototypes, modernization R&D, sandboxed evaluations of new AI capabilities — sits in the middle. The agency wants real working software. The agency wants the data handled defensibly. The agency does not want, and cannot procure, a 36-month ATO process to evaluate a six-month build.

What that class needs is a FedRAMP-aware commercial edge stack. Not authorized. Aware. Architected against the controls a future ATO would inherit, instrumented for audit, region-constrained where the data sensitivity requires it, and structured so that the eventual lift to a FedRAMP-authorized environment is an exercise in re-hosting rather than re-architecting. This is the engagement shape that the OMB acquisition memos (M-25-21 / M-25-22) implicitly favor, the shape that small-business set-aside lanes are sized for, and the shape that the Cloudflare developer platform supports natively if you architect to it deliberately.

This essay is the reference architecture I run for that engagement class. Five components, with the specific reasons each one is there. None of it is novel. All of it is precise about what the commercial Cloudflare stack does and does not do, where the FedRAMP boundary actually sits, and what an agency contracting officer should expect to see in a SOW.

What “FedRAMP-aware” actually means

Before the architecture, a definition, because the term gets abused.

FedRAMP — Federal Risk and Authorization Management Program — is the government-wide program that standardizes security assessment, authorization, and continuous monitoring for cloud products and services used by federal agencies. A product is FedRAMP-authorized at one of three impact levels (Low, Moderate, High) only if it has completed the formal authorization process and is listed on the FedRAMP Marketplace. Cloudflare has FedRAMP-authorized offerings under the Cloudflare Government brand. The commercial Cloudflare developer platform — the one most engineers and most product builds use — is not FedRAMP-authorized. These are different SKUs, different control environments, and different contractual surfaces.

“FedRAMP-aware” is the term I use, and that I’d argue the industry should standardize on, for a commercial deployment that is architected against the controls a Moderate or High authorization would require, even though the deployment itself is not authorized. The agency cannot use a FedRAMP-aware deployment for production handling of FISMA-controlled data. The agency can use it for pilots, evaluations, prototypes, R&D, and any workload where the data sensitivity does not require an authorized environment. The point of building the pilot FedRAMP-aware is that when the pilot succeeds and the agency wants to move it to production, the lift is to a FedRAMP-authorized environment running the same architecture — not a re-architecture from scratch.

That is the engagement class this reference architecture serves. The rest of the essay is what the architecture looks like.

Component 1: AI Gateway as the policy boundary

Every model call in the architecture goes through Cloudflare AI Gateway. Not most. Every.

AI Gateway is a policy-enforcement point that sits between your application code and the inference layer (Workers AI, OpenAI, Anthropic, or any provider you route through it). It gives you four things that matter for federal-aware deployments. First, it logs every inference request — model, prompt, response, latency, cost, status — to a durable log you control. Second, it lets you enforce rate limits, retries, model fallback, and caching at the policy layer rather than baking those policies into application code. Third, it lets you route to different providers based on policy (e.g., never send agency data to a provider that doesn’t sign a BAA-equivalent agreement). Fourth, it gives you a single chokepoint to revoke or rotate inference access if something goes sideways.

The architectural point is not that AI Gateway does anything magical. The point is that every model call goes through one named, configurable, observable hop. That is the property the federal audit posture requires, and most agentic systems do not have it by default because most agent frameworks let the agent call the model provider directly. Routing everything through AI Gateway is one config change. The audit posture it enables is what the agency contracting officer is going to ask about.

The thing to be precise about: AI Gateway as deployed on the commercial Cloudflare stack is not a FedRAMP-authorized service. What it gives you is the architectural shape that a FedRAMP-authorized inference deployment would have — single policy chokepoint, comprehensive logging, provider-routing controls — implemented on the commercial stack for the engagement class that doesn’t require authorization.

Component 2: R2 as immutable audit storage

Every AI Gateway log entry, every Durable Object state change you care about auditing, every document the system processes, every human-in-the-loop decision — written to R2, with object-lock semantics enabled, in a bucket that the application’s runtime identity can write but cannot delete.

R2 is Cloudflare’s S3-compatible object storage with zero egress fees. The features that matter for the audit-storage role are: object lock for write-once-read-many semantics (preventing the application from rewriting history if it gets compromised or misconfigured), lifecycle policies for retention scheduling, bucket-level access policies separable from object-level policies, and the ability to replicate buckets across regions for durability.

The architectural pattern is straightforward but worth being explicit about. The application Worker has a binding to write to the audit bucket. The bucket policy denies deletes from the application’s identity. A separate retention-management role (held by a human operator, not the application) is the only identity that can mutate object lock periods. The audit trail is therefore append-only by construction — the application cannot tamper with its own history even if it is fully compromised. This is the property a federal audit posture requires, and it is achievable with bucket policies and IAM discipline rather than with custom infrastructure.

R2 itself is not FedRAMP-authorized on the commercial stack. The architectural pattern, however, ports directly to S3 on AWS GovCloud or to a FedRAMP-authorized object store at the lift moment, because the only thing the application code knows about R2 is the S3-compatible API. Re-hosting the audit storage is a configuration change, not a code change. This is the property that makes “FedRAMP-aware” a real promise rather than a marketing line.

Component 3: Region-pinned Workers with explicit data flow

This is the component where the language matters most, because “region-pinning on Cloudflare” means something specific that is different from how the term is used elsewhere.

Cloudflare Workers run at the edge by default — your code is provisioned to all of Cloudflare’s 330+ cities of presence, and requests are routed to the nearest one. For most commercial workloads this is the entire point. For federal-aware workloads, you sometimes need to constrain where a specific Worker executes, typically to meet data residency or jurisdictional requirements. Cloudflare provides several mechanisms for this — Worker placement modes, region-restricted Workers for Workers for Platforms, and the ability to constrain Durable Object placement via jurisdiction hints in the DO namespace configuration.

The architectural pattern I run for federal-aware deployments is this. Public-facing Workers (request ingress, static asset serving, rate limiting) run at the global edge — there is no benefit to pinning them and significant cost to it. Compute Workers that handle non-public agency data run in a constrained set of regions, typically US-only, configured explicitly in the Worker’s deployment manifest. Durable Objects holding agency data are placed in a US-only jurisdiction via the namespace configuration. The data flow is documented as part of the architecture deliverable — every component, what region it runs in, what data crosses what boundary, what jurisdiction governs each store.

The documentation is the deliverable as much as the running system, because the agency’s authorization-equivalent review is going to ask for a data-flow diagram, and the diagram has to match the running configuration. The mistake I see new federal-pilot vendors make is treating the configuration and the documentation as separate artifacts. They should be generated from the same source — typically the Wrangler config and a small documentation tool that reads it.

The honest tradeoff: pinning Workers to specific regions reduces the global edge benefit. Latency goes up for users outside the pinned region. For most federal pilots this is acceptable because the user population is also US-based. For agencies with international user populations (State, USAID, DoD components with overseas operations) this needs explicit attention in the architecture.

Component 4: Role-based access via Cloudflare Access (Zero Trust)

Every operator-facing surface — admin UIs, dashboards, configuration endpoints, log viewers, the R2 retention-management role from Component 2 — sits behind Cloudflare Access with SSO integration to the agency’s identity provider, MFA enforcement, and role-based policies.

Cloudflare Access is part of Cloudflare One (the Zero Trust platform) and provides identity-aware proxy access to internal applications. The federal-aware deployment pattern uses Access for three layers of control. First, the agency’s existing SSO (Okta, Azure AD, agency-specific IdP) is the identity source — no separate user accounts in the application. Second, MFA is enforced at the Access layer rather than per-application, so the policy is consistent across every operator surface. Third, role-based access policies map agency users to specific application permissions — read-only auditor, retention manager, application administrator, end user — at the Access policy level, with the application reading the asserted role from a signed JWT.

The architectural point is the same as Component 1: a single named, configurable, observable hop. Every operator action authenticates through Access, and every authentication event is logged. The agency’s security team has one place to look when they want to know who did what.

Cloudflare One has FedRAMP Moderate authorization for the components in Cloudflare for Government. The commercial Cloudflare One does not. The architectural pattern is portable, the SKU at the authorization moment is different, and the SOW language should be precise about which one is in scope for the pilot.

Component 5: Continuous audit emission as a first-class output

The fifth component is not infrastructure — it is a discipline that has to be designed in from day one or it is impossible to retrofit.

Every action the system takes that an auditor would want to know about emits a structured audit event at the moment of action. Model call: prompt hash, model, parameters, response hash, latency, cost, request identity. Document processing: document hash, operation, operator identity, before-state, after-state. Human-in-the-loop decision: decision identifier, operator identity, decision, reasoning text, timestamp. Permission change: subject, object, before-permission, after-permission, operator identity.

The events are structured JSON, written to R2 (Component 2) via the append-only path, with content-addressed identifiers so that any event can be referenced from anywhere else in the system without ambiguity. The audit log is queryable — not just stored — via a separate Worker that runs in the same region constraints as the application and exposes a read-only query interface behind Cloudflare Access.

The reason this is a discipline rather than infrastructure is that it requires the application to emit the events. The platform can give you durable storage and immutability and access control. The application has to decide what is worth auditing and emit the events at the right moment with the right fields. Federal-pilot SOWs should specify the audit-event schema as a deliverable, because the agency will ask for it, and retrofitting it later is dramatically harder than designing it in.

This is the component that most distinguishes a “we built an AI thing on Cloudflare” pilot from a federal-aware deployment. The first one has the architecture. The second one has the architecture and the trace evidence that the architecture is operating as designed. The agency cannot accept the first one for anything beyond the most casual evaluation. The agency can accept the second one for a real pilot with real data, because the audit trail is the substitute for the formal authorization that the engagement isn’t large enough to support.

What the lift to FedRAMP-authorized actually looks like

When a pilot succeeds and the agency wants to move it to a FedRAMP-authorized environment, what changes?

The application code does not change. The audit schema does not change. The data flow does not change. The role model does not change.

What changes is the SKU layer underneath. AI Gateway moves from commercial Cloudflare to a FedRAMP-authorized inference path — either Cloudflare for Government’s offerings (where authorized for the workload’s impact level) or a different authorized provider routed via the same gateway pattern. R2 moves to a FedRAMP-authorized object store (S3 on GovCloud, or a Cloudflare for Government offering where available) with the same access semantics. Workers move to a FedRAMP-authorized execution environment with the same region constraints. Cloudflare Access moves to Cloudflare for Government’s authorized SSO/ZTNA stack.

The lift is real work — typically a quarter of focused effort for a mid-sized pilot — but it is migration work, not architecture work. The team that built the pilot can do the lift. The agency does not need to procure a separate vendor to “do it properly” for production. The IP transfer the agency negotiated under M-25-22 is meaningful because the architecture is portable by construction.

The reason this matters is that the alternative — building the pilot directly on a FedRAMP-authorized stack from day one — is what most agencies have been asked to procure historically, and it does not work at pilot scale. The cost of the authorized environment is too high to justify against an unproven feature, the development velocity is too slow to validate the feature in time, and the procurement timeline to even start work is measured in quarters. The FedRAMP-aware commercial pilot is the practical answer to the question “how do we evaluate an AI capability without committing to a multi-year ATO program before we know if the capability is worth it?”

What goes in the SOW

The contracting officer reading the SOW for a FedRAMP-aware pilot is going to look for specific language, and the language should be precise.

Scope: the workload, the data classification, and an explicit statement that the engagement is a pilot/prototype on a commercial-aware stack, not a production deployment on a FedRAMP-authorized environment. Data handling: what data the system processes, what data leaves the agency, what data is stored and where, the retention schedule, the destruction protocol at engagement end. Audit posture: the audit event schema, the immutability guarantee, the query interface, the access control model for the audit trail. Region constraints: which components run in which regions, with the Wrangler/deployment manifest excerpts referenced in the SOW. Access controls: the SSO integration, the MFA enforcement, the role model, the policy language. IP transfer: the code, the deployment manifests, the audit schema, the operator runbook — what the agency receives at end of engagement. Lift path: an explicit statement of what the lift to a FedRAMP-authorized environment would entail if the agency chooses to proceed to production.

That is what a memo-compliant federal-aware pilot SOW looks like. None of it is optional. All of it is what the OMB acquisition memos asked for, applied to the specific engagement class where commercial-stack agility is the only way to actually evaluate the capability before committing to an authorization investment.

The reference architecture, one paragraph

AI Gateway as the single policy boundary for every model call, with comprehensive logging. R2 as immutable audit storage, append-only by construction, lifecycle-managed. Workers and Durable Objects region-pinned to US jurisdictions for any component handling non-public agency data, documented in deployment manifests. Cloudflare Access as the identity boundary for every operator surface, integrated with the agency SSO. Continuous structured audit emission as a first-class application output, designed in from day one. Architecture portable to FedRAMP-authorized SKUs at the lift moment without code changes.

Same primitives the commercial side runs on. Hardened, region-constrained, audit-emitting, identity-integrated, and contractually portable. The federal-aware engagement class has a reference architecture. This is mine.