Skip to main content
Truvisory
Cloudflare-Native

AI Audit Logging on Cloudflare: Building a Tamper-Evident, Compliance-Grade Record of Every AI Request

Tony Adams 12 min read

Observability is for operating your AI — debugging, monitoring, optimizing. An audit trail is for proving to someone else what your AI did: every prompt, every response, which model and provider handled it, when, and which actor triggered it — captured and exported to immutable storage where even an administrator can’t alter or delete it during the retention period. The two overlap in the logs they draw on but serve different masters: one answers your engineers, the other answers a regulator, an auditor, or a court. On Cloudflare you assemble the audit trail from AI Gateway logs, Logpush, and R2 bucket locks, on FedRAMP-Moderate-authorized primitives. Two honest caveats are load-bearing, so they go up front: AI Gateway itself is not FedRAMP-authorized, and an audit log is a necessary control that supports — but does not by itself achieve — compliance with any regime.

This is the audit and governance spoke of our AI observability, cost, and evaluation cluster, and it’s the technical deep-dive behind our FedRAMP-aware federal positioning. It’s also where the credential-governance question the routing spoke deferred finally lands.

Operating versus proving: why audit is its own discipline

The observability spoke covers the operational use of AI Gateway’s logs — reading them to debug, monitor, and tune. This spoke uses the same raw logs for a different purpose, and the difference changes the requirements. Operational logging can be lossy, short-lived, and mutable; you only need enough recent data to answer “what’s happening right now.” An audit trail has to be the opposite: complete (no silent gaps), durable (retained for years), and tamper-evident (provably unaltered, including by your own administrators). When the question shifts from “why is this slow” to “prove to this auditor that your AI never did X,” lossy and mutable stop being acceptable. Everything below follows from that shift, so I’ll route the operational mechanics back to the observability spoke and focus here on what it takes to make a record stand up as evidence.

What’s in an auditable AI record

The foundation is what AI Gateway already captures on every request: the user prompt, the model’s response, the provider and model that served it, a timestamp, the request status, token usage, cost, and a correlation ID returned on every request, including failed ones. That correlation ID is the thread that ties an external, immutable copy back to the gateway entry and to your application traces — the spine of a defensible chain of custody.

Two pieces deserve specific attention for audit. The first is actor attribution — answering “who did this.” AI Gateway lets you tag each request with custom metadata, up to five string, number, or boolean values, which become searchable fields on the record. Stamp each request with an actor ID, a team, a tenant, or a case number and your trail can answer not just what the AI did but who asked it to. The honest caveat: AI Gateway stores only the identifiers you pass — it doesn’t authenticate the end user itself, so the actor identity has to originate in your application and be trustworthy before it reaches the gateway. The metadata mechanics belong to the observability spoke; the chain-of-custody use is the point here.

The second is the payload-versus-PII tension, and it’s a genuine design conflict rather than a setting you can default your way through. A complete audit often wants the full prompt and response, because the text itself is the evidence of what was asked and answered. But that same text may contain personal or health information you’re obligated to minimize. AI Gateway gives you a control that suppresses the raw request and response bodies while still logging the metadata — tokens, model, provider, status, cost, duration — so you can keep an operational record without retaining sensitive content. What it doesn’t give you at the gateway is field-level redaction: the payload control is effectively all-or-nothing per request. So you decide deliberately, per data class, whether the evidentiary value of the payload outweighs the obligation to minimize it — and for regulated data you often log references or hashes rather than raw content.

The gotcha that defines the architecture

Here’s the single most important fact for anyone building an audit trail on this stack, stated without softening: the AI Gateway log store is not your system of record. Logs cap at ten million per gateway on the paid plan, and when that ceiling is reached, new logs silently stop being saved. No error in your application, no failed request — the calls keep working, and the recording quietly stops. For an operational tool that’s a manageable annoyance; for a compliance record it’s a catastrophe, because a silent gap in an audit trail is itself an audit failure. The alternative gateway behavior is worse for this purpose: enabling automatic log cleanup keeps the store under its limit by deleting the oldest logs, which is exactly the history an audit trail exists to preserve.

10M logs
per-gateway cap where new logs silently stop saving — or auto-cleanup deletes the oldest. Both are wrong for a record of truth, so export to immutable storage and monitor the export. — Cloudflare AI Gateway limits docs

The conclusion is structural, not a tuning tweak: treat the gateway store as ephemeral and continuously export every log to durable, immutable storage before either failure mode can bite. And there’s a second, less obvious requirement that falls out of the same logic — the export itself has to be monitored. Logpush delivers logs once as they become available and cannot backfill historical data; if the export job is disabled or fails, the logs generated during that window are permanently lost. So monitoring the health of the export job isn’t operational hygiene here — it’s an audit-integrity control in its own right. An unmonitored exporter is a silent gap waiting to happen.

The immutable-archive pattern

With that established, the pattern itself is clean: AI Gateway logs flow through Logpush into an R2 bucket protected by a bucket lock. Logpush is the export mechanism — it requires the Workers Paid plan and encrypts each log with a hybrid scheme, an AES key per log wrapped with RSA using a public key you supply, so only you can decrypt the archive with your private key. Pointed at R2, it writes your logs continuously into object storage you control, organized by date.

The immutability comes from R2 bucket locks, Cloudflare’s write-once-read-many primitive. A lock prevents the deletion or overwriting of objects for a duration you set, until a specific date, or indefinitely; you can scope up to a thousand rules per bucket by prefix, and where rules overlap the strictest retention wins. Two properties make this suitable as an audit store. First, locked objects can’t be deleted or overwritten during the retention window — that’s the tamper-evidence. Second, bucket-lock rules take precedence over lifecycle rules, so a routine “delete after 30 days” cleanup won’t fire against an object a lock says to keep for six years. One precision point worth stating rather than glossing: Cloudflare’s documentation describes a single enforced-lock model — duration, until-date, or indefinite — and does not, as of this writing, expose the separate “governance” versus “compliance” retention modes that some object-storage services distinguish. The locks provide genuine WORM-style immutability for the retention period; if your regime requires a specific named mode with documented bypass semantics, verify the current behavior before you certify against it. Underneath, R2 encrypts all objects at rest with AES-256 and in transit over TLS.

Retention, access, and chain of custody

An immutable store is necessary but not the whole governance story; retention and access close the loop.

Retention is driven by your regime, and the bucket lock is how you enforce a minimum. Set the lock’s retention period to your obligation — and because the lock overrides lifecycle deletes, you get a floor that survives both accidental cleanup and deliberate tampering. (The specific periods belong in the compliance section below.)

Access should be least-privilege and separated. Issue read-only, prefix-scoped R2 tokens to auditors, and keep write, delete, and lock-configuration permissions on a separate admin credential held by a different role. This is real separation of duties, and the bucket lock reinforces it: during retention, even the admin token can’t delete the locked objects, so the people whose actions are being audited genuinely cannot alter the record of them.

There is, however, a meta-audit gap you should know about and design around rather than discover later. R2’s own audit logs capture configuration actions — creating a bucket, changing a lifecycle policy, altering visibility — but they explicitly exclude data-access operations, so reads and writes of individual objects aren’t recorded out of the box. In plain terms: “who read the archived audit log” is not captured by default. If your regime requires logging access to the audit trail itself, you need to add that layer — fronting the bucket with an access-controlled Worker or Cloudflare Access, or enabling request logging — rather than assuming it’s there.

What this satisfies — and what it doesn’t

This is the part that matters most for a compliance buyer, so here it is plainly. An immutable AI audit trail provides the audit-trail, record-keeping, and monitoring control that many regimes require. It does not, by itself, make you compliant with any of them — compliance is a holistic program of policy, process, and controls, and the audit log is one necessary piece.

With that framing, the trail directly supports several regimes. For SOC 2, it provides evidence for the monitoring and logical-access criteria, where what auditors look for is that your logs match your stated policy and that any modification or deletion would be detectable — which append-only WORM storage delivers. For HIPAA, audit controls are a required safeguard, and the standard interpretation pairs them with a six-year documentation retention period — a duration a bucket lock can enforce. For the EU AI Act, high-risk systems must automatically record events over their lifetime for traceability, must retain those logs for a period appropriate to the system’s purpose and at least six months, and face significant penalties for non-compliance — an immutable, attributable trail speaks directly to that record-keeping obligation. (One timeline caveat: the high-risk obligations are set to apply in August 2026, but a proposed delay could push parts to late 2027, so confirm the current date before relying on it.) And for financial recordkeeping, WORM storage is the long-standing, classic answer.

Then the FedRAMP reality, which has to be exact. Cloudflare for Government holds FedRAMP Moderate authorization, and the authorized Developer Platform services include the primitives this pattern is built on — R2, Workers, Durable Objects, Workers KV, and Logs. But AI Gateway, Workers AI, and Vectorize are not on that authorized list. Cloudflare has stated an intent to bring its AI services into FedRAMP scope, but intent is a roadmap, not an authorization, and FedRAMP High remains in process rather than authorized. So the honest position is precise: you can build the audit archive — the storage, export, and immutability — on FedRAMP-Moderate-authorized primitives, but the AI inference and proxy layer itself is not FedRAMP-authorized today, which means a fully in-boundary federal deployment of the AI is not yet available. Truvisory positions on this as FedRAMP-aware, and never as CMMC-certified. Anyone telling a federal buyer that AI Gateway is FedRAMP-authorized is wrong, and that error is expensive.

Where credential governance lands

The routing spoke deferred the governance side of bring-your-own-keys to here, so this is where it belongs. AI Gateway lets you store provider API keys centrally — backed by Cloudflare’s Secrets Store, encrypted at rest and in transit — and reference them rather than passing raw keys on every request, with rotation handled in the dashboard without code changes. From a governance standpoint that’s a real improvement: credentials stop living in client code, rotation becomes a controlled administrative action, and key custody is centralized rather than scattered. The honest consideration is the flip side of that centralization — the keys live with Cloudflare, a third party, so your key-custody, rotation, and separation-of-duties policy has to account for that arrangement explicitly rather than treat the keys as wholly your own. The operational and routing mechanics of stored keys are the routing spoke’s subject; the governance posture is the part that matters for an auditor.

The honest trade-offs

  • The gateway is not your system of record. The ten-million-log silent stop, the destructive auto-cleanup alternative, and Logpush’s inability to backfill all point one way: export to immutable storage and monitor the export. This is the first thing to get right and the easiest to get wrong.
  • AI Gateway is not FedRAMP-authorized. Only the underlying primitives are. State it plainly; don’t imply the AI layer is in-boundary.
  • An audit log is necessary, not sufficient. It supplies a control, not a compliance program.
  • Payload logging fights data minimization. The full prompt and response are the best evidence and the likeliest place for PII; the gateway’s payload control is all-or-nothing per request, with no field-level redaction.
  • Immutability cuts both ways. A bucket lock means you can’t delete during retention — which is the point for tamper-evidence and a hazard if you locked something you shouldn’t have, since a right-to-erasure request collides with a WORM retention you can’t lift. Decide what you log, preferring references or hashes over raw personal data, before you lock it.
  • Object reads aren’t logged by default. R2’s audit logs cover configuration, not data access, so “who read the archive” needs a layer you add.
  • Keys stored centrally live with a third party. A governance consideration to weigh, not a disqualifier.
  • Concentration and cost. Logs, export, and immutable store all on one vendor is a concentration worth naming, and multi-year immutable retention accrues real storage cost; for the highest-assurance regimes a second, independent copy may be warranted.

Concrete patterns

Attribute each request by stamping cf-aig-metadata with an actor ID, team, and case identifier, so the record can answer who asked. Where the data class requires it, suppress sensitive bodies with cf-aig-collect-log-payload: false while keeping the metadata record. Configure a Logpush job to export the AI Gateway dataset continuously to an R2 bucket, organized by date and encrypted with your own public key — then monitor that job, because a silent failure is a silent gap. Lock the archive prefix with an R2 bucket lock set to your retention obligation, whether a fixed multi-year duration or indefinite. And scope auditor access to a read-only, prefix-limited token while keeping write and lock-configuration rights on a separate admin credential. Build it so the record is complete, exported before it can be lost, immutable once written, and readable only by least privilege — and so you could hand an auditor the chain from a single request ID to its sealed, dated copy.

Frequently asked

Does a Cloudflare AI audit log make me HIPAA, SOC 2, or FedRAMP compliant?
No — it's a necessary audit-trail control, not a compliance program. It supports HIPAA audit controls, SOC 2 monitoring and access-control evidence, and EU AI Act traceability, but compliance is the whole program of policy, process, and controls.
Is AI Gateway FedRAMP-authorized?
No. Cloudflare for Government holds FedRAMP Moderate for primitives like R2, Workers, Workers KV, Durable Objects, and Logs — but AI Gateway, Workers AI, and Vectorize are not authorized, though Cloudflare has stated intent to add them. FedRAMP High is in process.
What happens when AI Gateway hits its log limit?
New logs silently stop being saved, or — if automatic cleanup is on — the oldest logs are deleted. Both are wrong for a record of truth, which is why you export to immutable R2 and monitor the export job; Logpush can't backfill a gap.
Can an administrator alter or delete the audit archive?
With an R2 bucket lock in force, objects can't be deleted or overwritten until retention expires, and the lock even overrides lifecycle deletes — so during retention, no one, including an admin, can tamper with the record. (Verify the current lock-mode semantics before certifying against a regime that requires a specific named mode.)
How long should I keep AI audit logs?
It depends on the regime — six years of documentation for HIPAA, at least six months under the EU AI Act, your stated policy for SOC 2 (commonly twelve months), and often longer for financial recordkeeping. Set the retention from your own risk analysis and enforce the floor with a bucket-lock retention period.

Working with Truvisory

If you operate in a regulated environment and need to prove what your AI did, see how we build compliance-grade AI systems on Cloudflare — with attributable records, monitored export, and immutable retention wired in from the first deploy, and an honest account of exactly what that does and doesn’t satisfy.

Truvisory is a Denver-based AI and automation consultancy run by a senior operator — a combat veteran and former PE-backed operating executive — who ships working software, not strategy decks. Cloudflare-native by default, for both AI delivery and the back-office automation where the ROI lives. Federal buyers: we’re SDVOSB set-aside eligible — see the federal AI modernization pillar.