Skip to main content
Truvisory
§ Cloudflare-Native

Cloudflare-native AI, by an engineer who lives on the platform.

We design, build, and ship production AI systems on Cloudflare's edge — Workers, Workers AI, Agents SDK, Durable Objects, Vectorize, AI Gateway, R2, D1, Queues, Containers — at sub-50ms latency in 330+ cities.

~50ms p50
to 95% of users
330+
cities
~20%
of the web
5%
avg GPU util · industry
§ 01 / Why Cloudflare for AI in 2026

Three reasons the math works.

REASON / 01

Pay-per-inference, not pay-per-idle-GPU.

Per Cast AI's 2026 State of Kubernetes Optimization Report, average GPU utilization is just 5% across 23,000 production clusters. "At 5% utilization, the math doesn't work." Workers AI bills for actual inference — not reserved capacity.

5% industry GPU util
REASON / 02

Edge by default.

Data centers in 330+ cities deliver ~50ms latency to about 95% of internet users — on the same infrastructure proxying ~20% of the web. Region-pin for sovereignty when federal mission requires it.

330+ cities · sub-50ms
REASON / 03

State and compute in one primitive.

Durable Objects give each agent its own SQL database, persistent memory, and hibernation — no separate data tier, no cold start, no glue code. Multi-agent systems become a primitive, not an architecture project.

primitive · agent + state
§ 02 / Reference architectures

Four patterns we ship, end to end.

Pattern · ARAG

Retrieval-Augmented Generation over private data

Hybrid semantic+keyword retrieval, AI-Gateway-fronted models, R2-backed source corpus, audit trail end to end.

// Use case: agency knowledge bases, contract repositories, customer support corpora
WorkerEdge entry
VectorizeHybrid search
AI GatewayGuardrails
Workers AIInference
Backed by R2 corpus D1 metadata Audit log
Pattern · BAGENT

Stateful agent with tool use

Each agent is a Durable Object with its own SQL state and lifecycle. MCP tool surface for clean external integrations. Hibernates when idle. The full agents & MCP playbook →

// Use case: customer agents, internal assistants, long-running automations
Agents SDKLifecycle
Durable ObjectSQL · Memory
MCPTool surface
Calls out to Workers AI OpenAI / Anthropic / Gemini External tools
Pattern · CRLM

Recursive multi-agent orchestrator

Kimi k2.6 (1T MoE) as the orchestrator; small Gemma scout workers run decomposed sub-tasks. Derived from MIT CSAIL's Recursive Language Models work (arXiv:2512.24601). This is the HotCopy architecture.

// Use case: code transformation, deep document understanding, complex multi-step research
Kimi k2.6Orchestrator · 1T MoE
Spawns Scout · Gemma Scout · Gemma Scout · Gemma +N
Substrate Workers AI Durable Objects Queues
Pattern · DFED

Federal-friendly edge deployment

Same Cloudflare primitives, hardened: AI Gateway guardrails, region-pinning for data sovereignty, immutable audit logging, role-based access, FedRAMP-aware deployment patterns.

// Use case: agency RAG, intra-agency assistants, OMB AI Action Plan rollouts
Zero TrustAuth
WorkerRegion-pinned
AI GatewayGuardrails · log
Workers AIInference
Audit + sovereignty Immutable log → R2 Region-pin · US Role-based access CMMC posture
§ 03 / Data control

Sovereignty isn't a checkbox. Here's the mechanism.

“Region-pinning” is three concrete controls, not a slogan:

Regional Services

Pins where High and Moderate impact data is processed, without losing edge performance.

Metadata Boundary

Keeps all government data inside the defined FedRAMP region.

FIPS-validated encryption

Every hop between edge and core, always.

We compose these into the deployment so CUI stays where your authorization says it stays.

§ 04 / Post-quantum

Quantum-safe by default — no config, no penalty.

Every site and API we serve through Cloudflare is protected against “harvest now, decrypt later” with TLS 1.3 + ML-KEM — automatically, no configuration changes. Post-quantum protection extends to Zero Trust access and the Secure Web Gateway, so encrypted traffic stays inspectable as you migrate. Built ahead of NIST's 2030–2035 deprecation deadlines.

§ 05 / AI security architecture

Guardrails, not vibes.

The AI Gateway is the control plane between your application and any model provider:

Threat detection

Prompt injection, model poisoning, and excessive-use abuse, caught at the proxy.

Bi-directional data control

What the user submits and what the model returns, both inspected and redacted.

MCP server portal

Accessible servers behind one URL, with OAuth authorization and least-privilege access per agent.

Credentials at the edge

API keys and secrets never touch the client; rotation stays simple.

RAG helps the model know more. MCP helps it do more. The Gateway makes both safe.

§ 06 / FedRAMP architecture

Every service. Every location. No enclave.

Some vendors carve out a special FedRAMP enclave and make you wait for capabilities to land in it. Cloudflare runs the same software in every data center, including the FedRAMP processing locations — so the authorized Cloudflare for Government environment carries nearly the entire platform, not a stripped-down subset. For federal engagements we provision there; commercial engagements run on the standard network. Either way: no rip-and-replace, no capability lag.

§ 07 / Capability matrix

Every Cloudflare primitive we work in, with its real use.

Workers
// edge HTTP entry · routing · middleware
Workers AI
// pay-per-inference LLM & embedding
Agents SDK
// agent lifecycle & orchestration
Durable Objects
// per-agent SQL state · hibernation
Vectorize
// vector search · hybrid retrieval
AI Gateway
// guardrails · cost · cache · obs
AI Search
// semantic search-as-a-service
R2
// object storage · zero egress
D1
// edge SQLite · serverless DB
KV
// global config · feature flags
Queues
// durable async work
Containers
// long-running side workloads
Hyperdrive
// accelerated legacy DB access
Browser Rendering
// headless browser · scraping
Workflows
// durable multi-step pipelines
Sandbox
// isolated code execution
§ 08 / Live proof

Run a Worker. From the closest edge.

This page hits a Truvisory®-deployed Cloudflare Worker on first paint. The latency you see is your latency to the closest of 330+ Cloudflare data centers — typically the same region you live in.

  • Edge ping with location, RTT, and colo
  • Live JSON from a Workers AI inference call
  • Source on GitHub — verbatim, deployable in 90 seconds
truvisory-edge.workers.dev ● live
// 1. Ping the closest Cloudflare edge
curl https://edge.truvisory.com/whoami
{
  "colo": "DEN",
  "city": "Denver, CO, US",
  "rtt_ms": 38,
  "region": "WNAM",
  "ts": "2026-05-05T14:22:11Z"
}
// 2. Inference on Workers AI · Llama 3.3
POST /infer {"q":"summarize CMMC L2 in one line"}
"CMMC L2 = 110 NIST 800-171 controls, third-party assessed for CUI."
// model: @cf/meta/llama-3.3-70b · 41ms p50 · cached: false
§ 09 / Partner posture

Honest about today. Transparent about the path.

Today
Cloudflare-Native Engineer

What we are right now.

Daily-driver builder on Workers, Workers AI, Durable Objects, Vectorize, AI Gateway, R2, D1, Agents SDK, MCP. Production deployments on file (HotCopy, PresEngage). The engineer-on-the-platform claim is true and defensible.

Pursuing
▣ Cloudflare ASDP · Application Services

What we're earning, not claiming.

Cloudflare's ASDP designation requires rigorous technical validation of security, performance, and reliability. We're in the process — and we won't surface a partner badge on the site that hasn't been earned. When it lands, you'll see it.