Skip to main content
Truvisory
§ Cloudflare-Native

Cloudflare-native AI, by an engineer who lives on the platform.

We design, build, and ship production AI systems on Cloudflare's edge — Workers, Workers AI, Agents SDK, Durable Objects, Vectorize, AI Gateway, R2, D1, Queues, Containers — at sub-50ms latency in 330+ cities.

~50ms p50
to 95% of users
330+
cities
~20%
of the web
5%
avg GPU util · industry
§ 01 / Why Cloudflare for AI in 2026

Three reasons the math works.

REASON / 01

Pay-per-inference, not pay-per-idle-GPU.

Per Cast AI's 2026 State of Kubernetes Optimization Report, average GPU utilization is just 5% across 23,000 production clusters. "At 5% utilization, the math doesn't work." Workers AI bills for actual inference — not reserved capacity.

5% industry GPU util
REASON / 02

Edge by default.

Data centers in 330+ cities deliver ~50ms latency to about 95% of internet users — on the same infrastructure proxying ~20% of the web. Region-pin for sovereignty when federal mission requires it.

330+ cities · sub-50ms
REASON / 03

State and compute in one primitive.

Durable Objects give each agent its own SQL database, persistent memory, and hibernation — no separate data tier, no cold start, no glue code. Multi-agent systems become a primitive, not an architecture project.

primitive · agent + state
§ 02 / Reference architectures

Four patterns we ship, end to end.

Pattern · ARAG

Retrieval-Augmented Generation over private data

Hybrid semantic+keyword retrieval, AI-Gateway-fronted models, R2-backed source corpus, audit trail end to end.

// Use case: agency knowledge bases, contract repositories, customer support corpora
WorkerEdge entry
VectorizeHybrid search
AI GatewayGuardrails
Workers AIInference
Backed by R2 corpus D1 metadata Audit log
Pattern · BAGENT

Stateful agent with tool use

Each agent is a Durable Object with its own SQL state and lifecycle. MCP tool surface for clean external integrations. Hibernates when idle.

// Use case: customer agents, internal assistants, long-running automations
Agents SDKLifecycle
Durable ObjectSQL · Memory
MCPTool surface
Calls out to Workers AI OpenAI / Anthropic / Gemini External tools
Pattern · CRLM

Recursive multi-agent orchestrator

Kimi k2.6 (1T MoE) as the orchestrator; small Gemma scout workers run decomposed sub-tasks. Derived from MIT CSAIL's Recursive Language Models work (arXiv:2512.24601). This is the HotCopy architecture.

// Use case: code transformation, deep document understanding, complex multi-step research
Kimi k2.6Orchestrator · 1T MoE
Spawns Scout · Gemma Scout · Gemma Scout · Gemma +N
Substrate Workers AI Durable Objects Queues
Pattern · DFED

Federal-friendly edge deployment

Same Cloudflare primitives, hardened: AI Gateway guardrails, region-pinning for data sovereignty, immutable audit logging, role-based access, FedRAMP-aware deployment patterns.

// Use case: agency RAG, intra-agency assistants, OMB AI Action Plan rollouts
Zero TrustAuth
WorkerRegion-pinned
AI GatewayGuardrails · log
Workers AIInference
Audit + sovereignty Immutable log → R2 Region-pin · US Role-based access CMMC posture
§ 03 / Capability matrix

Every Cloudflare primitive we work in, with its real use.

Workers
// edge HTTP entry · routing · middleware
Workers AI
// pay-per-inference LLM & embedding
Agents SDK
// agent lifecycle & orchestration
Durable Objects
// per-agent SQL state · hibernation
Vectorize
// vector search · hybrid retrieval
AI Gateway
// guardrails · cost · cache · obs
AI Search
// semantic search-as-a-service
R2
// object storage · zero egress
D1
// edge SQLite · serverless DB
KV
// global config · feature flags
Queues
// durable async work
Containers
// long-running side workloads
Hyperdrive
// accelerated legacy DB access
Browser Rendering
// headless browser · scraping
Workflows
// durable multi-step pipelines
Sandbox
// isolated code execution
§ 04 / Live proof

Run a Worker. From the closest edge.

This page hits a Truvisory®-deployed Cloudflare Worker on first paint. The latency you see is your latency to the closest of 330+ Cloudflare data centers — typically the same region you live in.

  • Edge ping with location, RTT, and colo
  • Live JSON from a Workers AI inference call
  • Source on GitHub — verbatim, deployable in 90 seconds
truvisory-edge.workers.dev ● live
// 1. Ping the closest Cloudflare edge
curl https://edge.truvisory.com/whoami
{
  "colo": "DEN",
  "city": "Denver, CO, US",
  "rtt_ms": 38,
  "region": "WNAM",
  "ts": "2026-05-05T14:22:11Z"
}
// 2. Inference on Workers AI · Llama 3.3
POST /infer {"q":"summarize CMMC L2 in one line"}
"CMMC L2 = 110 NIST 800-171 controls, third-party assessed for CUI."
// model: @cf/meta/llama-3.3-70b · 41ms p50 · cached: false
§ 05 / Partner posture

Honest about today. Transparent about the path.

Today
Cloudflare-Native Engineer

What we are right now.

Daily-driver builder on Workers, Workers AI, Durable Objects, Vectorize, AI Gateway, R2, D1, Agents SDK, MCP. Production deployments on file (HotCopy, PresEngage). The engineer-on-the-platform claim is true and defensible.

Pursuing
▣ Cloudflare ASDP · Application Services

What we're earning, not claiming.

Cloudflare's ASDP designation requires rigorous technical validation of security, performance, and reliability. We're in the process — and we won't surface a partner badge on the site that hasn't been earned. When it lands, you'll see it.