Is Your AI Project in 'Pilot Purgatory'? 7 Warning Signs (and the Fix for Each)
If your AI pilot has been “almost done” for more than 90 days, has no single dollar figure attached to it, and has no named owner outside the AI team, it’s already in pilot purgatory. The good news: the diagnosis is fast, and most of the fixes are operator moves you can make this week — not technical rebuilds.
This is a self-assessment. Walk the seven signs below against your own pilot and mark each one honestly. Two or more, and you’re stuck. Four or more, and the right move is to kill it and re-launch — more on that at the end. If you want the full picture of why most pilots fail and how mid-market companies ship in 90 days, that’s the pillar. This piece is narrower: here’s how to tell if yours is one of them, and what to do about each symptom.
How common is this, really?
Worth knowing before you score yourself: this is common and expensive, not a sign you’re behind.
By that IDC-tracked pattern, for every 33 AI proofs of concept companies launch, only about 4 reach production. Gartner has pegged the average prototype-to-production timeline at roughly eight months, and projected that at least 30% of generative-AI projects would be abandoned after the POC stage by the end of 2025. So if your pilot is stuck, you’re in the majority. The point of this checklist is to get out of it.
Sign 1: The pilot has been “almost done” for months
What you see: Status updates have used “we’re almost there,” “90% done,” or “just one more iteration” for three-plus consecutive months. There’s no written production date, and the finish line moves every time you ask.
Why it happens: Nobody wrote down what “done” means before work began. Without a go-live threshold and a hard date, pilots don’t end — they drift. The tell, as one CIO put it, is when you keep seeing the same slides and the same hurdles meeting after meeting while “almost there” never becomes “shipped.”
The fix: Write a one-page success contract this week with three lines — the single number that has to move, the threshold that triggers go-live, and a named production date no more than 90 days out. Circulate it to everyone with veto power. Anything not in the contract is not in the pilot.
Sign 2: Nobody can state the dollar value
What you see: Ask “what does this deliver in dollars or hours saved?” and the answer is a model metric (accuracy, F1, a deflection rate in the abstract) or a vibe (“the demo was impressive”). Nobody can finish the sentence: if this ships, we save $X / cut cycle time by Y days / avoid Z hires.
Why it happens: Pilots scoped by engineers default to engineering metrics. The redesign that actually drives business impact is a workflow-and-KPI decision, not a model decision — and McKinsey’s 2025 data shows only about a fifth of organizations using generative AI have fundamentally redesigned even one workflow. “Unclear business value” is one of the named reasons Gartner expects a third of GenAI projects to be abandoned after POC.
The fix: Replace the model metric with one P&L-linked number this week. Pick from cost per ticket, hours saved per week (priced at loaded labor cost), cycle-time reduction, days sales outstanding, deflection rate × ticket cost, or error-rate reduction × rework cost. Get Finance to sign the baseline. If you can’t tie the pilot to one of those, stop building and re-scope.
Sign 3: It runs on hand-picked or synthetic demo data
What you see: The pilot has only ever been tested on a curated dataset, a “clean” subset, or synthetic data. It has never seen last month’s actual production data — the missing fields, the inconsistent date formats, the manual overrides, the typos.
Why it happens: Demo conditions aren’t production conditions, and teams build for the demo because that’s what gets approved. Gartner has warned that organizations will abandon a large share of AI projects through 2026 specifically for lack of AI-ready data. The post-mortem keeps sounding the same: the model got built; the data never got fixed.
The fix: Before another sprint, point the pilot at last month’s real, untouched production data — the messy export, not the curated sample — and measure. If performance collapses, you don’t have a model problem; you have a data-readiness problem, and the next 30 days belong to data normalization, not model tuning.
Sign 4: No named operational owner outside the AI team
What you see: Ask “who runs this after launch?” and the answer is “the AI team,” “the innovation group,” “IT,” or “we’ll figure that out.” No line-of-business leader has the KPI on their scorecard.
Why it happens: Pilots born inside an innovation or data-science function rarely have a business owner with P&L accountability for the workflow being changed. The clearest pattern in the 2025 research is that the companies that succeed push adoption to line managers and embed in real workflows, rather than running everything through a central AI lab.
The fix: Name an operational owner this week — a department head or ops manager whose P&L the KPI sits on — and put their name on the success contract. If nobody will accept ownership, that’s your answer: there’s no business demand here, and you should kill it before spending another dollar.
Sign 5: It’s a standalone tool users have to remember to open
What you see: The pilot lives at its own URL or in its own app. Users have to leave the tool they actually work in — the CRM, the ticket queue, the inbox, the ERP — to go use the AI thing. Usage spiked at launch and has declined ever since.
Why it happens: Side-tools lose to the path of least resistance. The standout performers in the 2025 data aren’t building general-purpose tools; they’re embedding AI inside existing workflows and scaling from a narrow, high-value foothold. And there’s a competitor you didn’t account for: surveys found that workers at the large majority of companies already use personal AI tools daily, often while their employer’s official pilot goes unused — your team is routing around your tool to ChatGPT because the consumer tool is closer to where they work.
The fix: Embed the pilot inside one tool where the work already happens this sprint — a Slack/Teams command, a CRM side-panel, an inbox plugin, an inline button in the queue — not a separate dashboard. If you can’t embed it in 30 days, it isn’t a workflow; it’s a destination, and it will keep losing.
Sign 6: Scope keeps expanding — “can it also do X?”
What you see: Every steering meeting adds a requirement. The pilot that was supposed to draft one type of email now needs to draft five, route them, escalate them, and update the CRM. Nothing ships because the definition keeps moving.
Why it happens: AI pilots are exciting; stakeholders see possibilities and ask for more. Without a written scope and a stop-rule, ambition expands to fill every meeting — and the pilot tries to be reshaped before it’s ever proven.
The fix: Cut scope to one workflow, one user group, one number, one quarter. Apply a “next pilot” rule: every new request goes on the backlog for the next pilot, not this one. Narrow-and-ship is the discipline; breadth is what you earn after the first version is in production.
Sign 7: It’s a stuck internal build, when buying would already be shipping
What you see: Your team has been building a custom solution for six-plus months. A commercial or open-source equivalent exists and could be configured in weeks. The team keeps citing “our data is unique” or “we need full control” — but the project still isn’t in production.
Why it happens: Mid-market companies default to “build” because the AI team wants the project and the vendor cycle feels slow. The data cuts hard the other way: in the 2025 MIT NANDA research, purchased tools and vendor partnerships reached deployment roughly twice as often as internal builds — about 67% versus 33%. The report’s lead author summarized what they saw everywhere: companies trying to build their own tool, while the purchased solutions delivered the more reliable results.
The fix: Run a two-week buy-vs-build bake-off. List the top three commercial or hosted solutions that cover 80% of your use case. If any can be in production in 30–60 days for less than the next six months of your internal build burn, switch. Reserve “build” for the genuinely differentiated 20% — and only after the bought solution is live. The full build vs. buy vs. partner framework is the deeper version of this call.
Score yourself
- 0–1 signs: You’re slow, not stuck. Keep going and tighten the success contract.
- 2–3 signs: Pilot purgatory. The fixes above, applied within 30 days, can pull it out.
- 4 or more signs: Kill it. Re-launch one tightly scoped pilot with one number, one owner, and a hard 90-day production date.
That last line is the one operators resist, so be direct with yourself about it: a stalled pilot isn’t failing slowly, it’s failing expensively. The real cost of a zombie pilot isn’t its burn rate — it’s the executive attention, the governance bandwidth, and the team’s appetite for the next AI initiative, all of which the zombie consumes while delivering nothing. Killing a pilot is not the failure. Failing to kill one is.
Two honest exceptions before you act:
- Regulated or safety-critical work (clinical, financial controls, anything touching PHI or PII at scale) legitimately runs slower; a 90-day clock can stretch to 120–180. But the success contract, the named owner, and the real-data test still apply on the original schedule. “We’re in compliance review” is only a valid reason if the team can name exactly what they’re waiting on and when it resolves.
- A true R&D bet on a genuinely hard problem should be measured on learning-per-dollar, not P&L-per-quarter — but then fund it and name it as R&D, not as an operational pilot. Most “stuck” pilots are operational projects mislabeled as research.
Frequently asked
How long is too long for an AI pilot?
What's the single fastest way to tell if my pilot is stuck?
Should I fix a stuck pilot or kill it?
Is "shadow AI" — my team using ChatGPT on the side — a bad sign?
Working with Truvisory
If you’ve walked the seven signs and you’re sitting at three or more, the move is the same one that gets pilots out of purgatory in the first place: one workflow, one number, one named owner, shipped in 90 days. That’s the whole model Truvisory runs on — a senior operator who picks the stack, writes the code, and ships working software on a fixed scope and timeline, Cloudflare-native by default.
The founder is a U.S. Army combat veteran, 25-year multi-exit operator, University of Denver Executive MBA.
Start with a scoping call, or read the full why-pilots-fail breakdown first.