How We Scope AI Projects at AXI (Before Writing a Line of Code)
Our internal process for scoping AI automation projects. Discovery frameworks, red flags, and how we decide what to build first.
The fastest way to waste money on AI is to skip scoping. We've seen it dozens of times: a team gets excited about a use case, jumps straight to building, and ends up with a sophisticated system that solves the wrong problem. Or worse, a system that solves the right problem but nobody adopts because it doesn't fit into existing workflows.
At AXI, we spend 20-30% of every project timeline on discovery and scoping before any code gets written. That ratio sounds high. It's the reason our production success rate is above 90%.
The Discovery Framework
Every project starts with the same three questions. Simple to state, surprisingly hard to answer well.
Question 1: What Does the Human Do Today?
Not what the process document says. Not what the manager thinks happens. What does the actual human doing the work do, step by step, including all the workarounds, judgment calls, and "I just know" moments?
We map this through a combination of interviews and screen recordings. We ask the person performing the task to narrate their workflow in real time. We're listening for two things:
- Explicit steps: The documented, repeatable actions.
- Implicit judgment: The decisions they make without thinking. "I check the order value and if it's over $500, I loop in the account manager" is explicit. "If the customer's tone sounds frustrated, I prioritize them" is implicit.
The implicit judgment is where most AI projects succeed or fail. If you don't capture it, your agent will handle the easy 70% and break on the hard 30%.
Question 2: What Breaks When It Goes Wrong?
Every workflow has failure modes. We need to know what they are, how often they happen, and what the blast radius looks like.
A data sync that fails silently for three days is very different from a customer-facing email that goes out with the wrong name. Both are "automation failures," but one is a minor annoyance and the other is a brand-damaging incident.
We categorize failures into three tiers:
- Tier 1: Customer-facing impact. Revenue or reputation at stake. These workflows need human-in-the-loop checkpoints, even after automation.
- Tier 2: Internal operational impact. Delays or inefficiencies, but recoverable. These can be fully automated with monitoring and alerting.
- Tier 3: Low-stakes coordination. The worst outcome is a minor delay. Automate fully, check monthly.
This tier system determines how much oversight we build into the agent. It's not about whether AI can handle the task. It's about what happens when AI gets it wrong.
Question 3: Where Does the Data Live?
AI agents are only as good as the data they can access. Before we commit to a project, we audit every system the workflow touches:
- Access: Can we connect to it? Does it have an API? Is the API rate-limited in ways that would bottleneck the agent?
- Quality: Is the data clean, structured, and consistent? Or is it a mess of free-text fields and inconsistent naming conventions?
- Freshness: How often does the data update? Real-time, hourly, daily? Does the workflow need real-time data to function correctly?
- Permissions: Who owns the data? Are there compliance constraints (HIPAA, SOC 2, GDPR) that limit how we can process it?
We've killed projects at this stage. Not because the use case wasn't compelling, but because the data infrastructure wasn't ready. That's a painful conversation, but it's better than discovering the same thing three weeks into a build.
The Scoring Matrix
After discovery, we score every candidate workflow on four dimensions:
Volume: How many times per week does this workflow execute? Higher volume means higher ROI from automation.
Complexity: How many decision points, systems, and edge cases are involved? Higher complexity means higher build cost but also higher value, since these are the workflows humans struggle with most.
Data readiness: How clean and accessible is the data the agent needs? A score of 1 means "data doesn't exist yet." A score of 5 means "clean API with structured data, ready to consume."
Impact: What's the business value of automating this? We quantify this in hours saved, revenue protected, or error rate reduced.
We multiply Volume x Impact and divide by Complexity x (6 - Data Readiness). The result is a prioritization score. It's not a perfect formula, but it consistently surfaces the right first project.
The right first project is almost never the most ambitious one. It's the one with the highest score: high volume, high impact, moderate complexity, and data that's ready to go. Win there first. Build credibility. Then tackle the complex stuff.
Red Flags We Watch For
Over 50+ projects, we've built a list of patterns that predict trouble. If we see two or more of these during discovery, we flag the project as high-risk:
- "We need AI to figure out what to do." If the humans doing the work can't clearly articulate the decision logic, an AI agent won't magically discover it. AI automates judgment. It doesn't invent it.
- "The data is in spreadsheets that different people maintain." Fragmented, manually maintained data sources are a reliability nightmare. Fix the data layer first.
- "This needs to work perfectly from day one." No agent launches at 100% accuracy. If there's no room for a supervised learning period, the project isn't ready.
- "We've tried three other tools and they all failed." Sometimes this means the problem is genuinely hard and needs a custom approach. More often, it means the underlying process is broken and no amount of automation will fix a bad workflow.
- "Can you just make it smart?" Vague requirements produce vague outcomes. If we can't define "smart" in measurable terms, we push back until we can.
What the Scoping Document Looks Like
Every project gets a scoping document before we write code. Here's what it contains:
- Workflow map: Visual diagram of every step, decision point, and system handoff
- Success metrics: Specific, measurable targets (e.g., "reduce processing time from 45 minutes to under 5 minutes")
- Failure modes and mitigations: What can go wrong and how we handle it
- Data architecture: Where data comes from, how it flows, and what transformations are needed
- Human-in-the-loop checkpoints: Where a human reviews agent decisions before they execute
- Phase plan: What ships in week 1, week 4, and week 8
- Cost model: Build cost, operating cost, and projected ROI timeline
This document is the contract between us and the client. If we can't fill it out completely, we're not ready to build.
Why This Matters
Scoping isn't the exciting part of AI work. Nobody tweets about their discovery framework. But it's the difference between a project that delivers 10x ROI and a project that gets quietly shelved after three months.
The teams we work with through our automation practice get this process as a standard part of every engagement. Some clients come to us expecting to start building on day one. By the end of discovery, they're glad we didn't. Because the project we end up building is almost never the one they originally imagined. It's better.
If you've got a workflow you think AI could improve, the first step isn't building. It's understanding. Start a conversation with us and we'll help you figure out what's worth automating and what isn't.
Share this article