The Problem This Solves
AI coding agents can write code. That is no longer the bottleneck. The bottleneck is that they write code that looks correct but behaves incorrectly at scale, and no amount of prompting fixes the structural problems.
Three failure modes emerge when you move beyond toy projects:
Shallow assertions. Research shows that LLM-generated tests systematically capture actual program behavior rather than expected behavior. Mutation testing scores for AI-generated tests average around 40%, compared to 80%+ for well-written human suites. Half your test suite gives you false confidence. [Research: R1]
Context window degradation. After 3-4 auto-compactions, the agent reimplements things it already built, forgets patterns it established, and makes inconsistent decisions. The longer the session, the worse the output quality.
No memory between sessions. Each new session starts from zero. The agent that designed your notification system yesterday has no knowledge of those decisions today.
The solution is not better prompting. It is a system that makes the agent's job small enough, well-defined enough, and constrained enough that supervision becomes verification rather than direction.
The Full Journey
Building a product with AI coding agents is not "describe a feature, get code." It is a multi-stage process where each stage produces artifacts that constrain the next. The system works because every stage absorbs a specific type of complexity, so no single stage is overwhelmed.
Guide Contents
The Journey (Foundation)
Before writing a line of code, establish the foundation that makes AI agents predictable.
Building
The planning and execution pipeline that turns foundation documents into shipped code.
Shipping and Sustaining
Getting code to production and keeping it healthy.
Reference
Research backing the methodology and ready-to-use templates.
About the Running Example
Throughout this guide, concepts are illustrated using Linear (the project management tool) as a running example. Linear is used because it is a well-known, complex SaaS product with multiple modules (Issues, Projects, Cycles, Roadmaps, Inbox, Settings), real-time collaboration, keyboard-first UX, and a distinctive brand identity.
When you see a block like this, it shows what the concept would look like for Linear. Adapt the specifics to your own product.
How to Read This Guide
If you are starting from scratch: Read Parts 1-4 (Foundation) sequentially. These establish the artifacts that everything else depends on.
If you have a product and want to add AI agents to your workflow: Start with Part 5 (PRD Methodology) and Part 7 (AI-First Execution). These are the core differentiators.
If you are already using AI agents but hitting quality issues: Jump to Part 7 Section "The Shallow Assertion Problem" and Part 11 (Preventing Drift).
If you want the templates to adapt immediately: Go straight to Part 13 (Templates and Command Reference).
Each part has inline research references (marked with [R1], [R2], etc.) that link to Part 12 (Research Notes) for the full citation and findings.