From PRD to Execution Instructions
Phase plans bridge product intent and implementation details. Specs are line-by-line instructions for AI agents. Together they form the orchestration layer that turns a PRD into shipped code.
~10 minute read
The Six-Layer Hierarchy
Every feature is decomposed into a strict hierarchy. Each layer absorbs a different type of complexity.
Milestones are 4-12 week increments. M0 might be "Auth + Workspace," M1 might be "Issue Data Model + Views."
Units are the atomic planning level. Each unit gets one PRD. "Issue List View" or "Notification Engine."
Phases are the bridge. A unit with tables, API, and UI might split into Phase 1 (tables + API) and Phase 2 (UI). Each phase must be independently testable.
Specs are what the AI agent reads. Block A: every test. Block B: every implementation task. Zero decisions required.
Tasks are the lowest actionable level. "Create file packages/db/src/schema/issues.ts with the following table definition."
Phase Plans: The Bridge Document
The phase plan translates a product-level PRD into an implementation-level plan. This is where tables, columns, API signatures, indexes, RLS policies, and file paths first appear.
16 Sections
The phase plan has 16 sections (full template in Part 13). The most critical ones:
Codebase Snapshot (Section 2): What already exists that is relevant. Existing tables, routers, components. This is what spec verification uses to detect drift: if the codebase changes between phase plan creation and spec execution, the snapshot reveals it.
Tables and Schemas (Section 6): Every column, type, default, constraint, index, and RLS policy. Detailed enough that the executing agent writes the schema file without making any design decisions.
API Contracts (Section 7): Every procedure with input/output schema, auth, metering, validation rules, error cases, and an optimistic strategy per mutation (Via UI, Via Cache, or None). The optimistic strategy tells the spec generator exactly which TanStack Query pattern to implement.
Background Job Test Approach (Section 14): For every background job, the specific testing framework and pattern. Inngest functions use @inngest/test with InngestTestEngine. Trigger.dev tasks extract logic to pure functions tested with Vitest. This prevents the executing agent from guessing how to test async jobs.
Excerpted Guidelines (Section 12): Rules copied verbatim from foundation docs. Only rules relevant to packages this phase touches. The spec generator reads nothing outside the phase plan.
Section 7: API Contracts
| Field | issue.list | issue.create |
|---|---|---|
| Type | Query | Mutation |
| Auth | Authenticated, workspace member | Authenticated, workspace member |
| Input | teamId, status?, assigneeId?, priority?, cursor?, limit? | title (required), teamId, description?, status?, priority?, assigneeId?, labelIds? |
| Output | { items: Issue[], nextCursor: string | null } | Issue |
| Validation | teamId must belong to current workspace. Limit 1-100. | Title 1-500 chars. teamId must belong to workspace. assigneeId must be workspace member. |
| Optimistic | N/A (query) | Via UI (append to list using isPending + variables) |
Open Question Resolution
PRDs can have open questions. Phase plans cannot. The generate-phase command resolves every open question through one of four paths:
- Resolve from codebase analysis. "Which pagination pattern should we use?" Check existing routers: cursor-based is the established pattern.
- Ask the human. "Should archived issues appear in search results?" This is a product decision.
- Defer to spec generation. "What is the optimal index strategy for the filter query?" Requires deeper code investigation.
- Defer to implementation with criteria. "If the filter query takes >50ms with 10K rows, add a composite index. Otherwise, rely on the existing single-column indexes." The agent measures and decides during execution.
Every deferred decision has measurable criteria. "Use approach A if [condition]. Otherwise use approach B." No open-ended judgment calls during execution.
Spec Generation
Specs are what the AI agent reads line by line. A spec has two blocks:
Block A: Test Spec
Every test file, test case, setup, input, and assertion. Organized by category:
- A1: Unit / API tests. Router procedures, business logic, validation, error cases.
- A2: Visual / Component tests. Component renders correctly in each state, brand tokens applied.
- A3: Interaction / Accessibility tests. Click, keyboard, focus, ARIA verification.
- A4: Functional E2E tests. Full user journeys from PRD mapped to Playwright scenarios.
- A5: Performance tests. Query time, render time, scroll performance with thresholds.
Each test has a unique ID (A1-T1, A1-T2) for traceability. Block B tasks reference these IDs to indicate which tests verify which implementation.
Block B: Implementation Spec
Sequential tasks. Each task specifies: what to do, which file, what to build, which existing file to follow as a pattern, and which Block A tests should pass after this task.
## Task 3: Create issue tRPC router
| Field | Value |
|------------------|-------|
| Action | Create new file |
| File | packages/api/src/routers/issue.ts |
| What to Build | issue.list (cursor pagination, workspace-scoped),
| | issue.create (validation, RLS, event emit),
| | issue.update (partial update, ownership check) |
| Pattern Reference| Follow structure of packages/api/src/routers/team.ts |
| Tests to Run | A1-T1, A1-T2, A1-T3, A1-T4, A1-T5, A1-T6 |
Three Splitting Heuristics
A phase may produce 1-3 specs. The splitting decision uses three heuristics:
| Heuristic | Threshold | Reasoning |
|---|---|---|
| Test count | 3-25 per spec | Below 3: overhead exceeds value. Above 25: agent loses track. |
| Session duration | Under 2 hours | Memory leaks and context degradation worsen after 2h. [R4] |
| Context budget | Under 50K tokens | Tests ~200 tokens each. Tasks ~300. Pattern files ~500-2000. If total exceeds 50% of usable context, the spec is too large. |
Phase M1-U02-P1 has: 3 new tables, 6 API procedures, 2 background jobs, 22 test cases. Context budget: ~45K tokens. Under threshold but close.
Split decision: Split into 2 specs.
Spec S1 (Backend Core): Tables, schema, 4 CRUD procedures (list, create, update, archive). 14 tests. ~28K tokens.
Spec S2 (Backend Extended): Relations, bulk actions, 2 background jobs (auto-archive stale issues, sync with GitHub). 8 tests. ~17K tokens.
S2 depends on S1 (it references the tables S1 creates).
Spec Verification
Before a spec enters the execution queue, it passes through a verification command that runs 15 checks (8 drift checks + 7 fresh-eyes checks).
Drift Checks
The codebase changes between phase plan creation and spec execution. Other specs merge. New files appear. Drift checks compare the spec against the current codebase state:
- Do the files referenced in "Pattern Reference" still exist?
- Do the tables in the codebase snapshot still match reality?
- Are there new files in the target directories that the spec does not account for?
- Has the router registration file changed since the phase plan was created?
Fresh-Eyes Checks
A second pass that looks at the spec as if reading it for the first time:
- Can the spec be executed without reading any external document?
- Do test assertions check specific expected values (not just "defined" or "truthy")?
- Does every mutation have an optimistic strategy that matches the phase plan?
- Are deferred decisions clearly measurable?
If drift is detected, the spec is auto-updated where possible (file paths, pattern references). Structural issues are flagged for human review.
After verification, the spec enters the execution queue. Part 7 covers what happens when the AI agent starts writing code.