Part 13 of 13

Templates and Command Reference

Full generic templates for PRDs, phase plans, and specs. Complete command definitions for all 8 commands. Queue and metrics schemas. Adapt the structure to your stack; keep the constraints.

~15 minute read (reference material)

Each template below shows the generic structure first, then a concrete excerpt using the Linear running example. The structure is stack-agnostic. The examples use a specific stack (Next.js, tRPC, Drizzle, Neon) to make concepts concrete.

PRD Template

Header

# [Unit ID]: [Title]
Category: Frontend | Backend | Full Stack
Milestone: M[N]
Status: Draft | Verified
Dependencies: [Unit IDs]
Test Pair: [Unit ID]

Common Sections (All PRDs)

Purpose - Why this unit exists. Consequence of not having it.
Scope - In scope / out of scope. Out-of-scope items reference which unit handles them.
Dependencies - Unit ID, title, status, why needed.
Test Pair - Which unit verifies this one.
Related Units - 4 relationship types: depends on, test pair, composes into, shares pattern with.
Success Signals - Observable behaviors (not metrics).
Metrics to Track - Quantifiable with targets and measurement method.
Cross-Unit Integration Points - Sends to / receives from, with triggers and data shapes.
Tier and Metering Constraints - Feature availability per tier. User experience at each limit.
Configuration Surface - What is configurable, defaults, where it lives.
Assumptions - Documented, not implicit.
Open Questions - With resolution path. Zero allowed in phase plans.
Cross-Cutting Features - Audit, search, notifications: follows defaults or overrides.
Changelog - Date, change, reason.

Frontend Additions (F1-F13)

Backend Additions (B1-B9)

Linear Example: PRD Header for Issue List View

# M1-U03: Issue List View
Category: Frontend
Milestone: M1 (Issue Data Model + Views)
Status: Verified
Dependencies: M1-U02 (Issue CRUD API), M0-U08 (App Shell)
Test Pair: M1-U02 (verified by this frontend unit)

Phase Plan Template

16 Sections

Overview - What this phase accomplishes in one paragraph.
Codebase Snapshot - Existing tables, routers, components, constraint files. Used by verify-spec for drift detection.
Pre-requisites - Codebase requirements (verified), human tasks (blocking/non-blocking), new dependencies (approval needed).
Decisions - Every non-obvious choice: decision, context, rationale.
Deferred Decisions - Question, why deferred, resolved by whom, measurable criteria.
Tables and Schemas - Every column, type, default, constraint, index, RLS policy.
API Contracts - Per procedure: name, type, auth, metering, input, output, validation, errors, optimistic strategy (Via UI / Via Cache / None).
Background Jobs - Trigger, framework, expected duration, retry policy.
Visual Reference - Mock file path and relevant states.
Component References - Used (already built) and needed (build in this phase).
Integration Points - Connects to and consumed by.
Excerpted Guidelines - Rules copied verbatim from foundation docs. Only rules relevant to touched packages.
Risks - Implementation risks with impact, likelihood, mitigation.
Test Strategy - Categories, criteria per category, background job test approach (framework-specific), performance thresholds.
Phase Completion Criteria - Tests, code quality, visual match, integration, documentation.
Changelog

Key Detail: Optimistic Strategy per Mutation

Pattern	When	Implementation
Via UI	Additive (create, toggle)	Render optimistic state using mutation's `isPending` + `variables` in JSX. Do not touch cache.
Via Cache	Reorder/move/delete	Manipulate query cache in `onMutate`. Save previous. Restore in `onError`. Invalidate in `onSettled`.
None	Irreversible (delete, submit)	Wait for server. Show loading on trigger element.

Key Detail: Background Job Test Approach

For event-driven job frameworks (e.g., Inngest): use the framework's test engine to test full function execution, individual steps, and mocked dependencies.

For long-running task frameworks (e.g., Trigger.dev): extract business logic into pure functions. Test pure functions with standard test runner. Mock the task wrapper.

Spec Template

Header

# Spec: [Unit ID]-P[N]-S[N] [Descriptive Name]
Phase: [Unit ID]-P[N]
Category: Frontend | Backend | Full Stack
Status: Draft | Verified | Tests Written | Implemented
Estimated Duration: [N] hours
Test Count: [N] tests in Block A
Context Budget: [N]K tokens (tests: [N]K, tasks: [N]K, refs: [N]K)

Block A: Test Spec

Categories: A1 (Unit/API), A2 (Visual/Component), A3 (Interaction/Accessibility), A4 (E2E), A5 (Performance).

Per test: unique ID, name, setup, input, assertions (deep only). No toBeDefined(), toBeTruthy(), toBeFalsy() as standalone assertions. Every async test: expect.assertions(N). Every DB test: verify state with independent query.

Block B: Implementation Spec

B1: Pre-execution checks - Dependencies verified, pattern files exist, no merge conflicts.
B2: Tasks - Sequential. Per task: action, file, what to build, pattern reference, mapped test IDs.
B3: Deferred decision tasks - Measurable criteria, both paths specified.
B4: Post-execution (ex-impl, Tier 1-2) - Full test suite, typecheck, lint, security, self-review.
B5: Verification (ex-verify, Tier 3-4) - Mutation testing (StrykerJS, >=70%), exploratory browser testing, code review, migration generation, push to main.
B6: Commit - Message format, squash, local only (ex-verify pushes).

Schema Tasks Use Push, Not Generate

During execution: drizzle-kit push (or equivalent) applies schema directly to the isolated database branch. No migration files generated. Canonical migration files are generated by ex-verify as the final step before pushing to main. This avoids migration file conflicts between parallel specs.

Linear Example: Spec Block A Test (issue.create)

## A1-T3: issue.create validates title length

| Field     | Value |
|-----------|-------|
| Setup    | Create workspace and team via factories. |
| Input    | Call issue.create with title of 501 characters. |
| Assert   | Throws TRPCError with code BAD_REQUEST. |
|           | Error message contains "title". |
|           | No row inserted into issues table |
|           | (verify: SELECT COUNT(*) FROM issues |
|           |  WHERE team_id = [team.id] returns 0). |

Command Reference

Planning Commands

Command	Model	Input	Output	Key Checks
verify-prd	Deep reasoning	PRD + all existing PRDs + foundation docs	Verification report	18 checks: template conformance, terminology, scope overlap, dependency validation, integration symmetry, states coverage, mock verification, etc.
generate-phase	Deep reasoning	Verified PRD + codebase	1-3 phase plan files	Resolves all open questions. Every section filled. Tables with RLS. API contracts with optimistic strategy. Job test approaches specified.
generate-spec	Fast	Phase plan + codebase	1-3 spec files	Splitting by test count (<25), duration (<2h), context (<50K tokens). Deep assertion enforcement. Context budget in header.
verify-spec	Fast	Spec + codebase + phase plan	Updated spec or pass	8 drift checks (file existence, schema match, new files) + 7 fresh-eyes checks (self-contained, deep assertions, optimistic strategy match).

Execution Commands

Command	Model	Input	Output	Key Actions
ex-test	Fast	Spec Block A	Test files (all failing)	Rebase. DB branch create/drop. Generate tests. ESLint shallow check. Code review. Commit locally.
ex-impl	Fast	Spec Block B + test files	Implementation (Tier 1-2 verified)	Rebase. DB branch. Execute tasks sequentially. Fix-and-retest loop. Tier 1 (automated) + Tier 2 (self-review). Squash commit locally.
ex-verify	Deep reasoning	Spec + phase + PRD + code	Verified code on main	DB branch. StrykerJS >=70%. Exploratory testing (5-15 scenarios via extended reasoning). Code review. Migration generate. Rebase + push.

Feedback Command

Command	Model	Input	Output	Key Actions
apply-learnings	Deep reasoning	Exec log (3 sections) + metrics	Updated documents	Per-spec: categorize findings into 6 types, propose diffs, human approves. Per-milestone: trend analysis across all specs.

Self-Scheduling

All execution commands support running without arguments. They read .machine (machine identity) and agents/queue.json (spec status and dependencies), filter by eligibility, apply priority (unblocking > critical path > conflict avoidance > FIFO), and recommend the next spec. Human confirms before proceeding.

Queue Schema

agents/queue.json

{
  "version": 1,
  "updated_at": "2026-03-26T14:30:00Z",
  "updated_by": "machine-a1",
  "ready": [
    {
      "spec_id": "M1-U02-P1-S1",
      "unit": "M1-U02",
      "title": "Issue CRUD Backend",
      "category": "backend",
      "packages_touched": ["db", "api", "types"],
      "files_modified": ["packages/api/src/root.ts"],
      "dependencies": ["M0-U01-P1-S1"],
      "estimated_hours": 1.5,
      "priority": 1
    }
  ],
  "in_progress": [],
  "completed": []
}

Status progression: ready > testing > tests-written > implementing > implemented > verifying > complete

Conflict avoidance: files_modified tracks existing files that will be changed (not new files created). Two specs both creating new files in the same directory is fine. Two specs both modifying the same existing file must run sequentially.

Execution Log Structure

agents/exec-log-[SpecID].md

Three sections, each appended by its respective command:

Section 1 (ex-test): Session timing, model, machine, memory snapshots, DB branch lifecycle, test files created (count per file), self-check results (spec match, auto-generated tests), ESLint results, code review, test run results (new failing, existing passing), commit hash, key learnings.

Section 2 (ex-impl): Session timing, model, machine, memory, DB branch lifecycle, task execution table (per-task: duration, tests run, fix cycles, notes), deferred decisions resolved (question, answer, measurement), Tier 1-2 results (pass/fail per check), code review, commit hash, key learnings.

Section 3 (ex-verify): Session timing, model, machine, memory, DB branch lifecycle, mutation testing (initial/final score, assertions deepened, time), exploratory testing (scenarios planned/executed, issues by severity, fixed/deferred), code review findings, Tier 3-4 results, migration files, rebase details (conflicts, resolution), push confirmation, key learnings, deferred items for apply-learnings.

Metrics Schema

agents/metrics.json

Two layers: per-spec entries and per-milestone aggregates.

Per spec: spec_id, unit, milestone, machine, category, per-command metrics (duration, model, tests created/passed, fix cycles, mutation scores, issues found), totals (total duration, deviations, human interventions, rework flag, push timestamp, commit hash).

Per milestone: specs completed, total duration, average duration, total deviations, rework rate, average mutation score.

Trend notes: Added at end of each milestone. Human-readable observations about process health, adjustments made, and rationale.

Adapting These Templates

The templates above are designed to be stack-agnostic in structure but stack-specific in examples. When adapting to your project:

Keep the section structure. The sections exist because each one prevents a specific class of error. Removing sections creates gaps that AI agents will fill with guesswork.
Replace tool-specific references. Swap "tRPC" for your API framework, "Drizzle" for your ORM, "Neon" for your database. The patterns (RLS testing, migration strategy, optimistic UI) apply regardless of tool.
Adjust thresholds to your context. The 70% mutation score, 50K context budget, and 2-hour session duration are calibrated for the described toolchain. Your thresholds may differ.
Start with one unit end-to-end. Do not try to set up the entire system before building anything. Pick one backend unit, create its PRD, generate its phase plan, generate its spec, and execute it. Refine the templates based on what you learn.

The system works because it constrains AI agents, not because it uses specific tools. Replace every tool mentioned in this guide, and the methodology still works. The constraints (tests before code, self-contained specs, fresh sessions, mutation testing, feedback loops) are the value. The tools are the implementation.

12: Research Notes

Back to

Guide Overview