Part 13 of 13

Templates and Command Reference

Full generic templates for PRDs, phase plans, and specs. Complete command definitions for all 8 commands. Queue and metrics schemas. Adapt the structure to your stack; keep the constraints.

~15 minute read (reference material)

Each template below shows the generic structure first, then a concrete excerpt using the Linear running example. The structure is stack-agnostic. The examples use a specific stack (Next.js, tRPC, Drizzle, Neon) to make concepts concrete.

PRD Template

Header

# [Unit ID]: [Title]
Category: Frontend | Backend | Full Stack
Milestone: M[N]
Status: Draft | Verified
Dependencies: [Unit IDs]
Test Pair: [Unit ID]

Common Sections (All PRDs)

  1. Purpose - Why this unit exists. Consequence of not having it.
  2. Scope - In scope / out of scope. Out-of-scope items reference which unit handles them.
  3. Dependencies - Unit ID, title, status, why needed.
  4. Test Pair - Which unit verifies this one.
  5. Related Units - 4 relationship types: depends on, test pair, composes into, shares pattern with.
  6. Success Signals - Observable behaviors (not metrics).
  7. Metrics to Track - Quantifiable with targets and measurement method.
  8. Cross-Unit Integration Points - Sends to / receives from, with triggers and data shapes.
  9. Tier and Metering Constraints - Feature availability per tier. User experience at each limit.
  10. Configuration Surface - What is configurable, defaults, where it lives.
  11. Assumptions - Documented, not implicit.
  12. Open Questions - With resolution path. Zero allowed in phase plans.
  13. Cross-Cutting Features - Audit, search, notifications: follows defaults or overrides.
  14. Changelog - Date, change, reason.

Frontend Additions (F1-F13)

F1: User Journeys (4+ scenarios) | F2: Page Structure | F3: Interaction Patterns (trigger/behavior/feedback/result) | F4: States (8 sub-variants: empty first-time, empty filtered, loading initial, loading refresh, loaded standard, loaded dense, error full, error partial) | F5: Degradation (3 levels) | F6: Responsive | F7: URL Structure | F8: Keyboard | F9: Accessibility | F10: Performance | F11: Brand | F12: First-Time UX | F13: UI Mock (interactive HTML with state switcher)

Backend Additions (B1-B9)

B1: Capability (business-level) | B2: Business Rules (absolute constraints) | B3: Data Flow Narratives (with failure paths) | B4: Performance Targets | B5: Reliability and Recovery | B6: Security | B7: Scalability (launch/growth/scale) | B8: Event Catalog | B9: Output Mocks

Linear Example: PRD Header for Issue List View
# M1-U03: Issue List View
Category: Frontend
Milestone: M1 (Issue Data Model + Views)
Status: Verified
Dependencies: M1-U02 (Issue CRUD API), M0-U08 (App Shell)
Test Pair: M1-U02 (verified by this frontend unit)

Phase Plan Template

16 Sections

  1. Overview - What this phase accomplishes in one paragraph.
  2. Codebase Snapshot - Existing tables, routers, components, constraint files. Used by verify-spec for drift detection.
  3. Pre-requisites - Codebase requirements (verified), human tasks (blocking/non-blocking), new dependencies (approval needed).
  4. Decisions - Every non-obvious choice: decision, context, rationale.
  5. Deferred Decisions - Question, why deferred, resolved by whom, measurable criteria.
  6. Tables and Schemas - Every column, type, default, constraint, index, RLS policy.
  7. API Contracts - Per procedure: name, type, auth, metering, input, output, validation, errors, optimistic strategy (Via UI / Via Cache / None).
  8. Background Jobs - Trigger, framework, expected duration, retry policy.
  9. Visual Reference - Mock file path and relevant states.
  10. Component References - Used (already built) and needed (build in this phase).
  11. Integration Points - Connects to and consumed by.
  12. Excerpted Guidelines - Rules copied verbatim from foundation docs. Only rules relevant to touched packages.
  13. Risks - Implementation risks with impact, likelihood, mitigation.
  14. Test Strategy - Categories, criteria per category, background job test approach (framework-specific), performance thresholds.
  15. Phase Completion Criteria - Tests, code quality, visual match, integration, documentation.
  16. Changelog

Key Detail: Optimistic Strategy per Mutation

PatternWhenImplementation
Via UIAdditive (create, toggle)Render optimistic state using mutation's isPending + variables in JSX. Do not touch cache.
Via CacheReorder/move/deleteManipulate query cache in onMutate. Save previous. Restore in onError. Invalidate in onSettled.
NoneIrreversible (delete, submit)Wait for server. Show loading on trigger element.

Key Detail: Background Job Test Approach

For event-driven job frameworks (e.g., Inngest): use the framework's test engine to test full function execution, individual steps, and mocked dependencies.

For long-running task frameworks (e.g., Trigger.dev): extract business logic into pure functions. Test pure functions with standard test runner. Mock the task wrapper.

Spec Template

Header

# Spec: [Unit ID]-P[N]-S[N] [Descriptive Name]
Phase: [Unit ID]-P[N]
Category: Frontend | Backend | Full Stack
Status: Draft | Verified | Tests Written | Implemented
Estimated Duration: [N] hours
Test Count: [N] tests in Block A
Context Budget: [N]K tokens (tests: [N]K, tasks: [N]K, refs: [N]K)

Block A: Test Spec

Categories: A1 (Unit/API), A2 (Visual/Component), A3 (Interaction/Accessibility), A4 (E2E), A5 (Performance).

Per test: unique ID, name, setup, input, assertions (deep only). No toBeDefined(), toBeTruthy(), toBeFalsy() as standalone assertions. Every async test: expect.assertions(N). Every DB test: verify state with independent query.

Block B: Implementation Spec

  • B1: Pre-execution checks - Dependencies verified, pattern files exist, no merge conflicts.
  • B2: Tasks - Sequential. Per task: action, file, what to build, pattern reference, mapped test IDs.
  • B3: Deferred decision tasks - Measurable criteria, both paths specified.
  • B4: Post-execution (ex-impl, Tier 1-2) - Full test suite, typecheck, lint, security, self-review.
  • B5: Verification (ex-verify, Tier 3-4) - Mutation testing (StrykerJS, >=70%), exploratory browser testing, code review, migration generation, push to main.
  • B6: Commit - Message format, squash, local only (ex-verify pushes).

Schema Tasks Use Push, Not Generate

During execution: drizzle-kit push (or equivalent) applies schema directly to the isolated database branch. No migration files generated. Canonical migration files are generated by ex-verify as the final step before pushing to main. This avoids migration file conflicts between parallel specs.

Linear Example: Spec Block A Test (issue.create)
## A1-T3: issue.create validates title length

| Field     | Value |
|-----------|-------|
| Setup    | Create workspace and team via factories. |
| Input    | Call issue.create with title of 501 characters. |
| Assert   | Throws TRPCError with code BAD_REQUEST. |
|           | Error message contains "title". |
|           | No row inserted into issues table |
|           | (verify: SELECT COUNT(*) FROM issues |
|           |  WHERE team_id = [team.id] returns 0). |

Command Reference

Planning Commands

CommandModelInputOutputKey Checks
verify-prdDeep reasoningPRD + all existing PRDs + foundation docsVerification report18 checks: template conformance, terminology, scope overlap, dependency validation, integration symmetry, states coverage, mock verification, etc.
generate-phaseDeep reasoningVerified PRD + codebase1-3 phase plan filesResolves all open questions. Every section filled. Tables with RLS. API contracts with optimistic strategy. Job test approaches specified.
generate-specFastPhase plan + codebase1-3 spec filesSplitting by test count (<25), duration (<2h), context (<50K tokens). Deep assertion enforcement. Context budget in header.
verify-specFastSpec + codebase + phase planUpdated spec or pass8 drift checks (file existence, schema match, new files) + 7 fresh-eyes checks (self-contained, deep assertions, optimistic strategy match).

Execution Commands

CommandModelInputOutputKey Actions
ex-testFastSpec Block ATest files (all failing)Rebase. DB branch create/drop. Generate tests. ESLint shallow check. Code review. Commit locally.
ex-implFastSpec Block B + test filesImplementation (Tier 1-2 verified)Rebase. DB branch. Execute tasks sequentially. Fix-and-retest loop. Tier 1 (automated) + Tier 2 (self-review). Squash commit locally.
ex-verifyDeep reasoningSpec + phase + PRD + codeVerified code on mainDB branch. StrykerJS >=70%. Exploratory testing (5-15 scenarios via extended reasoning). Code review. Migration generate. Rebase + push.

Feedback Command

CommandModelInputOutputKey Actions
apply-learningsDeep reasoningExec log (3 sections) + metricsUpdated documentsPer-spec: categorize findings into 6 types, propose diffs, human approves. Per-milestone: trend analysis across all specs.

Self-Scheduling

All execution commands support running without arguments. They read .machine (machine identity) and agents/queue.json (spec status and dependencies), filter by eligibility, apply priority (unblocking > critical path > conflict avoidance > FIFO), and recommend the next spec. Human confirms before proceeding.

Queue Schema

agents/queue.json

{
  "version": 1,
  "updated_at": "2026-03-26T14:30:00Z",
  "updated_by": "machine-a1",
  "ready": [
    {
      "spec_id": "M1-U02-P1-S1",
      "unit": "M1-U02",
      "title": "Issue CRUD Backend",
      "category": "backend",
      "packages_touched": ["db", "api", "types"],
      "files_modified": ["packages/api/src/root.ts"],
      "dependencies": ["M0-U01-P1-S1"],
      "estimated_hours": 1.5,
      "priority": 1
    }
  ],
  "in_progress": [],
  "completed": []
}

Status progression: ready > testing > tests-written > implementing > implemented > verifying > complete

Conflict avoidance: files_modified tracks existing files that will be changed (not new files created). Two specs both creating new files in the same directory is fine. Two specs both modifying the same existing file must run sequentially.

Execution Log Structure

agents/exec-log-[SpecID].md

Three sections, each appended by its respective command:

Section 1 (ex-test): Session timing, model, machine, memory snapshots, DB branch lifecycle, test files created (count per file), self-check results (spec match, auto-generated tests), ESLint results, code review, test run results (new failing, existing passing), commit hash, key learnings.

Section 2 (ex-impl): Session timing, model, machine, memory, DB branch lifecycle, task execution table (per-task: duration, tests run, fix cycles, notes), deferred decisions resolved (question, answer, measurement), Tier 1-2 results (pass/fail per check), code review, commit hash, key learnings.

Section 3 (ex-verify): Session timing, model, machine, memory, DB branch lifecycle, mutation testing (initial/final score, assertions deepened, time), exploratory testing (scenarios planned/executed, issues by severity, fixed/deferred), code review findings, Tier 3-4 results, migration files, rebase details (conflicts, resolution), push confirmation, key learnings, deferred items for apply-learnings.

Metrics Schema

agents/metrics.json

Two layers: per-spec entries and per-milestone aggregates.

Per spec: spec_id, unit, milestone, machine, category, per-command metrics (duration, model, tests created/passed, fix cycles, mutation scores, issues found), totals (total duration, deviations, human interventions, rework flag, push timestamp, commit hash).

Per milestone: specs completed, total duration, average duration, total deviations, rework rate, average mutation score.

Trend notes: Added at end of each milestone. Human-readable observations about process health, adjustments made, and rationale.

Adapting These Templates

The templates above are designed to be stack-agnostic in structure but stack-specific in examples. When adapting to your project:

The system works because it constrains AI agents, not because it uses specific tools. Replace every tool mentioned in this guide, and the methodology still works. The constraints (tests before code, self-contained specs, fresh sessions, mutation testing, feedback loops) are the value. The tools are the implementation.