Shipping Code
Three environments with promotion gates at each stage. A 7-stage CI pipeline. Auto-rollback on production failures. AI-powered diagnosis and auto-remediation.
~10 minute read
Three Environments
Code flows through three environments, each with its own database, auth instance, and third-party service configuration.
| Environment | Branch | Database | Stripe | AI Calls | Alerts |
|---|---|---|---|---|---|
| Development | main | Dev project | Test mode | Real (dev keys) | Console only |
| Staging | staging | Staging project | Test mode | Real (dev keys) | |
| Production | prod | Production (scaled) | Live mode | Real (prod keys) | Email + PagerDuty |
The 7-Stage CI Pipeline
Every push to main triggers a 7-stage pipeline. All stages are hard fails: nothing moves to staging unless every stage passes.
| Stage | What Runs | Target Time |
|---|---|---|
| 1. Install | pnpm install, dependency audit | <30s |
| 2. Lint + Format | ESLint, Biome, import restrictions | <20s |
| 3. Typecheck | TypeScript strict mode across all packages | <40s |
| 4. Unit tests | Vitest: pure functions, validation, business logic | <60s |
| 5. Integration tests | API routes with real DB (Neon), RLS verification, metering | <90s |
| 6. E2E tests (Tier 1) | Critical user flows per app, Playwright headless | <120s |
| 7. Security + Performance | npm audit, bundle size check, Lighthouse scores | <60s |
Total pipeline target: under 7 minutes. If it exceeds 10 minutes, optimize before adding features.
Promotion Gates
Main to Staging
Promotion is a manual trigger (not automatic on every push). The orchestrator creates a PR from main to staging, and the full test suite re-runs against the staging environment:
- All 7 CI stages re-run against staging database and services.
- E2E Tier 2 tests added: extended flows, real email delivery verification, metering exhaustion scenarios.
- Performance benchmarks compared against staging baselines.
- Real AI extraction with timing validation (must complete within targets).
If any test fails, staging promotion is blocked. The failure enters the diagnosis flow.
Staging to Production
After staging passes completely:
- PR created from staging to prod.
- Deployment triggered to production environment.
- 7-point smoke test runs against production (health endpoints, auth flow, DB read, file access, AI health, email health, payment health).
- SLO validation: query response times for the first 5 minutes, compare against targets.
- Error monitoring: Sentry error rate compared against pre-deployment baseline. If error rate increases >50% or a new error type appears, trigger rollback.
Auto-Rollback
Production failures trigger instant rollback with zero human intervention:
- Smoke test failure or error rate spike detected.
- Deployment platform reverts all apps to previous production deployment instantly (zero-downtime, the old deployment stays warm).
- Failure notification sent: email with deployment SHA, failure reason, error logs, monitoring links.
- Issue created in the issue tracker with: full diagnosis, affected apps, Sentry link, recommended fix.
- Diagnosis flow begins automatically.
AI-Powered Diagnosis and Auto-Remediation
When any test fails (in dev, staging, or production), an AI agent on a separate machine diagnoses the failure, proposes a fix, and in some cases applies it automatically.
Three-Tier Auto-Fix
| Tier | Fix Type | Action | Examples |
|---|---|---|---|
| 1 | Simple, safe | Auto-commit to main | Lint errors, missing imports, type mismatches, flaky test retry |
| 2 | Moderate, needs review | Create PR for human approval | API contract changes, schema fixes, dependency updates |
| 3 | Complex or risky | Create issue with diagnosis | Data integrity issues, security vulnerabilities, multi-service failures |
The Diagnosis Flow
- Failure detected (CI stage, staging test, production smoke test).
- AI agent checks out the codebase on the external machine.
- Reads the failure logs, error traces, and test output.
- Identifies the root cause and categorizes it (Tier 1, 2, or 3).
- For Tier 1: generates fix, runs tests locally, commits directly to main.
- For Tier 2: generates fix, creates PR, requests human review.
- For Tier 3: creates issue with diagnosis, recommended approach, and estimated complexity.
- After fix is merged: re-triggers the promotion flow (max 3 diagnosis cycles before hard escalation).
Linear ships updates continuously. Their deployment pipeline uses canary deployments (internal dogfooding before general availability). When a new build is ready, Linear employees use it for their own project management first. If no issues surface after an internal testing window, it rolls out to all users.
For a solo founder, the staging environment serves the same purpose: it is your personal testing ground before customers see the code. The key difference is that your testing is automated, not manual.