Tool-agnostic AI Development Lifecycle (AIDLC) template for Claude Code, OpenAI Codex CLI, and Cursor IDE. One canonical methodology, three native adapters, harness patterns for long-running agentic work.
Inspired by AWS Labs AIDLC (3-phase lifecycle, decision gates), Anthropic harness research (initializer + coding agent, feature list), Anthropic evals guide (tasks/graders/transcripts), Anthropic harness design (generator/evaluator, sprint contracts), OpenAI Codex loop (instruction layering, sandbox/context handling), LangChain harness engineering (self-verification, traces), and Martin Fowler / Learn Harness Engineering (guides, sensors, lifecycle).
Agentic coding templates are usually tool-specific: pick .claude/ and you can't share with Codex; pick .cursor/ and Claude Code starts from scratch. The methodology is the same — review code, write tests, deploy carefully — but the wiring isn't portable.
This template separates methodology (workflow, rules, roles) from wiring (how each tool loads it). Methodology lives in aidlc/. Each tool has a thin adapter directory that points at it via repo-rooted paths.
It also bakes in patterns from long-running-agent research: a structured session lifecycle, a JSON feature backlog, sprint contracts between engineer and reviewer, and a dedicated eval phase for AI/agent behavior.
Designed around a few simple commitments: single source of truth (everything lives once in aidlc/), thin adapters (tool dirs are pointers, not copies), scoped work (one feature/slice at a time), and sensors over confidence (tests, hooks, evals, review). The harness is intentionally five-part:
- Instructions route agents through focused files instead of one giant prompt.
- State persists progress, backlog, and git history across resets.
- Scope keeps work to one independently committable slice.
- Verification uses tests, hooks, E2E, evals, transcripts, and review as sensors.
- Lifecycle forces initialize → work → verify → handoff → commit.
┌──────────────────────────────┐
│ aidlc/ (canonical, shared) │
│ agents · phases · examples │
└──────────┬───────────────────┘
│ repo-rooted refs
┌────────────────────┼────────────────────┐
▼ ▼ ▼
.claude/ .cursor/ .codex/
rules · agents rules · agents config · hooks
skills · settings skills · hooks
(Claude Code) (Cursor IDE) (Codex CLI)
memory/progress.md memory/feature-list.json
(handoff) (backlog)
Adapters never use ../../ chains — every reference is repo-rooted (e.g. aidlc/agents/engineer.md). Renaming or moving an adapter file never breaks references.
Inception (WHAT/WHY) → Construction (HOW) → Operations (RUN)
gate gate gate
Each phase is one canonical file in aidlc/{inception,construction,operations}/. Roles and rules below.
| Command | Phase | When to use |
|---|---|---|
/spec |
Inception | Starting a new feature or initiative — define problem, use cases, RICE, acceptance criteria |
/design |
Inception | The feature has UI — component specs, mobile, interaction patterns |
/plan |
Construction | Spec is approved — architecture, task breakdown, sprint contract with reviewer |
/build |
Construction | Sprint contract is agreed — incremental TDD on one feature/slice |
/test |
Construction | Build is in progress — coverage strategy, enforce 100% on new/modified |
/eval |
Construction | The change touches AI/agent behavior (tools, prompts, multi-turn flows) |
/review |
Construction | Code is ready for merge — pre-merge two-pass code review |
/security |
Construction | The change touches auth, data, file upload, external APIs, or crypto |
/e2e |
Construction | Before release — end-to-end journey verification, sign-off |
/ship |
Construction | Review + E2E + (security/evals if applicable) all green — land the branch |
/operate |
Operations | Just deployed; or an alert/incident fired; or 24h post-deploy check |
/investigate |
Operations | A bug, test failure, or unexpected behavior — root-cause first, no symptom patches |
/daily-report |
Operations | Manager's daily executive summary (typically morning) |
/retro |
Operations | End of sprint / every 2 weeks; or after a major model/tool upgrade |
| Tool | Mechanism |
|---|---|
| Claude Code | Slash command — /spec, /build, /eval, etc. (.claude/skills/X.md points at aidlc/<phase>/X.md) |
| Cursor IDE | Skill — /spec, /build, etc. (.cursor/skills/X.md points at the same file) |
| Codex CLI | Plain prose — "Follow aidlc/construction/build.md". No slash-command system. |
You can always open the canonical file directly and follow it — slash commands are convenience, not requirement.
| Situation | Command sequence |
|---|---|
| New feature | /spec → /design (if UI) → /plan → /build → /test → /review → /e2e → /ship → /operate |
| New AI/LLM feature | …same, plus /eval between /test and /review |
| Auth/data/API change | …same, plus /security before /e2e |
| Bug report | /investigate → /build (fix + regression test) → /test → /review → /ship |
| Production incident | /operate (acknowledge → mitigate) → /investigate (root cause) → fix loop → /operate (postmortem) |
| Daily standup | /daily-report |
| End of sprint / 2 weeks | /retro |
| After a major model upgrade | /retro (harness review step — strip stale scaffolding) |
Each phase has a gate before the next. Use aidlc/common/decision-gates.md (structured A/B/C/D + [Answer]:) when explicit human approval is needed.
Three roles, definitions in aidlc/agents/:
- engineer — implementation, architecture, DB, CI/CD; one feature/slice at a time from the backlog.
- reviewer — code review, security (STRIDE), runtime QA, agent evals, sprint-contract approval, E2E sign-off.
- manager — orchestrate, daily reports, harness-review cadence after model/tool upgrades.
Always-on, single source of truth in aidlc/rules/*.md: code-style, testing, security, api-conventions, ux-guidelines, reproducibility, tech-stack. .claude/rules/*.md and .cursor/rules/*.mdc are thin pointers in each tool's native frontmatter format. Decision gates use the structured-question pattern in aidlc/common/decision-gates.md.
Every session uses the same get-bearings → work → handoff loop so context survives resets.
| Start | Work | End |
|---|---|---|
Read memory/progress.md |
Sprint contract w/ reviewer | Commit |
Read memory/feature-list.json |
One feature/slice | Update memory/progress.md |
git log --oneline -20 |
TDD red→green→refactor | Leave merge-ready |
Run ./init.sh smoke |
Runtime QA via reviewer | Reviewer flips passes |
Canonical: aidlc/common/session-lifecycle.md. The SessionStart hook in each tool injects this reminder.
Artifacts:
memory/progress.md— Current Focus / Last / Next / Decisions / Open Questions / Known Issuesmemory/feature-list.json— incremental backlog ({id, description, steps, passes}); only the reviewer flipspasses: trueinit.sh.example— copy toinit.shfor env bootstrap + smoke
Tests cover code paths. Evals cover agent behavior — different graders, different lifecycle.
| Aspect | Tests (/test) |
Evals (/eval) |
|---|---|---|
| Subject | Code paths | Agent transcripts + outcomes |
| Graders | Deterministic asserts | Code + LLM-judge + human spot-checks |
| Suites | Unit / integration / E2E | Capability vs regression |
| When to add | Every change | When AI features ship or change |
Start with 20–50 real failures. Read transcripts on every failed run. Calibrate LLM-as-judge against humans. See aidlc/construction/eval.md and aidlc/examples/eval-suite.md.
| Tool | Entry | Native features |
|---|---|---|
| Claude Code | CLAUDE.md (imports AGENTS.md) |
.claude/{rules,agents,skills}/, .claude/settings.json |
| Codex CLI | AGENTS.md (hierarchical) |
.codex/config.toml, .codex/hooks.json |
| Cursor IDE | .cursor/rules/*.mdc |
.cursor/{rules,agents,skills,hooks}/, .cursor/hooks.json |
All three read AGENTS.md — natively (Codex), via @-import (Claude), or by reference (Cursor). Adapters point at canonical content in aidlc/ via repo-rooted paths.
Keep root instructions compact and stable. Put detailed method content under aidlc/, then let each tool load only its native adapter plus the canonical files it needs.
aidlc-template/
├── AGENTS.md Universal entry — read by Codex natively
├── CLAUDE.md Claude entry — @-imports AGENTS.md
├── README.md This file
├── init.sh.example Copy to init.sh — install + dev + smoke
├── aidlc/ Canonical methodology (single source of truth)
│ ├── core-workflow.md One-page master orchestrator
│ ├── agents/ engineer, manager, reviewer
│ ├── inception/ spec, design (WHAT/WHY)
│ ├── construction/ plan, build, test, eval, review,
│ │ security, e2e, ship (HOW)
│ ├── operations/ operate, retro, investigate, daily-report (RUN)
│ ├── rules/ 7 canonical rule bodies (single source of truth)
│ ├── common/ decision-gates.md, session-lifecycle.md
│ └── examples/ feature-spec, feature-list, eval-suite,
│ adr, threat-model, e2e-test-plan, postmortem
├── memory/ Tool-agnostic handoff state
│ ├── progress.md Decisions, last/next session, known issues
│ └── feature-list.json Incremental feature backlog
├── .claude/ Claude Code adapters (frontmatter + pointers, no duplicated content)
│ ├── rules/ 7 .md pointers (paths: frontmatter) → aidlc/rules/
│ ├── agents/ engineer · manager · reviewer → aidlc/agents/
│ ├── skills/ 14 slash commands → aidlc/{inception,construction,operations}/
│ └── settings.json Permissions + hooks
├── .cursor/ Cursor adapters (frontmatter + pointers, no duplicated content)
│ ├── rules/ 7 .mdc pointers (globs: / alwaysApply:) → aidlc/rules/
│ ├── agents/ engineer · manager · reviewer → aidlc/agents/
│ ├── skills/ 14 skills → aidlc/{inception,construction,operations}/
│ ├── hooks/ Hook scripts (e.g. aidlc-session-start.sh)
│ └── hooks.json Lifecycle hooks (sessionStart, beforeShellExecution)
├── .codex/ Codex CLI adapters
│ ├── config.toml MCP servers, feature flags
│ └── hooks.json PreToolUse safety + SessionStart bearings
├── docs/adr/ ADRs (tool-neutral)
└── scripts/audit.sh Footprint audit
git clone https://github.com/ianchan0817/aidlc-template.git my-project
cd my-project && rm -rf .git && git init-
Pick your tool(s). If you only use one of Claude Code / Codex / Cursor, delete the other adapter dirs (
.claude/,.codex/,.cursor/). The methodology inaidlc/and the entry files (AGENTS.md,memory/) work standalone. -
Edit your stack —
aidlc/rules/tech-stack.md(one place; all kept tools point at it). -
Bootstrap —
cp init.sh.example init.shand fill install / dev / smoke commands. Mirror them inAGENTS.md## How to Run. -
Seed memory — set
Current Focusinmemory/progress.md. Leavememory/feature-list.jsonempty (your/specwill append items). -
Audit —
bash scripts/audit.sh. Warns if root >1500 or canonical >8000 words. -
Open in your tool —
Tool Command Claude Code claude(auto-loadsCLAUDE.md)Codex CLI codex(auto-loadsAGENTS.md)Cursor open the directory (auto-loads .cursor/rules/*.mdc) -
First feature —
/specyour first feature, then/plan→/build→/test→/review→/ship.
Deterministic enforcement — actions that must happen, not requests.
| Tool | File | Events |
|---|---|---|
| Claude Code | .claude/settings.json |
SessionStart, PreToolUse, PostToolUse |
| Codex CLI | .codex/hooks.json |
Same names; requires [features] codex_hooks = true |
| Cursor | .cursor/hooks.json |
sessionStart, beforeShellExecution, afterFileEdit, … |
Two hooks ship out of the box per tool:
- Session-start bearings — injects the get-bearings reminder pointing at
aidlc/common/session-lifecycle.md. - Dangerous-command guard — rejects
rm -rf /,chmod -R 777 /,git push --forceto main.
Extend as needed (format-on-save, pre-completion checklists, loop detection).
| Layer | Location | Purpose |
|---|---|---|
| Project | ./CLAUDE.md, ./AGENTS.md, ./.claude/, ./.cursor/, ./.codex/, ./aidlc/, ./memory/ |
Team conventions and methodology — committed |
| Personal | ~/.claude/CLAUDE.md, ~/.codex/AGENTS.md, ~/.cursor/rules/ |
Individual preferences across all projects — not committed |
Project layer dictates what the codebase requires. Personal layer dictates how you prefer to work. Tools merge both; project rules win on conflicts.
Constitution, not prompts. Treat each layer as durable infrastructure, not a one-off prompt. Bloating a layer with conversational corrections is anti-pattern — extract them into rules, skills, or hooks. (Framing from Brij Kishore Pandey.)
Concrete fill-in templates — reference shapes for what each phase produces. They're documentation, not auto-loaded.
Always useful (most projects):
feature-spec.md—/specoutput (problem, use cases, RICE, acceptance criteria)feature-list.md— shape formemory/feature-list.jsone2e-test-plan.md—/e2ejourney table + sign-off checklist
Conditional (only when the situation arises):
adr.md—/planwhen an architectural decision is involvedthreat-model.md—/security(STRIDE), only when auth/data/API changeseval-suite.md—/evaltask YAML, only for AI/agent featurespostmortem.md—/operateafter a Critical/High incident
You don't need to touch any of these to start — phases reference them when relevant.
- Codex —
.codex/config.tomlunder[mcp_servers.NAME] - Claude Code —
claude mcp addor edit.claude/settings.json - Cursor — see Cursor's MCP docs
No MCP servers configured by default — add per project.
| Source | Concept folded in |
|---|---|
| AWS AIDLC | Three-phase lifecycle, structured [Answer]: decision gates, two-part code planning |
| Anthropic — Effective harnesses for long-running agents | init.sh + progress file + JSON feature list, get-bearings, one feature at a time |
| Anthropic — Harness design for long-running app development | Generator/evaluator (folded into reviewer), sprint contracts, runtime QA, harness-review cadence |
| Anthropic — Demystifying evals for AI agents | Tasks/trials/graders, capability vs regression, transcript review, calibrated LLM-as-judge |
| OpenAI — Unrolling the Codex agent loop | Layered project docs, sandbox/approval context, compact and stable adapter loading |
| LangChain — Harness engineering | Build-verify loop, context onboarding, traces as feedback, loop detection as future hook extension |
| Martin Fowler — Harness engineering for coding agent users | Feedforward guides, feedback sensors, harness templates, quality-left framing |
| Learn Harness Engineering | Five-subsystem harness shape: instructions, state, verification, scope, lifecycle |
| Metaflow | Human-centric framing; reproducibility-as-default |
| Kedro | Modular phase-based structure |
| ZenML | Stage gates with explicit pass/fail criteria |
| Made-With-ML | End-to-end iteration loop (operate → retro) |
| awesome-production-ML | Operations phase emphasis |
| agent-skills | Anti-rationalization framing (kept lightweight) |
MIT