fix(security): harden context, dry-run, manifest, and glob handling#488
fix(security): harden context, dry-run, manifest, and glob handling#488toshtag wants to merge 61 commits into
Conversation
Address the Codex security review (scan bd84281). Each fix adds attacker- scenario regression tests; behavior changes are documented in CHANGELOG + cli-contract. - Context pack: loadConstitution reads via resolveWithinProject (no symlink escape into the agent-facing pack). [CWE-59] - task complete --dry-run: propagate dryRun into verify so project-controlled verification.commands (shell:true) are previewed, not executed. [CWE-78] - Adapter manifest I/O: read/writeManifest resolve through resolveWithinProject, fail-closed on a .code-pact/adapters symlink escape. [CWE-59] - Atomic writes: crypto-random temp names + exclusive create (wx/O_EXCL) so a pre-planted temp symlink is refused, never followed. [CWE-59/377] - adapter install: managed-clean × stale re-renders (update) instead of trusting a project-shipped manifest hash to keep stale/forged content; managed-modified still untouched. [CWE-345] - adapter upgrade: orphan auto-prune gated on descriptor.ownedPathGlobs; an unowned orphan is surfaced (warn) and kept, with a CLI message naming the file, the reason, and the manual-removal step. A forged manifest can no longer turn upgrade --write into an arbitrary in-project delete. [CWE-73] - Glob matching: linear two-pointer matchGlob replaces the backtracking globToRegex on the walk/audit/doctor hot paths; pattern-length cap added. [CWE-1333] Security trade-off (#6): upgrade no longer auto-prunes orphaned generated files unless strong ownership can be proven. This intentionally favors preserving user files over automatic cleanup; safe auto-prune is deferred to a follow-up design using reserved generated-file namespaces / stronger ownership markers.
…divergent install files Round 2 — address the two pre-merge blockers from the follow-up review. Blocker 1 — `.code-pact/adapters` symlink escape no longer surfaces as an internal error / exit 3. `resolveManifestPath` re-throws the path-containment refusal with code `ADAPTER_MANIFEST_INVALID`; `adapter install` and `adapter upgrade` (--check / --write) map it to a structured envelope (exit 2) in both --json and human modes. Doctor already degrades it to an issue. Blocker 2 — `adapter install` no longer SILENTLY skips a managed file that matches neither the manifest hash nor the generator output (managed-modified × stale — the shape a hostile repo ships). decideAction returns `refuse` for that cell; install keeps the file (could be a real edit) but reports it via `result.refused[]` / `files[].action:"refuse"`, prints the file + the `--accept-modified` regenerate step, and exits 1. Also (additive): unowned-orphan `warn` plan entries now carry a machine-readable `reason: "unowned_orphan_not_pruned"` for JSON consumers. Docs (cli-contract + CHANGELOG) updated for all three. Verification: typecheck clean, unit 3438 passed, integration 664 passed, check:docs OK.
Round 2 — pre-merge blockers addressed (commit 5292283)Both conditional-approval blockers from the follow-up review are fixed; follow-ups filed. Blocker 1 — manifest symlink escape now a clean CLI error (not
|
| Command | Result |
|---|---|
pnpm typecheck |
clean |
pnpm test:unit |
3438 passed |
pnpm test:integration |
664 passed (+6) |
pnpm check:docs |
all OK |
Docs updated: cli-contract.md (top-level ADAPTER_MANIFEST_INVALID row, install refuse/exit-1 matrix row, warn reason) + CHANGELOG.md.
… root `plan prompt` read design/brief.md and design/constitution.md via join(cwd, ...) + readFileOrNull, so a repo that symlinks either file out of the project leaked the target's contents into the agent-facing prompt (and the --clipboard payload) — the same out-of-project-read → agent-facing leak class as the context pack (CWE-59). Route both reads through a shared readProjectTextOrNull (resolveWithinProject: .. / absolute / symlink-escape → null) and reuse it from the context pack's loadConstitution so every agent-facing grounding read shares one guard.
…losed resolveWithinProject now tags a symlink/unsafe-path escape with a stable PATH_OUTSIDE_PROJECT errno code so command layers can map it to a structured envelope instead of an internal error. That additive code silently broke the `decision prune` / `decision retire` target classification, which keyed the escape branch off `code === undefined` (the old code-less throw) and so demoted an escaping target from target_invalid to target_unreadable. Detect the escape via the explicit code (keeping the code-less ZodError path for structural rejections), and register PATH_OUTSIDE_PROJECT in the error-code surface contract + cli-contract.md.
…clean error readManifest let a YAML parse error or a Zod schema violation throw uncoded, so adapter install / upgrade surfaced a project-controlled (adversarial) manifest as an internal error / exit 3. Wrap the parse+validate step and tag it ADAPTER_MANIFEST_INVALID; ENOENT still degrades to null and the tolerantDuplicatePaths repair path is unchanged.
…lder mkdir, CLI mapping) - install reads the manifest AND resolves the placeholder dirs (.context / hook_dir, via resolveWithinProject) BEFORE persisting the --model pin, so a doomed install (manifest escape/invalid, or a symlinked placeholder dir) leaves no partial side effect — it never rewrites the profile's model_version. - install/upgrade route the placeholder mkdir through resolveWithinProject so a symlinked .context / .claude cannot create a directory outside the project. - the CLI maps ADAPTER_MANIFEST_INVALID (now also malformed/schema-invalid) and PATH_OUTSIDE_PROJECT (→ CONFIG_ERROR) to structured envelopes (exit 2). Adds integration coverage for all of the above: malformed/schema-invalid manifest on install + upgrade --check/--write, no model pin on a failed --model install, and .context/.claude placeholder symlink escape.
resolveAgentProfilePath returned join(cwd, ".code-pact", rel) — lexical only. A symlinked .code-pact/agent-profiles (or a symlinked profile file) let a profile READ, and crucially the `--model` pin's atomicWriteText, escape the project root (CWE-59). manifest I/O was contained but the profile path was not. Route the resolved path through resolveWithinProject (the single resolver every read + the pin flow shares) and map a symlink escape to CONFIG_ERROR — consistent with this resolver's existing throws, so every caller's CONFIG_ERROR handling applies unchanged with no new code to map at each call site.
adapter install moved the placeholder mkdir before the pin, but the generated-file write loop (and upgrade's pin → mkdir → write/prune order) still ran the path-safety checks AFTER persisting the --model pin. So an install/upgrade --model that ultimately fails closed on a symlink escape (.context/.claude, a forged manifest path, or a final-component symlink like CLAUDE.md) could strand a pinned model_version. Add a shared assertAdapterWritePathsContained preflight (resolveWithinProject over placeholder dirs + every generated file + manifest-tracked orphan candidates) and run it BEFORE the deferred pin in both install and upgrade --write. A doomed run now leaves no pin, no write, no unlink. Escapes surface as PATH_OUTSIDE_PROJECT → CONFIG_ERROR (exit 2). Adds integration coverage: upgrade --write --model .context no-pin, .code-pact/agent-profiles symlink escape (install + upgrade), and CLAUDE.md final-symlink no-pin / out-of-project target unwritten (install + upgrade).
realpath() reports a DANGLING symlink (target absent) as a bare ENOENT, indistinguishable from a not-yet-created path. The walk-up containment check therefore mistook .context -> /outside/does-not-exist for a safe missing path and returned ACCEPTED, so the Round-4 preflight could pass and then pin the profile / mkdir outside the project. The same gap let a dangling .code-pact/adapters read as null (no manifest), risking a partial generated-files-but-no-manifest state at the later writeManifest. Rewrite resolveWithinProject to canonicalize the path one component at a time from the real project root via lstat/readlink, following a symlink to where it POINTS regardless of target existence; the final canonical location must stay in the project. Contract: non-existent in-project path and in-project (incl. dangling) symlinks are allowed; any symlink (existing or dangling) pointing outside maps to PATH_OUTSIDE_PROJECT; a symlink cycle (> 40 hops) maps to PATH_OUTSIDE_PROJECT instead of a raw error. Tests: resolveWithinProject dangling-ancestor / dangling-final / in-project-dangling / symlink-cycle / ordinary-deep-nonexistent unit cases; adapter integration cases (dangling .context -> no model pin; dangling .code-pact/adapters -> ADAPTER_MANIFEST_INVALID, no pin, no partial state).
…blestar matchGlob (the runtime walk / write-audit matcher) let adjacent ** segments each match zero, but globToRegex compiled **+** into a form that forced an intermediate segment. So design/**/**/roadmap.yaml matched design/roadmap.yaml at runtime yet evaded findProtectedPathOverlaps (which used globToRegex) — a declared write could touch a protected path while dodging the advisory warning. Switch findProtectedPathOverlaps to matchGlob (the same matcher as the runtime walk, parity by construction) AND collapse adjacent ** runs in globToRegex so the two agree. Correct the doc comments that claimed exact parity. Tests: matchGlob/globToRegex parity over adjacent-doublestar patterns, and a findProtectedPathOverlaps case proving design/**/**/roadmap.yaml is flagged for the protected design/roadmap.yaml.
…e-safe preflight) The Round-5 walk ALLOWED an in-project symlink whose target was absent (dangling-but-contained). For a READ that is fine, but as a WRITE preflight it is wrong: .context -> <cwd>/missing passed the preflight, install then persisted the --model pin, and the subsequent mkdir(.context/claude) failed with ENOENT — leaving the pin as a partial side effect (and an uncoded error / exit 3). A dangling .code-pact/adapters -> <cwd>/.missing similarly read as 'no manifest' and risked a generated-files-but-no-manifest partial state. Switch resolveWithinProject to a simpler, more OS-portable shape: walk each component; on a symlink call realpath(candidate) — success continues from the canonical target (chains / macOS case-insensitive / Windows handled by the OS), ENOENT means dangling and is refused, ELOOP is refused. A PLAIN (non-symlink) missing component still ends the walk and is allowed, so creating new files/dirs works. Net contract: only plain not-yet-created paths and existing in-project paths/symlinks are allowed; ALL dangling symlinks (in- or out-of-project) and cycles map to PATH_OUTSIDE_PROJECT. Tests: flip the in-project-dangling unit case to expect refusal; add adapter integration cases for internal-dangling .context (install + upgrade -> CONFIG_ERROR, no pin) and internal-dangling .code-pact/adapters (install -> ADAPTER_MANIFEST_INVALID, no pin, no partial state).
createExclusiveTemp wrote via writeFile(flag:"wx"); if the file was created but the write then failed (EFBIG/ENOSPC/EIO), the catch only handled EEXIST and rethrew everything else — leaving a partial .tmp-<uuid> behind, because writeThenRename's unlink cleanup only runs AFTER createExclusiveTemp returns the path. The Round-1 #5 symlink-clobber fix had regressed the failure-path cleanup. Claim ownership with open(tmp,"wx") (O_CREAT|O_EXCL — still refuses + never follows a symlink), then write via the handle; on ANY post-open failure close the handle and unlink the partial temp before rethrowing. An EEXIST from open is NOT ours, so it is retried (fresh token) and never unlinked. Adds a test seam (__setAtomicWriteFailAfterOpenForTests) and tests proving no temp leak + untouched destination on a mid-write failure (both write and replace paths).
…nes)
validateGlobSyntax accepts '?' as a literal and matchGlob treats it as one, but globToRegex did NOT escape it: globToRegex("a?") became a quantifier (disagreeing with matchGlob) and globToRegex("?") threw a SyntaxError. Also '**' compiled to '.*', which (unlike matchGlob's '**') does not match a newline-containing segment.
Add '?' to the regex-escape set and expand '**' with [\s\S]* instead of '.*'. globToRegex is no longer used on a prod hot path (findProtectedPathOverlaps moved to matchGlob), but it is exported + the 'parity' claim must actually hold. Adds '?' and newline parity cases plus a deterministic generative parity test over the full validated subset (literals incl. regex metachars, '?', '*', '**').
… not exit 3 A project-controlled path that exists as the WRONG type (a directory where a file is read) makes readFile throw EISDIR — an UNCODED errno that surfaced as an internal error / exit 3, the same contract-break class as malformed manifests. Closed across the readers that take attacker-controlled paths: - readManifest: non-ENOENT read failures (EISDIR when the manifest path is a directory, ENOTDIR, EACCES, a symlink that breaks on read) now map to ADAPTER_MANIFEST_INVALID; ENOENT still degrades to null. - adapter doctor readFileMaybe: a diagnostic read degrades ALL errors to null (a directory at a managed path reads as missing/drift, never a crash). - classifyDecisionAdrs (adr.ts) + detectAdrAcceptedBodyThin (plan lint): route through the project-contained readLiveDecisionFile seam + degrade on error, so a directory named *.md (EISDIR) or a design/decisions symlinked outside no longer crashes plan lint and is contained. Tests: classifyDecisionAdrs skips a *.md directory (EISDIR) and a symlink-escaping ADR; adapter doctor reports (does not throw on) a managed path that is a directory.
…pe before the pin)
The write preflight checked CONTAINMENT but not path TYPE, so a forged profile/manifest pointing a write at an existing entry of the wrong type passed it, then failed the real op AFTER the --model pin — stranding a pinned model_version and surfacing an uncoded errno (exit 3). E.g. context_dir occupied by a regular file (mkdir EEXIST), or CLAUDE.md occupied by a directory (write EISDIR).
assertAdapterWritePathsContained now takes typed specs ({path, kind:'directory'|'file'}): after the containment check it stats the entry and rejects a directory-spec that is a file / a file-spec that is a directory / a non-directory intermediate component as CONFIG_ERROR — all BEFORE the pin (containment+type failures occur before any mutation; the comment no longer overclaims crash-atomicity). install/upgrade pass typed specs for placeholder dirs (directory) + generated files + orphan candidates (file). Also harden loadModelProfiles: a directory named *.yaml is skipped (readFile moved inside the try) rather than crashing the command.
Tests: typed-preflight unit cases (file-as-dir, dir-as-file, intermediate-file, symlink escape still PATH_OUTSIDE_PROJECT); integration cases for context_dir-as-file (install + upgrade --model: CONFIG_ERROR, no pin), CLAUDE.md-as-dir (install: no pin, no internal error), and manifest-path-as-directory (install human/json + upgrade check/write: ADAPTER_MANIFEST_INVALID).
The mandatory control-plane loaders read via lexical join(cwd, path), so a roadmap phase ref with .. or a symlinked design/ or design/phases/* pulled an out-of-project file into the agent-facing context pack AND into generated Claude skills (verification.commands) — the same out-of-project-read class as the original constitution-symlink finding, on the must-read path. Route both through resolveWithinProject; a path-safety refusal maps to CONFIG_ERROR (fail-closed + structured — these are control-plane inputs, never degraded to null). A missing/invalid roadmap or phase still throws ENOENT/ZodError as before. Tests: loadRoadmap refuses a symlinked design/roadmap.yaml; loadPhase refuses a symlinked phase ref and a .. ref — all CONFIG_ERROR.
…y project files Blocker 1 (HIGH): AgentProfile.instruction_filename and manifest files[].path are BOTH attacker-controlled. A forged manifest hash == a victim file's real hash made it managed-clean; since it differs from generated content it was stale → auto-update (overwrite) on a plain 'adapter install'. So a profile pointing instruction_filename at e.g. package.json + a matching forged manifest destroyed that file. A content OVERWRITE (update / replace_unmanaged) is now gated on a NEW trusted-static overwriteOwnedPathGlobs namespace (CLAUDE.md + .claude/skills/*.md for claude — separate from, and broader than, the narrow delete gate); a path outside it is refused, never written on manifest trust. This was the blind spot in the original managed-clean x stale -> update self-heal. Blocker 3: install/upgrade loadAgentProfile parsed YAML + Zod OUTSIDE the read try, so a malformed/schema-invalid project profile threw uncoded (exit 3). Now: ENOENT->AGENT_NOT_FOUND, other read error / parse / schema -> CONFIG_ERROR. Blocker 4: the typed write preflight allowed any non-directory as a 'file', so a FIFO/socket/device passed and a later readFile BLOCKED after the --model pin (hang + stranded pin). The file kind now requires a regular file (st.isFile()), refused before the pin. Also routes claude's readVerificationCommands through the contained loadRoadmap. Tests: forged-manifest+profile overwrite is refused (exit 1, --force too); malformed/schema-invalid profile -> CONFIG_ERROR on install + upgrade check/write; FIFO file spec -> CONFIG_ERROR.
…k engine) Answers enforce-this-without-bloating-the-3.5min-CI: a sub-second static check for the path-CONTAINMENT bug class the adversarial review kept finding — a project path read/written via lexical join() instead of resolveWithinProject(), which follows .. / symlinks out of the project. It is the engine behind a LOCAL PostToolUse(Edit|Write) hook (.claude/ is gitignored by repo policy, like the existing guard-push / guard-managed-skills hooks) that surfaces the smell at EDIT time, before any review round-trip or CI run. scripts/check-fs-containment.mjs: with file args it checks just those (the hook mode); with none, pnpm check:fs-containment sweeps src/commands,core,cli as a migration report (currently ~27 known lexical reads — the backlog to contain .code-pact/project.yaml / model-profiles / a few ADR reads; some are the open follow-ups). Deliberately NOT wired into the gate yet: a blocking check needs that baseline migrated or marked with an fs-safe reason first. It catches only the MECHANICAL containment class — not design bugs like trusting a manifest hash as path ownership, which still need review.
… pack Round-8 made loadPhase/loadRoadmap throw CONFIG_ERROR on a path-safety escape, but cmdPack's catch only handled PHASE_NOT_FOUND / AMBIGUOUS_PHASE_ID / TASK_NOT_FOUND, so an out-of-project-symlinked phase made 'pack --json' fall through to the top-level internal error / exit 3 instead of a structured CONFIG_ERROR / exit 2 (task context already mapped it). Also the loaders only coded the resolveWithinProject refusal — a phase path that is a directory (EISDIR), an intermediate file (ENOTDIR), unreadable (EACCES), or malformed YAML/schema stayed uncoded and likewise exit-3'd from pack. cmdPack now maps CONFIG_ERROR to an exit-2 envelope. loadPhase/loadRoadmap now map NON-ENOENT read failures AND YAML/schema-parse failures to CONFIG_ERROR while keeping ENOENT RAW (resolve-task's archived-fallback keys on code === ENOENT, and no caller inspects the ZodError) — so attacker-controlled phase/roadmap input is always structured, never exit 3, with the legitimate missing-phase path unchanged. Test: pack with a phase file symlinked outside → exit 2 / CONFIG_ERROR, no internal error, and the foreign phase's marker never reaches the pack.
…y, not lexical path Blocker 1 (HIGH): the overwrite/prune ownership gate matched the LEXICAL path against owned globs, but resolveWithinProject ALLOWS in-project symlinks and returns the lexical path — so .claude/skills -> ../src makes the owned-looking .claude/skills/context.md resolve to src/context.md, and a forged manifest then overwrote (or pruned) the real src file via the symlink. PoC-confirmed. New pathTraversesSymlink(cwd, relPath) check: a destructive AUTO action (update / replace_unmanaged / prune) on a path that traverses ANY symlink component is refused, so lexical path == real destination (CWE-59/61). Blocker 2 (HIGH): overwriteOwnedPathGlobs included '.claude/skills/*.md' — but that dir is SHARED with hand-authored user skills, so (no symlink, no --force needed) a verification command whose deriveSkillName collides with a user skill (e.g. 'deploy' → deploy.md) + a forged manifest hash made it managed-clean × stale → update → overwrote the user's skill with generator content embedding the attacker's command (agent-instruction poisoning). Removed overwriteOwnedPathGlobs entirely; the overwrite gate now uses the EXACT static ownedPathGlobs (CLAUDE.md + the 3 built-in skills). Dynamic command-skills are no longer auto-overwritten when stale — they are refused; a reserved generated-skill namespace that restores safe dynamic re-render is the planned follow-up. Also contains loadModelProfiles' read via resolveWithinProject (a symlinked model-profiles dir/file no longer reads out of project), surfaced by the new fs-containment hook. Tests: symlinked owned-skills-dir overwrite is refused (victim untouched); hand-authored deploy.md is not overwritten by a colliding command + forged manifest; --regen-skills no longer overwrites a divergent dynamic skill.
…dmap + map CONFIG_ERROR Round 8 added a contained loadRoadmap, but resolveTaskInRoadmap (shared by task status/context/prepare/complete/record-done/finalize/start/block/resume), phase-archive's loadRef, phase-reconcile's resolvePhase, and plan-adopt's nextPhaseSeed STILL did their own readFile(join(cwd,'design/roadmap.yaml')) + Roadmap.parse — bypassing symlink containment AND the EISDIR/ENOTDIR/EACCES/malformed → CONFIG_ERROR mapping. So a symlinked design/roadmap.yaml made those commands read an out-of-project roadmap as the control plane. (My fs-containment tripwire missed resolve-task because its read was MULTILINE — see the tripwire fix.) All four now use loadRoadmap(cwd). And every consumer's CLI maps the resulting CONFIG_ERROR to exit 2: task complete / record-done / finalize / status (switch cases), task start/block/resume (emitTaskCommonError), task add, verify, phase archive, phase reconcile — plus a top-level safety net in main()'s rejection handler so ANY unmapped CONFIG_ERROR is a clean exit-2 envelope, never an internal error / exit 3. Test: design/roadmap.yaml symlinked outside → task complete --dry-run / task status / phase archive / phase reconcile --write all exit 2 with error.code CONFIG_ERROR, no internal error, and the foreign roadmap's marker never leaks.
…t, machine-readable reason Must-fix 1: loadModelProfiles contained the per-file READ but still readdir'd a LEXICAL .code-pact/model-profiles, so a symlinked-outside dir was still enumerated (out-of-project listing / large-dir DoS). Now resolveWithinProject the directory BEFORE readdir (optional source → unsafe/missing dir is []). Must-fix 2: every refusal printed 're-run with --accept-modified', but a SECURITY refusal (a generated path outside the trusted owned set, or one that reaches its real target through a symlink) is NOT resolvable that way — re-running refuses again. Refusals now carry a machine-readable reason (managed_modified | unowned_generated_path | symlink_traversal) in files[]/plan[] (JSON), and the human guidance branches per reason: --accept-modified ONLY for managed_modified; the security reasons get inspect/remove or fix-the-symlink guidance. --regen-skills help + cli-contract updated to state it does NOT overwrite a divergent dynamic skill (reserved-namespace follow-up). Tests: the symlinked-owned-dir refusal carries reason symlink_traversal; the deploy.md collision carries reason unowned_generated_path.
…in(...)) The single-line regex missed a MULTILINE readFile(\n join(...)) — exactly the resolve-task read that bypassed loadRoadmap. The scan now runs over full text with \s* spanning newlines, reporting the fs-call's line number. A path stashed in a variable first (const d = join(...); readFile(d)) is still not caught — that needs dataflow (the AST-lint / projectFs chokepoint follow-up), so this stays an edit-time advisory, not a complete guarantee.
The last roadmap/phase consumers bypassing the contained seam were in plan/state.ts: loadPlanState (strict — behind task runbook, phase runbook, status, plan analyze) and collectPlanArtifacts + scanPhasesDirBestEffort (lenient — behind decision prune/retire and plan lint) read via roadmapPath(cwd)/join(cwd, ref.path) and a lexical design/phases readdir. A symlinked design/roadmap.yaml or design/phases therefore let those commands read an out-of-project control plane — and the top-level CONFIG_ERROR safety net does NOT help, because a VALID external YAML throws nothing and flows through as normal data. Blocker 2 (the dangerous one): collectPlanArtifacts feeds decision prune/retire's referencing-task gate, so a roadmap symlinked to an external EMPTY roadmap hid the current project's referencing not-done task → prune/retire could be wrongly authorized to DELETE a still-referenced decision. Fix: every read in the family now goes through resolveWithinProject. STRICT (loadPlanState) maps a containment escape to CONFIG_ERROR (propagates → consumer/safety-net → exit 2); LENIENT (collectPlanArtifacts / scanPhasesDirBestEffort) turns a containment escape into a graph-file FileIssue so planArtifactsUnreadable() fail-closes (prune/retire refuse). loadPlanStatePhase's ParseError-on-malformed contract is unchanged — only the PATH is contained. The model-profiles-style directory is also resolved before readdir. Tests: symlinked roadmap → task runbook (loadPlanState) exits 2 CONFIG_ERROR with no leak; a roadmap symlinked to an external empty roadmap can no longer hide a referencing not-done task → decision prune --write fails closed and the decision is byte-identical.
The adapter install/upgrade help described these security-relevant flags as the OPPOSITE of what they do: --force was 'Overwrite existing managed files' (it is unmanaged-adoption only and NEVER overwrites a modified managed file), and --accept-modified was 'Preserve manually-edited managed files' (it ALLOWS overwriting them with generator output — the destructive flag). Backwards help on destructive flags makes a user misjudge the blast radius. Now: --force = adopt/replace UNMANAGED only, does not overwrite a modified managed file; --accept-modified = ALLOW overwriting a locally-modified managed file.
A `decision_refs` value was any non-empty string (`z.string().min(1)`).
A checked-in phase YAML could name an arbitrary in-project file:
decision_refs:
- .env
The gate (adr.ts) read it, `classifyAdr` accepted it (lenient no-status
rule), the requires_decision gate was released, AND `loadDeclaredDecisions`
rendered its body into the agent-facing context pack — an arbitrary local
file read + gate bypass + secret-into-artifact leak, no symlink required.
Add a single source-of-truth validator `DecisionRefPath`
(design/decisions/**/*.md, nested allowed, README.md/PRUNED.md excluded,
absolute/`..`/backslash rejected) and apply it across the whole surface as
multi-layer defense (never schema-only):
- Task + phase-import schemas: parse-time HARD FAIL (front line) — a bad
ref is rejected before any read, at YAML parse / `task add` / `phase
import`. The repo has no decision_refs values today, so nothing breaks.
- decision gate read seam (diskReader): re-validate namespace → out-of-
namespace is `unsafe`, never read, never classified accepted.
- pack loader (loadDeclaredDecisions): shares the same seam → never renders
an out-of-namespace file into the pack.
- plan lint (detectTaskDecisionRefUnsafePath): uses the same
`decisionRefPathReason` → precise exit-affecting diagnostic.
- context-fit advisory: namespace-gate + owned read seam (no arbitrary
read just to measure bytes).
`acceptance_refs` keeps the loose shape ON PURPOSE — it routinely points at
docs / phase YAML, not just ADRs. The archive fallback was already safe
(`normalizeDecisionRef` returns null for non-decision paths).
Regression tests: `.env`, README.md, PRUNED.md, `docs/*.md`, `../secret.md`,
absolute, backslash → rejected at schema/lint, gate not released, secret
never in pack/result, record-done/status fail-closed at load. Nested ADR +
flat ADR still accepted.
Forged-manifest content/SHA oracle: a manifest is project-supplied and its
`files[].path` was only a RelativePosixPath. A hostile repo could list
files:
- path: .env
role: instruction
sha256: 0000...
and `adapter conformance` / `adapter doctor` (and global doctor/validate,
which calls inspectAgent) would READ `.env`, emit `actual_sha256`, and run
contract-heading/substring inspection — a content oracle on arbitrary local
files (low-entropy/dictionary attack; instruction-role heading oracle).
Add `classifyManifestFileForRead`: gate EVERY manifest-entry read behind the
SAME trusted authority the writer uses (writePathGlobs ?? ownedPathGlobs —
the exact static set the adapter may create/overwrite) plus the owned-path
symlink guard (resolveOwnedProjectPath rejects every symlink component). A
path the adapter could not have generated is refused — never read, never
hashed, no actual_sha256, no heading inspection.
- conformance: instruction read + per-file checksum loop both gated;
unowned/unsafe → `adapter_file_path_unowned` (required → non-compliant).
- doctor: per-file loop gated before read → `ADAPTER_FILE_PATH_UNSAFE`
(reused; doc text broadened to cover the unowned case).
Also wire `pnpm check:fs-containment` into CI (full profile) — a structural
backstop only; the SEMANTIC invariants (decision_refs namespace, manifest
ownership) are pinned by the security regression tests, which it does NOT
replace.
Regression tests: forged `.env` entry (and instruction-role `.env`) →
refused, secret/sha never in output, fail-closed, for both conformance and
doctor.
The prior round widened DecisionRefPath to nested `design/decisions/**/*.md`, but the entire decision lifecycle downstream is flat-only: normalizeDecisionRef / normalizePrunedDecisionPath, the decision-state-record schema, retire, prune, the archive gate fallback, and the quality scan all reject a nested path. A nested ADR was therefore valid at the schema/gate but silently un-retireable, un-prunable, and un-archivable — a schema-vs-lifecycle inconsistency. Scope here is the security fix, not a new feature: revert DecisionRefPath to flat (`design/decisions/<file>.md`, no subdirectories), matching every downstream contract. The `.env` / traversal / absolute / backslash / README / PRUNED rejections are unchanged — the arbitrary-file-read defense holds. Nested ADR support remains a deliberate future extension across the WHOLE lifecycle. The decision read seam (adr.ts diskReader) re-validates with the now-flat isDecisionRefPath, so a nested path is `unsafe` (never read) in lockstep with the rest. Tests updated: nested is now REJECTED at schema and refused by the read seam.
…amespace
The forged-manifest read gate used `writePathGlobs ?? ownedPathGlobs` as read
authority. For claude that includes `.claude/skills/*.md` — a namespace SHARED
with hand-authored user skills AND attacker-influenceable dynamic skill names
(derived from project verification commands). So a forged manifest naming a
victim's `.claude/skills/private.md` matched the wildcard → was read + hashed +
(role: instruction) heading-inspected: a content/SHA oracle on a real local
file. WRITE authority ("may create here") is not READ authority ("may read +
hash + inspect this existing file").
Read authority is now the NARROW, wildcard-free `ownedPathGlobs` (built-in
paths only):
- conformance: instruction read + per-file checksum gate on ownedPathGlobs +
symlink reject. A dynamic skill in the shared namespace cannot prove
read-ownership → `file_checksum_skipped_unverifiable` (advisory, never read);
conformance does not regenerate, so it cannot verify those bytes (doctor can).
An out-of-namespace path (.env) → `adapter_file_path_unowned` (required).
- doctor: three tiers — (1) exact current generated set or built-in static set
→ full verify incl. heading inspection; (2) in the broad write namespace but
NOT the current set (stale/orphan skill OR a victim's private file, path-
indistinguishable) → NOT read, advisory ADAPTER_FILE_UNVERIFIABLE (no content
oracle; replaces the prior benign read-based stale warning — both warnings,
validate stays green); (3) outside everything (.env) → hard ADAPTER_FILE_
PATH_UNSAFE. The heading/substring inspection (the real oracle) runs ONLY on
tier 1.
Regression tests: `.claude/skills/private.md` as role skill AND instruction,
for both conformance and doctor — secret/sha never in output; plus a hard-
refused `.env`. New codes documented + registered in error-code-surface.
… CI note Blocker 2 — roadmap/phase symlink-alias parity. loadRoadmap / loadPhase / collectPlanArtifacts read the control plane through resolveWithinProject, whose contract ALLOWS an existing in-project symlink. The strict loadPlanState already uses resolveOwnedProjectPath (rejects every symlink component). So a checked-in `design/phases/P1.yaml -> ../../.local/private-phase.yaml` was accepted by the lenient/strict-discovery loaders and its objective / verification commands / task prose flowed into the context pack, generated skills, and the prune/retire safety judgement. Unify all three onto resolveOwnedProjectPath: strict → CONFIG_ERROR, lenient → fail-closed FileIssue. A genuinely-missing (non-symlink) phase still throws RAW ENOENT (the archived-fallback signal). Regression tests cover loadPhase / loadRoadmap (CONFIG_ERROR) and collectPlanArtifacts (FileIssue, no aliased content leak). Must-fix 1 — `task add --decision-ref .env` produced an uncoded Phase.parse ZodError that escaped the CLI catch → exit 3 (internal fault) for what is user input. Validate --decision-ref at the CLI boundary with decisionRefPathReason → CONFIG_ERROR / exit 2, before runTaskAdd, so the phase YAML is never touched. Integration test pins exit 2 + byte-identical phase YAML. Must-fix 2 — check-fs-containment.mjs still said "WITHOUT bloating CI" / "NOT wired into the gate"; it IS now in the CI full profile. Comment corrected, and it now states plainly that exit 0 is a STRUCTURAL signal only (lexical join), not a proof of the semantic invariants (those live in the regression tests).
Addresses the Codex security review of
bd84281(/tmp/codex-security-scans/code-pact/bd84281_20260618T142512Z/report.md): 7 findings — 2 high, 5 medium. Every fix ships an attacker-scenario regression test; user-visible behavior changes are documented inCHANGELOG.md+docs/cli-contract.md.Codex findings → resolution
design/constitution.mdsymlink outside the project (CWE-59)loadConstitutionreads viareadWithinProject/resolveWithinProject— same project-contained path as rules/decisions; unsafe →nullsrc/core/pack/loaders.tstask complete --dry-runexecutes verification shell commands (CWE-78)dryRunintorunVerify;commandscheck is previewed not executed (shell:truenever runs); read-onlydecisiongate still runssrc/commands/task-complete.ts.code-pact/adapterssymlink escape (CWE-59)resolveManifestPath→read/writeManifestresolve viaresolveWithinProject; fail-closed on escape (throws, not treated as missing)src/core/adapters/manifest.tsrandomUUID+ exclusive createflag:"wx"(O_CREAT|O_EXCL, refuses + never follows a symlink); EEXIST retry; test seamsrc/io/atomic-text.tsdecideAction: installmanaged-clean × stale→update(re-render) instead ofskip; install command handlesupdate.managed-modifiedstill untouchedsrc/core/adapters/file-state.ts,src/commands/adapter-install.tsdescriptor.ownedPathGlobs; unowned orphan →warn(kept on disk, kept tracked, explained in CLI). See trade-off belowsrc/commands/adapter-upgrade.ts,src/cli/commands/adapter.ts**globs cause regex ReDoS (CWE-1333)matchGlobreplaces backtrackingglobToRegexon walk/audit/doctor hot paths;MAX_GLOB_LENGTHcap invalidateGlobSyntaxsrc/core/glob.ts,src/core/audit/write-audit.ts,src/commands/doctor.tsRegression tests added
pack-core.test.ts— adesign/constitution.mdsymlinked to an outside file does NOT leak its marker into acontext_size: largepack (includedConstitution:false); a real in-project constitution is still included.task-complete.test.ts+cli.test.ts—--dry-runwith a marker-creating / failing command does NOT execute it (no marker, exit 0dry_run); non-dry-run contrast still executes + fails.adapter-manifest.test.ts—.code-pact/adapterssymlinked outside:writeManifestrejects + writes nothing outside;readManifestrejects (no foreign read).atomic-text.test.ts— a pre-planted temp-path symlink is refused; the outside target is untouched and the destination is not created.adapter-upgrade.test.ts(+adapter-file-state.test.ts) — a forged manifest hash matching maliciousCLAUDE.mdis self-healed to generator output; a genuinely user-modified file is NOT overwritten.adapter-upgrade.test.ts+adapter-cli.test.ts— a forged manifest entry forsrc/important.tsis NOT deleted; unowned orphans arewarn+ stable across runs; the CLI names the file, the reason, and the manual-removal step.glob.test.ts—matchGlobparity vsglobToRegexacross patterns/paths; a 12×**pattern over a 200-segment path resolves< 1s(no catastrophic backtracking);MAX_GLOB_LENGTHrejection.User-visible behavior changes (3)
task complete --dry-run— a dry run whose only failing check is a command no longer exits 1; it returns a cleandry_runpreview (exit 0). The decision gate still runs. Non-dry-run completion is unchanged.adapter install— amanaged-cleanfile whose content is stale vs the generator is re-rendered (update) instead of skipped. Genuinely user-modified (managed-modified) files are still left untouched.adapter upgrade --write— orphans outside the adapter's owned path set are no longer auto-deleted; they are surfaced (action:"warn") and kept on disk with an explanatory CLI message.Design decision on #6 (option A — adopted)
ownedPathGlobsto.claude/skills/*.md) — rejected. The manifest is project-controlled; a namespace glob over a shared directory would let a forged manifest delete user-authored skills — the same vulnerability class in a different directory, which the review explicitly warned against.Verification
The initial-review gate passed (typecheck / unit / integration /
check:docsall green). Currentcounts after all hardening rounds are in Verification (latest) at the
bottom — that is the single source of truth; the per-round numbers above are historical.
Independent adversarial review (internal
code-reviewerpass) confirmedmatchGlob↔globToRegexparity, atomic-text exclusive-create correctness, and no other missed manifest-trust delete paths; its one diagnostic note (adapter listclassifying a symlink-escape throw asmanifestInvalid) was addressed by documenting the intentional tolerant degradation (the lister must not throw and surfacing-as-unusable beats masking-as-absent).Follow-up: #487 — safe orphan auto-cleanup via strong ownership.
Additional hardening (post-initial-review adversarial rounds)
Several further adversarial review passes on this branch extended the same fix classes — out-of-project read/write, partial side effects before a fail-closed check, and glob-matcher consistency. Each fix ships regression tests and the full gate stays green.
ADAPTER_MANIFEST_INVALID(exit 2) inadapter install/upgrade, not an uncoded internal error (exit 3)src/core/adapters/manifest.ts,src/cli/commands/adapter.tsplan promptreadsdesign/brief.md/design/constitution.mdthrough a sharedreadProjectTextOrNull(symlink-escape → null), closing the same agent-facing leak class as finding #1 for another commandsrc/core/project-read.ts,src/commands/plan-prompt.ts,src/core/pack/loaders.tsresolveAgentProfilePathresolves throughresolveWithinProject(escape →CONFIG_ERROR), so a symlinked.code-pact/agent-profilescannot redirect a profile read — or the--modelpin's write — outside the projectsrc/core/agent-profile-path.tsadapter install/upgraderead the manifest AND preflight every write path (placeholder dirs + generated files + orphan candidates viaresolveWithinProject) BEFORE persisting the--modelpin, so a doomed run leaves no pin/write/unlinksrc/commands/adapter-install.ts,src/commands/adapter-upgrade.ts,src/core/adapters/file-state.tsresolveWithinProjecttags an escapePATH_OUTSIDE_PROJECT;decision prune/retirekeep classifying it astarget_invalid; code registered in the error-code surface contractsrc/core/path-safety.ts,src/core/decisions/prune.ts,src/core/decisions/retire.tsresolveWithinProjectis now a write-safe containment preflight: it walks per-component and refuses ALL dangling symlinks (in- or out-of-project) and cycles viarealpath, instead of mistaking a dangling link for a safe not-yet-created path (which let a latermkdirstrand a partial side effect)src/core/path-safety.tsfindProtectedPathOverlapsuses the runtimematchGlob, andglobToRegexcollapses adjacent**so the two agree — a repeated-**write (e.g.design/**/**/roadmap.yaml) can no longer match a protected file at runtime while evading the protected-path advisory. Also:globToRegexescapes?(a literal in this subset) and expands**as[\s\S]*so it truly matchesmatchGlob(enforced by a generative parity test)src/core/glob.tscreateExclusiveTempclaims the temp viaopen(..,"wx")then writes through the handle; a mid-write failure (EFBIG/ENOSPC/EIO) closes + unlinks the partial temp instead of leaking a.tmp-<uuid>(the Round-1 symlink-clobber fix had regressed failure-path cleanup)src/io/atomic-text.tsdirectoryspec that is a regular file (mkdir EEXIST), afilespec that is a directory (write EISDIR), or a non-directory intermediate (ENOTDIR) →CONFIG_ERRORBEFORE the--modelpin — so a forged profile/manifest of the wrong type can no longer strand a pin or crash with exit 3src/core/adapters/file-state.ts, install/upgradereadManifest(a directory manifest path → EISDIR) →ADAPTER_MANIFEST_INVALID;adapter doctor's diagnostic read degrades all errors to null;classifyDecisionAdrs/plan lint's ADR reads route through the containedreadLiveDecisionFileseam + degrade (a*.mddirectory or symlink-escaping ADR no longer crashesplan lint);loadModelProfilesskips a*.yamldirectorysrc/core/adapters/manifest.ts,src/commands/adapter-doctor.ts,src/core/decisions/adr.ts,src/core/plan/lint.ts, install/upgradeAgentProfile.instruction_filenameand manifestfiles[].pathare both attacker-controlled, so a forged manifest (hash == a victim file's real hash) + a profile pointing the instruction/skill path at e.g.package.jsonmade itmanaged-clean × stale→ auto-update(overwrite) on a plainadapter install. A content overwrite (update/replace_unmanaged) is now gated on a NEW trusted-staticoverwriteOwnedPathGlobsnamespace (separate from, and broader than, the narrow delete gate) — a generated path outside it is refused, never written on manifest trust. This was the blind spot in the originalmanaged-clean × stale → updateself-healsrc/core/adapters/types.ts,src/core/adapters/claude.ts,src/commands/adapter-install.ts,src/commands/adapter-upgrade.tsloadPhase/loadRoadmap(mandatory inputs rendered into the context pack AND generated Claude skills) route throughresolveWithinProject→CONFIG_ERRORon escape, so a..phase ref or a symlinkeddesign/no longer leaks an out-of-project phase's objective/tasks/verification.commandssrc/core/plan/load-phase.ts,src/core/plan/roadmap.ts,src/core/adapters/claude.tsadapter install/upgrademap a malformed/schema-invalid agent profile toCONFIG_ERROR(was uncoded exit 3); the typed write preflight requires a regular file (st.isFile()), so a FIFO/socket/device at a generated path is refused before the pin instead of blocking a laterreadFileforeversrc/commands/adapter-install.ts,src/commands/adapter-upgrade.ts,src/core/adapters/file-state.tsThese rounds were increasingly proactive — an internal adversarial
code-reviewersweep found the typed-preflight gap, the temp-leak, and several uncoded-errno readers before they shipped, and a new fast static fs-containment tripwire (scripts/check-fs-containment.mjs, wired as a local PostToolUse edit-time hook — NOT added to CI, which is already ~3.5 min) catches the lexical-joincontainment class at authoring time.pnpm check:fs-containmentreports the remaining ~27-line migration backlog (the open follow-up to contain.code-pact/project.yaml/model-profilesreads).Ownership vs containment (combination-attack round)
A further adversarial round found that the overwrite/prune authorization conflated containment (inside the project) with ownership (a path the generator actually owns):
resolveWithinProjectallows an in-project symlink and returns the lexical path — so.claude/skills -> ../srclet an owned-looking.claude/skills/context.mdresolve tosrc/context.mdand a forged manifest overwrite/prune the real file. A destructive AUTO action (update/replace_unmanaged/prune) on a path that traverses ANY symlink component is now refused (newpathTraversesSymlink) — lexical path must equal the real destinationsrc/core/path-safety.ts,src/commands/adapter-install.ts,src/commands/adapter-upgrade.tsoverwriteOwnedPathGlobshad widened the auto-overwrite set to.claude/skills/*.md— a dir SHARED with hand-authored user skills, so (no symlink, no--force) a verification command whose derived name collides with a user skill + a forged manifest hash overwrote that user skill with generator content carrying the attacker's command. Reverted: the overwrite gate now uses the EXACT staticownedPathGlobs(built-ins only); dynamic command-skills are refused when stale (a reserved generated-skill namespace is the follow-up)src/core/adapters/types.ts,src/core/adapters/claude.ts, install/upgradeloadPhase/loadRoadmapnow map non-ENOENT read + YAML/schema failures toCONFIG_ERROR(ENOENT stays raw for the archived-fallback), andcmdPackmapsCONFIG_ERRORto exit 2 — so a symlinked / malformed / directory phase ref no longer exit-3's frompacksrc/core/plan/load-phase.ts,src/core/plan/roadmap.ts,src/cli.tsAlso tracked, NOT in this PR: #489 —
adapter doctor/conformanceread manifest-declared entry paths via lexicaljoin(final-component / parent symlink not contained). These are advisory surfaces only (checksums / check results, never a write); tracked as security debt. Themodel-profilesreads in install/upgrade are now contained;pnpm check:fs-containmentreports the remaining backlog (.code-pact/project.yamland a few others) for the "unify all project-metadata reads through resolveWithinProject" follow-up.Verification (latest)
pnpm typecheckpnpm test:unitpnpm test:integrationpnpm check:docsfull (Node 22)/smoke (Node 24)Roadmap-read containment completeness (cross-command round)
A cross-command audit found that the contained
loadRoadmapwas bypassed by several direct reads, and that the resultingCONFIG_ERRORwas not mapped by every command:resolveTaskInRoadmap(shared by alltask *commands),phase archive,phase reconcile, andplan adoptdid their ownreadFile(join(cwd,'design/roadmap.yaml'))— bypassing containment. All now useloadRoadmap(cwd)src/core/plan/resolve-task.ts,src/commands/phase-archive.ts,src/commands/phase-reconcile.ts,src/commands/plan-adopt.tsCONFIG_ERRORto exit 2, plus a top-level safety net inmain()so any unmapped one is still a clean envelope, never exit 3src/cli.ts,src/cli/commands/task.ts,src/cli/commands/phase.tsloadModelProfilesnowresolveWithinProjects the directory beforereaddir(no out-of-project enumeration of a symlinked dir)reason(managed_modified/unowned_generated_path/symlink_traversal);--accept-modifiedis only suggested formanaged_modified(the security refusals are not overridable by it)src/cli/commands/adapter.ts, install/upgradecheck-fs-containmentnow also catches the multilinefsfn(⏎ join(…))form (the variable-indirection form remains the AST-lint / projectFs follow-up)scripts/check-fs-containment.mjsloadPlanState(strict — task/phase runbook, status, plan analyze) +collectPlanArtifacts/scanPhasesDirBestEffort(lenient — decision prune/retire, plan lint) read viaresolveWithinProject: strict →CONFIG_ERROR, lenient → a graph-file FileIssue soplanArtifactsUnreadable()fail-closes. This closes a real destructive mis-authorization: a roadmap symlinked to an external EMPTY roadmap could hide the current project's referencing not-done task and letdecision prune/retire --writedelete a still-referenced decisionsrc/core/plan/state.ts--force/--accept-modifiedhelp described the OPPOSITE of the implemented behavior (a footgun on destructive flags) — correctedsrc/cli/usage.tsScope boundary — deferred to a dedicated follow-up PR
The
design/writer commands (plan normalize --write,plan sync-paths --write,phase add/new,spec import --write,plan brief --force,plan constitution --force,createPhase) can still write through a symlinkeddesign/to outside the project. This is a real, distinct class — but containing every project WRITE (plus aprojectFschokepoint + a no-direct-node:fslint) is its own large change, and piling it onto this already-large PR would itself be a review/regression risk. It is intentionally out of scope for #488 and tracked as the next security follow-up;pnpm check:fs-containmentalready surfaces the in-repo candidates for it.