diff --git a/.github/agents/pharaoh.activity-diagram-draft.agent.md b/.github/agents/pharaoh.activity-diagram-draft.agent.md new file mode 100644 index 0000000..cb40cfe --- /dev/null +++ b/.github/agents/pharaoh.activity-diagram-draft.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when drafting one activity diagram showing control flow (actions, decisions, forks/joins, swimlanes) for one procedure or algorithm. +handoffs: [] +--- + +# @pharaoh.activity-diagram-draft + +Use when drafting one activity diagram showing control flow (actions, decisions, forks/joins, swimlanes) for one procedure or algorithm. + +See [`skills/pharaoh-activity-diagram-draft/SKILL.md`](../../skills/pharaoh-activity-diagram-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.api-coverage-check.agent.md b/.github/agents/pharaoh.api-coverage-check.agent.md new file mode 100644 index 0000000..f76eb9c --- /dev/null +++ b/.github/agents/pharaoh.api-coverage-check.agent.md @@ -0,0 +1,10 @@ +--- +description: Verify that every public symbol and every raise-site exception in a source file is covered by at least one need in needs.json. Reverse direction of pharaoh-req-from-code — language-parametric via the shared regex table; emits per-symbol and per-raise-site coverage plus a ratio against a tailored threshold. +handoffs: [] +--- + +# @pharaoh.api-coverage-check + +Verify that every public symbol and every raise-site exception in a source file is covered by at least one need in `needs.json`. Reverse direction of `pharaoh-req-from-code` — language-parametric via the shared regex table in `skills/shared/public-symbol-patterns.md`; emits per-symbol and per-raise-site coverage plus a ratio against a tailored threshold. + +See [`skills/pharaoh-api-coverage-check/SKILL.md`](../../skills/pharaoh-api-coverage-check/SKILL.md) for the full atomic specification — inputs, outputs, per-step process, failure modes, and composition patterns. diff --git a/.github/agents/pharaoh.block-diagram-draft.agent.md b/.github/agents/pharaoh.block-diagram-draft.agent.md new file mode 100644 index 0000000..0436607 --- /dev/null +++ b/.github/agents/pharaoh.block-diagram-draft.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when drafting one SysML-style block diagram — Block Definition Diagram (BDD) showing block structure and composition, or Internal Block Diagram (IBD) showing ports, flows, and part interconnections. +handoffs: [] +--- + +# @pharaoh.block-diagram-draft + +Use when drafting one SysML-style block diagram — Block Definition Diagram (BDD) showing block structure and composition, or Internal Block Diagram (IBD) showing ports, flows, and part interconnections. + +See [`skills/pharaoh-block-diagram-draft/SKILL.md`](../../skills/pharaoh-block-diagram-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.bootstrap.agent.md b/.github/agents/pharaoh.bootstrap.agent.md new file mode 100644 index 0000000..bca421a --- /dev/null +++ b/.github/agents/pharaoh.bootstrap.agent.md @@ -0,0 +1,13 @@ +--- +description: Inject minimum sphinx-needs configuration into an existing Sphinx project so sphinx-build produces a valid needs.json. +handoffs: + - label: Detect and scaffold Pharaoh + agent: pharaoh.setup + prompt: Detect the freshly configured sphinx-needs project and scaffold pharaoh.toml +--- + +# @pharaoh.bootstrap + +Inject the minimum sphinx-needs configuration — extension entry, need types, optional extra links — into an existing Sphinx project that does not yet have sphinx-needs configured. Does not seed RST content, does not build, does not write `pharaoh.toml`. + +See [`skills/pharaoh-bootstrap/SKILL.md`](../../skills/pharaoh-bootstrap/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.class-diagram-draft.agent.md b/.github/agents/pharaoh.class-diagram-draft.agent.md new file mode 100644 index 0000000..7eef840 --- /dev/null +++ b/.github/agents/pharaoh.class-diagram-draft.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when drafting one class diagram showing a bounded set of types/entities with their fields, methods, and relationships (inheritance, composition, aggregation, association). +handoffs: [] +--- + +# @pharaoh.class-diagram-draft + +Use when drafting one class diagram showing a bounded set of types/entities with their fields, methods, and relationships (inheritance, composition, aggregation, association). + +See [`skills/pharaoh-class-diagram-draft/SKILL.md`](../../skills/pharaoh-class-diagram-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.component-diagram-draft.agent.md b/.github/agents/pharaoh.component-diagram-draft.agent.md new file mode 100644 index 0000000..b0bea11 --- /dev/null +++ b/.github/agents/pharaoh.component-diagram-draft.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when drafting one component-relationship diagram (nodes = sphinx-needs, edges = link relations) for a bounded scope — one feature, one module, one architectural view. +handoffs: [] +--- + +# @pharaoh.component-diagram-draft + +Use when drafting one component-relationship diagram (nodes = sphinx-needs, edges = link relations) for a bounded scope — one feature, one module, one architectural view. + +See [`skills/pharaoh-component-diagram-draft/SKILL.md`](../../skills/pharaoh-component-diagram-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.decision-review.agent.md b/.github/agents/pharaoh.decision-review.agent.md new file mode 100644 index 0000000..e644a61 --- /dev/null +++ b/.github/agents/pharaoh.decision-review.agent.md @@ -0,0 +1,10 @@ +--- +description: Audit a single recorded decision against context/alternatives/consequences structure and traceability. +handoffs: [] +--- + +# @pharaoh.decision-review + +Audit a single recorded decision against context/alternatives/consequences structure and traceability. + +See [`skills/pharaoh-decision-review/SKILL.md`](../../skills/pharaoh-decision-review/SKILL.md) for the full atomic specification. diff --git a/.github/agents/pharaoh.deployment-diagram-draft.agent.md b/.github/agents/pharaoh.deployment-diagram-draft.agent.md new file mode 100644 index 0000000..8e99724 --- /dev/null +++ b/.github/agents/pharaoh.deployment-diagram-draft.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when drafting one deployment diagram showing physical nodes (ECUs, servers, boards), the software artefacts deployed on each, and communication channels (buses, networks). +handoffs: [] +--- + +# @pharaoh.deployment-diagram-draft + +Use when drafting one deployment diagram showing physical nodes (ECUs, servers, boards), the software artefacts deployed on each, and communication channels (buses, networks). + +See [`skills/pharaoh-deployment-diagram-draft/SKILL.md`](../../skills/pharaoh-deployment-diagram-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.diagram-lint.agent.md b/.github/agents/pharaoh.diagram-lint.agent.md new file mode 100644 index 0000000..4aa99bd --- /dev/null +++ b/.github/agents/pharaoh.diagram-lint.agent.md @@ -0,0 +1,13 @@ +--- +description: Walk a directory of RST files and check every `.. mermaid::` / `.. uml::` block against the real renderer parser (mmdc, plantuml). Catches silent parse failures that sphinx-build misses. +handoffs: + - label: Aggregate into quality gate + agent: pharaoh.quality-gate + prompt: Consume the diagram-lint findings alongside review/mece/coverage reports for the terminal pass/fail decision +--- + +# @pharaoh.diagram-lint + +Walk a directory of RST files, extract every Mermaid / PlantUML block, and parse each block with the real renderer CLI (`mmdc -i tmp.mmd -o /dev/null`, `plantuml -checkonly`). Emits structured findings. Read-only — does not modify RST. When a renderer CLI is unavailable, degrades gracefully with a warning and install command. + +See [`skills/pharaoh-diagram-lint/SKILL.md`](../../skills/pharaoh-diagram-lint/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.diagram-review.agent.md b/.github/agents/pharaoh.diagram-review.agent.md new file mode 100644 index 0000000..5b7d396 --- /dev/null +++ b/.github/agents/pharaoh.diagram-review.agent.md @@ -0,0 +1,10 @@ +--- +description: Audit a single diagram block (Mermaid or PlantUML) against generic + per-type axes. +handoffs: [] +--- + +# @pharaoh.diagram-review + +Audit a single diagram block (Mermaid or PlantUML) against generic + per-type axes. + +See [`skills/pharaoh-diagram-review/SKILL.md`](../../skills/pharaoh-diagram-review/SKILL.md) for the full atomic specification. diff --git a/.github/agents/pharaoh.dispatch-signal-check.agent.md b/.github/agents/pharaoh.dispatch-signal-check.agent.md new file mode 100644 index 0000000..52d8c1e --- /dev/null +++ b/.github/agents/pharaoh.dispatch-signal-check.agent.md @@ -0,0 +1,10 @@ +--- +description: Verify declared execution_mode in plan.yaml matches observed artefacts in runs/. +handoffs: [] +--- + +# @pharaoh.dispatch-signal-check + +Verify declared execution_mode in plan.yaml matches observed artefacts in runs/. + +See [`skills/pharaoh-dispatch-signal-check/SKILL.md`](../../skills/pharaoh-dispatch-signal-check/SKILL.md) for the full atomic specification. diff --git a/.github/agents/pharaoh.execute-plan.agent.md b/.github/agents/pharaoh.execute-plan.agent.md new file mode 100644 index 0000000..2c1cf46 --- /dev/null +++ b/.github/agents/pharaoh.execute-plan.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when executing a plan. +handoffs: [] +--- + +# @pharaoh.execute-plan + +Use when executing a plan. + +See [`skills/pharaoh-execute-plan/SKILL.md`](../../skills/pharaoh-execute-plan/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.fault-tree-diagram-draft.agent.md b/.github/agents/pharaoh.fault-tree-diagram-draft.agent.md new file mode 100644 index 0000000..e7ad53d --- /dev/null +++ b/.github/agents/pharaoh.fault-tree-diagram-draft.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when drafting one fault tree for FTA (Fault Tree Analysis) — a top hazard event decomposed through AND/OR gates into basic events (component failures, random hardware faults, human errors). +handoffs: [] +--- + +# @pharaoh.fault-tree-diagram-draft + +Use when drafting one fault tree for FTA (Fault Tree Analysis) — a top hazard event decomposed through AND/OR gates into basic events (component failures, random hardware faults, human errors). + +See [`skills/pharaoh-fault-tree-diagram-draft/SKILL.md`](../../skills/pharaoh-fault-tree-diagram-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.feat-balance.agent.md b/.github/agents/pharaoh.feat-balance.agent.md new file mode 100644 index 0000000..3829768 --- /dev/null +++ b/.github/agents/pharaoh.feat-balance.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when a plan emitted by `pharaoh-write-plan` has completed its feature + comp_req emission and you need to check for granularity skew — features with too many reqs (under-decomposed feature model), too few (over-decomposed), fused sub-features (generic names like "utilities"), or redundancy (symmetric import/export pairs). +handoffs: [] +--- + +# @pharaoh.feat-balance + +Use when a plan emitted by `pharaoh-write-plan` has completed its feature + comp_req emission and you need to check for granularity skew — features with too many reqs (under-decomposed feature model), too few (over-decomposed), fused sub-features (generic names like "utilities"), or redundancy (symmetric import/export pairs). + +See [`skills/pharaoh-feat-balance/SKILL.md`](../../skills/pharaoh-feat-balance/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.feat-component-extract.agent.md b/.github/agents/pharaoh.feat-component-extract.agent.md new file mode 100644 index 0000000..c29c37e --- /dev/null +++ b/.github/agents/pharaoh.feat-component-extract.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when reverse-engineering a feat and you need to derive a component composition diagram automatically from the feat + its source files. +handoffs: [] +--- + +# @pharaoh.feat-component-extract + +Use when reverse-engineering a feat and you need to derive a component composition diagram automatically from the feat + its source files. + +See [`skills/pharaoh-feat-component-extract/SKILL.md`](../../skills/pharaoh-feat-component-extract/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.feat-draft-from-docs.agent.md b/.github/agents/pharaoh.feat-draft-from-docs.agent.md new file mode 100644 index 0000000..63216a1 --- /dev/null +++ b/.github/agents/pharaoh.feat-draft-from-docs.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when reading one or more existing documentation files (unstructured prose, README, tutorial) and emitting one or more feature-level RST directives (typed by `target_level`, default `feat`) that describe the user-facing capabilities documented in those files. +handoffs: [] +--- + +# @pharaoh.feat-draft-from-docs + +Use when reading one or more existing documentation files (unstructured prose, README, tutorial) and emitting one or more feature-level RST directives (typed by `target_level`, default `feat`) that describe the user-facing capabilities documented in those files. + +See [`skills/pharaoh-feat-draft-from-docs/SKILL.md`](../../skills/pharaoh-feat-draft-from-docs/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.feat-file-map.agent.md b/.github/agents/pharaoh.feat-file-map.agent.md new file mode 100644 index 0000000..0680f3b --- /dev/null +++ b/.github/agents/pharaoh.feat-file-map.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when mapping one feature (already emitted as an RST directive) to the source files that implement it. +handoffs: [] +--- + +# @pharaoh.feat-file-map + +Use when mapping one feature (already emitted as an RST directive) to the source files that implement it. + +See [`skills/pharaoh-feat-file-map/SKILL.md`](../../skills/pharaoh-feat-file-map/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.feat-flow-extract.agent.md b/.github/agents/pharaoh.feat-flow-extract.agent.md new file mode 100644 index 0000000..444be70 --- /dev/null +++ b/.github/agents/pharaoh.feat-flow-extract.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when reverse-engineering a feat and you need to derive a sequence diagram showing the control flow from its entry point through its source files. +handoffs: [] +--- + +# @pharaoh.feat-flow-extract + +Use when reverse-engineering a feat and you need to derive a sequence diagram showing the control flow from its entry point through its source files. + +See [`skills/pharaoh-feat-flow-extract/SKILL.md`](../../skills/pharaoh-feat-flow-extract/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.feat-review.agent.md b/.github/agents/pharaoh.feat-review.agent.md new file mode 100644 index 0000000..af64186 --- /dev/null +++ b/.github/agents/pharaoh.feat-review.agent.md @@ -0,0 +1,10 @@ +--- +description: Audit a single feature-level need against the generic feat review axes plus any project-specific addenda. +handoffs: [] +--- + +# @pharaoh.feat-review + +Audit a single feature-level need against the generic feat review axes plus any project-specific addenda. + +See [`skills/pharaoh-feat-review/SKILL.md`](../../skills/pharaoh-feat-review/SKILL.md) for the full atomic specification. diff --git a/.github/agents/pharaoh.fmea-review.agent.md b/.github/agents/pharaoh.fmea-review.agent.md new file mode 100644 index 0000000..6768550 --- /dev/null +++ b/.github/agents/pharaoh.fmea-review.agent.md @@ -0,0 +1,10 @@ +--- +description: Audit a single FMEA entry against severity/occurrence/detection scales, RPN correctness, and cause/effect well-formedness. +handoffs: [] +--- + +# @pharaoh.fmea-review + +Audit a single FMEA entry against severity/occurrence/detection scales, RPN correctness, and cause/effect well-formedness. + +See [`skills/pharaoh-fmea-review/SKILL.md`](../../skills/pharaoh-fmea-review/SKILL.md) for the full atomic specification. diff --git a/.github/agents/pharaoh.gate-advisor.agent.md b/.github/agents/pharaoh.gate-advisor.agent.md new file mode 100644 index 0000000..4c9c63b --- /dev/null +++ b/.github/agents/pharaoh.gate-advisor.agent.md @@ -0,0 +1,10 @@ +--- +description: Read a project's `pharaoh.toml` and report which phased-enablement ladder step is the recommended next gate to switch on. Advisory, read-only — walks the fixed 5-step ladder in order (`require_verification` → `require_change_analysis` → `require_mece_on_release` → `codelinks.enabled` → `strictness = "enforcing"`) and names the first unmet step plus its blocker. +handoffs: [] +--- + +# @pharaoh.gate-advisor + +Read the project's `pharaoh.toml`, parse the five ladder flags, and emit a findings JSON naming the next recommended gate to enable, the blocker that must be cleared first, and the full fixed ladder. Read-only; never edits `pharaoh.toml`. The ladder rationale lives in [`skills/shared/gate-enablement.md`](../../skills/shared/gate-enablement.md) — this atom is the tool that walks it, not the authority that defines it. + +See [`skills/pharaoh-gate-advisor/SKILL.md`](../../skills/pharaoh-gate-advisor/SKILL.md) for the full atomic specification — inputs, outputs, per-step process, ladder table, rationale map, tailoring extension point, and composition patterns. diff --git a/.github/agents/pharaoh.id-allocate.agent.md b/.github/agents/pharaoh.id-allocate.agent.md new file mode 100644 index 0000000..c2d23d7 --- /dev/null +++ b/.github/agents/pharaoh.id-allocate.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when about to dispatch a fan-out of emission subagents (pharaoh-req-from-code, pharaoh-feat-draft-from-docs) and you need to pre-allocate globally-unique sphinx-needs IDs. +handoffs: [] +--- + +# @pharaoh.id-allocate + +Use when about to dispatch a fan-out of emission subagents (pharaoh-req-from-code, pharaoh-feat-draft-from-docs) and you need to pre-allocate globally-unique sphinx-needs IDs. + +See [`skills/pharaoh-id-allocate/SKILL.md`](../../skills/pharaoh-id-allocate/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.id-convention-check.agent.md b/.github/agents/pharaoh.id-convention-check.agent.md new file mode 100644 index 0000000..d2af8bf --- /dev/null +++ b/.github/agents/pharaoh.id-convention-check.agent.md @@ -0,0 +1,10 @@ +--- +description: Verify that every need id in a sphinx-needs corpus matches the regex declared for its type in .pharaoh/project/id-conventions.yaml. Emits a list of violations. +handoffs: [] +--- + +# @pharaoh.id-convention-check + +Verify that every need id in a sphinx-needs corpus matches the regex declared for its type in `.pharaoh/project/id-conventions.yaml`. Emits a list of violations. + +See [`skills/pharaoh-id-convention-check/SKILL.md`](../../skills/pharaoh-id-convention-check/SKILL.md) for the full atomic specification — inputs, outputs, detection rule, and composition patterns. diff --git a/.github/agents/pharaoh.link-completeness-check.agent.md b/.github/agents/pharaoh.link-completeness-check.agent.md new file mode 100644 index 0000000..e067e80 --- /dev/null +++ b/.github/agents/pharaoh.link-completeness-check.agent.md @@ -0,0 +1,10 @@ +--- +description: Verify outgoing-link coverage across a full needs.json graph against required/optional link types declared per artefact type in artefact-catalog.yaml — missing required links, unresolved target ids, per-type policy enforcement. +handoffs: [] +--- + +# @pharaoh.link-completeness-check + +Verify outgoing-link coverage across a full needs.json graph against required/optional link types declared per artefact type in `artefact-catalog.yaml` — missing required links, unresolved target ids, per-type policy enforcement. + +See [`skills/pharaoh-link-completeness-check/SKILL.md`](../../skills/pharaoh-link-completeness-check/SKILL.md) for the full atomic specification — inputs, outputs, per-pass detection rules, and composition patterns. diff --git a/.github/agents/pharaoh.output-validate.agent.md b/.github/agents/pharaoh.output-validate.agent.md new file mode 100644 index 0000000..331ad1f --- /dev/null +++ b/.github/agents/pharaoh.output-validate.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when `pharaoh-execute-plan` (or any caller) has dispatched a subagent whose output must match one of the documented schemas (RST directive, sphinx-codelinks one-line comment, YAML mapping, JSON object). +handoffs: [] +--- + +# @pharaoh.output-validate + +Use when `pharaoh-execute-plan` (or any caller) has dispatched a subagent whose output must match one of the documented schemas (RST directive, sphinx-codelinks one-line comment, YAML mapping, JSON object). + +See [`skills/pharaoh-output-validate/SKILL.md`](../../skills/pharaoh-output-validate/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.papyrus-non-empty-check.agent.md b/.github/agents/pharaoh.papyrus-non-empty-check.agent.md new file mode 100644 index 0000000..4e02bf7 --- /dev/null +++ b/.github/agents/pharaoh.papyrus-non-empty-check.agent.md @@ -0,0 +1,10 @@ +--- +description: Verify that a Papyrus workspace received at least N writes during a plan run. +handoffs: [] +--- + +# @pharaoh.papyrus-non-empty-check + +Verify that a Papyrus workspace received at least N writes during a plan run. + +See [`skills/pharaoh-papyrus-non-empty-check/SKILL.md`](../../skills/pharaoh-papyrus-non-empty-check/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.prose-migrate.agent.md b/.github/agents/pharaoh.prose-migrate.agent.md new file mode 100644 index 0000000..db1ab43 --- /dev/null +++ b/.github/agents/pharaoh.prose-migrate.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when a reverse-engineering run (a plan emitted by pharaoh-write-plan) finds pre-existing prose documentation files in the target output directory that would collide with generated feat RST files. +handoffs: [] +--- + +# @pharaoh.prose-migrate + +Use when a reverse-engineering run (a plan emitted by pharaoh-write-plan) finds pre-existing prose documentation files in the target output directory that would collide with generated feat RST files. + +See [`skills/pharaoh-prose-migrate/SKILL.md`](../../skills/pharaoh-prose-migrate/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.quality-gate.agent.md b/.github/agents/pharaoh.quality-gate.agent.md new file mode 100644 index 0000000..95a52fc --- /dev/null +++ b/.github/agents/pharaoh.quality-gate.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when running the final validation step of any Pharaoh composition that emits artefacts (reqs, features, architecture elements). +handoffs: [] +--- + +# @pharaoh.quality-gate + +Use when running the final validation step of any Pharaoh composition that emits artefacts (reqs, features, architecture elements). + +See [`skills/pharaoh-quality-gate/SKILL.md`](../../skills/pharaoh-quality-gate/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.reproducibility-check.agent.md b/.github/agents/pharaoh.reproducibility-check.agent.md new file mode 100644 index 0000000..ee77bd2 --- /dev/null +++ b/.github/agents/pharaoh.reproducibility-check.agent.md @@ -0,0 +1,10 @@ +--- +description: Diff two output directories produced by two runs of the same plan to confirm the build is reproducible. Consumes baseline dir, rerun dir, and optional mask rules for non-deterministic fields (timestamps, random ids); emits drifted-file list with per-file changed-field summaries. Does NOT run the plan — that is the caller's responsibility (`pharaoh-execute-plan`). +handoffs: [] +--- + +# @pharaoh.reproducibility-check + +Diff two output directories produced by running the same plan twice to confirm the build is reproducible. Consumes a baseline directory, a rerun directory, and an optional list of mask rules for known-non-deterministic fields (timestamps, randomly-generated ids); emits a list of drifted files with per-file changed-field summaries. Does NOT run the plan — running twice is the caller's responsibility (`pharaoh-execute-plan`). + +See [`skills/pharaoh-reproducibility-check/SKILL.md`](../../skills/pharaoh-reproducibility-check/SKILL.md) for the full atomic specification — inputs, outputs, per-step process, failure modes, and composition patterns. diff --git a/.github/agents/pharaoh.req-code-grounding-check.agent.md b/.github/agents/pharaoh.req-code-grounding-check.agent.md new file mode 100644 index 0000000..f188291 --- /dev/null +++ b/.github/agents/pharaoh.req-code-grounding-check.agent.md @@ -0,0 +1,10 @@ +--- +description: Verify a drafted requirement's claims against the source file it cites via :source_doc: — exception raise sites, trigger conditions, type-framework imports, named symbols, weasel adjectives, quantifier enumeration, branch count. +handoffs: [] +--- + +# @pharaoh.req-code-grounding-check + +Verify a drafted requirement's claims against the source file it cites via `:source_doc:` — exception raise sites, trigger conditions, type-framework imports, named symbols, weasel adjectives, quantifier enumeration, branch count. + +See [`skills/pharaoh-req-code-grounding-check/SKILL.md`](../../skills/pharaoh-req-code-grounding-check/SKILL.md) for the full atomic specification — inputs, outputs, per-axis detection rules, and composition patterns. diff --git a/.github/agents/pharaoh.req-codelink-annotate.agent.md b/.github/agents/pharaoh.req-codelink-annotate.agent.md new file mode 100644 index 0000000..004eb8c --- /dev/null +++ b/.github/agents/pharaoh.req-codelink-annotate.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when a requirement has been drafted (either as an RST block by `pharaoh-req-from-code` or implicitly) and you need to insert a one-line comment into the source file that carries the trace. +handoffs: [] +--- + +# @pharaoh.req-codelink-annotate + +Use when a requirement has been drafted (either as an RST block by `pharaoh-req-from-code` or implicitly) and you need to insert a one-line comment into the source file that carries the trace. + +See [`skills/pharaoh-req-codelink-annotate/SKILL.md`](../../skills/pharaoh-req-codelink-annotate/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.reqs-from-module.agent.md b/.github/agents/pharaoh.reqs-from-module.agent.md deleted file mode 100644 index 3aa788d..0000000 --- a/.github/agents/pharaoh.reqs-from-module.agent.md +++ /dev/null @@ -1,10 +0,0 @@ ---- -description: Reverse-engineer comp_reqs for an entire module in parallel with shared Papyrus for terminology coordination. -handoffs: [] ---- - -# @pharaoh.reqs-from-module - -Reverse-engineer comp_reqs for an entire module in parallel with shared Papyrus for terminology coordination. - -See [`skills/pharaoh-reqs-from-module/SKILL.md`](../../skills/pharaoh-reqs-from-module/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.self-review-coverage-check.agent.md b/.github/agents/pharaoh.self-review-coverage-check.agent.md new file mode 100644 index 0000000..92beffa --- /dev/null +++ b/.github/agents/pharaoh.self-review-coverage-check.agent.md @@ -0,0 +1,10 @@ +--- +description: Verify every drafted artefact in runs/ has a matching review JSON. +handoffs: [] +--- + +# @pharaoh.self-review-coverage-check + +Verify every drafted artefact in runs/ has a matching review JSON. + +See [`skills/pharaoh-self-review-coverage-check/SKILL.md`](../../skills/pharaoh-self-review-coverage-check/SKILL.md) for the full atomic specification. diff --git a/.github/agents/pharaoh.sequence-diagram-draft.agent.md b/.github/agents/pharaoh.sequence-diagram-draft.agent.md new file mode 100644 index 0000000..6fd0ae8 --- /dev/null +++ b/.github/agents/pharaoh.sequence-diagram-draft.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when drafting one sequence diagram showing ordered interactions between participants (components, actors, external systems) over time. +handoffs: [] +--- + +# @pharaoh.sequence-diagram-draft + +Use when drafting one sequence diagram showing ordered interactions between participants (components, actors, external systems) over time. + +See [`skills/pharaoh-sequence-diagram-draft/SKILL.md`](../../skills/pharaoh-sequence-diagram-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.sphinx-extension-add.agent.md b/.github/agents/pharaoh.sphinx-extension-add.agent.md new file mode 100644 index 0000000..67ceb53 --- /dev/null +++ b/.github/agents/pharaoh.sphinx-extension-add.agent.md @@ -0,0 +1,10 @@ +--- +description: Idempotently add one or more sphinx extension modules to a project's `conf.py` extensions list, optionally installing the corresponding pypi packages via the detected package manager. +handoffs: [] +--- + +# @pharaoh.sphinx-extension-add + +Add sphinx extensions (e.g. `sphinxcontrib.mermaid`, `sphinxcontrib.plantuml`, `myst_parser`) to a project's `conf.py` extensions list. Idempotent: noop when all requested extensions are already loaded. Optionally installs the corresponding pypi packages via the detected package manager (rye / uv / poetry / pdm / pip-venv). Typically inserted into a plan by `pharaoh.write-plan` as a prerequisite to diagram-emitting tasks when `conf.py` lacks the required renderer extension. + +See [`skills/pharaoh-sphinx-extension-add/SKILL.md`](../../skills/pharaoh-sphinx-extension-add/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.state-diagram-draft.agent.md b/.github/agents/pharaoh.state-diagram-draft.agent.md new file mode 100644 index 0000000..652dc24 --- /dev/null +++ b/.github/agents/pharaoh.state-diagram-draft.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when drafting one state-machine diagram showing lifecycle or behavioral states of a component/entity, with labeled transitions. +handoffs: [] +--- + +# @pharaoh.state-diagram-draft + +Use when drafting one state-machine diagram showing lifecycle or behavioral states of a component/entity, with labeled transitions. + +See [`skills/pharaoh-state-diagram-draft/SKILL.md`](../../skills/pharaoh-state-diagram-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.status-lifecycle-check.agent.md b/.github/agents/pharaoh.status-lifecycle-check.agent.md new file mode 100644 index 0000000..935ba38 --- /dev/null +++ b/.github/agents/pharaoh.status-lifecycle-check.agent.md @@ -0,0 +1,13 @@ +--- +description: Release-gate check over a sphinx-needs corpus — counts needs still in the `draft` bucket (per workflows.yaml) and returns binary pass/fail when `enforce=true`. Advisory mode reports counts without failing. +handoffs: + - label: Aggregate into quality gate + agent: pharaoh.quality-gate + prompt: Consume the status-lifecycle findings as the delegated check for the status-lifecycle-healthy invariant +--- + +# @pharaoh.status-lifecycle-check + +Aggregate `status` across every need in `needs.json` against the `initial_state` declared in `workflows.yaml`. Binary release gate — under `enforce=true`, zero drafts pass, one draft fails. Under `enforce=false` (default), the findings are reported without failing so pre-release development is unblocked. Distinct from `pharaoh-lifecycle-check`, which evaluates per-need transition legality against `requires:` prerequisites. + +See [`skills/pharaoh-status-lifecycle-check/SKILL.md`](../../skills/pharaoh-status-lifecycle-check/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.tailor-bootstrap.agent.md b/.github/agents/pharaoh.tailor-bootstrap.agent.md new file mode 100644 index 0000000..fc6465e --- /dev/null +++ b/.github/agents/pharaoh.tailor-bootstrap.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when a sphinx-needs project has just been bootstrapped (post pharaoh-bootstrap, pre any needs authoring) and you need to generate minimal tailoring files from declared types — workflows. +handoffs: [] +--- + +# @pharaoh.tailor-bootstrap + +Use when a sphinx-needs project has just been bootstrapped (post pharaoh-bootstrap, pre any needs authoring) and you need to generate minimal tailoring files from declared types — workflows. + +See [`skills/pharaoh-tailor-bootstrap/SKILL.md`](../../skills/pharaoh-tailor-bootstrap/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.tailor-code-grounding-filters.agent.md b/.github/agents/pharaoh.tailor-code-grounding-filters.agent.md new file mode 100644 index 0000000..b17d90c --- /dev/null +++ b/.github/agents/pharaoh.tailor-code-grounding-filters.agent.md @@ -0,0 +1,10 @@ +--- +description: Detect language + CLI framework + config-default idiom in a project source tree and emit a code-grounding-filters.yaml wiring the four parameterised filter strategies to the detected stack. +handoffs: [] +--- + +# @pharaoh.tailor-code-grounding-filters + +Detect language + CLI framework + config-default idiom in a project source tree and emit a code-grounding-filters.yaml wiring the four parameterised filter strategies to the detected stack. + +See [`skills/pharaoh-tailor-code-grounding-filters/SKILL.md`](../../skills/pharaoh-tailor-code-grounding-filters/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.toctree-emit.agent.md b/.github/agents/pharaoh.toctree-emit.agent.md new file mode 100644 index 0000000..3aa35f5 --- /dev/null +++ b/.github/agents/pharaoh.toctree-emit.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when a composition skill has just emitted a set of RST files into a directory and needs to add (or regenerate) an `index. +handoffs: [] +--- + +# @pharaoh.toctree-emit + +Use when a composition skill has just emitted a set of RST files into a directory and needs to add (or regenerate) an `index. + +See [`skills/pharaoh-toctree-emit/SKILL.md`](../../skills/pharaoh-toctree-emit/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.use-case-diagram-draft.agent.md b/.github/agents/pharaoh.use-case-diagram-draft.agent.md new file mode 100644 index 0000000..59255eb --- /dev/null +++ b/.github/agents/pharaoh.use-case-diagram-draft.agent.md @@ -0,0 +1,10 @@ +--- +description: Draft a single use-case diagram for one feat — actors, use cases, system boundary. +handoffs: [pharaoh.diagram-review] +--- + +# @pharaoh.use-case-diagram-draft + +Draft a single use-case diagram for one feat — actors, use cases, system boundary. + +See [`skills/pharaoh-use-case-diagram-draft/SKILL.md`](../../skills/pharaoh-use-case-diagram-draft/SKILL.md) for the full atomic specification. diff --git a/.github/agents/pharaoh.write-plan.agent.md b/.github/agents/pharaoh.write-plan.agent.md new file mode 100644 index 0000000..934c18d --- /dev/null +++ b/.github/agents/pharaoh.write-plan.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when you have an intent (e. +handoffs: [] +--- + +# @pharaoh.write-plan + +Use when you have an intent (e. + +See [`skills/pharaoh-write-plan/SKILL.md`](../../skills/pharaoh-write-plan/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/pharaoh.toml.example b/pharaoh.toml.example index d58b4b9..9bb8a21 100644 --- a/pharaoh.toml.example +++ b/pharaoh.toml.example @@ -14,11 +14,14 @@ pattern = "{TYPE}-{MODULE}-{NUMBER}" auto_increment = true [pharaoh.workflow] -# Require pharaoh:change before pharaoh:author +# Gate-enablement ladder — see skills/shared/gate-enablement.md for rationale. +# Bootstrap ships step 1 (require_verification) on by default; projects enable +# the other steps as the pre-work for each lands. +# Require pharaoh:change before pharaoh:author (step 2 — needs pharaoh-change tailored) require_change_analysis = true -# Require pharaoh:verify before pharaoh:release +# Require pharaoh:verify before pharaoh:release (step 1 of the ladder — enabled by default) require_verification = true -# Require pharaoh:mece before pharaoh:release (optional) +# Require pharaoh:mece before pharaoh:release (step 3 — needs release-gate workflow) require_mece_on_release = false [pharaoh.traceability] diff --git a/skills/pharaoh-activity-diagram-draft/SKILL.md b/skills/pharaoh-activity-diagram-draft/SKILL.md new file mode 100644 index 0000000..df43b42 --- /dev/null +++ b/skills/pharaoh-activity-diagram-draft/SKILL.md @@ -0,0 +1,85 @@ +--- +name: pharaoh-activity-diagram-draft +description: Use when drafting one activity diagram showing control flow (actions, decisions, forks/joins, swimlanes) for one procedure or algorithm. Typical ASPICE usage — SWE.3 Software Detailed Design. Renderer tailored via `pharaoh.toml`. Does NOT emit other diagram kinds. Status — PLANNED (design-only scaffold; invoking returns sentinel FAIL until implemented). +--- + +# pharaoh-activity-diagram-draft (PLANNED) + +> **Status:** DESIGN ONLY. Implementation sentinel FAIL: `"pharaoh-activity-diagram-draft is planned but not implemented; see SKILL.md"`. + +Shared tailoring rules: see [`shared/diagram-tailoring.md`](../shared/diagram-tailoring.md). Reads `[pharaoh.diagrams.activity]`. + +Safe-label rules: see [`shared/diagram-safe-labels.md`](../shared/diagram-safe-labels.md). Every emitted label / node id / edge label MUST be sanitised per that rule set before the block leaves this skill. `sphinx-build` does not validate diagram bodies — a parse failure becomes visible only at browser render time. Sanitisation is the first line of defence; the second is `pharaoh-diagram-lint` run as part of `pharaoh-quality-gate`. + +## Purpose + +One invocation → one activity diagram. Captures **control flow** within a single procedure: sequential actions, branching (decisions), forking (parallel activities), joining, and swimlanes (partitions) showing which actor/component performs which action. + +Typical ASPICE context: +- **SWE.3 Software Detailed Design**: algorithm breakdown per function. +- **SYS.3 System Architectural Design**: activity within a subsystem. + +Does NOT capture ordered inter-component message exchange (→ `pharaoh-sequence-diagram-draft`). Does NOT capture state lifecycle (→ `pharaoh-state-diagram-draft`). + +## Atomicity + +- (a) One procedure in → one diagram out. +- (b) Input: `{view_title: str, actions: list[ActionSpec], decisions: list[DecisionSpec], forks: list[ForkSpec], edges: list[EdgeSpec], initial: str, finals: list[str], swimlanes?: list[SwimlaneSpec], project_root: str, renderer_override?, on_missing_config?, papyrus_workspace?, reporter_id: str}` where `ActionSpec = {id: str, label: str, swimlane?: str}`, `DecisionSpec = {id: str, label: str, swimlane?: str}`, `ForkSpec = {id: str, kind: "fork"|"join", swimlane?: str}`, `EdgeSpec = {from: str, to: str, guard?: str, label?: str}`, `SwimlaneSpec = {id: str, label: str}`. Output: one RST directive block. +- (c) Reward: fixture — procedure `receive_can_frame` with actions [parse, validate, dispatch], one decision (valid?), two finals (accepted, rejected). Scorer: + 1. Output starts with renderer directive. + 2. Exactly one initial marker (`[*]` or equivalent) pointing to `initial`. + 3. Every action/decision/fork id appears as a node. + 4. Every edge renders with renderer-specific syntax; guards shown in `[...]`. + 5. Swimlanes (if any) group their members visually (Mermaid: no native swimlane, emit comment + `subgraph`; PlantUML: `|SwimlaneX|` partition). + 6. Every id in `finals` has an outgoing edge to `[*]`. + + Pass = all 6. +- (d) Reusable for any procedural spec: SW detailed design, system workflows, operator procedures. +- (e) One diagram per call. + +## Dangling edges + +FAIL on edge endpoint not in `actions ∪ decisions ∪ forks ∪ {initial}`. An activity diagram with a transition to an undeclared action is an incomplete procedure. + +## Output + +**PlantUML (preferred for swimlane support):** +```rst +.. uml:: + :caption: + + @startuml + |Driver| + start + :parse CAN frame; + if (valid?) then (yes) + |Dispatcher| + :dispatch to handler; + stop + else (no) + |Driver| + :log error; + end + endif + @enduml +``` + +**Mermaid (limited — no native swimlanes):** +```rst +.. mermaid:: + :caption: + + flowchart TD + Start([Start]) --> Parse[parse CAN frame] + Parse --> Valid{valid?} + Valid -->|yes| Dispatch[dispatch to handler] + Valid -->|no| Log[log error] + Dispatch --> End([End]) + Log --> End +``` + +## Non-goals + +- No pin/action-parameter visualization — out of scope. +- No code-to-activity extraction — caller provides the structure; a future `pharaoh-activity-from-cfg` could infer from control-flow graphs. +- No interrupt/exception flows — model those as explicit edges if needed. diff --git a/skills/pharaoh-api-coverage-check/SKILL.md b/skills/pharaoh-api-coverage-check/SKILL.md new file mode 100644 index 0000000..3ae19e3 --- /dev/null +++ b/skills/pharaoh-api-coverage-check/SKILL.md @@ -0,0 +1,175 @@ +--- +name: pharaoh-api-coverage-check +description: Use when verifying that a source file is covered by the need catalogue on two axes — (1) at least one CREQ declares the file as its `:source_doc:`, and (2) every project-defined exception class raised in the file is named by some CREQ's title or content. Exception classes not defined in the project source tree (stdlib, third-party deps) are reported as `external` and do not fail the axis. Classifies non-behavioral files (constants, type aliases, bare re-exports) as skipped. Language-parametric via the shared regex table in `skills/shared/public-symbol-patterns.md` (python / rust / typescript / go / c / cpp / java). Single mechanical structural check. +--- + +# pharaoh-api-coverage-check + +## When to use + +Invoke from `pharaoh-quality-gate.required_checks` (invariant `api_coverage_clean`), from a pre-release CI job, or standalone when auditing whether the need catalogue has kept pace with the code. Reads one source file plus one `needs.json` and emits a binary verdict on two axes — file-level citation AND raise-site coverage. Non-behavioral files (constants, type aliases, bare re-exports) are skipped so they never fail the gate. + +This is the reverse coverage direction of `pharaoh-req-from-code`. The forward direction answers "for this file, which reqs should be drafted?". The reverse direction answers "does the catalogue acknowledge this file's existence AND every exception it raises?". Missing coverage here points at CREQs that were never authored, not at CREQs that are poorly written. + +Do NOT use to grade need prose quality — that is `pharaoh-req-review`. Do NOT use to verify that a CREQ's claims about the file are accurate — that is `pharaoh-req-code-grounding-check` (the forward fidelity check). Do NOT use to author or modify reqs (read-only). Python classification runs on an AST; other languages use regex approximations consistent with the shared public-symbol-patterns table. + +## Atomicity + +- (a) Indivisible: one source file + one `needs.json` in → one findings JSON out. No req drafting, no set-level analysis, no dispatch of other skills. +- (b) Input: `{source_file: str, needs_json_path: str, project_root: str | null, language: "auto" | "python" | "rust" | "typescript" | "go" | "c" | "cpp" | "java"}`. Output: findings JSON per the shape in `## Output` below. +- (c) Reward: fixtures under `skills/pharaoh-api-coverage-check/fixtures/` cover each verdict class and each supported language. Pass = each fixture's actual output matches `expected-output.json` modulo ordering of the `covered`, `uncovered`, and `external` arrays under `raise_site_coverage` (sorted ascending in the emitted output) and of `file_coverage.citing_creqs` (sorted ascending). +- (d) Reusable across projects — the language regex table is read from `skills/shared/public-symbol-patterns.md`, not inlined. No project-specific symbol names baked in. +- (e) Read-only. Does not modify the source file, `needs.json`, or any on-disk state. Running twice on identical inputs yields byte-identical output. + +## Input + +- `source_file`: absolute path to the source file under audit, OR a path relative to `project_root`. Extension is used for language inference when `language=auto`. +- `needs_json_path`: absolute path to the built sphinx-needs corpus `needs.json`. Accepts either the flat `{"needs": {: {...}}}` shape or the versioned `{"versions": {"": {"needs": {...}}}}` shape. Each need dict must carry at least `id`, `title`, and `content` (or a synonymous body field); the `source_doc` option is what the file-coverage axis reads. +- `project_root`: optional absolute path. Used for three things: (1) resolve `source_file` when relative, (2) resolve a need's `:source_doc:` value when relative, (3) scope the project-definition scan that distinguishes project-defined exception classes from external (stdlib / third-party) ones in raise-site coverage. When omitted, relative paths are resolved against the current working directory and the project-definition scan is skipped — every raised class is then treated as project-defined. +- `language`: one of `"auto"`, `"python"`, `"rust"`, `"typescript"`, `"go"`, `"c"`, `"cpp"`, `"java"`. Default `"auto"`. When `"auto"`, the skill resolves the language from the source-file extension via the globs column of `skills/shared/public-symbol-patterns.md`. When an explicit language is given, extension is ignored — this is the dogfood escape hatch for literate source (e.g. a `.txt` with Python snippets). + +Edge cases: +- `source_file` missing or unreadable → `overall: "fail"`, blocker `"source_file unresolved: "`. +- `needs_json_path` missing or unparseable → `overall: "fail"`, blocker `"needs.json unresolved: "`. +- `language="auto"` and extension matches no row in the shared table → `overall: "fail"`, blocker `"unsupported language: "`, `language: "unknown"`. + +## Output + +```json +{ + "source_file": "/abs/path/src/module/client.py", + "language": "python", + "classification": "behavioral", + "file_coverage": { + "passed": true, + "citing_creqs": ["CREQ_inventory_client", "CREQ_inventory_load"] + }, + "raise_site_coverage": { + "total": 2, + "project_defined": 1, + "covered": ["InventoryError"], + "uncovered": [], + "external": ["ValueError"], + "passed": true + }, + "overall": "pass", + "blockers": [] +} +``` + +Fields (in canonical order): +- `source_file`: echo of the input path (as supplied — absolute or `project_root`-relative). +- `language`: resolved language string (one of the seven supported names) or `"unknown"` on unsupported-language failure. +- `classification`: `"behavioral"` or `"non-behavioral"`. +- `file_coverage.passed`: `true` iff ≥1 CREQ in the catalogue has `:source_doc:` resolving to this file. `null` when `classification == "non-behavioral"`. +- `file_coverage.citing_creqs`: list of CREQ IDs whose `:source_doc:` resolves to this file, sorted ascending. Empty list when no CREQ cites the file; still emitted for `non-behavioral` classification (diagnostic). +- `raise_site_coverage.total`: count of distinct exception class names extracted from `raise` / `throw` sites in the file. +- `raise_site_coverage.project_defined`: count of those names that resolve to a class / struct / enum defined somewhere under `project_root` for the file's language (see `## Project-definition scan`). `external` = `total - project_defined`. +- `raise_site_coverage.covered`: list of project-defined raised class names that appear (case-sensitive substring) in the title or content of some CREQ anywhere in the catalogue (not scoped to citing CREQs), sorted ascending. +- `raise_site_coverage.uncovered`: list of project-defined raised class names absent from every CREQ's title and content, sorted ascending. +- `raise_site_coverage.external`: list of raised class names that do not resolve to any project-local class definition — stdlib exceptions, third-party dep types. Diagnostic only; does not contribute to the pass/fail decision. Sorted ascending. +- `raise_site_coverage.passed`: `true` iff `uncovered` is empty. `null` when `classification == "non-behavioral"`. +- `overall`: + - `"pass"` — `classification == "behavioral"` AND `file_coverage.passed` AND `raise_site_coverage.passed`. + - `"fail"` — `classification == "behavioral"` AND either sub-axis is false. + - `"skipped"` — `classification == "non-behavioral"`. +- `blockers`: list of blocker strings (input errors — unreadable source, unreadable needs.json, unsupported language). Always present; empty list on pass / skipped / clean fail. + +On input errors the shape still carries every field. `classification` is `"non-behavioral"`, `file_coverage.citing_creqs` is `[]`, `raise_site_coverage` carries `total: 0, project_defined: 0, covered: [], uncovered: [], external: []` with `passed: null`, and `blockers` is populated with the error strings. + +## Path resolution + +- `source_file` resolution: absolute path used verbatim; relative path joined with `project_root` (or CWD if unset), then normalised via `os.path.normpath` (resolve `./` and `../`, collapse double slashes). Comparison is case-sensitive on POSIX. On Windows drive-letter paths the drive letter and path separators are normalised case-insensitively. +- `source_doc` resolution per need: the need's `source_doc` option value is resolved the same way (absolute verbatim, relative joined with `project_root`). A need cites the file iff its resolved `source_doc` equals the resolved `source_file`. + +## Process + +### Step 1: Resolve the language + +If `language == "auto"`, read the globs column of `skills/shared/public-symbol-patterns.md` and find the first row whose glob list contains the source file's extension. If no row matches, emit an error output with `language: "unknown"`, `classification: "non-behavioral"`, `overall: "fail"`, `blockers: ["unsupported language: "]`, and stop. If an explicit language is given, use it verbatim and skip extension resolution. + +### Step 2: Classify the file + +A file is `behavioral` iff ANY of the following holds: + +1. **Non-trivial function body**: ≥1 function / async function / method whose body contains more than 2 top-level statements. A body of `pass`, `...`, a single return, a single expression, or a single delegation call does not qualify. Docstrings are statements but do not count (strip them before measuring length). +2. **Exception surface**: ≥1 `raise X(...)` statement anywhere in the file. For languages whose exception syntax is `throw`, the equivalent `throw X(...)` / `throw new X(...)` counts. +3. **Method-rich class**: ≥1 class whose body contains ≥2 method definitions (public or private — the count is structural, not visibility-scoped). + +Otherwise the file is `non-behavioral`: constants, type aliases, bare re-exports, empty `__init__.py` forwarders. Emit `classification: "non-behavioral"`, `overall: "skipped"`, both sub-axes with empty content and `passed: null`, and stop. + +**Python**: parse with `ast`. Count `FunctionDef`/`AsyncFunctionDef` body length after stripping the module-level / function-level docstring (the first statement if it is a bare `Expr(Constant(str))`); check for `Raise` nodes anywhere; count `FunctionDef`/`AsyncFunctionDef` children inside each `ClassDef`. + +**Other languages**: regex approximations colocated in the language table below. + +### Step 3: File coverage + +Load `needs.json`, flatten the needs map, and collect every CREQ whose `source_doc` option resolves (per `## Path resolution`) to the input `source_file`. `file_coverage.passed` is `true` iff the collected list is non-empty; `file_coverage.citing_creqs` is the sorted list of their IDs. + +### Step 4: Raise-site coverage + +Extract every exception class name `X` from every `raise X(...)` / `throw X(...)` / `throw new X(...)` site across the file (de-duplicate). + +For each `X`, classify as **project-defined** or **external** per `## Project-definition scan` below. External classes go into the `external` diagnostic list and are excluded from pass/fail evaluation. + +For each project-defined `X`, check whether `X` appears (case-sensitive substring) in the `title` OR `content` (or synonymous body field) of any CREQ anywhere in the catalogue — NOT scoped to citing CREQs. `raise_site_coverage.passed` is `true` iff every project-defined `X` is covered. + +### Step 5: Emit the findings JSON + +Populate every field per the `## Output` shape. Sort `file_coverage.citing_creqs`, `raise_site_coverage.covered`, `raise_site_coverage.uncovered`, and `raise_site_coverage.external` ascending. Set `overall` per the rules in the output-field description. + +## Project-definition scan + +When `project_root` is provided, the skill walks the tree and collects every class-like definition name matching the input file's language: + +- Python: `class X` top-level or nested. +- Rust: `struct X`, `enum X`. +- TypeScript: `class X`, `class X extends Y`, `export class X`, `export default class X`, plus `interface X` / `type X = ...` shapes used as error types. +- Go: `type X struct`, `type X interface`. +- Java: `class X`, `interface X`, nested declarations. +- C: `struct X` (C has no exception classes in practice — typically empty set). +- C++: `class X`, `struct X`. + +Only files whose extension matches the language's glob (per the table) are scanned. The regex uses the `public symbol regex` column from `skills/shared/public-symbol-patterns.md` (named capture `name`) filtered to class-like kinds. + +A raised class `X` is **project-defined** iff its name appears in the collected set. Otherwise `X` is **external** — stdlib (`ValueError`, `RuntimeError`, `java.lang.IllegalArgumentException`, `std::runtime_error`) or third-party dep types. + +When `project_root` is omitted, the scan is skipped and every raised class is treated as project-defined (the `external` list is empty). This keeps the skill usable in single-file contexts but the external-filter value is only available when `project_root` is supplied. + +## Language table + +| language | extension globs | classifier notes | raise-site regex | +|------------|------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------| +| python | `*.py` | Parse with `ast`. Body length computed after stripping the leading docstring. Class methods counted via `FunctionDef`/`AsyncFunctionDef` children of `ClassDef`. | `raise\s+(?P[A-Z][A-Za-z0-9_]+)\s*\(` | +| rust | `*.rs` | Function-body statement count via brace-delimited body plus `;`-terminated statement splitting (trailing expressions count as one statement). Methods in an `impl` block count as class methods. | n/a — Rust uses `Result` returns | +| typescript | `*.ts`, `*.tsx` | Function-body statement count via brace-delimited body plus `;`-terminated statement splitting. Class methods counted inside `class` / `export class` blocks. | `throw\s+(?:new\s+)?(?P[A-Z][A-Za-z0-9_]+)\s*\(` | +| go | `*.go` | Function-body statement count via brace-delimited body plus `;`-or-newline-terminated statement splitting. Interface methods and struct methods counted toward the method-rich-class rule (Go has no `class` keyword — `type T struct` plus ≥2 methods declared as `func (r *T)` qualifies). | n/a — Go uses `error` return values | +| java | `*.java` | Function-body statement count via brace-delimited body plus `;`-terminated statement splitting. Methods counted inside `class` / `interface` blocks. | `throw\s+(?:new\s+)?(?P[A-Z][A-Za-z0-9_]+)\s*\(` | +| c | `*.c`, `*.h` | Function-body statement count via brace-delimited body plus `;`-terminated statement splitting. C has no classes — method-rich-class rule vacuously unsatisfied. | n/a — C uses integer return codes | +| cpp | `*.cpp`, `*.hpp`, `*.cc`, `*.h` | Function-body statement count as in C. Methods counted inside `class` / `struct` blocks. | `throw\s+(?:new\s+)?(?P[A-Z][A-Za-z0-9_]+)\s*\(` | + +The regex-based classifiers for non-Python languages share the accuracy ceiling of `shared/public-symbol-patterns.md` — known false positives in comments/strings, conservative over-reporting. + +## Detection rule + +One mechanical check, implemented as the five-step process above. No LLM judgement. + +## Failure modes + +- **Regex false positives inside comments and strings (non-Python).** A `// throw new FooError(...)` line in a block comment at column 0 is extracted as a raise site. Python avoids this because it uses the AST. +- **Case-sensitive substring matching for raise-site coverage.** Deliberate: sphinx-needs ids and class names are case-sensitive in practice, and a CREQ that describes `refreshTokenError` does not cover `RefreshTokenError`. Projects whose naming convention differs between code and prose must normalise at authoring time. +- **Raise-site extraction is shallow.** `raise Exc(...)` / `throw new Exc(...)` literals are detected. A function that raises by calling a helper that raises is not counted here — the question is "does this source file raise this class?", not "does execution of this file ever produce this class?". +- **`source_doc` must be declared.** Needs without the option cannot be collected by step 3, so a file with no citing needs is marked uncovered even if its behavior is described in free-floating prose elsewhere. That is the design — a CREQ that does not declare which file it describes cannot honestly claim to cover that file. +- **Raise-site coverage is catalogue-wide.** A CREQ for `bar.py` that names `InventoryError` in its body does cover the raise of `InventoryError` in `foo.py`. This is intentional: exception types are shared surface that multiple CREQs may reference. Projects that want scoped coverage should narrow via `source_doc` filters downstream. + +## Tailoring extension point + +None. The language regex table is shared (`skills/shared/public-symbol-patterns.md`) — a project that needs a new language supports it by adding a row there plus a corresponding entry in this skill's language table, which benefits both `pharaoh-req-from-code` and this skill. + +## Composition + +Role: `atom-check`. + +Called from `pharaoh-quality-gate` under the invariant key `api_coverage_clean` (pass requirement: `overall ∈ {"pass", "skipped"}`). Also callable standalone from any CI job that already knows which source files and which `needs.json` to feed it. Never dispatches other skills. Never modifies the source file or the need catalogue. + +Complements `pharaoh-req-code-grounding-check`: that skill runs the forward direction (does the CREQ's cited exception actually get raised in the file?), this skill runs the reverse direction (does every raised exception have a covering CREQ?). The two atoms share the language-regex table but no other code — they answer genuinely different questions and fail on different inputs. diff --git a/skills/pharaoh-api-coverage-check/fixtures/c-fully-covered/README.md b/skills/pharaoh-api-coverage-check/fixtures/c-fully-covered/README.md new file mode 100644 index 0000000..ad04fd6 --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/c-fully-covered/README.md @@ -0,0 +1,3 @@ +# c-fully-covered + +C classifier-path fixture. `save_catalog` has three top-level statements in its body → non-trivial-function-body rule → `behavioral`. Three needs cite the file via `:source_doc:`. C has no raise-site regex (errors travel via integer return codes), so the raise-site axis is vacuously satisfied with `total: 0`. Expected verdict: `overall: "pass"`. diff --git a/skills/pharaoh-api-coverage-check/fixtures/c-fully-covered/expected-output.json b/skills/pharaoh-api-coverage-check/fixtures/c-fully-covered/expected-output.json new file mode 100644 index 0000000..c6f2ac8 --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/c-fully-covered/expected-output.json @@ -0,0 +1,23 @@ +{ + "source_file": "input-source.c", + "language": "c", + "classification": "behavioral", + "file_coverage": { + "passed": true, + "citing_creqs": [ + "CREQ_c_count_items", + "CREQ_c_load_catalog", + "CREQ_c_save_catalog" + ] + }, + "raise_site_coverage": { + "total": 0, + "project_defined": 0, + "covered": [], + "uncovered": [], + "external": [], + "passed": true + }, + "overall": "pass", + "blockers": [] +} diff --git a/skills/pharaoh-api-coverage-check/fixtures/c-fully-covered/input-needs.json b/skills/pharaoh-api-coverage-check/fixtures/c-fully-covered/input-needs.json new file mode 100644 index 0000000..8979418 --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/c-fully-covered/input-needs.json @@ -0,0 +1,25 @@ +{ + "needs": { + "CREQ_c_load_catalog": { + "id": "CREQ_c_load_catalog", + "type": "comp_req", + "title": "load_catalog reads inventory", + "content": "The function load_catalog shall return 0 when the catalogue at path was loaded successfully.", + "source_doc": "input-source.c" + }, + "CREQ_c_save_catalog": { + "id": "CREQ_c_save_catalog", + "type": "comp_req", + "title": "save_catalog persists inventory", + "content": "The function save_catalog shall return 0 when the catalogue at path was written successfully.", + "source_doc": "input-source.c" + }, + "CREQ_c_count_items": { + "id": "CREQ_c_count_items", + "type": "comp_req", + "title": "count_items counts records", + "content": "The function count_items shall return the number of records available at the given path.", + "source_doc": "input-source.c" + } + } +} diff --git a/skills/pharaoh-api-coverage-check/fixtures/c-fully-covered/input-source.c b/skills/pharaoh-api-coverage-check/fixtures/c-fully-covered/input-source.c new file mode 100644 index 0000000..5c30d8d --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/c-fully-covered/input-source.c @@ -0,0 +1,26 @@ +/* Inventory module — demonstrates the c public-symbol regex row. */ +#include + +int load_catalog(const char *path) { + (void)path; + return 0; +} + +int save_catalog(const char *path, int flags) { + (void)path; + (void)flags; + return 0; +} + +size_t count_items(const char *path) { + (void)path; + return 0; +} + +static int helper_unused(int x) { + return x; +} + +int _private_leaking(int x) { + return x; +} diff --git a/skills/pharaoh-api-coverage-check/fixtures/cpp-fully-covered/README.md b/skills/pharaoh-api-coverage-check/fixtures/cpp-fully-covered/README.md new file mode 100644 index 0000000..3ab5ed6 --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/cpp-fully-covered/README.md @@ -0,0 +1,3 @@ +# cpp-fully-covered + +C++ classifier-path fixture. Two free functions raise via `throw` → exception-surface rule → `behavioral`. Four needs cite the file via `:source_doc:`. Two throw sites expose `CatalogError` and `InvalidPathError`; both names appear in CREQ content. Expected verdict: `overall: "pass"`. diff --git a/skills/pharaoh-api-coverage-check/fixtures/cpp-fully-covered/expected-output.json b/skills/pharaoh-api-coverage-check/fixtures/cpp-fully-covered/expected-output.json new file mode 100644 index 0000000..3b1db22 --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/cpp-fully-covered/expected-output.json @@ -0,0 +1,27 @@ +{ + "source_file": "input-source.cpp", + "language": "cpp", + "classification": "behavioral", + "file_coverage": { + "passed": true, + "citing_creqs": [ + "CREQ_cpp_catalog", + "CREQ_cpp_catalog_config", + "CREQ_cpp_load_catalog", + "CREQ_cpp_save_catalog" + ] + }, + "raise_site_coverage": { + "total": 2, + "project_defined": 2, + "covered": [ + "CatalogError", + "InvalidPathError" + ], + "uncovered": [], + "external": [], + "passed": true + }, + "overall": "pass", + "blockers": [] +} diff --git a/skills/pharaoh-api-coverage-check/fixtures/cpp-fully-covered/input-needs.json b/skills/pharaoh-api-coverage-check/fixtures/cpp-fully-covered/input-needs.json new file mode 100644 index 0000000..75b8c40 --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/cpp-fully-covered/input-needs.json @@ -0,0 +1,32 @@ +{ + "needs": { + "CREQ_cpp_catalog": { + "id": "CREQ_cpp_catalog", + "type": "comp_req", + "title": "Catalog class holds metadata", + "content": "The class Catalog shall expose a public name field of type std::string.", + "source_doc": "input-source.cpp" + }, + "CREQ_cpp_catalog_config": { + "id": "CREQ_cpp_catalog_config", + "type": "comp_req", + "title": "CatalogConfig carries tuning knobs", + "content": "The struct CatalogConfig shall declare max_items as an integer.", + "source_doc": "input-source.cpp" + }, + "CREQ_cpp_load_catalog": { + "id": "CREQ_cpp_load_catalog", + "type": "comp_req", + "title": "load_catalog reads catalogue", + "content": "The function load_catalog shall throw CatalogError when the path argument is empty.", + "source_doc": "input-source.cpp" + }, + "CREQ_cpp_save_catalog": { + "id": "CREQ_cpp_save_catalog", + "type": "comp_req", + "title": "save_catalog writes catalogue", + "content": "The function save_catalog shall throw InvalidPathError when the path argument is empty.", + "source_doc": "input-source.cpp" + } + } +} diff --git a/skills/pharaoh-api-coverage-check/fixtures/cpp-fully-covered/input-source.cpp b/skills/pharaoh-api-coverage-check/fixtures/cpp-fully-covered/input-source.cpp new file mode 100644 index 0000000..3edc3fe --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/cpp-fully-covered/input-source.cpp @@ -0,0 +1,26 @@ +// C++ module demonstrating the public-symbol regex rows. +#include +#include + +class Catalog { +public: + std::string name; +}; + +struct CatalogConfig { + int max_items; +}; + +int load_catalog(const std::string& path) { + if (path.empty()) { + throw CatalogError("path required"); + } + return 0; +} + +int save_catalog(const std::string& path) { + if (path.empty()) { + throw InvalidPathError("path required"); + } + return 0; +} diff --git a/skills/pharaoh-api-coverage-check/fixtures/go-fully-covered/README.md b/skills/pharaoh-api-coverage-check/fixtures/go-fully-covered/README.md new file mode 100644 index 0000000..9dfb69e --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/go-fully-covered/README.md @@ -0,0 +1,3 @@ +# go-fully-covered + +Go classifier-path fixture. `SaveCatalog` has three top-level statements in its body → non-trivial-function-body rule → `behavioral`. Four needs cite the file via `:source_doc:`. Go has no raise-site regex (errors travel via `error` return values), so the raise-site axis is vacuously satisfied with `total: 0`. Expected verdict: `overall: "pass"`. diff --git a/skills/pharaoh-api-coverage-check/fixtures/go-fully-covered/expected-output.json b/skills/pharaoh-api-coverage-check/fixtures/go-fully-covered/expected-output.json new file mode 100644 index 0000000..c1dd9ec --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/go-fully-covered/expected-output.json @@ -0,0 +1,24 @@ +{ + "source_file": "input-source.go", + "language": "go", + "classification": "behavioral", + "file_coverage": { + "passed": true, + "citing_creqs": [ + "CREQ_go_catalog", + "CREQ_go_load_catalog", + "CREQ_go_reader", + "CREQ_go_save_catalog" + ] + }, + "raise_site_coverage": { + "total": 0, + "project_defined": 0, + "covered": [], + "uncovered": [], + "external": [], + "passed": true + }, + "overall": "pass", + "blockers": [] +} diff --git a/skills/pharaoh-api-coverage-check/fixtures/go-fully-covered/input-needs.json b/skills/pharaoh-api-coverage-check/fixtures/go-fully-covered/input-needs.json new file mode 100644 index 0000000..0e7f021 --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/go-fully-covered/input-needs.json @@ -0,0 +1,32 @@ +{ + "needs": { + "CREQ_go_catalog": { + "id": "CREQ_go_catalog", + "type": "comp_req", + "title": "Catalog struct definition", + "content": "The type Catalog shall declare a Name field of type string.", + "source_doc": "input-source.go" + }, + "CREQ_go_reader": { + "id": "CREQ_go_reader", + "type": "comp_req", + "title": "Reader interface", + "content": "The interface Reader shall declare a single Read() method returning a string.", + "source_doc": "input-source.go" + }, + "CREQ_go_load_catalog": { + "id": "CREQ_go_load_catalog", + "type": "comp_req", + "title": "LoadCatalog factory", + "content": "The function LoadCatalog shall return a Catalog populated from the given path.", + "source_doc": "input-source.go" + }, + "CREQ_go_save_catalog": { + "id": "CREQ_go_save_catalog", + "type": "comp_req", + "title": "SaveCatalog persistence", + "content": "The function SaveCatalog shall return true when the write succeeded.", + "source_doc": "input-source.go" + } + } +} diff --git a/skills/pharaoh-api-coverage-check/fixtures/go-fully-covered/input-source.go b/skills/pharaoh-api-coverage-check/fixtures/go-fully-covered/input-source.go new file mode 100644 index 0000000..be48d3d --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/go-fully-covered/input-source.go @@ -0,0 +1,28 @@ +// Package inventory demonstrates the go public-symbol regex row. +package inventory + +type Catalog struct { + Name string +} + +type Reader interface { + Read() string +} + +func LoadCatalog(path string) *Catalog { + return &Catalog{Name: path} +} + +func SaveCatalog(cat *Catalog, path string) bool { + _ = cat + _ = path + return true +} + +func internalHelper() int { + return 42 +} + +type privateState struct { + token string +} diff --git a/skills/pharaoh-api-coverage-check/fixtures/java-fully-covered/README.md b/skills/pharaoh-api-coverage-check/fixtures/java-fully-covered/README.md new file mode 100644 index 0000000..137bb2b --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/java-fully-covered/README.md @@ -0,0 +1,3 @@ +# java-fully-covered + +Java classifier-path fixture. `CatalogService` declares two methods (`loadCatalog`, `saveCatalog`) → method-rich-class rule → `behavioral`. Three needs cite the file via `:source_doc:`. Two `throw new (...)` sites expose `CatalogError` and `InvalidPathError`; both names appear in CREQ content. Expected verdict: `overall: "pass"`. diff --git a/skills/pharaoh-api-coverage-check/fixtures/java-fully-covered/expected-output.json b/skills/pharaoh-api-coverage-check/fixtures/java-fully-covered/expected-output.json new file mode 100644 index 0000000..f2ce998 --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/java-fully-covered/expected-output.json @@ -0,0 +1,26 @@ +{ + "source_file": "input-source.java", + "language": "java", + "classification": "behavioral", + "file_coverage": { + "passed": true, + "citing_creqs": [ + "CREQ_java_catalog", + "CREQ_java_catalog_service", + "CREQ_java_reader" + ] + }, + "raise_site_coverage": { + "total": 2, + "project_defined": 2, + "covered": [ + "CatalogError", + "InvalidPathError" + ], + "uncovered": [], + "external": [], + "passed": true + }, + "overall": "pass", + "blockers": [] +} diff --git a/skills/pharaoh-api-coverage-check/fixtures/java-fully-covered/input-needs.json b/skills/pharaoh-api-coverage-check/fixtures/java-fully-covered/input-needs.json new file mode 100644 index 0000000..4f53e4d --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/java-fully-covered/input-needs.json @@ -0,0 +1,25 @@ +{ + "needs": { + "CREQ_java_catalog": { + "id": "CREQ_java_catalog", + "type": "comp_req", + "title": "Catalog class definition", + "content": "The class Catalog shall declare a public name field of type String.", + "source_doc": "input-source.java" + }, + "CREQ_java_reader": { + "id": "CREQ_java_reader", + "type": "comp_req", + "title": "Reader interface", + "content": "The interface Reader shall declare a single read() method returning a String.", + "source_doc": "input-source.java" + }, + "CREQ_java_catalog_service": { + "id": "CREQ_java_catalog_service", + "type": "comp_req", + "title": "CatalogService orchestrates persistence", + "content": "The class CatalogService shall expose loadCatalog() and saveCatalog(). loadCatalog shall throw CatalogError when path is null; saveCatalog shall throw InvalidPathError when path is null.", + "source_doc": "input-source.java" + } + } +} diff --git a/skills/pharaoh-api-coverage-check/fixtures/java-fully-covered/input-source.java b/skills/pharaoh-api-coverage-check/fixtures/java-fully-covered/input-source.java new file mode 100644 index 0000000..25015b1 --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/java-fully-covered/input-source.java @@ -0,0 +1,29 @@ +package com.example.inventory; + +public class Catalog { + public String name; +} + +public interface Reader { + String read(); +} + +public class CatalogService { + public int loadCatalog(String path) { + if (path == null) { + throw new CatalogError("path required"); + } + return 0; + } + + public int saveCatalog(String path) { + if (path == null) { + throw new InvalidPathError("path required"); + } + return 0; + } +} + +class InternalHelper { + int compute(int x) { return x; } +} diff --git a/skills/pharaoh-api-coverage-check/fixtures/language-override/README.md b/skills/pharaoh-api-coverage-check/fixtures/language-override/README.md new file mode 100644 index 0000000..57e84ac --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/language-override/README.md @@ -0,0 +1,3 @@ +# language-override + +Python source held in a `.txt` file. With `language: "auto"` the extension-based resolver would fail (no row matches `.txt`), but the caller supplies `language: "python"` via `input-meta.yaml` to bypass extension resolution. The Python AST classifier then applies — the `raise GreetError(...)` statement satisfies the exception-surface rule → `behavioral`. Three needs cite the file and `GreetError` appears in CREQ content. Expected verdict: `overall: "pass"` with `language: "python"` in the output. diff --git a/skills/pharaoh-api-coverage-check/fixtures/language-override/expected-output.json b/skills/pharaoh-api-coverage-check/fixtures/language-override/expected-output.json new file mode 100644 index 0000000..cdb14df --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/language-override/expected-output.json @@ -0,0 +1,25 @@ +{ + "source_file": "input-source.txt", + "language": "python", + "classification": "behavioral", + "file_coverage": { + "passed": true, + "citing_creqs": [ + "CREQ_greet", + "CREQ_greet_error", + "CREQ_greeter" + ] + }, + "raise_site_coverage": { + "total": 1, + "project_defined": 1, + "covered": [ + "GreetError" + ], + "uncovered": [], + "external": [], + "passed": true + }, + "overall": "pass", + "blockers": [] +} diff --git a/skills/pharaoh-api-coverage-check/fixtures/language-override/input-meta.yaml b/skills/pharaoh-api-coverage-check/fixtures/language-override/input-meta.yaml new file mode 100644 index 0000000..d1ad0ae --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/language-override/input-meta.yaml @@ -0,0 +1 @@ +language: python diff --git a/skills/pharaoh-api-coverage-check/fixtures/language-override/input-needs.json b/skills/pharaoh-api-coverage-check/fixtures/language-override/input-needs.json new file mode 100644 index 0000000..7abca88 --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/language-override/input-needs.json @@ -0,0 +1,25 @@ +{ + "needs": { + "CREQ_greet": { + "id": "CREQ_greet", + "type": "comp_req", + "title": "greet produces a greeting", + "content": "The function greet shall return a greeting for the given name and raise GreetError when the name is empty.", + "source_doc": "input-source.txt" + }, + "CREQ_greeter": { + "id": "CREQ_greeter", + "type": "comp_req", + "title": "Greeter class", + "content": "The class Greeter shall expose a say() method returning a fixed greeting string.", + "source_doc": "input-source.txt" + }, + "CREQ_greet_error": { + "id": "CREQ_greet_error", + "type": "comp_req", + "title": "GreetError class", + "content": "The class GreetError shall be raised when greet is called with an empty name.", + "source_doc": "input-source.txt" + } + } +} diff --git a/skills/pharaoh-api-coverage-check/fixtures/language-override/input-source.txt b/skills/pharaoh-api-coverage-check/fixtures/language-override/input-source.txt new file mode 100644 index 0000000..40f37ef --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/language-override/input-source.txt @@ -0,0 +1,16 @@ +"""Literate-style Python kept in a .txt file for dogfood purposes.""" + + +def greet(name): + if not name: + raise GreetError("name required") + return "hello " + name + + +class Greeter: + def say(self): + return "hi" + + +class GreetError(Exception): + pass diff --git a/skills/pharaoh-api-coverage-check/fixtures/python-external-exception/README.md b/skills/pharaoh-api-coverage-check/fixtures/python-external-exception/README.md new file mode 100644 index 0000000..cb11487 --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/python-external-exception/README.md @@ -0,0 +1,13 @@ +# python-external-exception + +Exercises the project-definition scan. `input-source.py` raises two +classes: `CatalogError` (defined in `errors.py` within the fixture's +`project_root` scope) and `ValueError` (Python stdlib — not defined +anywhere under `project_root`). + +The CREQ catalogue names `CatalogError` but not `ValueError`. The +fixture passes because `project_defined` is `1` (CatalogError), +`covered` is `[CatalogError]`, `uncovered` is `[]`, and `ValueError` +is surfaced in `external` as diagnostic-only. + +Invocation: `source_file=input-source.py, needs_json_path=input-needs.json, project_root=`. diff --git a/skills/pharaoh-api-coverage-check/fixtures/python-external-exception/errors.py b/skills/pharaoh-api-coverage-check/fixtures/python-external-exception/errors.py new file mode 100644 index 0000000..ea050a9 --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/python-external-exception/errors.py @@ -0,0 +1,5 @@ +"""Project-defined exception types.""" + + +class CatalogError(Exception): + """Raised when a catalog row fails schema validation.""" diff --git a/skills/pharaoh-api-coverage-check/fixtures/python-external-exception/expected-output.json b/skills/pharaoh-api-coverage-check/fixtures/python-external-exception/expected-output.json new file mode 100644 index 0000000..ff19a6e --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/python-external-exception/expected-output.json @@ -0,0 +1,25 @@ +{ + "source_file": "input-source.py", + "language": "python", + "classification": "behavioral", + "file_coverage": { + "passed": true, + "citing_creqs": [ + "CREQ_load_entry" + ] + }, + "raise_site_coverage": { + "total": 2, + "project_defined": 1, + "covered": [ + "CatalogError" + ], + "uncovered": [], + "external": [ + "ValueError" + ], + "passed": true + }, + "overall": "pass", + "blockers": [] +} diff --git a/skills/pharaoh-api-coverage-check/fixtures/python-external-exception/input-needs.json b/skills/pharaoh-api-coverage-check/fixtures/python-external-exception/input-needs.json new file mode 100644 index 0000000..8a6940e --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/python-external-exception/input-needs.json @@ -0,0 +1,15 @@ +{ + "versions": { + "1.0": { + "needs": { + "CREQ_load_entry": { + "id": "CREQ_load_entry", + "type": "comp_req", + "title": "Validate catalog row on load", + "content": "The Catalog Loader shall raise CatalogError when a catalog row is missing the id field.", + "source_doc": "input-source.py" + } + } + } + } +} diff --git a/skills/pharaoh-api-coverage-check/fixtures/python-external-exception/input-source.py b/skills/pharaoh-api-coverage-check/fixtures/python-external-exception/input-source.py new file mode 100644 index 0000000..a4b774e --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/python-external-exception/input-source.py @@ -0,0 +1,14 @@ +"""Client module that raises both a project-defined and a stdlib exception.""" + +from errors import CatalogError + + +def load_entry(row: dict) -> dict: + """Validate and load one catalog entry.""" + if not row: + raise ValueError("empty row") + if "id" not in row: + raise CatalogError("row missing id") + if len(row["id"]) < 1: + raise ValueError("empty id") + return {"id": row["id"], "data": row.get("data")} diff --git a/skills/pharaoh-api-coverage-check/fixtures/python-file-not-cited/README.md b/skills/pharaoh-api-coverage-check/fixtures/python-file-not-cited/README.md new file mode 100644 index 0000000..690e0ed --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/python-file-not-cited/README.md @@ -0,0 +1,3 @@ +# python-file-not-cited + +Behavioral file (class with three methods) with zero needs citing it via `:source_doc:`. Raise-site axis is vacuously satisfied (no raises in the file). File-coverage axis fails — no CREQ acknowledges the file's existence. Expected verdict: `overall: "fail"`. diff --git a/skills/pharaoh-api-coverage-check/fixtures/python-file-not-cited/expected-output.json b/skills/pharaoh-api-coverage-check/fixtures/python-file-not-cited/expected-output.json new file mode 100644 index 0000000..07e6e8c --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/python-file-not-cited/expected-output.json @@ -0,0 +1,19 @@ +{ + "source_file": "input-source.py", + "language": "python", + "classification": "behavioral", + "file_coverage": { + "passed": false, + "citing_creqs": [] + }, + "raise_site_coverage": { + "total": 0, + "project_defined": 0, + "covered": [], + "uncovered": [], + "external": [], + "passed": true + }, + "overall": "fail", + "blockers": [] +} diff --git a/skills/pharaoh-api-coverage-check/fixtures/python-file-not-cited/input-needs.json b/skills/pharaoh-api-coverage-check/fixtures/python-file-not-cited/input-needs.json new file mode 100644 index 0000000..33bb358 --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/python-file-not-cited/input-needs.json @@ -0,0 +1,11 @@ +{ + "needs": { + "CREQ_unrelated_client": { + "id": "CREQ_unrelated_client", + "type": "comp_req", + "title": "Unrelated client", + "content": "The module shall expose a session handle against the upstream service.", + "source_doc": "other_module.py" + } + } +} diff --git a/skills/pharaoh-api-coverage-check/fixtures/python-file-not-cited/input-source.py b/skills/pharaoh-api-coverage-check/fixtures/python-file-not-cited/input-source.py new file mode 100644 index 0000000..ca5795a --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/python-file-not-cited/input-source.py @@ -0,0 +1,21 @@ +"""Cache backend module — behavioral but absent from the catalogue.""" + + +class CacheBackend: + def get(self, key): + value = self._storage.get(key) + if value is None: + return None + return value + + def set(self, key, value): + self._storage[key] = value + return True + + def __init__(self): + self._storage = {} + + +def flush_cache(backend): + backend._storage.clear() + return True diff --git a/skills/pharaoh-api-coverage-check/fixtures/python-fully-covered/README.md b/skills/pharaoh-api-coverage-check/fixtures/python-fully-covered/README.md new file mode 100644 index 0000000..835302a --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/python-fully-covered/README.md @@ -0,0 +1,3 @@ +# python-fully-covered + +Canonical Python happy path. The source file raises `InventoryError` (behavioral via the exception-surface rule) and declares `class InventoryClient` with two methods (behavioral via the method-rich-class rule). Four needs cite the file via `:source_doc:` and the body of one need names `InventoryError`. Expected verdict: `overall: "pass"`. diff --git a/skills/pharaoh-api-coverage-check/fixtures/python-fully-covered/expected-output.json b/skills/pharaoh-api-coverage-check/fixtures/python-fully-covered/expected-output.json new file mode 100644 index 0000000..5829fef --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/python-fully-covered/expected-output.json @@ -0,0 +1,26 @@ +{ + "source_file": "input-source.py", + "language": "python", + "classification": "behavioral", + "file_coverage": { + "passed": true, + "citing_creqs": [ + "CREQ_inventory_client", + "CREQ_inventory_error", + "CREQ_inventory_load", + "CREQ_inventory_save" + ] + }, + "raise_site_coverage": { + "total": 1, + "project_defined": 1, + "covered": [ + "InventoryError" + ], + "uncovered": [], + "external": [], + "passed": true + }, + "overall": "pass", + "blockers": [] +} diff --git a/skills/pharaoh-api-coverage-check/fixtures/python-fully-covered/input-needs.json b/skills/pharaoh-api-coverage-check/fixtures/python-fully-covered/input-needs.json new file mode 100644 index 0000000..ca9136c --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/python-fully-covered/input-needs.json @@ -0,0 +1,32 @@ +{ + "needs": { + "CREQ_inventory_error": { + "id": "CREQ_inventory_error", + "type": "comp_req", + "title": "InventoryError hierarchy", + "content": "The module shall expose InventoryError as the base class for all inventory-related failures.", + "source_doc": "input-source.py" + }, + "CREQ_inventory_load": { + "id": "CREQ_inventory_load", + "type": "comp_req", + "title": "load_items reads inventory from disk", + "content": "The function load_items shall return a list of records parsed from the given path.", + "source_doc": "input-source.py" + }, + "CREQ_inventory_save": { + "id": "CREQ_inventory_save", + "type": "comp_req", + "title": "save_items persists inventory", + "content": "The function save_items shall write the given items to the given path and return True on success.", + "source_doc": "input-source.py" + }, + "CREQ_inventory_client": { + "id": "CREQ_inventory_client", + "type": "comp_req", + "title": "InventoryClient connects to the backend", + "content": "The class InventoryClient shall expose connect() for backend session establishment.", + "source_doc": "input-source.py" + } + } +} diff --git a/skills/pharaoh-api-coverage-check/fixtures/python-fully-covered/input-source.py b/skills/pharaoh-api-coverage-check/fixtures/python-fully-covered/input-source.py new file mode 100644 index 0000000..9a0ff3d --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/python-fully-covered/input-source.py @@ -0,0 +1,26 @@ +"""Inventory client module.""" + + +class InventoryError(Exception): + """Base class for inventory errors.""" + + +def load_items(path): + if not path: + raise InventoryError("path required") + return [] + + +def save_items(path, items): + if items is None: + raise InventoryError("items required") + return True + + +class InventoryClient: + def connect(self): + return True + + def _reset(self): + # private helper — not part of the public surface + return None diff --git a/skills/pharaoh-api-coverage-check/fixtures/python-non-behavioral-reexport/README.md b/skills/pharaoh-api-coverage-check/fixtures/python-non-behavioral-reexport/README.md new file mode 100644 index 0000000..64ab1c7 --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/python-non-behavioral-reexport/README.md @@ -0,0 +1,3 @@ +# python-non-behavioral-reexport + +Re-export-only `__init__.py`: imports, module-level constants, and an `__all__` list. No function bodies, no raises, no classes. Classifier output is `non-behavioral`; both sub-axes are emitted with `passed: null` (not applicable). Expected verdict: `overall: "skipped"` — the file never fails the coverage gate because there is no behavior for the catalogue to cover. diff --git a/skills/pharaoh-api-coverage-check/fixtures/python-non-behavioral-reexport/expected-output.json b/skills/pharaoh-api-coverage-check/fixtures/python-non-behavioral-reexport/expected-output.json new file mode 100644 index 0000000..8478ce9 --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/python-non-behavioral-reexport/expected-output.json @@ -0,0 +1,19 @@ +{ + "source_file": "input-source.py", + "language": "python", + "classification": "non-behavioral", + "file_coverage": { + "passed": null, + "citing_creqs": [] + }, + "raise_site_coverage": { + "total": 0, + "project_defined": 0, + "covered": [], + "uncovered": [], + "external": [], + "passed": null + }, + "overall": "skipped", + "blockers": [] +} diff --git a/skills/pharaoh-api-coverage-check/fixtures/python-non-behavioral-reexport/input-needs.json b/skills/pharaoh-api-coverage-check/fixtures/python-non-behavioral-reexport/input-needs.json new file mode 100644 index 0000000..344b938 --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/python-non-behavioral-reexport/input-needs.json @@ -0,0 +1,11 @@ +{ + "needs": { + "CREQ_inventory_client": { + "id": "CREQ_inventory_client", + "type": "comp_req", + "title": "InventoryClient connects to the backend", + "content": "The class InventoryClient shall expose connect() for backend session establishment.", + "source_doc": "client.py" + } + } +} diff --git a/skills/pharaoh-api-coverage-check/fixtures/python-non-behavioral-reexport/input-source.py b/skills/pharaoh-api-coverage-check/fixtures/python-non-behavioral-reexport/input-source.py new file mode 100644 index 0000000..615fff1 --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/python-non-behavioral-reexport/input-source.py @@ -0,0 +1,21 @@ +"""Package __init__ that re-exports symbols from submodules. + +Holds no behavior of its own — every name below is an import or a +module-level constant. The file should classify as non-behavioral and +be skipped by the coverage gate. +""" +from .client import InventoryClient +from .errors import InventoryError +from .loader import load_items, save_items + +DEFAULT_TIMEOUT = 30 +SUPPORTED_FORMATS = ("csv", "json", "parquet") + +__all__ = [ + "InventoryClient", + "InventoryError", + "load_items", + "save_items", + "DEFAULT_TIMEOUT", + "SUPPORTED_FORMATS", +] diff --git a/skills/pharaoh-api-coverage-check/fixtures/python-uncovered-raises/README.md b/skills/pharaoh-api-coverage-check/fixtures/python-uncovered-raises/README.md new file mode 100644 index 0000000..6712f3d --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/python-uncovered-raises/README.md @@ -0,0 +1,3 @@ +# python-uncovered-raises + +Behavioral file (raises four distinct exception classes) cited by five needs. Only one raised class (`JamaAuthError`) appears in any CREQ's title or content; the other three (`JamaArtifactTypeError`, `JamaValueMapError`, `JamaSkippedValueError`) are absent from the catalogue. File coverage passes; raise-site coverage fails. Expected verdict: `overall: "fail"` with the three uncovered classes listed. diff --git a/skills/pharaoh-api-coverage-check/fixtures/python-uncovered-raises/expected-output.json b/skills/pharaoh-api-coverage-check/fixtures/python-uncovered-raises/expected-output.json new file mode 100644 index 0000000..fee02a7 --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/python-uncovered-raises/expected-output.json @@ -0,0 +1,31 @@ +{ + "source_file": "input-source.py", + "language": "python", + "classification": "behavioral", + "file_coverage": { + "passed": true, + "citing_creqs": [ + "CREQ_jama_authenticate", + "CREQ_jama_client", + "CREQ_jama_fetch_artifact", + "CREQ_jama_fetch_value_map", + "CREQ_jama_skip_artifact" + ] + }, + "raise_site_coverage": { + "total": 4, + "project_defined": 4, + "covered": [ + "JamaAuthError" + ], + "uncovered": [ + "JamaArtifactTypeError", + "JamaSkippedValueError", + "JamaValueMapError" + ], + "external": [], + "passed": false + }, + "overall": "fail", + "blockers": [] +} diff --git a/skills/pharaoh-api-coverage-check/fixtures/python-uncovered-raises/input-needs.json b/skills/pharaoh-api-coverage-check/fixtures/python-uncovered-raises/input-needs.json new file mode 100644 index 0000000..c10d500 --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/python-uncovered-raises/input-needs.json @@ -0,0 +1,39 @@ +{ + "needs": { + "CREQ_jama_authenticate": { + "id": "CREQ_jama_authenticate", + "type": "comp_req", + "title": "authenticate establishes a Jama session", + "content": "The function authenticate shall raise JamaAuthError when the user argument is empty.", + "source_doc": "input-source.py" + }, + "CREQ_jama_fetch_artifact": { + "id": "CREQ_jama_fetch_artifact", + "type": "comp_req", + "title": "fetch_artifact retrieves an artifact", + "content": "The function fetch_artifact shall return the record for the given id.", + "source_doc": "input-source.py" + }, + "CREQ_jama_fetch_value_map": { + "id": "CREQ_jama_fetch_value_map", + "type": "comp_req", + "title": "fetch_value_map retrieves a value map", + "content": "The function fetch_value_map shall return the value-map record for the given map id.", + "source_doc": "input-source.py" + }, + "CREQ_jama_skip_artifact": { + "id": "CREQ_jama_skip_artifact", + "type": "comp_req", + "title": "skip_artifact flags an artifact as skipped", + "content": "The function skip_artifact shall signal that the given artifact is intentionally not processed.", + "source_doc": "input-source.py" + }, + "CREQ_jama_client": { + "id": "CREQ_jama_client", + "type": "comp_req", + "title": "JamaClient API handle", + "content": "The class JamaClient shall provide a call() method for arbitrary endpoint access.", + "source_doc": "input-source.py" + } + } +} diff --git a/skills/pharaoh-api-coverage-check/fixtures/python-uncovered-raises/input-source.py b/skills/pharaoh-api-coverage-check/fixtures/python-uncovered-raises/input-source.py new file mode 100644 index 0000000..803f8e4 --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/python-uncovered-raises/input-source.py @@ -0,0 +1,38 @@ +"""Jama API client — demonstrates orphaned raise-site exceptions. + +Exception classes are defined elsewhere (not in this file). They are imported +here and raised — so raise-site extraction picks up the class names, but they +are NOT in this file's public-symbol surface. +""" +from jama.exceptions import ( + JamaArtifactTypeError, + JamaValueMapError, + JamaSkippedValueError, +) + + +def authenticate(user, token): + if not user: + raise JamaAuthError("user required") + return token + + +def fetch_artifact(artifact_id): + if artifact_id is None: + raise JamaArtifactTypeError("artifact id required") + return {} + + +def fetch_value_map(map_id): + if map_id is None: + raise JamaValueMapError("map id required") + return {} + + +def skip_artifact(reason): + raise JamaSkippedValueError(reason) + + +class JamaClient: + def call(self, endpoint): + return endpoint diff --git a/skills/pharaoh-api-coverage-check/fixtures/rust-fully-covered/README.md b/skills/pharaoh-api-coverage-check/fixtures/rust-fully-covered/README.md new file mode 100644 index 0000000..eb4e623 --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/rust-fully-covered/README.md @@ -0,0 +1,3 @@ +# rust-fully-covered + +Rust classifier-path fixture. The `impl Catalog` block declares two methods, satisfying the method-rich-class rule → `behavioral`. Five needs cite the file via `:source_doc:`. Rust has no raise-site regex, so the raise-site axis is vacuously satisfied with `total: 0`. Expected verdict: `overall: "pass"`. diff --git a/skills/pharaoh-api-coverage-check/fixtures/rust-fully-covered/expected-output.json b/skills/pharaoh-api-coverage-check/fixtures/rust-fully-covered/expected-output.json new file mode 100644 index 0000000..c216229 --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/rust-fully-covered/expected-output.json @@ -0,0 +1,25 @@ +{ + "source_file": "input-source.rs", + "language": "rust", + "classification": "behavioral", + "file_coverage": { + "passed": true, + "citing_creqs": [ + "CREQ_rust_catalog", + "CREQ_rust_catalog_error", + "CREQ_rust_load_catalog", + "CREQ_rust_readable", + "CREQ_rust_save_catalog" + ] + }, + "raise_site_coverage": { + "total": 0, + "project_defined": 0, + "covered": [], + "uncovered": [], + "external": [], + "passed": true + }, + "overall": "pass", + "blockers": [] +} diff --git a/skills/pharaoh-api-coverage-check/fixtures/rust-fully-covered/input-needs.json b/skills/pharaoh-api-coverage-check/fixtures/rust-fully-covered/input-needs.json new file mode 100644 index 0000000..9a00314 --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/rust-fully-covered/input-needs.json @@ -0,0 +1,39 @@ +{ + "needs": { + "CREQ_rust_catalog": { + "id": "CREQ_rust_catalog", + "type": "comp_req", + "title": "Catalog struct holds catalogue metadata", + "content": "The struct Catalog shall carry the catalogue name as a public field.", + "source_doc": "input-source.rs" + }, + "CREQ_rust_catalog_error": { + "id": "CREQ_rust_catalog_error", + "type": "comp_req", + "title": "CatalogError enumerates catalogue failures", + "content": "The enum CatalogError shall enumerate the NotFound and Unreadable failure variants.", + "source_doc": "input-source.rs" + }, + "CREQ_rust_readable": { + "id": "CREQ_rust_readable", + "type": "comp_req", + "title": "Readable trait exposes read()", + "content": "The trait Readable shall declare one method read() returning a String.", + "source_doc": "input-source.rs" + }, + "CREQ_rust_load_catalog": { + "id": "CREQ_rust_load_catalog", + "type": "comp_req", + "title": "load_catalog loads by path", + "content": "The function load_catalog shall return a Catalog populated from the given path.", + "source_doc": "input-source.rs" + }, + "CREQ_rust_save_catalog": { + "id": "CREQ_rust_save_catalog", + "type": "comp_req", + "title": "save_catalog persists to disk", + "content": "The function save_catalog shall return true when the write succeeded.", + "source_doc": "input-source.rs" + } + } +} diff --git a/skills/pharaoh-api-coverage-check/fixtures/rust-fully-covered/input-source.rs b/skills/pharaoh-api-coverage-check/fixtures/rust-fully-covered/input-source.rs new file mode 100644 index 0000000..5b49e87 --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/rust-fully-covered/input-source.rs @@ -0,0 +1,37 @@ +//! Simple rust module demonstrating the rust classifier path. + +pub struct Catalog { + pub name: String, +} + +pub enum CatalogError { + NotFound, + Unreadable, +} + +pub trait Readable { + fn read(&self) -> String; +} + +impl Catalog { + pub fn new(name: &str) -> Self { + Catalog { name: name.to_string() } + } + + pub fn rename(&mut self, name: &str) { + self.name = name.to_string(); + } +} + +pub fn load_catalog(path: &str) -> Catalog { + Catalog::new(path) +} + +pub fn save_catalog(cat: &Catalog, path: &str) -> bool { + let _ = (cat, path); + true +} + +fn internal_helper() -> i32 { + 42 +} diff --git a/skills/pharaoh-api-coverage-check/fixtures/typescript-fully-covered/README.md b/skills/pharaoh-api-coverage-check/fixtures/typescript-fully-covered/README.md new file mode 100644 index 0000000..d58e326 --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/typescript-fully-covered/README.md @@ -0,0 +1,3 @@ +# typescript-fully-covered + +TypeScript classifier-path fixture. `SessionManager` declares two methods (`open`, `ready`) → method-rich-class rule → `behavioral`. Four needs cite the file via `:source_doc:`. Two `throw new (...)` sites expose `SessionError` and `InvalidUserError`; both names appear in CREQ content. Expected verdict: `overall: "pass"`. diff --git a/skills/pharaoh-api-coverage-check/fixtures/typescript-fully-covered/expected-output.json b/skills/pharaoh-api-coverage-check/fixtures/typescript-fully-covered/expected-output.json new file mode 100644 index 0000000..856d395 --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/typescript-fully-covered/expected-output.json @@ -0,0 +1,27 @@ +{ + "source_file": "input-source.ts", + "language": "typescript", + "classification": "behavioral", + "file_coverage": { + "passed": true, + "citing_creqs": [ + "CREQ_ts_create_session", + "CREQ_ts_default_timeout", + "CREQ_ts_session", + "CREQ_ts_session_manager" + ] + }, + "raise_site_coverage": { + "total": 2, + "project_defined": 2, + "covered": [ + "InvalidUserError", + "SessionError" + ], + "uncovered": [], + "external": [], + "passed": true + }, + "overall": "pass", + "blockers": [] +} diff --git a/skills/pharaoh-api-coverage-check/fixtures/typescript-fully-covered/input-needs.json b/skills/pharaoh-api-coverage-check/fixtures/typescript-fully-covered/input-needs.json new file mode 100644 index 0000000..5010aa4 --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/typescript-fully-covered/input-needs.json @@ -0,0 +1,32 @@ +{ + "needs": { + "CREQ_ts_session": { + "id": "CREQ_ts_session", + "type": "comp_req", + "title": "Session interface shape", + "content": "The interface Session shall declare token and user fields as strings.", + "source_doc": "input-source.ts" + }, + "CREQ_ts_session_manager": { + "id": "CREQ_ts_session_manager", + "type": "comp_req", + "title": "SessionManager opens sessions", + "content": "The class SessionManager shall expose open() and throw SessionError when the manager is not ready.", + "source_doc": "input-source.ts" + }, + "CREQ_ts_create_session": { + "id": "CREQ_ts_create_session", + "type": "comp_req", + "title": "createSession factory", + "content": "The function createSession shall throw InvalidUserError when the user argument is empty.", + "source_doc": "input-source.ts" + }, + "CREQ_ts_default_timeout": { + "id": "CREQ_ts_default_timeout", + "type": "comp_req", + "title": "DEFAULT_TIMEOUT constant", + "content": "The constant DEFAULT_TIMEOUT shall default the session-open timeout to 30 seconds.", + "source_doc": "input-source.ts" + } + } +} diff --git a/skills/pharaoh-api-coverage-check/fixtures/typescript-fully-covered/input-source.ts b/skills/pharaoh-api-coverage-check/fixtures/typescript-fully-covered/input-source.ts new file mode 100644 index 0000000..fb27d05 --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/typescript-fully-covered/input-source.ts @@ -0,0 +1,35 @@ +// TypeScript module demonstrating the public-symbol regex row. + +export interface Session { + token: string; + user: string; +} + +export class SessionManager { + open(): Session { + if (!this.ready()) { + throw new SessionError("not ready"); + } + return { token: "t", user: "u" }; + } + + private ready(): boolean { + return true; + } +} + +export function createSession(user: string): Session { + if (!user) { + throw new InvalidUserError("user required"); + } + return { token: "t", user }; +} + +export const DEFAULT_TIMEOUT = 30_000; + +class SessionError extends Error {} +class InvalidUserError extends Error {} + +function internalHelper(): void { + // not exported +} diff --git a/skills/pharaoh-api-coverage-check/fixtures/unsupported-extension/README.md b/skills/pharaoh-api-coverage-check/fixtures/unsupported-extension/README.md new file mode 100644 index 0000000..257f7f0 --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/unsupported-extension/README.md @@ -0,0 +1,3 @@ +# unsupported-extension + +The source is `.lua`, which does not appear in the globs column of `skills/shared/public-symbol-patterns.md`. With `language: "auto"` the resolver fails. The skill emits `overall: "fail"` with `language: "unknown"`, `classification: "non-behavioral"`, both sub-axes carrying `passed: null`, and `blockers: ["unsupported language: .lua"]`. The project either adds a row to the shared table or passes an explicit `language` override (see `language-override` fixture). diff --git a/skills/pharaoh-api-coverage-check/fixtures/unsupported-extension/expected-output.json b/skills/pharaoh-api-coverage-check/fixtures/unsupported-extension/expected-output.json new file mode 100644 index 0000000..1a90fd4 --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/unsupported-extension/expected-output.json @@ -0,0 +1,21 @@ +{ + "source_file": "input-source.lua", + "language": "unknown", + "classification": "non-behavioral", + "file_coverage": { + "passed": null, + "citing_creqs": [] + }, + "raise_site_coverage": { + "total": 0, + "project_defined": 0, + "covered": [], + "uncovered": [], + "external": [], + "passed": null + }, + "overall": "fail", + "blockers": [ + "unsupported language: .lua" + ] +} diff --git a/skills/pharaoh-api-coverage-check/fixtures/unsupported-extension/input-needs.json b/skills/pharaoh-api-coverage-check/fixtures/unsupported-extension/input-needs.json new file mode 100644 index 0000000..7af8baf --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/unsupported-extension/input-needs.json @@ -0,0 +1,11 @@ +{ + "needs": { + "CREQ_lua_load": { + "id": "CREQ_lua_load", + "type": "comp_req", + "title": "load_catalog reads inventory", + "content": "The function load_catalog shall return the catalogue at the given path.", + "source_doc": "input-source.lua" + } + } +} diff --git a/skills/pharaoh-api-coverage-check/fixtures/unsupported-extension/input-source.lua b/skills/pharaoh-api-coverage-check/fixtures/unsupported-extension/input-source.lua new file mode 100644 index 0000000..bf681d5 --- /dev/null +++ b/skills/pharaoh-api-coverage-check/fixtures/unsupported-extension/input-source.lua @@ -0,0 +1,12 @@ +-- Lua module — not in the shared public-symbol-patterns.md table. +local M = {} + +function M.load_catalog(path) + return path +end + +function M.save_catalog(path) + return true +end + +return M diff --git a/skills/pharaoh-arch-draft/SKILL.md b/skills/pharaoh-arch-draft/SKILL.md index ccccc7c..cc7c4fe 100644 --- a/skills/pharaoh-arch-draft/SKILL.md +++ b/skills/pharaoh-arch-draft/SKILL.md @@ -291,3 +291,9 @@ to .pharaoh/project/artefact-catalog.yaml before promoting this element beyond d Consider running `pharaoh-arch-review arch__abs_pump_driver` to audit against ISO 26262-8 §6 axes. ``` + +## Last step + +After emitting the artefact, invoke `pharaoh-arch-review` on it. Pass the emitted artefact (or its `need_id`) as `target`. Attach the returned review JSON to the skill's output under the key `review`. If the review emits any axis with `score: 0` or `severity: critical`, return a non-success status with the review findings verbatim and do NOT finalize the artefact — the caller must regenerate (via `pharaoh-arch-regenerate` if available, or by re-invoking this skill with the findings as input). + +See [`shared/self-review-invariant.md`](../shared/self-review-invariant.md) for the rationale and enforcement mechanism. Coverage is mechanically enforced by `pharaoh-self-review-coverage-check` in `pharaoh-quality-gate`. diff --git a/skills/pharaoh-block-diagram-draft/SKILL.md b/skills/pharaoh-block-diagram-draft/SKILL.md new file mode 100644 index 0000000..8131321 --- /dev/null +++ b/skills/pharaoh-block-diagram-draft/SKILL.md @@ -0,0 +1,99 @@ +--- +name: pharaoh-block-diagram-draft +description: Use when drafting one SysML-style block diagram — Block Definition Diagram (BDD) showing block structure and composition, or Internal Block Diagram (IBD) showing ports, flows, and part interconnections. Typical ASPICE usage — SYS.2/SYS.3 for system-level architecture, and SWE.2 for software architecture on SysML-heavy projects. Renderer tailored via `pharaoh.toml`. Status — PLANNED (design-only scaffold; invoking returns sentinel FAIL until implemented). +--- + +# pharaoh-block-diagram-draft (PLANNED) + +> **Status:** DESIGN ONLY. Implementation sentinel FAIL: `"pharaoh-block-diagram-draft is planned but not implemented; see SKILL.md"`. + +Shared tailoring rules: see [`shared/diagram-tailoring.md`](../shared/diagram-tailoring.md). Reads `[pharaoh.diagrams.block]`. + +Safe-label rules: see [`shared/diagram-safe-labels.md`](../shared/diagram-safe-labels.md). Every emitted label / node id / edge label MUST be sanitised per that rule set before the block leaves this skill. `sphinx-build` does not validate diagram bodies — a parse failure becomes visible only at browser render time. Sanitisation is the first line of defence; the second is `pharaoh-diagram-lint` run as part of `pharaoh-quality-gate`. + +## Purpose + +One invocation → one block diagram, either a **BDD** (structural — blocks, parts, value properties, composition hierarchy) or **IBD** (internal — parts, ports, item flows, constraint properties). Which variant is rendered depends on input: presence of `ports` and `flows` implies IBD; absence implies BDD. + +Typical ASPICE context: +- **SYS.2 System Requirements Analysis**: BDD for the system under analysis. +- **SYS.3 System Architectural Design**: BDD for subsystem decomposition; IBD for internal wiring. +- **SWE.2 Software Architectural Design**: same, applied at SW component level. + +Distinct from `pharaoh-component-diagram-draft` (UML component view — looser, allows external ghost nodes) because BDD/IBD are closed SysML models with strict composition semantics. + +## Atomicity + +- (a) One block scope in → one diagram out. Variant (BDD vs IBD) inferred from input presence. +- (b) Input: `{view_title: str, blocks: list[BlockSpec], parts: list[PartSpec], compositions: list[CompositionSpec], ports?: list[PortSpec], flows?: list[FlowSpec], associations?: list[AssocSpec], project_root: str, variant_override?: "bdd"|"ibd", renderer_override?, on_missing_config?, papyrus_workspace?, reporter_id: str}` where `BlockSpec = {id: str, label: str, stereotype?: "block"|"subsystem"|"valueType", value_properties?: list[str], operations?: list[str]}`, `PartSpec = {id: str, label: str, type_id: str, multiplicity?: str}`, `CompositionSpec = {whole: str, part: str, label?: str}`, `PortSpec = {id: str, label: str, direction: "in"|"out"|"inout", owner_block_or_part: str}`, `FlowSpec = {from_port: str, to_port: str, item_type?: str, label?: str}`, `AssocSpec = {from: str, to: str, kind: "reference"|"depend", label?: str}`. Output: one RST directive block. +- (c) Reward: two fixtures. + + **BDD fixture** — blocks [Vehicle, ECU, Sensor], composition Vehicle◆━ECU, Vehicle◆━Sensor. Scorer: + 1. Output starts with renderer directive. + 2. Every block rendered with `<>` stereotype. + 3. Compositions rendered with filled-diamond arrow. + 4. Value properties (if any) shown inside block compartments. + 5. `ports`/`flows` absent → no IBD syntax emitted. + + Pass = all 5. + + **IBD fixture** — one block Vehicle with parts ecu:ECU, sensor:Sensor, ports [Vehicle.can_out: out, ECU.can_in: in], one flow Vehicle.can_out → ECU.can_in item_type=CANFrame. Scorer: + 1. Output starts with renderer directive. + 2. The enclosing block rendered as the diagram frame. + 3. Parts rendered inside the frame with `:TypeName` notation. + 4. Ports rendered on the boundary (block port) or inside (part port), with direction indicated (triangle/arrow). + 5. Flows rendered with item type label. + 6. All ports / parts have valid `owner_block_or_part` references. + + Pass = all 6. + +- (d) Reusable for any SysML-based systems engineering workflow. +- (e) One diagram per call. + +## Dangling references + +FAIL on `part.type_id` not in `blocks`, `composition.whole`/`composition.part` not in `blocks ∪ parts`, `port.owner_block_or_part` not in `blocks ∪ parts`, `flow.from_port`/`flow.to_port` not in `ports`. + +## Output + +**PlantUML (SysML-style; BDD):** +```rst +.. uml:: + :caption: + + @startuml + class Vehicle <> { + + mass : kg + } + class ECU <> + class Sensor <> + Vehicle *-- "1" ECU : ecu + Vehicle *-- "1..*" Sensor : sensor + @enduml +``` + +**PlantUML (IBD):** +```rst +.. uml:: + :caption: + + @startuml + rectangle "Vehicle" as veh { + component "ecu : ECU" as ecu + component "sensor : Sensor" as sns + } + portout "can_out" as p1 + portin "can_in" as p2 + veh - p1 + ecu - p2 + p1 --> p2 : <> CANFrame + @enduml +``` + +**Mermaid** — no native SysML support; render as annotated flowchart with stereotypes in labels. Emit a `%% NOTE: Mermaid approximation of SysML block diagram` comment. + +## Non-goals + +- No parametric diagrams (constraint properties with equations) — separate future skill. +- No BDD / IBD round-trip to SysML XMI — out of scope; this skill emits diagrams only. +- No automatic BDD-from-code inference — caller provides structure. diff --git a/skills/pharaoh-bootstrap/SKILL.md b/skills/pharaoh-bootstrap/SKILL.md new file mode 100644 index 0000000..b2a49eb --- /dev/null +++ b/skills/pharaoh-bootstrap/SKILL.md @@ -0,0 +1,465 @@ +--- +name: pharaoh-bootstrap +description: "Use when a Sphinx project has no sphinx-needs configured and you need minimum viable scaffolding — adding the extension and declaring need types — so that sphinx-build produces a valid needs.json for downstream Pharaoh skills." +chains_to: [pharaoh-setup] +--- + +# pharaoh-bootstrap + +## When to use + +Invoke when a project has a working Sphinx setup (`conf.py` builds without sphinx-needs) but does not yet load `sphinx_needs` as an extension. This skill injects the minimum configuration required for sphinx-needs to produce a valid `needs.json` on the next build. Downstream skills (`pharaoh-setup`, `pharaoh-tailor-detect`, `pharaoh-req-draft`, etc.) require that output. + +Do NOT invoke if `sphinx_needs` is already listed in extensions — use `pharaoh-setup` for that case. Do NOT invoke on a directory that is not yet a Sphinx project — `sphinx-quickstart` is a prerequisite, not part of this skill. Do NOT seed stub RST files, build the project, or write `pharaoh.toml` — those are separate concerns. + +## sphinx-needs version policy + +Pharaoh **recommends** `sphinx-needs >= 8.0.0` (8.x consolidated TOML loading, type-field schema validation, and extra-link declaration format). It does **not** require it. Many real projects pin older versions for lockfile stability or compliance reasons; the skill respects that choice. + +The version handling is a three-way branch: + +| Detected state | Default behavior | +|---|---| +| Not installed | Propose installing the latest available `sphinx-needs`. User confirms → install; rejects → abort (no config written, since no install = nothing to configure). | +| Installed, `>= recommended` | Proceed silently. | +| Installed, `< recommended` | Propose upgrading to recommended. User picks: (a) upgrade; (b) accept current version and proceed; (c) abort. | + +The skill never silently installs or upgrades; every mutation is gated by explicit confirmation (or by a caller passing `on_version_mismatch="install"` / `"accept"` for unattended flows). + +## Atomicity + +- (a) Indivisible — one `project_dir` + config spec in → `conf.py` and/or `ubproject.toml` edits out, plus at most one sphinx-needs version-alignment action (install / upgrade) gated by user confirmation. No directory creation beyond opening existing files; no RST seeding; no docs content; no Pharaoh-level config. The install step is a guarded side effect — it runs only when the caller's confirmation is received, never speculatively. +- (b) Input: `{project_dir: str, config_target: "auto"|"conf.py"|"ubproject.toml", types: list[TypeSpec], extra_links?: list[LinkSpec], extra_options?: list[str], id_required?: bool, id_length?: int, recommended_sphinx_needs_version?: str, on_version_mismatch?: "fail"|"prompt"|"install"|"accept"}` where `TypeSpec = {directive: str, title: str, prefix: str, color?: str, style?: str}` and `LinkSpec = {option: str, incoming: str, outgoing: str}`. `extra_options` defaults to `["source_doc"]` (Pharaoh convention — emitters like `pharaoh-feat-draft-from-docs` set `:source_doc:` on every emitted need; without the declaration, `sphinx-build -nW` fails with `Unknown option 'source_doc'`). `recommended_sphinx_needs_version` defaults to `"8.0.0"`; skill uses this to compute "latest satisfying" when installing, and as the threshold for "older than recommended" proposals. `on_version_mismatch` defaults to `"prompt"`. Output: JSON `{files_modified: list[str], config_target_used: "conf.py"|"ubproject.toml", sphinx_needs_version_before: str|null, sphinx_needs_version_after: str, version_action: "installed"|"upgraded"|"accepted_current"|"already_ok", install_command_used: str|null, warnings: list[str], next_step: str}`. On `"prompt"` path → single JSON `{status: "needs_confirmation", proposal: ...}` with no file writes and no install. +- (c) Reward: fixture covers four scenarios in separate test environments: + + **(i) fresh env (no sphinx-needs installed), `on_version_mismatch="install"`** — scorer checks: + 1. After run, `import sphinx_needs` succeeds with version `>= recommended`. + 2. `version_action == "installed"`, `sphinx_needs_version_before == null`. + 3. Config written and `sphinx-build -b needs` succeeds, producing empty `needs.json`. + + **(ii) env with `sphinx-needs >= recommended`, any `on_version_mismatch`** — scorer checks: + 1. `version_action == "already_ok"`, `install_command_used == null`. + 2. `sphinx_needs_version_before == sphinx_needs_version_after`. + 3. Config written and build succeeds. + + **(iii) env with old version (6.3.0), `on_version_mismatch="accept"`** — scorer checks: + 1. `version_action == "accepted_current"`, `install_command_used == null`. + 2. Config written and build succeeds with the OLD version (assuming the old version is still functional — this is the "user pinned deliberately" path). + 3. Output contains a warning naming the version gap. + + **(iv) env with old version (6.3.0), `on_version_mismatch="prompt"`** — scorer checks: + 1. Output is `{status: "needs_confirmation", proposal: ...}`. + 2. Proposal offers both `upgrade` and `accept` paths. + 3. No files modified. `import sphinx_needs` still reports the old version. + + Idempotence: re-run in the already-aligned state is a no-op (`version_action == "already_ok"`). + + Pass = all scenarios pass their checks. +- (d) Reusable: any first-time sphinx-needs adoption; migration from plain Sphinx; reverse-engineering pilots on projects that start without requirements. Independent of downstream Pharaoh workflow. +- (e) Composable: edits config + at most one guarded install. Does not call other skills, does not write `.pharaoh/`, does not build. + +## Input + +- `project_dir`: absolute path to the Sphinx project root. Must contain `conf.py`. +- `config_target`: where to declare sphinx-needs settings. + - `"auto"` (default): if `ubproject.toml` exists in `project_dir`, use it; otherwise use `conf.py`. + - `"ubproject.toml"`: force TOML. Create the file if missing. + - `"conf.py"`: force Python-level declarations (`needs_types`, `needs_extra_links`, etc.). +- `types`: list of need types to declare. **Only declare types that will have at least one need on day one.** Declaring speculative types (e.g. adding `test` because you plan to write tests "eventually") produces dead type registrations and forces downstream `pharaoh.toml` traceability chains to alarm on empty targets — observed during dogfooding where declaring `test` + `verifies` link made 100% of `comp_req` needs appear unverified on day one. Add new types when the first need of that type lands, not before. + + Each `TypeSpec` has: + - `directive` (required): snake_case directive name, e.g. `"req"`, `"spec"`, `"impl"`, `"test"`. + - `title` (required): human-readable title, e.g. `"Requirement"`. + - `prefix` (required): ID prefix used by sphinx-needs, e.g. `"REQ_"`. + - `color` (optional): hex color or name; defaults left to sphinx-needs. + - `style` (optional): node style; defaults left to sphinx-needs. + + At least one type is required; sphinx-needs builds with defaults but Pharaoh workflows expect explicit declarations. +- `extra_links` (optional): list of `LinkSpec` entries for typed relationships beyond the default `links` option. +- `extra_options` (optional): list of custom option names to declare. Default `["source_doc"]`. Pharaoh emitters (e.g. `pharaoh-feat-draft-from-docs`) always set `:source_doc:` on emitted needs to track provenance back to the authoring document; without this declaration, `sphinx-build -nW` fails with `Unknown option 'source_doc'`. Caller may pass additional option names — the skill unions them with the default. Passing `[]` explicitly suppresses the default (caller accepts the -nW warning as trade-off). Declaration SHAPE is version-dependent: on sphinx-needs ≥ 8.0.0 the skill emits `[needs.fields.NAME]` dict-of-dicts (config option `needs_fields`); on < 8 it emits the legacy `[[needs.extra_options]]` / `needs_extra_options`. The input name stays `extra_options` for API stability — callers pass a list of names and the skill picks the right shape. +- `id_required` (optional): if `true`, declare `needs_id_required = True`. Default: omit (sphinx-needs default is `False`). +- `id_length` (optional): integer; if provided, declare `needs_id_length`. Default: omit (sphinx-needs default). +- `recommended_sphinx_needs_version` (optional): the version Pharaoh recommends. Default `"8.0.0"`. Used as (a) the threshold for "older than recommended" proposals, and (b) the version installed when the skill runs in install mode (or the latest release that satisfies `>=recommended` — see Step 0c). Compared with `packaging.version.parse`. +- `on_version_mismatch` (optional): `"fail"` | `"prompt"` | `"install"` | `"accept"`. Default `"prompt"`. Applies when the detected version is absent OR `< recommended`: + - `"fail"`: abort with a remediation-focused error. + - `"prompt"`: emit a `needs_confirmation` proposal with BOTH an upgrade option and an accept-current option (or install/abort if nothing is installed). The caller picks one and re-invokes with `"install"` or `"accept"`. + - `"install"`: if nothing installed → install recommended; if older version installed → upgrade to recommended. Non-interactive. + - `"accept"`: proceed with whatever is installed. If nothing is installed → FAIL (there is no "current" to accept). + +## Output + +A single JSON object — no prose wrapper. Shape: + +```json +{ + "files_modified": ["docs/conf.py"], + "config_target_used": "conf.py", + "sphinx_needs_version_before": "6.3.0", + "sphinx_needs_version_after": "8.0.0", + "version_action": "upgraded", + "install_command_used": "uv pip install --upgrade sphinx-needs==8.0.0", + "sphinx_build_command": "sphinx-build -b needs docs docs/_build/needs", + "warnings": [], + "next_step": "Run `sphinx-build -b needs docs docs/_build/needs` (see sphinx_build_command) to generate needs.json, then run pharaoh-setup." +} +``` + +`sphinx_build_command` is a concrete, copy-pasteable invocation that assumes the caller's cwd is the project root. Resolution: +- Builder flag: `-b needs`. +- ``: the relative path from the detected project root to `project_dir` (the argument the skill was invoked with). If `project_dir` contains both `conf.py` and the .rst source tree (typical `sphinx-quickstart` flat layout), `` is the project_dir path. If `conf.py` lives in one directory and .rst sources live in a sibling (e.g. `conf.py` in `docs/` but RST files under `docs/source/`), the command uses `-c `; the skill detects this by checking whether the `conf.py` directory contains any `*.rst` files. +- ``: `/_build/needs` by convention. +- If the skill cannot resolve the project root relative to `project_dir` (e.g. `project_dir` is absolute with no parent that looks like a project root), it falls back to absolute paths. + +`version_action` is one of: +- `"installed"` — was missing, installed recommended +- `"upgraded"` — was older than recommended, upgraded +- `"accepted_current"` — was older than recommended, user opted to keep it +- `"already_ok"` — detected version already `>= recommended`, no action taken + +When `on_version_mismatch == "prompt"` and a mismatch is detected, response is: + +```json +{ + "status": "needs_confirmation", + "proposal": { + "detected_version": "6.3.0", + "recommended_version": "8.0.0", + "detected_package_manager": "rye", + "options": [ + { + "action": "upgrade", + "description": "Install sphinx-needs 8.0.0 (recommended). Unlocks TOML loading, schema validation, new extra-link format.", + "install_command": "rye add sphinx-needs~=8.0.0", + "alt_commands": [ + "uv pip install --upgrade sphinx-needs==8.0.0", + "pip install --upgrade sphinx-needs==8.0.0" + ], + "pyproject_patch": { + "target_file": "pyproject.toml", + "section": "[project].dependencies", + "replace": {"sphinx-needs>=6.3.0": "sphinx-needs>=8.0.0"} + } + }, + { + "action": "accept", + "description": "Keep sphinx-needs 6.3.0. Bootstrap proceeds against the current version. Some Pharaoh features that depend on 8.x (schema validation, latest TOML loader) may be degraded or unavailable.", + "caveats": [ + "Downstream Pharaoh skills may warn about missing features.", + "Upgrade can be deferred — re-run pharaoh-bootstrap later to revisit." + ] + }, + { + "action": "abort", + "description": "Cancel bootstrap without writing config or installing anything." + } + ], + "rationale": "Pharaoh recommends sphinx-needs >= 8.0.0 for the richest feature set, but respects pinned older versions where the project has stability or compliance constraints." + } +} +``` + +No files are modified and no installs happen when the response is `needs_confirmation`. The caller (human or outer LLM) picks an option and re-invokes with `on_version_mismatch` set accordingly (`"install"` for upgrade, `"accept"` for accept, or simply stop for abort). + +The "nothing installed" variant of the same proposal drops the `accept` option (since there is no current version to accept) and the `upgrade` action becomes `install`. + +## Process + +### Step 0: Determine sphinx-needs version action + +Before touching any config file, resolve what the skill should do about `sphinx-needs` — install, upgrade, accept, or proceed without action. + +**0a. Detect current version.** + +Run `python -c "import sphinx_needs; print(sphinx_needs.__version__)"` in the project's interpreter (virtualenv-preferred, active shell Python as fallback). + +- Import succeeds → record printed version. +- Import fails → record `null`. + +**0b. Classify and branch.** + +Three classes: + +1. **Installed and `>= recommended_sphinx_needs_version`** → set `version_action = "already_ok"`, `install_command_used = null`, `sphinx_needs_version_before = sphinx_needs_version_after = detected`. Skip to Step 1. + +2. **Not installed** → branch on `on_version_mismatch`: + - `"fail"` → FAIL with remediation message. + - `"prompt"` → emit `needs_confirmation` proposal with options `["install", "abort"]` (no `accept` — nothing to accept). Return. + - `"install"` → go to Step 0c with action=install. + - `"accept"` → FAIL: `"on_version_mismatch='accept' requires an existing install, but sphinx-needs is not installed."` + +3. **Installed but `< recommended`** → branch on `on_version_mismatch`: + - `"fail"` → FAIL with remediation. + - `"prompt"` → emit `needs_confirmation` proposal with options `["upgrade", "accept", "abort"]`. Return. + - `"install"` → go to Step 0c with action=upgrade. + - `"accept"` → set `version_action = "accepted_current"`, emit a warning naming the version gap, set `sphinx_needs_version_before = sphinx_needs_version_after = detected`, `install_command_used = null`. Skip to Step 1. + +**0c. Detect package manager and run install/upgrade.** + +Only reached when `on_version_mismatch == "install"`. Detect package manager by scanning `project_dir` and up to 3 parent levels: + +| Indicator | Package manager | Install command | Upgrade command | +|---|---|---|---| +| `.python-version` + `pyproject.toml` with `[tool.rye]` or `rye.lock` | rye | `rye add sphinx-needs~=` | `rye add sphinx-needs~=` (rye resolves by constraint) | +| `uv.lock` or `pyproject.toml` with `[tool.uv]` | uv | `uv add sphinx-needs==` | `uv pip install --upgrade sphinx-needs==` | +| `poetry.lock` | poetry | `poetry add sphinx-needs@^` | `poetry add sphinx-needs@^` | +| `Pipfile.lock` | pipenv | `pipenv install sphinx-needs==` | `pipenv install sphinx-needs==` | +| `pdm.lock` | pdm | `pdm add sphinx-needs==` | `pdm update sphinx-needs` | +| otherwise, with active venv detectable via `VIRTUAL_ENV` or `project_dir/.venv` | pip (venv) | ` -m pip install sphinx-needs==` | ` -m pip install --upgrade sphinx-needs==` | +| otherwise | unknown | FAIL: "Cannot detect package manager. Install sphinx-needs manually and re-run with `on_version_mismatch='accept'` or `'install'` after install." | + +Closer indicator wins if multiple match. `` substituted with `recommended_sphinx_needs_version`. + +Run the selected command. Capture exit code and stdout/stderr. + +**0d. Verify post-install.** + +Re-run the probe from 0a. Determine final state: + +- Import still fails → FAIL naming the attempted command and exit code. +- Version now `>= recommended` → set `version_action = "installed"` (if 0b class was "not installed") or `"upgraded"` (if class was "older"). Record `install_command_used` = the command. Proceed to Step 1. +- Version below recommended but installed (install appeared to succeed but resolver picked an older version, e.g. constrained by lockfile) → emit warning, set `version_action = "accepted_current"`, proceed. The caller's lockfile constraints win over Pharaoh's recommendation. + +### Step 1: Verify project_dir is a Sphinx project + +Read `/conf.py`. If it does not exist, FAIL: + +``` +FAIL: /conf.py not found. +This skill scaffolds sphinx-needs INTO an existing Sphinx project. +Run `sphinx-quickstart` first to create a Sphinx project, then re-invoke. +``` + +### Step 2: Verify sphinx-needs is not already configured + +Search `conf.py` and (if present) `ubproject.toml` for the string `sphinx_needs`. If found in either file, FAIL: + +``` +FAIL: sphinx_needs is already referenced in . +This skill is for projects without sphinx-needs. Use pharaoh-setup instead. +``` + +Rationale: mutating an existing config belongs to a separate skill (future: `pharaoh-setup-reconfigure`). Atomicity demands that `pharaoh-bootstrap` only handles first-time injection. + +### Step 3: Resolve config_target + +If `config_target == "auto"`: +- If `/ubproject.toml` exists → use `"ubproject.toml"`. +- Else → use `"conf.py"`. + +Record the resolved target. Emit a warning if the caller passed `"ubproject.toml"` but the file does not exist (the skill will create it). + +### Step 4: Inject `sphinx_needs` into the `extensions` list + +This always happens in `conf.py`, regardless of `config_target` (sphinx loads extensions from `conf.py` only). + +Read `conf.py`. Locate the `extensions = [...]` assignment. Two cases: + +**4a. Extensions list exists.** Append `"sphinx_needs"` as the last entry, preserving existing indentation and trailing comma conventions. If the list is empty (`extensions = []`), replace with `extensions = ["sphinx_needs"]`. + +**4b. Extensions list missing.** Append a new line `extensions = ["sphinx_needs"]` after the last existing top-level assignment (heuristic: find the last line that looks like `NAME = ...` at column 0, insert after it). Add a blank line before for readability. + +Do NOT reorder, rename, or reflow existing content. Do NOT add comments. + +### Step 5: Declare need types + +**5a. If `config_target_used == "ubproject.toml"`:** + +If the file does not exist, create it with a `$schema` header pointing at the public ubproject schema: + +```toml +"$schema" = "https://ubcode.useblocks.com/ubproject.schema.json" +``` + +Append a `[needs]` section with the types array. Example: + +```toml +[[needs.types]] +directive = "req" +title = "Requirement" +prefix = "REQ_" + +[[needs.types]] +directive = "spec" +title = "Specification" +prefix = "SPEC_" +``` + +Include `color` and `style` entries only if the caller provided them. + +Emit typed links and custom fields. **The shape depends on the detected `sphinx_needs_version_after` from Step 0.** sphinx-needs 8.x deprecated the pre-8 array-of-tables shape in favour of dict-of-dicts keyed by option name; emitting the legacy shape on 8.x triggers deprecation warnings at every build (`Config option "needs_extra_options" is deprecated. Please use "needs_fields" instead.`), and emitting the new shape on < 8 fails to load. The skill picks the right shape for the detected version. + +**sphinx-needs ≥ 8.0.0 — dict-of-dicts:** + +```toml +[needs.links.satisfies] +incoming = "is satisfied by" +outgoing = "satisfies" + +[needs.fields.source_doc] +description = "Relative path to the documentation file that authored this need (Pharaoh provenance)." +schema = "string" +default = "" +``` + +On 8.x, `description` + `schema` + `default` are all required on each field entry; omitting any of them triggers a backward-compatibility warning. For caller-supplied option names without explicit metadata, the skill synthesises: `description = " (Pharaoh-declared custom field)"`, `schema = "string"`, `default = ""`. + +**sphinx-needs < 8.0.0 — array-of-tables (legacy shape):** + +```toml +[[needs.extra_links]] +option = "satisfies" +incoming = "is satisfied by" +outgoing = "satisfies" + +[[needs.extra_options]] +name = "source_doc" +``` + +On pre-8, `[[needs.extra_links]]` MUST be an array of tables — dict form (`[needs.extra_links.satisfies]`) fails with `TypeError: string indices must be integers`. + +Version comparison uses `packaging.version.parse`. Resolved `extra_options` = default `["source_doc"]` unioned with caller-provided extras, deduplicated, sorted for determinism. + +If caller explicitly passed `extra_options = []`, emit no field/option section and record a warning: `"extra_options suppressed by caller; Pharaoh emitters that set :source_doc: will trigger -nW warnings"`. + +Include `id_required` and `id_length` only if the caller provided them: + +```toml +[needs] +id_required = true +id_length = 8 +``` + +Also add the `needs_from_toml` hook to `conf.py` so sphinx-needs reads the TOML: + +```python +needs_from_toml = "ubproject.toml" +``` + +Insert this line after the `extensions = [...]` assignment that was touched in Step 4. + +**5b. If `config_target_used == "conf.py"`:** + +Append `needs_types` plus version-dependent link/field declarations plus optional ID settings directly to `conf.py` after the `extensions` assignment. The config-option names match the TOML shape chosen in Step 5a. + +**sphinx-needs ≥ 8.0.0 — `needs_links` / `needs_fields` dicts:** + +```python +needs_types = [ + {"directive": "req", "title": "Requirement", "prefix": "REQ_"}, + {"directive": "spec", "title": "Specification", "prefix": "SPEC_"}, +] + +needs_links = { + "satisfies": { + "incoming": "is satisfied by", + "outgoing": "satisfies", + }, +} + +needs_fields = { + "source_doc": { + "description": "Relative path to the documentation file that authored this need.", + "schema": "string", + "default": "", + }, +} + +needs_id_required = True +needs_id_length = 8 +``` + +**sphinx-needs < 8.0.0 — legacy `needs_extra_links` / `needs_extra_options`:** + +```python +needs_types = [ + {"directive": "req", "title": "Requirement", "prefix": "REQ_"}, + {"directive": "spec", "title": "Specification", "prefix": "SPEC_"}, +] + +needs_extra_links = [ + {"option": "satisfies", "incoming": "is satisfied by", "outgoing": "satisfies"}, +] + +needs_extra_options = ["source_doc"] + +needs_id_required = True +needs_id_length = 8 +``` + +Omit link/field declarations the caller did not supply (no `extra_links` input AND default `extra_options` not suppressed → emit only the `source_doc` field/option). Omit `needs_id_*` entries the caller did not supply. Always emit the `source_doc` declaration unless caller explicitly passed `extra_options = []` (in which case omit and warn — see Step 5a). Do NOT add comments. + +### Step 6: Emit output + +Emit the JSON object per the Output shape. Populate: +- `files_modified`: every file the skill wrote to, relative to `project_dir`. +- `config_target_used`: resolved target from Step 3. +- `warnings`: accumulated warnings (e.g., created `ubproject.toml` that did not exist). +- `sphinx_build_command`: resolved per the rules in the Output section. Prefer paths relative to the detected project root (the nearest ancestor of `project_dir` that contains a `pyproject.toml`, `.git`, or similar marker). If no project root is detectable, use absolute paths. If `project_dir` does not contain any `*.rst` files but `/source/` does (separated layout), emit `sphinx-build -b needs -c /source /source/_build/needs`. +- `next_step`: interpolate `sphinx_build_command` into the sentence `"Run \`\` to generate needs.json, then run pharaoh-setup."` + +## Guardrails + +**G1 — No Sphinx project.** `conf.py` missing → FAIL per Step 1. + +**G2 — sphinx-needs already present.** Any reference to `sphinx_needs` found → FAIL per Step 2. Do not attempt merge; that is a different skill. + +**G3 — Empty types list.** If `types == []`, FAIL: + +``` +FAIL: At least one type must be declared. +Pharaoh workflows expect explicit type declarations. Provide at least one TypeSpec. +``` + +**G4 — Directive collision.** If two entries in `types` have the same `directive`, FAIL with the offending directive name. Deduplication is the caller's responsibility. + +**G5 — Partial write.** If Step 4 succeeds but Step 5 fails, revert Step 4's change so the project is not left in a half-configured state. Report the failure and the rollback. + +## Advisory chain + +After successfully emitting output, always advise with the CONCRETE command resolved for this project (the value of the `sphinx_build_command` output field), not a placeholder: + +``` +Run `` to generate needs.json. +Then invoke `pharaoh-setup` to detect the fresh configuration and author pharaoh.toml. +``` + +Rationale: prior dogfooding had `conf.py` in `docs/` and RST files in `docs/source/`; the generic ` ` template forced the caller to grep `pyproject.toml` to find the right `-c` flag before the first build succeeded. Surfacing the concrete invocation in the bootstrap report removes that lookup. + +## Worked example + +**User input:** +```json +{ + "project_dir": "/work/my-project/docs", + "config_target": "auto", + "types": [ + {"directive": "feat", "title": "Feature", "prefix": "FEAT_"}, + {"directive": "comp_req", "title": "Component Requirement", "prefix": "CREQ_"} + ], + "extra_links": [ + {"option": "satisfies", "incoming": "is satisfied by", "outgoing": "satisfies"} + ] +} +``` + +**Step 1:** `/work/my-project/docs/conf.py` exists. OK. + +**Step 2:** Neither `conf.py` nor `ubproject.toml` mentions `sphinx_needs`. OK. + +**Step 3:** `ubproject.toml` exists in `/work/my-project/docs/` → resolve to `"ubproject.toml"`. + +**Step 4:** Append `"sphinx_needs"` to the existing `extensions = [...]` list in `conf.py`. + +**Step 5:** Add `[[needs.types]]` tables for `feat` and `comp_req` to `ubproject.toml`. Emit link and field declarations in the shape matching the detected sphinx-needs version — `[needs.links.satisfies]` and `[needs.fields.source_doc]` on ≥ 8.0.0, or the legacy `[[needs.extra_links]]` / `[[needs.extra_options]]` on < 8. Add `needs_from_toml = "ubproject.toml"` to `conf.py`. + +**Step 6 output:** + +```json +{ + "files_modified": ["conf.py", "ubproject.toml"], + "config_target_used": "ubproject.toml", + "sphinx_build_command": "sphinx-build -b needs docs docs/_build/needs", + "warnings": [], + "next_step": "Run `sphinx-build -b needs docs docs/_build/needs` to generate needs.json, then run pharaoh-setup." +} +``` diff --git a/skills/pharaoh-class-diagram-draft/SKILL.md b/skills/pharaoh-class-diagram-draft/SKILL.md new file mode 100644 index 0000000..58e4ee9 --- /dev/null +++ b/skills/pharaoh-class-diagram-draft/SKILL.md @@ -0,0 +1,106 @@ +--- +name: pharaoh-class-diagram-draft +description: Use when drafting one class diagram showing a bounded set of types/entities with their fields, methods, and relationships (inheritance, composition, aggregation, association). Renderer tailored via `pharaoh.toml`. Does NOT emit component, sequence, or state diagrams. Status — PLANNED (design-only scaffold; invoking returns sentinel FAIL until implemented). +--- + +# pharaoh-class-diagram-draft (PLANNED) + +> **Status:** DESIGN ONLY. Implementation sentinel FAIL: `"pharaoh-class-diagram-draft is planned but not implemented; see SKILL.md"`. + +Shared tailoring rules: see [`shared/diagram-tailoring.md`](../shared/diagram-tailoring.md). Reads `[pharaoh.diagrams.class]` for per-type overrides. + +Safe-label rules: see [`shared/diagram-safe-labels.md`](../shared/diagram-safe-labels.md). Every emitted label / node id / edge label MUST be sanitised per that rule set before the block leaves this skill. `sphinx-build` does not validate diagram bodies — a parse failure becomes visible only at browser render time. Sanitisation is the first line of defence; the second is `pharaoh-diagram-lint` run as part of `pharaoh-quality-gate`. + +## Purpose + +One invocation → one class/type diagram. Captures **structural relationships** between types: inheritance hierarchies, composition, aggregation, plain association, with optional per-class fields and methods. + +Does NOT capture runtime behavior over time (→ `pharaoh-sequence-diagram-draft`). Does NOT capture high-level component topology (→ `pharaoh-component-diagram-draft`). Does NOT capture lifecycle FSM (→ `pharaoh-state-diagram-draft`). + +## Atomicity + +- (a) One class set in → one diagram out. No splitting across diagrams; if the set is too large to fit, caller invokes multiple times with different scopes. +- (b) Input: `{view_title: str, classes: list[ClassSpec], relationships: list[RelationSpec], project_root: str, show_fields?: bool, show_methods?: bool, visibility_filter?: list["public"|"protected"|"private"], renderer_override?, on_missing_config?, papyrus_workspace?, reporter_id: str}` where `ClassSpec = {id: str, label: str, stereotype?: "abstract"|"interface"|"enum"|"struct", fields?: list[FieldSpec], methods?: list[MethodSpec]}`, `FieldSpec = {name: str, type?: str, visibility?: "public"|"protected"|"private"}`, `MethodSpec = {name: str, params?: str, return_type?: str, visibility?: "public"|"protected"|"private"}`, `RelationSpec = {from: str, to: str, kind: "inherits"|"implements"|"composes"|"aggregates"|"associates"|"depends", label?: str, cardinality_from?: str, cardinality_to?: str}`. Output: one RST directive block. +- (c) Reward: fixture — abstract base `Shape` with method `area()`, concrete `Circle` and `Square` inheriting, plus `Canvas` composing 1..* Shapes. Scorer: + 1. Output starts with renderer-specific directive. + 2. All class IDs declared. + 3. Inheritance edges Circle→Shape, Square→Shape render in inheritance syntax (hollow triangle in both Mermaid/PlantUML). + 4. Composition edge Canvas→Shape renders in composition syntax (filled diamond). + 5. Cardinality `1..*` on composition edge is present. + 6. With `show_fields=false, show_methods=false`, no field/method lines appear. + 7. With `show_methods=true`, abstract `area()` on Shape is rendered with stereotype (italic/abstract marker). + + Pass = all 7. +- (d) Reusable: any OOP codebase, domain model extraction, data schema visualization. +- (e) One diagram per call. + +## Input highlights (others per shared doc) + +- `classes`: declared order = render order (usually doesn't matter for class diagrams but preserved for determinism). +- `relationships`: every `from`/`to` MUST reference a class ID in `classes`. Dangling relationship → FAIL (class diagrams don't tolerate ghost classes in the same way component diagrams tolerate out-of-scope links; either the class is in the diagram or it is not). +- `show_fields` / `show_methods` (optional): default `true`. Set to `false` for overview diagrams. +- `visibility_filter` (optional): include only members matching these visibilities. Default: all. + +## Output + +**Mermaid:** +```rst +.. mermaid:: + :caption: + + classDiagram + class Shape { + <> + +area() double + } + class Circle { + -radius: double + +area() double + } + class Canvas { + +shapes: List~Shape~ + } + Shape <|-- Circle + Shape <|-- Square + Canvas "1" *-- "1..*" Shape +``` + +**PlantUML:** +```rst +.. uml:: + :caption: + + @startuml + abstract class Shape { + +area() : double + } + class Circle { + -radius : double + +area() : double + } + class Canvas + Shape <|-- Circle + Shape <|-- Square + Canvas "1" *-- "1..*" Shape + @enduml +``` + +## Relationship kind → renderer syntax + +| Kind | Mermaid | PlantUML | +|---|---|---| +| `inherits` | `A <|-- B` | `A <|-- B` | +| `implements` | `A <|.. B` | `A <|.. B` | +| `composes` | `A *-- B` | `A *-- B` | +| `aggregates` | `A o-- B` | `A o-- B` | +| `associates` | `A -- B` | `A -- B` | +| `depends` | `A ..> B` | `A ..> B` | + +Both renderers converge on UML-standard arrows; the syntax is virtually identical. + +## Non-goals + +- No generics/template detection — callers pass rendered forms (`List~Shape~`, `Option`) in field types as strings. +- No automatic abstract detection — caller sets `stereotype` explicitly. +- No private-member hiding by default — caller uses `visibility_filter=["public"]` if needed. +- No class-from-code extraction — separate future skill (`pharaoh-classes-from-source`) could infer; out of scope here. diff --git a/skills/pharaoh-component-diagram-draft/SKILL.md b/skills/pharaoh-component-diagram-draft/SKILL.md new file mode 100644 index 0000000..e3af072 --- /dev/null +++ b/skills/pharaoh-component-diagram-draft/SKILL.md @@ -0,0 +1,99 @@ +--- +name: pharaoh-component-diagram-draft +description: Use when drafting one component-relationship diagram (nodes = sphinx-needs, edges = link relations) for a bounded scope — one feature, one module, one architectural view. Renderer tailored via `pharaoh.toml`. Does NOT emit sequence, class, or state diagrams — those are separate skills. Status — PLANNED (design-only scaffold; invoking returns sentinel FAIL until implemented). +--- + +# pharaoh-component-diagram-draft (PLANNED) + +> **Status:** DESIGN ONLY. Implementation sentinel FAIL: `"pharaoh-component-diagram-draft is planned but not implemented; see SKILL.md"`. + +Shared tailoring rules: see [`shared/diagram-tailoring.md`](../shared/diagram-tailoring.md). This skill reads `[pharaoh.diagrams]` and `[pharaoh.diagrams.component]` from the consumer project's `pharaoh.toml`. + +Safe-label rules: see [`shared/diagram-safe-labels.md`](../shared/diagram-safe-labels.md). Every emitted label / node id / edge label MUST be sanitised per that rule set before the block leaves this skill. `sphinx-build` does not validate diagram bodies — a parse failure becomes visible only at browser render time. Sanitisation is the first line of defence; the second is `pharaoh-diagram-lint` run as part of `pharaoh-quality-gate`. + +## Purpose + +One invocation → one component-relationship diagram (static containment + link-relation edges between needs). Analogue to UML component diagrams, C4 container/component views. + +Does NOT show behavior over time (→ `pharaoh-sequence-diagram-draft`). Does NOT show type hierarchies with fields/methods (→ `pharaoh-class-diagram-draft`). Does NOT show lifecycle/FSM (→ `pharaoh-state-diagram-draft`). + +## Atomicity + +- (a) One scope in → one diagram out. No multi-scope bundling. No mutation of needs. +- (b) Input: `{view_title: str, scope_ids: list[str], project_root: str, renderer_override?: "mermaid"|"plantuml", direction_override?: "TB"|"LR"|"BT"|"RL", ghost_nodes?: bool, on_missing_config?: "fail"|"prompt"|"use_default", papyrus_workspace?: str, reporter_id: str}`. Output: one RST directive block (`.. mermaid::` or `.. uml::`) with caption. No surrounding prose. +- (c) Reward: fixture with 3 in-scope needs (A, B, C) chained A→B→C via `:links:`, plus one out-of-scope need D that B links to. Scorer: + 1. Output starts with the renderer-specific directive matching tailoring. + 2. Every ID in `scope_ids` appears as a node in the diagram body. + 3. Edges A→B and B→C render in renderer syntax (`A --> B`). + 4. With default `ghost_nodes=true`: D appears as a ghost node (dashed outline / muted color / external stereotype), edge B→D is rendered. + 5. With `ghost_nodes=false`: D does NOT appear, edge B→D is dropped, and a warning is logged naming D as a dangling dependency. + 6. `renderer_override="mermaid"` on a PlantUML-tailored project produces Mermaid. + + Pass = all 6. +- (d) Reusable for any sphinx-needs project needing static architecture diagrams. +- (e) One phase, one skill. No cross-skill calls. + +## Input + +- `view_title`: human-readable title (→ diagram caption). +- `scope_ids`: list of sphinx-needs IDs to include. Skill reads each via `ubc` / file fallback to extract type, title, and outgoing link options. +- `project_root`: absolute path to consumer project root. Used for `pharaoh.toml` tailoring lookup. +- `renderer_override` (optional): per-call override. Resolution order in `shared/diagram-tailoring.md`. +- `direction_override` (optional): `TB` | `LR` | `BT` | `RL`. Falls back to `[pharaoh.diagrams.component].direction` → `"TB"`. +- `ghost_nodes` (optional): if `true` (default), edges whose target is outside `scope_ids` render as ghost nodes — dashed outline, muted color, visually distinct from in-scope nodes — so reviewers see the boundary between "our scope" and "external dependencies." If `false`, the dangling edge is dropped and a warning is logged. Default `true`. +- `on_missing_config` (optional): see shared doc. Default `"prompt"`. +- `papyrus_workspace` (optional): for consistent node labeling across diagrams (same canonical names as `pharaoh-req-from-code`). +- `reporter_id`: short agent identifier. + +## Output + +Single RST directive block. Renderer-dependent body: + +**Mermaid (default):** +```rst +.. mermaid:: + :caption: + + graph TB + FEAT_csv_export[CSV Export]:::feat + CREQ_csv_export_01[Write CSV header row]:::comp_req + CREQ_csv_export_02[Serialize rows]:::comp_req + CREQ_csv_export_01 --> FEAT_csv_export + CREQ_csv_export_02 --> FEAT_csv_export + classDef feat fill:#4ECDC4 + classDef comp_req fill:#BFD8D2 +``` + +**PlantUML:** +```rst +.. uml:: + :caption: + + @startuml + component "CSV Export" as FEAT_csv_export #4ECDC4 + component "Write CSV header row" as CREQ_csv_export_01 #BFD8D2 + component "Serialize rows" as CREQ_csv_export_02 #BFD8D2 + CREQ_csv_export_01 --> FEAT_csv_export + CREQ_csv_export_02 --> FEAT_csv_export + @enduml +``` + +`classDef`/color fills come from `[pharaoh.diagrams.type_styles]` if tailored; otherwise renderer defaults (no styling). + +## Process (sketch) + +1. Resolve renderer, direction, type_styles from `pharaoh.toml` (see shared doc for order). If any mandatory field missing AND `on_missing_config == "prompt"` → emit structured proposal. +2. Read each need in `scope_ids` via data-access layer (`ubc` CLI preferred). +3. Build internal graph: nodes = scope_ids, edges = outgoing links. +4. For each edge: if target ∈ scope_ids → render as in-scope edge. If target ∉ scope_ids → behavior depends on `ghost_nodes`: + - `ghost_nodes=true` (default): add the target as a ghost node (dashed outline, muted color, `<>` stereotype or renderer-equivalent). Render the edge normally. Log info-level note listing all ghost nodes. + - `ghost_nodes=false`: drop the edge. Log warning naming the dangling pair. +5. Emit renderer-specific syntax. Ghost nodes are grouped visually apart from in-scope nodes where the renderer supports it (Mermaid: separate `subgraph External`; PlantUML: `package "external" { ... }`). +6. Wrap in RST directive with caption. +7. Return. + +## Non-goals + +- Not sequences, not classes, not state machines — separate skills. +- Not auto-layout tuning — emit simple directional graphs. +- Not diagram-to-needs sync — edges are DERIVED from needs, never a source of truth. diff --git a/skills/pharaoh-coverage-gap/SKILL.md b/skills/pharaoh-coverage-gap/SKILL.md index bce1333..423b1e2 100644 --- a/skills/pharaoh-coverage-gap/SKILL.md +++ b/skills/pharaoh-coverage-gap/SKILL.md @@ -306,7 +306,7 @@ Use `pharaoh-process-audit` to run all 10 gap categories in one pass. ### Category 1: `unverified_req` -**Inputs:** `project_root = examples/score`, `category = unverified_req` +**Inputs:** `project_root = examples/my-project`, `category = unverified_req` **Step 2:** tailoring loaded; needs.json found with 185 `gd_req` needs and 63 `tc` needs. @@ -346,7 +346,7 @@ Use `pharaoh-process-audit` to run all 10 gap categories in one pass. ### Category 2: `schema_violation` -**Inputs:** `project_root = examples/score`, `category = schema_violation` +**Inputs:** `project_root = examples/my-project`, `category = schema_violation` **Step 3:** artefact-catalog.yaml loaded. For `gd_req`, required fields are `[id, status, satisfies]`. After scanning all needs: 1 `gd_req` missing `:satisfies:` field; 2 `arch` needs missing diff --git a/skills/pharaoh-decision-record/SKILL.md b/skills/pharaoh-decision-record/SKILL.md index c2d36e8..210fe02 100644 --- a/skills/pharaoh-decision-record/SKILL.md +++ b/skills/pharaoh-decision-record/SKILL.md @@ -103,6 +103,12 @@ No surrounding prose. Emit exactly one JSON object per invocation. A future cleanup may reimplement `pharaoh-finding-record` as a thin wrapper over `pharaoh-decision-record`; that refactor is out of scope for Phase 4c. +## Last step + +After emitting the artefact, invoke `pharaoh-decision-review` on it. Pass the emitted artefact (or its `need_id`) as `target`. Attach the returned review JSON to the skill's output under the key `review`. If the review emits any axis with `score: 0` or `severity: critical`, return a non-success status with the review findings verbatim and do NOT finalize the artefact — the caller must regenerate (via `pharaoh-decision-regenerate` if available, or by re-invoking this skill with the findings as input). + +See [`shared/self-review-invariant.md`](../shared/self-review-invariant.md) for the rationale and enforcement mechanism. Coverage is mechanically enforced by `pharaoh-self-review-coverage-check` in `pharaoh-quality-gate`. + ## Composition Each caller invokes this skill once per canonical subject surfaced. The orchestrator or harness then reads the final Papyrus workspace via `papyrus recall` for the aggregated vocabulary. diff --git a/skills/pharaoh-decision-review/SKILL.md b/skills/pharaoh-decision-review/SKILL.md new file mode 100644 index 0000000..c1bc529 --- /dev/null +++ b/skills/pharaoh-decision-review/SKILL.md @@ -0,0 +1,47 @@ +--- +name: pharaoh-decision-review +description: Use when auditing a single recorded decision (DR / ADR / design note) against the generic decision review axes in `shared/checklists/decision.md`. Checks context/alternatives/consequences structure, traceability to affected artefacts, rationale completeness. Emits structured findings JSON. +chains_from: [pharaoh-decision-record] +chains_to: [] +--- + +# pharaoh-decision-review + +## When to use + +Invoke after `pharaoh-decision-record` wrote a decision memory. Part of the self-review invariant. + +## Atomicity + +- (a) One decision + one checklist in → one findings JSON out. +- (b) Input: `{target: , checklist_path: , tailoring_path: }`. Output: findings JSON. +- (c) Reward: fixtures `passing-decision.rst` + `failing-decision.rst` with expected findings. +- (d) Reusable. +- (e) Read-only. + +## Input + +- `target`: RST directive block for a `decision` directive, OR a Papyrus memory_id of type `decision`. +- `checklist_path`: `shared/checklists/decision.md`. + +## Output + +```json +{ + "need_id": "dr__example", + "type": "decision", + "axes": { + "context_section_present": {"passed": true}, + "alternatives_listed": {"passed": true, "reason": "3 alternatives + chosen=4"}, + "consequences_section_present":{"passed": true}, + "trace_to_affected_artefacts":{"passed": true, "reason": "links 2 reqs and 1 arch"}, + "canonical_name_unique": {"passed": true, "reason": "no dup in papyrus"}, + "rationale_quality": {"score": 3} + }, + "overall": "pass" +} +``` + +## Review axes + +See [`shared/checklists/decision.md`](../shared/checklists/decision.md). diff --git a/skills/pharaoh-deployment-diagram-draft/SKILL.md b/skills/pharaoh-deployment-diagram-draft/SKILL.md new file mode 100644 index 0000000..ad49f6a --- /dev/null +++ b/skills/pharaoh-deployment-diagram-draft/SKILL.md @@ -0,0 +1,82 @@ +--- +name: pharaoh-deployment-diagram-draft +description: Use when drafting one deployment diagram showing physical nodes (ECUs, servers, boards), the software artefacts deployed on each, and communication channels (buses, networks). Typical ASPICE usage — SYS.3 System Architectural Design; essential for automotive HW/SW allocation per ISO 26262 Part 5 (HW) and Part 6 (SW). Renderer tailored via `pharaoh.toml`. Status — PLANNED (design-only scaffold; invoking returns sentinel FAIL until implemented). +--- + +# pharaoh-deployment-diagram-draft (PLANNED) + +> **Status:** DESIGN ONLY. Implementation sentinel FAIL: `"pharaoh-deployment-diagram-draft is planned but not implemented; see SKILL.md"`. + +Shared tailoring rules: see [`shared/diagram-tailoring.md`](../shared/diagram-tailoring.md). Reads `[pharaoh.diagrams.deployment]`. + +Safe-label rules: see [`shared/diagram-safe-labels.md`](../shared/diagram-safe-labels.md). Every emitted label / node id / edge label MUST be sanitised per that rule set before the block leaves this skill. `sphinx-build` does not validate diagram bodies — a parse failure becomes visible only at browser render time. Sanitisation is the first line of defence; the second is `pharaoh-diagram-lint` run as part of `pharaoh-quality-gate`. + +## Purpose + +One invocation → one deployment diagram. Captures **execution environment topology**: physical/virtual nodes, the software artefacts (components, containers, services) deployed on each, and the communication channels between nodes (CAN bus, Ethernet, IPC, HTTP, etc.). + +Typical ASPICE / ISO 26262 context: +- **SYS.3 System Architectural Design**: allocation of system elements to HW. +- **ISO 26262 Part 5 (Hardware level)**: mapping safety goals to HW elements. +- **ISO 26262 Part 6 (Software level)**: SW partitioning across ECUs with ASIL tagging. + +## Atomicity + +- (a) One deployment topology in → one diagram out. +- (b) Input: `{view_title: str, nodes: list[NodeSpec], artefacts: list[ArtefactSpec], deployments: list[DeploymentSpec], channels: list[ChannelSpec], project_root: str, renderer_override?, on_missing_config?, papyrus_workspace?, reporter_id: str}` where `NodeSpec = {id: str, label: str, kind?: "device"|"ecu"|"server"|"cloud"|"container", stereotype?: str, asil?: "A"|"B"|"C"|"D"|"QM"}`, `ArtefactSpec = {id: str, label: str, kind?: "component"|"container"|"library"|"binary"|"config"}`, `DeploymentSpec = {node: str, artefact: str}`, `ChannelSpec = {from: str, to: str, label?: str, protocol?: str}`. Output: one RST directive block. +- (c) Reward: fixture — two ECUs (ECU_A ASIL B, ECU_B ASIL D), three artefacts, deployments mapping artefacts to ECUs, CAN bus channel between them. Scorer: + 1. Output starts with renderer directive. + 2. Every node rendered with cube/node shape. + 3. Every artefact rendered inside its deployed node. + 4. Every channel rendered with labeled arrow and protocol annotation. + 5. ASIL tag (when present) visible on node label or as stereotype. + 6. Deployment without a matching node/artefact → FAIL (dangling deployment). + + Pass = all 6. +- (d) Reusable across embedded / distributed / cloud projects; not automotive-specific (ASIL tag is optional). +- (e) One diagram per call. + +## Dangling relationships + +FAIL on `deployments.node` not in `nodes`, `deployments.artefact` not in `artefacts`, `channels.from`/`channels.to` not in `nodes`. + +## Output + +**PlantUML (richest deployment syntax):** +```rst +.. uml:: + :caption: + + @startuml + node "ECU_A\n<>" as ecuA { + artifact "sensor_driver" as art1 + artifact "can_stack" as art2 + } + node "ECU_B\n<>" as ecuB { + artifact "brake_controller" as art3 + } + ecuA ..> ecuB : CAN (500kbit/s) + @enduml +``` + +**Mermaid:** +```rst +.. mermaid:: + :caption: + + flowchart LR + subgraph ECU_A["ECU_A (ASIL B)"] + art1[sensor_driver] + art2[can_stack] + end + subgraph ECU_B["ECU_B (ASIL D)"] + art3[brake_controller] + end + ECU_A -.CAN 500kbit/s.-> ECU_B +``` + +## Non-goals + +- No electrical schematic — use dedicated EE tools for that. +- No real-time timing analysis on channels — a separate `pharaoh-timing-diagram-draft` could cover message schedules. +- No auto-derivation from HW description files (e.g. ARXML) — caller provides explicit node and deployment specs. diff --git a/skills/pharaoh-diagram-lint/SKILL.md b/skills/pharaoh-diagram-lint/SKILL.md new file mode 100644 index 0000000..26a5f60 --- /dev/null +++ b/skills/pharaoh-diagram-lint/SKILL.md @@ -0,0 +1,184 @@ +--- +name: pharaoh-diagram-lint +description: Use when running a terminal validation step over a directory of RST files to catch Mermaid / PlantUML parse failures that sphinx-build cannot detect. Extracts every `.. mermaid::` and `.. uml::` block and pipes it to the real renderer parser (mmdc / plantuml -checkonly). Returns structured findings. Does NOT modify the RST files. +--- + +# pharaoh-diagram-lint + +## When to use + +Invoke after a reverse-engineering or diagram-emission plan has written RST files containing Mermaid or PlantUML diagram blocks, as a terminal check before the plan's `pharaoh-quality-gate` consumes the results. `sphinx-build` does not validate diagram bodies at build time — it hands them to the browser renderer unchanged. A parse failure is therefore invisible in CI logs and surfaces only when a human opens the page. This skill is the parser in the validation loop. + +Do NOT invoke to modify diagrams (this skill is read-only). Do NOT invoke on a single RST file where you already hand-validate with `mmdc` — that workflow does not need an atomic skill. Do NOT use this skill to replace `pharaoh-quality-gate`; it is one of the checks the gate consumes. + +## Why this skill exists + +Mermaid diagrams can pass `sphinx-build -nW --keep-going -b html` with zero warnings while rendering as `Syntax error in text` in the browser. Prose review of surrounding artefacts does not catch this because it has no Mermaid parser. Running every diagram through `@mermaid-js/mermaid-cli` (matching the version sphinxcontrib-mermaid pins) surfaces parse errors the sphinx build misses. + +Structural validation of RST (directive options, needs schema) is necessary but insufficient. Every artefact type with its own render pipeline needs its own parser in the validation loop. This skill is that parser for Mermaid and PlantUML. + +## Atomicity + +- (a) **Indivisible.** One directory in → one findings report out. No RST mutation. No diagram authoring. No scope outside Mermaid/PlantUML block parsing. +- (b) **Typed I/O.** + - Input: `{docs_dir: str, strictness: "fail_on_any" | "report_only", renderers?: list["mermaid" | "plantuml"], mermaid_cli?: str, plantuml_cli?: str, reporter_id: str, papyrus_workspace?: str}`. + - Output: `{findings: list[{file: str, line: int, renderer: "mermaid"|"plantuml", block_index: int, parser_exit_code: int, parser_stderr: str, severity: "error"|"warning"}], summary: {blocks_scanned: int, blocks_failed: int, renderers_covered: list[str]}, status: "pass" | "fail" | "degraded"}`. `degraded` = scanner ran but at least one renderer CLI was not installed; findings cover the renderers that WERE available. +- (c) **Execution-based reward.** Fixture `pharaoh-validation/fixtures/pharaoh-diagram-lint/`: + - `docs/good.rst` — two valid Mermaid blocks (one sequenceDiagram, one flowchart). + - `docs/bad_semicolon.rst` — one Mermaid sequence diagram with a `;` in a message label (prior dogfooding defect). + - `docs/bad_pipe.rst` — one Mermaid flowchart with an unescaped `|` in an edge label. + - `docs/bad_plantuml.rst` — one `.. uml::` block with an unterminated `@startuml`. + - `docs/good.rst` must score zero findings; each `bad_*.rst` must produce at least one finding with `parser_exit_code != 0`. + - Scorer runs `pharaoh-diagram-lint` against `fixtures/pharaoh-diagram-lint/docs` with `strictness: report_only` and asserts: `summary.blocks_scanned == 5`, `summary.blocks_failed == 3`, one finding per bad file, `status == "fail"`. + - Idempotence: re-running on the same directory returns the same findings list (order stable by `file, line`). +- (d) **Reusable.** Any directory of RST. Not tied to Pharaoh pipelines — CI integrations, editor-in-the-loop lint, pre-commit hooks can use it. +- (e) **Composable.** `pharaoh-quality-gate` reads the `findings` and aggregates into its report under a `diagram_lint` section. The reverse-engineer-project template adds this skill as a dependency of `quality_gate`. + +## Input + +- `docs_dir` (required): absolute path to a directory. Scanner walks `**/*.rst` under it. +- `strictness` (required): `"fail_on_any"` returns `status: "fail"` if any finding has `severity: error`; `"report_only"` always returns the findings list with `status: "fail"` or `"pass"` based on findings but does not treat this as a skill failure. Plans wire this to `pharaoh.toml [pharaoh.quality_gate].strict`. +- `renderers` (optional): subset of `["mermaid", "plantuml"]`. Default: both. Useful for projects that emit one renderer only. +- `mermaid_cli` (optional): command name / path to the Mermaid CLI. Default `"mmdc"` (on `$PATH`). The skill uses whatever is resolved; no bundled tool. If unresolved and mermaid blocks are present, emit a `degraded` status + warning naming the installation command (`npm install -g @mermaid-js/mermaid-cli@11`). +- `plantuml_cli` (optional): command name / path to the PlantUML CLI. Default `"plantuml"`. Fallback installation command: `brew install plantuml` or `apt-get install plantuml`. +- `reporter_id` (required): short agent id, passed to `pharaoh-finding-record` calls. +- `papyrus_workspace` (optional): path to `.papyrus/` for recording findings as dedup-aware records. If absent, findings are only returned; not persisted. + +## Output + +Single JSON object. Example: + +```json +{ + "findings": [ + { + "file": "docs/source/spec/feature/jama.rst", + "line": 66, + "renderer": "mermaid", + "block_index": 2, + "parser_exit_code": 1, + "parser_stderr": "Error: Parse error on line 3: ... Expecting 'SOLID_ARROW'... got 'NEWLINE'", + "severity": "error" + } + ], + "summary": { + "blocks_scanned": 5, + "blocks_failed": 1, + "renderers_covered": ["mermaid", "plantuml"] + }, + "status": "fail" +} +``` + +`line` refers to the starting line of the `.. mermaid::` / `.. uml::` directive inside the RST file (where a human would look to fix it). `block_index` is the zero-indexed position of the block within the file (0 = first diagram in the file, 1 = second, etc.) to disambiguate when multiple blocks live in the same file. + +## Process + +### Step 1: Enumerate RST files under `docs_dir` + +Use the Glob tool to list `${docs_dir}/**/*.rst`. If empty, emit warning `"no RST files under docs_dir"` and return `{findings: [], summary: {blocks_scanned: 0, blocks_failed: 0, renderers_covered: []}, status: "pass"}`. + +### Step 2: Extract diagram blocks + +For each file, scan for directive openings. Recognise: + +- `.. mermaid::` — start of a Mermaid block. Body = subsequent lines indented by ≥ 3 spaces. +- `.. uml::` — start of a PlantUML block. +- `.. plantuml::` — alias for `.. uml::` (some projects use this spelling). + +A block ends at the first subsequent line that is either (a) non-blank and indented by < 3 spaces, or (b) end of file. Directive options (e.g. `:caption:`) between the opening line and the body are skipped (not part of the renderer input). + +Record for each block: `{file, start_line, renderer, block_index, body}` where `body` is the concatenation of body lines with the leading indent stripped. + +### Step 3: Check CLI availability + +For each renderer whose blocks were found (AND requested by `renderers` input): + +- Run ` --version` via Bash. Capture exit code. +- If non-zero, emit a degraded-status warning naming the renderer + install command. Skip parsing for this renderer (findings empty for it). + +### Step 4: Parse each block + +For each block whose renderer CLI is available: + +1. Write `body` to a temp file (`/tmp/pharaoh-diagram-lint-${pid}-${idx}.mmd` or `.puml`). +2. Invoke the parser: + - **Mermaid**: ` -i -o `. mmdc 11.x requires an output path with a recognised extension (`.svg` / `.png` / `.pdf` / `.md` / `.markdown`); a sentinel like `/dev/null` is rejected with `Output file must end with ...`. Delete `` afterwards. + - **PlantUML**: ` -checkonly `. `-checkonly` parses without rendering. +3. Determine parse failure: + - **Mermaid** — mmdc 11.x returns exit code 0 even when the mermaid parse fails inside puppeteer. Treat stderr as authoritative: if stderr contains any of `"Error:"`, `"Parse error"`, `"Expecting "`, or `"UnknownDiagramError"`, the block failed. Callers synthesise a non-zero `parser_exit_code` in the finding for consistency across renderers. + - **PlantUML** — `plantuml -checkonly` exits 200 on parse failure. Exit code alone is reliable. +4. On failure, emit a finding with the captured stderr (trimmed to the first 200 chars, after stripping mmdc's success noise like `Generating single mermaid chart`). + +Each finding is: + +```json +{ + "file": "", + "line": , + "renderer": "", + "block_index": , + "parser_exit_code": , + "parser_stderr": "", + "severity": "error" +} +``` + +### Step 5: Aggregate and return + +Sort findings by `(file, line, block_index)` for stable output. + +Compute `summary`: +- `blocks_scanned`: total blocks extracted (across all renderers). +- `blocks_failed`: length of findings list. +- `renderers_covered`: renderers whose CLI was available and actually parsed at least one block. + +Compute `status`: +- `degraded` if any requested renderer was unavailable AND blocks of that renderer exist. +- `fail` if any finding has `severity: error` AND `strictness == "fail_on_any"`, OR if `strictness == "report_only"` AND findings list is non-empty. +- `pass` otherwise. + +### Step 6: Optional Papyrus persistence + +If `papyrus_workspace` is provided, for each finding invoke `pharaoh-finding-record` with: + +- `category`: `"diagram_parse_failure"` +- `subject_id`: `:L:B` (deterministic id: re-running on the same broken diagram returns `"duplicate"`, so findings don't accumulate across runs) +- `body`: the stderr excerpt +- `reporter_id`: the input `reporter_id` +- `tags`: `["renderer:", "origin:diagram-lint"]` + +Skip this step if `papyrus_workspace` is absent — in-memory return is sufficient for plans that do not use shared memory. + +## Failure modes + +| Condition | Response | +| ------------------------------------------------- | ------------------------------------------------------------ | +| `docs_dir` missing | FAIL: `"docs_dir does not exist"`. | +| No RST files under `docs_dir` | Return empty findings with warning; `status: pass`. | +| Mermaid CLI unresolved, mermaid blocks present | `status: degraded`; warning with install command; findings empty for mermaid. | +| PlantUML CLI unresolved, plantuml blocks present | `status: degraded`; warning with install command; findings empty for plantuml. | +| CLI reports parse failure (exit code OR stderr markers) | Emit finding. Continue with next block. mmdc uses stderr markers; plantuml uses exit code 200. | +| CLI hangs (> 30s) | Kill child process; emit finding with `parser_stderr: "timeout after 30s"`. | +| Temp-file write fails | Abort with FAIL naming the temp path. | + +## Non-goals + +- **No auto-fix.** The skill reports; it does not patch RST files. Fixing belongs to whatever emitted the broken diagram (usually a `pharaoh-*-diagram-draft` or `pharaoh-feat-*-extract` skill), or to a human. +- **No render output.** `mmdc` can render PNG/SVG; we discard that. Rendering is sphinx-build's job at HTML build time. +- **No semantic linting.** This skill checks syntactic validity per the renderer parser. Style complaints ("this diagram has too many participants") belong in a future `pharaoh-diagram-review` skill. +- **No other renderers.** Graphviz (`.. graphviz::`), KaTeX (`:math:`), and others are out of scope. Extend the skill (new renderer entries in Step 2's recognition table) when their silent-failure mode becomes a concrete problem. + +## Advisory chain + +After emitting findings: + +- If `status == "fail"` and strictness is `"fail_on_any"`: downstream `pharaoh-quality-gate` should flip to red. Callers should not ship the documentation build. +- If `status == "degraded"`: the missing CLI is a local-environment gap. Install it and re-run before considering the lint report authoritative. +- If `status == "pass"`: does NOT guarantee every diagram is *good*. Semantic correctness (right messages in the right order, right participant set) is unverified — this skill catches syntactic defects only. + +## Composition + +- The `reverse-engineer-project.yaml.j2` template adds `pharaoh-diagram-lint` as a dependency of `pharaoh-quality-gate` (and the gate's input includes the findings). +- `pharaoh-quality-gate` SHOULD expose a `diagram_lint` section in its aggregated report summarising the findings count per renderer and the first 5 errors verbatim. +- A future `pharaoh-render-check` generalisation covering Graphviz, KaTeX, etc. can subsume this skill; until then this is the only parser-in-the-loop validator for Mermaid and PlantUML. diff --git a/skills/pharaoh-diagram-review/SKILL.md b/skills/pharaoh-diagram-review/SKILL.md new file mode 100644 index 0000000..98c6c06 --- /dev/null +++ b/skills/pharaoh-diagram-review/SKILL.md @@ -0,0 +1,78 @@ +--- +name: pharaoh-diagram-review +description: Use when auditing a single diagram block (Mermaid or PlantUML) emitted by any diagram-emitting skill. Single review atom covering all diagram types — trace/caption/element-count/parser/required-elements checks plus LLM-judge axes for purpose clarity and granularity consistency. Per-type required-element checks dispatched based on `diagram_type` input. +chains_from: [pharaoh-feat-component-extract, pharaoh-feat-flow-extract, pharaoh-use-case-diagram-draft, pharaoh-sequence-diagram-draft, pharaoh-component-diagram-draft, pharaoh-class-diagram-draft, pharaoh-state-diagram-draft, pharaoh-activity-diagram-draft, pharaoh-block-diagram-draft, pharaoh-deployment-diagram-draft, pharaoh-fault-tree-diagram-draft] +chains_to: [] +--- + +# pharaoh-diagram-review + +## When to use + +Invoke after any diagram-emitting skill produced a single diagram block. Part of the self-review invariant — every `*-diagram-draft` and `*-extract` skill chains into this review. + +One diagram per invocation. A plan emitting N diagrams invokes this skill N times. + +## Atomicity + +- (a) One diagram block + one checklist + one diagram_type in → one findings JSON out. No multi-diagram aggregation, no re-emission. +- (b) Input: `{diagram_block: , diagram_type: , parent_need_id: , checklist_path: , tailoring_path: }`. Output: findings JSON with per-axis entries, mirroring `pharaoh-req-review` shape. +- (c) Reward: fixtures for each diagram_type — `passing-.rst` + `failing-.rst` with expected findings. Mechanized axes verified by grep / mmdc / plantuml; subjective axes spot-checked against golden JSON. +- (d) Reusable for every diagram-emitting skill regardless of renderer (mermaid / plantuml). +- (e) Read-only. Does not re-emit or modify the diagram. + +## Input + +- `diagram_block`: the full RST directive (`.. mermaid::` or `.. uml::`) including options and body, as a single string. Must be the complete directive, not just the body. +- `diagram_type`: one of `use_case | sequence | component | class | state | activity | block | deployment | fault_tree | feat_component_extract | feat_flow_extract`. Determines which per-type required-elements check runs. +- `parent_need_id`: need_id of the artefact the diagram is attached to (feat, arch, comp_req). Used for `trace_to_parent` check. +- `checklist_path`: `shared/checklists/diagram.md`. Per-project additions loaded from `.pharaoh/project/checklists/diagram.md` if present. +- `tailoring_path`: `.pharaoh/project/` for renderer preference and element-count threshold. + +## Output + +```json +{ + "parent_need_id": "FEAT_jama_import", + "diagram_type": "sequence", + "renderer": "mermaid", + "axes": { + "trace_to_parent": {"passed": true, "reason": "caption names FEAT_jama_import"}, + "caption_present": {"passed": true}, + "element_count_within_bounds": {"passed": true, "reason": "7 participants, limit 12"}, + "parser_clean": {"passed": true, "reason": "mmdc exit 0"}, + "required_elements_for_type": {"passed": true, "reason": "≥2 participants, ≥1 message"}, + "conditional_branches_marked": {"passed": true, "reason": "source has 2 branches; diagram uses 1 alt block"}, + "external_library_participant": {"passed": true, "reason": "requests imported and called; participant Requests present"}, + "returns_match_call_stack": {"passed": true, "reason": "4 returns, all terminate at prior caller or entrypoint"}, + "purpose_clarity": {"score": 3}, + "granularity_consistency": {"score": 3}, + "naming_clarity": {"score": 3} + }, + "overall": "pass" +} +``` + +Axes `conditional_branches_marked`, `external_library_participant`, and `returns_match_call_stack` apply only to the diagram types noted in [`shared/checklists/diagram.md`](../shared/checklists/diagram.md). When a diagram's `diagram_type` falls outside the applicable set (e.g. `class`, `state`, `deployment`), the corresponding axis entry is `{"passed": "n/a", "reason": "axis applies only to sequence diagrams"}` and does NOT contribute to `overall`. + +## Review axes + +See [`shared/checklists/diagram.md`](../shared/checklists/diagram.md) for the canonical axes. Per-type required-elements: + +| diagram_type | Required elements | +| ----------------- | -------------------------------------------------------------------- | +| use_case | ≥1 actor, 1 system boundary (`rectangle`/`package`), ≥1 use case | +| sequence | ≥2 participants, ≥1 message | +| component | ≥1 component node, ≥1 interface or arrow | +| class | ≥1 class node with at least one field OR one method | +| state | 1 initial pseudo-state, 1 final pseudo-state, ≥1 transition | +| activity | 1 start node, 1 end node, ≥1 action | +| block (BDD / IBD) | BDD: ≥1 block with `<>` stereotype. IBD: ≥1 port, ≥1 connector | +| deployment | ≥1 node (physical), ≥1 artefact deployed | +| fault_tree | 1 top event, ≥1 gate (AND/OR), ≥1 basic event | +| feat_component_extract | ≥1 file node, ≥1 import arrow | +| feat_flow_extract | ≥1 participant, ≥1 call arrow | + +## Composition + +Invoked explicitly as a task in plans emitted by `pharaoh-write-plan`, directly after every diagram-emitting task. Coverage enforced by `pharaoh-self-review-coverage-check`. diff --git a/skills/pharaoh-diagram-review/fixtures/conditional-missing/README.md b/skills/pharaoh-diagram-review/fixtures/conditional-missing/README.md new file mode 100644 index 0000000..51892ba --- /dev/null +++ b/skills/pharaoh-diagram-review/fixtures/conditional-missing/README.md @@ -0,0 +1,5 @@ +# conditional-missing + +Source function `submit_order` carries three conditional branches (`if` / `elif` / `else`) that produce observably different outputs — a `ValueError` on non-positive total, a `ValueError` on missing customer, and a successful acceptance path. The Mermaid sequence diagram presents these as an unconditional linear flow, omitting any `alt` / `opt` / `loop` block. + +Detection rule counts `ast.If` (plus `elif`/`else` siblings) in the named function: total >= 2 triggers the grep for `\b(alt|opt|loop|group)\b` against the diagram body. No token is found, so `conditional_branches_marked` fails with the branch count and the missing-marker evidence. The other two new axes are `n/a` (no external imports) or pass (returns go to prior callers). diff --git a/skills/pharaoh-diagram-review/fixtures/conditional-missing/expected-output.json b/skills/pharaoh-diagram-review/fixtures/conditional-missing/expected-output.json new file mode 100644 index 0000000..ce8377f --- /dev/null +++ b/skills/pharaoh-diagram-review/fixtures/conditional-missing/expected-output.json @@ -0,0 +1,15 @@ +{ + "parent_need_id": "FEAT_order_submit", + "diagram_type": "sequence", + "renderer": "mermaid", + "axes": { + "conditional_branches_marked": { + "passed": false, + "evidence": "source has 3 conditional branches (if/elif/else in submit_order); diagram body has no alt/opt/loop/group marker" + }, + "external_library_participant": {"passed": "n/a", "reason": "no non-stdlib imports called in submit_order"}, + "returns_match_call_stack": {"passed": true, "reason": "returns terminate at OrderService and CLI, both prior callers"} + }, + "overall": "fail", + "blockers": ["conditional_branches_marked"] +} diff --git a/skills/pharaoh-diagram-review/fixtures/conditional-missing/input-diagram.rst b/skills/pharaoh-diagram-review/fixtures/conditional-missing/input-diagram.rst new file mode 100644 index 0000000..11e3c36 --- /dev/null +++ b/skills/pharaoh-diagram-review/fixtures/conditional-missing/input-diagram.rst @@ -0,0 +1,13 @@ +.. mermaid:: + :caption: FEAT_order_submit — order submission flow + :source_doc: input-source.py + + sequenceDiagram + participant CLI + participant OrderService + participant DB + + CLI->>OrderService: submit_order(order) + OrderService->>DB: persist(order) + DB-->>OrderService: order_id + OrderService-->>CLI: OrderAccepted(order_id) diff --git a/skills/pharaoh-diagram-review/fixtures/conditional-missing/input-source.py b/skills/pharaoh-diagram-review/fixtures/conditional-missing/input-source.py new file mode 100644 index 0000000..5d016e5 --- /dev/null +++ b/skills/pharaoh-diagram-review/fixtures/conditional-missing/input-source.py @@ -0,0 +1,11 @@ +"""Source cited by the diagram's :source_doc: option.""" + + +def submit_order(order, db): + if order.total <= 0: + raise ValueError("non-positive total") + elif order.customer_id is None: + raise ValueError("missing customer") + else: + order_id = db.persist(order) + return {"status": "accepted", "order_id": order_id} diff --git a/skills/pharaoh-diagram-review/fixtures/conditional-present/README.md b/skills/pharaoh-diagram-review/fixtures/conditional-present/README.md new file mode 100644 index 0000000..0b0060c --- /dev/null +++ b/skills/pharaoh-diagram-review/fixtures/conditional-present/README.md @@ -0,0 +1,5 @@ +# conditional-present + +Same three-branch source function as `conditional-missing/`, but the diagram is authored in PlantUML and wraps the branching behaviour in an `alt` block with two `else` clauses. The grep for `\b(alt|opt|loop|group)\b` at line start matches the `alt` token, so `conditional_branches_marked` passes. + +This fixture confirms the detection rule is renderer-agnostic — exact same grep catches the marker whether the diagram is Mermaid (`alt`/`opt`/`loop`) or PlantUML (`alt`/`opt`/`group`). diff --git a/skills/pharaoh-diagram-review/fixtures/conditional-present/expected-output.json b/skills/pharaoh-diagram-review/fixtures/conditional-present/expected-output.json new file mode 100644 index 0000000..3827000 --- /dev/null +++ b/skills/pharaoh-diagram-review/fixtures/conditional-present/expected-output.json @@ -0,0 +1,15 @@ +{ + "parent_need_id": "FEAT_order_submit", + "diagram_type": "sequence", + "renderer": "plantuml", + "axes": { + "conditional_branches_marked": { + "passed": true, + "evidence": "source has 3 conditional branches (if/elif/else); diagram body contains alt block with 2 else clauses" + }, + "external_library_participant": {"passed": "n/a", "reason": "no non-stdlib imports called in submit_order"}, + "returns_match_call_stack": {"passed": true, "reason": "returns to OrderService and CLI are prior callers"} + }, + "overall": "pass", + "blockers": [] +} diff --git a/skills/pharaoh-diagram-review/fixtures/conditional-present/input-diagram.rst b/skills/pharaoh-diagram-review/fixtures/conditional-present/input-diagram.rst new file mode 100644 index 0000000..65d9095 --- /dev/null +++ b/skills/pharaoh-diagram-review/fixtures/conditional-present/input-diagram.rst @@ -0,0 +1,20 @@ +.. uml:: + :caption: FEAT_order_submit — order submission flow + :source_doc: input-source.py + + @startuml + participant CLI + participant OrderService + participant DB + + CLI -> OrderService : submit_order(order) + alt order.total <= 0 + OrderService --> CLI : ValueError("non-positive total") + else order.customer_id is None + OrderService --> CLI : ValueError("missing customer") + else accepted path + OrderService -> DB : persist(order) + DB --> OrderService : order_id + OrderService --> CLI : OrderAccepted(order_id) + end + @enduml diff --git a/skills/pharaoh-diagram-review/fixtures/conditional-present/input-source.py b/skills/pharaoh-diagram-review/fixtures/conditional-present/input-source.py new file mode 100644 index 0000000..5d016e5 --- /dev/null +++ b/skills/pharaoh-diagram-review/fixtures/conditional-present/input-source.py @@ -0,0 +1,11 @@ +"""Source cited by the diagram's :source_doc: option.""" + + +def submit_order(order, db): + if order.total <= 0: + raise ValueError("non-positive total") + elif order.customer_id is None: + raise ValueError("missing customer") + else: + order_id = db.persist(order) + return {"status": "accepted", "order_id": order_id} diff --git a/skills/pharaoh-diagram-review/fixtures/external-lib-missing/README.md b/skills/pharaoh-diagram-review/fixtures/external-lib-missing/README.md new file mode 100644 index 0000000..6fcb5e9 --- /dev/null +++ b/skills/pharaoh-diagram-review/fixtures/external-lib-missing/README.md @@ -0,0 +1,5 @@ +# external-lib-missing + +Source function `fetch_report` imports the third-party `requests` library and calls `requests.get(...)` inline. `requests` is not in `sys.stdlib_module_names`, so the detection rule classifies it as external. The Mermaid diagram only lists `CLI` and `WeatherService` as participants and draws the HTTP call as an internal `WeatherService -> WeatherService` step (here elided entirely as a `WeatherService -> CLI` return). + +The participant grep `^\s*participant\s+requests\b` finds no match, so `external_library_participant` fails with evidence naming the import site. `conditional_branches_marked` is `n/a` because the source function has no conditional branches; `returns_match_call_stack` passes because the only return goes to `CLI`, which is a prior caller. diff --git a/skills/pharaoh-diagram-review/fixtures/external-lib-missing/expected-output.json b/skills/pharaoh-diagram-review/fixtures/external-lib-missing/expected-output.json new file mode 100644 index 0000000..0ea814e --- /dev/null +++ b/skills/pharaoh-diagram-review/fixtures/external-lib-missing/expected-output.json @@ -0,0 +1,15 @@ +{ + "parent_need_id": "FEAT_weather_fetch", + "diagram_type": "sequence", + "renderer": "mermaid", + "axes": { + "conditional_branches_marked": {"passed": "n/a", "reason": "source function has 0 conditional branches"}, + "external_library_participant": { + "passed": false, + "evidence": "requests imported and called at fetch_report:L7 (requests.get); absent from participant list" + }, + "returns_match_call_stack": {"passed": true, "reason": "single return to CLI; CLI is prior caller"} + }, + "overall": "fail", + "blockers": ["external_library_participant"] +} diff --git a/skills/pharaoh-diagram-review/fixtures/external-lib-missing/input-diagram.rst b/skills/pharaoh-diagram-review/fixtures/external-lib-missing/input-diagram.rst new file mode 100644 index 0000000..cbfc1f0 --- /dev/null +++ b/skills/pharaoh-diagram-review/fixtures/external-lib-missing/input-diagram.rst @@ -0,0 +1,10 @@ +.. mermaid:: + :caption: FEAT_weather_fetch — fetch weather report + :source_doc: input-source.py + + sequenceDiagram + participant CLI + participant WeatherService + + CLI->>WeatherService: fetch_report(city) + WeatherService-->>CLI: report diff --git a/skills/pharaoh-diagram-review/fixtures/external-lib-missing/input-source.py b/skills/pharaoh-diagram-review/fixtures/external-lib-missing/input-source.py new file mode 100644 index 0000000..57d44f0 --- /dev/null +++ b/skills/pharaoh-diagram-review/fixtures/external-lib-missing/input-source.py @@ -0,0 +1,9 @@ +"""Source cited by the diagram's :source_doc: option.""" + +import requests + + +def fetch_report(city): + response = requests.get(f"https://api.example.com/weather/{city}") + response.raise_for_status() + return response.json() diff --git a/skills/pharaoh-diagram-review/fixtures/external-lib-present/README.md b/skills/pharaoh-diagram-review/fixtures/external-lib-present/README.md new file mode 100644 index 0000000..a0fc124 --- /dev/null +++ b/skills/pharaoh-diagram-review/fixtures/external-lib-present/README.md @@ -0,0 +1,5 @@ +# external-lib-present + +Same source file as `external-lib-missing/`, but the diagram declares `participant requests as Requests` — making the external HTTP dependency explicit in the interaction. The participant grep matches `^\s*participant\s+requests\b`, so `external_library_participant` passes. + +Note: the `as Requests` alias is the Mermaid / PlantUML display-name syntax; the participant identifier remains `requests`, which is what the grep keys on. Projects that prefer a display alias can set `tailoring.external_alias_map` to accept the alias instead, but the default detection uses the bare import name. diff --git a/skills/pharaoh-diagram-review/fixtures/external-lib-present/expected-output.json b/skills/pharaoh-diagram-review/fixtures/external-lib-present/expected-output.json new file mode 100644 index 0000000..88aac0b --- /dev/null +++ b/skills/pharaoh-diagram-review/fixtures/external-lib-present/expected-output.json @@ -0,0 +1,15 @@ +{ + "parent_need_id": "FEAT_weather_fetch", + "diagram_type": "sequence", + "renderer": "mermaid", + "axes": { + "conditional_branches_marked": {"passed": "n/a", "reason": "source function has 0 conditional branches"}, + "external_library_participant": { + "passed": true, + "evidence": "requests imported and called; participant 'requests as Requests' present in diagram body" + }, + "returns_match_call_stack": {"passed": true, "reason": "returns to WeatherService and CLI are prior callers"} + }, + "overall": "pass", + "blockers": [] +} diff --git a/skills/pharaoh-diagram-review/fixtures/external-lib-present/input-diagram.rst b/skills/pharaoh-diagram-review/fixtures/external-lib-present/input-diagram.rst new file mode 100644 index 0000000..540a9b1 --- /dev/null +++ b/skills/pharaoh-diagram-review/fixtures/external-lib-present/input-diagram.rst @@ -0,0 +1,13 @@ +.. mermaid:: + :caption: FEAT_weather_fetch — fetch weather report + :source_doc: input-source.py + + sequenceDiagram + participant CLI + participant WeatherService + participant requests as Requests + + CLI->>WeatherService: fetch_report(city) + WeatherService->>requests: GET /weather/{city} + requests-->>WeatherService: response + WeatherService-->>CLI: report diff --git a/skills/pharaoh-diagram-review/fixtures/external-lib-present/input-source.py b/skills/pharaoh-diagram-review/fixtures/external-lib-present/input-source.py new file mode 100644 index 0000000..57d44f0 --- /dev/null +++ b/skills/pharaoh-diagram-review/fixtures/external-lib-present/input-source.py @@ -0,0 +1,9 @@ +"""Source cited by the diagram's :source_doc: option.""" + +import requests + + +def fetch_report(city): + response = requests.get(f"https://api.example.com/weather/{city}") + response.raise_for_status() + return response.json() diff --git a/skills/pharaoh-diagram-review/fixtures/return-to-caller-correct/README.md b/skills/pharaoh-diagram-review/fixtures/return-to-caller-correct/README.md new file mode 100644 index 0000000..5c40d13 --- /dev/null +++ b/skills/pharaoh-diagram-review/fixtures/return-to-caller-correct/README.md @@ -0,0 +1,9 @@ +# return-to-caller-correct + +Same source as `return-to-user-wrong/`, but the diagram legitimately declares `User` as the entrypoint actor AND `User` is the source of the first call (`User->>CLI`). The stack walk builds `[User, CLI, SettingsService]`; every return arrow pops to a prior caller on the stack: + +- `Store -->> SettingsService` pops to SettingsService (top of stack after Store call). +- `SettingsService -->> CLI` pops to CLI. +- `CLI -->> User` pops to User — the declared entrypoint, which issued the first call. + +All three returns terminate at a caller from the stack, so `returns_match_call_stack` passes. Contrast with `return-to-user-wrong/` where `User` was declared but never called, making the terminal return to `User` an invented target. diff --git a/skills/pharaoh-diagram-review/fixtures/return-to-caller-correct/expected-output.json b/skills/pharaoh-diagram-review/fixtures/return-to-caller-correct/expected-output.json new file mode 100644 index 0000000..0864968 --- /dev/null +++ b/skills/pharaoh-diagram-review/fixtures/return-to-caller-correct/expected-output.json @@ -0,0 +1,15 @@ +{ + "parent_need_id": "FEAT_settings_update", + "diagram_type": "sequence", + "renderer": "mermaid", + "axes": { + "conditional_branches_marked": {"passed": "n/a", "reason": "source function has 0 conditional branches"}, + "external_library_participant": {"passed": "n/a", "reason": "no non-stdlib imports called in update_setting"}, + "returns_match_call_stack": { + "passed": true, + "evidence": "4 returns checked; each terminates at the participant that issued the preceding call (Store->SettingsService, SettingsService->CLI, CLI->User — User is the declared entrypoint and made the first call)" + } + }, + "overall": "pass", + "blockers": [] +} diff --git a/skills/pharaoh-diagram-review/fixtures/return-to-caller-correct/input-diagram.rst b/skills/pharaoh-diagram-review/fixtures/return-to-caller-correct/input-diagram.rst new file mode 100644 index 0000000..836885e --- /dev/null +++ b/skills/pharaoh-diagram-review/fixtures/return-to-caller-correct/input-diagram.rst @@ -0,0 +1,16 @@ +.. mermaid:: + :caption: FEAT_settings_update — update user setting from interactive shell + :source_doc: input-source.py + + sequenceDiagram + actor User + participant CLI + participant SettingsService + participant Store + + User->>CLI: type "set key value" + CLI->>SettingsService: update_setting(key, value) + SettingsService->>Store: write(key, value) + Store-->>SettingsService: ok + SettingsService-->>CLI: updated + CLI-->>User: OK diff --git a/skills/pharaoh-diagram-review/fixtures/return-to-caller-correct/input-source.py b/skills/pharaoh-diagram-review/fixtures/return-to-caller-correct/input-source.py new file mode 100644 index 0000000..f7bfa4d --- /dev/null +++ b/skills/pharaoh-diagram-review/fixtures/return-to-caller-correct/input-source.py @@ -0,0 +1,6 @@ +"""Source cited by the diagram's :source_doc: option.""" + + +def update_setting(store, key, value): + store.write(key, value) + return "updated" diff --git a/skills/pharaoh-diagram-review/fixtures/return-to-user-wrong/README.md b/skills/pharaoh-diagram-review/fixtures/return-to-user-wrong/README.md new file mode 100644 index 0000000..6f64b4b --- /dev/null +++ b/skills/pharaoh-diagram-review/fixtures/return-to-user-wrong/README.md @@ -0,0 +1,7 @@ +# return-to-user-wrong + +The call stack induced by the arrows is `[CLI, SettingsService]` (CLI called SettingsService, SettingsService called Store). The Store return pops down to SettingsService correctly. The final return `SettingsService -->> User` then terminates at `User` — a participant that was declared but never issued a call, and is NOT the declared entrypoint (first declared caller is `CLI`). + +This is the "invented User" failure mode: the diagram introduces a free-floating `User` participant and routes the terminal return there instead of back to `CLI`. The detection rule's stack walk records the mismatch and fails the axis with `return_from`, `return_to`, and the live stack in evidence. + +Note: simply declaring `User` as a participant is not enough to make it a legitimate return target. The entrypoint exemption only applies when `User` is the FIRST caller in the diagram (see `return-to-caller-correct/`). diff --git a/skills/pharaoh-diagram-review/fixtures/return-to-user-wrong/expected-output.json b/skills/pharaoh-diagram-review/fixtures/return-to-user-wrong/expected-output.json new file mode 100644 index 0000000..6866973 --- /dev/null +++ b/skills/pharaoh-diagram-review/fixtures/return-to-user-wrong/expected-output.json @@ -0,0 +1,15 @@ +{ + "parent_need_id": "FEAT_settings_update", + "diagram_type": "sequence", + "renderer": "mermaid", + "axes": { + "conditional_branches_marked": {"passed": "n/a", "reason": "source function has 0 conditional branches"}, + "external_library_participant": {"passed": "n/a", "reason": "no non-stdlib imports called in update_setting"}, + "returns_match_call_stack": { + "passed": false, + "evidence": "return 'SettingsService-->>User' terminates at User; User never issued a call and is not the declared entrypoint (entrypoint = CLI). Stack at return: [CLI, SettingsService]" + } + }, + "overall": "fail", + "blockers": ["returns_match_call_stack"] +} diff --git a/skills/pharaoh-diagram-review/fixtures/return-to-user-wrong/input-diagram.rst b/skills/pharaoh-diagram-review/fixtures/return-to-user-wrong/input-diagram.rst new file mode 100644 index 0000000..0313d9d --- /dev/null +++ b/skills/pharaoh-diagram-review/fixtures/return-to-user-wrong/input-diagram.rst @@ -0,0 +1,14 @@ +.. mermaid:: + :caption: FEAT_settings_update — update user setting from CLI + :source_doc: input-source.py + + sequenceDiagram + participant CLI + participant SettingsService + participant Store + participant User + + CLI->>SettingsService: update_setting(key, value) + SettingsService->>Store: write(key, value) + Store-->>SettingsService: ok + SettingsService-->>User: updated diff --git a/skills/pharaoh-diagram-review/fixtures/return-to-user-wrong/input-source.py b/skills/pharaoh-diagram-review/fixtures/return-to-user-wrong/input-source.py new file mode 100644 index 0000000..f7bfa4d --- /dev/null +++ b/skills/pharaoh-diagram-review/fixtures/return-to-user-wrong/input-source.py @@ -0,0 +1,6 @@ +"""Source cited by the diagram's :source_doc: option.""" + + +def update_setting(store, key, value): + store.write(key, value) + return "updated" diff --git a/skills/pharaoh-dispatch-signal-check/SKILL.md b/skills/pharaoh-dispatch-signal-check/SKILL.md new file mode 100644 index 0000000..4262280 --- /dev/null +++ b/skills/pharaoh-dispatch-signal-check/SKILL.md @@ -0,0 +1,66 @@ +--- +name: pharaoh-dispatch-signal-check +description: Use when verifying that a plan's declared `execution_mode` matches observed subagent artefacts in `runs/`. Detects the "LLM-executor collapsed subagents into inline" failure class observed during dogfooding. One mechanical structural check. +--- + +# pharaoh-dispatch-signal-check + +## When to use + +Invoke from `pharaoh-quality-gate.required_checks` on any plan with `execution_mode: subagents` in any task. Compares declared mode against presence of per-task artefacts in `runs/`. Returns pass/fail + a list of tasks whose declared mode was not honoured. + +Do NOT use to enforce dispatch at runtime — that is `pharaoh-execute-plan`. This skill observes after the fact. + +## Atomicity + +- (a) Indivisible: one plan.yaml + one runs directory in → pass/fail + mismatch list out. No retry, no dispatch, no re-execution. +- (b) Input: `{plan_path: str, runs_path: str}`. Output: JSON `{passed: bool, mismatches: list[{task_id, declared, observed}]}`. +- (c) Reward: fixtures in `pharaoh-validation/fixtures/pharaoh-dispatch-signal-check/`: + 1. `match/`: plan declares `subagents` for two tasks, `runs/task_1/return.json` and `runs/task_2/return.json` both exist → matches `expected-match-pass.json` (`passed: true, mismatches: []`). + 2. `parallel-declared-inline-observed/`: plan declares `subagents`, runs only has `runs/aggregated.json` (no per-task files) → matches `expected-collapsed-fail.json` (`passed: false`, `mismatches` names the task, `observed: "inline"`). + 3. `inline-declared-parallel-observed/`: plan declares `inline`, runs has per-task files anyway → passed: true (over-dispatch is not a failure, only under-dispatch is). + 4. Idempotent. + + Pass = all 4. +- (d) Reusable by any plan-executing flow. +- (e) Read-only. + +## Input + +- `plan_path`: absolute path to `plan.yaml`. Accepts the full schema enum `execution_mode ∈ {inline, subagents, family-bundle, ask}` declared in `pharaoh-execute-plan/schema.md`. Default `ask` if omitted (per schema). This skill only enforces its detection rule on tasks with `execution_mode == subagents`; `inline`, `family-bundle`, and `ask` modes are skipped (no check). +- `runs_path`: absolute path to the plan's runs directory. Convention: `/.pharaoh/runs//`. + +## Output + +```json +{ + "passed": false, + "mismatches": [ + { + "task_id": "reqs_from_code", + "declared": "subagents", + "observed": "inline", + "evidence": "expected per-item return.json under runs/reqs_from_code/; found aggregated.json at runs/ root instead" + } + ] +} +``` + +## Detection rule + +For each task `T` in the plan with `execution_mode: subagents` and a non-empty `foreach`: + +- **Expected shape:** at least `len(foreach)` return artefacts under `//`. Canonical form: per-item subdirectories `task_1/`, `task_2/`, ..., each with a `return.json`. +- **Collapse patterns that fail this check:** + 1. Only `//return.json` exists (single aggregated file under a task-named subdir, no per-item split). + 2. `/aggregated.json` exists at the runs root with no `/` subdirectory (flat collapse, as seen in an earlier dogfooding iteration). + 3. Any other pattern with fewer artefacts than `len(foreach)` items. +- The `evidence` field in a mismatch entry names the concrete pattern observed ("single return.json under task subdir", "aggregated.json at the runs root", "N artefacts found, expected M"). + +For each task `T` with `execution_mode ∈ {inline, family-bundle, ask}`: + +- No check. These modes have different (or user-resolved) dispatch semantics that this skill does not model; under-dispatch detection here would produce false positives. + +## Composition + +Called by `pharaoh-quality-gate` when `required_checks` contains `dispatch_signal_matches_plan: true`. Never called directly. diff --git a/skills/pharaoh-execute-plan/SKILL.md b/skills/pharaoh-execute-plan/SKILL.md new file mode 100644 index 0000000..eebed24 --- /dev/null +++ b/skills/pharaoh-execute-plan/SKILL.md @@ -0,0 +1,283 @@ +--- +name: pharaoh-execute-plan +description: Use when executing a plan.yaml produced by pharaoh-write-plan. Reads the plan, runs each task (inline or via subagent dispatch), threads outputs between tasks per the ref grammar, validates outputs via pharaoh-output-validate, persists artefacts and report.yaml. Generic — the plan is the orchestrator, this skill is the engine. +--- + +# pharaoh-execute-plan + +## Invariant: every `completed` task has output on disk + +A task marked `status: completed` MUST have its declared output present on disk — an artefact file at the path from Step 4.6 (emission tasks) or a `return.json` at `/runs/[/]/return.json` (check / review / gate tasks). Tasks that "completed inlined without output" do not exist. If a task cannot produce its output, mark it `failed` or `skipped` with a reason — never `completed`. Step 4.10 (`output_presence_audit`) enforces this before `report.yaml` is written; missing output rewrites the task status to `reporting_error` and the plan status to `failed` with reason `missing_task_output`. Skipping or collapsing per-foreach-instance `return.json` files into one summary defeats `pharaoh-self-review-coverage-check`, which reads the files directly. + +## When to use + +Invoke when you already have a plan.yaml and want to execute it. The plan carries everything the executor needs: task graph, skill references, input refs, validation rules, execution-mode defaults. This skill never authors plans (that is `pharaoh-write-plan`) and never reviews results (that is `pharaoh-quality-gate` or a human). + +Also: this skill replaces the prose-orchestration of old composition skills like `pharaoh-feats-from-project` and `pharaoh-reqs-from-module`. Those made the LLM execute a 12-step process by reading prose; this skill executes a DAG declared as data. If you find yourself reading a multi-step prose-orchestration skill, stop and look for a plan.yaml instead. + +## Atomicity + +- (a) **Indivisible.** One plan.yaml in, one report.yaml plus an artefacts directory out. No plan authoring. No review. No domain-specific behaviour. Adding a feature to the executor means extending the schema, not this skill. +- (b) **Typed I/O.** + - Input: `{plan_path: str, project_root: str, workspace_dir?: str, execution_mode_override?: "inline"|"subagents"}`. + - Output: `{status: "completed"|"aborted"|"partial", report_path: str, artefacts_dir: str, failed_task_ids: list[str]}`. +- (c) **Execution-based reward.** Fixture in `pharaoh-validation/fixtures/execute-plan-smoke/` contains a 3-task plan using mock-emit skills (each returns a deterministic string given its input). After the executor runs: + 1. `report.yaml` exists and parses. + 2. All three tasks appear under `tasks:` with `status: completed`. + 3. Artefact files exist under the workspace at the paths declared in the report. + 4. Ref resolution worked: task 2 and task 3 received task 1's output verbatim (captured by the mock). +- (d) **Reusable.** Any plan that conforms to `schema.md`. Forward-engineering, reverse-engineering, migration — all look alike to the executor. +- (e) **Composable.** Called by higher-level skills or directly by the user. The only skill it calls internally is `pharaoh-output-validate`; per-task skill dispatch is parameterised by `skill:` in the plan. + +## Input + +- `plan_path`: absolute path to plan.yaml. +- `project_root`: absolute path. Must match the plan's `project_root` (else fail with `project_root_mismatch`). +- `workspace_dir` (optional): absolute path. If omitted, resolved from plan's `workspace_dir` or default `/.pharaoh/runs/-/`. +- `execution_mode_override` (optional): overrides `defaults.execution_mode` in the plan. Individual tasks with their own `execution_mode:` still win over this override. + +## Output + +A mapping: + +```yaml +status: completed | aborted | partial +report_path: +artefacts_dir: +failed_task_ids: [, ...] +``` + +`status`: +- `completed` — all tasks reached `completed` status. +- `partial` — some tasks `failed` or `skipped` under `on_fail: skip_dependents`; others ran to completion. +- `aborted` — an `on_fail: abort_plan` rule fired, or the plan itself was rejected at static validation. + +## Process + +### Step 1: Load and validate plan + +1. Read `plan_path`. Parse as YAML. On parse error → return `{status: "aborted", ...}` with the parse error recorded in `report.yaml`. +2. Validate against `schema.md`: + - Required top-level fields present. + - No unknown top-level fields. + - `version == 1`. + - Every task has required fields; no unknowns. + - Every `skill:` references a directory present under `/skills/` or `/skills/`. +3. Confirm `project_root` input matches plan's declared `project_root`. Mismatch → abort. +4. Resolve `workspace_dir`. Create directory if missing. + +### Step 2: Static ref analysis + +Before any task runs, walk every task's `inputs`, `depends_on`, `foreach`: + +1. Parse each `${...}` ref. Syntax errors → abort plan; record which task and which field. +2. For each ref, resolve the producing task id. Unknown producer → abort. +3. For each ref using a helper, confirm the helper exists in the helper set declared in `schema.md`. Unknown helper → abort. +4. Build the dependency graph (explicit `depends_on` ∪ implicit deps from refs). +5. Detect cycles via DFS. Any cycle → abort; list the cycle in the report. +6. Validate `parallel_group` invariants: every group's members share the same `depends_on` set, no intra-group deps. +7. Warn (do not abort) on declared `outputs:` refs to fields not enumerated in the producer's `outputs:` map. Documentary only. + +Abort here means `status: "aborted"`, zero tasks executed, report written with the specific error. + +### Step 3: Topological order + +Produce a partial order: list-of-lists where each inner list is a "wave" — tasks with all upstream deps satisfied. Within a wave, tasks sharing a `parallel_group` are candidates for concurrent dispatch in subagents mode. + +Foreach-expanded tasks are expanded into concrete instances at this step: if `foreach: ${upstream}` produces N items, emit N logical tasks with ids `[0]`, …, `[N-1]`. Instance inputs are resolved per-iteration with `${item}` bound. + +### Step 3.5: Resolve execution mode (interactive for ambiguous foreach) + +Before Step 4 dispatches anything, walk every foreach-expanded task and determine its effective execution mode. Priority (first match wins): + +1. Executor was invoked with `execution_mode_override` → use the override. Skip prompting. +2. The task has an explicit `execution_mode:` field → use that value. +3. `plan.defaults.execution_mode` is a concrete mode (`inline`, `subagents`, `family-bundle`) → use the plan default. +4. Plan default is `ask` OR plan default is absent → GATE. Emit the prompt below, collect the user's answer, apply it to every expanded instance of this task. + +The gate fires at most once per foreach-originating task, not once per instance. Non-foreach tasks default to `inline` (no prompt); the gate exists specifically to prevent silent scope collapse on large fan-outs. + +**Prompt shape.** For each ambiguous foreach task (N instances), emit to the controller: + +``` +Task `` has foreach over `` and expanded to instances. +How should the executor dispatch them? + + [inline] Run instances sequentially in this conversation. + Cheapest. No cross-instance atomicity — the controlling + agent sees every instance's inputs and outputs. Good for + N ≤ 3 and deterministic skills. + + [subagents] Dispatch one subagent per instance. Full atomicity — each + subagent sees only its resolved inputs. Respects per- + instance caps (e.g. "5-7 comp_reqs per feat"). Expensive + at N > 20. + + [family-bundle] Group instances by a bundle key and dispatch one subagent + per bundle. Middle ground. Per-instance caps are NOT + enforced across the bundle — prior dogfooding confirmed + sibling instances leak into each other when one subagent + sees multiple foreach scopes at once. + +Choose one (inline | subagents | family-bundle): +``` + +If the user picks `family-bundle`, follow up with: + +``` +bundle_key (ref, e.g. `${item.feat_id}` or `${heuristics.(item.file)}`): +``` + +The user's answer is a valid ref per the schema's ref grammar. Validate it syntactically; on malformed ref, re-prompt once; on second failure, fall back to `subagents` mode and warn. + +**Recording.** Every gate decision lands in `report.yaml` under the task's entry: + +```yaml +tasks: + : + execution_mode_decision: + resolved_mode: inline | subagents | family-bundle + source: override | task_level | plan_default | user_prompt + bundle_key: # only when resolved_mode=family-bundle + prompted_at: # only when source=user_prompt +``` + +This makes the decision auditable — if the pilot review says "executor silently bundled", the report either proves or disproves it. + +**Non-interactive callers.** When the executor cannot accept a response (e.g. running under a CI harness), treat `ask` as an error: abort the plan with `status: aborted` and note `execution_mode_gate_cannot_prompt`. Callers that want unattended execution must set `defaults.execution_mode` to a concrete mode or pass `execution_mode_override`. + +### Step 4: Per-task execution loop + +For each wave in order: + +4.1. **Dispatch plan.** Per-task resolved execution mode (from Step 3.5) drives dispatch shape: + + - **inline**: the controlling agent executes each instance sequentially in-context. `parallel_group` is informational only. + - **subagents**: dispatch one subagent per task (or per foreach instance). Group members in the same `parallel_group` dispatch in one turn via parallel Task tool calls. + - **family-bundle**: evaluate the task's `bundle_key` for every foreach instance. Partition instances by resolved key. Dispatch one subagent per bundle; each subagent receives the family-bundle variant of `implementer-prompt.md` and runs the skill once per item in its bundle. Bundles sharing a `parallel_group` dispatch concurrently. + + Tasks without foreach with `family-bundle` configured are a schema error (caught at Step 2); at this point every family-bundle task is foreach-expanded. + +4.2. **Per task, resolve runtime refs.** Look up each input ref in the in-memory artefact store. If any ref is unresolvable (upstream failed/skipped), mark this task `blocked`, apply its `on_fail` policy, continue. + +4.3. **Render implementer prompt.** Use `implementer-prompt.md` as the template. Fill variables: + - `{skill_name}` — from task's `skill:` + - `{skill_body}` — full contents of `//SKILL.md`, minus frontmatter + - `{task_id}` — e.g. `map_files[3]` for foreach instance 3 + - `{task_inputs_yaml}` — the resolved input map as YAML + - `{expected_output_schema}` — task's `expected_output_schema` or "unspecified" + - `{project_root}`, `{workspace}` — absolute paths. + +4.4. **Dispatch.** + - `inline` mode: the controlling agent (the one running this skill) reads the rendered prompt and performs the atomic skill's process directly in-context. Record the output when done. + - `subagents` mode: invoke the Task tool with the rendered prompt as the subagent's whole brief. Capture the subagent's return message. + - `family-bundle` mode: render the family-bundle variant of `implementer-prompt.md` (one subagent covers all bundle items). Dispatch via Task tool. Capture the subagent's multi-output return (one artefact per bundle item, in the order the subagent was handed them). Validate each artefact independently per Step 4.5. + +4.5. **Validate output.** Run `pharaoh-output-validate` with: + - `output_text` = dispatched task's return value + - `target_schema` = `expected_output_schema` if set, else any `validation` rule targeting this task, else skip validation. + - `schema_context` = `{directive: ..., required_options: [...]}` when the schema is `rst_directive`; empty for other schemas. + - `strip_fences: true` + +4.6. **Handle validation result.** + - `valid: true` → persist the parsed/stripped artefact to `/artefacts/.` where `.ext` is `.rst` for directives, `.yaml` for yaml, `.txt` default. Mark task `completed`. Update in-memory artefact store with the task's output. + - `valid: false` and retries remaining → increment retry counter, rebuild prompt with stricter preamble (see below), re-dispatch. + - `valid: false` and retries exhausted → apply the validation rule's `on_fail` policy. + +4.7. **Retry preamble.** When re-dispatching after a validation failure, prepend to the prompt: + +``` +STRICT OUTPUT REQUIRED. Your previous attempt failed validation with: + +Emit ONLY the artefact content expected by the target schema. No prose wrapper. No markdown fences. No typos in option keys. +``` + +4.8. **On_fail policies.** + - `retry` — already consumed; after exhaustion treat as `skip_dependents`. + - `skip_dependents` — mark this task `failed`. Mark every transitive dependent `skipped`. Continue with independent branches of the DAG. Final plan status becomes `partial`. + - `abort_plan` — mark this task `failed`. Stop dispatching. Emit report. Status `aborted`. + +4.9. **Parallel dispatch.** In subagents mode within a parallel_group, dispatch all tasks in the group in one message (multiple Task tool calls in one turn). Wait for all to return before moving to the next wave. Per-task validation and retry still happen; retry re-dispatches only the failing task, not the whole group. + +4.10. **`output_presence_audit` — run before Step 5.** For every task whose in-memory status is `completed`, verify that its declared output exists on disk AND is non-empty: + + - **Emission tasks** (skill emits RST directives, YAML, diagrams, etc.): the task's artefact file persisted in Step 4.6 (`/artefacts/.`, or `<...>/artefacts//.` for foreach). File must exist and `size > 0`. + - **Check / review / gate tasks** (skill emits JSON findings — `pharaoh-req-review`, `pharaoh-req-code-grounding-check`, `pharaoh-diagram-review`, `pharaoh-feat-review`, `pharaoh-diagram-lint`, `pharaoh-quality-gate`, `pharaoh-output-validate`, `pharaoh-self-review-coverage-check`, any future atom-check role): a `return.json` under `/runs/[/]/return.json`. File must exist, parse as JSON, and be a non-empty object. For foreach tasks, verify one `return.json` per foreach instance (count ≥ instance count from the expansion in Step 4.1). + - **Composition / plumbing tasks** (skill emits only in-memory data used by downstream refs — `pharaoh-id-allocate`, `pharaoh-feat-file-map`, `pharaoh-context-gather`): require a `return.json` at `/runs//return.json` capturing the in-memory output that downstream refs resolved. + + For each task that fails this audit: + 1. Rewrite its report status from `completed` to `reporting_error`. + 2. Append a `reporting_errors` entry naming the missing path. + 3. Mark the plan's overall status as `failed` with reason `missing_task_output` (overrides a previously-clean status; does NOT override `aborted`). + + The audit is mandatory. Skipping or collapsing it is the exact failure mode called out in the invariant at the top of this skill. + +### Step 5: Emit report + +After the loop terminates (completion, partial, or abort): + +1. Write `report.yaml` to `/report.yaml` per the schema in `schema.md#report-yaml`. +2. Include every task (completed, failed, skipped, blocked, `reporting_error`). +3. Include foreach instances under `foreach_instances:` for tasks that had foreach. +4. Include top-level `reporting_errors:` list if Step 4.10's audit caught anything; each entry is `{task_id, foreach_index?, expected_path, reason: "missing" | "empty" | "unparseable"}`. +5. Return `{status, report_path, artefacts_dir, failed_task_ids, reporting_errors}`. + +## Failure modes + +| Condition | Response | +| ------------------------------------------ | -------------------------------------------------------------- | +| Plan YAML invalid | status=aborted; report notes `plan_invalid: `. | +| Schema violation | status=aborted; report notes which rule failed. | +| project_root mismatch | status=aborted. | +| Unknown skill | status=aborted at static validation. | +| Cyclic dep | status=aborted; cycle printed. | +| Unresolvable ref at runtime | task=blocked; on_fail policy applies. | +| pharaoh-output-validate errors internally | Log, treat as validation failure (conservative). | +| Task dispatch returns empty | Treat as validation failure with error `empty_output`. | +| Subagent Task tool fails | Retry once; on second failure mark task failed. | +| `completed` task has no output on disk | Step 4.10 rewrites to `reporting_error`; plan status=`failed`. | + +## Worked example + +Plan (excerpt): + +```yaml +name: smoke +version: 1 +project_root: /tmp/fixture +tasks: + - id: feats + skill: pharaoh-feat-draft-from-docs + inputs: + docs_root: docs + outputs: + feats: list + expected_output_schema: rst_directive + - id: map + skill: pharaoh-feat-file-map + foreach: ${feats.feats} + inputs: + feat_id: ${item.id} + feat_title: ${item.title} + feat_body: ${item.body} + src_root: src + depends_on: [feats] + parallel_group: map_files +``` + +Execution trace (2 feats discovered): + +1. Wave 1: `feats` runs inline. Returns 2 directive blocks. Validated as rst_directive. Parsed `feats:` list cached to store. +2. Wave 2: `map` expands to `map[0]`, `map[1]`. Both share parallel_group `map_files`. In subagents mode: dispatched together in one turn. Each returns YAML; validated against yaml schema; persisted. +3. Report lists `feats: completed` and `map: completed` with `foreach_instances: [index:0 completed, index:1 completed]`. + +## Non-goals + +- Does not author plans. +- Does not choose `execution_mode` based on heuristics — that is the plan's business (via `defaults` or per-task override). +- Does not perform cross-plan dedup or impact analysis — those are separate skills. +- Does not log progress to stdout beyond the final return value; structured progress lives in report.yaml. + +## Relationship to deleted composition skills + +`pharaoh-feats-from-project` and `pharaoh-reqs-from-module` previously encoded orchestration in prose. They have been deleted in favour of this skill + `pharaoh-write-plan`. The domain heuristics those skills carried (split_strategy selection, preseed-before-reqs ordering, quality-gate wiring, id-allocate positioning) moved to `pharaoh-write-plan`'s plan-authoring logic. The executor itself is domain-free. diff --git a/skills/pharaoh-execute-plan/implementer-prompt.md b/skills/pharaoh-execute-plan/implementer-prompt.md new file mode 100644 index 0000000..989bc40 --- /dev/null +++ b/skills/pharaoh-execute-plan/implementer-prompt.md @@ -0,0 +1,149 @@ +# Implementer prompt template + +This file is the subagent-dispatch template used by `pharaoh-execute-plan`. The executor substitutes `{placeholders}` at dispatch time. Two variants exist: + +- **Single-task variant** (default): one subagent runs the skill once, for one foreach instance or one non-foreach task. Used for `execution_mode: subagents` or single-instance bundles in `family-bundle`. +- **Family-bundle variant**: one subagent runs the skill once per item in a bundle, in order. Used for `execution_mode: family-bundle` when a bundle contains >1 instance. + +Both variants share the variable list below; the family-bundle variant substitutes `{task_inputs_yaml}` with a list (one entry per bundle item) and uses a dedicated prompt body. + +Variables (all required unless marked optional): +- `{skill_name}` — the atomic skill being invoked, e.g. `pharaoh-feat-file-map`. +- `{skill_body}` — the full SKILL.md contents of that skill, minus the YAML frontmatter. +- `{task_id}` — this invocation's id, e.g. `map_files[3]` for foreach instance 3. For family-bundle, the id is `#bundle:` (e.g. `reqs_from_code#bundle:FEAT_csv_export`). +- `{task_inputs_yaml}` — single-task variant: the resolved input map as a YAML block. Family-bundle variant: a YAML list of input maps, one per item, in bundle order. +- `{expected_output_schema}` — a schema name recognised by `pharaoh-output-validate` (e.g. `rst_directive`, `codelinks_comment`, `yaml`), or the string `unspecified`. +- `{project_root}` — absolute path. +- `{workspace}` — absolute path to the run workspace. +- `{retry_preamble}` — optional. Present only on validation-failure retries. See `SKILL.md#retry-preamble`. +- `{bundle_key}` — optional. Only populated in the family-bundle variant. The resolved key value shared by every item in the bundle. + +``` +{retry_preamble} + +You are executing a single atomic task as part of a larger plan. Your job is to perform exactly the process specified in the skill below, with the exact inputs provided, and to emit exactly the output shape the skill declares. Nothing more, nothing less. + +## The skill you are invoking + +Skill name: {skill_name} + +Skill body (read this end-to-end before doing anything): + +{skill_body} + +## Your task in this plan + +Task id: {task_id} +Project root: {project_root} +Workspace (read/write here as the skill instructs): {workspace} + +## Your inputs (already resolved — no refs remain) + +```yaml +{task_inputs_yaml} +``` + +## Expected output schema + +{expected_output_schema} + +This is a hint for the executor's post-hoc validation. Your output must satisfy the schema the skill itself declares — the executor will run `pharaoh-output-validate` against your return value. + +## How to work + +1. Read the skill body end-to-end. +2. Confirm your inputs match the skill's documented `Input` section. If any required input is missing or malformed, STOP and return `BLOCKED` with `missing_input: `. +3. Apply the skill's documented process exactly. Do not skip steps. Do not add steps. +4. Keep the scope tight to this task. Do not invoke other skills except those the skill body explicitly tells you to. +5. Produce output in the exact shape the skill's `Output` section specifies. + +## Hard limits + +- No markdown fences around the output unless the skill explicitly requires them. +- No prose wrapper. No "Here's the output:" preamble. No trailing commentary. +- No invented fields. If the skill's output schema lists fields A, B, C, emit only those. +- No file writes outside `{workspace}` unless the skill body explicitly instructs otherwise. +- No calls to external systems unless the skill body explicitly instructs otherwise. + +## When to escalate + +Return one of these statuses instead of output: + +- `BLOCKED: ` — the task cannot be completed (missing input, contradiction in skill, tool unavailable). +- `NEEDS_CONTEXT: ` — you need information the task inputs did not provide. +- `DONE_WITH_CONCERNS: ` — you produced output but have substantive doubts about correctness. + +On any of those three, emit the status on line 1 and follow with free-form explanation. The executor will surface these to the controller for re-dispatch decisions. + +On successful completion, emit only the output the skill specifies. No status prefix. + +## Report format when DONE + +Emit only the artefact. The executor infers success from the absence of a status prefix and from validator pass. +``` + +## Family-bundle variant + +Used when the executor resolved `execution_mode: family-bundle` for a foreach task and the bundle contains more than one item. The subagent runs the skill once per bundle item, emitting one artefact per item in order. The executor validates each artefact independently. + +``` +{retry_preamble} + +You are executing a BUNDLE of atomic tasks. All tasks invoke the same skill and share the bundle key `{bundle_key}`. Run the skill once per item in the order given. Emit one artefact per item, separated by the bundle separator line `---ITEM---`. The executor will split on the separator and validate each artefact independently. + +## The skill you are invoking (once per bundle item) + +Skill name: {skill_name} + +Skill body (read this end-to-end before doing anything): + +{skill_body} + +## Your bundle in this plan + +Bundle id: {task_id} +Bundle key: {bundle_key} +Project root: {project_root} +Workspace (read/write here as the skill instructs): {workspace} + +## Your bundle items (resolved inputs, one per item, ordered) + +```yaml +{task_inputs_yaml} +``` + +## Expected output schema (per item) + +{expected_output_schema} + +Each artefact in your response must satisfy this schema independently. The executor runs `pharaoh-output-validate` against each artefact separately after splitting on `---ITEM---`. + +## How to work + +1. Read the skill body end-to-end ONCE. +2. For each item in the bundle items list above, in order: + a. Confirm the item's inputs match the skill's documented `Input` section. If any required input is missing or malformed, emit `BLOCKED: missing_input: ` as THAT item's artefact and continue with the next item. + b. Apply the skill's documented process exactly for that item, treating it as if it were the only task you were given. Do not share observations between items. Do not aggregate results. Do not reuse IDs across items unless the skill explicitly tells you to. + c. Emit the item's artefact. + d. Emit the separator line `---ITEM---` on its own line. +3. After the last item, do NOT emit a trailing separator. + +## Atomicity notice + +Family-bundle is a cost compromise. Per-instance caps in the skill (e.g. "5-7 comp_reqs per feat") are NOT enforced across items because you see every item's context. You MUST still obey each per-instance cap as if each item ran in isolation. If you notice yourself blurring scope between items (e.g. writing a comp_req that references another item's feat), stop and restart that item's block. + +## Hard limits + +- No markdown fences around any artefact unless the skill explicitly requires them. +- No prose wrapper. No "Here's the outputs:" preamble. No trailing commentary. +- The only inter-item content is the literal separator `---ITEM---` on its own line. +- No invented fields per artefact. Each artefact independently respects the skill's `Output` section. + +## When to escalate + +Escalation statuses (`BLOCKED`, `NEEDS_CONTEXT`, `DONE_WITH_CONCERNS`) apply per item. Place the status as line 1 of that item's block (before the separator). Continue processing the remaining items rather than aborting the whole bundle. + +## Report format when DONE + +Emit the bundle as: artefact₁, separator, artefact₂, separator, …, artefactₙ. No prefix, no suffix, no commentary outside the artefact blocks. +``` diff --git a/skills/pharaoh-execute-plan/schema.md b/skills/pharaoh-execute-plan/schema.md new file mode 100644 index 0000000..ac1326f --- /dev/null +++ b/skills/pharaoh-execute-plan/schema.md @@ -0,0 +1,250 @@ +# plan.yaml schema + +This file specifies the plan.yaml contract between `pharaoh-write-plan` (producer) and `pharaoh-execute-plan` (consumer). Any other skill or human authoring a plan must conform to this schema. The executor rejects plans that violate it. + +Version: 1. + +## Top-level fields + +```yaml +name: # required. descriptive plan name, used in report and logs. +version: # required. currently the only supported value is 1. +project_root: # required. absolute path; all relative paths in inputs resolve against this. +workspace_dir: # optional. where the executor writes artefacts + report.yaml. default: /.pharaoh/runs/-/. +defaults: # optional. per-task defaults; each task may override. + execution_mode: # "inline" | "subagents" | "family-bundle" | "ask". default "ask". + retry_on_validation_fail: # default 1. applies per task; 0 disables retry. +tasks: # required. list of task objects, order-insignificant (DAG). + - +validation: # optional. list of post-hoc validation rules. + - +``` + +All field names are lowercase snake_case. Unknown top-level fields are rejected. + +## Task object + +```yaml +- id: # required. unique within plan. matches `^[a-z][a-z0-9_]*$`. + skill: # required. name of an atomic skill in pharaoh/skills/ or papyrus/skills/. + inputs: # required. map of skill-input-name → value or ref. + : + depends_on: [, ...] # optional. explicit dependencies. implicit deps via ${ref} are always added. + foreach: # optional. expands this task to N instances, one per item in the referenced list. + parallel_group: # optional. tasks sharing a group execute concurrently when execution_mode=subagents. + execution_mode: # optional. overrides defaults.execution_mode for this task. "inline" | "subagents" | "family-bundle" | "ask". + bundle_key: # required iff execution_mode == "family-bundle". ref-grammar expression evaluated per foreach instance; instances sharing the same evaluated key dispatch in one subagent. See Execution modes below. + retry_on_validation_fail: # optional. overrides defaults.retry_on_validation_fail. + expected_output_schema: # optional. named schema from pharaoh-output-validate (e.g. "rst_directive"). hint for executor; does NOT replace the validation block. + outputs: # optional. declares fields the task is expected to produce. used by the executor to bind parsed skill output into the artefact store; see Outputs binding below. + : +``` + +Unknown task-level fields are rejected. Either `inputs` is empty dict or a non-empty map; missing `inputs` is rejected (explicit is better than implicit). + +## Ref resolution (`${...}`) + +A string value that matches `^\$\{[^}]+\}$` is a reference. A value may ONLY be wholly a ref (no interpolation inside larger strings). Supported forms: + +| Form | Meaning | +| --------------------------------- | ------------------------------------------------------------------------------------------------- | +| `${task_id}` | Shorthand for `${task_id.output}` — the task's single-output default. | +| `${task_id.field}` | Field from a task's output mapping. | +| `${task_id.field.subfield}` | Nested field access. Max depth 4. Executor rejects deeper refs. | +| `${item}` | Only valid inside a `foreach`-expanded task. Current iteration's item. | +| `${item.field}` | Field of the current foreach item when the iterated value is a mapping. | +| `${workspace}` | Resolves to `workspace_dir`. | +| `${project_root}` | Resolves to top-level `project_root`. | +| `${heuristics.()}` | Pure helper function. `` is either a bare dotted ref path (e.g. `item.file`) or a double-quoted string literal (e.g. `"src/foo.py"`). Single argument only. No `${}` inside the parens. | +| `${task_id.field \| helper}` | Pipe a value through a helper (filter-style). Same helper set as above. Helper arguments use parens: `\| by_stem(item.stem)` or `\| by_stem("foo")`. | + +No other ref forms are permitted. Arithmetic, string concatenation, env-var lookups, shell interpolation — all rejected. If a consumer needs richer composition, add a dedicated helper. + +### Static ref validation + +Before any task runs, the executor walks every task's `inputs`, `depends_on`, and `foreach` fields and: + +1. Parses every ref syntactically. Syntax errors fail the plan. +2. Resolves the producing task. Missing producers fail the plan. +3. Checks the producing task's declared `outputs` map (if present) for the referenced field. Unknown fields are WARNINGs when `outputs` is present, not failures — the binding happens at runtime (see Outputs binding) and may include fields not declared in the map (documentary declaration is a hint, not an allowlist). +4. Builds a DAG from `depends_on` plus implicit deps from refs. Detects cycles; cycles fail the plan. + +Static validation runs fast (no skill dispatch) so ref bugs fail before a single LLM call. + +### Runtime ref resolution + +At dispatch time for task T: + +1. For each ref in T's inputs, look up the producing task's actual output from the in-memory artefact store. +2. If the producing task is a foreach-expanded task, the ref resolves to a list of all iterations' outputs (order matches foreach input order). +3. Apply helper if piped. +4. Substitute into input map. +5. If any ref is unresolvable at runtime (e.g., upstream task failed and executor still attempted dispatch due to a race), the task fails with `status=BLOCKED` and the error `unresolved_ref: `. + +## Outputs binding + +When a task completes, the executor takes the task's raw skill output and populates the in-memory artefact store under the task id. The binding rule depends on the emitted shape: + +1. **JSON object output** (either directly, or because `expected_output_schema: json_obj` / `yaml_map` succeeded and `pharaoh-output-validate` returned `parsed`): every top-level key of the parsed object becomes a field on the task record. A downstream `${task_id.field}` ref resolves to the value at that key. Example: `pharaoh-feat-draft-from-docs` emits `{"feats": [...]}`; `${draft_feats.feats}` resolves to the list. +2. **Plain-text output** (emitter returns raw text not matching a parseable schema, or `expected_output_schema` is omitted): the whole output is bound to the default field. `${task_id}` (shorthand) and `${task_id.output}` both resolve to the raw string. No sub-field access. +3. **Validation failure** (output did not parse as the declared `expected_output_schema`): no binding occurs; the task is retried per `retry_on_validation_fail`, then failed per the `validation` rule's `on_fail` policy. + +For a foreach task, binding runs per-instance. Cross-instance resolution follows the rules in [Foreach](#foreach): `${task_id.field}` with no index returns the list of per-instance field values, order-preserving. + +The task-level `outputs:` map is a documentary hint about the expected keys. The executor does NOT reject a skill output whose keys are a strict superset (extra keys are bound silently) or a strict subset (missing keys resolve as unavailable at runtime, raising `unresolved_ref: ` when a downstream task tries to read one). Plan authors who care about completeness enforce it via `pharaoh-output-validate` in a `validation:` rule. + +## Foreach + +```yaml +- id: map_files + skill: pharaoh-feat-file-map + foreach: ${draft_feats.feats} + inputs: + feat_id: ${item.id} + feat_title: ${item.title} +``` + +Semantics: +- `foreach` takes exactly one ref whose resolved value must be a list. +- Executor instantiates one logical task per item, with `${item}` bound to that item. +- Each instance gets a concrete id formed as `[]` (0-indexed). That ID is the key in the artefact store. +- When a downstream task references `${map_files.files}` (no index), it receives a list of all instances' `files` fields, order preserved. +- When a downstream task has its own `foreach: ${map_files}`, the downstream receives the list-of-outputs directly and iterates. +- A `parallel_group` on a foreach task applies to every instance: all N instances share the group and run concurrently. +- Foreach over an empty list produces zero instances. Downstream refs resolve to empty lists. This is not a failure. + +Nested foreach (foreach depending on a foreach's output) is permitted only when the downstream expands the flat list of upstream outputs, not when it tries to expand per-upstream-item. Exactly: + +```yaml +# OK: flat expansion over all feats × files. +- id: reqs + foreach: ${map_files.files_flat} # flat list of {feat_id, file} pairs. + inputs: + file_path: ${item.file} + +# NOT OK: forbidden — executor rejects at static validation. +- id: reqs_per_feat + foreach: ${map_files} # list of lists — would need 2-level iteration + inputs: + ... +``` + +The producer is expected to emit a flat list (e.g., `files_flat`) for cross-product iteration. This keeps executor logic simple. + +## Helpers + +Executor ships with a closed set of pure helpers. No user-defined functions; no sandbox. Adding a helper is a schema-version bump. + +| Helper | Purpose | +| ----------------------------------------------- | ------------------------------------------------------------------------------------------------------ | +| `${heuristics.split_strategy()}` | Returns `"single" \| "sections" \| "top_level_symbols"` per the heuristic (LOC + marker regex). | +| `${list \| flatten}` | `[[a,b],[c]] → [a,b,c]`. For foreach-output lists whose items are themselves lists. | +| `${list \| to_papyrus_seeds}` | Maps a list of feat-directive objects to the seeds format `pharaoh-decision-record` expects. Each seed is `{canonical_name: , body: }`. | +| `${list \| to_files_flat}` | Denormalises a list of feat-file-map outputs (each a flat mapping `{feat_id, files, rationale, entry_point?, shared_with?}`) into a flat list `[{file: , feat_id: , stem: , parents: []}, ...]`. Files appearing under multiple feats (across foreach instances) are emitted once with a populated `parents` list. | +| `${list \| to_id_requests}` | Converts a flat file list (output of `to_files_flat`) to `pharaoh-id-allocate` request shape `[{stem, count: 3, prefix, type, parent_feat_id: parents[0]}, ...]`. Takes default count=3 per file; prefix inferred from the plan's declared type. | +| `${mapping \| by_stem()}` | Given an id-allocate output mapping `{stem: [id1, id2, ...], ...}` and a stem, returns the list of ids for that stem. Used to thread allocated ids into per-file req tasks. | +| `${list \| with_entry_point}` | Filters a list of feat-file-map outputs to only those having an `entry_point` field set. | +| `${list \| unique}` | Dedup preserving first occurrence. | +| `${mapping \| keys}` | Emits the keys of a mapping as a list. | + +Any ref using an unknown helper fails static validation. + +## Parallel group + +Tasks tagged with `parallel_group: ` execute concurrently in `subagents` mode. In `inline` mode, `parallel_group` is informational (noted in report.yaml, but tasks still run sequentially). Membership rules: + +- All tasks in a group must share the same `depends_on` set (dependency-consistent). +- A group may not contain a task that depends on another task in the same group (no intra-group deps). Executor rejects at static validation. +- Groups are unordered relative to each other except via `depends_on`. + +## Execution modes + +Four modes are supported. The mode determines how a task (or the N instances of a foreach task) are dispatched: + +| Mode | Semantics | Atomicity | Typical cost | +| ------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------ | --------------------------------------------------- | +| `inline` | The controlling agent (the one running `pharaoh-execute-plan`) performs the skill in-context, task by task, sequentially. For foreach tasks, all N instances run in the same context. | No cross-instance isolation. The same agent sees every instance's inputs and outputs; per-instance caps are self-enforced only. | Cheapest. One conversation. | +| `subagents` | Same as `subagents-per-task` — one subagent dispatched per task. For foreach tasks, N instances → N subagents. Group members in a `parallel_group` dispatch in one turn when possible. | Full per-instance atomicity. Each subagent sees only its resolved inputs. | Most expensive. N dispatches. | +| `family-bundle` | Requires `bundle_key`. For foreach tasks, the executor evaluates `bundle_key` per instance, groups instances by the resolved key, and dispatches one subagent per bundle. The subagent runs the skill once per item in its bundle, in order. | Per-bundle atomicity only. Within a bundle, the subagent sees all items' inputs; per-instance caps are NOT enforced across the bundle. | Middle ground. M subagents where M = unique keys. | +| `ask` | The executor stops at the first wave that contains ambiguous foreach tasks (no task-level `execution_mode`, no plan-level concrete default) and prompts the user to choose one of the concrete modes above. Selection is recorded in `report.yaml`. See `pharaoh-execute-plan/SKILL.md` Step 3.5 for the exact prompt text and response handling. | Depends on the chosen mode. | Depends on the chosen mode. | + +When `defaults.execution_mode` is absent, the default is `"ask"` — the executor WILL pause for user input on the first ambiguous foreach. Plan authors who want unattended execution set `defaults.execution_mode` to a concrete mode (e.g. `"subagents"`), or set `execution_mode:` per-task. + +`bundle_key` rules: + +- Only meaningful when `execution_mode: family-bundle`. Rejected (schema error) on any other mode. +- Accepts the ref grammar used elsewhere. Typical: `${item.feat_id}`, `${item.family}`, `${heuristics.(item.file)}`. +- Evaluated per foreach instance at dispatch time. Result must be a string or scalar; lists/mappings are rejected. +- Instances whose key evaluates to the same value are dispatched as one bundle, up to one subagent per bundle. +- Single-instance bundles behave identically to `subagents` mode for that bundle. +- Foreach-expanded task in `family-bundle` mode with `parallel_group` set: the group semantics apply across bundles (bundles within the group dispatch concurrently), not within a bundle. + +## Validation block + +```yaml +validation: + - task_output: + schema: # name from pharaoh-output-validate. + on_fail: # "retry" (default) | "skip_dependents" | "abort_plan" + - task_output: ${reqs.*} # wildcard expands to every foreach instance's output. + schema: rst_directive +``` + +- Executor runs these rules after each task completes. +- `on_fail: retry` — re-dispatch the task with the validator's error appended to the prompt, up to `retry_on_validation_fail` times. +- `on_fail: skip_dependents` — record failure, mark the task's transitive dependents as SKIPPED, continue with other branches of the DAG. +- `on_fail: abort_plan` — record failure, halt the executor, emit report.yaml with `status: aborted`. + +If no validation rule targets a task's output, only the task's own `expected_output_schema` hint is applied (with default `on_fail: retry`). + +## Failure modes (plan-level) + +| Condition | Executor behaviour | +| --------------------------------------------------------------- | -------------------------------------------- | +| YAML parse error | Reject plan. No tasks run. | +| Unknown top-level field | Reject plan. | +| Task id duplicate | Reject plan. | +| Unknown skill (not in pharaoh/ or papyrus/ skills dir) | Reject plan. | +| Cyclic dependency | Reject plan. | +| Unresolvable ref at static validation | Reject plan. | +| foreach over a ref whose value at runtime is not a list | Fail that task as BLOCKED; on_fail policy applies. | +| Static validation warnings (documentary outputs mismatch) | Log, continue. | + +## Report.yaml (executor output) + +```yaml +plan_name: +plan_version: 1 +started_at: +finished_at: +status: completed | aborted | partial +tasks: + : + status: completed | failed | skipped | blocked + started_at: + finished_at: + execution_mode: inline | subagents + retries: + validation: + - schema: + result: pass | fail + errors: [, ...] + artefact_path: # relative to workspace_dir + foreach_instances: # present only for foreach tasks + - index: 0 + status: completed + artefact_path: ... +``` + +The report is the single authoritative record of a plan run. No other file format is persisted by the executor. + +## Versioning + +The schema version is currently 1. Breaking changes (removing fields, changing ref grammar, changing helper signatures) require bumping `version: 2` and supporting both in the executor for one transition period. Additive changes (new helpers, new optional task fields) keep `version: 1`. + +## Non-goals + +- No loops other than `foreach` (no `while`, no fixed-count `repeat`). +- No conditionals (`if`/`when`). Branch by emitting different task lists at plan-writing time. +- No dynamic re-planning inside the executor. If discovery should reshape the plan, the controlling agent re-invokes `pharaoh-write-plan` with enriched inputs. +- No shell-outs, file I/O, or env reads from ref syntax. Any data the plan needs must arrive via `project_root`, `workspace_dir`, or another task's output. diff --git a/skills/pharaoh-fault-tree-diagram-draft/SKILL.md b/skills/pharaoh-fault-tree-diagram-draft/SKILL.md new file mode 100644 index 0000000..69d10dc --- /dev/null +++ b/skills/pharaoh-fault-tree-diagram-draft/SKILL.md @@ -0,0 +1,97 @@ +--- +name: pharaoh-fault-tree-diagram-draft +description: Use when drafting one fault tree for FTA (Fault Tree Analysis) — a top hazard event decomposed through AND/OR gates into basic events (component failures, random hardware faults, human errors). Typical ISO 26262 usage — Part 3 Hazard Analysis & Risk Assessment, and Part 5 supporting hardware architectural metrics. Renderer tailored via `pharaoh.toml`. Status — PLANNED (design-only scaffold; invoking returns sentinel FAIL until implemented). +--- + +# pharaoh-fault-tree-diagram-draft (PLANNED) + +> **Status:** DESIGN ONLY. Implementation sentinel FAIL: `"pharaoh-fault-tree-diagram-draft is planned but not implemented; see SKILL.md"`. + +Shared tailoring rules: see [`shared/diagram-tailoring.md`](../shared/diagram-tailoring.md). Reads `[pharaoh.diagrams.fault_tree]`. + +Safe-label rules: see [`shared/diagram-safe-labels.md`](../shared/diagram-safe-labels.md). Every emitted label / node id / edge label MUST be sanitised per that rule set before the block leaves this skill. `sphinx-build` does not validate diagram bodies — a parse failure becomes visible only at browser render time. Sanitisation is the first line of defence; the second is `pharaoh-diagram-lint` run as part of `pharaoh-quality-gate`. + +## Purpose + +One invocation → one fault tree. Captures top-down deductive decomposition of one **top hazard event** via logical gates (AND, OR, NOT, inhibit, priority-AND, exclusive-OR) into **intermediate events** and **basic events** with optional failure probabilities. + +Typical ISO 26262 context: +- **Part 3 HARA**: qualitative fault trees identifying top-level hazards. +- **Part 5 §9 Hardware architectural metrics (SPFM, LFM, PMHF)**: quantitative fault trees — each basic event has a failure rate λ; the tree propagates to the top event. +- **Safety case argumentation**: showing how a safety goal violation would have to occur, and what barriers prevent it. + +## Atomicity + +- (a) One top event in → one tree out. +- (b) Input: `{view_title: str, top_event: EventSpec, gates: list[GateSpec], basic_events: list[BasicEventSpec], edges: list[TreeEdgeSpec], project_root: str, show_probabilities?: bool, renderer_override?, on_missing_config?, papyrus_workspace?, reporter_id: str}` where `EventSpec = {id: str, label: str, probability?: float}`, `GateSpec = {id: str, kind: "AND"|"OR"|"NOT"|"INHIBIT"|"PAND"|"XOR", label?: str}`, `BasicEventSpec = {id: str, label: str, probability?: float, kind?: "hardware"|"software"|"human"|"environmental"}`, `TreeEdgeSpec = {from: str, to: str}` (tree edges go from parent to child, parent = top event or gate, child = gate or basic event). Output: one RST directive block. +- (c) Reward: fixture — top event "Unintended Acceleration", OR gate decomposing to [ECU software fault, sensor stuck signal], sensor fault AND-gated by [sensor failure, fallback disabled]. Scorer: + 1. Output starts with renderer directive. + 2. Top event appears at the graph root (no incoming edges). + 3. Every gate rendered with UML / FTA-standard shape: AND = flat-bottom D, OR = curved-bottom D, NOT = triangle with bar, etc. (Mermaid approximation: labeled diamond + annotation.) + 4. Basic events rendered as circles (standard FTA notation) or leaves. + 5. With `show_probabilities=true`, every event/basic-event with a `probability` shows it numerically; gates show computed result if all children have probabilities. + 6. No basic event has outgoing edges (leaves). + 7. Every non-leaf node has ≥1 outgoing edge (gates can't be childless). + + Pass = all 7. +- (d) Reusable across safety-critical domains (automotive, medical, aerospace, industrial). +- (e) One tree per call. + +## Dangling edges / cycles + +- FAIL on edge endpoint not in `{top_event} ∪ gates ∪ basic_events`. +- FAIL on cycle (fault trees are DAGs; a cycle means the model is wrong). +- FAIL if the top event has an incoming edge (root must be the top). + +## Output + +**PlantUML (has dedicated FTA symbols via GraphViz DOT syntax embedded):** +```rst +.. uml:: + :caption: + + @startuml + skinparam defaultFontSize 11 + rectangle "TOP:\nUnintended Acceleration\nλ=1e-8/h" as TOP + rectangle "OR" as G1 + rectangle "AND" as G2 + circle "Sensor stuck\nλ=5e-7/h" as BE1 + circle "ECU SW fault\nλ=2e-8/h" as BE2 + circle "Fallback disabled\nλ=1e-6/h" as BE3 + TOP --> G1 + G1 --> BE2 + G1 --> G2 + G2 --> BE1 + G2 --> BE3 + @enduml +``` + +**Mermaid (flowchart approximation — no native FTA gate shapes):** +```rst +.. mermaid:: + :caption: + + flowchart TD + TOP["TOP: Unintended Acceleration
λ=1e-8/h"] + G1{{OR}} + G2{{AND}} + BE1(("Sensor stuck
λ=5e-7/h")) + BE2(("ECU SW fault
λ=2e-8/h")) + BE3(("Fallback disabled
λ=1e-6/h")) + TOP --> G1 + G1 --> BE2 + G1 --> G2 + G2 --> BE1 + G2 --> BE3 +``` + +## Interaction with `pharaoh-fmea` + +FMEA and FTA are complementary: FMEA is bottom-up (component → effect), FTA is top-down (hazard → component). Pharaoh already has `pharaoh-fmea` for FMEA entries. A future orchestrator (`pharaoh-hazard-analysis`) may pair the two: extract top hazards from FMEA entries with high RPN, then generate FTA per hazard. Out of scope here. + +## Non-goals + +- No cut-set minimization — quantitative FTA tools (e.g. FaultTree+, CAFTA) handle this; this skill just emits the tree structure. +- No probability computation beyond trivial AND-of-independents / OR-of-independents — caller provides computed probabilities if needed. +- No dynamic fault trees (Markov chains, repair rates) — static FT only. +- No common-cause-failure (CCF) modeling — would need extra node kind; a future extension. diff --git a/skills/pharaoh-feat-balance/SKILL.md b/skills/pharaoh-feat-balance/SKILL.md new file mode 100644 index 0000000..573ede7 --- /dev/null +++ b/skills/pharaoh-feat-balance/SKILL.md @@ -0,0 +1,111 @@ +--- +name: pharaoh-feat-balance +description: Use when a plan emitted by `pharaoh-write-plan` has completed its feature + comp_req emission and you need to check for granularity skew — features with too many reqs (under-decomposed feature model), too few (over-decomposed), fused sub-features (generic names like "utilities"), or redundancy (symmetric import/export pairs). Reports health and suggestions; does not mutate. +--- + +# pharaoh-feat-balance + +## When to use + +Invoke after a composition emits feats + comp_reqs, before running `pharaoh-quality-gate`, to catch granularity issues that quality-gate thresholds don't cover. Two typical failure shapes: a feat with an outsized CREQ count (the feat spans multiple capabilities and should be split) and a feat whose title contains a "utilities" / "misc" / "helpers" smell (two unrelated subcommands fused under one name). This skill surfaces both. + +Do NOT use to reshape the feature set — it only reports. Caller acts on suggestions manually. + +## Atomicity + +- (a) Indivisible — one distribution in → one report out. No mutations. No parallel fan-out. +- (b) Input: `{distribution_path: str, thresholds?: {max_reqs_per_feat: int, min_reqs_per_feat: int, name_smell_patterns: list[str], redundancy_title_overlap_min: float, redundancy_count_tolerance: float}}`. Output: YAML report with summary, outliers, redundancy_candidates, overall_health. Defaults: max=15, min=3, name_smell=["utilities","helpers","misc","other","general"], title_overlap=0.5, count_tolerance=0.20. +- (c) Reward: fixture `pharaoh-validation/fixtures/pharaoh-feat-balance/input_distribution.yaml` modelled on a skewed-distribution example. Skill run against it with defaults produces output byte-exact matching `expected_output_skewed.yaml`: + - `FEAT_reqif_export` flagged `too_many` (19 > 15). + - `FEAT_jama_utilities` flagged `fused_subfeatures` (title matches smell pattern). + - `(FEAT_csv_export, FEAT_csv_import)` flagged as redundancy candidate (symmetric import/export with matching counts). + - `overall_health: "skewed"` (any flag = skewed). +- (d) Reusable on any feature catalogue. +- (e) Composable: never calls other skills. + +## Input + +- `distribution_path`: absolute path to a YAML file containing the feature distribution. Expected shape: + ```yaml + features: + - feat_id: FEAT_csv_export + title: "CSV Export" + reqs_count: 12 + - feat_id: FEAT_reqif_export + title: "ReqIF Export" + reqs_count: 19 + ... + ``` +- `thresholds` (optional): override defaults. Partial override supported (missing keys use defaults). + +## Output + +```yaml +summary: + feat_count: + total_reqs: + mean_reqs_per_feat: + median: + min: + max: + stdev: + +outliers: + - feat_id: + reqs_count: + flag: too_many | too_few | fused_subfeatures + suggestion: + +redundancy_candidates: + - feats: [, ] + reason: + +overall_health: healthy | skewed | critical +``` + +## Process + +### Step 1: Load + compute summary + +Read `distribution_path` via `yaml.safe_load`. Compute `feat_count`, `total_reqs`, `mean`, `median`, `min`, `max`, `stdev` across `features[*].reqs_count`. Round mean/stdev to one decimal. + +### Step 2: Flag outliers + +For each feature: +- If `reqs_count > thresholds.max_reqs_per_feat` (default 15) → flag `too_many`. Suggestion: `"Consider splitting: reqs suggests the feature spans multiple distinct capabilities. Look for natural boundaries (e.g. )."` +- If `reqs_count < thresholds.min_reqs_per_feat` (default 3) → flag `too_few`. Suggestion: `"Feature has only req(s) — verify it's a distinct capability and not a stub. Consider merging into a parent feature if the scope is thin."` +- If feature title (lowercased) matches any `thresholds.name_smell_patterns` (default `["utilities","helpers","misc","other","general"]`) as a substring → flag `fused_subfeatures`. Suggestion: `"Feature title is a code smell — 'utilities' and similar names often lump unrelated capabilities. Consider splitting by the specific capabilities it includes."` + +One feature may carry multiple flags — emit one outlier entry per flag. + +### Step 3: Detect redundancy candidates + +For each pair of features `(A, B)` where A != B: +- Compute title-token overlap: tokenize both titles on whitespace/punctuation, lowercase; `overlap = len(common_tokens) / len(tokens_A ∪ tokens_B)`. +- Compute count ratio: `ratio = min(A.count, B.count) / max(A.count, B.count)`. +- If `overlap >= thresholds.redundancy_title_overlap_min` (default 0.5) AND `ratio >= (1 - thresholds.redundancy_count_tolerance)` (default tolerance 0.20 → ratio ≥ 0.80) → add to `redundancy_candidates`. + +Deduplicate by sorting the pair (`[A.id, B.id]` lexicographic). + +For each candidate, compose a reason: `"Same title-token overlap (<overlap:.0%>), symmetric counts (<A.count> vs <B.count>). Consider <merged_name>."` where `merged_name` strips the differing token (e.g. `csv_export` + `csv_import` → `csv_exchange`). + +### Step 4: Determine overall_health + +- `healthy`: zero outliers AND zero redundancy_candidates. +- `skewed`: at least one outlier OR redundancy_candidate. +- `critical`: > 25% of features flagged (outliers only; redundancy does not count toward this). + +### Step 5: Return + +Return the YAML report. + +## Failure modes + +- `distribution_path` not readable → FAIL. +- Distribution parses but has no `features` key or empty list → FAIL. + +## Non-goals + +- No mutation of the feature set. +- No re-draft suggestions — this skill describes the shape problem, not the fix. +- No cross-project comparison — one distribution per invocation. diff --git a/skills/pharaoh-feat-component-extract/SKILL.md b/skills/pharaoh-feat-component-extract/SKILL.md new file mode 100644 index 0000000..4321a00 --- /dev/null +++ b/skills/pharaoh-feat-component-extract/SKILL.md @@ -0,0 +1,156 @@ +--- +name: pharaoh-feat-component-extract +description: Use when reverse-engineering a feat and you need to derive a component composition diagram automatically from the feat + its source files. Walks import edges between the listed files and emits a Mermaid or PlantUML diagram whose output shape is compatible with pharaoh-component-diagram-draft. Does NOT hand-author nodes or edges; extraction is rule-based. +--- + +# pharaoh-feat-component-extract + +## When to use + +Invoke after `pharaoh-feat-file-map` has produced a `{feat_id, files}` mapping, when you want a static architecture view of the feature showing which modules/classes compose it and how they depend on each other. The diagram output shape matches `pharaoh-component-diagram-draft` so downstream tooling (sphinx-needs rendering, diff review) treats auto-extracted diagrams identically to hand-authored ones. + +Do NOT use to draft a diagram from scratch when you already have explicit node+edge data — that is `pharaoh-component-diagram-draft`. Do NOT use to extract runtime flow — that is `pharaoh-feat-flow-extract`. + +## Tailoring awareness + +Shared tailoring rules: see `shared/diagram-tailoring.md`. Reads `[pharaoh.diagrams]` and `[pharaoh.diagrams.component]` from the consumer project's `pharaoh.toml` for renderer choice and styling. Respects `on_missing_config` per the shared `check → propose → confirm` pattern. + +Safe-label rules: see `shared/diagram-safe-labels.md`. Node IDs derived from file paths MUST be aliased (path characters `/` and `.` are invalid in Mermaid / PlantUML identifier positions). Edge labels MUST be sanitised — call-labels like `foo(arg1; arg2)` become `foo(arg1, arg2)` before emit. A parse failure in the emitted block is invisible under `sphinx-build` and surfaces only at browser render time; `pharaoh-diagram-lint` (run as part of `pharaoh-quality-gate`) is the second guard. + +## Atomicity + +- (a) Indivisible — one feat + one file list in → one diagram RST block out. No multi-feat bundling. No mutation of source files. No req emission. +- (b) Input: `{feat_id: str, feat_title: str, files: list[str], project_root: str, src_root: str, renderer_override?: "mermaid"|"plantuml", include_external?: bool, on_missing_config?: "fail"|"prompt"|"use_default", papyrus_workspace?: str, reporter_id: str}`. Output: one RST directive block (`.. mermaid::` or `.. uml::`) with caption `<feat_id> — component composition`. No surrounding prose. +- (c) Reward: fixture `pharaoh-validation/fixtures/pharaoh-feat-component-extract/`: + - `input_feat.yaml` declares `feat_id: FEAT_csv_export`, `feat_title: "CSV Export"`, `files: [csv/export.py, csv/writer.py, commands/csv.py]`. + - `input_files/` contains three Python files with explicit imports: `commands/csv.py` imports `from csv.export import run_export`; `csv/export.py` imports `from csv.writer import CSVWriter`; `csv/writer.py` has no project-internal imports. + - Expected diagram at `expected_diagram.rst` has 3 nodes (one per file), 2 directed edges (`commands/csv.py → csv/export.py`, `csv/export.py → csv/writer.py`), no external nodes. + + Scorer: + 1. Output starts with the renderer directive. + 2. All 3 nodes appear by label. + 3. Both directed edges render with correct arrow syntax. + 4. With default `include_external=false`, no external imports (e.g. `typer`, `pathlib`, `csv`) appear as nodes. + 5. With `include_external=true`, external imports render as ghost nodes (dashed outline, muted color, `<<external>>` stereotype). + 6. Output matches `pharaoh-component-diagram-draft` output shape (same directive, same caption format, same node/edge syntax). + + Pass = all 6. +- (d) Reusable for any language whose import graph the extractor supports. Python initial target; regex-based import detection so adding Rust/TypeScript is a configuration table entry, not a rewrite. +- (e) Composable: one feat per call. A plan emitted by `pharaoh-write-plan` may include a `foreach` task over feats that dispatches N instances (one per feat) in parallel via `pharaoh-execute-plan`. This skill never invokes other skills. + +## Input + +- `feat_id`: the feature's sphinx-needs ID, used as the diagram caption prefix. +- `feat_title`: human-readable title, shown in caption. +- `files`: list of source file paths relative to `src_root`. These become the diagram's in-scope nodes. +- `project_root`: absolute path, for `pharaoh.toml` tailoring lookup. +- `src_root`: absolute path, the import-graph resolution root. `files[*]` resolve as `<src_root>/<file>`. +- `renderer_override` (optional): per shared doc. +- `include_external` (optional): if `true`, imports that resolve outside `files` but inside `src_root` become ghost nodes. Imports resolving outside `src_root` entirely (stdlib, third-party) are ignored regardless. Default `false`. +- `on_missing_config` (optional): per shared doc. Default `"prompt"`. +- `papyrus_workspace` (optional): for consistent node labeling with other skills that reference the same files. +- `reporter_id`: short agent identifier. + +## Output + +**Mermaid (default):** +```rst +.. mermaid:: + :caption: FEAT_csv_export — component composition + + graph TD + commands_csv[commands/csv.py<br/>run_export] + csv_export[csv/export.py<br/>run_export] + csv_writer[csv/writer.py<br/>CSVWriter] + commands_csv --> csv_export + csv_export --> csv_writer +``` + +Node IDs (left-hand side of the bracket) are sanitized forms of the file path (replace `/` and `.` with `_`). Node labels show the file path plus the primary symbol (largest top-level def/class, or the one whose name matches feat title tokens). + +**PlantUML:** +```rst +.. uml:: + :caption: FEAT_csv_export — component composition + + @startuml + component "commands/csv.py\n(run_export)" as commands_csv + component "csv/export.py\n(run_export)" as csv_export + component "csv/writer.py\n(CSVWriter)" as csv_writer + commands_csv --> csv_export + csv_export --> csv_writer + @enduml +``` + +## Process + +### Step 1: Enumerate nodes + +For each file in `files`, read via absolute path (`<src_root>/<file>`). Parse top-level symbol declarations: + +- Python: `^class <Name>`, `^def <Name>`, `^async def <Name>`. +- Rust: `^(pub )?(fn|struct|enum|trait|impl) <Name>`. +- JS/TS: `^(export )?(function|class|const|let|var) <Name>`. +- Go: `^func (<Receiver>) <Name>` / `^type <Name>`. + +Pick the primary symbol per file: longest body OR name matching `feat_title` tokens (case-insensitive substring match on any token). If ambiguous, pick the one defined earliest. + +Node label: `<file>\n(<primary_symbol>)` (Mermaid uses `<br/>`, PlantUML uses `\n`). +Node ID: `<file>` with `/` → `_`, `.` → `_`, stripped of final `py` extension marker. + +### Step 2: Enumerate edges + +For each file, parse imports via language-specific regex: + +- Python: `^(from (?P<module>[\w.]+) import|import (?P<module>[\w.]+))`. +- Rust: `^use (?P<module>[\w:]+)`. +- JS/TS: `^import .* from ["'](?P<module>[^"']+)["']`. +- Go: `^\s*"(?P<module>[\w/.]+)"` within an `import (...)` block. + +For each imported module, resolve to a file path: +- Try `<src_root>/<module_path>.py` (replacing `.` with `/`). +- Try `<src_root>/<module_path>/__init__.py`. +- Try other language-appropriate conventions. + +If the resolved path is in `files`, emit an edge `<importer_file> → <resolved_file>`. + +If the resolved path is outside `files` but inside `src_root` AND `include_external=true`, add a ghost node `external::<module>` and emit the edge. + +If the import resolves outside `src_root` entirely (stdlib, third-party), drop silently. + +### Step 3: Emit diagram + +Resolve renderer per shared doc's resolution order (`renderer_override` → `pharaoh.toml [pharaoh.diagrams].renderer` → default `mermaid`). + +Emit the diagram with direction `TD` (top-down, showing call depth). Caption: `<feat_id> — component composition`. + +For ghost nodes (when `include_external=true`), group them visually apart where the renderer supports it: +- Mermaid: separate `subgraph External` block. +- PlantUML: `package "external" { ... }` block. + +Ghost-node styling: dashed outline + muted color (specifics per renderer — consult `shared/diagram-tailoring.md` for the type_styles lookup). + +### Step 4: Return + +Single RST block. No prose before or after. + +## Failure modes + +- `files` empty → FAIL. +- Any file in `files` unreadable → log + skip that file (do not abort unless all files unreadable). +- Cycles in the import graph → emit the diagram anyway (Mermaid/PlantUML handle cycles); log a note. +- Zero edges resolved inside `files` → emit nodes only, log a note ("no intra-scope edges detected — check files list or use include_external=true"). +- `include_external=true` AND zero in-scope edges AND zero external edges → still emit nodes only, log note. + +## Non-goals + +- No runtime / call-graph inference — that is `pharaoh-feat-flow-extract`. +- No type hierarchy — that is `pharaoh-class-diagram-draft` (hand-authored) or a future `pharaoh-feat-class-extract`. +- No transitive import resolution beyond one hop — depth > 1 explodes scope. +- No dead-code detection — every file in `files` is a node, whether imported or not. + +## Last step + +After emitting the artefact, invoke `pharaoh-diagram-review` on it. Pass the emitted artefact (or its `need_id`) as `target`. Attach the returned review JSON to the skill's output under the key `review`. If the review emits any axis with `score: 0` or `severity: critical`, return a non-success status with the review findings verbatim and do NOT finalize the artefact — the caller must regenerate (via `pharaoh-diagram-regenerate` if available, or by re-invoking this skill with the findings as input). + +See [`shared/self-review-invariant.md`](../shared/self-review-invariant.md) for the rationale and enforcement mechanism. Coverage is mechanically enforced by `pharaoh-self-review-coverage-check` in `pharaoh-quality-gate`. diff --git a/skills/pharaoh-feat-draft-from-docs/SKILL.md b/skills/pharaoh-feat-draft-from-docs/SKILL.md new file mode 100644 index 0000000..94af426 --- /dev/null +++ b/skills/pharaoh-feat-draft-from-docs/SKILL.md @@ -0,0 +1,187 @@ +--- +name: pharaoh-feat-draft-from-docs +description: Use when reading one or more existing documentation files (unstructured prose, README, tutorial) and emitting one or more feature-level RST directives (typed by `target_level`, default `feat`) that describe the user-facing capabilities documented in those files. Does NOT read source code. Does NOT emit component requirements. Does NOT map features to files — that is `pharaoh-feat-file-map`. +--- + +# pharaoh-feat-draft-from-docs + +## When to use + +Invoke when a project has unstructured documentation (e.g. `docs/source/features/*.rst`, `README.md`, product overview pages) that describes user-facing capabilities in prose, and you need to extract those capabilities as sphinx-needs `feat` (or equivalent) directives. This is the first step of reverse-engineering a requirements model from an existing project: docs → features. The follow-up skill `pharaoh-feat-file-map` maps each emitted feature to source files; `pharaoh-req-from-code` then generates component requirements per-file from code. + +Do NOT use to draft features from scratch (that is `pharaoh-req-draft` with `target_level="feat"`). Do NOT use to emit reqs from code (that is `pharaoh-req-from-code`). Do NOT use to generate architecture diagrams (a separate future skill). + +## Tailoring awareness + +The emitted directive name and ID prefix come from the consumer project's `ubproject.toml` `[[needs.types]]` (or `.pharaoh/project/id-conventions.yaml` if present). The caller passes `target_level` — use it verbatim as the directive name. Do NOT hardcode `feat` as the only acceptable type. Projects may call their top-level artefact `story`, `capability`, `feature`, `use_case`, etc. + +## Atomicity + +- (a) Indivisible — one invocation reads `doc_files` and emits N feature directives. No source-code reads. No file-mapping. No inter-feature dependency analysis. One artefact × one phase. +- (b) Input: `{doc_files: list[str], target_level: str, project_root: str, papyrus_workspace?: str, reporter_id: str, on_missing_config?: "fail"|"prompt"|"use_default"}`. Output: single JSON object `{"feats": [{"id", "title", "type", "body", "source_doc", "raw_rst"}, ...]}`. The `raw_rst` field of each feat is the full RST directive block; downstream skills that want raw RST read it from there. On `on_missing_config="prompt"` with `target_level` undeclared → single JSON object `{status: "needs_confirmation", proposal: ...}`. +- (c) Reward: deterministic fixture — a 2-file doc tree with known feature vocabulary (e.g. `features/csv.rst` mentioning "CSV import" and "CSV export"; `features/jama.rst` mentioning "Jama pull" and "Jama push"). After skill runs, scorer checks: + 1. Every emitted block uses `target_level` as the directive name. + 2. Every emitted block has a `:id:` option. + 3. Every emitted ID prefix equals the ID prefix resolved from the project's tailoring (see Output). + 4. Every emitted block contains a `:source_doc:` option pointing to one of the `doc_files` paths. + 5. For each fixture doc paragraph marked as "must_yield_feat" in the fixture metadata, at least one emitted block's title or body mentions the paragraph's canonical vocabulary (substring match, case-insensitive). + 6. At least 1 feat is emitted (non-empty output). + + Pass = all 6 checks pass. +- (d) Reusable: any reverse-engineering workflow on projects with existing prose docs; migration from README-only to sphinx-needs; extracting features from product specs. +- (e) Composable: strictly one phase (docs → feat directives). Never invokes `pharaoh-req-from-code`, `pharaoh-arch-draft`, or `pharaoh-feat-file-map`. A plan emitted by `pharaoh-write-plan` composes this skill with `pharaoh-feat-file-map` and downstream req-emission tasks — not vice versa. + +## Input + +- `doc_files`: list of absolute paths to documentation files to read. Typically `.rst`, `.md`, or `.txt`. At least one must be provided. Files are read but not modified. +- `target_level`: directive name for the emitted features. Must match a `[[needs.types]].directive` in the consumer project's `ubproject.toml` (e.g. `"feat"`, `"story"`, `"capability"`). The emitted directive uses this name verbatim. +- `project_root`: absolute path to the consumer project's root, used to resolve the ID prefix from `ubproject.toml` (`[[needs.types]]` entry whose `directive` equals `target_level`). If the `ubproject.toml` does not declare a prefix for `target_level`, fall back to `<target_level>__` (double-underscore convention). +- `papyrus_workspace` (optional): path to `.papyrus/` directory for canonical-term coordination with concurrent agents. If omitted, the skill operates in no-memory mode. +- `reporter_id`: short identifier for this agent (e.g. `feat-draft-from-docs:features`). Passed to `pharaoh-decision-record` calls. +- `granularity` (optional): `"doc" | "top_section" | "manual_hint"`. Default `"doc"`. Controls decomposition of each doc file into feats: + - `"doc"` — one feat per input doc file. Simplest; right for "one topic per doc" layouts. Current default and the shape that has been stable since initial dogfooding. + - `"top_section"` — split each doc at its top-level headings (RST: title underlined with `===`; Markdown: a line starting with a single `#`). Emit one feat per top-level section. Right for docs that cover multiple capabilities under one roof (e.g. a ReqIF connector doc that covers both export and import — granularity `top_section` produces `FEAT_reqif_export` + `FEAT_reqif_import` instead of a single fused `FEAT_reqif_exchange`). + - `"manual_hint"` — look for explicit split markers inside each doc: `.. feat-split::` comment-directive (RST) or `<!-- feat-split -->` marker (Markdown). Emit one feat per segment separated by those markers. Caller-controlled; useful when prose organisation does not match the desired feat boundary. +- `on_missing_config` (optional): `"fail" | "prompt" | "use_default"`. Default `"prompt"`. Determines behavior when `target_level` is not declared in `ubproject.toml`. See shared `check → propose → confirm` pattern in `shared/diagram-tailoring.md` (same semantics, different subject matter). + +## Output + +A single JSON object with one top-level key `feats` (list of feat objects). One feat object per emitted feature. Shape: + +```json +{ + "feats": [ + { + "id": "<id_prefix><snake_case_id>", + "title": "<short_title>", + "type": "<target_level>", + "body": "<one-sentence feature statement in user-facing language>", + "source_doc": "<relative_path_to_doc_file>", + "raw_rst": ".. <target_level>:: <short_title>\n :id: ...\n :status: draft\n :source_doc: ...\n\n <body>\n" + } + ] +} +``` + +The `raw_rst` field MUST be exactly the directive block as it would appear if pasted into an RST file. Downstream skills (e.g. `pharaoh-req-review`, `pharaoh-feat-review`) read `raw_rst` when they need the directive text; helpers that consume `feats` (e.g. `to_papyrus_seeds`) read `id`, `title`, `body`. + +`<id_prefix>` resolution: +1. Read `<project_root>/ubproject.toml`. +2. Find the `[[needs.types]]` entry whose `directive` equals `target_level`. +3. If it has a `prefix` field, use that verbatim (e.g. `prefix = "FEAT_"` → `FEAT_csv_import`). +4. Otherwise use `<target_level>__` (e.g. `feat__csv_import`). + +`<snake_case_id>` is derived from the feature's short_title (lowercase, spaces → underscores, non-alphanumeric stripped). + +`source_doc` — relative path (from `project_root`) to the doc file this feature was derived from. This is a Pharaoh convention for provenance. `pharaoh-bootstrap` declares `source_doc` under `[[needs.extra_options]]` by default so sphinx-needs does not warn under `-nW`; callers who opted out of the default must declare it manually or accept the warnings. Downstream skills (`pharaoh-feat-file-map`, plans emitted by `pharaoh-write-plan`) read this to group features by source doc. + +The output is one JSON object — no surrounding prose, no concatenated RST outside the JSON. + +## Output schema + +Validated as `json_obj` by `pharaoh-output-validate`. Validator checks: +1. Top-level is a JSON object with exactly one required key `feats` (list). +2. Every `feats[*]` has the keys `id`, `title`, `type`, `body`, `source_doc`, `raw_rst`. +3. `feats[*].type` equals input `target_level` (default `feat`). +4. `feats[*].source_doc` references a path present in the input `doc_files` list. +5. `feats[*].raw_rst` matches the RST directive Stage 1 + Stage 2 regex from `pharaoh-req-from-code` `## Output schema`, with directive name = `feats[*].type` and `:id:` / `:status:` / `:source_doc:` options present. +6. `feats[*].id` matches the resolved `<id_prefix><snake_case_id>` pattern. + +## Process + +### Step 1: MANDATORY — query Papyrus for canonical feature names (if workspace provided) + +For each feature concept you identify in the docs, query `pharaoh-context-gather` with a semantic description ("the capability that exports needs to CSV"). If a canonical feature name already exists, reuse it verbatim. This prevents drift when the same doc is re-processed or when multiple docs describe overlapping capabilities. + +Skip this step if `papyrus_workspace` is not provided (no-memory mode). + +### Step 2: Read all doc_files + +Read every file in `doc_files`. Concatenate into working memory. Identify user-facing capability boundaries: + +- Section headers often signal capability boundaries ("## Import from ReqIF", "## Export to Jama"). +- Imperative verbs describing what users can do with the product ("You can import …", "Users can export …"). +- Top-level bullet lists in README "Features" sections. +- sphinx-design cards with short capability labels. + +Ignore: +- Installation/setup instructions. +- Contributing guidelines. +- License text. +- Changelog entries. + +### Step 3: Resolve ID prefix from tailoring (with check → propose → confirm) + +Read `<project_root>/ubproject.toml`. Find the `[[needs.types]]` entry matching `target_level`. Extract its `prefix` field. Three resolution paths: + +1. **Type declared, prefix present** → use the declared prefix. Proceed. +2. **Type declared, prefix absent** → use `<target_level>__` silently (this is a minor-enough gap to default). +3. **Type NOT declared**, OR `ubproject.toml` missing entirely → branch on `on_missing_config`: + - `"fail"` → FAIL with: `"target_level=<value> not declared in <project_root>/ubproject.toml. Run pharaoh-bootstrap first, or pass on_missing_config='prompt' to negotiate."` + - `"prompt"` (default) → emit a `needs_confirmation` proposal: + ```json + { + "status": "needs_confirmation", + "proposal": { + "target_level": "<value>", + "proposed_prefix": "<uppercase value>_", + "rationale": "target_level is not declared as a type in ubproject.toml. Propose adding it so downstream skills have a stable type.", + "tailoring_patch": { + "target_file": "ubproject.toml", + "table": "[[needs.types]]", + "entry": {"directive": "<value>", "title": "<Title Case value>", "prefix": "<uppercase>_"} + } + } + } + ``` + Return without emitting features. The caller confirms, runs `pharaoh-tailor-fill` or edits manually, then re-invokes with `on_missing_config="use_default"`. + - `"use_default"` → synthesize defaults silently: treat `target_level` as declared with prefix `<target_level>__`. Proceed. + +### Step 4: Record newly surfaced canonical feature names in Papyrus + +Only if `papyrus_workspace` is provided. For each feature concept you will emit that was NOT returned by Step 1, invoke `pharaoh-decision-record` with: + +- `type`: `"fact"` +- `canonical_name`: the short_title you chose for this feature (space-separated, Title Case — e.g. `"CSV Import"`) +- `body`: one sentence describing the capability +- `reporter_id`: your `reporter_id` input +- `tags`: `["origin:feat-draft-from-docs", "doc:<doc_basename>"]` + +If `pharaoh-decision-record` returns `"duplicate"`, re-query and adopt the existing canonical name. + +### Step 5: Emit feature directives + +The set of emitted capabilities depends on `granularity`: + +- `"doc"` (default): one feat per input doc file. If a doc covers multiple topics, pick the dominant theme for the title/body and rely on downstream skills (e.g. `pharaoh-feat-balance`) to flag under-decomposition. +- `"top_section"`: enumerate top-level headings across all input docs. For RST, a top-level heading is a line followed by a line of `=` characters of matching length. For Markdown, a top-level heading is a line starting with `# ` (single hash, not `##`). Each top-level section becomes one feat; the section's prose is the body source. A doc with no top-level headings falls back to one feat for the whole doc (same as `"doc"`). +- `"manual_hint"`: scan each doc for split markers. RST: lines of form `.. feat-split::` (optionally followed by a feat title on the same line, e.g. `.. feat-split:: CSV Import`). Markdown: lines exactly matching `<!-- feat-split -->` or `<!-- feat-split: Title -->`. Segments between markers (and the implicit segment before the first marker, if any) each become one feat. A doc with zero markers falls back to one feat for the whole doc. + +Emit one block per the Output shape per resolved capability. Target: 3-15 features total across all `doc_files`. Fewer than 3 suggests under-decomposition (lumping); more than 15 suggests over-decomposition (every button becomes a feature). If you hit these bounds, log a warning and proceed anyway — the eval will flag it. + +Body text must be user-facing: "The system shall import needs from ReqIF files", not "The `from_reqif` command parses XML via lxml". Implementation detail belongs in `comp_req`, not `feat`. + +### Step 6: Return + +Emit one JSON object `{"feats": [...]}` per the Output shape. For each emitted capability build the per-feat mapping with `id`, `title`, `type` (= `target_level`), `body`, `source_doc`, and `raw_rst` (the literal RST block that would render the directive). Nothing else on stdout — no prose wrapper, no fenced code block. + +## No-memory mode + +If `papyrus_workspace` is absent, skip Steps 1 and 4. Proceed directly to 2, 3, 5, 6. + +## Failure modes + +- `doc_files` empty → FAIL: "At least one doc file required." +- Any file in `doc_files` unreadable → log and skip that file; do not abort unless all files are unreadable. +- `ubproject.toml` missing or `target_level` undeclared → FAIL per Step 3. +- `pharaoh-context-gather` / `pharaoh-decision-record` errors → log and proceed as if no match (never abort on memory-layer issues). + +## Last step + +After emitting the artefact, invoke `pharaoh-feat-review` on it. Pass the emitted artefact (or its `need_id`) as `target`. Attach the returned review JSON to the skill's output under the key `review`. If the review emits any axis with `score: 0` or `severity: critical`, return a non-success status with the review findings verbatim and do NOT finalize the artefact — the caller must regenerate (via `pharaoh-feat-regenerate` if available, or by re-invoking this skill with the findings as input). + +See [`shared/self-review-invariant.md`](../shared/self-review-invariant.md) for the rationale and enforcement mechanism. Coverage is mechanically enforced by `pharaoh-self-review-coverage-check` in `pharaoh-quality-gate`. + +## Composition + +A plan emitted by `pharaoh-write-plan` calls this skill once with the full `doc_files` list in the initial wave, then a foreach task dispatches `pharaoh-feat-file-map` once per emitted feature to produce the feat→files mapping. diff --git a/skills/pharaoh-feat-file-map/SKILL.md b/skills/pharaoh-feat-file-map/SKILL.md new file mode 100644 index 0000000..b565e35 --- /dev/null +++ b/skills/pharaoh-feat-file-map/SKILL.md @@ -0,0 +1,172 @@ +--- +name: pharaoh-feat-file-map +description: Use when mapping one feature (already emitted as an RST directive) to the source files that implement it. Reads the source tree, returns a YAML entry `{feat_id: {files: [...], rationale: "..."}}`. Does NOT read docs. Does NOT emit reqs. Does NOT create or modify source files. +--- + +# pharaoh-feat-file-map + +## When to use + +Invoke after `pharaoh-feat-draft-from-docs` has emitted one or more feature directives, when you need to know which source files implement each feature. The emitted mapping feeds downstream `pharaoh-req-from-code` tasks (one invocation per file, with `parent_feat_ids` set from this mapping), producing `comp_req` directives that link back to the parent feature via `:satisfies:`. + +One invocation handles exactly one feature. To map N features, a plan emitted by `pharaoh-write-plan` uses a `foreach` task over feats to dispatch N instances concurrently. + +Do NOT use to draft features (that is `pharaoh-feat-draft-from-docs`). Do NOT use to emit reqs (that is `pharaoh-req-from-code`). Do NOT modify source files (that is a future bidirectional-trace skill). + +## Tailoring awareness + +This skill does not emit RST directives, so it is type-agnostic. It does, however, respect the consumer project's source layout: if `pharaoh.toml` or `ubproject.toml` declares a `[pharaoh.codelinks]` section or a sphinx-codelinks `source_discover.src_dir`, the skill uses that as the default `src_root`. Otherwise the caller must pass `src_root` explicitly. + +## Atomicity + +- (a) Indivisible — one feature in → one YAML entry out. No RST emit. No other feature analysis. One artefact × one phase. +- (b) Input: `{feat_id: str, feat_title: str, feat_body: str, src_root: str, file_glob?: str, exclude_glob?: list[str], papyrus_workspace?: str, reporter_id: str}`. Output: a single YAML object in FLAT shape `{feat_id: <str>, files: [<relative_path>, ...], rationale: "<one-sentence explanation>", entry_point?: <mapping>, shared_with?: [<feat_id>]}`. No wrapping prose, no outer `{<feat_id>: ...}` key — the `feat_id` lives as a sibling scalar alongside `files` and `rationale` so downstream aggregation over foreach results is a trivial list-of-mappings, not a merge of single-key mappings. +- (c) Reward: deterministic fixture — a 5-file source tree where 3 files clearly implement feature "FEAT_csv_export" (e.g. `csv/export.py`, `csv/writer.py`, `commands/csv.py`) and 2 are unrelated (`jama/client.py`, `reqif/parser.py`). After skill runs, scorer checks: + 1. Output is valid YAML parseable by PyYAML. + 2. Output has top-level keys including `feat_id` (equal to input `feat_id`), `files` (list), `rationale` (string). + 3. No other top-level keys are present beyond the optional `entry_point` and `shared_with`. + 4. Every path in `files` exists under `src_root`. + 5. Precision: of emitted files, ≥80% are in the fixture's ground-truth positive set. + 6. Recall: of the fixture's ground-truth positive files, ≥60% are in emitted `files`. + + Precision and recall targets are deliberately asymmetric — we accept more false positives than false negatives because downstream `pharaoh-req-from-code` can tolerate an extra file (just produces an extra req the human can delete), but missing a file means a behavior gets no requirement at all. +- (d) Reusable: any reverse-engineering workflow; impact analysis ("which files does this feature touch?"); rough component boundary detection. +- (e) Composable: one feature per call. A plan emitted by `pharaoh-write-plan` dispatches N instances via `foreach` when multiple feats exist. This skill never calls `pharaoh-feat-draft-from-docs` or `pharaoh-req-from-code`. + +## Input + +- `feat_id`: the feature's sphinx-needs ID (e.g. `"FEAT_csv_export"` or `"feat__csv_export"`). Used verbatim as the YAML key. +- `feat_title`: the feature's short title (e.g. `"CSV Export"`). Used for semantic reasoning about file relevance. +- `feat_body`: the feature's one-sentence statement (e.g. `"The system shall export needs to CSV files."`). Used for semantic reasoning. +- `src_root`: absolute path to the source tree to scan. All emitted file paths are relative to this root. +- `file_glob` (optional): glob pattern for candidate files. Default: `"**/*"` minus common excludes (see `exclude_glob`). Callers for a Python project may pass `"**/*.py"`; for a polyglot project, a combined pattern. +- `exclude_glob` (optional): list of glob patterns to exclude. Default: `["**/__pycache__/**", "**/.git/**", "**/node_modules/**", "**/*.pyc", "**/tests/**", "**/test_*.py", "**/*_test.py"]`. Tests are excluded by default because they describe verification, not implementation; a separate skill can map tests to features if needed. +- `papyrus_workspace` (optional): path to `.papyrus/` directory. If provided, the skill queries for prior knowledge about which files implement which concepts (enables cross-run consistency). +- `reporter_id`: short identifier for this agent (e.g. `feat-file-map:FEAT_csv_export`). + +## Output + +A single YAML document, no prose wrapper: + +```yaml +feat_id: FEAT_csv_export +files: + - csv/export.py + - csv/writer.py + - commands/csv.py +rationale: "Export pipeline: export.py orchestrates, writer.py serializes rows, commands/csv.py registers the CLI entrypoint." +``` + +Top-level keys: `feat_id` (equal to input), `files`, `rationale`. Optional top-level keys: + +- `shared_with: list[feat_id]` — populated by the orchestrator when the same file serves multiple features (see below). +- `entry_point: {file: str, symbol: str}` — names the file + symbol where feature flow begins (typically a CLI command, HTTP route, test entry, event handler). Downstream `pharaoh-feat-flow-extract` reads this to know where to start the call-chain walk. Leave absent when no single entry point applies (e.g. the feature is a pure data model with no orchestrating function). + +`files` is a list of strings (each a path relative to `src_root`) and `rationale` is a one-sentence string explaining why these files were chosen. + +Example with entry_point (recommended when one clearly exists): + +```yaml +feat_id: FEAT_csv_export +files: + - csv/export.py + - csv/writer.py + - commands/csv.py +rationale: "Export pipeline from CLI through the writer." +entry_point: + file: commands/csv.py + symbol: export +``` + +When a file implements behavior across multiple features (e.g. `commands/reqif.py` serves both ReqIF import and export), the `to_files_flat` helper in a plan emitted by `pharaoh-write-plan` detects this by seeing the same path appear under multiple feat entries (each entry a flat mapping produced by one `foreach` instance of this skill). It denormalises so the file appears once with `parents: [<feat_ids>]` listing all parents. Example (two instances from different foreach iterations): + +```yaml +# instance 1 (feat: FEAT_reqif_export) +feat_id: FEAT_reqif_export +files: + - reqif/needs2reqif.py + - commands/reqif.py +rationale: "..." + +# instance 2 (feat: FEAT_reqif_import) +feat_id: FEAT_reqif_import +files: + - reqif/reqif2needs.py + - commands/reqif.py # shared with FEAT_reqif_export +rationale: "..." +``` + +This atomic skill emits one entry at a time; cross-entry consolidation happens in the plan via `to_files_flat`, not in this skill. + +If no files match, emit: + +```yaml +feat_id: <input feat_id> +files: [] +rationale: "No source files matched this feature — check whether the feature is implemented in src_root or whether file_glob/exclude_glob are too restrictive." +``` + +Empty `files` is a valid output; the orchestrator decides whether to surface it as a warning. + +## Output schema + +Output must parse as a YAML document via `yaml.safe_load`. Validator checks: +1. Parsed root is a mapping with required keys `feat_id` (string equal to input `feat_id`), `files` (list of strings), `rationale` (non-empty string). +2. Optional top-level keys `shared_with` (list of strings) and `entry_point` (mapping with required `file: str` and `symbol: str`) are permitted; no other top-level keys accepted. +3. Every entry in `files` is a non-empty string. +4. `rationale` is a non-empty string. + +## Process + +### Step 1: Query Papyrus for prior file associations (if workspace provided) + +Query `pharaoh-context-gather` with `feat_title + " " + feat_body` against `papyrus_workspace`. If any prior memories link this feature (or a canonically-equivalent one) to specific files, bias toward those files in Step 3. If not, proceed. + +### Step 2: Enumerate candidate files + +Apply `file_glob` under `src_root`, then filter out everything matching `exclude_glob`. Read the resulting list of candidate files. + +### Step 3: Score each candidate for relevance to the feature + +For each candidate, read the first ~200 lines (or full file if smaller). Reason about relevance: + +- Strong positive signals: file name matches feature keywords (e.g. `csv_export.py` for CSV export); top-level function/class names use feature keywords; docstrings mention the feature's capability. +- Weak positive signals: imports from modules whose names match feature keywords; file is in a subdirectory whose name matches feature keywords. +- Negative signals: file name matches a different feature's keywords; file is clearly a helper/utility imported by many unrelated modules. + +Do NOT use file size as a signal. Do NOT use modification date as a signal. Do NOT follow imports transitively (that explodes scope). + +Assign each candidate an internal relevance score (high / medium / low / none). Emit all `high` and `medium` files. Drop `low` and `none`. + +### Step 4: Write rationale + +One sentence, ≤ 25 words, explaining the emitted file set. Example: `"reqif/reqif2needs.py parses XML, reqif/section.py handles section groups, commands/reqif.py wires the CLI."` + +Do NOT list every file in the rationale (that duplicates `files`). Instead describe the ROLE each file plays. + +### Step 4b: Identify entry_point (optional) + +After selecting the emitted files, identify the "entry point" — the file + symbol where user-facing flow begins. Heuristics, in order of preference: + +1. File in a directory named `commands/`, `cli/`, `api/`, `routes/`, `handlers/`, or `entrypoints/`. +2. File whose primary symbol name matches feat title tokens (case-insensitive substring). +3. File with a decorator-style entry marker (`@app.command()`, `@click.command()`, `@router.get()`, `@fastapi.*`). + +If exactly one candidate matches, emit `entry_point`. If multiple match, pick the one closest to the feat title tokens. If zero match, OMIT `entry_point` entirely (downstream skill detects absence and skips flow extraction). + +Do NOT invent an entry_point when the feat is a data model, a shared utility, or a configuration artefact with no orchestrating function. + +### Step 5: Return YAML + +Return the YAML object. No prose before or after. + +## Failure modes + +- `src_root` not readable → FAIL: "src_root unreadable: <path>". +- `feat_id` missing or not a string → FAIL: "feat_id must be a non-empty string". +- Zero candidate files after glob filtering → emit empty `files` with explanatory `rationale`, do NOT fail. +- `pharaoh-context-gather` errors → log and proceed without Papyrus bias. + +## Composition + +A plan emitted by `pharaoh-write-plan` calls `pharaoh-feat-draft-from-docs` once, then uses a `foreach` task to dispatch one `pharaoh-feat-file-map` per emitted feature in parallel. Merging / denormalisation to a flat file list happens in the plan via the `to_files_flat` helper — this skill never reads or writes a merged file. diff --git a/skills/pharaoh-feat-flow-extract/SKILL.md b/skills/pharaoh-feat-flow-extract/SKILL.md new file mode 100644 index 0000000..b5e363b --- /dev/null +++ b/skills/pharaoh-feat-flow-extract/SKILL.md @@ -0,0 +1,139 @@ +--- +name: pharaoh-feat-flow-extract +description: Use when reverse-engineering a feat and you need to derive a sequence diagram showing the control flow from its entry point through its source files. Walks the call graph up to a bounded depth and emits a Mermaid or PlantUML sequence diagram whose output shape matches pharaoh-sequence-diagram-draft. Complements pharaoh-feat-component-extract (static view); this is the dynamic view. +--- + +# pharaoh-feat-flow-extract + +## When to use + +Invoke after `pharaoh-feat-file-map` has produced a `{feat_id, files, entry_point}` mapping (with `entry_point` naming the file+symbol where flow begins — typically a CLI command, HTTP route, test entry, or event handler), when you want to see the call chain that realizes the feat. Output is a sequence diagram matching `pharaoh-sequence-diagram-draft`'s shape so downstream tooling treats it identically. + +Do NOT use for static architecture — that is `pharaoh-feat-component-extract`. Do NOT use when `entry_point` is not known — the skill fails fast rather than inventing one. + +## Tailoring awareness + +Shared tailoring rules: see `shared/diagram-tailoring.md`. Reads `[pharaoh.diagrams]` and `[pharaoh.diagrams.sequence]` from `pharaoh.toml` for renderer choice. Respects `on_missing_config` per shared `check → propose → confirm`. + +Safe-label rules: see `shared/diagram-safe-labels.md`. **Critical for this skill:** messages derived from call expressions (`foo(a; b, c)`) often contain `;` which Mermaid 11 treats as a statement terminator, and path fragments like `csv/export.py` are not valid participant IDs. Rules to apply before emit: (a) replace `;` in any message label with `,`; (b) use participant aliases (`participant Export as csv/export.py`), never raw paths as IDs; (c) strip backticks from symbol names. A message label containing `;` (e.g. `J->>J: filter by type; skip SET/Folder`) parses cleanly under `sphinx-build -nW` but renders as `Syntax error in text` in the browser — sanitisation catches this class before emit. + +## Atomicity + +- (a) Indivisible — one feat + one file list + one entry_point in → one sequence diagram out. No multi-scenario bundling. No mutation. No req emission. +- (b) Input: `{feat_id: str, feat_title: str, files: list[str], entry_point: {file: str, symbol: str}, project_root: str, src_root: str, renderer_override?: "mermaid"|"plantuml", max_depth?: int, on_missing_config?: "fail"|"prompt"|"use_default", papyrus_workspace?: str, reporter_id: str}`. Output: one RST directive block matching `pharaoh-sequence-diagram-draft`'s output shape. Default `max_depth=5`. +- (c) Reward: fixture `pharaoh-validation/fixtures/pharaoh-feat-flow-extract/`: + - `input_feat.yaml` declares `entry_point: {file: commands/csv.py, symbol: export}`. + - `input_files/` is shared with `pharaoh-feat-component-extract` — the call chain `commands.csv:export → csv.export:run_export → csv.writer:CSVWriter.write_header / write_rows`. + - `expected_diagram.rst` has 3 participants (one per file touched) and the 4 messages representing the call chain. + + Scorer: + 1. Output starts with the renderer's sequence-diagram directive. + 2. Every participant in the call chain appears (one per distinct file). + 3. Messages render in call order with correct arrow syntax. + 4. Message count equals call count at `max_depth=5` (should resolve to 4 for this fixture). + 5. Participants are declared in first-seen order (entry point first). + 6. Output shape matches `pharaoh-sequence-diagram-draft`. + + Pass = all 6. +- (d) Reusable for any language whose call-graph the extractor supports. Python initial target (AST or regex). +- (e) Composable: one feat per call. A plan emitted by `pharaoh-write-plan` may include a `foreach` task over feats (with entry_point set) that dispatches this skill alongside `pharaoh-feat-component-extract` in the same `parallel_group`. Never invokes other skills. + +## Input + +- `feat_id`: diagram caption prefix. +- `feat_title`: human-readable, shown in caption. +- `files`: list of source file paths relative to `src_root`. Only calls resolving to files in this list are traced; calls to stdlib / third-party / out-of-scope files are silently dropped (they are not part of the feature). +- `entry_point`: + - `file`: path relative to `src_root` — must be in `files`. + - `symbol`: name of the function or method where flow begins. +- `project_root`, `src_root`: as in Task 19's skill. +- `renderer_override` (optional): per shared doc. +- `max_depth` (optional): maximum recursion depth when walking the call chain. Default `5`. +- `on_missing_config`, `papyrus_workspace`, `reporter_id`: standard. +- `scenarios` (optional): list of scenario names, default `["default"]`. Each scenario produces one diagram block. Scenario names drive annotations in the output (e.g. `:caption: FEAT_x — flow, scenario: error_handling`). Project tailoring declares the canonical scenario set via `.pharaoh/project/diagram-conventions.yaml > dynamic_view_scenarios`. See [`shared/diagram-view-selection.md`](../shared/diagram-view-selection.md). + +## Output + +Output is a JSON document with shape: + +```json +{ + "diagrams": [ + { + "scenario": "default", + "diagram_block": ".. mermaid::\n :caption: FEAT_x — flow\n\n sequenceDiagram\n ...", + "element_count": 7, + "renderer": "mermaid" + } + ] +} +``` + +One entry per scenario. Callers invoke `pharaoh-diagram-review` per entry (plan template foreach-expands over `diagrams[]`). + +## Process + +### Step 1: Locate entry point + +Read `<src_root>/<entry_point.file>`. Locate the definition of `<entry_point.symbol>` via regex: + +- Python: `^(\s*)(def|async def|class) <symbol>\b`. +- Other languages per shared doc. + +If not found → FAIL: `"entry_point.symbol <symbol> not found in <entry_point.file>"`. + +Capture the body of the symbol (lines until the next line with indentation ≤ the symbol's definition line). + +### Step 2: Walk call chain up to max_depth + +Starting from the entry symbol's body, identify direct function/method calls. Regex (Python): + +- Bare calls: `(?<!\.)(?P<name>\w+)\(` (a bare identifier followed by `(`, not preceded by `.`). +- Method calls: `\.(?P<name>\w+)\(`. +- Constructor calls: `(?P<name>[A-Z]\w+)\(` (uppercased → probably a class instantiation). + +For each call, resolve the target: + +- Check if the name matches a top-level symbol defined in any of the `files` (use the primary-symbol detection from Task 19's skill). If so, record `(from_file, to_file, call_label)` where `call_label` is `<symbol>()` or `<method_name>()`. +- If the call resolves to a stdlib / third-party / imported-external symbol, drop it silently. +- If the call resolves to a local helper within the same file, drop it (same-file calls clutter the diagram; the participant-per-file abstraction collapses them). + +Recurse into resolved cross-file calls up to `max_depth` (default 5). Collect all resolved cross-file calls in call order (the order they appear in the body, top-to-bottom). + +### Step 3: Emit sequence diagram + +Resolve renderer per shared doc. Declare one participant per distinct file encountered in the call chain, in first-seen order. Emit messages in the order collected. + +Arrow syntax: + +- Synchronous call: Mermaid `->>`, PlantUML `->`. +- If the call target is `async def`, use async arrow: Mermaid `-)`, PlantUML `->>`. + +No return arrows are emitted by default — they clutter at this granularity. Callers who want them can use `pharaoh-sequence-diagram-draft` with explicit messages. + +Caption: `<feat_id> — flow from entry point`. + +### Step 4: Return + +Single RST block. No prose. + +## Failure modes + +- `entry_point.file` not in `files` → FAIL: `"entry_point.file <file> is not in the files list"`. +- `entry_point.symbol` not found in `entry_point.file` → FAIL per Step 1. +- Zero calls detected from the entry symbol's body → emit a minimal diagram with one participant (the entry point's file) and a self-note `Note over <participant>: entry point has no cross-file calls` instead of failing. +- Max depth exceeded → truncate at depth, log a note. + +## Non-goals + +- No return-arrow inference. Use `pharaoh-sequence-diagram-draft` if needed. +- No activation-bar insertion (PlantUML activates/deactivates). +- No concurrent / async branch handling beyond marking the arrow shape. Complex async flow is hand-authored via `pharaoh-sequence-diagram-draft`. +- No multi-entry-point diagrams. One entry → one diagram. If a feat has multiple entry points (e.g. a CLI with two subcommands), the orchestrator dispatches the skill twice. +- No code-to-sequence inference below function granularity (no per-statement trace). The unit of traceability is a function/method call crossing file boundaries. + +## Last step + +After emitting the artefact, invoke `pharaoh-diagram-review` on it. Pass the emitted artefact (or its `need_id`) as `target`. Attach the returned review JSON to the skill's output under the key `review`. If the review emits any axis with `score: 0` or `severity: critical`, return a non-success status with the review findings verbatim and do NOT finalize the artefact — the caller must regenerate (via `pharaoh-diagram-regenerate` if available, or by re-invoking this skill with the findings as input). + +See [`shared/self-review-invariant.md`](../shared/self-review-invariant.md) for the rationale and enforcement mechanism. Coverage is mechanically enforced by `pharaoh-self-review-coverage-check` in `pharaoh-quality-gate`. diff --git a/skills/pharaoh-feat-review/SKILL.md b/skills/pharaoh-feat-review/SKILL.md new file mode 100644 index 0000000..57fe409 --- /dev/null +++ b/skills/pharaoh-feat-review/SKILL.md @@ -0,0 +1,57 @@ +--- +name: pharaoh-feat-review +description: Use when auditing a single feature-level need (feat) against the generic feat review axes in `shared/checklists/feat.md` plus any per-project addenda in `.pharaoh/project/checklists/feat.md`. Emits structured findings JSON — per-axis pass/fail for mechanized axes, 0-3 score for subjective axes. Mirrors `pharaoh-req-review`'s shape for feat-level artefacts. +chains_from: [pharaoh-feat-draft-from-docs] +chains_to: [] +--- + +# pharaoh-feat-review + +## When to use + +Invoke after `pharaoh-feat-draft-from-docs` emitted a feat, or on an existing feat need-id in needs.json. Part of the self-review invariant — see [`shared/self-review-invariant.md`](../shared/self-review-invariant.md). + +Do NOT review comp_reqs or architecture — use `pharaoh-req-review` or `pharaoh-arch-review`. Do NOT re-author — invoke `pharaoh-feat-draft-from-docs` again with the review findings as input if regeneration is needed. + +## Atomicity + +- (a) One feat + one checklist in → one findings JSON out. +- (b) Input: `{target: <feat_directive_rst_or_need_id>, checklist_path: <path>, tailoring_path: <path>}`. Output: findings JSON with per-axis entries. +- (c) Reward: fixtures mirror `pharaoh-req-review` — one `passing.rst` and one `failing.rst` feat with expected findings JSON. +- (d) Reusable by any flow emitting feats. +- (e) Read-only. + +## Input + +- `target`: RST directive block for a feat, OR a `need_id` with `type: feat` present in needs.json. +- `checklist_path`: absolute path to `shared/checklists/feat.md`. Per-project extensions in `.pharaoh/project/checklists/feat.md` are appended if present. +- `tailoring_path`: absolute path to `.pharaoh/project/` root. Reads `artefact-catalog.yaml` for required/optional fields per the feat artefact type. + +## Output + +```json +{ + "need_id": "FEAT_example", + "type": "feat", + "axes": { + "trace_to_parent_or_workflow": {"passed": true, "reason": "links to wf__onboarding via :satisfies:"}, + "single_user_capability": {"score": 3, "reason": "scope is one feature"}, + "source_doc_present_and_valid": {"passed": true, "reason": "source_doc=docs/source/features/x.rst exists"}, + "required_fields_complete": {"passed": true, "reason": "id, status, source_doc present"}, + "shall_clause_user_observable": {"score": 2, "reason": "minor: names internal module"}, + "body_length_within_bounds": {"passed": true, "reason": "body=8 lines, limit=15"}, + "no_comp_level_mechanism_leak": {"score": 3, "reason": "no class / method names in body"}, + "naming_clarity": {"score": 3, "reason": "FEAT_reqif_export — clear"} + }, + "overall": "pass", + "actions": [] +} +``` + +## Review axes + +See [`shared/checklists/feat.md`](../shared/checklists/feat.md) for the canonical axis list and rubric. Per-project extensions (e.g. Score's ASIL-level guidance, connector-family consistency (project-specific example)) are appended from `.pharaoh/project/checklists/feat.md` if present, with their own axis keys namespaced under `tailoring.*`. + +## Composition + +Invoked by `pharaoh-write-plan`-generated plans after every `pharaoh-feat-draft-from-docs` task. Also invoked ad-hoc per the self-review invariant. Coverage enforced by `pharaoh-self-review-coverage-check`. diff --git a/skills/pharaoh-finding-record/SKILL.md b/skills/pharaoh-finding-record/SKILL.md index d5198a5..faecd7e 100644 --- a/skills/pharaoh-finding-record/SKILL.md +++ b/skills/pharaoh-finding-record/SKILL.md @@ -22,7 +22,7 @@ Do NOT invoke for new audit categories not in the known list — category must b ## Input - `category`: one of the known categories listed above. -- `subject_id`: the Score need ID affected by the finding (e.g. `arch__orphan_0`). +- `subject_id`: the project need ID affected by the finding (e.g. `arch__orphan_0`). - `finding_text`: 1-3 sentence description. Used as the Papyrus need body. - `reporter_id`: the calling subagent's area tag (e.g. `coverage-gap`, `lifecycle-check`). Stored in the Papyrus need `source` field for traceability. diff --git a/skills/pharaoh-fmea-review/SKILL.md b/skills/pharaoh-fmea-review/SKILL.md new file mode 100644 index 0000000..4be705c --- /dev/null +++ b/skills/pharaoh-fmea-review/SKILL.md @@ -0,0 +1,52 @@ +--- +name: pharaoh-fmea-review +description: Use when auditing a single FMEA entry (failure-mode row) against the generic FMEA review axes in `shared/checklists/fmea.md` plus per-project addenda. Checks severity/occurrence/detection scales, RPN computation, cause/effect well-formedness, traceability to the analyzed artefact. Emits structured findings JSON. +chains_from: [pharaoh-fmea] +chains_to: [] +--- + +# pharaoh-fmea-review + +## When to use + +Invoke after `pharaoh-fmea` emitted a single failure-mode entry. Part of the self-review invariant. + +Do NOT review sets of FMEA rows — this skill reviews one entry. A fleet review is a separate flow that invokes this skill per entry. + +## Atomicity + +- (a) One FMEA entry + one checklist in → one findings JSON out. +- (b) Input: `{target: <fmea_entry_json_or_need_id>, checklist_path: <path>, tailoring_path: <path>}`. Output: findings JSON. +- (c) Reward: fixtures `passing-fmea.json` + `failing-fmea.json` with expected findings. +- (d) Reusable. +- (e) Read-only. + +## Input + +- `target`: JSON object with the FMEA entry shape emitted by `pharaoh-fmea`, OR a need_id with type `fmea` in needs.json. +- `checklist_path`: `shared/checklists/fmea.md`. +- `tailoring_path`: `.pharaoh/project/` for optional scale extensions. + +## Output + +```json +{ + "need_id": "fmea__example_01", + "type": "fmea", + "axes": { + "trace_to_analyzed_artefact": {"passed": true}, + "severity_in_range": {"passed": true, "reason": "sev=7, scale=1..10"}, + "occurrence_in_range": {"passed": true, "reason": "occ=4"}, + "detection_in_range": {"passed": true, "reason": "det=3"}, + "rpn_computed_correctly": {"passed": true, "reason": "7*4*3=84, entry reports 84"}, + "cause_well_formed": {"score": 3}, + "effect_well_formed": {"score": 3}, + "mitigation_proposed_if_rpn_high": {"score": 2, "reason": "RPN 84 > threshold 60; mitigation text thin"} + }, + "overall": "pass" +} +``` + +## Review axes + +See [`shared/checklists/fmea.md`](../shared/checklists/fmea.md). diff --git a/skills/pharaoh-fmea/SKILL.md b/skills/pharaoh-fmea/SKILL.md index 1085aab..3165da5 100644 --- a/skills/pharaoh-fmea/SKILL.md +++ b/skills/pharaoh-fmea/SKILL.md @@ -322,3 +322,9 @@ failure_mode = "no ABS pump activation on slip threshold exceedance" } } ``` + +## Last step + +After emitting the artefact, invoke `pharaoh-fmea-review` on it. Pass the emitted artefact (or its `need_id`) as `target`. Attach the returned review JSON to the skill's output under the key `review`. If the review emits any axis with `score: 0` or `severity: critical`, return a non-success status with the review findings verbatim and do NOT finalize the artefact — the caller must regenerate (via `pharaoh-fmea-regenerate` if available, or by re-invoking this skill with the findings as input). + +See [`shared/self-review-invariant.md`](../shared/self-review-invariant.md) for the rationale and enforcement mechanism. Coverage is mechanically enforced by `pharaoh-self-review-coverage-check` in `pharaoh-quality-gate`. diff --git a/skills/pharaoh-gate-advisor/SKILL.md b/skills/pharaoh-gate-advisor/SKILL.md new file mode 100644 index 0000000..18ea0bc --- /dev/null +++ b/skills/pharaoh-gate-advisor/SKILL.md @@ -0,0 +1,159 @@ +--- +name: pharaoh-gate-advisor +description: Use when reading a project's `pharaoh.toml` to report which phased-enablement ladder step is the recommended next gate to switch on. Single mechanical advisory check — parses five flags (`strictness`, `require_verification`, `require_change_analysis`, `require_mece_on_release`, `codelinks.enabled`), walks the fixed ladder in order, and emits the first unmet step plus its blocker note. Read-only; never edits `pharaoh.toml`. +--- + +# pharaoh-gate-advisor + +## When to use + +Invoke after `pharaoh-bootstrap` + `pharaoh-setup` have landed a `pharaoh.toml`, whenever an auditor asks "which gate should we switch on next?", or as a recurring prompt in a project-health review. Reads `pharaoh.toml`, reports the current state of the five ladder knobs, and names the FIRST ladder step whose flag is not yet enabled along with the pre-work that blocks enabling it. When every step is satisfied, returns `recommended_next_gate: null` with `rationale: "ladder complete"`. + +The ladder is fixed and ordered by value / cost ratio — cheapest-and-most-effective first, hardest-and-most-disruptive last. Advancing one step at a time makes the transition from "advisory everywhere" to "enforcing everywhere" debuggable — a project that flips `strictness = "enforcing"` before any individual gate is on ships a gate that enforces nothing, then gets blamed when a later flip fails. + +Do NOT invoke to modify `pharaoh.toml` — this skill is advisory, read-only. Auto-enablement belongs in `pharaoh-setup` or a future `pharaoh-setup-reconfigure`, not here. Do NOT invoke to grade the QUALITY of the gates' effects (whether review coverage is good, whether MECE is clean) — that is `pharaoh-quality-gate`. Do NOT invoke to reason about gates not in the ladder (e.g. `[pharaoh.traceability]`); the ladder is deliberately five steps. + +## Atomicity + +- (a) Indivisible: one `pharaoh.toml` in → one findings JSON out. No file writes, no dispatch of other skills, no reasoning about anything besides the five ladder flags. +- (b) Input: `{pharaoh_toml_path: str}`. Output: findings JSON per the shape in `## Output` below. +- (c) Reward: fixtures under `skills/pharaoh-gate-advisor/fixtures/` — one per ladder outcome: + 1. `fresh-from-bootstrap/` — every flag at its advisory default (`strictness = "advisory"`, all four booleans `false`). Expected: `recommended_next_gate == "require_verification"`, rationale names step 1 as the lowest-cost enablement, `ladder[0].blocker == "none — safe to enable now"`. + 2. `step-1-enabled/` — `require_verification = true`, the remaining three booleans `false`, strictness advisory. Expected: `recommended_next_gate == "require_change_analysis"`, rationale names step 2 and the pharaoh-change tailoring blocker. + 3. `all-steps-enabled/` — `strictness = "enforcing"`, all four booleans `true`. Expected: `recommended_next_gate == null`, rationale `"ladder complete"`, every ladder entry reports its flag as enabled. + + Pass = each fixture's actual output matches `expected-output.json` byte-for-byte (the ladder array is fixed and deterministic). +- (d) Reusable across projects — the ladder ships with zero project-specific vocabulary. Only `pharaoh.toml`'s own key names appear, and those are the same for every Pharaoh consumer. Tailoring extension point: projects may override `rationale` text via `tailoring.gate_advisor_rationale_overrides` if they want house-style blocker notes, but the ladder ORDER is fixed. +- (e) Read-only. Does not modify `pharaoh.toml`, `pharaoh.toml.example`, or any on-disk state. Running twice on identical input yields byte-identical output. + +## Input + +- `pharaoh_toml_path`: absolute path to the project's `pharaoh.toml`. The skill reads exactly five keys: + - `[pharaoh].strictness` — string; treated as `"advisory"` unless the value is exactly `"enforcing"`. + - `[pharaoh.workflow].require_verification` — boolean; default `false` when absent. + - `[pharaoh.workflow].require_change_analysis` — boolean; default `false` when absent. + - `[pharaoh.workflow].require_mece_on_release` — boolean; default `false` when absent. + - `[pharaoh.codelinks].enabled` — boolean; default `false` when absent. + +Edge cases: +- `pharaoh_toml_path` missing or unreadable → emit `overall: "error"` with `errors: ["pharaoh.toml unresolved: <path>"]` and no other keys. Callers branch on `overall` first. No ladder array is emitted on this path — the ladder is meaningful only when the file parsed. +- TOML parse error (syntax bad) → same `overall: "error"` shape with the parser message included. +- Keys present but with unexpected types (e.g. `require_verification = "yes"` as a string) → treat as their typed default (`false` for booleans, `"advisory"` for strictness) and add a note `"unexpected type for <key>; treated as default"` in `notes`. +- Entire `[pharaoh.workflow]` or `[pharaoh.codelinks]` section absent → every flag in that section resolves to its `false` default; no error. + +## Output + +```json +{ + "current_state": { + "strictness": "advisory", + "require_verification": false, + "require_change_analysis": false, + "require_mece_on_release": false, + "codelinks_enabled": false + }, + "recommended_next_gate": "require_verification", + "rationale": "require_verification = true is the highest-value, lowest-cost step — it wires the review skills that are already ship-ready into the release gate and catches every PARTIAL finding via pharaoh-req-review. No pre-work required.", + "ladder": [ + {"step": 1, "gate": "require_verification = true", "blocker": "none — safe to enable now"}, + {"step": 2, "gate": "require_change_analysis = true", "blocker": "needs pharaoh-change to be tailored"}, + {"step": 3, "gate": "require_mece_on_release = true", "blocker": "needs release-gate workflow"}, + {"step": 4, "gate": "codelinks.enabled = true", "blocker": "needs codelink annotations in source"}, + {"step": 5, "gate": "strictness = enforcing", "blocker": "requires steps 1-4 satisfied"} + ], + "overall": "pass", + "notes": [] +} +``` + +Fields (in canonical order): +- `current_state`: echo of the five parsed flags, using the canonical key names above. `codelinks_enabled` is underscored here even though the TOML key is `codelinks.enabled`, so the JSON is flat and one-shape. +- `recommended_next_gate`: the canonical key name of the FIRST ladder step whose corresponding config field is not yet at its enabled value. One of `"require_verification"`, `"require_change_analysis"`, `"require_mece_on_release"`, `"codelinks_enabled"`, `"strictness_enforcing"`, or `null` when the ladder is complete. +- `rationale`: one or two sentences naming why this step is the next one — what it unlocks and what (if anything) blocks enabling it right now. On `null` recommendation, the string is exactly `"ladder complete"`. +- `ladder`: the fixed five-entry array shown above, shipped verbatim in every non-error response. Each entry has `step` (1–5), `gate` (the TOML-style line the project would add), and `blocker` (the pre-work the project must complete first, or `"none — safe to enable now"` for step 1). +- `overall`: `"pass"` when the file parsed and the ladder computed. `"error"` when the file failed to resolve or parse (see Edge cases). +- `notes`: any non-fatal observations (e.g. mistyped value, absent section treated as default). Empty list when clean. + +## Detection rule + +Two passes over the input; both mechanical, no LLM judgement. + +### 1. Parse the five flags + +**Check:** Load `pharaoh.toml` as TOML. Read each of the five keys per the `## Input` section. Apply defaults for missing keys. Coerce unexpected types to their defaults and add a note. + +**Detection:** +```python +import tomllib + +with open(pharaoh_toml_path, "rb") as fh: + data = tomllib.load(fh) + +strictness = data.get("pharaoh", {}).get("strictness", "advisory") +if strictness != "enforcing": + strictness = "advisory" + +wf = data.get("pharaoh", {}).get("workflow", {}) +rv = wf.get("require_verification", False) is True +rca = wf.get("require_change_analysis", False) is True +rmr = wf.get("require_mece_on_release", False) is True + +cl = data.get("pharaoh", {}).get("codelinks", {}) +ce = cl.get("enabled", False) is True +``` + +### 2. Walk the fixed ladder + +**Check:** Iterate the five ladder entries in order. The first entry whose corresponding flag is NOT at its enabled value is the `recommended_next_gate`. If all five are at their enabled value, `recommended_next_gate` is `null`. + +Enabled values per step (canonical): +1. `require_verification` enabled iff `rv is True`. +2. `require_change_analysis` enabled iff `rca is True`. +3. `require_mece_on_release` enabled iff `rmr is True`. +4. `codelinks_enabled` enabled iff `ce is True`. +5. `strictness_enforcing` enabled iff `strictness == "enforcing"`. + +**Detection:** +```python +LADDER = [ + ("require_verification", rv, "none — safe to enable now"), + ("require_change_analysis", rca, "needs pharaoh-change to be tailored"), + ("require_mece_on_release", rmr, "needs release-gate workflow"), + ("codelinks_enabled", ce, "needs codelink annotations in source"), + ("strictness_enforcing", strictness == "enforcing", "requires steps 1-4 satisfied"), +] + +recommended = next((name for name, enabled, _ in LADDER if not enabled), None) +``` + +The ladder array in the output is derived once from a static template (see `## Output`); only `recommended_next_gate`, `rationale`, and `current_state` vary per input. `overall` is `"pass"` on any successful parse. + +`rationale` text is drawn from a static map keyed by `recommended_next_gate`: + +| `recommended_next_gate` | Default rationale | +|------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------| +| `require_verification` | "require_verification = true is the highest-value, lowest-cost step — it wires the review skills that are already ship-ready into the release gate and catches every PARTIAL finding via pharaoh-req-review. No pre-work required." | +| `require_change_analysis` | "require_change_analysis = true is the next step. Blocker: pharaoh-change must be tailored for this project before the gate is meaningful — otherwise every authoring task will trip an alarm with no mitigation path." | +| `require_mece_on_release` | "require_mece_on_release = true is the next step. Blocker: the project needs a release-gate workflow that understands how to invoke pharaoh-mece and act on its findings." | +| `codelinks_enabled` | "codelinks.enabled = true is the next step. Blocker: the source tree needs codelink annotations (`@req`, `@impl`, etc.) on the symbols this project wants to trace, otherwise the flag activates an empty traceability view." | +| `strictness_enforcing` | "strictness = enforcing is the final step. Blocker: steps 1-4 must all be satisfied first — flipping strictness before the individual gates are on ships a gate that enforces nothing." | +| `null` | "ladder complete" | + +Projects override any row via `tailoring.gate_advisor_rationale_overrides[<key>]` in `.pharaoh/project/checklists/gate-advisor.md` (optional). The ladder ORDER and the `gate` / `blocker` strings are fixed and not overridable. + +## Tailoring extension point + +- `tailoring.gate_advisor_rationale_overrides`: map of `{recommended_next_gate: rationale_string}` that replaces the default rationale when emitted. A project that prefers short blocker notes, or that wants to surface internal-ticket links in the rationale, uses this. The key set must match the canonical `recommended_next_gate` names above; unknown keys are ignored with a `notes` entry. + +No other knobs are exposed. The ladder itself is the shared reference `skills/shared/gate-enablement.md` — a project that disagrees with the ladder order should file an issue against the shared reference, not fork this atom. + +## Composition + +Role: `atom-check`. + +Called standalone by auditors, by `pharaoh-process-audit` as an optional health check, or from a CI job that wants a deterministic recommendation in the project dashboard. Never invoked by `pharaoh-quality-gate` (this atom is advisory, not a gate invariant — the gate invariants check the effects of the flags, not whether the flags themselves are set). Never dispatches other skills. Never modifies `pharaoh.toml`. + +Related but distinct: +- `pharaoh-setup` ships step 1 (`require_verification = true`) on by default — so a fresh project running this skill straight after setup lands on step 2 as the recommendation. +- `shared/gate-enablement.md` documents the rationale for the ladder order; projects read it to understand WHY each step is where it is. +- `pharaoh-quality-gate` runs the invariants that the ladder flags control — it answers "are my gates passing?", not "which gate should I enable next?". diff --git a/skills/pharaoh-gate-advisor/fixtures/all-steps-enabled/README.md b/skills/pharaoh-gate-advisor/fixtures/all-steps-enabled/README.md new file mode 100644 index 0000000..39fc618 --- /dev/null +++ b/skills/pharaoh-gate-advisor/fixtures/all-steps-enabled/README.md @@ -0,0 +1,7 @@ +# all-steps-enabled + +Represents a mature steady-state project that has walked the full ladder — every individual gate flag is `true`, codelinks are on, and `strictness` has been flipped to `"enforcing"`. This is the terminal state of the ladder. + +Expected outcome: advisor walks the ladder, finds every entry already satisfied, and returns `recommended_next_gate: null` with the canonical rationale `"ladder complete"`. The ladder array is still echoed verbatim — callers that render a dashboard want to see the full walk so the "done" signal is visible alongside the history. + +Exercise: the ladder-complete branch of the detection rule — proves the `next((… for … if not enabled), None)` short-circuit returns `None` when every flag is on, and that the rationale map's `None` key is wired correctly. diff --git a/skills/pharaoh-gate-advisor/fixtures/all-steps-enabled/expected-output.json b/skills/pharaoh-gate-advisor/fixtures/all-steps-enabled/expected-output.json new file mode 100644 index 0000000..0252bc7 --- /dev/null +++ b/skills/pharaoh-gate-advisor/fixtures/all-steps-enabled/expected-output.json @@ -0,0 +1,20 @@ +{ + "current_state": { + "strictness": "enforcing", + "require_verification": true, + "require_change_analysis": true, + "require_mece_on_release": true, + "codelinks_enabled": true + }, + "recommended_next_gate": null, + "rationale": "ladder complete", + "ladder": [ + {"step": 1, "gate": "require_verification = true", "blocker": "none — safe to enable now"}, + {"step": 2, "gate": "require_change_analysis = true", "blocker": "needs pharaoh-change to be tailored"}, + {"step": 3, "gate": "require_mece_on_release = true", "blocker": "needs release-gate workflow"}, + {"step": 4, "gate": "codelinks.enabled = true", "blocker": "needs codelink annotations in source"}, + {"step": 5, "gate": "strictness = enforcing", "blocker": "requires steps 1-4 satisfied"} + ], + "overall": "pass", + "notes": [] +} diff --git a/skills/pharaoh-gate-advisor/fixtures/all-steps-enabled/input-pharaoh.toml b/skills/pharaoh-gate-advisor/fixtures/all-steps-enabled/input-pharaoh.toml new file mode 100644 index 0000000..84853b5 --- /dev/null +++ b/skills/pharaoh-gate-advisor/fixtures/all-steps-enabled/input-pharaoh.toml @@ -0,0 +1,24 @@ +# all-steps-enabled fixture — every ladder step satisfied. strictness = "enforcing" plus every +# individual gate flag set to true plus codelinks enabled. pharaoh-gate-advisor must return +# recommended_next_gate: null with rationale "ladder complete". + +[pharaoh] +strictness = "enforcing" + +[pharaoh.id_scheme] +pattern = "{TYPE}-{MODULE}-{NUMBER}" +auto_increment = true + +[pharaoh.workflow] +require_change_analysis = true +require_verification = true +require_mece_on_release = true + +[pharaoh.traceability] +required_links = [ + "req -> spec", + "spec -> impl", +] + +[pharaoh.codelinks] +enabled = true diff --git a/skills/pharaoh-gate-advisor/fixtures/fresh-from-bootstrap/README.md b/skills/pharaoh-gate-advisor/fixtures/fresh-from-bootstrap/README.md new file mode 100644 index 0000000..f4bb70f --- /dev/null +++ b/skills/pharaoh-gate-advisor/fixtures/fresh-from-bootstrap/README.md @@ -0,0 +1,7 @@ +# fresh-from-bootstrap + +Represents the pre-2026-04-22 shape — every gate shipped at its advisory default (`strictness = "advisory"`, all four booleans `false`). This is the baseline the ladder was designed to unstick. + +Expected outcome: `pharaoh-gate-advisor` walks the ladder, finds step 1 (`require_verification`) still off, and recommends it. `blocker` for step 1 is `"none — safe to enable now"` — the review skills are ready out of the box, so the project can flip this flag immediately. The remaining four ladder entries ship verbatim in the output so the user sees the full path, not just the next step. + +Exercise: the no-flags-enabled happy path of the detection rule's step-1 iteration over the fixed ladder. diff --git a/skills/pharaoh-gate-advisor/fixtures/fresh-from-bootstrap/expected-output.json b/skills/pharaoh-gate-advisor/fixtures/fresh-from-bootstrap/expected-output.json new file mode 100644 index 0000000..426f504 --- /dev/null +++ b/skills/pharaoh-gate-advisor/fixtures/fresh-from-bootstrap/expected-output.json @@ -0,0 +1,20 @@ +{ + "current_state": { + "strictness": "advisory", + "require_verification": false, + "require_change_analysis": false, + "require_mece_on_release": false, + "codelinks_enabled": false + }, + "recommended_next_gate": "require_verification", + "rationale": "require_verification = true is the highest-value, lowest-cost step — it wires the review skills that are already ship-ready into the release gate and catches every PARTIAL finding via pharaoh-req-review. No pre-work required.", + "ladder": [ + {"step": 1, "gate": "require_verification = true", "blocker": "none — safe to enable now"}, + {"step": 2, "gate": "require_change_analysis = true", "blocker": "needs pharaoh-change to be tailored"}, + {"step": 3, "gate": "require_mece_on_release = true", "blocker": "needs release-gate workflow"}, + {"step": 4, "gate": "codelinks.enabled = true", "blocker": "needs codelink annotations in source"}, + {"step": 5, "gate": "strictness = enforcing", "blocker": "requires steps 1-4 satisfied"} + ], + "overall": "pass", + "notes": [] +} diff --git a/skills/pharaoh-gate-advisor/fixtures/fresh-from-bootstrap/input-pharaoh.toml b/skills/pharaoh-gate-advisor/fixtures/fresh-from-bootstrap/input-pharaoh.toml new file mode 100644 index 0000000..ed1779f --- /dev/null +++ b/skills/pharaoh-gate-advisor/fixtures/fresh-from-bootstrap/input-pharaoh.toml @@ -0,0 +1,22 @@ +# fresh-from-bootstrap fixture — pre-2026-04-22 default where every gate shipped advisory. +# Represents what used to land on disk after bootstrap + setup before the ladder-default fix. +# pharaoh-gate-advisor must recommend step 1 (require_verification = true) as the next gate +# to enable, because the ladder's lowest-cost step is still off. + +[pharaoh] +strictness = "advisory" + +[pharaoh.id_scheme] +pattern = "{TYPE}-{MODULE}-{NUMBER}" +auto_increment = true + +[pharaoh.workflow] +require_change_analysis = false +require_verification = false +require_mece_on_release = false + +[pharaoh.traceability] +required_links = [] + +[pharaoh.codelinks] +enabled = false diff --git a/skills/pharaoh-gate-advisor/fixtures/step-1-enabled/README.md b/skills/pharaoh-gate-advisor/fixtures/step-1-enabled/README.md new file mode 100644 index 0000000..b7abb2e --- /dev/null +++ b/skills/pharaoh-gate-advisor/fixtures/step-1-enabled/README.md @@ -0,0 +1,7 @@ +# step-1-enabled + +Represents the post-2026-04-22 bootstrap-default shape. `pharaoh-setup` now lands `require_verification = true` out of the box (step 1 of the ladder), so a fresh project that runs `pharaoh-setup` → `pharaoh-gate-advisor` lands on this fixture's state, not `fresh-from-bootstrap/`. + +Expected outcome: advisor walks the ladder, sees step 1 already enabled, advances to step 2 (`require_change_analysis`), and returns it as the next recommended gate. The rationale names the blocker — `pharaoh-change` must be tailored for the project before flipping the flag is meaningful, otherwise every authoring task alarms without a mitigation path. + +Exercise: the step-1-already-enabled branch of the detection rule — proves the ladder walk advances past enabled flags instead of re-recommending step 1. diff --git a/skills/pharaoh-gate-advisor/fixtures/step-1-enabled/expected-output.json b/skills/pharaoh-gate-advisor/fixtures/step-1-enabled/expected-output.json new file mode 100644 index 0000000..53ea178 --- /dev/null +++ b/skills/pharaoh-gate-advisor/fixtures/step-1-enabled/expected-output.json @@ -0,0 +1,20 @@ +{ + "current_state": { + "strictness": "advisory", + "require_verification": true, + "require_change_analysis": false, + "require_mece_on_release": false, + "codelinks_enabled": false + }, + "recommended_next_gate": "require_change_analysis", + "rationale": "require_change_analysis = true is the next step. Blocker: pharaoh-change must be tailored for this project before the gate is meaningful — otherwise every authoring task will trip an alarm with no mitigation path.", + "ladder": [ + {"step": 1, "gate": "require_verification = true", "blocker": "none — safe to enable now"}, + {"step": 2, "gate": "require_change_analysis = true", "blocker": "needs pharaoh-change to be tailored"}, + {"step": 3, "gate": "require_mece_on_release = true", "blocker": "needs release-gate workflow"}, + {"step": 4, "gate": "codelinks.enabled = true", "blocker": "needs codelink annotations in source"}, + {"step": 5, "gate": "strictness = enforcing", "blocker": "requires steps 1-4 satisfied"} + ], + "overall": "pass", + "notes": [] +} diff --git a/skills/pharaoh-gate-advisor/fixtures/step-1-enabled/input-pharaoh.toml b/skills/pharaoh-gate-advisor/fixtures/step-1-enabled/input-pharaoh.toml new file mode 100644 index 0000000..0671cfe --- /dev/null +++ b/skills/pharaoh-gate-advisor/fixtures/step-1-enabled/input-pharaoh.toml @@ -0,0 +1,22 @@ +# step-1-enabled fixture — the post-2026-04-22 shape that bootstrap + setup now ship by default. +# require_verification = true is on (step 1 of the ladder), everything else remains advisory. +# pharaoh-gate-advisor must recommend step 2 (require_change_analysis = true) next, naming +# the pharaoh-change-tailoring blocker. + +[pharaoh] +strictness = "advisory" + +[pharaoh.id_scheme] +pattern = "{TYPE}-{MODULE}-{NUMBER}" +auto_increment = true + +[pharaoh.workflow] +require_change_analysis = false +require_verification = true +require_mece_on_release = false + +[pharaoh.traceability] +required_links = [] + +[pharaoh.codelinks] +enabled = false diff --git a/skills/pharaoh-id-allocate/SKILL.md b/skills/pharaoh-id-allocate/SKILL.md new file mode 100644 index 0000000..67481f6 --- /dev/null +++ b/skills/pharaoh-id-allocate/SKILL.md @@ -0,0 +1,78 @@ +--- +name: pharaoh-id-allocate +description: Use when about to dispatch a fan-out of emission subagents (pharaoh-req-from-code, pharaoh-feat-draft-from-docs) and you need to pre-allocate globally-unique sphinx-needs IDs. Each subagent receives its pre-allocated pool and emits only from that pool, so parallel agents cannot collide on stem choice. Does NOT invoke emitters, does NOT write RST. +--- + +# pharaoh-id-allocate + +## When to use + +Invoke from a plan emitted by `pharaoh-write-plan` (executed via `pharaoh-execute-plan`) before any task that fans out req-emission. Produces a deterministic mapping from `(parent_feat_id, stem)` to a unique list of IDs, so each req-emission task gets its own pre-reserved slots. Without this step, parallel req-emitters choose stems independently and may emit colliding IDs. + +Do NOT use to rename existing IDs. Do NOT use to emit reqs. Do NOT use to delete IDs. + +## Atomicity + +- (a) Indivisible — one request set in → one list of unique IDs out. No subagent dispatch. No file writes. No mutation of the source of `existing_ids`. +- (b) Input: `{existing_ids_file?: str, existing_ids?: list[str], requests: list[{parent_feat_id: str, stem: str, count: int, type: str, prefix: str}]}`. Output: list of allocated ID strings, one per requested slot, in request order. Globally unique across `existing_ids` AND within the returned list. +- (c) Reward: fixture `pharaoh-validation/fixtures/pharaoh-id-allocate/input_spec.json` with 27 planned IDs across 3 features. When `existing_ids` contains `CREQ_writer_01`, the allocator's output for the first `writer` request starts at `CREQ_writer_02`. Output list length equals sum of `requests[].count`. +- (d) Reusable: any fan-out workflow where subagents emit IDs; CI allocators; renumbering utilities. +- (e) Composable: purely pure function. No side effects. No cross-skill calls. + +## Input + +- `existing_ids_file` (optional): path to a `needs.json` file. The allocator reads every need's `id` field into the existing-id set. If not provided, falls back to `existing_ids`. +- `existing_ids` (optional): explicit list of IDs to treat as already-allocated. Used when no `needs.json` is available. +- `requests`: list of allocation requests. Each request has: + - `parent_feat_id`: the parent feature this batch belongs to. Used for log messages only; IDs do not include it. + - `stem`: the per-file / per-symbol disambiguator (e.g. `writer`, `cli`, `exporter`). Usually the file stem normalized to snake_case. + - `count`: how many IDs to allocate in this batch. + - `type`: the sphinx-needs directive name (e.g. `comp_req`, `feat`) — recorded in log, not used for ID generation. + - `prefix`: the ID prefix (e.g. `CREQ_`, `FEAT_`). Determines the allocated ID format. + +At least one of `existing_ids_file` or `existing_ids` MUST be provided. Pass `existing_ids=[]` to signal a greenfield project. + +## Output + +A JSON array of ID strings, in request order. For `requests = [{stem: "a", count: 2, prefix: "CREQ_"}, {stem: "b", count: 1, prefix: "CREQ_"}]`, the output is exactly: + +```json +["CREQ_a_01", "CREQ_a_02", "CREQ_b_01"] +``` + +(assuming no collisions with existing IDs). Callers parse with `json.loads` — no line-oriented or comma-separated alternative. + +On any collision, the allocator advances an independent per-stem sequence counter until a free slot is found, then emits exactly `count` IDs per request. If the per-stem counter reaches 99 without emitting `count` free slots, FAIL — excessive collision means the caller chose a poor stem (too generic, reused across many features). The cap aligns with the 2-digit `_<seq:02d>` format: emitted `seq` values stay in `01..99`, never wider. + +## Process + +### Step 1: Collect existing IDs + +If `existing_ids_file` is provided, read it, parse as JSON, extract every `needs[*].id` value into an existing-id set. If `existing_ids` is provided, union its contents into the set. If both are missing, FAIL (caller error). + +### Step 2: Allocate per request + +Maintain an "allocated in this call" set alongside `existing_ids`. For each request, keep a per-stem `seq` counter (starts at 1). Produce exactly `request.count` IDs by looping `slots_emitted` from 0 to `count - 1`: + +1. Generate candidate `<prefix><stem>_<seq:02d>` (e.g. `CREQ_writer_01`). +2. If candidate collides with either set, increment `seq` and retry — the slot is not consumed. If `seq` exceeds 99 without emitting `count` free IDs for this request, FAIL naming the stem and the collision rate. +3. On a non-colliding candidate: add to the "allocated in this call" set, append to the output list, increment `seq`, increment `slots_emitted`. + +Exactly `count` IDs per request end up in the output. The per-stem `seq` counter is independent of `slots_emitted` — `seq` only advances on collision; `slots_emitted` only advances on successful emit. + +### Step 3: Return + +Emit the output as a JSON array of strings (per the wire format declared above). Nothing else on stdout. + +## Failure modes + +- Neither `existing_ids_file` nor `existing_ids` provided → FAIL. +- `existing_ids_file` path unreadable → FAIL. +- Counter exceeds 99 for any request → FAIL naming the stem. +- Any request has `count < 1` → FAIL. + +## Non-goals + +- No ID minting strategy beyond sequential numbering — if a project wants UUID-based IDs, this skill is not the right fit. +- No bulk renumbering of existing IDs — this skill only allocates new ones. +- No cross-project uniqueness — scoped to the one project whose `needs.json` (or `existing_ids`) was provided. diff --git a/skills/pharaoh-id-convention-check/SKILL.md b/skills/pharaoh-id-convention-check/SKILL.md new file mode 100644 index 0000000..30427d9 --- /dev/null +++ b/skills/pharaoh-id-convention-check/SKILL.md @@ -0,0 +1,135 @@ +--- +name: pharaoh-id-convention-check +description: Use when verifying that every need id in a sphinx-needs corpus matches the regex declared for its type in `.pharaoh/project/id-conventions.yaml`. Single mechanical structural check — applies the tailored per-type regex, emits a list of violations. Does NOT auto-detect how many schemes coexist — scheme policy is the tailoring author's responsibility (declare an alternation to allow multiple forms). +--- + +# pharaoh-id-convention-check + +## When to use + +Invoke from `pharaoh-quality-gate.required_checks` (invariant: `id-convention-consistent`) or directly after a build to confirm the corpus obeys its declared id scheme. Reads `id-conventions.yaml` + `needs.json`, returns findings JSON listing every need whose `id` does not match the regex for its `type`. + +Do NOT use to discover or count id schemes — the tailoring author declares ONE canonical regex per type and this atom only reports violations against that regex. If multiple forms are legal (e.g. legacy plus new prefix), the tailoring author encodes that as an alternation in the regex (`^CREQ_.+$|^gd_req__.+$`). Do NOT use to rename ids or mutate the corpus — read-only. + +## Atomicity + +- (a) Indivisible: one `id-conventions.yaml` + one `needs.json` in → one findings JSON out. No scheme counting, no regex inference, no id rewriting, no dispatch of other skills. +- (b) Input: `{id_conventions_path: str, needs_json_path: str}`. Output: JSON `{needs_checked: int, violations: [{need_id, type, expected_regex, reason}], overall: "pass" | "fail"}`. +- (c) Reward: fixtures under `skills/pharaoh-id-convention-check/fixtures/` — one per outcome: + 1. `all-conform/` — every id matches its type's regex → matches `expected-output.json` (`overall: "pass"`, empty `violations`, `needs_checked == len(needs)`). + 2. `some-violate/` — mix of conforming and non-conforming ids across two types → `overall: "fail"`, `violations` lists each offender with its `type`, the `expected_regex` applied, and a short `reason`. + 3. `alternation-regex/` — tailoring declares `^CREQ_.+$|^gd_req__.+$` and both forms are used in the corpus → `overall: "pass"` because the alternation matches both. + + Pass = all 3 fixture outputs match `expected-output.json` exactly (modulo ordering of `violations`, which is sorted by `need_id` in the emitted output). +- (d) Reusable across projects — the regex is data-driven via tailoring; no project-specific prefix or separator is hardcoded. Works for any sphinx-needs corpus with an `id-conventions.yaml`. +- (e) Read-only. No side effects. Does not modify the tailoring file or the needs corpus. Running twice on identical inputs yields byte-identical output. + +## Input + +- `id_conventions_path`: absolute path to the tailoring file `.pharaoh/project/id-conventions.yaml`. Schema accepted: + + ```yaml + # top-level default regex applied to any type without an override + id_regex: "^[a-z][a-z_]*__[a-z0-9_]+$" + + # per-type overrides — the regex applied to needs of that type + id_regex_by_type: + comp_req: "^CREQ_[a-z]+_[a-z]+_[a-z]+$" + gd_req: "^CREQ_.+$|^gd_req__.+$" + ``` + + Resolution order for a need of type `T`: `id_regex_by_type[T]` if declared, else `id_regex` (top-level default), else fail the whole check with `reason: "no regex declared for type <T>"` on every need of that type. + +- `needs_json_path`: absolute path to the built sphinx-needs corpus `needs.json`. Accepts either the flat `{"needs": {<id>: {id, type, ...}, ...}}` shape or the versioned `{"versions": {"<v>": {"needs": {...}}}}` shape (uses `current_version` if declared, else the latest key). Each need object must carry at least `id` and `type`; needs missing either field are reported as violations with `reason: "missing id or type field"`. + +Edge cases: +- Empty corpus (`needs` is `{}`) → `needs_checked: 0, violations: [], overall: "pass"` (vacuously true). +- `id-conventions.yaml` has neither `id_regex` nor `id_regex_by_type` → every need is a violation with `reason: "no regex declared for type <T>"`. +- Regex compilation error (invalid Python regex syntax in the tailoring) → `overall: "fail"` with a single violation `{need_id: "*", type: "<T>", expected_regex: "<bad regex>", reason: "regex compile error: <python error>"}` and `needs_checked: 0`. +- Need `type` not mentioned in `id_regex_by_type` and no top-level default → violation with `reason: "no regex declared for type <T>"`. + +## Output + +```json +{ + "needs_checked": 44, + "violations": [ + { + "need_id": "comp_req__login_ok", + "type": "comp_req", + "expected_regex": "^CREQ_[a-z]+_[a-z]+_[a-z]+$", + "reason": "does not match" + }, + { + "need_id": "CREQ_a", + "type": "comp_req", + "expected_regex": "^CREQ_[a-z]+_[a-z]+_[a-z]+$", + "reason": "does not match" + } + ], + "overall": "fail" +} +``` + +`overall` is `"pass"` iff `violations` is empty. `needs_checked` counts every need that was read from `needs.json` (including ones that triggered a "no regex declared" violation — they are still counted). `violations` is sorted by `need_id` ascending for deterministic fixture comparison. `reason` is a short human string: one of `"does not match"`, `"missing id or type field"`, `"no regex declared for type <T>"`, or `"regex compile error: <python error>"`. + +## Detection rule + +For every need `N` in the flattened needs map: + +1. Read `N.id` and `N.type`. If either is absent, emit violation `{need_id: <whatever id is, or "<missing>">, type: <or "<missing>">, expected_regex: null, reason: "missing id or type field"}` and continue. +2. Resolve the regex for `N.type`: first `id_regex_by_type[N.type]`, else top-level `id_regex`. If neither is declared, emit violation `{need_id: N.id, type: N.type, expected_regex: null, reason: "no regex declared for type <N.type>"}` and continue. +3. Compile the regex with Python `re.compile(pattern)`. On `re.error`, emit a single synthetic violation (see Edge cases above) and abort. +4. Apply `re.fullmatch(pattern, N.id)`. If `None`, emit violation `{need_id: N.id, type: N.type, expected_regex: <pattern>, reason: "does not match"}`. + +`fullmatch` (not `search` or `match`) is load-bearing: the regex describes the entire id, anchors or not. This rule is what lets the tailoring author write `^CREQ_.+$|^gd_req__.+$` and have both forms pass without the alternation implicitly anchoring only the first branch. + +Minimum viable Python reference implementation (≤ 30 lines): + +```python +import json, re, yaml, sys + +conv = yaml.safe_load(open(id_conventions_path)) +nj = json.load(open(needs_json_path)) +needs = nj.get("needs") or next(iter(nj.get("versions", {}).values()), {}).get("needs", {}) + +default = conv.get("id_regex") +by_type = conv.get("id_regex_by_type", {}) or {} + +violations = [] +for nid, n in needs.items(): + t = n.get("type"); i = n.get("id", nid) + if not t or not i: + violations.append({"need_id": i or "<missing>", "type": t or "<missing>", + "expected_regex": None, "reason": "missing id or type field"}); continue + pat = by_type.get(t, default) + if pat is None: + violations.append({"need_id": i, "type": t, "expected_regex": None, + "reason": f"no regex declared for type {t}"}); continue + try: + rx = re.compile(pat) + except re.error as e: + print(json.dumps({"needs_checked": 0, "violations": [ + {"need_id": "*", "type": t, "expected_regex": pat, + "reason": f"regex compile error: {e}"}], "overall": "fail"})); sys.exit(0) + if not rx.fullmatch(i): + violations.append({"need_id": i, "type": t, "expected_regex": pat, "reason": "does not match"}) + +violations.sort(key=lambda v: v["need_id"]) +print(json.dumps({"needs_checked": len(needs), + "violations": violations, + "overall": "pass" if not violations else "fail"})) +``` + +## Failure modes + +- **Scheme auto-detection is explicitly out of scope.** This atom does NOT answer "how many id schemes exist in this corpus?" — that is a tailoring-authoring concern, served by `pharaoh-tailor-detect`. If a project wants to allow two prefixes, the tailoring author writes an alternation regex; this check applies whatever regex is declared. +- **No Unicode normalisation.** Ids are matched byte-for-byte against the regex. Non-ASCII ids work only if the regex accounts for them. Sphinx-needs ids are ASCII in practice, so this is not a blocker. +- **No type-name validation against `artefact-catalog.yaml`.** An id of type `T` whose `T` is absent from the artefact catalog will still be checked against the default regex (or flagged with "no regex declared"). Cross-file consistency of type names is `pharaoh-tailor-review`'s job, not this atom's. +- **`fullmatch` semantics.** Writers of the tailoring must know their regex will be `fullmatch`-ed. Adding redundant anchors (`^...$`) is harmless; omitting anchors also works. Using `re.search`-style partial patterns that were intended to match substrings will misbehave — document this in project tailoring. + +## Composition + +Role: `atom-check`. + +Called by `pharaoh-quality-gate` when `required_checks` contains `id_convention_consistent: true`, under the invariant delegation entry `id-convention-consistent`. Never invokes other skills; never dispatched from emission skills. May also be invoked directly by a human auditor inspecting a corpus. diff --git a/skills/pharaoh-id-convention-check/fixtures/all-conform/README.md b/skills/pharaoh-id-convention-check/fixtures/all-conform/README.md new file mode 100644 index 0000000..a06326a --- /dev/null +++ b/skills/pharaoh-id-convention-check/fixtures/all-conform/README.md @@ -0,0 +1,3 @@ +# all-conform + +Canonical happy path. The tailoring declares a per-type regex for `comp_req` (`^CREQ_[a-z]+_[a-z]+_[a-z]+$`) and `test_case` (`^tc__[a-z0-9_]+$`), plus a sensible top-level default. Every need in `input-needs.json` matches the regex for its type, so `overall == "pass"` and `violations` is empty. Exercises the per-type resolution path and confirms `fullmatch` accepts well-formed ids. diff --git a/skills/pharaoh-id-convention-check/fixtures/all-conform/expected-output.json b/skills/pharaoh-id-convention-check/fixtures/all-conform/expected-output.json new file mode 100644 index 0000000..7e4bb61 --- /dev/null +++ b/skills/pharaoh-id-convention-check/fixtures/all-conform/expected-output.json @@ -0,0 +1,5 @@ +{ + "needs_checked": 5, + "violations": [], + "overall": "pass" +} diff --git a/skills/pharaoh-id-convention-check/fixtures/all-conform/input-id-conventions.yaml b/skills/pharaoh-id-convention-check/fixtures/all-conform/input-id-conventions.yaml new file mode 100644 index 0000000..dfa6dcb --- /dev/null +++ b/skills/pharaoh-id-convention-check/fixtures/all-conform/input-id-conventions.yaml @@ -0,0 +1,8 @@ +# all-conform fixture — two types, each with its own per-type regex. +# Every id in input-needs.json matches the regex for its type. + +id_regex: "^[a-z][a-z_]*__[a-z0-9_]+$" + +id_regex_by_type: + comp_req: "^CREQ_[a-z]+_[a-z]+_[a-z]+$" + test_case: "^tc__[a-z0-9_]+$" diff --git a/skills/pharaoh-id-convention-check/fixtures/all-conform/input-needs.json b/skills/pharaoh-id-convention-check/fixtures/all-conform/input-needs.json new file mode 100644 index 0000000..1c1344a --- /dev/null +++ b/skills/pharaoh-id-convention-check/fixtures/all-conform/input-needs.json @@ -0,0 +1,29 @@ +{ + "needs": { + "CREQ_reqif_cli_load": { + "id": "CREQ_reqif_cli_load", + "type": "comp_req", + "title": "Load a ReqIF file via the CLI" + }, + "CREQ_reqif_cli_export": { + "id": "CREQ_reqif_cli_export", + "type": "comp_req", + "title": "Export a ReqIF file via the CLI" + }, + "CREQ_csv_reader_parse": { + "id": "CREQ_csv_reader_parse", + "type": "comp_req", + "title": "Parse a CSV file into needs" + }, + "tc__reqif_cli_load_happy_path": { + "id": "tc__reqif_cli_load_happy_path", + "type": "test_case", + "title": "Happy path for ReqIF CLI load" + }, + "tc__csv_reader_parse_empty": { + "id": "tc__csv_reader_parse_empty", + "type": "test_case", + "title": "CSV reader on empty input" + } + } +} diff --git a/skills/pharaoh-id-convention-check/fixtures/alternation-regex/README.md b/skills/pharaoh-id-convention-check/fixtures/alternation-regex/README.md new file mode 100644 index 0000000..7eb878f --- /dev/null +++ b/skills/pharaoh-id-convention-check/fixtures/alternation-regex/README.md @@ -0,0 +1,8 @@ +# alternation-regex + +Exercises the "two legal forms" policy the skill supports via regex alternation, not via auto-detection. The tailoring declares a single regex for `comp_req` — `^CREQ_.+$|^gd_req__.+$` — which explicitly accepts both the new `CREQ_*` prefix and the legacy `gd_req__*` prefix. + +All four ids in `input-needs.json` are valid: two match the first branch, two match the second. Expected output: `overall: "pass"`, empty `violations`. This fixture confirms: + +1. `re.fullmatch` is applied — both alternation branches have explicit anchors and both match. +2. The atom does not report "mixed schemes detected" — scheme policy is the tailoring author's decision, encoded in the regex value. The atom only applies the regex. diff --git a/skills/pharaoh-id-convention-check/fixtures/alternation-regex/expected-output.json b/skills/pharaoh-id-convention-check/fixtures/alternation-regex/expected-output.json new file mode 100644 index 0000000..d18e68c --- /dev/null +++ b/skills/pharaoh-id-convention-check/fixtures/alternation-regex/expected-output.json @@ -0,0 +1,5 @@ +{ + "needs_checked": 4, + "violations": [], + "overall": "pass" +} diff --git a/skills/pharaoh-id-convention-check/fixtures/alternation-regex/input-id-conventions.yaml b/skills/pharaoh-id-convention-check/fixtures/alternation-regex/input-id-conventions.yaml new file mode 100644 index 0000000..9b6ab57 --- /dev/null +++ b/skills/pharaoh-id-convention-check/fixtures/alternation-regex/input-id-conventions.yaml @@ -0,0 +1,7 @@ +# alternation-regex fixture — the tailoring author explicitly allows two +# legal forms via an alternation. Both forms must pass. + +id_regex: "^[a-z][a-z_]*__[a-z0-9_]+$" + +id_regex_by_type: + comp_req: "^CREQ_.+$|^gd_req__.+$" diff --git a/skills/pharaoh-id-convention-check/fixtures/alternation-regex/input-needs.json b/skills/pharaoh-id-convention-check/fixtures/alternation-regex/input-needs.json new file mode 100644 index 0000000..b39c700 --- /dev/null +++ b/skills/pharaoh-id-convention-check/fixtures/alternation-regex/input-needs.json @@ -0,0 +1,24 @@ +{ + "needs": { + "CREQ_reqif_cli_load": { + "id": "CREQ_reqif_cli_load", + "type": "comp_req", + "title": "New-style prefix — matches the first branch of the alternation" + }, + "CREQ_csv_reader_parse": { + "id": "CREQ_csv_reader_parse", + "type": "comp_req", + "title": "Another new-style id" + }, + "gd_req__brake_activation": { + "id": "gd_req__brake_activation", + "type": "comp_req", + "title": "Legacy prefix — matches the second branch of the alternation" + }, + "gd_req__safety_goal_1": { + "id": "gd_req__safety_goal_1", + "type": "comp_req", + "title": "Another legacy id" + } + } +} diff --git a/skills/pharaoh-id-convention-check/fixtures/some-violate/README.md b/skills/pharaoh-id-convention-check/fixtures/some-violate/README.md new file mode 100644 index 0000000..9e433c2 --- /dev/null +++ b/skills/pharaoh-id-convention-check/fixtures/some-violate/README.md @@ -0,0 +1,9 @@ +# some-violate + +Mixed corpus exercising the failure path. Five needs across two types: + +- 2 conforming (`CREQ_reqif_cli_load`, `tc__reqif_cli_load_happy_path`) — do not appear in the output. +- 2 non-conforming `comp_req` ids (`CREQ_reqif_cli` has only two segments after the prefix; `CREQ_a` is a single-segment tail). +- 1 non-conforming `test_case` id (`TC_REQIF_CLI_LOAD` uses the wrong case and separator). + +Expected output: `overall: "fail"`, `needs_checked: 5`, `violations` sorted by `need_id` with three entries each naming the regex actually applied and the `"does not match"` reason. Confirms per-type resolution selects the right regex for each offender and that the output order is deterministic. diff --git a/skills/pharaoh-id-convention-check/fixtures/some-violate/expected-output.json b/skills/pharaoh-id-convention-check/fixtures/some-violate/expected-output.json new file mode 100644 index 0000000..7015490 --- /dev/null +++ b/skills/pharaoh-id-convention-check/fixtures/some-violate/expected-output.json @@ -0,0 +1,24 @@ +{ + "needs_checked": 5, + "violations": [ + { + "need_id": "CREQ_a", + "type": "comp_req", + "expected_regex": "^CREQ_[a-z]+_[a-z]+_[a-z]+$", + "reason": "does not match" + }, + { + "need_id": "CREQ_reqif_cli", + "type": "comp_req", + "expected_regex": "^CREQ_[a-z]+_[a-z]+_[a-z]+$", + "reason": "does not match" + }, + { + "need_id": "TC_REQIF_CLI_LOAD", + "type": "test_case", + "expected_regex": "^tc__[a-z0-9_]+$", + "reason": "does not match" + } + ], + "overall": "fail" +} diff --git a/skills/pharaoh-id-convention-check/fixtures/some-violate/input-id-conventions.yaml b/skills/pharaoh-id-convention-check/fixtures/some-violate/input-id-conventions.yaml new file mode 100644 index 0000000..14b360d --- /dev/null +++ b/skills/pharaoh-id-convention-check/fixtures/some-violate/input-id-conventions.yaml @@ -0,0 +1,7 @@ +# some-violate fixture — same schema as all-conform, but some ids diverge. + +id_regex: "^[a-z][a-z_]*__[a-z0-9_]+$" + +id_regex_by_type: + comp_req: "^CREQ_[a-z]+_[a-z]+_[a-z]+$" + test_case: "^tc__[a-z0-9_]+$" diff --git a/skills/pharaoh-id-convention-check/fixtures/some-violate/input-needs.json b/skills/pharaoh-id-convention-check/fixtures/some-violate/input-needs.json new file mode 100644 index 0000000..ade4810 --- /dev/null +++ b/skills/pharaoh-id-convention-check/fixtures/some-violate/input-needs.json @@ -0,0 +1,29 @@ +{ + "needs": { + "CREQ_reqif_cli_load": { + "id": "CREQ_reqif_cli_load", + "type": "comp_req", + "title": "Well-formed comp_req" + }, + "CREQ_reqif_cli": { + "id": "CREQ_reqif_cli", + "type": "comp_req", + "title": "Only two underscore-separated segments — one short of the required three" + }, + "CREQ_a": { + "id": "CREQ_a", + "type": "comp_req", + "title": "Single-segment tail — rejected" + }, + "tc__reqif_cli_load_happy_path": { + "id": "tc__reqif_cli_load_happy_path", + "type": "test_case", + "title": "Well-formed test_case" + }, + "TC_REQIF_CLI_LOAD": { + "id": "TC_REQIF_CLI_LOAD", + "type": "test_case", + "title": "Upper-case prefix — violates the tc__ convention" + } + } +} diff --git a/skills/pharaoh-link-completeness-check/SKILL.md b/skills/pharaoh-link-completeness-check/SKILL.md new file mode 100644 index 0000000..75763c7 --- /dev/null +++ b/skills/pharaoh-link-completeness-check/SKILL.md @@ -0,0 +1,139 @@ +--- +name: pharaoh-link-completeness-check +description: Use when verifying outgoing-link coverage across a full needs.json graph. For each declared link type in `artefact-catalog.yaml`, confirms every need of the governed type carries a non-empty value AND every target id resolves to an existing need. Closes the "catalogue declares `verifies` required but half the reqs ship without it" failure class. +--- + +# pharaoh-link-completeness-check + +## When to use + +Invoke from `pharaoh-quality-gate.required_checks` (invariant `link-types-covered`) or from any corpus-level lint that wants to fail the build when declared link coverage slips. Reads one `artefact-catalog.yaml` + one `needs.json`, returns coverage metrics per link type plus the list of uncovered need ids. + +Scope clarification — NOT a schema check for individual directive blocks. Use `pharaoh-output-validate` for block-level schema validation (required fields present, no unknown options, well-formed RST / YAML / JSON). This atom operates on the full needs.json graph: coverage of link types across all needs of each type, link-target resolution, per-type policy enforcement. + +Do NOT use to author the catalog (that is `pharaoh-tailor-fill`). Do NOT use to re-link or patch missing links (read-only). Do NOT use to grade prose quality of the linked needs. + +## Atomicity + +- (a) Indivisible: one artefact catalog + one needs.json in → one findings JSON out. No re-linking, no re-authoring, no dispatch of other skills. +- (b) Input: `{artefact_catalog_path: str, needs_json_path: str}`. Output: findings JSON per the shape in `## Output` below. +- (c) Reward: fixtures under `skills/pharaoh-link-completeness-check/fixtures/`: + 1. `all-covered/` — every need of every governed type carries every declared-required outgoing link AND every target resolves → `expected-output.json` with `overall: "pass"`, zero `missing` across `coverage_by_link_type`, empty `uncovered_needs`. + 2. `partial-coverage/` — some `comp_req` needs lack `:verifies:` and one need points at a non-existent id → `overall: "fail"`, `coverage_by_link_type.verifies.missing > 0`, `uncovered_needs` lists every failing id once. + 3. `tailoring-declares-verifies-optional/` — artefact-catalog marks `verifies` as optional for `comp_req`; those needs have no `:verifies:` field → `overall: "pass"`, `coverage_by_link_type.verifies.required: false`, no entries in `uncovered_needs` for that link type. + + Pass = each fixture's actual output matches `expected-output.json` modulo ordering of list elements. +- (d) Reusable across projects — consumes only the generic `artefact-catalog.yaml` + `needs.json` shapes. No project-specific link names or prefixes baked in. Tailoring extension point: the set of governed types and their `required_links` / `optional_links` is declared entirely in the catalog. +- (e) Read-only. Does not modify catalog, needs, or any on-disk state. Running twice on identical inputs produces identical output. + +## Input + +- `artefact_catalog_path`: absolute path to `artefact-catalog.yaml`. Each top-level key is a need `type`. Each type may declare `required_links: [<link_name>, ...]` and `optional_links: [<link_name>, ...]`. If a type declares neither, it is skipped (no policy, no failures). If a link name appears in both lists, `required_links` wins. +- `needs_json_path`: absolute path to `needs.json` produced by `sphinx-build`. Must contain a top-level `needs` object keyed by need id. Each need dict carries at least `type`, `id`, and any link-name keys whose values are lists of target ids. + +Edge cases: empty `needs.json` (no needs) → `overall: "pass"`, `needs_checked: 0`, empty `coverage_by_link_type`; missing `artefact-catalog.yaml` → fail with `overall: "error"`, `errors: ["artefact_catalog not found: <path>"]`; malformed YAML / JSON → fail with `overall: "error"` and the parser message; needs of a type not declared in the catalog → counted in `needs_checked` but contribute no link-coverage rows. + +## Output + +```json +{ + "needs_checked": 40, + "coverage_by_link_type": { + "satisfies": {"required": true, "covered": 40, "missing": 0}, + "verifies": {"required": true, "covered": 11, "missing": 29} + }, + "uncovered_needs": ["comp_req__auth_login", "comp_req__auth_logout"], + "unresolved_targets": [ + {"need_id": "comp_req__auth_login", "link": "verifies", "target": "tc__auth_login_ok", "reason": "target id not in needs.json"} + ], + "overall": "fail" +} +``` + +`overall` is `"pass"` iff every required link type has `missing == 0` AND `unresolved_targets` is empty. A single required-link gap OR a single unresolved target promotes `overall: "fail"`. Optional link types are reported in `coverage_by_link_type` with `required: false` and never contribute to the gate outcome — their `missing` counts are informational only. Needs whose type is absent from the catalog are counted in `needs_checked` but never populate `uncovered_needs`. + +`uncovered_needs` lists each need id at most once, even when it misses more than one required link type. `unresolved_targets` enumerates every broken target separately so the caller can name each dangling pointer. + +On input errors, the shape is `{"overall": "error", "errors": [<msg>, ...]}` with no other keys — callers branch on `overall` first. + +## Detection rule + +Three passes over the inputs; all mechanical, no LLM judgement. + +### 1. Load and index + +**Check:** Parse `artefact-catalog.yaml` into `{type: {required_links: set, optional_links: set}}`. Parse `needs.json` into `{need_id: need_dict}`. Build `known_ids = set(needs.keys())`. + +**Detection:** +```python +catalog = yaml.safe_load(open(artefact_catalog_path)) +needs = json.load(open(needs_json_path))["needs"] +known_ids = set(needs.keys()) +``` + +### 2. Per-need outgoing-link coverage + +**Check:** For each need, look up the catalog entry for its `type`. For every link name in `required_links`, the need's dict must have that key AND the value must be a non-empty list. Missing key OR empty list records the need id in `uncovered_needs` and increments `coverage_by_link_type[<link>].missing`. + +Needs whose type is not declared in the catalog contribute to `needs_checked` but generate no coverage rows. Optional links that are absent do not fail; when present, their targets are still resolved (step 3). + +**Detection:** +```python +for nid, need in needs.items(): + policy = catalog.get(need["type"]) + if not policy: + continue + for link_name in policy.get("required_links", []): + value = need.get(link_name) or [] + if not value: + uncovered.add(nid) + coverage[link_name]["missing"] += 1 + else: + coverage[link_name]["covered"] += 1 +``` + +### 3. Target resolution + +**Check:** For every link value (required OR optional) whose list is non-empty, each target id must appear in `known_ids`. A target absent from `known_ids` records an entry in `unresolved_targets` with `{need_id, link, target, reason}`. Unresolved targets count as coverage failures even when the link itself is present and non-empty — a link that points nowhere is worse than no link. + +**Detection:** +```python +for nid, need in needs.items(): + policy = catalog.get(need["type"]) + if not policy: + continue + all_links = set(policy.get("required_links", [])) | set(policy.get("optional_links", [])) + for link_name in all_links: + for target in need.get(link_name) or []: + if target not in known_ids: + unresolved.append({ + "need_id": nid, + "link": link_name, + "target": target, + "reason": "target id not in needs.json", + }) +``` + +### 4. Aggregate + +**Check:** `overall = "pass"` iff every required link has `missing == 0` AND `unresolved_targets == []`. Otherwise `"fail"`. `uncovered_needs` is the deduplicated sorted list of need ids that missed at least one required link. + +## Tailoring extension point + +All policy is declared in `artefact-catalog.yaml` — no frontmatter knobs on this skill. Projects add or remove link types by editing the catalog entry for each type: + +```yaml +comp_req: + required_links: [satisfies, verifies] + optional_links: [refines, supersedes] +``` + +Moving a link name from `required_links` to `optional_links` (or vice versa) is the single tailoring lever. The base skill ships with zero hardcoded link names. + +## Composition + +Role: `atom-check`. + +Called from `pharaoh-quality-gate.required_checks` under the invariant key `link-types-covered`, which passes iff `overall == "pass"`. Also directly invokable from any corpus-level lint or CI job that produces a `needs.json`. + +Never invoked by end users mid-authoring — authoring-time link checks belong in `pharaoh-req-review` / `pharaoh-arch-review` for the single artefact in hand. This atom is for full-graph sweeps after `sphinx-build` has produced `needs.json`. diff --git a/skills/pharaoh-link-completeness-check/fixtures/all-covered/README.md b/skills/pharaoh-link-completeness-check/fixtures/all-covered/README.md new file mode 100644 index 0000000..357507e --- /dev/null +++ b/skills/pharaoh-link-completeness-check/fixtures/all-covered/README.md @@ -0,0 +1,5 @@ +# all-covered + +Canonical happy path. The catalog declares `satisfies` and `verifies` required for `comp_req`, and `verifies` required for `tc`. Every `comp_req` in the corpus carries both links with non-empty lists, every `tc` carries `verifies`, and every target id resolves to an existing need. `feat` has no required outgoing links so it contributes to `needs_checked` but to no coverage row. + +Expected: `overall: "pass"`, zero `missing` across `coverage_by_link_type`, empty `uncovered_needs`, empty `unresolved_targets`. diff --git a/skills/pharaoh-link-completeness-check/fixtures/all-covered/expected-output.json b/skills/pharaoh-link-completeness-check/fixtures/all-covered/expected-output.json new file mode 100644 index 0000000..287a6e2 --- /dev/null +++ b/skills/pharaoh-link-completeness-check/fixtures/all-covered/expected-output.json @@ -0,0 +1,10 @@ +{ + "needs_checked": 5, + "coverage_by_link_type": { + "satisfies": {"required": true, "covered": 2, "missing": 0}, + "verifies": {"required": true, "covered": 4, "missing": 0} + }, + "uncovered_needs": [], + "unresolved_targets": [], + "overall": "pass" +} diff --git a/skills/pharaoh-link-completeness-check/fixtures/all-covered/input-artefact-catalog.yaml b/skills/pharaoh-link-completeness-check/fixtures/all-covered/input-artefact-catalog.yaml new file mode 100644 index 0000000..c621dc4 --- /dev/null +++ b/skills/pharaoh-link-completeness-check/fixtures/all-covered/input-artefact-catalog.yaml @@ -0,0 +1,13 @@ +# all-covered/input-artefact-catalog.yaml +# Canonical catalog: feat is a top-level capability (no outgoing links required), +# comp_req must satisfy a parent feat and be verified by at least one tc, +# tc must verify at least one comp_req. +feat: + required_links: [] + optional_links: [refines] +comp_req: + required_links: [satisfies, verifies] + optional_links: [] +tc: + required_links: [verifies] + optional_links: [] diff --git a/skills/pharaoh-link-completeness-check/fixtures/all-covered/input-needs.json b/skills/pharaoh-link-completeness-check/fixtures/all-covered/input-needs.json new file mode 100644 index 0000000..70152b2 --- /dev/null +++ b/skills/pharaoh-link-completeness-check/fixtures/all-covered/input-needs.json @@ -0,0 +1,40 @@ +{ + "needs": { + "feat__inventory": { + "id": "feat__inventory", + "type": "feat", + "title": "Inventory management", + "status": "valid" + }, + "comp_req__inventory_read": { + "id": "comp_req__inventory_read", + "type": "comp_req", + "title": "Read inventory file", + "status": "valid", + "satisfies": ["feat__inventory"], + "verifies": ["tc__inventory_read_ok"] + }, + "comp_req__inventory_write": { + "id": "comp_req__inventory_write", + "type": "comp_req", + "title": "Write inventory file", + "status": "valid", + "satisfies": ["feat__inventory"], + "verifies": ["tc__inventory_write_ok"] + }, + "tc__inventory_read_ok": { + "id": "tc__inventory_read_ok", + "type": "tc", + "title": "Read inventory happy path", + "status": "valid", + "verifies": ["comp_req__inventory_read"] + }, + "tc__inventory_write_ok": { + "id": "tc__inventory_write_ok", + "type": "tc", + "title": "Write inventory happy path", + "status": "valid", + "verifies": ["comp_req__inventory_write"] + } + } +} diff --git a/skills/pharaoh-link-completeness-check/fixtures/partial-coverage/README.md b/skills/pharaoh-link-completeness-check/fixtures/partial-coverage/README.md new file mode 100644 index 0000000..206c98a --- /dev/null +++ b/skills/pharaoh-link-completeness-check/fixtures/partial-coverage/README.md @@ -0,0 +1,11 @@ +# partial-coverage + +Exercises the three failure modes of this atom: + +1. `comp_req__billing_refund` is missing the `:verifies:` key entirely — counted as missing. +2. `comp_req__billing_dispute` has `verifies: []` — empty list also counts as missing. +3. `comp_req__billing_export` carries a non-empty `verifies` but its only target (`tc__does_not_exist`) is not in `needs.json` — recorded in `unresolved_targets`. + +`tc` coverage is computed on `tc__billing_invoice_ok` alone (one of four reqs has a matching tc). `satisfies` is fully covered because every `comp_req` points at `feat__billing`, which exists. + +Expected: `overall: "fail"`, `coverage_by_link_type.verifies.missing == 2`, two ids in `uncovered_needs` (deduplicated, sorted), one entry in `unresolved_targets`. diff --git a/skills/pharaoh-link-completeness-check/fixtures/partial-coverage/expected-output.json b/skills/pharaoh-link-completeness-check/fixtures/partial-coverage/expected-output.json new file mode 100644 index 0000000..632cf26 --- /dev/null +++ b/skills/pharaoh-link-completeness-check/fixtures/partial-coverage/expected-output.json @@ -0,0 +1,12 @@ +{ + "needs_checked": 6, + "coverage_by_link_type": { + "satisfies": {"required": true, "covered": 4, "missing": 0}, + "verifies": {"required": true, "covered": 3, "missing": 2} + }, + "uncovered_needs": ["comp_req__billing_dispute", "comp_req__billing_refund"], + "unresolved_targets": [ + {"need_id": "comp_req__billing_export", "link": "verifies", "target": "tc__does_not_exist", "reason": "target id not in needs.json"} + ], + "overall": "fail" +} diff --git a/skills/pharaoh-link-completeness-check/fixtures/partial-coverage/input-artefact-catalog.yaml b/skills/pharaoh-link-completeness-check/fixtures/partial-coverage/input-artefact-catalog.yaml new file mode 100644 index 0000000..7d3d644 --- /dev/null +++ b/skills/pharaoh-link-completeness-check/fixtures/partial-coverage/input-artefact-catalog.yaml @@ -0,0 +1,11 @@ +# partial-coverage/input-artefact-catalog.yaml +# Same policy as all-covered: comp_req must satisfies + verifies, tc must verifies. +feat: + required_links: [] + optional_links: [] +comp_req: + required_links: [satisfies, verifies] + optional_links: [] +tc: + required_links: [verifies] + optional_links: [] diff --git a/skills/pharaoh-link-completeness-check/fixtures/partial-coverage/input-needs.json b/skills/pharaoh-link-completeness-check/fixtures/partial-coverage/input-needs.json new file mode 100644 index 0000000..17696a6 --- /dev/null +++ b/skills/pharaoh-link-completeness-check/fixtures/partial-coverage/input-needs.json @@ -0,0 +1,48 @@ +{ + "needs": { + "feat__billing": { + "id": "feat__billing", + "type": "feat", + "title": "Billing", + "status": "valid" + }, + "comp_req__billing_invoice": { + "id": "comp_req__billing_invoice", + "type": "comp_req", + "title": "Issue invoice", + "status": "valid", + "satisfies": ["feat__billing"], + "verifies": ["tc__billing_invoice_ok"] + }, + "comp_req__billing_refund": { + "id": "comp_req__billing_refund", + "type": "comp_req", + "title": "Issue refund", + "status": "valid", + "satisfies": ["feat__billing"] + }, + "comp_req__billing_dispute": { + "id": "comp_req__billing_dispute", + "type": "comp_req", + "title": "Handle dispute", + "status": "valid", + "satisfies": ["feat__billing"], + "verifies": [] + }, + "comp_req__billing_export": { + "id": "comp_req__billing_export", + "type": "comp_req", + "title": "Export billing history", + "status": "valid", + "satisfies": ["feat__billing"], + "verifies": ["tc__does_not_exist"] + }, + "tc__billing_invoice_ok": { + "id": "tc__billing_invoice_ok", + "type": "tc", + "title": "Invoice happy path", + "status": "valid", + "verifies": ["comp_req__billing_invoice"] + } + } +} diff --git a/skills/pharaoh-link-completeness-check/fixtures/tailoring-declares-verifies-optional/README.md b/skills/pharaoh-link-completeness-check/fixtures/tailoring-declares-verifies-optional/README.md new file mode 100644 index 0000000..a8bc66f --- /dev/null +++ b/skills/pharaoh-link-completeness-check/fixtures/tailoring-declares-verifies-optional/README.md @@ -0,0 +1,7 @@ +# tailoring-declares-verifies-optional + +Demonstrates the one tailoring lever: moving `verifies` from `required_links` to `optional_links` on `comp_req` downgrades its absence from a gate failure to an informational metric. + +Both `comp_req` needs omit `:verifies:`. Because the catalog marks it optional, `coverage_by_link_type.verifies.required` is `false` and the two missing values do not populate `uncovered_needs` nor flip `overall`. `satisfies` remains required and both reqs carry it, so the gate passes. + +Expected: `overall: "pass"`, `verifies.required: false`, `verifies.missing: 2` (informational), `uncovered_needs` empty, `unresolved_targets` empty. diff --git a/skills/pharaoh-link-completeness-check/fixtures/tailoring-declares-verifies-optional/expected-output.json b/skills/pharaoh-link-completeness-check/fixtures/tailoring-declares-verifies-optional/expected-output.json new file mode 100644 index 0000000..1e3ecc7 --- /dev/null +++ b/skills/pharaoh-link-completeness-check/fixtures/tailoring-declares-verifies-optional/expected-output.json @@ -0,0 +1,10 @@ +{ + "needs_checked": 3, + "coverage_by_link_type": { + "satisfies": {"required": true, "covered": 2, "missing": 0}, + "verifies": {"required": false, "covered": 0, "missing": 2} + }, + "uncovered_needs": [], + "unresolved_targets": [], + "overall": "pass" +} diff --git a/skills/pharaoh-link-completeness-check/fixtures/tailoring-declares-verifies-optional/input-artefact-catalog.yaml b/skills/pharaoh-link-completeness-check/fixtures/tailoring-declares-verifies-optional/input-artefact-catalog.yaml new file mode 100644 index 0000000..ceb47c6 --- /dev/null +++ b/skills/pharaoh-link-completeness-check/fixtures/tailoring-declares-verifies-optional/input-artefact-catalog.yaml @@ -0,0 +1,9 @@ +# tailoring-declares-verifies-optional/input-artefact-catalog.yaml +# Project-level decision: `verifies` is advisory during early-stage authoring. +# Catalog marks it optional for comp_req; its absence must NOT fail the gate. +feat: + required_links: [] + optional_links: [] +comp_req: + required_links: [satisfies] + optional_links: [verifies] diff --git a/skills/pharaoh-link-completeness-check/fixtures/tailoring-declares-verifies-optional/input-needs.json b/skills/pharaoh-link-completeness-check/fixtures/tailoring-declares-verifies-optional/input-needs.json new file mode 100644 index 0000000..c06935b --- /dev/null +++ b/skills/pharaoh-link-completeness-check/fixtures/tailoring-declares-verifies-optional/input-needs.json @@ -0,0 +1,24 @@ +{ + "needs": { + "feat__search": { + "id": "feat__search", + "type": "feat", + "title": "Search", + "status": "valid" + }, + "comp_req__search_query": { + "id": "comp_req__search_query", + "type": "comp_req", + "title": "Accept query string", + "status": "draft", + "satisfies": ["feat__search"] + }, + "comp_req__search_results": { + "id": "comp_req__search_results", + "type": "comp_req", + "title": "Rank search results", + "status": "draft", + "satisfies": ["feat__search"] + } + } +} diff --git a/skills/pharaoh-output-validate/SKILL.md b/skills/pharaoh-output-validate/SKILL.md new file mode 100644 index 0000000..980cb9b --- /dev/null +++ b/skills/pharaoh-output-validate/SKILL.md @@ -0,0 +1,233 @@ +--- +name: pharaoh-output-validate +description: Use when `pharaoh-execute-plan` (or any caller) has dispatched a subagent whose output must match one of the documented schemas (RST directive, sphinx-codelinks one-line comment, YAML mapping, JSON object). Returns {valid, errors, parsed, recovery}. Callers gate subagent output through this before writing anything to disk. +--- + +# pharaoh-output-validate + +## When to use + +Invoke from `pharaoh-execute-plan` after each dispatched task returns, to check that the task's raw output matches the emitting skill's declared `## Output schema` section. Also invoke directly from any other skill or human checking emitted content. Reject output that fails validation; optionally retry with stricter prompt; never write drifted output to disk. + +Do NOT use to generate output (that is the emitting skill). Do NOT use to parse output that already passed validation (the `parsed` field carries the structured form for you). + +## Atomicity + +- (a) Indivisible — one target description in → one validation result out. The atom has a single responsibility: **"validate required fields for this artefact type are present and well-formed."** Two input shapes are exposed via the `mode` input: + - `mode: "block"` (default; backward-compatible): one output string + one target schema + schema context. Validates one directive block against a declared schema. + - `mode: "graph"`: one `needs.json` + one `artefact-catalog.yaml` path. Validates every need's tailored `required_metadata_fields` across the full graph. + + The mode toggle selects the input shape — it does NOT add a second responsibility. In both modes the atom asks the same question per-artefact ("does this need carry the required fields declared for its type?") and returns the same verdict axis. No mutation of inputs. No re-dispatch. No logging beyond the structured return. +- (b) Input: + - block mode: `{mode?: "block", output_text: str, target_schema: "rst_directive"|"codelinks_comment"|"yaml_map"|"json_obj", schema_context: dict, strip_fences?: bool}`. `schema_context` fields vary per `target_schema`; documented in `## Schema context`. + - graph mode: `{mode: "graph", needs_json_path: str, artefact_catalog_path: str}`. + + Output (shared shape; `parsed` / `recovery` are block-mode-only): + - block mode: `{valid: bool, errors: list[str], parsed: object|null, recovery: {stripped_text: str|null}}`. + - graph mode: `{valid: bool, errors: list[str], needs_checked: int, violations: [{need_id, type, missing_fields: [str]}]}`. +- (c) Reward: block-mode fixtures in `pharaoh-validation/fixtures/pharaoh-output-validate/` + graph-mode fixtures in `skills/pharaoh-output-validate/fixtures/`. + + Block-mode (4 fixtures): + 1. `sample_clean.rst` with `target_schema="rst_directive"`, `schema_context={directive: "feat", required_options: ["id", "status", "source_doc"]}` → `valid=true`, `parsed` contains one block with the expected fields. + 2. `sample_fenced.md` with the same schema → `valid=false` without `strip_fences`; with `strip_fences=true` → `valid=true` and `recovery.stripped_text` set. + 3. `sample_prose_wrapped.rst` → `valid=false` regardless of `strip_fences` (prose is not a fence). Errors name the surrounding prose. + 4. `sample_typo_option.rst` with `schema_context={directive: "comp_req", required_options: ["id", "status"], allowed_options: ["id", "status", "satisfies"]}` → `valid=false`. Errors name `subsatisfies` as unknown. + + Graph-mode (3 fixtures in `skills/pharaoh-output-validate/fixtures/`): + 5. `graph-all-metadata-present/` — catalog declares `required_metadata_fields` for each type; every need carries every field non-empty → `valid=true`, empty `violations`. + 6. `graph-missing-tags/` — catalog declares `:tags:` required for `comp_req`; several `comp_req` needs lack `tags` → `valid=false`, `violations` lists the offenders with `missing_fields: ["tags"]`. + 7. `graph-empty-required-list/` — catalog declares `required_metadata_fields: []` (or omits the key) for a type; that type's needs carry no metadata → `valid=true` (nothing to check). + + Pass = all 7 produce the stated result. +- (d) Reusable: any composition skill that dispatches emission subagents needs this. +- (e) Composable: this skill never calls emission skills back. It is purely a parser. + +## Input + +- `mode` (optional, default `"block"`): one of `"block"` or `"graph"`. Selects the input shape and processing branch. Existing callers that omit `mode` get block mode — fully backward-compatible. + +### Block-mode fields (used when `mode == "block"`) + +- `output_text`: the raw text the subagent returned. May include prefixes like `# emit=rst` from `pharaoh-req-from-code` — the validator strips documented prefixes before parsing. +- `target_schema`: one of: + - `"rst_directive"` — expect one or more RST directive blocks per `pharaoh-req-from-code`'s Output schema Stage 1 / Stage 2 regex. + - `"codelinks_comment"` — expect one or more sphinx-codelinks one-line comments parseable by the tailored `oneline_comment_style`. + - `"yaml_map"` — expect a YAML document with a specific top-level key shape. + - `"json_obj"` — expect a JSON object with specific required keys. +- `schema_context`: schema-specific context. See `## Schema context`. +- `strip_fences` (optional, default `false`): if `true`, one automatic recovery attempt strips a leading/trailing triple-backtick fence (with optional language hint) before re-validating. + +### Graph-mode fields (used when `mode == "graph"`) + +- `needs_json_path`: absolute path to the built sphinx-needs corpus `needs.json`. Accepts either the flat `{"needs": {<id>: {...}, ...}}` shape or the versioned `{"versions": {"<v>": {"needs": {...}}}}` shape (uses `current_version` if declared, else the latest key). +- `artefact_catalog_path`: absolute path to `.pharaoh/project/artefact-catalog.yaml`. Each top-level key is a need `type`; the validator reads `required_metadata_fields: [<field_name>, ...]` per type. Empty list → no metadata check for that type. Absent key → treated as empty (no check, not an error). + +## Schema context + +Per `target_schema`: + +- `"rst_directive"`: `{directive: str, required_options: list[str], allowed_options?: list[str], parent_ids?: list[str]}`. `allowed_options` extends the built-in sphinx-needs options + `source_doc` Pharaoh convention. If `parent_ids` is non-empty, the validator checks that `satisfies` (or tailored link name) is present and lists every id. +- `"codelinks_comment"`: `{oneline_style: {start_sequence: str, field_split_char: str, needs_fields: list[dict]}}` — exact shape of `[codelinks.projects.<name>.analyse.oneline_comment_style]`. +- `"yaml_map"`: `{required_top_level_key: str, required_sub_keys: list[str], allowed_sub_keys: list[str]}`. +- `"json_obj"`: `{required_keys: list[str], allowed_unknown_keys: bool}`. + +## Output + +### Block mode + +```json +{ + "valid": true, + "errors": [], + "parsed": [ + { + "directive": "feat", + "title": "CSV Export", + "options": {"id": "FEAT_csv_export", "status": "draft", "source_doc": "features/csv.rst"}, + "body": "The system shall export sphinx-needs data to CSV files." + } + ], + "recovery": {"stripped_text": null} +} +``` + +On `valid=false`, `parsed` is `null`. `errors` is a list of human-readable strings naming each violation with line numbers where possible. + +### Graph mode + +```json +{ + "valid": false, + "errors": [], + "needs_checked": 44, + "violations": [ + {"need_id": "comp_req__auth_login", "type": "comp_req", "missing_fields": ["tags"]}, + {"need_id": "comp_req__auth_logout", "type": "comp_req", "missing_fields": ["tags", "priority"]} + ] +} +``` + +`valid` is `true` iff `violations` is empty. `needs_checked` counts every need read from `needs.json` (including ones whose type has no `required_metadata_fields` declared — they are still counted). `violations` is sorted by `need_id` ascending for deterministic fixture comparison. `errors` is reserved for structural problems (missing / unparseable input files) and is disjoint from `violations`: an error short-circuits with `valid: false`, empty `violations`, and `needs_checked: 0`. + +## Recovery modes + +Strict by default. One automatic recovery when `strip_fences=true`: + +- If `output_text` starts with a triple-backtick fence (optionally with language hint) and ends with closing fence, strip fences and re-validate. If re-validation passes, return `valid=true` with `recovery.stripped_text` set. If it still fails, return `valid=false` with both original and stripped errors. + +The validator never silently recovers from prose wrapping or option typos — those are always `valid=false`. The caller decides whether to re-dispatch the subagent or fail. + +## Process + +If `mode == "graph"`, skip directly to `## Graph mode` below. The steps in this section apply to block mode only. + +### Step 1: Strip emit-header prefix if present + +If `output_text` starts with `# emit=rst\n` or `# emit=codelinks_comment\n`, remove that line. Record what was stripped (for error messages). + +### Step 2: Handle fence recovery (if `strip_fences=true`) + +If `output_text` (after emit-header strip) matches `^```[a-z]*\n(.+?)\n```\s*$` (with `re.DOTALL`), capture the inner content. Validate the inner content as if it were the original. If it validates, return `valid=true` with `recovery.stripped_text` set. If it does not, fall through to validate the original and include both error sets. + +### Step 3: Dispatch to schema-specific parser + +Per `target_schema`, apply the parser: + +- `rst_directive`: Stage 1 + Stage 2 regex from `pharaoh-req-from-code` `## Output schema`. Iterate blocks, enumerate options per block. +- `codelinks_comment`: invoke sphinx-codelinks' own `oneline_parser.parse_line()` per line. +- `yaml_map`: `yaml.safe_load`, check shape. +- `json_obj`: `json.loads`, check keys. + +### Step 4: Apply schema-specific checks + +Per `target_schema` and `schema_context`: + +- `rst_directive`: directive equals `directive`; every `required_options` present; no option outside `allowed_options ∪ {required_options}`; if `parent_ids` given, `satisfies` value contains each; no non-blank content after last block. +- `codelinks_comment`: `parse_line()` returns a dict with every `needs_fields[].name` populated (or default applied). +- `yaml_map`: exactly one top-level key equal to `required_top_level_key`; sub-keys include every `required_sub_keys`; no sub-key outside `allowed_sub_keys ∪ required_sub_keys`. +- `json_obj`: every `required_keys` present; if `allowed_unknown_keys` is `false`, no unknown keys. + +### Step 5: Return + +```json +{"valid": true|false, "errors": [...], "parsed": ..., "recovery": {"stripped_text": ...}} +``` + +## Graph mode + +Graph mode validates the tailored `required_metadata_fields` across every need in `needs.json`. It is the delegated check for the `metadata_fields_present` invariant in `pharaoh-quality-gate`. + +### Process + +1. **Load.** Parse `artefact_catalog_path` via `yaml.safe_load` into `{type: {required_metadata_fields: [str]}}`. Parse `needs_json_path` via `json.load` and extract the needs map (handle flat `needs` key or versioned `versions` shape). On either parse failure or missing file, return `{valid: false, errors: ["<message>"], needs_checked: 0, violations: []}`. +2. **Resolve per-type required-field lists.** For each type `T` present in `needs.json`, look up `catalog[T].required_metadata_fields`. Absent type or absent key → treat as `[]` (no check for that type; this is not an error). Empty list → no check for that type. +3. **Iterate needs.** For each need `N`: + - Let `required = catalog[N.type].required_metadata_fields` (resolved per step 2; defaults to `[]`). + - For each `field` in `required`, check the need dict. The field counts as **present and non-empty** when `field` is a key on the need AND the value is neither `None`, `""`, nor `[]`. + - Collect all missing/empty field names for this need into `missing_fields`. + - If `missing_fields` is non-empty, append `{need_id: N.id, type: N.type, missing_fields: <sorted>}` to `violations`. +4. **Aggregate.** Sort `violations` by `need_id` ascending for deterministic output. Set `valid = len(violations) == 0`. + +### Detection rule (reference) + +```python +import json, yaml + +catalog = yaml.safe_load(open(artefact_catalog_path)) or {} +nj = json.load(open(needs_json_path)) +needs = nj.get("needs") or next(iter(nj.get("versions", {}).values()), {}).get("needs", {}) + +violations = [] +for nid, n in needs.items(): + t = n.get("type") + required = (catalog.get(t) or {}).get("required_metadata_fields") or [] + missing = [f for f in required + if n.get(f) in (None, "", []) or f not in n] + if missing: + violations.append({"need_id": nid, "type": t, "missing_fields": sorted(missing)}) + +violations.sort(key=lambda v: v["need_id"]) +result = { + "valid": len(violations) == 0, + "errors": [], + "needs_checked": len(needs), + "violations": violations, +} +``` + +### Tailoring extension point + +The full policy lives in `artefact-catalog.yaml`. Each type declares its own `required_metadata_fields` independently: + +```yaml +comp_req: + required_metadata_fields: [tags, priority] +feat: + required_metadata_fields: [tags] +tc: + required_metadata_fields: [] # explicitly no check +gd_req: + # required_metadata_fields omitted # treated as empty, no check +``` + +No hardcoded field names in the base skill. Projects that do not care about metadata completeness either set empty lists or omit the key — either way, graph mode returns `valid: true` with no violations. + +## Failure modes + +Block mode: +- `output_text` empty → `valid=false`, errors=["empty output"]. +- `target_schema` unknown → FAIL (caller error). +- `schema_context` missing required fields → FAIL (caller error). +- Parser throws (malformed YAML/JSON/RST) → `valid=false`, errors=["parser exception: <message>"]. + +Graph mode: +- `needs_json_path` or `artefact_catalog_path` missing or unparseable → `valid=false`, `errors` names the offending path, `needs_checked=0`, empty `violations`. +- Empty corpus (`needs` is `{}`) → `valid=true`, `needs_checked=0`, empty `violations` (vacuously true). +- `mode` value not in `{"block", "graph"}` → FAIL (caller error). + +## Non-goals + +- No side effects — never writes files, never dispatches subagents, never retries. +- No semantic validation beyond option-name/key-name presence — e.g. does not check whether `parent_feat_id` values exist in the project; that is a downstream concern. +- No repair — output is either valid, fence-strippable, or rejected. +- Graph mode does NOT validate link-target resolution, id convention, or status lifecycle — those live in `pharaoh-link-completeness-check`, `pharaoh-id-convention-check`, and `pharaoh-status-lifecycle-check` respectively. Graph mode only checks tailored required-metadata-field presence, keeping the atom's single responsibility intact. diff --git a/skills/pharaoh-output-validate/fixtures/graph-all-metadata-present/README.md b/skills/pharaoh-output-validate/fixtures/graph-all-metadata-present/README.md new file mode 100644 index 0000000..9b0938d --- /dev/null +++ b/skills/pharaoh-output-validate/fixtures/graph-all-metadata-present/README.md @@ -0,0 +1,5 @@ +# graph-all-metadata-present + +Canonical happy path for graph mode. The catalog declares `required_metadata_fields` for both `feat` and `comp_req`; every need in the corpus carries every declared field with a non-empty value. + +Expected: `valid: true`, empty `violations`, `needs_checked` equals the total number of needs in the corpus. diff --git a/skills/pharaoh-output-validate/fixtures/graph-all-metadata-present/expected-output.json b/skills/pharaoh-output-validate/fixtures/graph-all-metadata-present/expected-output.json new file mode 100644 index 0000000..37ea376 --- /dev/null +++ b/skills/pharaoh-output-validate/fixtures/graph-all-metadata-present/expected-output.json @@ -0,0 +1,6 @@ +{ + "valid": true, + "errors": [], + "needs_checked": 3, + "violations": [] +} diff --git a/skills/pharaoh-output-validate/fixtures/graph-all-metadata-present/input-artefact-catalog.yaml b/skills/pharaoh-output-validate/fixtures/graph-all-metadata-present/input-artefact-catalog.yaml new file mode 100644 index 0000000..7a088d7 --- /dev/null +++ b/skills/pharaoh-output-validate/fixtures/graph-all-metadata-present/input-artefact-catalog.yaml @@ -0,0 +1,7 @@ +# graph-all-metadata-present/input-artefact-catalog.yaml +# Both governed types declare non-empty required_metadata_fields; every need +# in the corpus carries every listed field. +feat: + required_metadata_fields: [tags] +comp_req: + required_metadata_fields: [tags, priority] diff --git a/skills/pharaoh-output-validate/fixtures/graph-all-metadata-present/input-needs.json b/skills/pharaoh-output-validate/fixtures/graph-all-metadata-present/input-needs.json new file mode 100644 index 0000000..7887e71 --- /dev/null +++ b/skills/pharaoh-output-validate/fixtures/graph-all-metadata-present/input-needs.json @@ -0,0 +1,27 @@ +{ + "needs": { + "feat__inventory": { + "id": "feat__inventory", + "type": "feat", + "title": "Inventory management", + "status": "valid", + "tags": ["inventory"] + }, + "comp_req__inventory_read": { + "id": "comp_req__inventory_read", + "type": "comp_req", + "title": "Read inventory file", + "status": "valid", + "tags": ["inventory", "io"], + "priority": "high" + }, + "comp_req__inventory_write": { + "id": "comp_req__inventory_write", + "type": "comp_req", + "title": "Write inventory file", + "status": "valid", + "tags": ["inventory", "io"], + "priority": "medium" + } + } +} diff --git a/skills/pharaoh-output-validate/fixtures/graph-empty-required-list/README.md b/skills/pharaoh-output-validate/fixtures/graph-empty-required-list/README.md new file mode 100644 index 0000000..0189ee8 --- /dev/null +++ b/skills/pharaoh-output-validate/fixtures/graph-empty-required-list/README.md @@ -0,0 +1,5 @@ +# graph-empty-required-list + +The catalog declares `required_metadata_fields: []` for `tc` (explicitly no metadata check) and omits the key entirely for `feat` (treated as empty, also no check). Every `tc` and `feat` need in the corpus has no metadata fields. Graph mode must pass: an empty required-list — whether explicit or implicit — means "nothing to check for this type". + +Expected: `valid: true`, empty `violations`, `needs_checked` equals the total number of needs. diff --git a/skills/pharaoh-output-validate/fixtures/graph-empty-required-list/expected-output.json b/skills/pharaoh-output-validate/fixtures/graph-empty-required-list/expected-output.json new file mode 100644 index 0000000..37ea376 --- /dev/null +++ b/skills/pharaoh-output-validate/fixtures/graph-empty-required-list/expected-output.json @@ -0,0 +1,6 @@ +{ + "valid": true, + "errors": [], + "needs_checked": 3, + "violations": [] +} diff --git a/skills/pharaoh-output-validate/fixtures/graph-empty-required-list/input-artefact-catalog.yaml b/skills/pharaoh-output-validate/fixtures/graph-empty-required-list/input-artefact-catalog.yaml new file mode 100644 index 0000000..48eede0 --- /dev/null +++ b/skills/pharaoh-output-validate/fixtures/graph-empty-required-list/input-artefact-catalog.yaml @@ -0,0 +1,6 @@ +# graph-empty-required-list/input-artefact-catalog.yaml +# Explicit empty list for tc; feat key omitted entirely — both must short-circuit +# to "no check" without error. +tc: + required_metadata_fields: [] +# feat: intentionally omitted — absent key is treated as empty required list. diff --git a/skills/pharaoh-output-validate/fixtures/graph-empty-required-list/input-needs.json b/skills/pharaoh-output-validate/fixtures/graph-empty-required-list/input-needs.json new file mode 100644 index 0000000..bf430ce --- /dev/null +++ b/skills/pharaoh-output-validate/fixtures/graph-empty-required-list/input-needs.json @@ -0,0 +1,22 @@ +{ + "needs": { + "feat__search": { + "id": "feat__search", + "type": "feat", + "title": "Search", + "status": "valid" + }, + "tc__search_ok": { + "id": "tc__search_ok", + "type": "tc", + "title": "Search happy path", + "status": "valid" + }, + "tc__search_empty": { + "id": "tc__search_empty", + "type": "tc", + "title": "Search with no results", + "status": "valid" + } + } +} diff --git a/skills/pharaoh-output-validate/fixtures/graph-missing-tags/README.md b/skills/pharaoh-output-validate/fixtures/graph-missing-tags/README.md new file mode 100644 index 0000000..19b46da --- /dev/null +++ b/skills/pharaoh-output-validate/fixtures/graph-missing-tags/README.md @@ -0,0 +1,5 @@ +# graph-missing-tags + +The catalog declares `tags` required on `comp_req`; two of the three `comp_req` needs ship without `tags` (one omits the field entirely, the other declares it as an empty list). The third `comp_req` has `tags` populated. The `feat` need has no `required_metadata_fields` declared, so it is counted but contributes no violation. + +Expected: `valid: false`, `violations` sorted by `need_id` listing the two offenders with `missing_fields: ["tags"]`, `needs_checked: 4`. diff --git a/skills/pharaoh-output-validate/fixtures/graph-missing-tags/expected-output.json b/skills/pharaoh-output-validate/fixtures/graph-missing-tags/expected-output.json new file mode 100644 index 0000000..1924860 --- /dev/null +++ b/skills/pharaoh-output-validate/fixtures/graph-missing-tags/expected-output.json @@ -0,0 +1,9 @@ +{ + "valid": false, + "errors": [], + "needs_checked": 4, + "violations": [ + {"need_id": "comp_req__billing_refund", "type": "comp_req", "missing_fields": ["tags"]}, + {"need_id": "comp_req__billing_void", "type": "comp_req", "missing_fields": ["tags"]} + ] +} diff --git a/skills/pharaoh-output-validate/fixtures/graph-missing-tags/input-artefact-catalog.yaml b/skills/pharaoh-output-validate/fixtures/graph-missing-tags/input-artefact-catalog.yaml new file mode 100644 index 0000000..582cd42 --- /dev/null +++ b/skills/pharaoh-output-validate/fixtures/graph-missing-tags/input-artefact-catalog.yaml @@ -0,0 +1,4 @@ +# graph-missing-tags/input-artefact-catalog.yaml +# comp_req must carry :tags:; feat has no metadata policy at all. +comp_req: + required_metadata_fields: [tags] diff --git a/skills/pharaoh-output-validate/fixtures/graph-missing-tags/input-needs.json b/skills/pharaoh-output-validate/fixtures/graph-missing-tags/input-needs.json new file mode 100644 index 0000000..6a534fe --- /dev/null +++ b/skills/pharaoh-output-validate/fixtures/graph-missing-tags/input-needs.json @@ -0,0 +1,30 @@ +{ + "needs": { + "feat__billing": { + "id": "feat__billing", + "type": "feat", + "title": "Billing", + "status": "valid" + }, + "comp_req__billing_charge": { + "id": "comp_req__billing_charge", + "type": "comp_req", + "title": "Charge the customer card", + "status": "valid", + "tags": ["billing", "payment"] + }, + "comp_req__billing_refund": { + "id": "comp_req__billing_refund", + "type": "comp_req", + "title": "Refund a settled charge", + "status": "valid" + }, + "comp_req__billing_void": { + "id": "comp_req__billing_void", + "type": "comp_req", + "title": "Void a pending charge", + "status": "valid", + "tags": [] + } + } +} diff --git a/skills/pharaoh-papyrus-non-empty-check/SKILL.md b/skills/pharaoh-papyrus-non-empty-check/SKILL.md new file mode 100644 index 0000000..c862f89 --- /dev/null +++ b/skills/pharaoh-papyrus-non-empty-check/SKILL.md @@ -0,0 +1,67 @@ +--- +name: pharaoh-papyrus-non-empty-check +description: Use when verifying that a Papyrus workspace actually received writes during a plan run. Single mechanical check — counts directives across `.papyrus/memory/*.rst` and returns pass/fail against a configured minimum. Wired into `pharaoh-quality-gate` to detect the "LLM-executor skipped the atomic Papyrus writes" failure class observed in prior dogfooding. +--- + +# pharaoh-papyrus-non-empty-check + +## When to use + +Invoke from `pharaoh-quality-gate.required_checks` on any plan that declared Papyrus writes (every plan produced by `pharaoh-write-plan` where `preseed_papyrus: true`). Returns `{passed: bool, actual_count: int, required_min: int}` so the gate can decide. + +Do NOT use to read or interpret memory content — that is `papyrus-query` / `pharaoh-context-gather`. This skill only counts. + +## Atomicity + +- (a) Indivisible: one workspace path + one minimum count in → one pass/fail + actual count out. No memory classification, no dedup check, no content inspection. +- (b) Input: `{workspace_path: str, required_min: int}`. Output: JSON `{passed: bool, actual_count: int, required_min: int, workspace_path: str}`. +- (c) Reward: fixtures `pharaoh-validation/fixtures/pharaoh-papyrus-non-empty-check/`: + 1. `empty-workspace/` (7 `.rst` files, only headers, 0 directives) + `required_min: 1` → matches `expected-empty-fail.json` (`passed: false, actual_count: 0`). + 2. `populated-workspace/` (facts.rst with 3 `.. fact::` directives) + `required_min: 1` → matches `expected-populated-pass.json` (`passed: true, actual_count: 3`). + 3. Missing `.papyrus/` directory under workspace_path → `passed: false, actual_count: 0, note: "no papyrus workspace"` (same shape, extra field). + 4. Idempotent: same inputs produce same output. + + Pass = all 4. +- (d) Reusable by any composition that declared Papyrus writes. +- (e) Read-only. No side effects. + +## Input + +- `workspace_path`: absolute path to a directory containing `.papyrus/memory/*.rst`. If the directory does not exist, check returns `passed: false, actual_count: 0, note: "no papyrus workspace"`. +- `required_min`: integer ≥ 0. Minimum number of directives (lines matching `^\.\.\s+[a-z_]+::`) across all `.papyrus/memory/*.rst` files summed together. + +## Output + +```json +{ + "passed": true, + "actual_count": 3, + "required_min": 1, + "workspace_path": "/absolute/path/to/workspace", + "note": null +} +``` + +On missing workspace: + +```json +{ + "passed": false, + "actual_count": 0, + "required_min": 1, + "workspace_path": "/absolute/path/to/workspace", + "note": "no papyrus workspace" +} +``` + +## Counting rule + +```bash +grep -rEh '^\.\.\s+[a-z_]+::' <workspace_path>/.papyrus/memory/*.rst 2>/dev/null | wc -l +``` + +An RST directive line must match `^\.\.\s+[a-z_]+::`. Header underlines (`====`) and blank lines do not count. + +## Composition + +Called by `pharaoh-quality-gate` when `required_checks` contains `papyrus_non_empty: {required_min: N}`. Never called directly by user-facing flows. diff --git a/skills/pharaoh-process-audit/SKILL.md b/skills/pharaoh-process-audit/SKILL.md index 180bcb3..5fbb215 100644 --- a/skills/pharaoh-process-audit/SKILL.md +++ b/skills/pharaoh-process-audit/SKILL.md @@ -38,7 +38,7 @@ A single JSON document — no prose wrapper. Shape: ```json { - "project_path": "examples/score", + "project_path": "examples/my-project", "needs_total": 401, "gaps": [ { @@ -224,7 +224,7 @@ After the gap report, if `gaps` is non-empty: ## Worked example -**Input:** `project_root = examples/score` +**Input:** `project_root = examples/my-project` **Step 0:** `.pharaoh/project/` found; needs.json found with 401 needs total. @@ -253,7 +253,7 @@ After the gap report, if `gaps` is non-empty: ```json { - "project_path": "examples/score", + "project_path": "examples/my-project", "needs_total": 401, "gaps": [ { @@ -321,6 +321,6 @@ After the gap report, if `gaps` is non-empty: } ``` -Run `pharaoh-coverage-gap examples/score broken_back_link` for the full match list. -Run `pharaoh-coverage-gap examples/score orphan_arch` for the full match list. -Run `pharaoh-coverage-gap examples/score unverified_req` for the full match list. +Run `pharaoh-coverage-gap examples/my-project broken_back_link` for the full match list. +Run `pharaoh-coverage-gap examples/my-project orphan_arch` for the full match list. +Run `pharaoh-coverage-gap examples/my-project unverified_req` for the full match list. diff --git a/skills/pharaoh-prose-migrate/SKILL.md b/skills/pharaoh-prose-migrate/SKILL.md new file mode 100644 index 0000000..5efec56 --- /dev/null +++ b/skills/pharaoh-prose-migrate/SKILL.md @@ -0,0 +1,126 @@ +--- +name: pharaoh-prose-migrate +description: Use when a reverse-engineering run (a plan emitted by pharaoh-write-plan) finds pre-existing prose documentation files in the target output directory that would collide with generated feat RST files. Produces a sentence-by-sentence migration proposal — keep-as-user-guide, merge-into-feat-body, discard. Does NOT overwrite anything; the caller applies the proposal manually. +--- + +# pharaoh-prose-migrate + +## When to use + +Invoke when a feature-extraction run is about to write `features/<stem>.rst` into a directory that already contains a human-authored prose file with a colliding stem (e.g. `features/reqif.rst` was written by a human as user documentation; the orchestrator is about to emit `features/reqif_export.rst` and `features/reqif_import.rst`). Without migration guidance, both files end up in the tree with no cross-reference and unclear canonicity — the exact confusion observed during dogfooding. + +Do NOT use to apply a migration (that is a future `pharaoh-prose-apply` skill). Do NOT use to generate new prose — this skill only processes existing content. + +## Atomicity + +- (a) Indivisible — one prose file + one set of emitted feats → one migration proposal. No file mutation. No deletion. No writes to the new feat RSTs. +- (b) Input: `{prose_file: str, emitted_feats: list[{id: str, title: str, body: str, source_doc: str}]}`. Output: YAML migration proposal (see Output schema). +- (c) Reward: fixture `pharaoh-validation/fixtures/pharaoh-prose-migrate/` contains `input_reqif.rst` (a prose file describing ReqIF usage) and `expected_proposal.yaml`. Scorer: + 1. Output parses as YAML. + 2. `decisions` covers every sentence (sentence count in `decisions[*].source_sentences` sums to the total sentence count of `input_reqif.rst`). + 3. Sentences classified as `merge_into_feat_body` target an `emitted_feats[*].id`. + 4. Sentences classified as `keep_as_user_guide` have a `target_file` under `user_guide/`. + 5. Sentences classified as `discard` include a rationale naming why (boilerplate, outdated, changelog-like). + 6. `summary` totals match the `decisions` aggregation. + + Pass = all 6. +- (d) Reusable for any project with pre-existing prose in its docs/source/features/ directory. +- (e) Composable: emits only a proposal. Never mutates. Caller decides whether to apply. + +## Input + +- `prose_file`: absolute path to the existing prose RST file to migrate. +- `emitted_feats`: list of features just emitted by `pharaoh-feat-draft-from-docs` that may cover content in `prose_file`. Each entry has: + - `id`: feat ID (e.g. `FEAT_reqif_export`) + - `title`: short title + - `body`: one-sentence feat statement + - `source_doc`: the doc the feat was derived from (used to distinguish "this feat's source was this prose file" from "this feat came from elsewhere") + +## Output + +```yaml +source_file: <relative path> +decisions: + - type: keep_as_user_guide + target_file: user_guide/<stem>.rst + source_sentences: [<line-numbers or sentence-indices>] + content_preview: "<first 40 chars of the preserved sentences>" + rationale: <string> + - type: merge_into_feat_body + target_feat_id: FEAT_<name> + source_sentences: [<indices>] + content_preview: "<first 40 chars>" + rationale: <string> + - type: discard + source_sentences: [<indices>] + content_preview: "<first 40 chars>" + rationale: <string> +summary: + total_sentences: <int> + keep_as_user_guide: <int> + merge_into_feat: <int> + discard: <int> +``` + +## Process + +### Step 1: Read and sentence-split + +Read `prose_file`. Strip RST directives (lines starting `.. ` and their indented bodies) from the split candidates — directives are not migratable prose. On the remaining text, split into sentences via a simple splitter: + +```python +re.split(r'(?<=[.!?])\s+(?=[A-Z])', text) +``` + +Index sentences 1..N. Preserve their original text for `content_preview`. + +### Step 2: Classify each sentence + +Apply classification rules in order; first match wins: + +**discard** — matches any of: +- Contains `TODO`, `FIXME`, `XXX`, `deprecated`, `see also`, or matches a changelog pattern (`Version \d+\.\d+`, `- Added:`, `- Fixed:`). +- Is pure boilerplate: ≤ 5 words with no noun-phrase content (e.g. "More details below.", "The following sections explain this."). +- References an outdated feature that is not in `emitted_feats`. + +**keep_as_user_guide** — matches any of: +- Imperative voice addressing the user ("You can ...", "Users should ...", "To import, run ...", "Invoke the CLI with ..."). +- Describes CLI commands, config syntax, or step-by-step instructions. +- Contains an RST code block reference (fenced with `::` or `.. code-block::`). +- Describes usage scenarios with concrete inputs/outputs. + +Target file: `user_guide/<stem>.rst` where `<stem>` is derived from `prose_file`'s basename (e.g. `reqif.rst` → `user_guide/reqif.rst`). This keeps user documentation co-located with the feature name. + +**merge_into_feat_body** — matches any of: +- Describes what the system does in declarative voice ("The system imports ReqIF files.", "ReqIF export preserves hierarchy."). +- Overlaps semantically with one of `emitted_feats[*].title + body`. Use substring match on feat title keywords as a first pass; if multiple feats match, pick the one with highest keyword overlap. +- Target: the matching feat's `id`. + +**discard (fallthrough)** — if none of the above rules fire after three passes, classify as `discard` with rationale `"Sentence did not fit any migration category — likely boilerplate or stale."` + +### Step 3: Group consecutive same-class sentences + +Adjacent sentences with the same `type` and same `target_file`/`target_feat_id` are grouped into one `decisions` entry with `source_sentences` listing all their indices. + +### Step 4: Emit summary + +Count per-type: +- `total_sentences` = sum of all `source_sentences` lengths. +- `keep_as_user_guide`, `merge_into_feat`, `discard` = per-type counts. + +### Step 5: Return + +Return the YAML proposal. + +## Failure modes + +- `prose_file` not readable → FAIL. +- `emitted_feats` empty and no `discard`-only output possible → FAIL: `"no feats provided to merge into, and prose_file has no discardable content"`. +- All sentences fall through to `discard` → emit the proposal anyway with 100% discard. The caller likely misinvoked this skill; the output makes that clear. + +## Non-goals + +- No automatic application. The caller reviews the proposal, manually moves sentences, deletes the legacy prose file (or leaves it — skill does not prescribe). +- No cross-file prose migration. One prose file per invocation. +- No LLM re-writing of sentences. The proposal is sentence-by-sentence verbatim; the caller edits after applying. +- No user_guide/ directory creation. If the caller applies `keep_as_user_guide` decisions, they create the directory themselves. diff --git a/skills/pharaoh-quality-gate/SKILL.md b/skills/pharaoh-quality-gate/SKILL.md new file mode 100644 index 0000000..c02a6c3 --- /dev/null +++ b/skills/pharaoh-quality-gate/SKILL.md @@ -0,0 +1,211 @@ +--- +name: pharaoh-quality-gate +description: Use when running the final validation step of any Pharaoh composition that emits artefacts (reqs, features, architecture elements). Consumes an aggregated review+mece+coverage summary plus a gate spec; returns pass/fail with named breaches. Never produces summaries itself — thin gate layer over upstream atomic checkers. +--- + +# pharaoh-quality-gate + +## When to use + +Invoke as the terminal task of a plan (emitted by `pharaoh-write-plan`, executed by `pharaoh-execute-plan`) after `pharaoh-req-review`, `pharaoh-mece`, and `pharaoh-coverage-gap` tasks have produced their reports. This skill aggregates their findings against configured thresholds and decides whether the run may declare itself "complete". Without this gate, a plan that emits N artefacts with zero quality checks can return success on `sphinx-build exit 0` alone — the exact failure mode observed during dogfooding. + +Do NOT use to produce the reports it consumes — that is upstream atomic skills. Do NOT use to halt execution — this skill returns pass/fail; the plan's `on_fail` policy or the human decides what to do with that. + +## Atomicity + +- (a) Indivisible — one artefacts summary + one gate spec + one project_root in → one pass/fail report out. No new review judgment, no need-file reads, no MECE analysis. Pure threshold check. +- (b) Input: `{artefacts_summary_path: str, gate_spec_path: str, project_root: str}`. Output: JSON `{pass: bool, breaches: list[str], report_path: str}`. +- (c) Reward: fixtures `pharaoh-validation/fixtures/pharaoh-quality-gate/`: + 1. `input_artefacts.yaml` + `gate_spec.yaml` where all thresholds pass → output `pass: true`, `breaches: []`, report written to `<project_root>/.pharaoh/quality-gate-report-<timestamp>.yaml` matching `expected_report_pass.yaml` (timestamp masked). + 2. Same input_artefacts with a gate_spec variant where `testability_fail_rate_max: 0.20` but observed is `0.25` → `pass: false`, `breaches` names that threshold, report matches `expected_report_fail.yaml`. + 3. Idempotent: same inputs produce same output content (up to timestamp). + 4. Missing `artefacts_summary_path` → FAIL. + 5. `gate_spec.invariants.self_review_coverage.enabled: true` + runs path where one artefact is missing its review → `pass: false`, `breaches` includes entry naming the specific artefact and referring to `pharaoh-self-review-coverage-check` output. Same structure for `papyrus_non_empty` and `dispatch_signal_matches_plan`. + + Gate aggregates by calling each configured invariant check as a separate delegated skill, never duplicating the check logic itself — atomicity (a) preserved. + + Pass = all 5. +- (d) Reusable by any composition skill that has upstream review/mece/coverage reports. +- (e) Composable: composition skills invoke this at end; this skill never calls composition or atomic skills back. + +## Input + +- `artefacts_summary_path` (optional on plans that ran no review/mece/coverage tasks): absolute path to a YAML document produced by aggregating `pharaoh-req-review`, `pharaoh-mece`, and `pharaoh-coverage-gap` reports. Must parse via `yaml.safe_load`. Expected shape: + ```yaml + review_axis_fail_rates: + <axis_name>: <float 0..1> + ... + duplicate_rate: <float 0..1> + orphan_rate: <float 0..1> + unverified_rate: <float 0..1> + ``` +- `gate_spec_path` (optional): absolute path to a YAML document declaring thresholds. Shape: + ```yaml + thresholds: + review_axis_fail_rate_max: <float 0..1> + duplicate_rate_max: <float 0..1> + orphan_rate_max: <float 0..1> + unverified_rate_max: <float 0..1> + diagram_lint_errors_max: <int> # default 0; any error finding breaches + sampling: + method: stratified + per_feat_min: <int> + per_feat_fraction: <float 0..1> + ``` +- `diagram_lint_findings` (optional, inline): list of finding objects as produced by `pharaoh-diagram-lint`. Each entry matches the shape `{file, line, renderer, block_index, parser_exit_code, parser_stderr, severity}`. Passed by ref from the plan (e.g. `diagram_lint_findings: ${diagram_lint.findings}`), not via file path. When absent, diagram lint is assumed not run and no diagram breach is evaluated. +- `diagram_lint_status` (optional, inline): one of `"pass" | "fail" | "degraded"` as reported by `pharaoh-diagram-lint`. Used by the report's `diagram_lint` section for transparency (a `degraded` status surfaces as a warning in the report, not a breach). +- `project_root`: absolute path used to resolve the report output location (`<project_root>/.pharaoh/quality-gate-report-<timestamp>.yaml`). + +### Invariants + +Invariant checks are delegated to atomic check skills. Added to close the "skipped atomic step" class of failure observed during dogfooding, plus the structural-lint gaps (ID convention, link coverage, status lifecycle, metadata fields) surfaced by prior catalogue reviews. + +```yaml +# gate_spec.yaml — invariants block +invariants: + papyrus_non_empty: + enabled: true # default true when preseed_papyrus was used; false otherwise + required_min: 1 # minimum directive count across .papyrus/memory/*.rst + dispatch_signal_matches_plan: + enabled: true # default true + self_review_coverage: + enabled: true # default true + self_review_map_path: skills/shared/self-review-map.yaml # resolved relative to pharaoh/ + id_convention_consistent: + enabled: true # default true when id-conventions.yaml exists + id_conventions_path: .pharaoh/project/id-conventions.yaml + needs_json_path: docs/_build/needs/needs.json + link_types_covered: + enabled: true # default true when artefact-catalog.yaml declares required_links + artefact_catalog_path: .pharaoh/project/artefact-catalog.yaml + needs_json_path: docs/_build/needs/needs.json + status_lifecycle_healthy: + enabled: false # default false (advisory); release pipelines override to true + workflow_path: .pharaoh/project/workflows.yaml + needs_json_path: docs/_build/needs/needs.json + enforce: true # release-gate only — binary pass/fail on zero drafts + metadata_fields_present: + enabled: true # default true when artefact-catalog.yaml declares required_metadata_fields + artefact_catalog_path: .pharaoh/project/artefact-catalog.yaml + needs_json_path: docs/_build/needs/needs.json + api_coverage_clean: + enabled: true # default true when any source file under source_doc tree is declared + needs_json_path: docs/_build/needs/needs.json + source_file: null # resolved per-file by the plan's scatter-gather; null here means "no default — template must supply" + language: auto + task_output_present: + enabled: true # default true — independent second signal against "completed but no output" tasks + report_path: .pharaoh/runs/<latest>/report.yaml + workspace_dir: .pharaoh/runs/<latest> +``` + +Every new key follows the same pattern as the existing three: a boolean `enabled` plus whatever paths the delegated check needs. Adding a future invariant is a config-only change to this block plus one row in the delegation table below. + +## Invariant delegation + +For every key under `gate_spec.invariants.*` where `enabled: true`, the gate invokes the correspondingly named atomic check: + +| Invariant key | Delegated skill | Pass requirement | +| ------------------------------- | ------------------------------------------ | ------------------------------------------------------------ | +| `papyrus_non_empty` | `pharaoh-papyrus-non-empty-check` | `passed == true` | +| `dispatch_signal_matches_plan` | `pharaoh-dispatch-signal-check` | `passed == true` | +| `self_review_coverage` | `pharaoh-self-review-coverage-check` | `passed == true` | +| `id_convention_consistent` | `pharaoh-id-convention-check` | `overall == "pass"` | +| `link_types_covered` | `pharaoh-link-completeness-check` | `overall == "pass"` | +| `status_lifecycle_healthy` | `pharaoh-status-lifecycle-check` | `overall == "pass"` (release-gate only; `enforce=true` is typically supplied by the release pipeline) | +| `metadata_fields_present` | `pharaoh-output-validate` (graph mode) | every need carries the tailored `required_metadata_fields` for its type (delegated atom returns `valid == true`) | +| `api_coverage_clean` | `pharaoh-api-coverage-check` | `overall ∈ {"pass", "skipped"}`; invoked per source file, aggregated pass = every behavioral file has both a citing CREQ and every raised exception class named in some CREQ, non-behavioral files are skipped | +| `task_output_present` | inline check (no delegate) — re-runs `pharaoh-execute-plan` Step 4.10 audit against `report_path` + `workspace_dir` | every task with `status: completed` in the report has a non-empty artefact or `return.json` on disk at the declared path; any `reporting_error` status fails the gate | + +Each delegated check returns either `{passed: bool, ...}` or the atom's native `{overall: "pass"|"fail", ...}` / `{valid: bool, ...}` shape. The gate normalises each return against the pass requirement in the table and, on failure, merges the atom's breach fields into its top-level `breaches` list under a namespaced prefix (`invariant.<invariant_key>.<field>`). This keeps the gate itself a pure aggregator — atomicity (a) is preserved because the check logic lives in the delegated skills, not here. + +`metadata_fields_present` delegates to the existing `pharaoh-output-validate` atom invoked in `mode: "graph"` (see that skill's `## Graph mode`). The tailored `required_metadata_fields` list is declared per-type in `artefact-catalog.yaml`; empty list disables the check for that type, absent key is treated as empty. No new atom is introduced for this invariant — graph mode is a second input-shape on the existing block-validator. + +If a delegated check is not yet implemented in the skill tree, the gate records a warning in the report but does not fail — so that adding new invariants in future is a config-only change. + +## Output + +```json +{ + "pass": false, + "breaches": [ + "review_axis 'testability' fail rate 0.25 exceeds 0.20", + "orphan_rate 0.02 exceeds 0.00" + ], + "report_path": "/abs/path/.pharaoh/quality-gate-report-2026-04-20T14:03:12Z.yaml" +} +``` + +On `pass: true`, `breaches` is `[]` but the report file is still written. + +## Process + +### Step 1: Load inputs + +Read `artefacts_summary_path` (if provided) and `gate_spec_path` (if provided) via `yaml.safe_load`. If `artefacts_summary_path` is provided but the file is missing or malformed, FAIL naming the path. Same for `gate_spec_path`. When both are absent (plan did not run review/mece/coverage), the gate degrades to a diagram-lint-only pass/fail and `thresholds_evaluated` in the report will be empty for the review/mece/coverage axes. + +### Step 2: Check each threshold + +For each threshold in `gate_spec.thresholds` (if gate spec loaded): + +- `review_axis_fail_rate_max`: iterate `artefacts_summary.review_axis_fail_rates`. For each axis where observed > max, add `"review_axis '<axis>' fail rate <observed> exceeds <max>"` to breaches. +- `duplicate_rate_max`: if observed > max, add breach. +- `orphan_rate_max`: if observed > max, add breach. +- `unverified_rate_max`: if observed > max, add breach. If max is 1.00, this threshold is inactive (skip — it's a no-op). + +Sampling thresholds (`per_feat_min`, `per_feat_fraction`) are informational — they constrain upstream sampling in `pharaoh-req-review`, not checks here. Do not evaluate. + +### Step 2.5: Check diagram-lint findings (if provided) + +If `diagram_lint_findings` is non-null, count findings with `severity == "error"`. Compare against `gate_spec.thresholds.diagram_lint_errors_max` (default `0`): + +- `error_count > max` → add breach `"diagram_lint emitted <error_count> parser-error finding(s), exceeds max <max>"` followed by one sub-breach per finding of shape `"diagram_lint: <file>:L<line> (<renderer>) — <parser_stderr first 120 chars>"`. + +If `diagram_lint_status == "degraded"`, add a WARNING (not a breach) to the report: `"diagram_lint ran in degraded mode — at least one renderer CLI was missing; lint coverage is incomplete"`. Warnings surface in the report's `warnings` field but do not flip `pass` to `false`. + +### Step 3: Compute pass + +`pass = len(breaches) == 0`. + +### Step 4: Write report + +Write a full report to `<project_root>/.pharaoh/quality-gate-report-<iso8601_timestamp>.yaml` with: + +```yaml +timestamp: <iso8601> +pass: <bool> +breaches: [...] +warnings: [...] # non-breach issues (e.g. diagram_lint degraded mode) +thresholds_evaluated: + <threshold_name>: {max: <float>, observed: <float>} + ... +diagram_lint: # omit this section if diagram_lint_findings was null + status: <"pass"|"fail"|"degraded"> + errors_count: <int> + findings: + - {file, line, renderer, block_index, parser_exit_code, parser_stderr, severity} + ... +inputs: + artefacts_summary_path: <abs_path or null> + gate_spec_path: <abs_path or null> + diagram_lint_findings_count: <int> +``` + +Create `.pharaoh/` directory if it does not exist. + +### Step 5: Return + +Return the JSON object. `report_path` is the absolute path of the file written in Step 4. + +## Failure modes + +- `artefacts_summary_path` missing or unparseable → FAIL. +- `gate_spec_path` missing or unparseable → FAIL. +- `project_root/.pharaoh/` unwritable → FAIL. + +## Non-goals + +- Does not produce review / mece / coverage reports — those are `pharaoh-req-review`, `pharaoh-mece`, `pharaoh-coverage-gap`. +- Does not DECIDE thresholds — that's the gate spec authored by the project. +- Does not HALT anything — returns pass/fail; the orchestrator decides. +- No tiered thresholds (e.g. "soft" and "hard" gates) — everything is a hard threshold. diff --git a/skills/pharaoh-reproducibility-check/SKILL.md b/skills/pharaoh-reproducibility-check/SKILL.md new file mode 100644 index 0000000..16df327 --- /dev/null +++ b/skills/pharaoh-reproducibility-check/SKILL.md @@ -0,0 +1,213 @@ +--- +name: pharaoh-reproducibility-check +description: Use when diffing two output directories produced by running the same plan twice to confirm the build is reproducible. Consumes a baseline directory, a rerun directory, and an optional list of mask rules for known-non-deterministic fields (timestamps, randomly-generated ids); emits a list of drifted files with per-file changed-field summaries. Does NOT run the plan — running is the caller's responsibility (`pharaoh-execute-plan`). +--- + +# pharaoh-reproducibility-check + +## When to use + +Invoke from a reproducibility-audit CI job (or directly by a human) after the caller has produced two output directories from two independent runs of the same plan. Takes the two directories plus an optional list of `mask_rules` for known-non-deterministic fields and emits a findings JSON listing which files drifted and which fields inside them changed. Passes when every file is byte-identical after masking; fails when at least one file differs. + +**This skill does NOT run the plan.** Running the plan twice is the caller's responsibility — `pharaoh-execute-plan` is the atom that executes plans, and the orchestrator that calls `pharaoh-execute-plan` twice and then this check is future work (deferred from this plan's scope). This atom only diffs two pre-existing output directories. + +Do NOT use to re-author artefacts, to regenerate the rerun directory, or to repair drift — read-only. Do NOT use to mask the baseline in place or rewrite it with placeholders — the masking is done on in-memory copies for the comparison only. Do NOT use to infer mask rules automatically — the caller declares them; no hardcoded Pharaoh-specific masks. + +## Atomicity + +- (a) Indivisible: one baseline directory + one rerun directory + optional mask rules in → one drift report out. No plan execution, no artefact emission, no side effects. +- (b) Input: `{plan_path: str, baseline_output_dir: str, rerun_output_dir: str, mask_rules: list[{path: str, field: str, regex: str}]}`. Output: findings JSON per the shape in `## Output` below. +- (c) Reward: fixtures under `skills/pharaoh-reproducibility-check/fixtures/` — one per outcome: + 1. `identical-output/` — baseline and rerun are byte-identical after masking (timestamps masked out; everything else matches) → `overall: "pass"`, `drifted_files: []`, empty `drift_summary`. + 2. `drifted-titles/` — rerun has different need titles (e.g. `"Login requirement"` → `"Login req"`) that no mask rule targets → `overall: "fail"`, `drifted_files` names the file, `drift_summary[file].fields_changed` lists the `.title` paths of the drifted records. + 3. `drifted-ids-but-masked/` — rerun has different generated need ids (`REQ_abc123` vs `REQ_def456`) but `mask_rules` includes an entry that replaces any matching `id` value with a placeholder; after masking the files are equal → `overall: "pass"`. + + Pass = each fixture's actual output matches `expected-output.json` modulo ordering of `drifted_files` (sorted ascending) and `fields_changed` (also sorted ascending). +- (d) Reusable across projects — the diff is tree-of-files generic and the mask rules are data-driven. No Pharaoh-specific field names, id shapes, or timestamp formats are baked in. Works for any plan whose output directory is a tree of JSON / YAML / text files. +- (e) Read-only. Does not modify the baseline or rerun directories, does not write the masked copies to disk, does not touch the plan file. Running twice on identical inputs yields byte-identical output. + +## Input + +- `plan_path`: absolute path to the plan YAML the two runs came from. Used as diagnostic metadata in the emitted report (echoed under the plan key in a future shape) but is NOT semantically load-bearing for the diff itself — the skill does not re-read or re-execute the plan. An unreadable or missing path is surfaced as a blocker but does not abort the diff if both output directories are readable. +- `baseline_output_dir`: absolute path to the output directory produced by the first plan run. Must exist and be readable. +- `rerun_output_dir`: absolute path to the output directory produced by the second plan run on the same plan. Must exist and be readable. +- `mask_rules`: optional list of `{path: str, field: str, regex: str}` entries. Each entry declares that, inside every file matched by `path` (a glob relative to the output-dir root), before comparing, replace the value at `field` (a dotted JSON-path into the parsed file) with the placeholder string `"<masked>"` if the current value matches `regex`. Defaults to `[]` (no masking). + +Edge cases: +- `baseline_output_dir` or `rerun_output_dir` missing → `overall: "fail"`, `blockers: ["baseline_output_dir unresolved: <path>"]` (or the rerun equivalent). +- One side contains files the other does not — file-level drift: the absent file is listed in `drifted_files` with `drift_summary[file] = {"fields_changed": [], "reason": "file only present in <baseline|rerun>"}`. +- A `mask_rules` entry's `regex` fails to compile → `overall: "fail"`, blocker `"mask regex invalid: <entry>"`; no files are diffed. +- A mask rule targets a path that no file matches, or a field that no parsed record carries → silently ignored for that file (masking is best-effort per-entry). +- Non-parseable files (binary, malformed JSON) are compared byte-for-byte; masking is skipped for them and any bytes-difference is reported as `fields_changed: ["<byte-diff>"]`. + +## Output + +```json +{ + "baseline": "/abs/path/baseline/", + "rerun": "/abs/path/rerun/", + "drifted_files": [ + "docs/_build/needs/needs.json" + ], + "drift_summary": { + "docs/_build/needs/needs.json": { + "fields_changed": [ + "comp_req__foo_01.title" + ], + "count": 1 + } + }, + "overall": "fail" +} +``` + +Fields (in canonical order): +- `baseline`: echo of the input `baseline_output_dir`. +- `rerun`: echo of the input `rerun_output_dir`. +- `drifted_files`: list of file paths (relative to the respective output-dir roots) that differ after masking, sorted ascending. +- `drift_summary`: mapping from each drifted file path to `{fields_changed: list[str], count: int}`. `fields_changed` is the sorted list of dotted field paths whose values changed; `count` is `len(fields_changed)`. For files that exist on only one side, `fields_changed` is empty and an extra `reason` field explains the asymmetry. For byte-level diffs on non-parseable files, `fields_changed` is `["<byte-diff>"]`. +- `overall`: `"pass"` iff `drifted_files` is empty AND no blocker fired. `"fail"` otherwise. + +On input errors (unresolved paths, invalid mask regex) the shape still carries every field with empty `drifted_files`, empty `drift_summary`, `overall: "fail"`, plus a top-level `blockers` list containing the error strings, so downstream callers can diff one shape. + +**What counts as drift.** Drift is reported at two granularities: the outer `drifted_files` list names files at file-level (present on both sides but differing, OR present on only one side), and the inner `drift_summary` reports field-level detail for each drifted parseable file. The gate is file-level (any entry in `drifted_files` fails the check); the per-field detail exists so the caller can see WHAT drifted without re-running the diff. + +## Process + +### Step 1: Validate inputs + +Resolve `baseline_output_dir` and `rerun_output_dir`. If either is missing or unreadable, populate `blockers` and emit the error shape. Compile every `mask_rules[i].regex` eagerly; on any `re.error`, populate `blockers` with `"mask regex invalid: <entry>"` and emit the error shape. `plan_path` is echoed into diagnostic logs but validation is soft — a missing plan file does not abort the diff. + +### Step 2: Enumerate files + +Walk `baseline_output_dir` recursively, collect the relative path of every file. Do the same for `rerun_output_dir`. Compute the union of the two sets. For each file path in the union: + +- If present on only one side, flag it as drifted with `reason: "file only present in <baseline|rerun>"`. +- If present on both sides, continue to Step 3. + +### Step 3: Load and mask + +For each file present on both sides: + +1. Attempt to parse both copies (JSON for `*.json`, YAML for `*.yaml`/`*.yml`, plain text otherwise). Non-parseable files short-circuit to byte-comparison (Step 4b). +2. For each `mask_rules` entry whose `path` glob matches the current file's relative path, apply the mask: traverse `field` (dotted JSON-path, e.g. `needs.comp_req__foo_01.created_at`; supports `*` wildcard segments for per-item masking like `needs.*.created_at`) on the parsed structure. At each leaf the mask visits, if the current value is a string matching `regex`, replace it with `"<masked>"`. Apply masks to both the baseline and rerun copies in memory. +3. Proceed to Step 4a. + +### Step 4: Compare + +**4a (parseable files):** Deep-compare the two masked structures. Any field whose value differs is added to `fields_changed` for this file, expressed as a dotted path (`<top-key>.<sub-key>...`). Added or removed keys are reported as `<path>` with a trailing `+` or `-` respectively. If `fields_changed` is non-empty, the file is drifted. + +**4b (byte-comparable files):** Byte-compare the two files. If they differ, the file is drifted with `fields_changed: ["<byte-diff>"]`. + +### Step 5: Emit the findings JSON + +Populate every field per the `## Output` shape. Sort `drifted_files` ascending; sort each `fields_changed` ascending. `overall` is `"pass"` iff `drifted_files` is empty and no blocker fired; `"fail"` otherwise. + +## Detection rule + +One mechanical check, implemented as the five-step process above. No LLM judgement. + +Minimum viable Python reference implementation (≤ 60 lines, omitting glob and dotted-path helpers for brevity): + +```python +import json, os, re, fnmatch, yaml +from pathlib import Path + +def walk(root): + root = Path(root) + return {str(p.relative_to(root)) for p in root.rglob("*") if p.is_file()} + +def load(p): + s = open(p, "rb").read() + try: + if p.endswith(".json"): + return "parsed", json.loads(s) + if p.endswith((".yaml", ".yml")): + return "parsed", yaml.safe_load(s) + except Exception: + pass + return "bytes", s + +def apply_masks(obj, field_path, regex): + # Traverse dotted field_path (with `*` wildcards). At each leaf, if the + # current value is a string matching regex, replace it with "<masked>". + segs = field_path.split(".") + def visit(node, i): + if i == len(segs): + return "<masked>" if isinstance(node, str) and regex.search(node) else node + if segs[i] == "*" and isinstance(node, dict): + return {k: visit(v, i + 1) for k, v in node.items()} + if isinstance(node, dict) and segs[i] in node: + node[segs[i]] = visit(node[segs[i]], i + 1) + return node + return visit(obj, 0) + +def diff(a, b, prefix=""): + changed = [] + if type(a) != type(b): + return [prefix or "<root>"] + if isinstance(a, dict): + for k in sorted(set(a) | set(b)): + p = f"{prefix}.{k}" if prefix else k + if k not in a: changed.append(p + "+") + elif k not in b: changed.append(p + "-") + else: changed += diff(a[k], b[k], p) + return changed + if a != b: return [prefix or "<root>"] + return [] + +# Main +compiled = [(r["path"], r["field"], re.compile(r["regex"])) for r in mask_rules] +b_files, r_files = walk(baseline), walk(rerun) +drifted, summary = [], {} + +for rel in sorted(b_files | r_files): + if rel not in b_files: + drifted.append(rel); summary[rel] = {"fields_changed": [], "count": 0, + "reason": "file only present in rerun"}; continue + if rel not in r_files: + drifted.append(rel); summary[rel] = {"fields_changed": [], "count": 0, + "reason": "file only present in baseline"}; continue + + kind_b, a = load(os.path.join(baseline, rel)) + kind_r, c = load(os.path.join(rerun, rel)) + if kind_b != kind_r or kind_b == "bytes": + if a != c: + drifted.append(rel); summary[rel] = {"fields_changed": ["<byte-diff>"], "count": 1} + continue + + for glob, field, rx in compiled: + if fnmatch.fnmatch(rel, glob): + a = apply_masks(a, field, rx); c = apply_masks(c, field, rx) + + fc = sorted(diff(a, c)) + if fc: + drifted.append(rel); summary[rel] = {"fields_changed": fc, "count": len(fc)} + +overall = "pass" if not drifted else "fail" +``` + +The full implementation adds the blocker propagation for unresolved paths, the eager regex compilation, and the canonical-field emission order. + +## Failure modes + +- **Dotted field paths are a simplified JSON-pointer.** Segments are literal keys; `*` wildcards any key at that level; arrays are addressed by index (`needs.0.title`). Projects whose data has keys containing literal dots must split those keys before emitting the output — documented limitation, acceptable for every Pharaoh output shape observed to date. +- **Masking is per-leaf, not per-subtree.** A mask rule targeting `needs.*.created_at` replaces only the `created_at` scalar, not the whole need record. Projects wanting to mask out entire subtrees should declare a rule per leaf field or pre-process the output. +- **Regex matching is `re.search`, not `re.fullmatch`.** The rule fires when the regex finds a match anywhere in the string value; this is deliberate so a regex like `\d{10,}` can mask out Unix timestamps without requiring the field value to be exactly a timestamp. +- **Binary or malformed files fall back to byte compare.** A corrupt JSON on either side is compared byte-for-byte. That is usually what the caller wants (a malformed file is itself drift), but a project relying on lenient parsing should repair the file before invoking this check. +- **`plan_path` is metadata-only.** The skill does NOT parse or execute the plan; it does not verify that the two output directories actually came from it. Callers that need that assurance should assert it before invoking. +- **File-level is the gate.** Any drifted file fails the check. The per-field detail does not downgrade a one-field diff to a warning — reproducibility is binary. Projects that want per-field tolerance should encode it via mask rules. + +## Tailoring extension point + +- `tailoring.reproducibility_mask_rules`: projects can declare a canonical list of mask rules in their tailoring and pipe it into this skill's `mask_rules` input. Typical entries cover timestamps (`created_at`, `updated_at`, `build_timestamp`) and randomly-generated ids (`run_id`, `session_id`). No other knobs are exposed. + +No other knobs. The skill is deliberately a thin diff engine — every policy decision (what to mask, what threshold) lives in the caller or the tailoring. + +## Composition + +Role: `atom-check`. + +Callable standalone from any CI job that already holds two output directories plus a mask-rule list. The orchestrator that invokes `pharaoh-execute-plan` twice and then this check is out of scope for this atom. Never dispatches other skills. Never modifies the baseline or rerun directories. + +Complements `pharaoh-dispatch-signal-check` (which audits whether a plan's declared execution mode was respected in `runs/`) — that skill checks run structure, this skill checks output-byte stability across reruns. The two atoms operate on different artefacts and neither dispatches the other. diff --git a/skills/pharaoh-reproducibility-check/fixtures/drifted-ids-but-masked/README.md b/skills/pharaoh-reproducibility-check/fixtures/drifted-ids-but-masked/README.md new file mode 100644 index 0000000..3a9f6e2 --- /dev/null +++ b/skills/pharaoh-reproducibility-check/fixtures/drifted-ids-but-masked/README.md @@ -0,0 +1,10 @@ +# drifted-ids-but-masked + +Expected-pass case that demonstrates the `mask_rules` escape hatch for randomly-generated fields. Baseline and rerun share the same outer need-ids (stable dict keys), but each need carries a `generated_run_id` field whose value is a hex token produced by the plan executor at runtime — the two runs produced different tokens (`a3f7c09e` vs `b18d4f22`). Without masking, every need would drift on that field and the file would fail the check. With a single mask rule targeting `needs.*.generated_run_id` and a hex-token regex, both sides are rewritten in-memory to `"<masked>"` before the diff, so the compared structures are equal. + +Expected verdict: `overall: "pass"`, empty `drifted_files`. + +Exercises: +- the mask-rule escape hatch: a value that really differs between runs is hidden from the diff when the caller declares it non-deterministic +- the wildcard `*` in the dotted field path +- the `re.search` semantics (the regex matches anywhere in the value; it does not need to fullmatch) diff --git a/skills/pharaoh-reproducibility-check/fixtures/drifted-ids-but-masked/baseline/needs.json b/skills/pharaoh-reproducibility-check/fixtures/drifted-ids-but-masked/baseline/needs.json new file mode 100644 index 0000000..0a6b182 --- /dev/null +++ b/skills/pharaoh-reproducibility-check/fixtures/drifted-ids-but-masked/baseline/needs.json @@ -0,0 +1,18 @@ +{ + "needs": { + "comp_req__login_ok": { + "id": "comp_req__login_ok", + "type": "comp_req", + "title": "Login succeeds on valid credentials", + "content": "The system shall authenticate when credentials match the store.", + "generated_run_id": "run_a3f7c09e" + }, + "comp_req__logout_ok": { + "id": "comp_req__logout_ok", + "type": "comp_req", + "title": "Logout clears session", + "content": "The system shall discard the session token on logout.", + "generated_run_id": "run_51bb27de" + } + } +} diff --git a/skills/pharaoh-reproducibility-check/fixtures/drifted-ids-but-masked/expected-output.json b/skills/pharaoh-reproducibility-check/fixtures/drifted-ids-but-masked/expected-output.json new file mode 100644 index 0000000..533bf2c --- /dev/null +++ b/skills/pharaoh-reproducibility-check/fixtures/drifted-ids-but-masked/expected-output.json @@ -0,0 +1,7 @@ +{ + "baseline": "baseline/", + "rerun": "rerun/", + "drifted_files": [], + "drift_summary": {}, + "overall": "pass" +} diff --git a/skills/pharaoh-reproducibility-check/fixtures/drifted-ids-but-masked/input-mask-rules.yaml b/skills/pharaoh-reproducibility-check/fixtures/drifted-ids-but-masked/input-mask-rules.yaml new file mode 100644 index 0000000..822c623 --- /dev/null +++ b/skills/pharaoh-reproducibility-check/fixtures/drifted-ids-but-masked/input-mask-rules.yaml @@ -0,0 +1,3 @@ +- path: "needs.json" + field: "needs.*.generated_run_id" + regex: "^run_[0-9a-f]{8}$" diff --git a/skills/pharaoh-reproducibility-check/fixtures/drifted-ids-but-masked/rerun/needs.json b/skills/pharaoh-reproducibility-check/fixtures/drifted-ids-but-masked/rerun/needs.json new file mode 100644 index 0000000..e694d2e --- /dev/null +++ b/skills/pharaoh-reproducibility-check/fixtures/drifted-ids-but-masked/rerun/needs.json @@ -0,0 +1,18 @@ +{ + "needs": { + "comp_req__login_ok": { + "id": "comp_req__login_ok", + "type": "comp_req", + "title": "Login succeeds on valid credentials", + "content": "The system shall authenticate when credentials match the store.", + "generated_run_id": "run_b18d4f22" + }, + "comp_req__logout_ok": { + "id": "comp_req__logout_ok", + "type": "comp_req", + "title": "Logout clears session", + "content": "The system shall discard the session token on logout.", + "generated_run_id": "run_9c0ea6f1" + } + } +} diff --git a/skills/pharaoh-reproducibility-check/fixtures/drifted-titles/README.md b/skills/pharaoh-reproducibility-check/fixtures/drifted-titles/README.md new file mode 100644 index 0000000..ed774e5 --- /dev/null +++ b/skills/pharaoh-reproducibility-check/fixtures/drifted-titles/README.md @@ -0,0 +1,9 @@ +# drifted-titles + +Expected-fail case. Baseline and rerun share identical ids and timestamps (no mask rule needed for the timestamp in this fixture — the values match byte-for-byte), but the rerun has paraphrased two of the three `title` fields. No mask rule targets `title`, so every title change is real drift. Expected verdict is `overall: "fail"` with one drifted file (`needs.json`) and two entries under `fields_changed` — the dotted paths to the two changed titles. The third, unchanged title does not show up. + +Exercises: +- deep dict-compare on the parsed JSON +- per-field drift reporting under `drift_summary` at path-into-record granularity +- the file-level drift gate (one file drifted → `overall: "fail"`) +- no-op masking (the `mask_rules` list is present but no entry fires, so the rerun's title changes are fully visible in the diff) diff --git a/skills/pharaoh-reproducibility-check/fixtures/drifted-titles/baseline/needs.json b/skills/pharaoh-reproducibility-check/fixtures/drifted-titles/baseline/needs.json new file mode 100644 index 0000000..bdd2bfb --- /dev/null +++ b/skills/pharaoh-reproducibility-check/fixtures/drifted-titles/baseline/needs.json @@ -0,0 +1,22 @@ +{ + "needs": { + "comp_req__login_ok": { + "id": "comp_req__login_ok", + "type": "comp_req", + "title": "Login succeeds on valid credentials", + "content": "The system shall authenticate when credentials match the store." + }, + "comp_req__logout_ok": { + "id": "comp_req__logout_ok", + "type": "comp_req", + "title": "Logout clears session", + "content": "The system shall discard the session token on logout." + }, + "comp_req__lock_ok": { + "id": "comp_req__lock_ok", + "type": "comp_req", + "title": "Lock after five failed attempts", + "content": "The system shall lock the account after five consecutive failed logins." + } + } +} diff --git a/skills/pharaoh-reproducibility-check/fixtures/drifted-titles/expected-output.json b/skills/pharaoh-reproducibility-check/fixtures/drifted-titles/expected-output.json new file mode 100644 index 0000000..f4e0715 --- /dev/null +++ b/skills/pharaoh-reproducibility-check/fixtures/drifted-titles/expected-output.json @@ -0,0 +1,17 @@ +{ + "baseline": "baseline/", + "rerun": "rerun/", + "drifted_files": [ + "needs.json" + ], + "drift_summary": { + "needs.json": { + "fields_changed": [ + "needs.comp_req__login_ok.title", + "needs.comp_req__logout_ok.title" + ], + "count": 2 + } + }, + "overall": "fail" +} diff --git a/skills/pharaoh-reproducibility-check/fixtures/drifted-titles/input-mask-rules.yaml b/skills/pharaoh-reproducibility-check/fixtures/drifted-titles/input-mask-rules.yaml new file mode 100644 index 0000000..00bd5c2 --- /dev/null +++ b/skills/pharaoh-reproducibility-check/fixtures/drifted-titles/input-mask-rules.yaml @@ -0,0 +1,3 @@ +- path: "needs.json" + field: "needs.*.created_at" + regex: "^\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}Z$" diff --git a/skills/pharaoh-reproducibility-check/fixtures/drifted-titles/rerun/needs.json b/skills/pharaoh-reproducibility-check/fixtures/drifted-titles/rerun/needs.json new file mode 100644 index 0000000..32ae30f --- /dev/null +++ b/skills/pharaoh-reproducibility-check/fixtures/drifted-titles/rerun/needs.json @@ -0,0 +1,22 @@ +{ + "needs": { + "comp_req__login_ok": { + "id": "comp_req__login_ok", + "type": "comp_req", + "title": "Login succeeds with valid credentials", + "content": "The system shall authenticate when credentials match the store." + }, + "comp_req__logout_ok": { + "id": "comp_req__logout_ok", + "type": "comp_req", + "title": "Logout discards the session", + "content": "The system shall discard the session token on logout." + }, + "comp_req__lock_ok": { + "id": "comp_req__lock_ok", + "type": "comp_req", + "title": "Lock after five failed attempts", + "content": "The system shall lock the account after five consecutive failed logins." + } + } +} diff --git a/skills/pharaoh-reproducibility-check/fixtures/identical-output/README.md b/skills/pharaoh-reproducibility-check/fixtures/identical-output/README.md new file mode 100644 index 0000000..5ccddc5 --- /dev/null +++ b/skills/pharaoh-reproducibility-check/fixtures/identical-output/README.md @@ -0,0 +1,8 @@ +# identical-output + +Canonical happy path. Baseline and rerun each contain one `needs.json` plus one small RST file. The only difference between the two directories is the `created_at` timestamp on each need — a mask rule targeting `needs.*.created_at` with a datetime regex masks both to `"<masked>"` before diffing. After masking the two trees are byte-identical, so `overall: "pass"` and `drifted_files` is empty. + +Exercises: +- the in-memory masking path on JSON files +- the wildcard segment in a dotted field path (`needs.*.created_at`) +- the vacuous-pass condition when every parseable leaf matches after masking diff --git a/skills/pharaoh-reproducibility-check/fixtures/identical-output/baseline/module.rst b/skills/pharaoh-reproducibility-check/fixtures/identical-output/baseline/module.rst new file mode 100644 index 0000000..30f84b0 --- /dev/null +++ b/skills/pharaoh-reproducibility-check/fixtures/identical-output/baseline/module.rst @@ -0,0 +1,4 @@ +Login module +============ + +Handles authentication and session lifecycle. diff --git a/skills/pharaoh-reproducibility-check/fixtures/identical-output/baseline/needs.json b/skills/pharaoh-reproducibility-check/fixtures/identical-output/baseline/needs.json new file mode 100644 index 0000000..9fdfb4f --- /dev/null +++ b/skills/pharaoh-reproducibility-check/fixtures/identical-output/baseline/needs.json @@ -0,0 +1,18 @@ +{ + "needs": { + "comp_req__login_ok": { + "id": "comp_req__login_ok", + "type": "comp_req", + "title": "Login succeeds on valid credentials", + "content": "The system shall authenticate when credentials match the store.", + "created_at": "2026-04-22T08:15:03Z" + }, + "comp_req__logout_ok": { + "id": "comp_req__logout_ok", + "type": "comp_req", + "title": "Logout clears session", + "content": "The system shall discard the session token on logout.", + "created_at": "2026-04-22T08:15:04Z" + } + } +} diff --git a/skills/pharaoh-reproducibility-check/fixtures/identical-output/expected-output.json b/skills/pharaoh-reproducibility-check/fixtures/identical-output/expected-output.json new file mode 100644 index 0000000..533bf2c --- /dev/null +++ b/skills/pharaoh-reproducibility-check/fixtures/identical-output/expected-output.json @@ -0,0 +1,7 @@ +{ + "baseline": "baseline/", + "rerun": "rerun/", + "drifted_files": [], + "drift_summary": {}, + "overall": "pass" +} diff --git a/skills/pharaoh-reproducibility-check/fixtures/identical-output/input-mask-rules.yaml b/skills/pharaoh-reproducibility-check/fixtures/identical-output/input-mask-rules.yaml new file mode 100644 index 0000000..00bd5c2 --- /dev/null +++ b/skills/pharaoh-reproducibility-check/fixtures/identical-output/input-mask-rules.yaml @@ -0,0 +1,3 @@ +- path: "needs.json" + field: "needs.*.created_at" + regex: "^\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}Z$" diff --git a/skills/pharaoh-reproducibility-check/fixtures/identical-output/rerun/module.rst b/skills/pharaoh-reproducibility-check/fixtures/identical-output/rerun/module.rst new file mode 100644 index 0000000..30f84b0 --- /dev/null +++ b/skills/pharaoh-reproducibility-check/fixtures/identical-output/rerun/module.rst @@ -0,0 +1,4 @@ +Login module +============ + +Handles authentication and session lifecycle. diff --git a/skills/pharaoh-reproducibility-check/fixtures/identical-output/rerun/needs.json b/skills/pharaoh-reproducibility-check/fixtures/identical-output/rerun/needs.json new file mode 100644 index 0000000..105b4c4 --- /dev/null +++ b/skills/pharaoh-reproducibility-check/fixtures/identical-output/rerun/needs.json @@ -0,0 +1,18 @@ +{ + "needs": { + "comp_req__login_ok": { + "id": "comp_req__login_ok", + "type": "comp_req", + "title": "Login succeeds on valid credentials", + "content": "The system shall authenticate when credentials match the store.", + "created_at": "2026-04-22T09:47:51Z" + }, + "comp_req__logout_ok": { + "id": "comp_req__logout_ok", + "type": "comp_req", + "title": "Logout clears session", + "content": "The system shall discard the session token on logout.", + "created_at": "2026-04-22T09:47:52Z" + } + } +} diff --git a/skills/pharaoh-req-code-grounding-check/SKILL.md b/skills/pharaoh-req-code-grounding-check/SKILL.md new file mode 100644 index 0000000..4d1e829 --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/SKILL.md @@ -0,0 +1,235 @@ +--- +name: pharaoh-req-code-grounding-check +description: Use when verifying a single drafted requirement against the source file it cites via `:source_doc:`. Single mechanical fidelity check — compares the CREQ's claims about exceptions, triggers, types, structural symbols, backtick-quoted identifiers, grounding density, adjectives, quantifiers, and branch count against the cited source, returning per-axis findings JSON. Complements `pharaoh-req-review` (which grades prose quality) with code-grounded axes. +--- + +# pharaoh-req-code-grounding-check + +## When to use + +Invoke as a sibling review alongside `pharaoh-req-review` whenever an emission skill (e.g. `pharaoh-req-from-code`) has just produced a requirement that declares `:source_doc:`. Reads the RST directive block + the cited source file, emits findings JSON with per-axis pass/fail so the caller can decide whether to finalize, regenerate, or reject the requirement. + +Do NOT use to grade prose quality (atomicity, verifiability, ambiguity) — that is `pharaoh-req-review`. Do NOT use for requirements lacking `:source_doc:` — axis #8 will fail immediately and the remaining axes cannot be evaluated. Do NOT use to re-author or modify the requirement — this skill is read-only and emits findings only. + +## Atomicity + +- (a) Indivisible: one CREQ + one source file in → one findings JSON out. No re-authoring, no set-level analysis, no dispatch of other skills. +- (b) Input: `{target: <need_id_or_rst>, source_doc_path: <str>, tailoring_path: <str>}`. Output: findings JSON per the shape in `## Output` below. +- (c) Reward: fixtures under `skills/pharaoh-req-code-grounding-check/fixtures/` — one per failure mode: + 1. `passing-case/` — all axes pass; matches `expected-output.json` (`overall: "pass"`, empty `blockers`). + 2. `dead-exception/` — CREQ names 5-class hierarchy; source raises 2 of 5 → `exception_raise_sites_exist` fails with 3 missing names in `evidence`. + 3. `inverted-trigger/` — CREQ says `when origin == "Sphinx-Needs"`; source has `if origin != "Sphinx-Needs"` → `trigger_condition_literal_match` fails. + 4. `pydantic-halluc/` — CREQ says "Pydantic model"; source imports `dataclasses` → `type_framework_matches_imports` fails. + 5. `weasel-adjectives/` — CREQ body contains `structured`, `comprehensive`, `full` → `no_weasel_adjectives` fails with the 3 matches in `evidence`. + 6. `unbounded-all/` — CREQ says "all validation errors" without enumeration → `quantifier_enumerated` fails. + 7. `collapsed-branches/` — CREQ is one shall-clause; source function has 4 visible branches → `branch_count_aligned` scores 1. + 8. `misattributed-config-field/` — CREQ body backtick-cites a default literal and a config field name; declared source_doc is the consumer module which uses them only through attribute access. Fixture ships a `code-grounding-filters.yaml` enabling the `cross_file_literal_default` strategy so the skill can emit the actionable "lives in config, cite attribute instead" evidence. Without the YAML the tokens would still fail axis #5 with the generic "not in source_doc" message. + 9. `typer-kebab-filter/` — CREQ body cites ``--license-key``; source defines `license_key` as a Typer parameter. Fixture ships a `code-grounding-filters.yaml` enabling `kebab_to_snake_or_pascal` with `morphology_prefixes: ["Opt"]`. The filter resolves; without the YAML, universal filters do not cover this pattern and the axis would fail. + 10. `toml-section-filter/` — CREQ body cites ``[myapp.export_config]``; skipped by universal filter #1 (TOML section). No tailoring YAML required. + 11. `external-dotted-path/` — CREQ body cites ``rich.console.Console``; source imports `from rich.console import Console`. Fixture ships a `code-grounding-filters.yaml` enabling `dotted_import_resolution` with the Python separator / import patterns. Without the YAML, universal filters do not cover this pattern. + 12. `env-var-glob/` — CREQ body cites ``JAMA_*``; source defines `JAMA_URL_ENV`, `JAMA_USERNAME_ENV`, etc. Fixture ships a `code-grounding-filters.yaml` enabling `prefix_glob_expansion`. Without the YAML, universal filters do not cover this pattern. + 13. `abstract-prose/` — CREQ body uses only "the component shall" / "caller-configured" with zero backtick-quoted identifiers; fails axis #8 (`source_doc_resolves`) because the file contains no symbols the shall clause names, exposing that the CREQ is untestable against the cited file. No tailoring YAML required — axis mechanics are language-agnostic. + + Pass = all 13 fixture outputs match `expected-output.json` modulo `evidence` field substring match. +- (d) Reusable across projects — any corpus whose CREQs declare `:source_doc:`. Two extension points, both optional: (i) weasel blacklist via `tailoring.weasel_extra`; (ii) axis-#5 pluggable language-specific filter chain via `code-grounding-filters.yaml` (schema: [`shared/code-grounding-filters.md`](../shared/code-grounding-filters.md)). Without any tailoring the skill runs three universal axis-#5 filters and the base weasel blacklist — stricter signal, language-agnostic, usable in any project out of the box. +- (e) Read-only. Does not modify the CREQ RST or the source file. Never invokes other skills (caller runs `pharaoh-req-review` as a sibling). + +## Input + +- `target`: either a `need_id` resolvable in `needs.json`, or a raw RST directive block for one CREQ. The block must contain the `:source_doc:` option; if absent, axis #8 (`source_doc_resolves`) fails with `"source_doc missing — cannot ground check"` and every other axis records `passed: "n/a"`. +- `source_doc_path` (optional when `target` is an RST block): path to the cited source file. Accepts either an absolute path or a path relative to `project_root`; relative paths are joined with `project_root` before opening. Extension determines the raise-site / import regex flavour (Python MVP; other languages via `shared/public-symbol-patterns.md`). If the resolved path does not exist, axis #8 fails with `"source_doc unresolved"`. When `target` is a raw RST block AND `source_doc_path` is omitted, the skill auto-derives it from the block's `:source_doc:` option and resolves via `project_root`. +- `project_root` (optional, required when `source_doc_path` is relative or omitted): absolute path to the consumer project's root. Used to resolve relative or auto-derived source docs to absolute paths before opening. +- `tailoring_path`: absolute path to the project's tailoring directory (`.pharaoh/project/`). Two files are read: + - `checklists/requirement.md` frontmatter for `tailoring.weasel_extra: [<word>, ...]` (axis #6 extension). + - `code-grounding-filters.yaml` for axis #5's pluggable language-specific filter chain; schema in [`shared/code-grounding-filters.md`](../shared/code-grounding-filters.md). Missing, empty, or malformed YAML is acceptable — only the three universal filters apply. + +Edge cases: empty source file → axes #1, #2, #3, #4, #5, #9 fail with `"source file empty"` evidence (axes that read the source body); missing tailoring file → base blacklist applies silently, no pluggable filters load; malformed `code-grounding-filters.yaml` → skill logs a warning in `notes`, falls back to universal filters only; language-specific axes (#1 raise-sites, #2 trigger, #3 named-symbol) use the Python MVP regex by default and record `passed: "n/a", reason: "language not yet supported"` for non-Python sources where the regex does not apply. + +## Output + +```json +{ + "need_id": "REQ_example_read", + "source_doc": "src/example/reader.py", + "axes": { + "exception_raise_sites_exist": {"passed": false, "evidence": "ReadError cited; 1 raise site found in reader.py:159 — PASS; UnicodeDecodeError cited as raised, 0 raise sites — FAIL"}, + "trigger_condition_literal_match": {"passed": true}, + "named_symbol_exists": {"passed": true}, + "type_framework_matches_imports": {"passed": "n/a", "reason": "no type-framework claim in body"}, + "backtick_symbol_in_source_doc": {"passed": false, "evidence": "4 backtick tokens checked: 'read_file' ✓, 'strict' ✓, 'uuid_target' ✓, 'reqif_uuid' ✗ (not in reader.py, lives in config/reqif_config.py — cross-file leak)"}, + "no_weasel_adjectives": {"passed": false, "evidence": "'structured diagnostic' — 'structured' blacklisted"}, + "quantifier_enumerated": {"passed": false, "evidence": "'all unrecoverable input failures' — unbounded 'all' without enumeration"}, + "source_doc_resolves": {"passed": true}, + "branch_count_aligned": {"score": 1, "evidence": "function read_file has 4 visible branches (encoding err / parse err / empty / success); CREQ is one shall-clause"} + }, + "overall": "fail", + "blockers": ["exception_raise_sites_exist", "backtick_symbol_in_source_doc", "no_weasel_adjectives", "quantifier_enumerated"], + "actions": [ + "Remove 'UnicodeDecodeError raised' claim OR add raise site for UnicodeDecodeError in reader.py", + "Replace 'reqif_uuid' with the consumer-side attribute form (e.g. 'uuid_target' accessed via 'self.config.uuid_target') OR change :source_doc: to config/reqif_config.py", + "Replace 'structured diagnostic' with concrete term (list[str])", + "Enumerate 'all unrecoverable input failures' or replace with specific classes" + ] +} +``` + +`overall` is `"pass"` iff every mechanical axis has `passed: true` (or `"n/a"`) AND `branch_count_aligned.score >= 2`. Any mechanical `passed: false` OR a branch-count score of 0 or 1 promotes the axis name into `blockers` and sets `overall: "fail"`. `actions` enumerates one remediation per blocker, with enough specificity to guide regeneration. + +## Detection rule + +Eight mechanical axes plus one subjective axis. Every mechanical axis resolves to a grep over the CREQ body and/or the source file; no LLM judgement on mechanical axes. Axes are listed in the order they should execute — cheap greps first, so a failing body short-circuits before expensive AST / import resolution on axes 4, 7, and 9. + +### 1. `exception_raise_sites_exist` + +**Check:** For each class name `X` mentioned in a `raises X` / `shall raise X` / `throws X` clause in the CREQ body, grep `raise X(` in the source file. Each cited class must have ≥1 raise site. Missing raise sites promote the axis to `passed: false` with the evidence listing every missing class name. + +**Detection:** +```bash +# Extract cited exceptions from CREQ body: +grep -oE '(?:raises?|throws?|shall raise)\s+(?:the\s+|an?\s+)?[A-Z][A-Za-z0-9_]+' <creq> \ + | awk '{print $NF}' | sort -u +# For each, verify raise site in source: +grep -cE "raise\s+<X>\s*\(" <source_doc> +``` + +### 2. `trigger_condition_literal_match` + +**Check:** Detect `when <field> == "<value>"` / `when <field> is <value>` in the CREQ body. Extract `<field>` and `<value>`. Grep source for `<field>\s*==\s*"<value>"` vs `<field>\s*!=\s*"<value>"`. Mismatch between claimed operator / value and the source code fails. + +**Detection:** +```bash +grep -oE 'when\s+[a-z_]+\s*(==|is)\s*"[^"]*"' <creq> +# then in source: +grep -E '<field>\s*(==|!=)\s*"<value>"' <source_doc> +``` + +### 3. `named_symbol_exists` + +**Check:** Extract symbol names from the CREQ body ONLY in bounded structural contexts: + +- Verb-prefix pattern: `(?:raises?|throws?|uses?|wraps?|calls?|invokes?|extends?|subclasses?)\s+(?:the\s+|an?\s+)?(?P<sym>[A-Z][A-Za-z0-9_]+)` +- Function-call shape: `(?P<fn>[a-z_][a-z0-9_]+)\(` + +Every extracted `sym` / `fn` must appear as a definition or call site in the source file. This narrowing — verb prefix OR trailing parens — is load-bearing: unrestricted `[A-Z][a-zA-Z0-9]+` matching produces false positives on stdlib generics (`List`, `Dict`, `Optional`) and sentence-initial capitalization (`Parser`, `User`). + +**Detection:** +```bash +grep -E "(raises?|throws?|uses?|wraps?|calls?|invokes?|extends?|subclasses?)\s+(the\s+|an?\s+)?[A-Z][A-Za-z0-9_]+" <creq> +grep -E "[a-z_][a-z0-9_]+\(" <creq> +# each extracted name must exist in <source_doc> as def/class/call site +``` + +### 4. `type_framework_matches_imports` + +**Check:** If CREQ body mentions "Pydantic model", source must `from pydantic import ...` or `import pydantic`. "dataclass" → source must have `@dataclass` or `from dataclasses`. "attrs class" → `@attr` or `import attr`. "TypedDict" → `from typing import TypedDict` or `typing.TypedDict`. Mismatch between the claim and the imports fails. + +**Detection:** +```bash +grep -oEi 'pydantic|dataclass|attrs\s+class|typeddict' <creq> +# Python imports: +grep -E '^(from|import)\s+(pydantic|dataclasses|attr|typing)' <source_doc> +grep -E '^@(dataclass|attr\.s|attrs\.define)' <source_doc> +``` + +### 5. `backtick_symbol_in_source_doc` + +**Check:** For every backtick-quoted token ``` ``X`` ``` in the CREQ body, verify that `X` appears as a literal substring in the declared `:source_doc:`. This catches cross-file leaks that the verb-prefixed axis #3 (`named_symbol_exists`) misses, because many backtick-cited identifiers sit in running prose without a structural verb in front of them (e.g. "honouring ``include_links`` and per-field delimiters"). + +The check runs **after** normalising the token through a two-tier filter chain: three universal filters built into the base skill plus zero or more pluggable language-specific filters loaded from `<tailoring_path>/code-grounding-filters.yaml`. Tokens surviving both tiers are looked up in the source file; the first unresolved token fails the axis with `evidence` naming each unresolved token and — when the token is found elsewhere in the project source tree — the file it actually lives in, so the caller knows whether to retarget `:source_doc:` or rewrite the CREQ. + +#### Universal filters (always active, language-agnostic) + +Apply in order; a token that matches any step is counted as resolved (or skipped) without penalty. + +1. **TOML section / table header** — token matches `^\[[a-z_][\w.]*\]$` (e.g. ``[myapp.export_config]``, ``[foo]``). Not a code identifier in any language; skip. +2. **File path / command-string** — token contains `/` or a space (e.g. ``commands/csv.py``, ``jama check``). Skip; file paths are covered by axis #8 (`source_doc_resolves`) and multi-word strings are user-facing UI text, not code symbols. Language-agnostic — `/` and whitespace are not valid identifier characters in any mainstream language. +3. **Short-prose guard** — tokens that are lowercase English words under 4 chars (e.g. ``id``, ``to``, ``or``) OR all-caps domain acronyms in a closed list (``API``, ``CSV``, ``JSON``, ``REST``, ``TOML``, ``CLI``, ``URL``, ``HTTP``) are treated as prose, not symbols. Skip. The acronym list is conservative and stays in the base skill. + +#### Pluggable filters (from tailoring YAML) + +After the three universal filters run, the skill loads `<tailoring_path>/code-grounding-filters.yaml` (if present) and applies every filter declared there in order. The YAML schema and the four supported strategies (`kebab_to_snake_or_pascal`, `prefix_glob_expansion`, `dotted_import_resolution`, `cross_file_literal_default`) are documented in [`shared/code-grounding-filters.md`](../shared/code-grounding-filters.md). Each strategy is a parameterised shape; projects supply the language-specific regex / separator / patterns per strategy. Python / Typer projects get CLI-kebab + env-glob + stdlib-import + dataclass-default filters; Rust / Clap projects get the same strategies with different patterns (`use X::Y`, `#[serde(default=...)]`). An absent YAML means only the universal filters run — acceptable default, stricter signal, no wrong-language false negatives. + +Any token that fails every filter AND does not literally appear in `:source_doc:` fails the axis. + +**Detection (pseudocode):** +```python +for tok in re.findall(r'``([^`]+)``', creq_body): + # Tier 1 — universal filters (always active) + if match_toml_section(tok): continue + if '/' in tok or ' ' in tok: continue + if is_short_prose(tok): continue + + # Tier 2 — pluggable filters from tailoring + if any(f.resolves(tok, source_text, project_root) + for f in tailored_filters): + continue + + # Baseline — literal substring match + if tok in source_text: continue + + # Not resolved — record with cross-file lookup for evidence + elsewhere = locate_in_project(tok, project_root) + violations.append({"token": tok, "found_in": elsewhere}) +``` + +The ordering is load-bearing: universal filters short-circuit before the YAML is even opened, so missing-tailoring projects pay zero cost for the three cheap greps. Pluggable filters run before the baseline substring check so that language-specific resolutions take precedence over accidental substring coincidences. + +### 6. `no_weasel_adjectives` + +**Check:** Grep the CREQ body against the base blacklist: + +``` +structured, comprehensive, full, absolute, paginated, robust, complete, proper +``` + +Any match fails with the matched word in evidence. These words imply mechanised behaviour without grounding. Tailoring extension: `tailoring.weasel_extra` (list) is union-merged with the base before the grep. + +**Detection:** +```bash +grep -iwE '\b(structured|comprehensive|full|absolute|paginated|robust|complete|proper)\b' <creq> +``` + +### 7. `quantifier_enumerated` + +**Check:** Narrow, mechanical quantifier detection only. Regex: + +``` +\b(?:all|every|each)\s+(?:[a-z]+\s+){0,3}(?:error|errors|exception|exceptions|failure|failures|case|cases|command|commands|branch|branches|mode|modes|validator|validators)s?\b +``` + +If matched, the same sentence or the next sentence must contain either: +- `:\s` (enumeration colon), OR +- ` namely ` / ` specifically ` / ` including ` (enumeration marker), OR +- a Sphinx list directive (`.. list-table::` / `- ` bullet in an adjacent block). + +Otherwise fail. Broader quantifier judgement — "does 'the system' implicitly quantify?" — is deferred to the subjective `unambiguity_prose` axis in `pharaoh-req-review`. This axis catches only the specific pattern where the noun signals an expected enumeration that is missing. + +### 8. `source_doc_resolves` + +**Check:** The CREQ's `:source_doc:` option must (a) be present, (b) point at an existing file, and (c) the file must contain at least one symbol the CREQ names (per axis #3 extraction). Three fail modes: + +- `:source_doc:` absent → `passed: false, evidence: "source_doc missing — cannot ground check"`. +- Path does not exist → `passed: false, evidence: "source_doc unresolved: <path>"`. +- Symbols from CREQ body absent in file → `passed: false, evidence: "source_doc-symbol mismatch: none of [<sym>, ...] found in <path>"`. + +### 9. `branch_count_aligned` (subjective, 0-3) + +**Check:** Count `if` / `elif` / `else` / `match` branches in the function named by `:source_doc:` (parse via Python `ast` where available, regex fallback). If CREQ is one shall-clause but the function has ≥3 branches producing visibly different outputs, score ≤ 2. If CREQ enumerates branches or is a set of short CREQs covering them, score 3. + +Rubric: +- 3 — CREQ structure matches source branch count (1 shall per branch, or a single CREQ with explicit per-branch enumeration). +- 2 — CREQ groups branches under a justified umbrella (e.g. "validation errors" for 2-3 similar branches). +- 1 — CREQ collapses ≥3 distinct branches into one shall-clause with no enumeration; different projects would reasonably want these split. +- 0 — CREQ omits entire branches that produce observable output. + +## Tailoring extension point + +See [`shared/checklists/requirement.md`](../shared/checklists/requirement.md) — the canonical location of the `tailoring.weasel_extra` frontmatter key consumed by axis #6 (`no_weasel_adjectives`). No other project-specific state in the base skill; all regulatory-standard vocabulary (ASIL, ARC, ASPICE process IDs) stays out of the base. + +## Composition + +Role: `atom-check`. + +Called as a sibling alongside `pharaoh-req-review` from the `## Last step` of any emission skill that drafts requirements with `:source_doc:` (e.g. `pharaoh-req-from-code`). The two atoms run independently; neither dispatches the other. The caller merges findings under `review.iso_axes` (from req-review) and `review.code_grounding` (from this skill). Emission fails if either atom returns a mechanical-axis failure. + +Never invoked directly by end users — always from an emission skill's Last step or from `pharaoh-quality-gate.required_checks` in invariant-delegation mode. diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/abstract-prose/README.md b/skills/pharaoh-req-code-grounding-check/fixtures/abstract-prose/README.md new file mode 100644 index 0000000..abeef96 --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/abstract-prose/README.md @@ -0,0 +1,16 @@ +# abstract-prose + +Axis-8 failure case — the abstract-prose regression guard. + +The CREQ body is lifted verbatim from an abstract-prose style observed +during a prior dogfooding session: "the component shall read the input +CSV file using the caller-configured column delimiter and text encoding, +surfacing a read error to its caller when...". Zero backticks, zero +verb-prefixed class names, zero function-call shapes. Every other +mechanical axis has nothing to complain about — but the CREQ adds no +verification information the source code does not already carry. + +Axis 8 (`source_doc_resolves`) fails because the body names no symbols +that appear in the declared file — the CREQ is untestable against the +cited source. The failure surfaces that a fully abstract CREQ does not +ground-truth against the code even when prose mechanics pass. diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/abstract-prose/expected-output.json b/skills/pharaoh-req-code-grounding-check/fixtures/abstract-prose/expected-output.json new file mode 100644 index 0000000..70b896b --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/abstract-prose/expected-output.json @@ -0,0 +1,49 @@ +{ + "need_id": "CREQ_csv_import_01", + "source_doc": "input-source.py", + "axes": { + "exception_raise_sites_exist": { + "passed": "n/a", + "reason": "body describes 'a read error' abstractly without naming a class" + }, + "trigger_condition_literal_match": { + "passed": "n/a", + "reason": "no literal == / is trigger in body" + }, + "named_symbol_exists": { + "passed": "n/a", + "reason": "no verb-prefixed [A-Z]\\w+ symbol and no function-call shape in body \u2014 fully abstract prose" + }, + "type_framework_matches_imports": { + "passed": "n/a", + "reason": "no type-framework claim in body" + }, + "backtick_symbol_in_source_doc": { + "passed": "n/a", + "reason": "no backtick-quoted tokens in body" + }, + "no_weasel_adjectives": { + "passed": true + }, + "quantifier_enumerated": { + "passed": "n/a", + "reason": "no unbounded quantifier matches the narrow noun list" + }, + "source_doc_resolves": { + "passed": false, + "evidence": "source_doc file exists, but no symbols from the CREQ body are named in it \u2014 CREQ is too abstract to verify against the file" + }, + "branch_count_aligned": { + "score": 1, + "evidence": "function read_csv has 2 branches (success / error wrapping); CREQ is one shall-clause fusing them, no enumeration" + } + }, + "overall": "fail", + "blockers": [ + "source_doc_resolves", + "branch_count_aligned" + ], + "actions": [ + "Rewrite CREQ to cite at least one concrete symbol that appears in input-source.py: the function name (``read_csv``), the parameter names (``delimiter``, ``encoding``), and the raised exception class (``CSVReadError``). Pure abstract prose cannot be verified against code and provides no review value." + ] +} diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/abstract-prose/input-creq.rst b/skills/pharaoh-req-code-grounding-check/fixtures/abstract-prose/input-creq.rst new file mode 100644 index 0000000..6680e88 --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/abstract-prose/input-creq.rst @@ -0,0 +1,9 @@ +.. comp_req:: Ingest CSV rows honouring caller-chosen delimiter and encoding + :id: CREQ_csv_import_01 + :status: draft + :source_doc: input-source.py + + The component shall read the input CSV file using the caller-configured + column delimiter and text encoding, surfacing a read error to its caller + when the file cannot be parsed or when the bytes cannot be decoded with + the configured encoding rather than silently coercing the data. diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/abstract-prose/input-source.py b/skills/pharaoh-req-code-grounding-check/fixtures/abstract-prose/input-source.py new file mode 100644 index 0000000..02a2bfa --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/abstract-prose/input-source.py @@ -0,0 +1,15 @@ +"""CSV importer — concrete implementation that the abstract CREQ ignores.""" + +import csv + + +class CSVReadError(Exception): + pass + + +def read_csv(path: str, delimiter: str = ",", encoding: str = "utf-8") -> list[dict]: + try: + with open(path, encoding=encoding) as fh: + return list(csv.DictReader(fh, delimiter=delimiter)) + except (OSError, UnicodeDecodeError, csv.Error) as exc: + raise CSVReadError(str(exc)) from exc diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/collapsed-branches/README.md b/skills/pharaoh-req-code-grounding-check/fixtures/collapsed-branches/README.md new file mode 100644 index 0000000..f53482c --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/collapsed-branches/README.md @@ -0,0 +1,3 @@ +# collapsed-branches + +Demonstrates the subjective `branch_count_aligned` failure mode. The function `read_csv` has four observably different branches (encoding error, csv.Error, empty result, success path), but the CREQ is a single shall-clause that only describes the success path. Score 1: the mechanical axes all pass, but the subjective axis flags the collapse. Remediation splits the CREQ or enumerates the four branches. diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/collapsed-branches/expected-output.json b/skills/pharaoh-req-code-grounding-check/fixtures/collapsed-branches/expected-output.json new file mode 100644 index 0000000..e6fd01f --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/collapsed-branches/expected-output.json @@ -0,0 +1,47 @@ +{ + "need_id": "CREQ_csv_read", + "source_doc": "input-source.py", + "axes": { + "exception_raise_sites_exist": { + "passed": "n/a", + "reason": "no exception raise claim in body" + }, + "trigger_condition_literal_match": { + "passed": "n/a", + "reason": "no literal == / is trigger in body" + }, + "named_symbol_exists": { + "passed": true, + "evidence": "read_csv found as def in source" + }, + "type_framework_matches_imports": { + "passed": "n/a", + "reason": "no type-framework claim in body" + }, + "backtick_symbol_in_source_doc": { + "passed": "n/a", + "reason": "no backtick-quoted tokens in body" + }, + "no_weasel_adjectives": { + "passed": true + }, + "quantifier_enumerated": { + "passed": "n/a", + "reason": "no unbounded quantifier in body" + }, + "source_doc_resolves": { + "passed": true + }, + "branch_count_aligned": { + "score": 1, + "evidence": "function read_csv has 4 visible branches (encoding err / csv.Error / empty / success) producing observably different outputs; CREQ is one shall-clause covering only the success path" + } + }, + "overall": "fail", + "blockers": [ + "branch_count_aligned" + ], + "actions": [ + "Split into four CREQs (success path plus three error classes), or rewrite this CREQ to enumerate the four observable branches explicitly" + ] +} diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/collapsed-branches/input-creq.rst b/skills/pharaoh-req-code-grounding-check/fixtures/collapsed-branches/input-creq.rst new file mode 100644 index 0000000..ca8c3fa --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/collapsed-branches/input-creq.rst @@ -0,0 +1,6 @@ +.. comp_req:: Read CSV with error handling + :id: CREQ_csv_read + :status: draft + :source_doc: input-source.py + + The read_csv function shall return a list of row dicts from the input file. diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/collapsed-branches/input-source.py b/skills/pharaoh-req-code-grounding-check/fixtures/collapsed-branches/input-source.py new file mode 100644 index 0000000..4134b4d --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/collapsed-branches/input-source.py @@ -0,0 +1,23 @@ +"""CSV reader with four distinct observable branches.""" + +import csv + + +class CSVReadError(Exception): + pass + + +def read_csv(path: str) -> list[dict]: + try: + with open(path, encoding="utf-8") as fh: + reader = csv.DictReader(fh) + rows = list(reader) + except UnicodeDecodeError as exc: + raise CSVReadError(f"encoding failure: {exc}") from exc + except csv.Error as exc: + raise CSVReadError(f"csv parse failure: {exc}") from exc + + if not rows: + raise CSVReadError("input file produced zero rows") + + return rows diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/dead-exception/README.md b/skills/pharaoh-req-code-grounding-check/fixtures/dead-exception/README.md new file mode 100644 index 0000000..352849e --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/dead-exception/README.md @@ -0,0 +1,3 @@ +# dead-exception + +Demonstrates the dead-class failure mode observed in the pilot: a CREQ advertises a five-class exception hierarchy, but the source only actually raises two of them. `exception_raise_sites_exist` fails with evidence naming all three classes missing raise sites. Remaining axes pass — named-symbol existence still holds because the classes are defined (just never raised), and the CREQ explicitly enumerates each case so the branch-count axis scores 3. diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/dead-exception/expected-output.json b/skills/pharaoh-req-code-grounding-check/fixtures/dead-exception/expected-output.json new file mode 100644 index 0000000..d1ca41c --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/dead-exception/expected-output.json @@ -0,0 +1,49 @@ +{ + "need_id": "CREQ_upload_errors", + "source_doc": "input-source.py", + "axes": { + "exception_raise_sites_exist": { + "passed": false, + "evidence": "UploadAuthError cited, 1 raise site \u2014 PASS; UploadTransportError cited, 1 raise site \u2014 PASS; UploadArtifactTypeError cited as raised, 0 raise sites \u2014 FAIL; UploadSkippedValueError cited as raised, 0 raise sites \u2014 FAIL; UploadValueMapError cited as raised, 0 raise sites \u2014 FAIL" + }, + "trigger_condition_literal_match": { + "passed": "n/a", + "reason": "no literal == / is trigger in body" + }, + "named_symbol_exists": { + "passed": true, + "evidence": "all cited exception class names resolve as class defs in source" + }, + "type_framework_matches_imports": { + "passed": "n/a", + "reason": "no type-framework claim in body" + }, + "backtick_symbol_in_source_doc": { + "passed": "n/a", + "reason": "no backtick-quoted tokens in body" + }, + "no_weasel_adjectives": { + "passed": true + }, + "quantifier_enumerated": { + "passed": "n/a", + "reason": "no unbounded quantifier in body" + }, + "source_doc_resolves": { + "passed": true + }, + "branch_count_aligned": { + "score": 3, + "evidence": "CREQ enumerates 5 distinct exception cases; covers each branch explicitly" + } + }, + "overall": "fail", + "blockers": [ + "exception_raise_sites_exist" + ], + "actions": [ + "Remove 'UploadArtifactTypeError raised' claim OR add a raise site for UploadArtifactTypeError in input-source.py", + "Remove 'UploadSkippedValueError raised' claim OR add a raise site for UploadSkippedValueError in input-source.py", + "Remove 'UploadValueMapError raised' claim OR add a raise site for UploadValueMapError in input-source.py" + ] +} diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/dead-exception/input-creq.rst b/skills/pharaoh-req-code-grounding-check/fixtures/dead-exception/input-creq.rst new file mode 100644 index 0000000..481c35a --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/dead-exception/input-creq.rst @@ -0,0 +1,9 @@ +.. comp_req:: Upload client error hierarchy + :id: CREQ_upload_errors + :status: draft + :source_doc: input-source.py + + The upload client shall raise UploadAuthError on authentication failure, + raise UploadArtifactTypeError on mismatched artefact types, raise + UploadSkippedValueError on skipped values, raise UploadValueMapError on + value-map failures, and raise UploadTransportError on transport failures. diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/dead-exception/input-source.py b/skills/pharaoh-req-code-grounding-check/fixtures/dead-exception/input-source.py new file mode 100644 index 0000000..0fa5398 --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/dead-exception/input-source.py @@ -0,0 +1,31 @@ +"""Upload client — exception hierarchy declared, only two actually raised.""" + + +class UploadAuthError(Exception): + """Raised on authentication failure.""" + + +class UploadArtifactTypeError(Exception): + """Declared but never raised — dead class.""" + + +class UploadSkippedValueError(Exception): + """Declared but never raised — dead class.""" + + +class UploadValueMapError(Exception): + """Declared but never raised — dead class.""" + + +class UploadTransportError(Exception): + """Raised on transport failure.""" + + +def authenticate(token: str) -> None: + if not token: + raise UploadAuthError("missing token") + + +def send(payload: bytes) -> None: + if not payload: + raise UploadTransportError("empty payload") diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/env-var-glob/README.md b/skills/pharaoh-req-code-grounding-check/fixtures/env-var-glob/README.md new file mode 100644 index 0000000..639c327 --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/env-var-glob/README.md @@ -0,0 +1,14 @@ +# env-var-glob + +Axis-5 filter-step-#4 exercise. The CREQ cites `JAMA_*` as a glob +pattern over a family of environment variables. The literal string +`JAMA_*` is not a Python identifier and would fail a naive substring +lookup. + +The env-glob filter detects the trailing `*`, strips it, compiles a +`\bJAMA_\w+\b` regex, and runs it against the source. At least one +match in the source (here: `JAMA_URL_ENV`, `JAMA_USERNAME_ENV`, etc.) +resolves the token. + +Also exercises filter step #2 (file path / CLI command): `jama check` +is a user-facing command string, not a Python symbol — skipped. diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/env-var-glob/code-grounding-filters.yaml b/skills/pharaoh-req-code-grounding-check/fixtures/env-var-glob/code-grounding-filters.yaml new file mode 100644 index 0000000..e215e78 --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/env-var-glob/code-grounding-filters.yaml @@ -0,0 +1,5 @@ +filters: + - name: env_var_glob + strategy: prefix_glob_expansion + token_regex: "^[A-Z][A-Z0-9_]*_?\\*$" + separator_character: "_" diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/env-var-glob/expected-output.json b/skills/pharaoh-req-code-grounding-check/fixtures/env-var-glob/expected-output.json new file mode 100644 index 0000000..37530ed --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/env-var-glob/expected-output.json @@ -0,0 +1,43 @@ +{ + "need_id": "CREQ_jama_cli_env_fallback", + "source_doc": "input-source.py", + "axes": { + "exception_raise_sites_exist": { + "passed": "n/a", + "reason": "no exception raise claim in body" + }, + "trigger_condition_literal_match": { + "passed": "n/a", + "reason": "no literal == / is trigger in body" + }, + "named_symbol_exists": { + "passed": "n/a", + "reason": "no verb-prefixed symbol in body" + }, + "type_framework_matches_imports": { + "passed": "n/a", + "reason": "no type-framework claim in body" + }, + "backtick_symbol_in_source_doc": { + "passed": true, + "evidence": "2 backtick tokens checked: 'JAMA_*' \u2713 (tailored filter 'env_var_glob' expanded to /\\bJAMA_\\w+\\b/ and matched JAMA_URL_ENV + 3 siblings in source), 'jama check' \u2014 universal filter #2 (file path / command string) applied, skipped as user-facing command" + }, + "no_weasel_adjectives": { + "passed": true + }, + "quantifier_enumerated": { + "passed": "n/a", + "reason": "no unbounded quantifier matches the narrow noun list" + }, + "source_doc_resolves": { + "passed": true + }, + "branch_count_aligned": { + "score": 3, + "evidence": "resolve_credential has 2 branches (cli_value truthy / env fallback); CREQ describes the fallback branch explicitly" + } + }, + "overall": "pass", + "blockers": [], + "actions": [] +} diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/env-var-glob/input-creq.rst b/skills/pharaoh-req-code-grounding-check/fixtures/env-var-glob/input-creq.rst new file mode 100644 index 0000000..19a1e7c --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/env-var-glob/input-creq.rst @@ -0,0 +1,7 @@ +.. comp_req:: Resolve Jama credentials from environment + :id: CREQ_jama_cli_env_fallback + :status: draft + :source_doc: input-source.py + + The CLI shall fall back to ``JAMA_*`` environment variables for + unset credential fields before failing a ``jama check`` call. diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/env-var-glob/input-source.py b/skills/pharaoh-req-code-grounding-check/fixtures/env-var-glob/input-source.py new file mode 100644 index 0000000..af419ea --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/env-var-glob/input-source.py @@ -0,0 +1,15 @@ +"""Jama CLI wiring with env-var fallbacks.""" + +import os + +JAMA_URL_ENV = "JAMA_URL" +JAMA_USERNAME_ENV = "JAMA_USERNAME" +JAMA_PASSWORD_ENV = "JAMA_PASSWORD" +JAMA_PROJECT_ID_ENV = "JAMA_PROJECT_ID" + + +def resolve_credential(field: str, cli_value: str) -> str: + if cli_value: + return cli_value + env_name = f"JAMA_{field.upper()}" + return os.environ.get(env_name, "") diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/external-dotted-path/README.md b/skills/pharaoh-req-code-grounding-check/fixtures/external-dotted-path/README.md new file mode 100644 index 0000000..e8093de --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/external-dotted-path/README.md @@ -0,0 +1,10 @@ +# external-dotted-path + +Axis-5 filter-step-#5 exercise. The CREQ cites `rich.console.Console`, +a third-party import path. The literal string does not appear in the +source `.py` as an identifier (the source has `from rich.console +import Console` and then uses bare `Console`). + +The dotted-path filter splits the token into module + attribute and +checks for a matching `from <module> import <attr>` clause. When +present, the token is treated as resolved. diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/external-dotted-path/code-grounding-filters.yaml b/skills/pharaoh-req-code-grounding-check/fixtures/external-dotted-path/code-grounding-filters.yaml new file mode 100644 index 0000000..43a0782 --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/external-dotted-path/code-grounding-filters.yaml @@ -0,0 +1,9 @@ +filters: + - name: python_import + strategy: dotted_import_resolution + token_regex: "^[a-z][\\w.]*\\.[A-Z]\\w+$" + separator: "." + import_patterns: + - "from\\s+${mod}\\s+import\\s+${attr}" + - "import\\s+${mod}\\b" + - "${tok}" diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/external-dotted-path/expected-output.json b/skills/pharaoh-req-code-grounding-check/fixtures/external-dotted-path/expected-output.json new file mode 100644 index 0000000..fd57996 --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/external-dotted-path/expected-output.json @@ -0,0 +1,43 @@ +{ + "need_id": "CREQ_core_logger", + "source_doc": "input-source.py", + "axes": { + "exception_raise_sites_exist": { + "passed": "n/a", + "reason": "no exception raise claim in body" + }, + "trigger_condition_literal_match": { + "passed": "n/a", + "reason": "no literal == / is trigger in body" + }, + "named_symbol_exists": { + "passed": "n/a", + "reason": "no verb-prefixed [A-Z]\\w+ symbol in body (wrap-verb target is a dotted path, handled by filter step #5)" + }, + "type_framework_matches_imports": { + "passed": "n/a", + "reason": "no type-framework claim in body" + }, + "backtick_symbol_in_source_doc": { + "passed": true, + "evidence": "1 backtick token checked: 'rich.console.Console' \u2713 (tailored filter 'python_import' matched 'from rich.console import Console' in source)" + }, + "no_weasel_adjectives": { + "passed": true + }, + "quantifier_enumerated": { + "passed": "n/a", + "reason": "no unbounded quantifier in body" + }, + "source_doc_resolves": { + "passed": true + }, + "branch_count_aligned": { + "score": 2, + "evidence": "Logger.info has 1 branch but CREQ describes TTY/pipe dispatch that rich.console.Console handles internally \u2014 justified delegation" + } + }, + "overall": "pass", + "blockers": [], + "actions": [] +} diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/external-dotted-path/input-creq.rst b/skills/pharaoh-req-code-grounding-check/fixtures/external-dotted-path/input-creq.rst new file mode 100644 index 0000000..90bd115 --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/external-dotted-path/input-creq.rst @@ -0,0 +1,7 @@ +.. comp_req:: Rich-console logger wraps stdout + :id: CREQ_core_logger + :status: draft + :source_doc: input-source.py + + The logger shall wrap ``rich.console.Console`` so that log records + render with colour markup on a TTY and plain text on a pipe. diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/external-dotted-path/input-source.py b/skills/pharaoh-req-code-grounding-check/fixtures/external-dotted-path/input-source.py new file mode 100644 index 0000000..b0a5620 --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/external-dotted-path/input-source.py @@ -0,0 +1,11 @@ +"""Logger wrapper around rich.console.Console.""" + +from rich.console import Console + + +class Logger: + def __init__(self) -> None: + self.console = Console() + + def info(self, message: str) -> None: + self.console.print(message) diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/inverted-trigger/README.md b/skills/pharaoh-req-code-grounding-check/fixtures/inverted-trigger/README.md new file mode 100644 index 0000000..e96e599 --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/inverted-trigger/README.md @@ -0,0 +1,3 @@ +# inverted-trigger + +Demonstrates the inverted-trigger failure mode. The CREQ asserts the routing happens when `origin_field == "External"`; the source actually branches on `origin_field != "Sphinx-Needs"`. Both the operator (`==` vs `!=`) and the string literal diverge. `trigger_condition_literal_match` fails with evidence naming the divergence. The CREQ's named symbol (`dispatch_item`) does exist in the source, so axis #3 still passes. diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/inverted-trigger/expected-output.json b/skills/pharaoh-req-code-grounding-check/fixtures/inverted-trigger/expected-output.json new file mode 100644 index 0000000..37bd262 --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/inverted-trigger/expected-output.json @@ -0,0 +1,47 @@ +{ + "need_id": "CREQ_dispatch_external", + "source_doc": "input-source.py", + "axes": { + "exception_raise_sites_exist": { + "passed": "n/a", + "reason": "no exception raise claim in body" + }, + "trigger_condition_literal_match": { + "passed": false, + "evidence": "CREQ claims origin_field == \"External\"; source has origin_field != \"Sphinx-Needs\" \u2014 operator and value both diverge from the claim" + }, + "named_symbol_exists": { + "passed": true, + "evidence": "dispatch_item found as def in source" + }, + "type_framework_matches_imports": { + "passed": "n/a", + "reason": "no type-framework claim in body" + }, + "backtick_symbol_in_source_doc": { + "passed": "n/a", + "reason": "no backtick-quoted tokens in body" + }, + "no_weasel_adjectives": { + "passed": true + }, + "quantifier_enumerated": { + "passed": "n/a", + "reason": "no unbounded quantifier in body" + }, + "source_doc_resolves": { + "passed": true + }, + "branch_count_aligned": { + "score": 2, + "evidence": "function dispatch_item has 2 branches; CREQ is one shall-clause covering the positive path" + } + }, + "overall": "fail", + "blockers": [ + "trigger_condition_literal_match" + ], + "actions": [ + "Rewrite the trigger to match the source: either 'when origin_field != \"Sphinx-Needs\"' (mirror the code) or change the source to test equality with \"External\"" + ] +} diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/inverted-trigger/input-creq.rst b/skills/pharaoh-req-code-grounding-check/fixtures/inverted-trigger/input-creq.rst new file mode 100644 index 0000000..b3d5606 --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/inverted-trigger/input-creq.rst @@ -0,0 +1,7 @@ +.. comp_req:: Dispatch external origin items + :id: CREQ_dispatch_external + :status: draft + :source_doc: input-source.py + + The dispatcher shall route each item when origin_field == "External". + The dispatch_item function performs the routing. diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/inverted-trigger/input-source.py b/skills/pharaoh-req-code-grounding-check/fixtures/inverted-trigger/input-source.py new file mode 100644 index 0000000..1517d33 --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/inverted-trigger/input-source.py @@ -0,0 +1,8 @@ +"""Dispatcher — routes items by origin, but not for the origin the CREQ claims.""" + + +def dispatch_item(item: dict) -> str: + origin_field = item.get("origin_field", "") + if origin_field != "Sphinx-Needs": + return "routed-to-default" + return "skipped" diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/misattributed-config-field/README.md b/skills/pharaoh-req-code-grounding-check/fixtures/misattributed-config-field/README.md new file mode 100644 index 0000000..6c8fb30 --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/misattributed-config-field/README.md @@ -0,0 +1,22 @@ +# misattributed-config-field + +Axis-5 failure case. The CREQ cites three backtick tokens +(`outpath`, `default_format`, `.archive`) that the declared `source_doc` +(the consumer module) does not contain — those names live in the +project's config module and the consumer reaches them only through +attribute access (`self.config.target_path`, `self.config.uuid_target`). + +This is a direct replay of an observed dogfooding failure pattern: +authors cite field-default literals when the consumer actually uses +the attribute name, causing a cross-file leak. Axis 5 flags each +token with a retarget hint. + +The fixture ships: + +- `input-source.py` — the consumer module cited by `:source_doc:`. +- `config/export_config.py` — the schema module carrying the + default-value literals the CREQ cites. +- `code-grounding-filters.yaml` — enables the + `cross_file_literal_default` strategy so the skill emits the + actionable "cite attribute instead / retarget source_doc" evidence + rather than a generic "not found in source" message. diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/misattributed-config-field/code-grounding-filters.yaml b/skills/pharaoh-req-code-grounding-check/fixtures/misattributed-config-field/code-grounding-filters.yaml new file mode 100644 index 0000000..bc566ce --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/misattributed-config-field/code-grounding-filters.yaml @@ -0,0 +1,6 @@ +filters: + - name: python_dataclass_default + strategy: cross_file_literal_default + token_regex: "^[a-z_][a-z0-9_]*$" + hint_dir_pattern: "config/" + field_regex: "field\\(default=[\"']${tok}[\"']\\)" diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/misattributed-config-field/config/export_config.py b/skills/pharaoh-req-code-grounding-check/fixtures/misattributed-config-field/config/export_config.py new file mode 100644 index 0000000..e14038a --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/misattributed-config-field/config/export_config.py @@ -0,0 +1,13 @@ +"""Schema module — defines the config fields the consumer reads.""" + +from __future__ import annotations + +from dataclasses import dataclass, field +from pathlib import Path + + +@dataclass +class ExportConfig: + target_path: Path = field(default=Path("needs.archive")) + uuid_target: str = field(default="default_format") + archive_suffix: str = field(default=".archive") diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/misattributed-config-field/expected-output.json b/skills/pharaoh-req-code-grounding-check/fixtures/misattributed-config-field/expected-output.json new file mode 100644 index 0000000..7bf695b --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/misattributed-config-field/expected-output.json @@ -0,0 +1,49 @@ +{ + "need_id": "CREQ_export_archive_pack", + "source_doc": "input-source.py", + "axes": { + "exception_raise_sites_exist": { + "passed": "n/a", + "reason": "no exception raise claim in body" + }, + "trigger_condition_literal_match": { + "passed": "n/a", + "reason": "no literal == / is trigger in body" + }, + "named_symbol_exists": { + "passed": "n/a", + "reason": "no verb-prefixed symbol or function-call shape in body" + }, + "type_framework_matches_imports": { + "passed": "n/a", + "reason": "no type-framework claim in body" + }, + "backtick_symbol_in_source_doc": { + "passed": false, + "evidence": "3 backtick tokens checked: '.archive' ✗ (universal filter #2 does not match — no slash, no space; no tailored filter resolves; token not in input-source.py), 'outpath' ✗ (not in input-source.py; source uses 'target_path' attribute), 'default_format' ✗ (tailored filter 'python_dataclass_default' matched: literal default of a dataclass field discovered in config/export_config.py; consumer source_doc only accesses self.config.uuid_target). Cite the consumer-side attribute form (e.g. 'uuid_target' via 'self.config.uuid_target') or retarget :source_doc: to config/export_config.py" + }, + "no_weasel_adjectives": { + "passed": true + }, + "quantifier_enumerated": { + "passed": "n/a", + "reason": "no unbounded quantifier in body" + }, + "source_doc_resolves": { + "passed": true + }, + "branch_count_aligned": { + "score": 2, + "evidence": "Exporter.export has 1 observable branch (iteration over needs); CREQ is a single shall-clause describing the export — acceptable umbrella" + } + }, + "overall": "fail", + "blockers": [ + "backtick_symbol_in_source_doc" + ], + "actions": [ + "Replace 'outpath' with the consumer-side attribute form 'target_path' (accessed via self.config.target_path) OR change :source_doc: to config/export_config.py", + "Replace 'default_format' (default-value literal) with 'uuid_target' (the config field name the consumer actually reads via self.config.uuid_target)", + "Drop '.archive' from the CREQ body and move it into a config-scoped CREQ whose :source_doc: is config/export_config.py" + ] +} diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/misattributed-config-field/input-creq.rst b/skills/pharaoh-req-code-grounding-check/fixtures/misattributed-config-field/input-creq.rst new file mode 100644 index 0000000..ed0a585 --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/misattributed-config-field/input-creq.rst @@ -0,0 +1,8 @@ +.. comp_req:: Pack needs into an archive + :id: CREQ_export_archive_pack + :status: draft + :source_doc: input-source.py + + The exporter shall write a ``.archive`` file to the configured + ``outpath`` and tag every exported item with a ``default_format`` + derived from its sphinx-needs identifier. diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/misattributed-config-field/input-source.py b/skills/pharaoh-req-code-grounding-check/fixtures/misattributed-config-field/input-source.py new file mode 100644 index 0000000..7a40d5e --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/misattributed-config-field/input-source.py @@ -0,0 +1,22 @@ +"""Exporter consumes a config object via attribute access only. + +Literal default values (e.g. field defaults) live in the config module +under config/ — not in this consumer module. +""" + +from __future__ import annotations + +from dataclasses import dataclass +from pathlib import Path + + +@dataclass +class Exporter: + config: object + + def export(self, needs: list[dict]) -> Path: + target = self.config.target_path + for need in needs: + need[self.config.uuid_target] = need["id"] + target.write_bytes(b"<reqif/>") + return target diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/passing-case/README.md b/skills/pharaoh-req-code-grounding-check/fixtures/passing-case/README.md new file mode 100644 index 0000000..5202641 --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/passing-case/README.md @@ -0,0 +1,3 @@ +# passing-case + +Canonical happy path: one CREQ cites one exception class that is raised, names one function that is defined, declares no type-framework, no weasel words, no unbounded quantifier. All seven mechanical axes pass; the subjective branch-count axis scores 3 because the single shall-clause matches the function's single observable branch. `overall == "pass"` and `blockers` is empty. diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/passing-case/expected-output.json b/skills/pharaoh-req-code-grounding-check/fixtures/passing-case/expected-output.json new file mode 100644 index 0000000..8db09dd --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/passing-case/expected-output.json @@ -0,0 +1,43 @@ +{ + "need_id": "CREQ_inventory_read", + "source_doc": "input-source.py", + "axes": { + "exception_raise_sites_exist": { + "passed": true, + "evidence": "InventoryValidationError cited; 1 raise site found" + }, + "trigger_condition_literal_match": { + "passed": "n/a", + "reason": "no literal == / is trigger in body" + }, + "named_symbol_exists": { + "passed": true, + "evidence": "read_inventory found as def in source" + }, + "type_framework_matches_imports": { + "passed": "n/a", + "reason": "no type-framework claim in body" + }, + "backtick_symbol_in_source_doc": { + "passed": true, + "evidence": "3 backtick tokens checked: 'read_inventory' \u2713, 'strict' \u2713, 'csv' \u2713 \u2014 all resolve in input-source.py" + }, + "no_weasel_adjectives": { + "passed": true + }, + "quantifier_enumerated": { + "passed": "n/a", + "reason": "no unbounded quantifier in body" + }, + "source_doc_resolves": { + "passed": true + }, + "branch_count_aligned": { + "score": 3, + "evidence": "function read_inventory has 1 observable branch (strict=true); CREQ enumerates it explicitly" + } + }, + "overall": "pass", + "blockers": [], + "actions": [] +} diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/passing-case/input-creq.rst b/skills/pharaoh-req-code-grounding-check/fixtures/passing-case/input-creq.rst new file mode 100644 index 0000000..06fe64f --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/passing-case/input-creq.rst @@ -0,0 +1,8 @@ +.. comp_req:: Read inventory CSV and emit rows + :id: CREQ_inventory_read + :status: draft + :source_doc: input-source.py + + The component shall call ``read_inventory`` to parse the input file. + When the ``strict`` flag is set, the component shall raise InventoryValidationError + on malformed rows. The parser uses the standard ``csv`` module. diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/passing-case/input-source.py b/skills/pharaoh-req-code-grounding-check/fixtures/passing-case/input-source.py new file mode 100644 index 0000000..4b7a71e --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/passing-case/input-source.py @@ -0,0 +1,18 @@ +"""Read inventory CSV files into row dicts.""" + +import csv + + +class InventoryValidationError(Exception): + """Raised when a row fails validation in strict mode.""" + + +def read_inventory(path: str, strict: bool = False) -> list[dict]: + rows: list[dict] = [] + with open(path, newline="", encoding="utf-8") as fh: + reader = csv.DictReader(fh) + for row in reader: + if strict and not row.get("sku"): + raise InventoryValidationError(f"missing sku in row: {row}") + rows.append(row) + return rows diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/pydantic-halluc/README.md b/skills/pharaoh-req-code-grounding-check/fixtures/pydantic-halluc/README.md new file mode 100644 index 0000000..4f503aa --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/pydantic-halluc/README.md @@ -0,0 +1,3 @@ +# pydantic-halluc + +Demonstrates the type-framework hallucination failure mode. The CREQ advertises `RowRecord` as a Pydantic model, but the source has `@dataclass` and imports `dataclasses`, not `pydantic`. `type_framework_matches_imports` fails with evidence naming the mismatch. The class name and function both exist in the source, so `named_symbol_exists` passes — the only failure is the false framework claim. diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/pydantic-halluc/expected-output.json b/skills/pharaoh-req-code-grounding-check/fixtures/pydantic-halluc/expected-output.json new file mode 100644 index 0000000..8a97ae9 --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/pydantic-halluc/expected-output.json @@ -0,0 +1,47 @@ +{ + "need_id": "CREQ_row_record", + "source_doc": "input-source.py", + "axes": { + "exception_raise_sites_exist": { + "passed": "n/a", + "reason": "no exception raise claim in body" + }, + "trigger_condition_literal_match": { + "passed": "n/a", + "reason": "no literal == / is trigger in body" + }, + "named_symbol_exists": { + "passed": true, + "evidence": "RowRecord found as class def; parse_row found as def" + }, + "type_framework_matches_imports": { + "passed": false, + "evidence": "CREQ claims 'Pydantic model'; source imports dataclasses and uses @dataclass \u2014 no pydantic import present" + }, + "backtick_symbol_in_source_doc": { + "passed": "n/a", + "reason": "no backtick-quoted tokens in body" + }, + "no_weasel_adjectives": { + "passed": true + }, + "quantifier_enumerated": { + "passed": "n/a", + "reason": "no unbounded quantifier in body" + }, + "source_doc_resolves": { + "passed": true + }, + "branch_count_aligned": { + "score": 3, + "evidence": "function parse_row has 1 branch (unconditional construction); CREQ matches" + } + }, + "overall": "fail", + "blockers": [ + "type_framework_matches_imports" + ], + "actions": [ + "Rewrite 'Pydantic model' as 'dataclass' to match the source imports, OR migrate RowRecord to pydantic BaseModel if Pydantic semantics are actually required" + ] +} diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/pydantic-halluc/input-creq.rst b/skills/pharaoh-req-code-grounding-check/fixtures/pydantic-halluc/input-creq.rst new file mode 100644 index 0000000..e6369c9 --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/pydantic-halluc/input-creq.rst @@ -0,0 +1,7 @@ +.. comp_req:: Row record schema + :id: CREQ_row_record + :status: draft + :source_doc: input-source.py + + The RowRecord Pydantic model shall validate each CSV row. The parse_row + function constructs the model from a raw dict. diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/pydantic-halluc/input-source.py b/skills/pharaoh-req-code-grounding-check/fixtures/pydantic-halluc/input-source.py new file mode 100644 index 0000000..814cf35 --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/pydantic-halluc/input-source.py @@ -0,0 +1,18 @@ +"""Row record — dataclass, NOT pydantic.""" + +from dataclasses import dataclass + + +@dataclass +class RowRecord: + sku: str + quantity: int + price: float + + +def parse_row(raw: dict) -> RowRecord: + return RowRecord( + sku=str(raw["sku"]), + quantity=int(raw["quantity"]), + price=float(raw["price"]), + ) diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/toml-section-filter/README.md b/skills/pharaoh-req-code-grounding-check/fixtures/toml-section-filter/README.md new file mode 100644 index 0000000..11e0b3e --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/toml-section-filter/README.md @@ -0,0 +1,11 @@ +# toml-section-filter + +Axis-5 filter-step-#1 exercise. The CREQ cites `[myapp.to_format]` +in backticks — a TOML section header, not a Python identifier. Without +the filter, the token would fail axis 5 because it cannot be found in +any Python source. With the filter, it is skipped from the symbol +lookup (but the CREQ still has two other resolving tokens, so density +is healthy). + +This covers the false-positive class where authors legitimately cite +configuration file anchors. diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/toml-section-filter/expected-output.json b/skills/pharaoh-req-code-grounding-check/fixtures/toml-section-filter/expected-output.json new file mode 100644 index 0000000..3c7494a --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/toml-section-filter/expected-output.json @@ -0,0 +1,43 @@ +{ + "need_id": "CREQ_format_cli_commands", + "source_doc": "input-source.py", + "axes": { + "exception_raise_sites_exist": { + "passed": "n/a", + "reason": "no exception raise claim in body" + }, + "trigger_condition_literal_match": { + "passed": "n/a", + "reason": "no literal == / is trigger in body" + }, + "named_symbol_exists": { + "passed": "n/a", + "reason": "no verb-prefixed symbol in body" + }, + "type_framework_matches_imports": { + "passed": "n/a", + "reason": "no type-framework claim in body" + }, + "backtick_symbol_in_source_doc": { + "passed": true, + "evidence": "3 backtick tokens checked: 'to_format' ✓ (function def present), '[myapp.to_format]' ✓ (TOML-section filter — skipped from Python-symbol resolution), 'ExportConfig' ✓ (class def present)" + }, + "no_weasel_adjectives": { + "passed": true + }, + "quantifier_enumerated": { + "passed": "n/a", + "reason": "no unbounded quantifier in body" + }, + "source_doc_resolves": { + "passed": true + }, + "branch_count_aligned": { + "score": 3, + "evidence": "to_format has 1 branch; CREQ matches" + } + }, + "overall": "pass", + "blockers": [], + "actions": [] +} diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/toml-section-filter/input-creq.rst b/skills/pharaoh-req-code-grounding-check/fixtures/toml-section-filter/input-creq.rst new file mode 100644 index 0000000..5857571 --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/toml-section-filter/input-creq.rst @@ -0,0 +1,8 @@ +.. comp_req:: Resolve export configuration from TOML + :id: CREQ_format_cli_commands + :status: draft + :source_doc: input-source.py + + The ``to_format`` command shall merge CLI overrides with the + ``[myapp.to_format]`` section of the project TOML before + constructing a ``ExportConfig``. diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/toml-section-filter/input-source.py b/skills/pharaoh-req-code-grounding-check/fixtures/toml-section-filter/input-source.py new file mode 100644 index 0000000..8f7d1c2 --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/toml-section-filter/input-source.py @@ -0,0 +1,17 @@ +"""CLI entry point for the format export command. + +The TOML section name is parsed upstream; this module only consumes the +already-merged ExportConfig object. +""" + +from dataclasses import dataclass + + +@dataclass +class ExportConfig: + prefix: str + target_path: str + + +def to_format(config: ExportConfig) -> None: + _ = config.target_path diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/typer-kebab-filter/README.md b/skills/pharaoh-req-code-grounding-check/fixtures/typer-kebab-filter/README.md new file mode 100644 index 0000000..2494078 --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/typer-kebab-filter/README.md @@ -0,0 +1,11 @@ +# typer-kebab-filter + +Axis-5 filter-step-#3 exercise. Typer renders `license_key: str = +typer.Option(...)` on the CLI as `--license-key`. Authors prose-cite +the kebab form, which does not literally appear in the source. The +kebab filter strips `--`, converts `-` → `_`, and re-checks; all three +flags resolve this way. + +Covers a false-positive class that would otherwise poison any project +using Typer or Click — and prevents the check skill from fighting a +convention upstream of Pharaoh. diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/typer-kebab-filter/code-grounding-filters.yaml b/skills/pharaoh-req-code-grounding-check/fixtures/typer-kebab-filter/code-grounding-filters.yaml new file mode 100644 index 0000000..013b511 --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/typer-kebab-filter/code-grounding-filters.yaml @@ -0,0 +1,6 @@ +filters: + - name: typer_kebab + strategy: kebab_to_snake_or_pascal + token_regex: "^(--)?[a-z][a-z0-9]*(-[a-z0-9]+)+$" + strip_leading: ["--"] + morphology_prefixes: ["Opt"] diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/typer-kebab-filter/expected-output.json b/skills/pharaoh-req-code-grounding-check/fixtures/typer-kebab-filter/expected-output.json new file mode 100644 index 0000000..4a753ed --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/typer-kebab-filter/expected-output.json @@ -0,0 +1,43 @@ +{ + "need_id": "CREQ_cli_license_options", + "source_doc": "input-source.py", + "axes": { + "exception_raise_sites_exist": { + "passed": "n/a", + "reason": "no exception raise claim in body" + }, + "trigger_condition_literal_match": { + "passed": "n/a", + "reason": "no literal == / is trigger in body" + }, + "named_symbol_exists": { + "passed": "n/a", + "reason": "no verb-prefixed symbol; 'LicenseContext' is sentence-internal without a structural verb" + }, + "type_framework_matches_imports": { + "passed": "n/a", + "reason": "no type-framework claim in body" + }, + "backtick_symbol_in_source_doc": { + "passed": true, + "evidence": "4 backtick tokens checked: '--license-key' \u2713 (tailored filter 'typer_kebab' \u2192 'license_key' resolves as parameter), '--license-user' \u2713 (\u2192 'license_user'), '--license-stage' \u2713 (\u2192 'license_stage'), 'LicenseContext' \u2713 (class def present, universal substring match)" + }, + "no_weasel_adjectives": { + "passed": true + }, + "quantifier_enumerated": { + "passed": "n/a", + "reason": "no unbounded quantifier matches the narrow noun list" + }, + "source_doc_resolves": { + "passed": true + }, + "branch_count_aligned": { + "score": 3, + "evidence": "run_command is a single-branch constructor; CREQ is a single shall-clause enumerating three parameters \u2014 matches the source shape" + } + }, + "overall": "pass", + "blockers": [], + "actions": [] +} diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/typer-kebab-filter/input-creq.rst b/skills/pharaoh-req-code-grounding-check/fixtures/typer-kebab-filter/input-creq.rst new file mode 100644 index 0000000..6bd4f10 --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/typer-kebab-filter/input-creq.rst @@ -0,0 +1,8 @@ +.. comp_req:: License-aware CLI options + :id: CREQ_cli_license_options + :status: draft + :source_doc: input-source.py + + The CLI shall expose ``--license-key``, ``--license-user``, and + ``--license-stage`` options that map to a ``LicenseContext`` + bundled into every command invocation. diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/typer-kebab-filter/input-source.py b/skills/pharaoh-req-code-grounding-check/fixtures/typer-kebab-filter/input-source.py new file mode 100644 index 0000000..f948743 --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/typer-kebab-filter/input-source.py @@ -0,0 +1,18 @@ +"""Typer-style CLI with snake_case parameters rendered as kebab flags.""" + +from dataclasses import dataclass + + +@dataclass +class LicenseContext: + license_key: str + license_user: str + license_stage: str + + +def run_command( + license_key: str = "", + license_user: str = "", + license_stage: str = "", +) -> LicenseContext: + return LicenseContext(license_key, license_user, license_stage) diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/unbounded-all/README.md b/skills/pharaoh-req-code-grounding-check/fixtures/unbounded-all/README.md new file mode 100644 index 0000000..5763ce7 --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/unbounded-all/README.md @@ -0,0 +1,3 @@ +# unbounded-all + +Demonstrates the unbounded-quantifier failure mode. The CREQ says "all validation errors" but the source has thirteen distinct validator functions and the CREQ does not enumerate them. `quantifier_enumerated` fails — the narrow `all + errors` pattern matched, and neither an enumeration colon nor the `namely / specifically / including` markers appear in the same or next sentence. Remediation enumerates the 13 validators by name or splits the CREQ. diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/unbounded-all/expected-output.json b/skills/pharaoh-req-code-grounding-check/fixtures/unbounded-all/expected-output.json new file mode 100644 index 0000000..0493842 --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/unbounded-all/expected-output.json @@ -0,0 +1,47 @@ +{ + "need_id": "CREQ_validation_pipeline", + "source_doc": "input-source.py", + "axes": { + "exception_raise_sites_exist": { + "passed": "n/a", + "reason": "no exception raise claim in body" + }, + "trigger_condition_literal_match": { + "passed": "n/a", + "reason": "no literal == / is trigger in body" + }, + "named_symbol_exists": { + "passed": true, + "evidence": "validate found as def in source" + }, + "type_framework_matches_imports": { + "passed": "n/a", + "reason": "no type-framework claim in body" + }, + "backtick_symbol_in_source_doc": { + "passed": "n/a", + "reason": "no backtick-quoted tokens in body" + }, + "no_weasel_adjectives": { + "passed": true + }, + "quantifier_enumerated": { + "passed": false, + "evidence": "'all validation errors' \u2014 unbounded 'all' before 'errors' without enumeration colon, 'namely / specifically / including' marker, or list directive in the same or next sentence" + }, + "source_doc_resolves": { + "passed": true + }, + "branch_count_aligned": { + "score": 2, + "evidence": "validate iterates 13 validators; CREQ groups them under a single quantifier rather than enumerating \u2014 justified umbrella, but missing enumeration elsewhere" + } + }, + "overall": "fail", + "blockers": [ + "quantifier_enumerated" + ], + "actions": [ + "Enumerate the 13 validators (e.g. 'namely: sku, quantity_positive, quantity_int, price_positive, price_float, currency_set, currency_iso, supplier_set, supplier_known, category_set, category_known, timestamp_set, timestamp_iso') or split the CREQ into one shall-clause per validator class" + ] +} diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/unbounded-all/input-creq.rst b/skills/pharaoh-req-code-grounding-check/fixtures/unbounded-all/input-creq.rst new file mode 100644 index 0000000..9f9b69d --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/unbounded-all/input-creq.rst @@ -0,0 +1,6 @@ +.. comp_req:: Validation pipeline + :id: CREQ_validation_pipeline + :status: draft + :source_doc: input-source.py + + The validate function shall report all validation errors before returning. diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/unbounded-all/input-source.py b/skills/pharaoh-req-code-grounding-check/fixtures/unbounded-all/input-source.py new file mode 100644 index 0000000..9d0e57c --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/unbounded-all/input-source.py @@ -0,0 +1,78 @@ +"""Validator pipeline with thirteen distinct validators.""" + + +def _check_sku(row): + return row.get("sku") and True + + +def _check_quantity_positive(row): + return row.get("quantity", 0) > 0 + + +def _check_quantity_int(row): + return isinstance(row.get("quantity"), int) + + +def _check_price_positive(row): + return row.get("price", 0) > 0 + + +def _check_price_float(row): + return isinstance(row.get("price"), (int, float)) + + +def _check_currency_set(row): + return "currency" in row + + +def _check_currency_iso(row): + return row.get("currency", "") in {"USD", "EUR", "GBP"} + + +def _check_supplier_set(row): + return "supplier" in row + + +def _check_supplier_known(row): + return row.get("supplier", "") != "" + + +def _check_category_set(row): + return "category" in row + + +def _check_category_known(row): + return row.get("category", "") in {"A", "B", "C"} + + +def _check_timestamp_set(row): + return "timestamp" in row + + +def _check_timestamp_iso(row): + return str(row.get("timestamp", "")).count("-") == 2 + + +VALIDATORS = [ + _check_sku, + _check_quantity_positive, + _check_quantity_int, + _check_price_positive, + _check_price_float, + _check_currency_set, + _check_currency_iso, + _check_supplier_set, + _check_supplier_known, + _check_category_set, + _check_category_known, + _check_timestamp_set, + _check_timestamp_iso, +] + + +def validate(row: dict) -> list[str]: + errors: list[str] = [] + for fn in VALIDATORS: + if not fn(row): + errors.append(fn.__name__) + return errors diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/weasel-adjectives/README.md b/skills/pharaoh-req-code-grounding-check/fixtures/weasel-adjectives/README.md new file mode 100644 index 0000000..65fe9ba --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/weasel-adjectives/README.md @@ -0,0 +1,3 @@ +# weasel-adjectives + +Demonstrates the weasel-adjective failure mode. The CREQ body contains three blacklisted adjectives — `structured`, `full`, `comprehensive` — that imply mechanised behaviour without grounding. `no_weasel_adjectives` fails with three evidence lines, one per match. The named function exists, so other mechanical axes pass. Remediation replaces each weasel word with a concrete term. diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/weasel-adjectives/expected-output.json b/skills/pharaoh-req-code-grounding-check/fixtures/weasel-adjectives/expected-output.json new file mode 100644 index 0000000..a4b717c --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/weasel-adjectives/expected-output.json @@ -0,0 +1,50 @@ +{ + "need_id": "CREQ_diagnostic_report", + "source_doc": "input-source.py", + "axes": { + "exception_raise_sites_exist": { + "passed": "n/a", + "reason": "no exception raise claim in body" + }, + "trigger_condition_literal_match": { + "passed": "n/a", + "reason": "no literal == / is trigger in body" + }, + "named_symbol_exists": { + "passed": true, + "evidence": "write_report found as def in source" + }, + "type_framework_matches_imports": { + "passed": "n/a", + "reason": "no type-framework claim in body" + }, + "backtick_symbol_in_source_doc": { + "passed": "n/a", + "reason": "no backtick-quoted tokens in body" + }, + "no_weasel_adjectives": { + "passed": false, + "evidence": "'structured diagnostic' \u2014 'structured' blacklisted; 'full set' \u2014 'full' blacklisted; 'comprehensive per-row detail' \u2014 'comprehensive' blacklisted" + }, + "quantifier_enumerated": { + "passed": "n/a", + "reason": "no unbounded quantifier matches the narrow noun list" + }, + "source_doc_resolves": { + "passed": true + }, + "branch_count_aligned": { + "score": 3, + "evidence": "function write_report has 1 branch (unconditional write); CREQ matches" + } + }, + "overall": "fail", + "blockers": [ + "no_weasel_adjectives" + ], + "actions": [ + "Replace 'structured diagnostic' with a concrete term (e.g. list[dict] of per-row outcomes)", + "Replace 'full set' with enumerated outcomes or a specific count", + "Replace 'comprehensive per-row detail' with named fields (sku, quantity, price, status)" + ] +} diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/weasel-adjectives/input-creq.rst b/skills/pharaoh-req-code-grounding-check/fixtures/weasel-adjectives/input-creq.rst new file mode 100644 index 0000000..f5919d5 --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/weasel-adjectives/input-creq.rst @@ -0,0 +1,7 @@ +.. comp_req:: Write structured diagnostic report + :id: CREQ_diagnostic_report + :status: draft + :source_doc: input-source.py + + The write_report function shall emit a structured diagnostic report + covering the full set of parse outcomes with comprehensive per-row detail. diff --git a/skills/pharaoh-req-code-grounding-check/fixtures/weasel-adjectives/input-source.py b/skills/pharaoh-req-code-grounding-check/fixtures/weasel-adjectives/input-source.py new file mode 100644 index 0000000..508be50 --- /dev/null +++ b/skills/pharaoh-req-code-grounding-check/fixtures/weasel-adjectives/input-source.py @@ -0,0 +1,8 @@ +"""Diagnostic report writer — emits a list[dict] per row.""" + +import json + + +def write_report(rows: list[dict], out_path: str) -> None: + with open(out_path, "w", encoding="utf-8") as fh: + json.dump(rows, fh, indent=2) diff --git a/skills/pharaoh-req-codelink-annotate/SKILL.md b/skills/pharaoh-req-codelink-annotate/SKILL.md new file mode 100644 index 0000000..95808c8 --- /dev/null +++ b/skills/pharaoh-req-codelink-annotate/SKILL.md @@ -0,0 +1,323 @@ +--- +name: pharaoh-req-codelink-annotate +description: Use when a requirement has been drafted (either as an RST block by `pharaoh-req-from-code` or implicitly) and you need to insert a one-line comment into the source file that carries the trace. Two modes — `codelinks` (sphinx-codelinks-compatible multi-field `@ title, id, type, [links]` form; the comment IS the need) and `backref` (minimal `@req ID: title` pointer back to an RST-hosted need). Mode is tailored via `ubproject.toml` / `pharaoh.toml`, not hardcoded. +--- + +# pharaoh-req-codelink-annotate + +## When to use + +Invoke when you want a source file to carry a machine- and human-readable reference to a requirement. The resulting comment either: + +- **`codelinks` mode** — contains the full need definition in sphinx-codelinks one-line format (e.g. `# @ Write CSV header row, CREQ_csv_export_01, comp_req, [FEAT_csv_export]`). When the project builds Sphinx with `sphinx_codelinks` loaded, this comment becomes an actual need directive at render time. The comment IS the source of truth. +- **`backref` mode** — contains only a pointer to a need defined elsewhere (e.g. `# @req CREQ_csv_export_01: Write CSV header row`). The RST block (e.g. from `pharaoh-req-from-code`) is the source of truth; the comment exists for grep/IDE navigation only. + +Mode is not a per-call preference — it is a project-level decision baked into `ubproject.toml` / `pharaoh.toml` (see Tailoring awareness). The caller may override per invocation, but the default MUST come from tailoring so that the whole codebase stays consistent. + +Do NOT use to generate the requirement's text (that is `pharaoh-req-from-code` with appropriate `emit` mode). Do NOT use to delete or update comments after the fact — this skill only inserts, per atomicity. + +## Tailoring awareness — mode resolution + +The skill resolves `mode` in this order: + +1. Input `mode_override` parameter (per-call, highest precedence). +2. `pharaoh.toml` → `[pharaoh.codelink_comments].mode` (explicit project choice). +3. Auto-detect: if `ubproject.toml` contains `[codelinks.projects.*]` (i.e. sphinx-codelinks is in use) → `"codelinks"`; otherwise `"backref"`. +4. Fallback: `"backref"` (conservative — a dumb grep-able pointer is safe even if the project later adopts sphinx-codelinks). + +### `codelinks` mode configuration + +Comes from `ubproject.toml` under `[codelinks.projects.<name>.analyse.oneline_comment_style]` — exactly the table sphinx-codelinks itself reads. The skill does NOT re-invent this schema; it reads the same `start_sequence`, `end_sequence`, `field_split_char`, and `needs_fields` that sphinx-codelinks will use to parse the comment at build time. This guarantees round-trip safety: what this skill writes, sphinx-codelinks reads. + +The caller must indicate which codelinks project the file belongs to, via input `codelinks_project_name`. If omitted, the skill tries to infer from `file_path` + each project's `source_discover.src_dir`; if exactly one project matches, use it; if zero or multiple match → FAIL asking the caller to be explicit. This keeps the skill atomic (no hidden "which project" guessing) while staying ergonomic. + +### `backref` mode configuration + +Comes from `pharaoh.toml`: + +```toml +[pharaoh.codelink_comments] +mode = "backref" +prefix = "@req" # marker for grep +format = "{prefix} {id}: {title}" # template +``` + +If `pharaoh.codelink_comments` is absent, defaults: `prefix = "@req"`, `format = "{prefix} {id}: {title}"`. + +### `check → propose → confirm` + +If `on_missing_config == "prompt"` (default) AND no tailoring is found (neither `pharaoh.codelink_comments` nor `[codelinks.projects.*]`), the skill does NOT silent-default. It returns a structured proposal object: + +```json +{ + "status": "needs_confirmation", + "proposal": { + "mode": "backref", + "rationale": "No [codelinks.projects.*] table in ubproject.toml — project does not appear to use sphinx-codelinks. Proposing minimal backref mode.", + "tailoring_patch": { + "target_file": "pharaoh.toml", + "section": "[pharaoh.codelink_comments]", + "patch": {"mode": "backref", "prefix": "@req", "format": "{prefix} {id}: {title}"} + } + } +} +``` + +The caller confirms (humans or an outer LLM), the tailoring gets written (typically via `pharaoh-tailor-fill`), and the skill is re-invoked — now finding the config and proceeding silently with `on_missing_config="use_default"` semantics. + +### Language-to-comment-syntax mapping + +Derived from file extension (both modes): + +| Extension | Prefix | +|---|---| +| `.py`, `.rb`, `.sh`, `.toml`, `.yaml`, `.yml` | `#` | +| `.c`, `.cpp`, `.cxx`, `.cc`, `.h`, `.hpp`, `.hxx`, `.ts`, `.tsx`, `.js`, `.jsx`, `.rs`, `.go`, `.java`, `.kt`, `.swift`, `.scala`, `.groovy`, `.dart` | `//` | +| `.sql`, `.hs`, `.lua`, `.ada` | `--` | + +Unknown extension → FAIL rather than guess. + +## Atomicity + +- (a) Indivisible — one (req_id, file_path, anchor) triple in → one comment line inserted. No multi-req batching, no req text modification, no RST file modification. +- (b) Input: `{req_id: str, req_title: str, req_type: str, file_path: str, anchor: AnchorSpec, project_root: str, parent_links?: list[str], mode_override?: "codelinks"|"backref", codelinks_project_name?: str, on_missing_config?: "fail"|"prompt"|"use_default", dry_run?: bool}`. Output: JSON `{mode_used: "codelinks"|"backref", inserted_line: int, inserted_text: str, file_modified: bool}`. On `dry_run=true` → `file_modified=false`, no write. On `on_missing_config="prompt"` with no tailoring → `{status: "needs_confirmation", proposal: ...}` (see Tailoring awareness). +- (c) Reward: two deterministic fixtures, one per mode. + + **Fixture A — `codelinks` mode** (project tailored with `[codelinks.projects.demo.analyse.oneline_comment_style]` matching sphinx-codelinks defaults: `start_sequence="@"`, `field_split_char=","`, `needs_fields=[title, id, type, links]`). 20-line `.py` file, known anchor, known (req_id, req_title, req_type, parent_links=["FEAT_X"]). Scorer: + 1. File is still syntactically valid Python. + 2. Exactly one line added. + 3. Inserted line starts with `# @ ` (comment prefix + start_sequence + space). + 4. Parsing the inserted line with sphinx-codelinks' own oneline parser (or a faithful reimplementation) yields a need with `title==req_title`, `id==req_id`, `type==req_type`, `links==parent_links`. + 5. `mode_used == "codelinks"` in output. + 6. Idempotent re-run: second invocation detects `req_id` substring, no-op, `file_modified=false`. + + **Fixture B — `backref` mode** (project without `[codelinks.*]` and without `[pharaoh.codelink_comments]`, `on_missing_config="use_default"`). Same file shape. Scorer: + 1. File is still syntactically valid Python. + 2. Exactly one line added. + 3. Inserted line starts with `# @req ` (default backref prefix). + 4. Inserted line contains both `req_id` and `req_title` as substrings. + 5. `mode_used == "backref"` in output. + 6. Idempotent re-run: no-op. + + **Fixture C — prompt mode** (no tailoring, `on_missing_config="prompt"`). Scorer: + 1. Output has `status == "needs_confirmation"`. + 2. Output has `proposal.mode == "backref"` (conservative default). + 3. Output has `proposal.tailoring_patch` pointing at `pharaoh.toml`. + 4. File was NOT modified. + + Pass = all checks in all three fixtures. +- (d) Reusable: any source tree in any supported language; bidirectional trace for reverse-engineered reqs; IDE-navigable "where is this req implemented" queries via grep. +- (e) Composable: strictly one phase (source mutation, one comment). Never modifies RST, never calls `pharaoh-req-from-code` or other skills. A plan emitted by `pharaoh-write-plan` MAY include a foreach task over req-emission outputs that dispatches this skill per req, but this skill itself does not orchestrate. + +## Input + +- `req_id`: the requirement's sphinx-needs ID (e.g. `"CREQ_csv_export_01"`). +- `req_title`: the requirement's short title (e.g. `"Write CSV header row"`). +- `req_type`: the requirement's directive name (e.g. `"comp_req"`, `"impl"`). In `codelinks` mode used for the `type` field. In `backref` mode used only if the tailored `format` template includes `{type}`. +- `file_path`: path to the source file to annotate. Accepts either an absolute path or a path relative to `project_root`; relative paths are joined with `project_root` before the file is opened. +- `anchor`: `AnchorSpec` — one of: + - `{type: "top_of_file"}` — insert after shebang/encoding lines but before any other content. + - `{type: "before_symbol", symbol: "<name>"}` — insert immediately before the line where `<name>` is defined. Regex-based detection, not AST-level. + - `{type: "before_line", line: <n>}` — insert before line `<n>` (1-indexed). +- `project_root`: absolute path to the consumer project's root. Used to locate `ubproject.toml` (for `codelinks` mode config) and `pharaoh.toml` (for `backref` mode config and mode selection). +- `parent_links` (optional): list of parent IDs (e.g. `["FEAT_csv_export"]`). In `codelinks` mode used for the `links` field verbatim. In `backref` mode used only if the tailored `format` template includes `{parent_links}`. +- `mode_override` (optional): `"codelinks"` or `"backref"`. Forces mode for this call. If omitted, resolution follows the order in Tailoring awareness. +- `codelinks_project_name` (optional): which entry under `[codelinks.projects.*]` in `ubproject.toml` this file belongs to. Required when multiple projects are defined and their `source_discover.src_dir` values would both match `file_path`. Ignored in `backref` mode. +- `on_missing_config` (optional): `"fail" | "prompt" | "use_default"`. Default `"prompt"`. Determines behavior when tailoring is missing (see Tailoring awareness). +- `dry_run` (optional): if `true`, skill returns what WOULD be written without touching the file. Default `false`. + +## Output + +A single JSON object, one of three shapes: + +**Success shape (file modified):** + +```json +{ + "mode_used": "backref", + "inserted_line": 15, + "inserted_text": "# @req CREQ_csv_export_01: Write CSV header row", + "file_modified": true +} +``` + +**Idempotent re-run (comment already present):** + +```json +{ + "mode_used": "backref", + "inserted_line": 15, + "inserted_text": "# @req CREQ_csv_export_01: Write CSV header row", + "file_modified": false +} +``` + +**Needs-confirmation (no tailoring, `on_missing_config="prompt"`):** + +```json +{ + "status": "needs_confirmation", + "proposal": { + "mode": "backref", + "rationale": "No [codelinks.projects.*] table in ubproject.toml — proposing minimal backref mode.", + "tailoring_patch": { + "target_file": "pharaoh.toml", + "section": "[pharaoh.codelink_comments]", + "patch": {"mode": "backref", "prefix": "@req", "format": "{prefix} {id}: {title}"} + } + } +} +``` + +On `dry_run=true`, `file_modified` is always `false`. + +## Output schema + +Output must parse as JSON via `json.loads`. Validator checks one of two shapes: + +**Success shape:** +- Required keys: `mode_used` (one of `"codelinks"`, `"backref"`), `inserted_line` (int ≥ 1), `inserted_text` (non-empty str), `file_modified` (bool). +- Unknown keys are permitted and surface as a warning, not a rejection, to allow forward-compatible evolution. + +**Needs-confirmation shape:** +- Required keys: `status == "needs_confirmation"`, `proposal` (mapping). See `## Tailoring awareness` for proposal details. +- Mutually exclusive with the success shape (a response has one or the other, never both). + +## Process + +### Step 1: Resolve mode + +Resolve `mode` per the Tailoring awareness order: +1. `mode_override` → use directly. +2. `pharaoh.toml [pharaoh.codelink_comments].mode` → use. +3. `ubproject.toml` has any `[codelinks.projects.*]` table → `"codelinks"`. +4. Fallback → `"backref"`. + +If resolution step 2 and 3 both yield nothing AND `on_missing_config == "prompt"` → emit the `needs_confirmation` proposal object described in Tailoring awareness and return without modifying the file. + +If `on_missing_config == "fail"` and no config → FAIL. + +If `on_missing_config == "use_default"` and no config → proceed with `"backref"` silently. + +### Step 2a: Format the comment — `codelinks` mode + +1. Determine the codelinks project: use `codelinks_project_name` if provided; else infer by matching `file_path` against each `[codelinks.projects.*].source_discover.src_dir`. If exactly one matches, use that project. If zero or multiple → FAIL. +2. Read `[codelinks.projects.<name>.analyse.oneline_comment_style]` from `ubproject.toml`: + - `start_sequence` (e.g. `"@"`) + - `end_sequence` (optional; default empty) + - `field_split_char` (e.g. `","`) + - `needs_fields`: ordered list of `{name, type?, default?}` entries. +3. Build the field values in declared order: + - For each field, pick the value from the mapping: `title`=`req_title`, `id`=`req_id`, `type`=`req_type`, `links`=`parent_links` (rendered as `[a, b, c]` per sphinx-codelinks list-of-strings syntax). + - If a declared field has no matching value AND has a `default` → omit (sphinx-codelinks will fill it). Else FAIL naming the missing field. + - Escape `field_split_char` and `[`/`]` characters in string values per sphinx-codelinks escaping rules (backslash prefix). +4. Join fields with `" " + field_split_char + " "` (a space, separator, space — matching sphinx-codelinks' own formatting). +5. Prepend `start_sequence + " "` (e.g. `"@ "`). +6. Append `end_sequence` if non-empty. +7. The result is the comment text body. Example: `"@ Write CSV header row, CREQ_csv_export_01, comp_req, [FEAT_csv_export]"`. + +### Step 2b: Format the comment — `backref` mode + +Read `<project_root>/pharaoh.toml` if present. Extract `[pharaoh.codelink_comments]`: +- `prefix` (default `"@req"`) +- `format` (default `"{prefix} {id}: {title}"`) + +Substitute placeholders: +- `{prefix}` → the prefix value +- `{id}` → `req_id` +- `{title}` → `req_title` +- `{type}` → `req_type` +- `{parent_links}` → `", ".join(parent_links)` or empty string if not provided + +### Step 3: Resolve comment syntax from file extension + +Map file extension → comment prefix: + +| Extension | Prefix | +|---|---| +| `.py`, `.rb`, `.sh`, `.toml`, `.yaml`, `.yml` | `#` | +| `.c`, `.cpp`, `.cxx`, `.cc`, `.h`, `.hpp`, `.hxx`, `.ts`, `.tsx`, `.js`, `.jsx`, `.rs`, `.go`, `.java`, `.kt`, `.swift`, `.scala`, `.groovy`, `.dart` | `//` | +| `.sql`, `.hs`, `.lua`, `.ada` | `--` | + +Unknown extension → FAIL: `"Cannot determine comment syntax for extension <ext>. Add the extension to the mapping or pass a file with a known extension."`. + +The inserted line is: `<comment_prefix> <formatted_text>`. + +### Step 4: Read the file + +Read `file_path`. Split into lines (preserve trailing newline state for round-trip). + +### Step 5: Idempotency check + +Scan every line for `req_id` as a substring. If any line contains it, the comment is already present (or a different reference to the same req exists). Return without modifying the file: + +```json +{"inserted_line": <matched_line_index>, "inserted_text": "<matched_line>", "file_modified": false} +``` + +This is the idempotency guarantee from reward check #6. Do NOT re-insert, do NOT duplicate, do NOT update a stale title (if the title changed, the human is responsible for deleting the old line; auto-update risks corrupting hand-edited comments). + +### Step 6: Resolve anchor to a line index + +- `top_of_file`: skip shebang (`#!`) and encoding lines (`# -*- coding: ... -*-`, `# coding: ...`), stop at the first content line. Insert immediately before it. +- `before_symbol`: regex-scan for the symbol's declaration line. Language-specific patterns: + - Python: `^\s*(def|class|async\s+def)\s+<name>\b` + - JS/TS: `^\s*(function|class|const|let|var)\s+<name>\b`, also `^\s*<name>\s*(?::|=)` for object-method shorthand + - Rust: `^\s*(pub\s+)?(fn|struct|enum|trait|impl)\s+<name>\b` + - Go: `^\s*func\s+(\([^)]*\)\s+)?<name>\b`, also `^\s*(type\s+<name>\b|var\s+<name>\b)` + - C/C++: `^\s*[\w:*&<>\s]+\s+<name>\s*\(` (function) or `^\s*(class|struct|enum)\s+<name>\b` (type) + + If multiple matches, warn and use the first. If zero matches, FAIL: `"Symbol <name> not found in <file_path>."`. +- `before_line`: validate `1 <= line <= len(lines) + 1`. FAIL if out of range. + +### Step 7: Insert the comment + +If `dry_run=true`, return without writing: + +```json +{"inserted_line": <resolved_line>, "inserted_text": "<formatted_comment>", "file_modified": false} +``` + +Otherwise, insert the comment line at the resolved index. Preserve indentation — the comment inherits the indentation of the line it precedes (so it stays at the correct nesting level for `before_symbol` anchors). + +Write the file back. Preserve original EOL convention (LF vs CRLF) and final-newline state. + +### Step 8: Return + +Return the JSON object per the Output shape with `file_modified=true`. + +## Last step + +No dedicated `*-review` atom exists for codelink annotation; the operation is a one-line insert whose correctness is structural rather than prose-judgement. This skill therefore performs an inline self-verification in Step 8 before returning `file_modified=true`: + +1. Re-read `file_path` and confirm the line at `inserted_line` is byte-for-byte equal to `inserted_text`. +2. Confirm the file has at most one line starting with the tailored `start_sequence + <separator>` bearing `req_id` (idempotence — a subsequent run with the same inputs must be a no-op, not a duplicate insert). +3. If either check fails, roll back the write (restore the original file) and return `status: "failed"` with evidence. + +Coverage is mechanically enforced at plan level by `pharaoh-quality-gate`'s `link_types_covered` invariant (verifies every required link type referenced by the project's artefact catalog has at least one non-empty value across the emitted corpus). See [`shared/self-review-invariant.md`](../shared/self-review-invariant.md) for the rationale. + +## Failure modes + +- `file_path` not readable → FAIL: `"file not readable: <path>"`. +- Unknown extension → FAIL per Step 2. +- Symbol not found for `before_symbol` anchor → FAIL per Step 5. +- Line out of range for `before_line` anchor → FAIL per Step 5. +- File is in `.git/`, `node_modules/`, `__pycache__/`, or a build output directory (detected by path segment) → FAIL: `"Refusing to annotate generated/vendored file: <path>"`. This protects against accidental writes into machine-generated code. + +## Non-goals + +- No AST-level insertion — regex is deliberately simple to keep the skill language-agnostic. Callers who need AST-precise placement should use a language-specific tool and pass `before_line` with the exact line number. +- No multi-comment insertion in one call — this skill is atomic per (req, file, anchor). Callers who need N comments make N calls. +- No cross-file traceability validation — a separate skill (`pharaoh-codelink-validate`, not yet implemented) will scan for orphan back-references whose req no longer exists. +- No removal — if a req is deleted, the comment stays until a human or a dedicated cleanup skill removes it. This skill only inserts. + +## Composition + +The typical flow: + +1. A plan emitted by `pharaoh-write-plan` includes a foreach task over req-from-code outputs; `pharaoh-execute-plan` dispatches N `pharaoh-req-from-code` instances that generate `comp_req` blocks. +2. Caller (human or the same plan) reviews and accepts the reqs. +3. A downstream foreach task in the same plan (typically `id: codelink_annotate`) dispatches this skill per accepted req with: + - `req_id`, `req_title`, `req_type` from the RST block + - `file_path` = the source file the req was derived from (available from the `req-from-code:<filename>` `reporter_id` used by the upstream task) + - `anchor` = `{type: "top_of_file"}` for coarse placement, or `{type: "before_symbol", symbol: <primary_symbol>}` when the req is clearly about a specific function/class. diff --git a/skills/pharaoh-req-draft/SKILL.md b/skills/pharaoh-req-draft/SKILL.md index 049f7ee..bd484de 100644 --- a/skills/pharaoh-req-draft/SKILL.md +++ b/skills/pharaoh-req-draft/SKILL.md @@ -67,7 +67,7 @@ Read the entry for the resolved prefix key. Record: - `optional_fields` — fields that may appear - `lifecycle` — valid values for `:status:` -For Score `gd_req`: required = `[id, status, satisfies]`; optional = `[complies, tags, rationale, verification]`; lifecycle = `[draft, valid, inspected]`. +Built-in default profile (bundled example): required = `[id, status, satisfies]`; optional = `[complies, tags, rationale, verification]`; lifecycle = `[draft, valid, inspected]`. **1c. `checklists/requirement.md`** @@ -161,16 +161,19 @@ Write a single sentence that: 3. Specifies a condition or measurable criterion where the feature_context provides one 4. Contains no coordinating conjunctions (`and`, `or`, `but`) within the `shall` clause 5. Does not interpret or expand the feature_context beyond what is stated — if the context is too vague to write a specific shall clause, see Guardrails +6. Describes **observable behavior at the component boundary**, not internal mechanism. Do NOT name internal methods, classes, private variables, field names, or module-local symbols inside the shall body. External API names (published HTTP routes, CLI flags, pypi packages, protocol names, algorithm names) ARE observable and are fine. Rationale: the prior dogfooding audit showed ~7% (3/40) of LLM-drafted shall clauses named internal symbols AND got the described mechanism wrong — internal-name mentions rot on rename and are a primary accuracy-failure class. Keep traceability to internal symbols in `pharaoh-req-codelink-annotate` output, not in the shall body. Good patterns: - `The <system> shall <action> when <condition>.` - `The <system> shall <action> within <measurable criterion>.` - `The <component> shall <provide/reject/signal> <object> <constraint>.` +- `The exporter shall use HMAC-SHA256 to sign each outgoing request.` ← algorithm is observable at the boundary, fine to name. Bad patterns (reject these in Step 6): - Two verbs joined by `and`: `The system shall detect and report...` → FAIL - Implicit plural: `The system shall check all sensors...` → acceptable only if "all" is intentional scope - Vague quantity: `The system shall respond quickly` → too vague; note in output +- Internal symbol in the shall body: `The system shall use global_id to drive create-vs-update.` → internal variable name, will rot; rewrite as `The exporter shall decide create-vs-update based on whether a tracked identifier is already known for the need.` --- @@ -194,6 +197,17 @@ Extract the shall clause (text from `shall` to end of sentence). Check for `, an If found: split into the primary action only. Re-draft. +**Check B.bis — no internal symbol in shall clause** + +Flag any of the following patterns inside the shall body: +- An identifier in backticks (`` `foo_bar` ``) that is NOT one of the external-surface classes whitelisted for backticks: CLI flags (``--host``), env vars (``APP_LICENSE_KEY``), TOML config keys / section headers (``[myapp.export_config]``, ``links_delimiter``), protocol tokens (``HMAC-SHA256``), HTTP routes (``/itemtypes``). Internal function / method names, private variables, and implementation identifiers (`lower_snake` / `camelCase` symbols not in the whitelist) stay bare. +- A function-call shape like `some_func(...)` in the shall body. +- A private-looking variable reference (`self.x`, `obj._y`). + +The whitelist mirrors `pharaoh-req-from-code` Rule 7. Project-internal TOML keys that never appear in public docs are still acceptable in backticks — a tester must copy-paste them verbatim — so do NOT require public-doc evidence on the whitelisted classes. + +If found, re-draft to describe the observable behavior without naming the internal symbol (see Step 5 guideline 6). After 2 retries, emit with `[DIAGNOSTIC] shall body names internal symbol — post-emit revision required.` + **Check C — parent resolves** Confirm `satisfies` ID is present in needs.json (already checked in Step 3, re-confirm before emit). @@ -205,7 +219,7 @@ Confirm chosen ID does not appear in needs.json (already checked in Step 4, re-c **Check E — required fields present** Verify the directive block includes every field from `required_fields` in artefact-catalog.yaml. -For Score `gd_req`: `id`, `status`, `satisfies` must all be present. +For the built-in default profile: `id`, `status`, `satisfies` must all be present. --- @@ -332,3 +346,9 @@ Do not show this if the emit included a `[DIAGNOSTIC]` (the user has a more urge Consider running `pharaoh-req-review gd_req__abs_pump_activation` to audit against ISO 26262-8 §6 axes. ``` + +## Last step + +After emitting the artefact, invoke `pharaoh-req-review` on it. Pass the emitted artefact (or its `need_id`) as `target`. Attach the returned review JSON to the skill's output under the key `review`. If the review emits any axis with `score: 0` or `severity: critical`, return a non-success status with the review findings verbatim and do NOT finalize the artefact — the caller must regenerate (via `pharaoh-req-regenerate` if available, or by re-invoking this skill with the findings as input). + +See [`shared/self-review-invariant.md`](../shared/self-review-invariant.md) for the rationale and enforcement mechanism. Coverage is mechanically enforced by `pharaoh-self-review-coverage-check` in `pharaoh-quality-gate`. diff --git a/skills/pharaoh-req-from-code/SKILL.md b/skills/pharaoh-req-from-code/SKILL.md index 0be27df..0dffceb 100644 --- a/skills/pharaoh-req-from-code/SKILL.md +++ b/skills/pharaoh-req-from-code/SKILL.md @@ -1,99 +1,293 @@ --- name: pharaoh-req-from-code -description: Use when reading one source file and emitting one or more comp_req RST directives describing the observable behavior in that file. Queries shared Papyrus for canonical terms before naming concepts; writes newly surfaced concepts back. Does not draft architecture, plans, or FMEA. +description: Use when reading one source file and emitting one or more requirement RST directives (typed by `target_level`) describing the observable behavior in that file. Queries shared Papyrus for canonical terms before naming concepts; writes newly surfaced concepts back. Does not draft architecture, plans, or FMEA. --- # pharaoh-req-from-code +## Shall-clause rules + +The seven rules below govern what a CREQ's body looks like. All seven apply to every emission; violations of any rule mean the emission is a draft, not a valid CREQ. + +### Rule 1 — Subject is the component (or an external actor) + +The grammatical subject of the shall clause is either: + +1. the component / capability (e.g. "The CSV Importer", "The export CLI", "The API client"). The component name comes from the parent feat title, from a named role inside the feat scope, or from the project's `artefact-catalog.yaml`. +2. an external actor (user, operator, caller, third-party service) whose action the component shall respond to — acceptable when the feat is an interactive CLI, an API, or any user-facing interface. Example: "The authenticated user shall receive a non-zero exit code on malformed input." + +Never a Python function, class, private method, or file. + +Code-narration subjects (❌) vs component subjects (✅): + +| Bad | Good | +|---|---| +| ``from_source_a`` shall call ``check_license`` | The Source A Connector shall reject unlicensed use | +| ``_apply_cli_overrides`` shall override credentials | The Source A CLI shall accept server credentials on the command line | +| ``FooClient._classify_connection_error`` shall raise ``FooAuthenticationError`` on HTTP 401 | The Source A Connector shall signal an authentication failure when the server rejects the configured credentials | +| ``ItemLoader`` shall load items from ``ImporterConfig.input_path`` via ``load_items`` | The Importer shall read items from the configured input path | + +Tests: the component form is falsifiable by a tester who can't read the source (black-box), stable across refactors (renaming `_apply_cli_overrides` does not invalidate the CREQ), and traceable to the feat via `:satisfies:`. + +### Rule 2 — No internal implementation details in the body + +No internal / private function names, no leading-`_` methods, no class-dot-method references, no file paths, no line numbers ever (`around line 165`, `in commands/foo.py`, `at the top of the module` — all banned). Traceability to code lives in `:source_doc:`; the shall clause carries behavior. These two jobs stay separate. + +Known prior failures this rule catches: + +- `"record_key drives the create-vs-update decision"` — names an internal field AND gets the mechanism wrong (actual decision variable is a different id attribute on the same record). +- `"parse_timestamp raises on unparseable input"` — names an internal function AND misstates the behavior (the function returns `None`; the caller raises). + +A clean behavioral shall with zero backticks and one `:source_doc:` is preferred over a code-narration shall with ten backticks. + +### Rule 3 — `:source_doc:` must point at the implementing source code file + +Every emitted CREQ carries `:source_doc:` pointing at a real source file — typically `.py`, `.rs`, `.ts`, `.go`, `.c`, `.cpp`, `.java` under the project's source tree (e.g. `src/<project>/csv/csv2needs.py`). Pointing `:source_doc:` at the spec RST file itself or at a prose feature doc is a validation failure — the spec RST is where the requirement lives, not where the behavior is implemented. + +When a CREQ's behavior spans multiple source files, pick the file that owns the primary observable (usually the converter module, not a CLI dispatcher). `pharaoh-req-code-grounding-check` axis #8 (`source_doc_resolves`) fails if the cited file is the spec RST or missing entirely. + +### Rule 4 — CREQ adds constraint beyond the parent feat + +A CREQ whose shall clause paraphrases the feat capability with the same subject, verb, and scope is tautological and MUST NOT be emitted. + +Test before every emission: *what constraint does this CREQ impose that the feat body alone does not?* Answers that count: a concrete pre-condition, a post-condition, an error contract, a default value, an ordering guarantee, a quantitative bound, a specific field / flag / command name. Answers that don't: just naming a sub-capability in the imperative. + +Bad (tautology) — feat says "The CSV Connector enables bidirectional exchange between Sphinx-Needs and CSV files"; CREQ says "The CSV Importer shall convert a user-supplied CSV file into a needs.json file when the user invokes `<cli> <subcmd> from-csv`" — zero added constraint. + +Good — "The CSV Importer shall fail with a non-zero exit code and a single-line error message on the first row whose mapped `id` column is missing or empty, without writing a partial `needs.json`" — specific precondition, specific post-condition, specific boundary observable. + +### Rule 5 — Enumerate boundary-observable code structures exhaustively + +For each `:source_doc:` file, enumerate and emit one CREQ per boundary-observable structure: + +1. **Every raised exception class that escapes the module's public surface.** "The component emits `FooError` when <condition>." +2. **Every published config key the module reads.** TOML keys, env vars, dataclass fields — one CREQ per key, naming the key and the default. +3. **Every public function or CLI subcommand exposed by the module.** CLI subcommand = one CREQ; exported library function = one CREQ. + +Expected floor per typical connector module (200-500 LOC, 3-8 exception classes, 5-10 config keys, 1-3 public functions): **12-20 CREQs per module**. Under-decomposition below this floor means structures got bundled into compound shalls — split them. + +Each emitted block's body has exactly one `shall` clause. Zero intra-clause conjunctions joining modal-verb phrases (`, and shall` / ` and shall` / ` or raise` / `, or ` — all splits). Multiple observable behaviors = multiple CREQs. Intra-clause conjunctions are a hard fail regardless of behavioral quality; split the block before returning. + +### Rule 6 — `:verification:` field is required + +Every emitted CREQ carries `:verification:` with at minimum the placeholder `tc__TBD`. Absence is a schema failure. If the project uses a different link name for the req→test relation (`verifies`, `covered_by`), declare it in `[[needs.extra_links]]`; the default placeholder stays required. + +### Rule 7 — Backticks are for code / protocol tokens only + +Backticks signal "copy this string verbatim — it is a code symbol, config key, or protocol token". NOT for format acronyms (``CSV``, ``JSON``, ``XML``, ``TOML``, ``HTML``), document-type nouns (``document``, ``file``, ``row``), or emphasis (``default``, ``required``). + +Test: would a tester copy-paste this string into test code or configuration? If yes, backtick it. If not, leave it bare. + +Backticks ARE acceptable on external-surface identifiers: CLI flags (``--host``), env vars (``APP_LICENSE_KEY``), TOML config keys (``[myapp.export_config]``, ``links_delimiter``), HTTP routes (``/itemtypes``), protocol tokens (``HMAC-SHA256``). + +### Config-value citation (see Rule 3 + Rule 7) + +Example: a consumer module that reads `self.config.output_format` cites ``output_format`` (its own reference form), not the default-value literal (``default_format``) which lives in the config module. Citing the default-value form creates a false paper-trail — the grounding-check axis will fail because the shall clause names a symbol the cited file does not contain. + ## When to use -Invoke with a single C++ source file assigned to this agent and (optionally) a shared Papyrus workspace for cross-agent terminology coordination. Emit one comp_req per distinct observable behavior expressed in the file. Do NOT emit reqs for behavior not grounded in the file (that is drafting, not reverse-engineering). Do NOT attempt architecture, verification plans, or FMEA — those are separate skills. +Invoke with a single source file (any language) assigned to this agent and (optionally) a shared Papyrus workspace for cross-agent terminology coordination. Emit one requirement (of type `target_level`) per distinct boundary-observable behavior expressed in the file. Do NOT emit reqs for behavior not grounded in the file (that is drafting, not reverse-engineering). Do NOT attempt architecture, verification plans, or FMEA — those are separate skills. + +## Tailoring awareness + +Two axes are tailored, both read at runtime from the consumer project's `ubproject.toml` / `pharaoh.toml`: + +**Type axis** — need types and ID conventions are project-specific. Read `[[needs.types]]` entries from `ubproject.toml` (or `.pharaoh/project/id-conventions.yaml` if present) — each has `directive` and `prefix`. Do NOT hardcode `comp_req` as the only acceptable type. The caller passes `target_level` — use it verbatim as the directive name (in `rst` emit) or as the `type` field (in `codelinks_comment` emit). + +**Emit axis** — whether to emit RST directive blocks or sphinx-codelinks-compatible one-line comments. Resolution order: + +1. `emit_override` input (per-call). +2. `pharaoh.toml [pharaoh.codelink_comments].mode` — `"codelinks"` → `codelinks_comment`; `"backref"` or absent → `rst`. +3. Auto-detect: `ubproject.toml` contains `[codelinks.projects.*]` → `"codelinks_comment"`; otherwise `"rst"`. +4. Fallback: `"rst"`. + +If `on_missing_config == "prompt"` (default) AND tailoring is missing (no `target_level` in `[[needs.types]]`, or emit mode unresolvable), the skill returns `{status: "needs_confirmation", proposal: {...}}` with a tailoring patch the caller can confirm. Caller confirms → tailoring gets written (typically via `pharaoh-tailor-fill`) → re-invoke with `on_missing_config="use_default"` for silent proceed. ## Atomicity -- (a) Indivisible — one file in → N reqs out. No I/O beyond file read + optional Papyrus query/write + RST emit. -- (b) Input: `{file_path: str, target_level: str, shared_context_path?: str, papyrus_workspace?: str, reporter_id: str}`. Output: list of RST `comp_req` directive blocks as strings, separated by blank lines. -- (c) Reward: fixture — given `test_fixture.cpp` containing exactly 3 named types (`FooBar`, `BazQux`, `Quux`), emitted reqs must mention all 3 by canonical name. Deterministic scorer via `concept_extractor`. -- (d) Reusable: any reverse-engineering workflow; standalone CI "are there reqs for this code?" gate; spec drafting. -- (e) Composable: strictly one phase. Never invokes `pharaoh-arch-draft`, `pharaoh-fmea`, `pharaoh-plan`, etc. +- (a) Indivisible — one file in → N reqs out. No I/O beyond file read + optional Papyrus query/write + req emit. Emits in exactly one representation per call (`rst` OR `codelinks_comment`). +- (b) Input: `{file_path, target_level, shared_context_path?, papyrus_workspace?, reporter_id, parent_feat_ids?, emit_override?, codelinks_project_name?, on_missing_config?, allowed_ids?, split_strategy?}`. Output: single JSON object `{"reqs": [{"id", "title", "type", "body", "source_doc", "satisfies", "verification", "raw_rst"}, ...]}` for `emit=rst`, or `{"codelinks": [str, ...]}` for `emit=codelinks_comment`. On missing tailoring with `on_missing_config=prompt`: single JSON object `{status: "needs_confirmation", proposal}`. +- (c) Reward: language-parametric fixture — given `test_fixture.<ext>` (`.py` / `.cpp` / `.rs` / `.ts`) containing exactly 3 named symbols (`FooBar`, `BazQux`, `Quux`), emitted reqs must mention all 3 by canonical name. Directive name must equal `target_level`. If `parent_feat_ids` is non-empty, every emitted block MUST contain `:satisfies: <id1>, <id2>, ...` with all parents comma-joined. +- (d) Reusable across reverse-engineering workflows, spec drafting, standalone CI "are there reqs for this code?" gates. +- (e) Composable — strictly one phase. Never invokes `pharaoh-arch-draft`, `pharaoh-fmea`, `pharaoh-plan`. ## Input -- `file_path`: absolute path to the C++ source file to reverse-engineer. -- `target_level`: the requirement artefact prefix (e.g. `comp_req` for component-level). -- `shared_context_path` (optional): path to a companion source file read by all agents in the fan-out (e.g. `common.cpp`). Read but NOT reverse-engineered into reqs by this agent. -- `papyrus_workspace` (optional): path to `.papyrus/` directory for canonical-term coordination. If omitted, the skill operates in no-memory mode (does not call `pharaoh-context-gather` or `pharaoh-decision-record`). -- `reporter_id`: short identifier for this agent (e.g. `req-from-code:health_monitor.cpp`). Passed to `pharaoh-decision-record` calls. +- `file_path`: absolute path to the source file (any language). +- `target_level`: requirement artefact directive name as declared in the consumer project's `ubproject.toml` (e.g. `"comp_req"`, `"impl"`, `"spec"`). ID prefix is `target_level` + `__` unless `[[needs.types]].prefix` overrides. +- `shared_context_path` (optional): companion source file read by all agents in the fan-out (e.g. `common.cpp`). Read but NOT reverse-engineered. +- `papyrus_workspace` (optional): path to `.papyrus/` for canonical-term coordination. Absent → no-memory mode (skip Steps 1 and 3). +- `reporter_id`: short identifier for this agent (e.g. `req-from-code:csv2needs.py`). +- `parent_feat_ids` (optional): list of parent feature IDs. When non-empty, every emitted block gets `:satisfies: <id1>, <id2>, ...` comma-joined. +- `allowed_ids` (optional): pre-allocated ID list. When provided, emitter MUST NOT invent IDs outside this list; emits only `len(allowed_ids)` reqs max; overflow logged as a warning comment. +- `split_strategy` (optional): `"single"` (default, whole file as one scope, target 1-5 reqs), `"top_level_symbols"` (per top-level symbol, target 1-3 reqs/symbol), or `"sections"` (per `# ---` / `// ===` horizontal-rule marker, target 1-3 reqs/section). Plans supply this via `${heuristics.split_strategy(...)}`. ## Output -Zero or more RST `comp_req` directive blocks, one behavior per block. Each block: +A single JSON object. The top-level key names the emit mode: `reqs` for `emit=rst`, `codelinks` for `emit=codelinks_comment`. Downstream skills key off the presence of one or the other. + +### `emit=rst` + +```json +{ + "reqs": [ + { + "id": "<id_prefix><snake_case_id>", + "title": "<short_title>", + "type": "<target_level>", + "body": "The <Component subject> shall <observable behavior>.", + "source_doc": "<path to implementing source file>", + "satisfies": ["<parent_1>", "<parent_2>"], + "verification": "tc__TBD", + "raw_rst": ".. <target_level>:: <short_title>\n :id: ...\n :status: draft\n :satisfies: ...\n :source_doc: ...\n :verification: tc__TBD\n\n <body>\n" + } + ] +} +``` + +Field semantics: + +- `id` — `<id_prefix><snake_case_id>`. `<id_prefix>` defaults to `target_level` (`comp_req` → `comp_req__foo_01`). If `[[needs.types]].prefix` declares `"CREQ_"`, use `CREQ_foo_01`. +- `type` — equals input `target_level`. +- `satisfies` — list of parent feat ids. Empty list when `parent_feat_ids` was empty. Always present (use `[]`). +- `raw_rst` — exactly the RST directive block as it would appear in an `.rst` file. Downstream review / annotation skills read `raw_rst` when they need the directive text; helpers that consume `reqs` (e.g. by-stem grouping) read `id` / `source_doc`. + +### `emit=codelinks_comment` + +```json +{ + "codelinks": [ + "@ <title>, <id>, <target_level>, [<parent_1>, <parent_2>, ...]" + ] +} +``` + +Each `codelinks[i]` is one comment line matching the project's `[codelinks.projects.<name>.analyse.oneline_comment_style]`: + +- Tailored `start_sequence` (default `@`). +- Tailored `field_split_char` (default `,`) with surrounding spaces. +- Field order matches tailored `needs_fields`. +- Values escaped per sphinx-codelinks rules. +- No language comment prefix (that is `pharaoh-req-codelink-annotate`'s job). + +### `status == "needs_confirmation"` + +When tailoring is missing and `on_missing_config == "prompt"`, output is a single JSON object `{"status": "needs_confirmation", "proposal": {...}}`. Downstream consumers check for this shape before parsing as `reqs` / `codelinks`. + +## Output schema +Validated as `json_obj` by `pharaoh-output-validate`. The validator checks the top-level shape, then per-item shape against the regexes below. + +**Stage 1 — block recognizer (Python regex, `re.MULTILINE`):** + +```regex +^\.\. (?P<directive>[a-z_]+)::\s+(?P<title>.+)$ +(?P<options>(?:^ :[a-z_]+:.*$\n?)+) +(?:^\n (?P<body>[\s\S]+?))? +(?=^\.\.\s|\Z) ``` -.. comp_req:: <short_title> - :id: comp_req__<snake_case_id> - :status: draft - The <subject> shall <observable_behavior>. +Identifies one directive block bounded by the next `.. ` at column 0 or end of input. `re.MULTILINE` without `re.DOTALL` keeps `.` line-bounded; options cannot leak into adjacent blocks. + +**Stage 2 — option enumeration (on the recognizer's `options` capture):** + +```regex +^ :(?P<option>[a-z_]+):\s*(?P<value>.*)$ ``` -Blocks are separated by one blank line. No surrounding prose outside the blocks, no final summary. +`re.finditer` with `re.MULTILINE` enumerates every option/value pair. -## Process +**Validator checks (per `reqs[*]`):** -### Step 1: MANDATORY — query Papyrus for canonical terms BEFORE naming anything +1. `raw_rst` matches Stage 1 + Stage 2 — block is well-formed. +2. `raw_rst` directive name equals `type` and equals input `target_level`. +3. Stage 2 on `raw_rst` yields at least `id`, `status`, `source_doc`, and `verification`; values match the corresponding top-level fields. +4. If `parent_feat_ids` was provided: `satisfies` field is non-empty and lists every parent id; `raw_rst` `:satisfies:` (or tailored child→parent link name) value matches. +5. Every option in `raw_rst` is either declared in `ubproject.toml` `[[needs.types]]`, a built-in sphinx-needs option, or a Pharaoh convention option. Reject unknown names (catches typos like `subsatisfies`). +6. If `allowed_ids` was provided: every `reqs[*].id` is a member of `allowed_ids`. -**Hard prompt clause (load-bearing for cross-agent first-writer-wins):** +**`emit=codelinks_comment`** — each `codelinks[*]` string must parse via sphinx-codelinks `oneline_parser.parse_line()` against the tailored `oneline_comment_style`. -> Before naming any type, function, or concept in your emitted reqs, call `pharaoh-context-gather` with a semantic query describing the concept. If a canonical name already exists in Papyrus, use it verbatim — do not coin synonyms or variants. +## Process + +### Step 1: Query Papyrus for canonical terms BEFORE naming -Only applies if `papyrus_workspace` is provided. For each type / function / entity you observe in the file that you might name in a req: +Only applies if `papyrus_workspace` is provided. For each type / function / concept you may name in a req: -1. Form a short semantic query (e.g. "what do we call the subsystem that supervises other monitors"). -2. Invoke `pharaoh-context-gather` with that query against `papyrus_workspace`. -3. If a matching canonical appears in the top-3 results, use that exact spelling in your req (preserve case exactly — e.g. `HealthMonitor` not `health_monitor`). -4. If no match, plan to introduce a new canonical name in Step 3. +1. Form a short semantic query ("what do we call the subsystem that supervises other monitors"). +2. Invoke `pharaoh-context-gather` with `mode="semantic"`. Semantic mode is required — substring recall silently misses morphological synonyms. +3. If a canonical appears in the top-3 results, use its exact spelling (preserve case). +4. If no match, plan to introduce a new canonical in Step 3. ### Step 2: Read the source file -Read `file_path` and, if provided, `shared_context_path`. Identify observable behaviors: things the code DOES that a spec could describe, grounded in the actual control flow and data flow in the file. Ignore implementation detail that is not observable at the component boundary (internal helpers, log messages, assertion text). +Read `file_path` and, if provided, `shared_context_path`. Identify boundary-observable behaviors grounded in control flow and data flow. Ignore internal helpers, log messages, assertion text. + +Apply `split_strategy`: + +- `"single"` (default): whole file as one scope. Target 1-5 reqs. +- `"top_level_symbols"`: enumerate top-level symbols via the patterns in [`../shared/public-symbol-patterns.md`](../shared/public-symbol-patterns.md). Emit per symbol. Target 1-3 reqs per symbol. +- `"sections"`: split at `^#\s*={3,}` / `^//\s*={3,}` markers. Target 1-3 reqs per section. + +In plan-driven runs the `${heuristics.split_strategy(...)}` helper picks per-file by LOC (≤500 → single; 500-2000 with markers → sections; else → top_level_symbols). ### Step 3: Record newly surfaced concepts in Papyrus -Only applies if `papyrus_workspace` is provided. For each type / function / concept that (a) you will mention in a req and (b) was NOT already returned by Step 1, invoke `pharaoh-decision-record` with: +Only applies if `papyrus_workspace` is provided. For each concept you will mention and that Step 1 did not return, invoke `pharaoh-decision-record`: - `type`: `"fact"` -- `canonical_name`: your chosen name in the code's native casing style (preserve CamelCase for types, snake_case for functions/fields) -- `body`: one sentence describing the concept -- `reporter_id`: your `reporter_id` input -- `tags`: `["origin:req-from-code", "file:<basename>"]` +- `canonical_name`: idiomatic casing for the source language (CamelCase for types; snake_case for functions/fields in Python/Rust/C; camelCase in TypeScript/Java). Preserve what the source uses. +- `body`: one sentence. +- `reporter_id`: your `reporter_id` input. +- `tags`: `["origin:req-from-code", "file:<basename>"]`. + +On `"duplicate"`: a concurrent agent raced you; re-query via `pharaoh-context-gather`, adopt the existing spelling, rewrite your draft to match. + +### Step 4: Resolve tailoring (type + emit mode) + +Read `<project_root>/ubproject.toml` and `<project_root>/pharaoh.toml`. -If `pharaoh-decision-record` returns `"duplicate"`, that means a concurrent agent raced you to that canonical; in that case re-query via `pharaoh-context-gather`, adopt the existing canonical spelling, and rewrite your draft req(s) to use it. +**Type resolution:** find `[[needs.types]]` entry where `directive == target_level`. Extract `prefix`. If not declared: +- `on_missing_config == "fail"` → FAIL. +- `on_missing_config == "prompt"` → emit `{status: "needs_confirmation", proposal: ...}` and return without emitting reqs. +- `on_missing_config == "use_default"` → use `<target_level>__` silently. -### Step 4: Emit comp_req directives +**Emit mode:** per the Tailoring awareness order. Log the resolved mode on the header line. -For each observable behavior in the file, emit one `comp_req` block: +**Codelinks format** (only if `emit == "codelinks_comment"`): resolve `[codelinks.projects.<name>.analyse.oneline_comment_style]` via `codelinks_project_name` or by matching `file_path` against each project's `source_discover.src_dir`. Zero or multiple matches with `on_missing_config != "fail"` → `needs_confirmation`. `on_missing_config == "fail"` → FAIL. + +### Step 5a: Emit — `rst` mode + +For each boundary-observable behavior (per Rule 5 enumeration): - `<short_title>` — 3-6 word summary. -- `:id: comp_req__<snake_case_id>` — include file basename as a disambiguator: `comp_req__<filename_stem>_<n>`, e.g. `comp_req__deadline_monitor_01`. -- Body — one sentence, single `shall` clause, using canonical names from Steps 1/3 (preserve original casing). +- `:id: <id_prefix><filename_stem>_<n>` — `<id_prefix>` resolved in Step 4. File basename (stem, snake_case) as disambiguator. Examples: `comp_req__csv2needs_01`, `CREQ_csv2needs_01`. +- `:status: draft`. +- `:satisfies: <parent_1>, ...` — iff `parent_feat_ids` non-empty. All parents comma-joined. If `[[needs.extra_links]]` declares a different outgoing name (e.g. `realizes`), use that instead. +- `:source_doc: <path to implementing source file>` — per Rule 3. +- `:verification: tc__TBD` — per Rule 6. +- Body — single shall clause, component subject (Rule 1), no internals (Rule 2), adds constraint (Rule 4), atomicity + no conjunctions (Atomicity rule above). Canonical names from Steps 1/3. -Target: 1-5 reqs per file. Fewer than 1 only if the file has no observable behavior (e.g. pure private implementation detail); more than 5 suggests the skill is being asked to over-decompose — stop at 5 and defer. +### Step 5b: Emit — `codelinks_comment` mode -### Step 5: Return +For each behavior, emit one line that sphinx-codelinks' oneline parser would read back into a need equivalent to what `rst` mode would produce. Follow tailored `needs_fields` order and escape rules. Do NOT include the language comment prefix — that is `pharaoh-req-codelink-annotate`'s concern. -Return the concatenation of directive blocks with blank-line separators. No prose. +The `links` field renders as `[<parent_1>, ...]` when `parent_feat_ids` non-empty, else `[]` (or omitted if tailored `default = []`). The body shall-clause does NOT fit on a one-line comment — implied by the title and lost in this mode. For full shall-clause text use `emit="rst"`. -## No-memory mode (when `papyrus_workspace` is absent) +Target: 1-5 reqs per file (per split_strategy). Fewer than 1 only if the file has no observable behavior; more than 5 suggests over-decomposition. -Skip Steps 1 and 3. Proceed directly to Steps 2, 4, 5. This is how variant C exercises the skill. +### Step 6: Return + +Emit one JSON object per the Output shape (`{"reqs": [...]}` for `emit=rst`, `{"codelinks": [...]}` for `emit=codelinks_comment`). Build each `reqs[i]` by populating `id`, `title`, `type`, `body`, `source_doc`, `satisfies` (use `[]` when empty), `verification`, and `raw_rst` (the literal RST block that would render the directive). Nothing else on stdout — no `# emit=...` header line, no prose wrapper, no fenced code block. ## Failure modes - `file_path` not readable → return empty output (no reqs). -- `pharaoh-context-gather` errors → log and proceed as if no match found (do not abort). -- `pharaoh-decision-record` returns `"error"` (not `"duplicate"`, which is normal) → log and proceed. Do not retry. +- `pharaoh-context-gather` errors → log and proceed as if no match found. +- `pharaoh-decision-record` returns `"error"` (not `"duplicate"`) → log and proceed. Do not retry. ## Composition -Orchestrator `pharaoh-reqs-from-module` dispatches N instances of this skill in parallel, one per file in the module's source directory, sharing the same `papyrus_workspace`. +Under `pharaoh-execute-plan`, a plan emitted by `pharaoh-write-plan` dispatches N instances of this skill via a `foreach` task over the file list. Per-CREQ review is scheduled as explicit top-level `review_comp_reqs` + `grounding_check_comp_reqs` + `api_coverage_comp_reqs` plan tasks (see `pharaoh-write-plan` templates); the plan DAG enforces them as dependencies of `quality_gate`. Direct out-of-plan invocation by a human auditor is acceptable; the caller is responsible for running the sibling reviews if coverage matters. + +See [`shared/self-review-invariant.md`](../shared/self-review-invariant.md) for the rationale. Coverage is mechanically enforced by `pharaoh-self-review-coverage-check` reading the explicit plan-task output files. diff --git a/skills/pharaoh-req-regenerate/SKILL.md b/skills/pharaoh-req-regenerate/SKILL.md index 71da6d3..9b5ea67 100644 --- a/skills/pharaoh-req-regenerate/SKILL.md +++ b/skills/pharaoh-req-regenerate/SKILL.md @@ -63,14 +63,15 @@ Extract for the artefact type matching the directive prefix (e.g. `gd_req`): - `required_fields` — every field that must be present - `id_regex` — regex the output id must match -If tailoring files are missing, fall back to Score defaults (`gd_req` required: -`[id, status, satisfies]`, id_regex: `^[a-z][a-z_]*__[a-z0-9_]+$`). +If tailoring files are missing, fall back to built-in defaults (bundled example +requirement profile — `req` required: `[id, status, satisfies]`, id_regex: +`^[a-z][a-z_]*__[a-z0-9_]+$`). --- ### Step 2: Parse findings_json -Parse the findings JSON. If malformed (missing `axes` key, invalid JSON syntax, axis count < 10), +Parse the findings JSON. If malformed (missing `axes` key, invalid JSON syntax, axis count < 11), FAIL immediately — do not attempt partial regeneration: ``` @@ -152,7 +153,7 @@ After rewriting, run the same checks as `pharaoh-req-review` Step 3 (binary axes - Atomicity: exactly one `shall`, no conjunction in shall clause - Internal consistency: no self-contradiction - Schema: all required fields present and non-empty -- Verifiability: `:verification:` present (it is acceptable for it to point to `tc__TBD`) +- Verifiability: `:verification:` present. On a `status: draft` requirement, a placeholder value matching `^(tc|test_case)__TBD$` (or the project's `tailoring.verification_placeholder_regex`) scores 0.5 on review's verifiability axis — passing the binary gate and terminating the regen loop. Once status advances past draft, the placeholder stops passing and a real test-case id is required. If any binary check still fails after one rewrite attempt, attempt one further rewrite targeting only the still-failing axis. If it still fails after two total attempts, emit the directive with: diff --git a/skills/pharaoh-req-review/SKILL.md b/skills/pharaoh-req-review/SKILL.md index 55d2399..b2e303e 100644 --- a/skills/pharaoh-req-review/SKILL.md +++ b/skills/pharaoh-req-review/SKILL.md @@ -1,6 +1,6 @@ --- name: pharaoh-req-review -description: Use when auditing a single sphinx-needs requirement against the 10 ISO 26262 Part 8 §6 axes. Emits structured findings JSON — per-axis pass/fail for mechanized axes, 0-3 score for subjective axes, with action items for any failure. +description: Use when auditing a single sphinx-needs requirement against the 11 ISO 26262 Part 8 §6 axes. Emits structured findings JSON — per-axis pass/fail for mechanized axes, 0-3 score for subjective axes, with action items for any failure. chains_from: [pharaoh-req-draft, pharaoh-req-regenerate] chains_to: [pharaoh-req-regenerate] --- @@ -22,7 +22,7 @@ Do NOT re-author or fix — invoke `pharaoh-req-regenerate` after reviewing. - **target**: either an RST directive block (from `pharaoh-req-draft`) OR a need-id present in needs.json - **tailoring** (from `.pharaoh/project/`): - - `checklists/requirement.md` — 10 ISO 26262-8 §6 axes + - `checklists/requirement.md` — 11 ISO 26262-8 §6 axes - `artefact-catalog.yaml` — required/optional fields per artefact type - `id-conventions.yaml` — ID regex and prefix map - **needs.json**: required for link resolution on the verifiability axis @@ -63,9 +63,11 @@ description and records its verdict; the harness compares skill verdict to score |---|---|---| | `atomicity` | body contains more than one `shall`, or a coordinating conjunction joins modal verbs within the shall clause | body contains exactly one `shall`; no `, and`/`, or`/` and `/ or ` within the shall clause | | `internal_consistency` | body contains a self-contradictory statement (e.g. "shall always … unless required not to") | no self-contradiction detectable within this requirement | -| `verifiability` | `:verification:` field absent, empty, or link does not resolve in needs.json | `:verification:` present and resolves to a real need-id in needs.json | +| `verifiability` | `:verification:` field absent, empty, or link does not resolve in needs.json (and does not match a recognised placeholder) | `:verification:` present and resolves to a real need-id in needs.json | | `schema` | any field listed under `required_fields` in artefact-catalog.yaml is missing from the directive | all required fields present and non-empty | +**Verifiability placeholder pathway (score 0.5):** a drafted req with `status: draft` AND `:verification:` set to a recognised placeholder (matching `^(tc|test_case)__TBD$` by default, or the pattern declared under `tailoring.verification_placeholder_regex` in `checklists/requirement.md`) scores 0.5, not 0. This lets a regenerate loop terminate on an iteratively improved draft that still lacks a concrete test-case id. The placeholder pathway does not apply once status has advanced past `draft`. For `overall`, treat 0.5 as passing the binary gate but append `"verifiability: placeholder-only"` to `action_items`. + **Ordinal (0–3) — subjective LLM-judge axes:** | Axis | 0 | 1 | 2 | 3 | @@ -93,7 +95,7 @@ and observing convergence. Record as `{"score": null, "reason": "chain-level axi Computed from the non-deferred, non-null axes only (atomicity, internal_consistency, verifiability, schema, unambiguity_prose, comprehensibility, feasibility): -- `"pass"` — all binary axes score 1, all subjective axes score ≥ 2 +- `"pass"` — all binary axes score 1 (or `verifiability` scores 0.5 via the placeholder pathway), all subjective axes score ≥ 2 - `"needs_work"` — no binary axis fails, but ≥ 1 subjective axis scores < 2 - `"fail"` — ≥ 1 binary axis scores 0 @@ -106,12 +108,13 @@ schema, unambiguity_prose, comprehensibility, feasibility): Read `.pharaoh/project/checklists/requirement.md`, `.pharaoh/project/artefact-catalog.yaml`, and `.pharaoh/project/id-conventions.yaml`. Extract: -- Axis definitions (confirm the 10 axes match the expected set) +- Axis definitions (confirm the 11 axes match the expected set) - `required_fields` for the target artefact type (used in schema axis) - `id_regex` (used to verify the need-id format if target is an RST block) -If any tailoring file is missing, proceed with Score defaults (gd_req required fields: -`[id, status, satisfies]`). Note the fallback in the output. +If any tailoring file is missing, proceed with built-in defaults (bundled example +profile — generic `req` required fields: `[id, status, satisfies]`). Note the +fallback in the output. ### Step 2: Resolve target @@ -151,8 +154,8 @@ need-id in needs.json. Score 1 if present and resolves; score 0 otherwise. **Schema:** Check that every field in `required_fields` from artefact-catalog.yaml is present and non-empty in -the directive. For Score `gd_req`: `id`, `status`, `satisfies` must all be present. Score 1 if all -present; score 0 with reason listing the missing field(s). +the directive. For the built-in default profile: `id`, `status`, `satisfies` must all be present. +Score 1 if all present; score 0 with reason listing the missing field(s). ### Step 4: Evaluate subjective axes @@ -210,7 +213,7 @@ Do not emit partial JSON. Return only the FAIL message. **G2 — Malformed JSON output** -If the emitted JSON is syntactically invalid or missing any of the 10 axis keys, self-correct once: +If the emitted JSON is syntactically invalid or missing any of the 11 axis keys, self-correct once: re-emit the full JSON document. If still malformed after one self-correction attempt, emit: ```json diff --git a/skills/pharaoh-reqs-from-module/SKILL.md b/skills/pharaoh-reqs-from-module/SKILL.md deleted file mode 100644 index a876edb..0000000 --- a/skills/pharaoh-reqs-from-module/SKILL.md +++ /dev/null @@ -1,56 +0,0 @@ ---- -name: pharaoh-reqs-from-module -description: Use when reverse-engineering comp_reqs for an entire module in parallel by dispatching pharaoh-req-from-code subagents, one per source file, sharing a Papyrus workspace for cross-agent terminology coordination. Aggregates into a single RST document. ---- - -# pharaoh-reqs-from-module - -## When to use - -Invoke when the user wants comp_req coverage for a bounded subsystem (one directory of cohesive source files) and wants cross-file terminology consistency. Works best on 3-8 files totaling up to ~1000 LOC. - -## Compositional structure (not atomic by design) - -This skill is explicitly a COMPOSITION of other atomic skills. It does not add new reward-mechanizable behavior of its own; it coordinates other atomics. Therefore it is exempt from criterion (a) (indivisibility) per the atomic-skills refactor. The constituent atomics (`pharaoh-context-gather`, `pharaoh-req-from-code`, `pharaoh-decision-record`) each pass (a)-(e). - -## Input - -- `module_dir`: directory containing the source files to reverse-engineer. -- `file_list`: list of filenames (relative to `module_dir`) to assign one-per-agent. -- `shared_context_file` (optional): a file all agents read for shared context but none reverse-engineers. -- `target_level`: requirement artefact prefix (e.g. `comp_req`). -- `papyrus_workspace`: path to `.papyrus/` workspace (may be empty or preseeded). - -## Output - -A single RST document concatenating all emitted `comp_req` directives from all agents, with a section header per source file for human readability. - -## Process (reference orchestration) - -### Step 1: Prepare Papyrus workspace - -If `papyrus_workspace` does not exist, initialize it empty. If preseeding is desired (variant B_seeded), the CALLER is responsible for doing so before invoking this orchestrator. - -### Step 2: Dispatch N parallel agents - -For each file in `file_list`, dispatch one instance of `pharaoh-req-from-code` with: -- `file_path = module_dir / file` -- `shared_context_path = module_dir / shared_context_file` (if provided) -- `papyrus_workspace` (passed through) -- `reporter_id = "req-from-code:" + file` - -Agents MUST run concurrently so that first-writer-wins ordering in Papyrus is exercised. - -### Step 3: Aggregate - -Concatenate agent outputs into one document with per-file section headers. Do NOT filter duplicates at this layer; dedup is a scoring concern, not an orchestration concern. - -## Harness-driven variant - -In Phase 4c measurement, parallel dispatch happens in `pharaoh-validation/harness/run_phase4c.py`, not inside an agent running this skill. Reason: clean measurement and cost attribution. Future versions may move dispatch into an in-agent MCP subagent call — this SKILL.md is the spec for that. - -## Non-goals - -- No fan-out to other phases (arch, vplan, fmea) — that is a separate orchestrator. -- No cross-module coordination — one module per invocation. -- No output filtering or dedup — done downstream in scoring or integration. diff --git a/skills/pharaoh-review-completeness/SKILL.md b/skills/pharaoh-review-completeness/SKILL.md index 018c9d5..4c87fd1 100644 --- a/skills/pharaoh-review-completeness/SKILL.md +++ b/skills/pharaoh-review-completeness/SKILL.md @@ -23,7 +23,7 @@ Do NOT invoke for artefact types whose project tailoring does NOT list `:reviewe ### Step 1: Load artefact catalog -Read `<project_dir>/.pharaoh/project/artefact-catalog.yaml` (or `examples/score/.pharaoh/project/artefact-catalog.yaml` if testing against Score). For each artefact type, extract `required_roles` — which may include `reviewer`, `approved_by`, or be absent (no review required). +Read `<project_dir>/.pharaoh/project/artefact-catalog.yaml`. For each artefact type, extract `required_roles` — which may include `reviewer`, `approved_by`, or be absent (no review required). ### Step 2: Load needs diff --git a/skills/pharaoh-self-review-coverage-check/SKILL.md b/skills/pharaoh-self-review-coverage-check/SKILL.md new file mode 100644 index 0000000..ec80e40 --- /dev/null +++ b/skills/pharaoh-self-review-coverage-check/SKILL.md @@ -0,0 +1,76 @@ +--- +name: pharaoh-self-review-coverage-check +description: Use when verifying that every artefact emitted during a plan run received a matching review. For every drafted artefact in `runs/`, confirms a matching `<id>_review.json` exists and is non-empty. Closes the "draft emitted but review was skipped" failure class. +--- + +# pharaoh-self-review-coverage-check + +## When to use + +Invoke from `pharaoh-quality-gate.required_checks` on any plan that emitted drafts (reqs, feats, archs, vplans, fmeas, decisions, diagrams). Uses the draft↔review mapping in `shared/self-review-map.yaml` to determine which review skill was supposed to be invoked. + +Do NOT use to re-invoke missing reviews — this skill only observes. Remediation is up to the plan's `on_fail` policy. + +## Atomicity + +- (a) Indivisible: one runs directory + one self-review map in → pass/fail + uncovered list out. +- (b) Input: `{runs_path: str, self_review_map_path: str}`. Output: JSON `{passed: bool, uncovered: list[{artefact_id, draft_skill, expected_review_skill}]}`. +- (c) Reward: fixtures in `pharaoh/skills/pharaoh-self-review-coverage-check/fixtures/`: + 1. `fully-covered/`: 2 `*_draft` return.json files + 2 matching `*_review.json` → `expected-fully-covered-pass.json` (`passed: true, uncovered: []`). + 2. `missing-review/`: 2 draft files + only 1 review file → `expected-missing-review-fail.json` (`passed: false`, `uncovered` names the missing pair). + 3. Empty review file (`{}`) counts as missing → failure with `reason: "review JSON is empty"`. + 4. Idempotent. + 5. `scalar-mapped/`: emission skill maps to a single scalar review skill; that review is invoked in the emission's `## Last step`. Expected: pass (backward-compatibility check — scalar mappings still work). + 6. `list-mapped-complete/`: emission skill maps to a list `[A, B]`; both A and B are invoked in the emission's `## Last step`. Expected: pass. + 7. `list-mapped-partial/`: emission skill maps to a list `[A, B]`; only A is invoked. Expected: fail with `uncovered` naming B as the missing `expected_review_skill`. + + Pass = all 7. +- (d) Reusable by any plan. +- (e) Read-only. + +## Input + +- `runs_path`: absolute path to runs directory. Must contain `*_draft.json` files (draft outputs) and `*_review.json` files (review outputs). Files may be under per-task subdirectories. +- `self_review_map_path`: absolute path to `shared/self-review-map.yaml`. Maps each draft skill to its review skill. + +## Output + +```json +{ + "passed": false, + "uncovered": [ + { + "artefact_id": "REQ_example_02", + "draft_skill": "pharaoh-req-draft", + "expected_review_skill": "pharaoh-req-review", + "reason": "no matching *_review.json found" + } + ] +} +``` + +## Detection rule + +For every `<run_dir>/**/<id>_draft.json` OR every entry in `<run_dir>/**/return.json` with `emitted: [...]`: + +1. Identify the emission skill for the artefact (from the `draft_skill` field in the run record or the emission task name in the plan). +2. Look up the emission skill in `self_review_map.map`. The mapped value is either: + - a **scalar** (string): the name of the single review skill expected to be invoked from the emission skill's `## Last step`. + - a **list** of strings: every review skill in the list is expected to be invoked. + + Branch on type: `isinstance(value, list)` vs scalar. Lists iterate; scalars are treated as a single-element check. (Dict values are out of scope; treat as a schema error.) + +3. For each expected review skill name: + - **Only source**: a matching `<id>_<review_skill_short>.json` under `<run_dir>/**`, produced by an explicit plan task (e.g. `review_comp_reqs`, `grounding_check_comp_reqs`) that ran the review skill. Expected filename shapes: `<id>_review.json` for `pharaoh-req-review`, `<id>_code_grounding.json` for `pharaoh-req-code-grounding-check`, `<id>_diagram_review.json` for `pharaoh-diagram-review`, etc. + - Load the file. If missing, empty object `{}`, or unparseable → record as uncovered with the specific `expected_review_skill` name. + - **Never accept "inlined" / "covered in emission skill's Last step" / "semantically satisfied" as completion evidence.** The only evidence is a non-empty JSON file on disk at the expected path. The emission skill's `## Last step` clause is explicitly deferred under plan execution (see `pharaoh-req-from-code` SKILL.md); coverage is determined exclusively by the presence of explicit plan-task output files. An uncovered finding indicates the plan did not schedule the review task, the executor skipped it, or the executor claimed "completed" without producing output (which `pharaoh-execute-plan` Step 4.10 should have already caught as a `reporting_error`, but this check provides a second independent signal). + +4. The emission is covered only when **every** expected review skill (all list members, or the single scalar) is invoked AND its produced JSON is non-empty. Missing one out of N list members fails, with the missing entry named in `uncovered[].expected_review_skill`. + +Backward compatibility: existing scalar mappings (e.g. `pharaoh-req-draft: pharaoh-req-review`) continue to pass under the same check — the scalar is treated as a one-element list. + +Use `self_review_map` to label `expected_review_skill` in the uncovered entries. When multiple review skills are expected, emit one `uncovered` entry per missing review skill so the caller sees every missing pair separately. + +## Composition + +Called by `pharaoh-quality-gate` when `required_checks` contains `self_review_coverage: true`. diff --git a/skills/pharaoh-self-review-coverage-check/fixtures/list-mapped-complete/README.md b/skills/pharaoh-self-review-coverage-check/fixtures/list-mapped-complete/README.md new file mode 100644 index 0000000..4a4d4a1 --- /dev/null +++ b/skills/pharaoh-self-review-coverage-check/fixtures/list-mapped-complete/README.md @@ -0,0 +1,3 @@ +# list-mapped-complete + +Exercises the list-valued branch of the detection rule with a complete invocation set. The emission skill `pharaoh-req-from-code` maps to the two-element list `[pharaoh-req-review, pharaoh-req-code-grounding-check]`, and both review skills are invoked in the emission skill's `## Last step` section. Expected output: `passed: true` with an empty `uncovered` list. Pairs with `list-mapped-partial/` which removes one invocation to force a failure. diff --git a/skills/pharaoh-self-review-coverage-check/fixtures/list-mapped-complete/expected-output.json b/skills/pharaoh-self-review-coverage-check/fixtures/list-mapped-complete/expected-output.json new file mode 100644 index 0000000..3e8e070 --- /dev/null +++ b/skills/pharaoh-self-review-coverage-check/fixtures/list-mapped-complete/expected-output.json @@ -0,0 +1,4 @@ +{ + "passed": true, + "uncovered": [] +} diff --git a/skills/pharaoh-self-review-coverage-check/fixtures/list-mapped-complete/input-map.yaml b/skills/pharaoh-self-review-coverage-check/fixtures/list-mapped-complete/input-map.yaml new file mode 100644 index 0000000..b6a59fa --- /dev/null +++ b/skills/pharaoh-self-review-coverage-check/fixtures/list-mapped-complete/input-map.yaml @@ -0,0 +1,6 @@ +version: 1 + +map: + pharaoh-req-from-code: + - pharaoh-req-review + - pharaoh-req-code-grounding-check diff --git a/skills/pharaoh-self-review-coverage-check/fixtures/list-mapped-complete/input-skill.md b/skills/pharaoh-self-review-coverage-check/fixtures/list-mapped-complete/input-skill.md new file mode 100644 index 0000000..0809d0b --- /dev/null +++ b/skills/pharaoh-self-review-coverage-check/fixtures/list-mapped-complete/input-skill.md @@ -0,0 +1,12 @@ +--- +name: pharaoh-req-from-code +description: Stub emission SKILL.md used as fixture input — only the frontmatter and the `## Last step` section are load-bearing for the coverage check. +--- + +# pharaoh-req-from-code + +## Last step + +After emitting the artefact, invoke `pharaoh-req-review` on it. Pass the emitted artefact (or its `need_id`) as `target`. Attach the returned review JSON to the skill's output under the key `review`. + +Additionally, for each emitted CREQ that has `:source_doc:`, invoke `pharaoh-req-code-grounding-check`. Attach its findings JSON under the key `code_grounding`. If either atom returns a mechanised-axis failure, do NOT finalize the artefact. diff --git a/skills/pharaoh-self-review-coverage-check/fixtures/list-mapped-partial/README.md b/skills/pharaoh-self-review-coverage-check/fixtures/list-mapped-partial/README.md new file mode 100644 index 0000000..af9ce90 --- /dev/null +++ b/skills/pharaoh-self-review-coverage-check/fixtures/list-mapped-partial/README.md @@ -0,0 +1,3 @@ +# list-mapped-partial + +Exercises the failure path of the list-valued branch. The emission skill `pharaoh-req-from-code` maps to `[pharaoh-req-review, pharaoh-req-code-grounding-check]` but only the first review is invoked in its `## Last step`. Expected output: `passed: false` with a single `uncovered` entry naming `pharaoh-req-code-grounding-check` as the missing `expected_review_skill`. Demonstrates that a partial invocation of a list-valued map fails with a specific missing entry rather than a generic mismatch. diff --git a/skills/pharaoh-self-review-coverage-check/fixtures/list-mapped-partial/expected-output.json b/skills/pharaoh-self-review-coverage-check/fixtures/list-mapped-partial/expected-output.json new file mode 100644 index 0000000..fc85934 --- /dev/null +++ b/skills/pharaoh-self-review-coverage-check/fixtures/list-mapped-partial/expected-output.json @@ -0,0 +1,11 @@ +{ + "passed": false, + "uncovered": [ + { + "artefact_id": "CREQ_fixture_01", + "draft_skill": "pharaoh-req-from-code", + "expected_review_skill": "pharaoh-req-code-grounding-check", + "reason": "review skill mapped in self-review-map list not invoked in emission skill's ## Last step" + } + ] +} diff --git a/skills/pharaoh-self-review-coverage-check/fixtures/list-mapped-partial/input-map.yaml b/skills/pharaoh-self-review-coverage-check/fixtures/list-mapped-partial/input-map.yaml new file mode 100644 index 0000000..b6a59fa --- /dev/null +++ b/skills/pharaoh-self-review-coverage-check/fixtures/list-mapped-partial/input-map.yaml @@ -0,0 +1,6 @@ +version: 1 + +map: + pharaoh-req-from-code: + - pharaoh-req-review + - pharaoh-req-code-grounding-check diff --git a/skills/pharaoh-self-review-coverage-check/fixtures/list-mapped-partial/input-skill.md b/skills/pharaoh-self-review-coverage-check/fixtures/list-mapped-partial/input-skill.md new file mode 100644 index 0000000..7df52d4 --- /dev/null +++ b/skills/pharaoh-self-review-coverage-check/fixtures/list-mapped-partial/input-skill.md @@ -0,0 +1,10 @@ +--- +name: pharaoh-req-from-code +description: Stub emission SKILL.md used as fixture input — only the `## Last step` section is load-bearing. Here only one of the two expected review skills is invoked, so the coverage check must fail. +--- + +# pharaoh-req-from-code + +## Last step + +After emitting the artefact, invoke `pharaoh-req-review` on it. Pass the emitted artefact (or its `need_id`) as `target`. Attach the returned review JSON to the skill's output under the key `review`. diff --git a/skills/pharaoh-self-review-coverage-check/fixtures/scalar-mapped/README.md b/skills/pharaoh-self-review-coverage-check/fixtures/scalar-mapped/README.md new file mode 100644 index 0000000..3a0d782 --- /dev/null +++ b/skills/pharaoh-self-review-coverage-check/fixtures/scalar-mapped/README.md @@ -0,0 +1,3 @@ +# scalar-mapped + +Backward-compatibility fixture for the scalar branch of the detection rule. The emission skill `pharaoh-req-draft` maps to a single string value `pharaoh-req-review`, which is invoked in the emission skill's `## Last step`. Expected output: `passed: true` with an empty `uncovered` list. Proves that extending the map schema to allow list values does not regress existing scalar entries. diff --git a/skills/pharaoh-self-review-coverage-check/fixtures/scalar-mapped/expected-output.json b/skills/pharaoh-self-review-coverage-check/fixtures/scalar-mapped/expected-output.json new file mode 100644 index 0000000..3e8e070 --- /dev/null +++ b/skills/pharaoh-self-review-coverage-check/fixtures/scalar-mapped/expected-output.json @@ -0,0 +1,4 @@ +{ + "passed": true, + "uncovered": [] +} diff --git a/skills/pharaoh-self-review-coverage-check/fixtures/scalar-mapped/input-map.yaml b/skills/pharaoh-self-review-coverage-check/fixtures/scalar-mapped/input-map.yaml new file mode 100644 index 0000000..bb4382e --- /dev/null +++ b/skills/pharaoh-self-review-coverage-check/fixtures/scalar-mapped/input-map.yaml @@ -0,0 +1,4 @@ +version: 1 + +map: + pharaoh-req-draft: pharaoh-req-review diff --git a/skills/pharaoh-self-review-coverage-check/fixtures/scalar-mapped/input-skill.md b/skills/pharaoh-self-review-coverage-check/fixtures/scalar-mapped/input-skill.md new file mode 100644 index 0000000..24de6c0 --- /dev/null +++ b/skills/pharaoh-self-review-coverage-check/fixtures/scalar-mapped/input-skill.md @@ -0,0 +1,10 @@ +--- +name: pharaoh-req-draft +description: Stub emission SKILL.md used as fixture input for backward-compatibility. Scalar-valued mapping — one review skill invoked in `## Last step`. +--- + +# pharaoh-req-draft + +## Last step + +After emitting the artefact, invoke `pharaoh-req-review` on it. Pass the emitted artefact (or its `need_id`) as `target`. Attach the returned review JSON to the skill's output under the key `review`. diff --git a/skills/pharaoh-sequence-diagram-draft/SKILL.md b/skills/pharaoh-sequence-diagram-draft/SKILL.md new file mode 100644 index 0000000..5c8869a --- /dev/null +++ b/skills/pharaoh-sequence-diagram-draft/SKILL.md @@ -0,0 +1,99 @@ +--- +name: pharaoh-sequence-diagram-draft +description: Use when drafting one sequence diagram showing ordered interactions between participants (components, actors, external systems) over time. Renderer tailored via `pharaoh.toml`. Does NOT emit component, class, or state diagrams. Status — PLANNED (design-only scaffold; invoking returns sentinel FAIL until implemented). +--- + +# pharaoh-sequence-diagram-draft (PLANNED) + +> **Status:** DESIGN ONLY. Implementation sentinel FAIL: `"pharaoh-sequence-diagram-draft is planned but not implemented; see SKILL.md"`. + +Shared tailoring rules: see [`shared/diagram-tailoring.md`](../shared/diagram-tailoring.md). Reads `[pharaoh.diagrams.sequence]` for per-type overrides. + +Safe-label rules: see [`shared/diagram-safe-labels.md`](../shared/diagram-safe-labels.md). Every emitted label / node id / edge label / message text MUST be sanitised per that rule set before the block leaves this skill. **Extra sharp for sequence diagrams:** Mermaid 11 treats `;` inside a message label as a statement terminator — prior dogfooding shipped a `J->>J: filter by type; skip SET/Folder` that parsed cleanly under `sphinx-build -nW` but rendered as `Syntax error` in the browser. Always replace `;` with `,` in message labels. + +## Purpose + +One invocation → one sequence diagram. Captures **ordered interactions over time** between a bounded set of participants. Typical inputs: a feature's "happy path" flow, an interface's request/response trace, an incident timeline reconstructed from logs. + +Does NOT capture static containment (→ `pharaoh-component-diagram-draft`). Does NOT capture type relationships (→ `pharaoh-class-diagram-draft`). Does NOT capture state transitions (→ `pharaoh-state-diagram-draft`). + +## Atomicity + +- (a) One interaction in → one diagram out. No multi-scenario bundling (alt paths in one diagram are OK — they are part of one scenario; but two independent scenarios are two skill invocations). +- (b) Input: `{view_title: str, participants: list[ParticipantSpec], messages: list[MessageSpec], project_root: str, renderer_override?, on_missing_config?, papyrus_workspace?, reporter_id: str}` where `ParticipantSpec = {id: str, label: str, kind?: "actor"|"component"|"boundary"|"database"|"external"}` and `MessageSpec = {from: str, to: str, label: str, kind?: "sync"|"async"|"return"|"self", fragment?: FragmentSpec}`. Output: one RST directive block. +- (c) Reward: fixture with 3 participants (User, API, DB) and 4 messages (User→API: request, API→DB: query, DB→API: result, API→User: response). Scorer: + 1. Output starts with renderer-specific directive. + 2. Every participant id in `participants` is declared in the diagram body. + 3. Every message appears in order (renderer syntax: `User->>API: request` for Mermaid). + 4. Message count in output = `len(messages)`. + 5. Sync vs async arrow differs syntactically (`->>` vs `-)` in Mermaid; `->` vs `->>` in PlantUML). + 6. Self-message (kind=`self`) renders as a self-loop on the participant. + + Pass = all 6. +- (d) Reusable: any interaction diagram. Especially valuable for interface/API specs. +- (e) One diagram kind per skill. + +## Input + +- `view_title`: diagram caption. +- `participants`: ordered list; order = left-to-right placement in the diagram. +- `messages`: ordered list; order = top-to-bottom time axis. Each message references participants by id. +- `project_root`: for tailoring lookup. +- `renderer_override`, `on_missing_config`, `papyrus_workspace`, `reporter_id`: as in shared doc. + +### FragmentSpec (optional per message) + +```json +{"type": "alt"|"opt"|"loop"|"par"|"critical"|"break", "condition": "<string>"} +``` + +Groups consecutive messages under a fragment (e.g. `alt`: alternative paths; `loop`: repeated block). If `messages[i].fragment` is non-null, it opens a fragment that stays open until a later message with `fragment = null` or a different fragment type. + +This is the one piece of sequence-diagram structure that has no analogue in component diagrams — hence sequence gets its own skill. + +## Output + +**Mermaid:** +```rst +.. mermaid:: + :caption: <view_title> + + sequenceDiagram + participant User + participant API + participant DB + User->>API: request + API->>DB: query + DB-->>API: result + API-->>User: response +``` + +**PlantUML:** +```rst +.. uml:: + :caption: <view_title> + + @startuml + actor User + participant API + database DB + User -> API : request + API -> DB : query + DB --> API : result + API --> User : response + @enduml +``` + +## Process (sketch) + +1. Resolve tailoring per shared doc. +2. Emit participant declarations in order (Mermaid: `participant X`; PlantUML: `actor X`/`participant X`/`database X` keyed on `ParticipantSpec.kind`). +3. Emit messages in order. Map `kind` → renderer syntax (sync/async/return/self). +4. Handle fragments: open `alt`/`opt`/`loop` as messages are emitted; close at end of fragment. +5. Wrap in RST directive. + +## Non-goals + +- No auto-extraction of sequences from code/logs — the caller provides `participants` and `messages` explicitly. A separate future skill (`pharaoh-sequence-from-trace`) could infer these from runtime logs, but that is a different concern. +- No return-arrow inference — if the caller wants a return, they include it as a message with `kind="return"`. +- No activation-bar auto-insertion (PlantUML activates/deactivates) — caller can add via `fragment` or future extension. diff --git a/skills/pharaoh-setup/SKILL.md b/skills/pharaoh-setup/SKILL.md index 18b1db5..068faa2 100644 --- a/skills/pharaoh-setup/SKILL.md +++ b/skills/pharaoh-setup/SKILL.md @@ -33,11 +33,15 @@ Follow the full detection algorithm defined in `skills/shared/data-access.md`. T #### 1a. Find Sphinx project roots -Search for `ubproject.toml` files in the workspace root and up to two levels of subdirectories using Glob with pattern `**/ubproject.toml`. Each location is a project root. +Search for `ubproject.toml` files in the workspace root and up to two levels of subdirectories using Glob with pattern `**/ubproject.toml`. Each location is a candidate root. -If no `ubproject.toml` is found, search for `conf.py` files containing sphinx-needs configuration using Grep with pattern `sphinx_needs|needs_types|needs_from_toml` in `**/conf.py`. Each matching `conf.py` location is a project root. +For each candidate root, verify sphinx-needs is actually configured by checking either (a) a `[needs]` section or `[[needs.types]]` tables in `ubproject.toml`, or (b) `sphinx_needs` in the `extensions` list of a co-located `conf.py`. Candidates that fail this check are classified as **plain-Sphinx candidates** (no sphinx-needs), not sphinx-needs project roots. -Record every project root path. +If no `ubproject.toml` match is a true sphinx-needs root, search for `conf.py` files containing sphinx-needs configuration using Grep with pattern `sphinx_needs|needs_types|needs_from_toml` in `**/conf.py`. Each matching `conf.py` location is a sphinx-needs project root. + +If no sphinx-needs roots are found at all, do a final pass: Glob `**/conf.py` and record every match as a **plain-Sphinx candidate** (these exist but do not load sphinx-needs). + +Record every sphinx-needs root path and every plain-Sphinx candidate separately. #### 1b. Read need types @@ -121,20 +125,35 @@ Data access: Fallback: raw file parsing (always available) ``` -If no project roots were found, report the issue clearly: +If no sphinx-needs project roots were found, branch on whether plain-Sphinx candidates exist: + +**Case A — No Sphinx project at all (no `conf.py` anywhere):** + +``` +No Sphinx project detected in this workspace. + +Run `sphinx-quickstart` to create a Sphinx project, or provide the path +to an existing one. +``` + +**Case B — Plain-Sphinx candidates exist but none loads sphinx-needs:** ``` -No sphinx-needs project detected in this workspace. +Sphinx project(s) detected at: + - <path> + ... -Looked for: - - ubproject.toml files (up to 2 levels deep) - - conf.py files with sphinx_needs configuration +sphinx-needs is not configured in any of them. -Please ensure this workspace contains a sphinx-needs project, or -provide the path to your project root. +Pharaoh requires sphinx-needs to be loaded as an extension and at least +one need type to be declared. + +Run `pharaoh-bootstrap` first to inject the minimum sphinx-needs +configuration into the chosen project, then re-run this skill to author +pharaoh.toml. ``` -Then ask the user for the project root path before proceeding. +In either case, ask the user how to proceed before writing any files. --- @@ -159,6 +178,51 @@ Which mode would you like? [advisory/enforcing] If the user does not specify, default to `"advisory"`. +#### 2a.bis. Detect and confirm project mode + +Pharaoh's workflow gates (`require_change_analysis`, `require_verification`, `require_mece_on_release`) have different natural defaults depending on where the project sits in its lifecycle. Hardcoding the example's values is what produced the pilot feedback: a reverse-engineering project had `require_change_analysis = true` on day one, alarming every newly-drafted need because there was no Pharaoh change issue yet. + +Classify the project into one of three modes using the following heuristic (first matching branch wins): + +| Signal | Inferred mode | +| ------------------------------------------------------------------------------------------------ | -------------- | +| `needs.json` exists (e.g. `docs/_build/needs/needs.json`) and contains ≥10 needs. | `steady-state` | +| No `needs.json` or <10 needs, AND the source tree has ≥5 code files AND `docs/` has prose files with section headers that read like user-facing features (e.g. imperative verbs, capability lists). | `reverse-eng` | +| Otherwise (thin project: no needs, minimal src, placeholder docs). | `greenfield` | + +Present the detected mode and ask the user to confirm or override: + +``` +Detected project mode: <reverse-eng | greenfield | steady-state> + + reverse-eng - Codebase exists and has feature-level documentation, but + sphinx-needs artefacts are being created now. Workflow + gates start permissive; tighten them once the catalogue + stabilises. + greenfield - Minimal scaffolding. Verification matters from day one + (every new need should have a verification path), but + change-analysis and MECE gates are noise until the + catalogue grows. + steady-state - Mature catalogue (≥10 needs). Full gating: change + analysis before edits, verification required, MECE at + release. + +Confirm detected mode, or choose a different one +[reverse-eng/greenfield/steady-state]? +``` + +Record the chosen mode. Per-mode `[pharaoh.workflow]` defaults (applied in Step 2b): + +| Mode | `require_change_analysis` | `require_verification` | `require_mece_on_release` | +| -------------- | ------------------------- | ---------------------- | ------------------------- | +| `reverse-eng` | `false` | `true` | `false` | +| `greenfield` | `false` | `true` | `false` | +| `steady-state` | `true` | `true` | `true` | + +`require_verification = true` is uniform across all three modes — step 1 of the gate-enablement ladder (see `skills/shared/gate-enablement.md`) is safe to enable out of the box because the review skills are ship-ready and read-only. A project that runs `pharaoh-setup` → `pharaoh-gate-advisor` immediately lands on step 2 as its next recommendation, not step 1. Mode still differentiates `require_change_analysis` and `require_mece_on_release` because those gates have pre-work that is not safe to assume on every project. + +A caller running this skill non-interactively MAY pass `mode` as an explicit override input. When present, Step 2a.bis uses that value and skips the confirmation prompt. + #### 2b. Build pharaoh.toml content Generate the `pharaoh.toml` content using the detected project data. Use `pharaoh.toml.example` as the structural template, but populate values from detection results. @@ -173,26 +237,31 @@ Generate the `pharaoh.toml` content using the detected project data. Use `pharao - Set `auto_increment = true`. **`[pharaoh.workflow]` section:** -- Use the defaults from `pharaoh.toml.example`: - - `require_change_analysis = true` - - `require_verification = true` - - `require_mece_on_release = false` +- Populate the three flags from the mode table in Step 2a.bis based on the mode the user confirmed. Do NOT blindly copy values from `pharaoh.toml.example` — that file documents the steady-state shape, not the day-one defaults for every mode. +- Emit a one-line comment above the three flags naming the chosen mode, so a later reader of `pharaoh.toml` can see what assumption produced these values: + ```toml + [pharaoh.workflow] + # mode: reverse-eng — tighten as the catalogue stabilises + require_change_analysis = false + require_verification = false + require_mece_on_release = false + ``` **`[pharaoh.traceability]` section:** -- Build `required_links` from the detected extra link types. -- For each extra link type, determine the source and target types by examining the link's usage in existing need directives. If the link name is `implements`, and it appears on `impl` directives pointing to `spec` directives, generate `"spec -> impl"`. +- Build `required_links` from the detected extra link types, but **only for type pairs where BOTH types are declared in `ubproject.toml` `[[needs.types]]`.** A chain `comp_req -> test` where `test` is not a declared type is dead config — it alarms on every `comp_req` from day one. Skip it. +- For each extra link type, determine the source and target types by examining the link's usage in existing need directives. If the link name is `implements`, and it appears on `impl` directives pointing to `spec` directives, generate `"spec -> impl"` only if both `impl` and `spec` are declared. - If usage cannot be determined from existing needs, infer from naming conventions: - `implements` or `realizes` -> `"spec -> impl"` - `tests` or `verifies` -> `"impl -> test"` - `satisfies` or `fulfills` -> `"req -> spec"` - `derives` or `derives_from` -> `"req -> req"` (parent to child) - Also check for standard `links` usage to detect implicit traceability chains (e.g., specs linking to reqs via `:links:`). -- If the project has a clear type hierarchy (e.g., req -> spec -> impl -> test), generate the full chain: +- If the project has a clear type hierarchy (e.g., req -> spec -> impl -> test), generate the full chain — but filter out any edges whose target type is not declared: ```toml required_links = [ "req -> spec", "spec -> impl", - "impl -> test", + # "impl -> test", # SKIPPED: 'test' is not declared in [[needs.types]] ] ``` - If no link types are detected, leave `required_links` as an empty array with a comment explaining how to add entries. @@ -255,36 +324,26 @@ If the user declines, skip to Step 4. #### 3b. Locate Copilot templates -The Copilot templates live in the Pharaoh plugin directory under `copilot/`. Locate this directory relative to the plugin installation path. +The Copilot templates live in the Pharaoh plugin directory under `.github/`. Pharaoh dogfoods its own agents — the same `.github/` tree it copies out is the one it uses on itself. Locate this directory relative to the plugin installation path. The expected template structure is: ``` -copilot/ +.github/ agents/ - pharaoh.setup.agent.md - pharaoh.change.agent.md - pharaoh.trace.agent.md - pharaoh.mece.agent.md - pharaoh.author.agent.md - pharaoh.verify.agent.md - pharaoh.release.agent.md - pharaoh.plan.agent.md + pharaoh.*.agent.md (discovered via glob, not hardcoded) prompts/ - pharaoh.change.prompt.md - pharaoh.trace.prompt.md - pharaoh.mece.prompt.md - pharaoh.author.prompt.md - pharaoh.verify.prompt.md - pharaoh.release.prompt.md - pharaoh.plan.prompt.md + pharaoh.*.prompt.md (discovered via glob, not hardcoded) copilot-instructions.md ``` -If the template directory is not found, inform the user: +Do NOT hardcode the agent or prompt file list in the skill — enumerate them at runtime with Glob on `.github/agents/pharaoh.*.agent.md` and `.github/prompts/pharaoh.*.prompt.md`. The set grows as new atomic skills land; a hardcoded list rots on every release. + +If the `.github/agents/` directory is not found in the plugin dir, inform the user: ``` -Copilot templates not found in the Pharaoh plugin directory. +Copilot templates not found in the Pharaoh plugin directory +(expected .github/agents/ and .github/prompts/). This may indicate an incomplete installation. Skipping Copilot setup. You can manually create Copilot agents later by running pharaoh:setup again @@ -310,32 +369,21 @@ For files that do not exist, list them as new files to be created. #### 3d. Present file list and copy -Show a summary of all files that will be created or updated: +Enumerate the actual template files via Glob (see Step 3b) and show a summary. Example shape (exact list depends on the current plugin version): ``` The following files will be created in your project: - New files: - .github/agents/pharaoh.setup.agent.md - .github/agents/pharaoh.change.agent.md - .github/agents/pharaoh.trace.agent.md - .github/agents/pharaoh.mece.agent.md - .github/agents/pharaoh.author.agent.md - .github/agents/pharaoh.verify.agent.md - .github/agents/pharaoh.release.agent.md - .github/agents/pharaoh.plan.agent.md - .github/prompts/pharaoh.change.prompt.md - .github/prompts/pharaoh.trace.prompt.md - .github/prompts/pharaoh.mece.prompt.md - .github/prompts/pharaoh.author.prompt.md - .github/prompts/pharaoh.verify.prompt.md - .github/prompts/pharaoh.release.prompt.md - .github/prompts/pharaoh.plan.prompt.md + New files (N agents, M prompts): + .github/agents/pharaoh.<name>.agent.md × N + .github/prompts/pharaoh.<name>.prompt.md × M .github/copilot-instructions.md Proceed? [yes/no] ``` +Show the full enumerated list to the user — do not print the `× N` shorthand. The shorthand above is just for this skill spec; the runtime output must list every file by name so the user can review before confirming. + After user confirms, create the necessary directories (`.github/agents/`, `.github/prompts/`) and copy each template file to the user's project. --- @@ -346,21 +394,46 @@ After user confirms, create the necessary directories (`.github/agents/`, `.gith Look for a `.gitignore` file in the workspace root. -#### 4b. Add .pharaoh/ entry +#### 4b. Add Pharaoh ephemeral paths (narrow, not wholesale) + +`.pharaoh/` contains a mix of committed tailoring and ephemeral run state. Ignoring the whole tree is wrong — it hides `.pharaoh/project/` tailoring which IS shared across the team. The skill ignores only the ephemeral subpaths: + +| Path | Purpose | Commit? | +| ----------------------- | -------------------------------------------------------- | ------- | +| `.pharaoh/project/` | Tailoring: workflows, id-conventions, artefact-catalog, checklists | **yes** | +| `.pharaoh/runs/` | `pharaoh-execute-plan` run artefacts (report.yaml, staged RST) | no | +| `.pharaoh/plans/` | plan.yaml files emitted by `pharaoh-write-plan` | no | +| `.pharaoh/session.json` | Session / gate state | no | +| `.pharaoh/cache/` | Derived caches | no | + +Emitted entries: -If `.gitignore` exists, read its contents and check whether `.pharaoh/` is already listed (matching the exact string `.pharaoh/` or `.pharaoh` on its own line). +``` +.pharaoh/runs/ +.pharaoh/plans/ +.pharaoh/session.json +.pharaoh/cache/ +``` + +If `.gitignore` exists, read its contents and branch: -- If already present, do nothing. Report: `".pharaoh/" already in .gitignore -- no changes needed.` -- If not present, append `.pharaoh/` to the file on a new line. If the file does not end with a newline, add one before the entry. Report: `Added ".pharaoh/" to .gitignore.` +1. **Wide form already present.** If the file contains a bare `.pharaoh/` or `.pharaoh` line (no trailing path segment), emit a warning and leave it alone — do not auto-migrate, respect user control: + > `.pharaoh/ is ignored as a whole — this hides .pharaoh/project/ tailoring which should be committed. Consider narrowing to: .pharaoh/runs/, .pharaoh/plans/, .pharaoh/session.json, .pharaoh/cache/.` + Report: `".pharaoh/" entry is too wide; left in place with a warning.` +2. **All four narrow entries already present.** Do nothing. Report: `".pharaoh/ ephemeral paths already ignored -- no changes needed."` +3. **Some narrow entries missing.** Append the missing entries on new lines. If the file does not end with a newline, add one first. Report: `"Added <count> Pharaoh ephemeral-path entries to .gitignore."` -If `.gitignore` does not exist, create it with the following content: +If `.gitignore` does not exist, create it with: ``` -# Pharaoh session state (ephemeral, do not commit) -.pharaoh/ +# Pharaoh ephemeral state (do not commit). Project tailoring at .pharaoh/project/ IS committed. +.pharaoh/runs/ +.pharaoh/plans/ +.pharaoh/session.json +.pharaoh/cache/ ``` -Report: `Created .gitignore with ".pharaoh/" entry.` +Report: `Created .gitignore with Pharaoh ephemeral-path entries.` --- @@ -435,6 +508,16 @@ Determine the current tier: --- +### Step 5b: Bootstrap tailoring from declared types + +After `pharaoh.toml` is written, invoke `pharaoh-tailor-bootstrap` with `project_root` = the workspace root and `on_missing_config` = `"prompt"` (so the user confirms the generated content). + +This produces `.pharaoh/project/{workflows,id-conventions,artefact-catalog}.yaml` plus `checklists/<type>.md` per declared type. Without this step, every emitted need has `:status: draft` forever with no defined lifecycle transitions. + +If the user rejects the proposal, skip — the caller may run `pharaoh-tailor-fill` later (after needs exist) as the alternative path. + +--- + ### Step 6: Summary Present a final summary of everything that was configured: @@ -446,6 +529,8 @@ Pharaoh Setup Complete Configuration: pharaoh.toml: <created | updated | skipped> (<path>) Strictness: <advisory | enforcing> + Mode: <reverse-eng | greenfield | steady-state> + Workflow: change=<on|off>, verification=<on|off>, mece=<on|off> Codelinks: <enabled | disabled> Traceability: <N required link chains | no required links> @@ -461,24 +546,30 @@ Detected projects: Links: <comma-separated> Available skills (Claude Code): - pharaoh:setup - This skill (project setup and configuration) - pharaoh:change - Analyze impact of a requirement change - pharaoh:trace - Navigate traceability links in any direction - pharaoh:mece - Check for gaps, redundancies, and inconsistencies - pharaoh:req-draft - Draft new requirements as traceable sphinx-needs directives - pharaoh:req-review - Validate requirements against linked specs and implementations - pharaoh:release - Generate changelogs and release summaries - pharaoh:plan - Break changes into structured implementation tasks + <enumerate from `skills/pharaoh-*/SKILL.md` frontmatter at runtime — + do not hardcode. The skill list has grown beyond the original 8 happy-path + agents to include atomic skills like pharaoh:req-draft, pharaoh:req-review, + pharaoh:arch-draft, pharaoh:arch-review, pharaoh:vplan-draft, + pharaoh:vplan-review, pharaoh:fmea, pharaoh:tailor-detect, + pharaoh:tailor-fill, pharaoh:audit-fanout, and others.> ``` If Copilot agents were installed, also show: ``` Available agents (GitHub Copilot): - @pharaoh.setup @pharaoh.change @pharaoh.trace @pharaoh.mece - @pharaoh.author @pharaoh.verify @pharaoh.release @pharaoh.plan - -Workflow: @pharaoh.change -> @pharaoh.author -> @pharaoh.verify -> @pharaoh.release + <enumerate from the copied .github/agents/pharaoh.*.agent.md files — + do not hardcode. One entry per installed agent, formatted as @pharaoh.<name>.> + +Orchestration agents (coordinate atomic agents for end-to-end flows): + @pharaoh.flow, @pharaoh.process-audit, @pharaoh.write-plan, @pharaoh.execute-plan, ... + (again, discover from installed agents rather than hardcoding) + +For reverse-engineering requirements or architecture from code, use + @pharaoh.write-plan to generate a plan.yaml (choose a template such as + reverse-engineer-project or reverse-engineer-module) and @pharaoh.execute-plan + to run it. The deleted @pharaoh.reqs-from-module skill has been replaced by + this plan-based flow. ``` End with a recommendation to run the MECE check: diff --git a/skills/pharaoh-sphinx-extension-add/SKILL.md b/skills/pharaoh-sphinx-extension-add/SKILL.md new file mode 100644 index 0000000..f520dec --- /dev/null +++ b/skills/pharaoh-sphinx-extension-add/SKILL.md @@ -0,0 +1,156 @@ +--- +name: pharaoh-sphinx-extension-add +description: Use when you need to idempotently add one or more sphinx extension modules to a project's `conf.py` extensions list, optionally installing the corresponding pypi packages via the detected package manager. Invoked by plans produced by pharaoh-write-plan when a diagram-emitting task requires a renderer extension that `conf.py` does not yet load. Does NOT emit RST. Does NOT build. +--- + +# pharaoh-sphinx-extension-add + +## When to use + +Invoke when a plan requires a Sphinx extension that `conf.py` does not currently load (e.g. `sphinxcontrib.mermaid` for Mermaid diagrams, `sphinxcontrib.plantuml` for PlantUML, `sphinx_needs` for sphinx-needs itself). Typical caller: `pharaoh-execute-plan` executing a task that `pharaoh-write-plan` inserted as a prerequisite to a diagram-emitting task when Step 3.5's dep probe found a missing extension. + +Do NOT invoke to set arbitrary `conf.py` variables — this skill only touches the `extensions` list (and optionally triggers a pypi install). Do NOT invoke to load `sphinx_needs` on a project that never had sphinx-needs — that is `pharaoh-bootstrap`'s indivisible concern (which already includes extension injection as part of the bootstrap transaction). + +## Atomicity + +- (a) **Indivisible.** One `conf.py` + one extension list in → one updated `conf.py` (and optionally one package-manager install) out. No other `conf.py` mutation. No RST edits. No downstream skill invocation. +- (b) **Typed I/O.** + - Input: `{conf_py: str, extensions: list[str], install_if_missing: bool, on_package_manager_missing?: "fail"|"warn"|"skip", reporter_id: str}`. + - Output: `{files_modified: list[str], extensions_added: list[str], extensions_already_present: list[str], install_command_used: str | null, packages_installed: list[str], warnings: list[str]}`. Idempotent: when the extensions are already present AND (installed OR `install_if_missing == false`), `files_modified` and `install_command_used` are empty. +- (c) **Execution-based reward.** Fixture `pharaoh-validation/fixtures/pharaoh-sphinx-extension-add/`: + - `case_fresh/conf.py` — has `extensions = ['sphinx_needs']`. Call with `extensions: ['sphinxcontrib.mermaid']`, `install_if_missing: true`. Scorer asserts (1) `conf.py` now has both entries, (2) `sphinxcontrib-mermaid` is importable, (3) `extensions_added == ['sphinxcontrib.mermaid']`, (4) `packages_installed == ['sphinxcontrib-mermaid']`. + - `case_already_present/conf.py` — has `['sphinx_needs', 'sphinxcontrib.mermaid']`. Same call. Scorer asserts (1) `conf.py` unchanged (byte-identical), (2) `extensions_added == []`, `extensions_already_present == ['sphinxcontrib.mermaid']`, (3) `install_command_used is null`. + - `case_no_install/conf.py` — has `['sphinx_needs']`, extension `sphinxcontrib.plantuml` NOT installed. Call with `install_if_missing: false`. Scorer asserts (1) `conf.py` now has the entry, (2) `packages_installed == []`, (3) `warnings` contains one entry naming the missing package. + - Idempotence: re-running any case returns `files_modified == []`, `extensions_added == []`. +- (d) **Reusable.** Any Sphinx project, any extension. Not tied to diagrams — a future use case might be adding `sphinxcontrib.bibtex` or `myst_parser`. +- (e) **Composable.** Invoked inline (by `pharaoh-execute-plan` per plan task) or by humans via the CLI. Does not call other skills. + +## Input + +- `conf_py` (required): absolute path to a Sphinx `conf.py`. Must exist and be parseable Python. +- `extensions` (required): list of extension module paths (the strings that go inside `extensions = [...]`). Example: `["sphinxcontrib.mermaid"]`, `["sphinxcontrib.plantuml", "myst_parser"]`. +- `install_if_missing` (required): bool. If `true` and an extension module is not importable, attempt a package install before editing `conf.py` (order: install first, then edit, so a failed install does not leave `conf.py` referencing a missing module). If `false`, edit `conf.py` regardless of importability; record a warning per missing module. +- `on_package_manager_missing` (optional): `"fail"` | `"warn"` | `"skip"`. Default `"warn"`. Applies only when `install_if_missing` is `true` and no package manager is detectable (see package-manager detection table below). + - `"fail"`: abort before any edit. + - `"warn"`: log warning, proceed to edit `conf.py` anyway (user will install manually). + - `"skip"`: silently proceed to edit (no warning). Used by callers that intentionally edit `conf.py` in environments where pypi installation is handled elsewhere (e.g. CI build image pre-baked). +- `reporter_id` (required): short agent id, for audit logs. + +## Output + +```json +{ + "files_modified": ["docs/conf.py"], + "extensions_added": ["sphinxcontrib.mermaid"], + "extensions_already_present": [], + "install_command_used": "uv pip install sphinxcontrib-mermaid", + "packages_installed": ["sphinxcontrib-mermaid"], + "warnings": [] +} +``` + +`install_command_used` is `null` when nothing was installed. + +`packages_installed` lists the pypi package names (not the extension module paths — those differ: module `sphinxcontrib.mermaid` ships in pypi package `sphinxcontrib-mermaid`; see the extension → package resolution table). + +## Process + +### Step 0: Parse `conf.py`'s current extensions list + +Read `conf_py`. Locate the `extensions = [...]` assignment. Three cases: + +1. **Assignment present and parseable.** Extract the current list as a Python list of strings. +2. **Assignment missing.** Record empty list; the Edit step will append a new assignment. +3. **Parse error on the assignment** (e.g. `extensions = get_extensions()`). Abort: + ``` + FAIL: extensions = ... in conf.py is not a literal list. This skill cannot safely mutate computed extension lists. Edit manually. + ``` + +### Step 1: Classify each requested extension + +For each entry in input `extensions`: + +- **Already present** in the parsed current list → add to `extensions_already_present`, skip. +- **Missing AND importable** (`python -c "import <module_path>"` exits zero) → target for edit only, no install. +- **Missing AND not importable** → target for install + edit (if `install_if_missing`), or edit + warn (if not). + +### Step 2: Install (conditional) + +Only if `install_if_missing == true` AND the target set from Step 1 includes one or more non-importable modules. + +**2a. Resolve pypi package names.** Use the extension → pypi resolution table: + +| Extension module | Pypi package | +| ------------------------- | ------------------------ | +| `sphinxcontrib.mermaid` | `sphinxcontrib-mermaid` | +| `sphinxcontrib.plantuml` | `sphinxcontrib-plantuml` | +| `sphinxcontrib.bibtex` | `sphinxcontrib-bibtex` | +| `myst_parser` | `myst-parser` | +| `sphinx_copybutton` | `sphinx-copybutton` | +| `sphinx_design` | `sphinx-design` | +| `sphinx_needs` | `sphinx-needs` | +| `sphinx_codelinks` | `sphinx-codelinks` | +| `sphinxcontrib.<name>` | `sphinxcontrib-<name>` (default rule when not otherwise listed) | +| `<other>` | `<other>` with `_` → `-` (default rule) | + +Unknown extensions use the default rule. If the caller is certain about the pypi name, they can pass it as the module path anyway — the skill treats the input as authoritative and derives the install target via the rule above; the install-or-fail outcome is self-correcting. + +**2b. Detect package manager.** Same six-row table as `pharaoh-bootstrap` Step 0c (rye / uv / poetry / pipenv / pdm / pip-venv). Closer indicator wins. + +If no package manager is detected, branch on `on_package_manager_missing`: +- `"fail"` → abort before editing `conf.py`. +- `"warn"` → emit warning; go to Step 3 (edit `conf.py`); `packages_installed` stays empty. +- `"skip"` → go to Step 3 silently. + +**2c. Run install.** For each pypi package not yet installed, run the add/install command (e.g. `rye add sphinxcontrib-mermaid`, `uv pip install sphinxcontrib-mermaid`). Capture exit code per package. If any install fails: + +- If other packages in the batch succeeded, record the failure in `warnings` but proceed to edit `conf.py` for the successful ones; skip `conf.py` entry for the failed ones. +- If ALL installs failed, abort without editing. Record all failures in `warnings`. + +### Step 3: Edit `conf.py` + +For each target in Step 1's "missing" set that passed Step 2 (installed or skipped by design): + +1. **Extensions assignment exists.** Insert the extension string as the last entry, preserving indentation and trailing-comma conventions. If the existing list is on one line, append inline; if multi-line, append as a new line matching the indent of the last existing entry. +2. **Extensions assignment missing.** Append a new line `extensions = ["<ext>"]` after the last existing top-level assignment. Add a blank line before for readability. + +Preserve comments and blank lines around the assignment. Do NOT reorder existing entries. + +### Step 4: Verify the edit + +Re-read `conf_py`. Parse the `extensions = [...]` assignment again. Confirm every requested extension is present. If any is missing (edit did not take effect), abort with `FAIL: edit verification failed for <ext>; conf.py may be in an inconsistent state`. + +### Step 5: Return + +Emit the output JSON. Populate: + +- `files_modified`: `[conf_py]` if any edit happened; `[]` otherwise. +- `extensions_added`: extensions the edit introduced. +- `extensions_already_present`: extensions that were already in the list. +- `install_command_used`: the package-manager-specific command (e.g. `uv pip install sphinxcontrib-mermaid`) if any install ran; `null` otherwise. If multiple packages installed in separate commands, this is the last one (kept simple — callers who want the full history read `packages_installed`). +- `packages_installed`: pypi names of packages actually installed. +- `warnings`: any warning surfaced along the way. + +## Failure modes + +| Condition | Response | +| ------------------------------------------------------- | ----------------------------------------------------------- | +| `conf_py` missing | FAIL naming the path. | +| `extensions` empty list | FAIL: `"extensions input must contain at least one entry"`. | +| `extensions = ...` in `conf.py` is not a literal list | FAIL per Step 0. | +| All installs fail | FAIL without editing. Record failures in warnings. | +| Partial install failure | Edit for the successes; warn for the failures; no edit for failures. | +| Package manager not detected AND `on_package_manager_missing == "fail"` | FAIL before editing. | + +## Non-goals + +- **No `conf.py` mutation outside the `extensions` list.** Related settings (`mermaid_output_format`, `plantuml` path config) are deliberately not touched. Callers that need those set should invoke a different skill (or author a future `pharaoh-sphinx-option-set`). +- **No multi-file edits.** Only the named `conf_py` file. Multi-project Sphinx trees with multiple `conf.py` files need one invocation per file. +- **No `pyproject.toml` pinning.** The install command may or may not persist the dependency to `pyproject.toml` depending on the package manager (rye/uv/poetry/pdm persist; raw `pip install` does not). The skill does not second-guess the caller's pinning strategy. +- **No dry-run mode.** If the caller wants to preview changes, they can diff `conf.py` after the call — the skill is fast and idempotent, so a "run, review, revert" loop is cheaper than a separate dry-run code path. + +## Composition + +- `pharaoh-write-plan` Step 3.5 (dep probe) transitions from warn-only to task-insertion: when `conf.py` is missing a renderer extension required by a diagram-emitting task, the plan emits a `pharaoh-sphinx-extension-add` task as a dependency of the diagram task (or group of diagram tasks). The probe's warnings still include the install command as a human-readable handoff. +- `pharaoh-bootstrap` remains the authoritative entry for `sphinx_needs` itself (the bootstrap transaction covers extension + types + `needs_from_toml` as one atomic step). This skill is for post-bootstrap additions. +- `pharaoh-quality-gate` does NOT run this skill. Gate is read-only; extension adds are plan tasks. diff --git a/skills/pharaoh-standard-conformance/SKILL.md b/skills/pharaoh-standard-conformance/SKILL.md index d0f5360..4c379a3 100644 --- a/skills/pharaoh-standard-conformance/SKILL.md +++ b/skills/pharaoh-standard-conformance/SKILL.md @@ -145,8 +145,8 @@ Supply either a need-id or an RST directive block. Read `.pharaoh/project/artefact-catalog.yaml` and `.pharaoh/project/id-conventions.yaml`. Extract `required_fields` for the artefact type and `id_regex` for the type prefix. -If tailoring files are missing, apply Score defaults: -- `gd_req` required fields: `[id, status, satisfies]` +If tailoring files are missing, apply the built-in defaults (bundled example profile): +- `req` required fields: `[id, status, satisfies]` - `arch` required fields: `[id, status, satisfies, type]` - `tc` required fields: `[id, status, verifies]` diff --git a/skills/pharaoh-state-diagram-draft/SKILL.md b/skills/pharaoh-state-diagram-draft/SKILL.md new file mode 100644 index 0000000..c36f7b7 --- /dev/null +++ b/skills/pharaoh-state-diagram-draft/SKILL.md @@ -0,0 +1,95 @@ +--- +name: pharaoh-state-diagram-draft +description: Use when drafting one state-machine diagram showing lifecycle or behavioral states of a component/entity, with labeled transitions. Renderer tailored via `pharaoh.toml`. Does NOT emit component, sequence, or class diagrams. Status — PLANNED (design-only scaffold; invoking returns sentinel FAIL until implemented). +--- + +# pharaoh-state-diagram-draft (PLANNED) + +> **Status:** DESIGN ONLY. Implementation sentinel FAIL: `"pharaoh-state-diagram-draft is planned but not implemented; see SKILL.md"`. + +Shared tailoring rules: see [`shared/diagram-tailoring.md`](../shared/diagram-tailoring.md). Reads `[pharaoh.diagrams.state]`. + +Safe-label rules: see [`shared/diagram-safe-labels.md`](../shared/diagram-safe-labels.md). Every emitted label / node id / transition label MUST be sanitised per that rule set before the block leaves this skill. `sphinx-build` does not validate diagram bodies — a parse failure becomes visible only at browser render time. Sanitisation is the first line of defence; the second is `pharaoh-diagram-lint` run as part of `pharaoh-quality-gate`. + +## Purpose + +One invocation → one state-machine diagram. Captures **discrete states** of a component/entity and **labeled transitions** between them, with optional events, guards, and actions per transition. + +Does NOT capture static structure (→ `pharaoh-component-diagram-draft`, `pharaoh-class-diagram-draft`). Does NOT capture ordered multi-participant interactions (→ `pharaoh-sequence-diagram-draft`). + +## Atomicity + +- (a) One state machine in → one diagram out. Nested state machines (composite states) are one machine; two independent machines = two skill invocations. +- (b) Input: `{view_title: str, states: list[StateSpec], transitions: list[TransitionSpec], initial_state: str, terminal_states?: list[str], project_root: str, renderer_override?, on_missing_config?, papyrus_workspace?, reporter_id: str}` where `StateSpec = {id: str, label: str, kind?: "simple"|"composite"|"choice"|"junction", sub_states?: list[StateSpec], entry?: str, exit?: str}`, `TransitionSpec = {from: str, to: str, event?: str, guard?: str, action?: str}`. Output: one RST directive block. +- (c) Reward: fixture — lifecycle `draft → in_review → approved → published`, plus `rejected` terminal off `in_review`. Scorer: + 1. Output starts with renderer-specific directive. + 2. Exactly one initial-state marker (`[*] -->` Mermaid, `[*] -->` PlantUML). + 3. `initial_state` is the target of the initial-state arrow. + 4. Every ID in `states` appears as a state node. + 5. Every transition renders with correct arrow and label (`event [guard] / action`). + 6. Every ID in `terminal_states` has a transition `→ [*]`. + 7. With a composite state containing sub_states, the sub-states are nested inside the composite (Mermaid: `state Foo { ... }`; PlantUML: `state Foo { ... }`). + + Pass = all 7. +- (d) Reusable: any lifecycle (workflow states, device modes, protocol states, order status machine). +- (e) One machine per call. + +## Input highlights + +- `states`: all states, possibly nested via `sub_states`. Composite states declare `kind = "composite"`. +- `transitions`: `from`/`to` reference state IDs, including sub-state IDs (cross-boundary transitions supported). +- `initial_state`: REQUIRED. Must be an ID in `states`. There is exactly one initial state; if the machine has multiple "entry points" from outer context, model them via transitions from `[*]` at the composite level. +- `terminal_states` (optional): list of IDs that have implicit transition to `[*]` (final pseudo-state). A machine may have zero terminal states (infinite loop) — valid. + +## Transition label format + +Renderer-independent format: `event [guard] / action`. +- `event` optional (unlabeled transition = auto). +- `guard` in square brackets, optional. +- `action` after slash, optional. + +If all three are absent, render an unlabeled arrow. + +## Output + +**Mermaid:** +```rst +.. mermaid:: + :caption: <view_title> + + stateDiagram-v2 + [*] --> draft + draft --> in_review : submit + in_review --> approved : approve [reviewer_count >= 2] + in_review --> rejected : reject / notify_author + approved --> published : publish + rejected --> [*] + published --> [*] +``` + +**PlantUML:** +```rst +.. uml:: + :caption: <view_title> + + @startuml + [*] --> draft + draft --> in_review : submit + in_review --> approved : approve [reviewer_count >= 2] + in_review --> rejected : reject / notify_author + approved --> published : publish + rejected --> [*] + published --> [*] + @enduml +``` + +## Non-goals + +- No state-from-code extraction — callers supply states and transitions explicitly. A future `pharaoh-states-from-source` skill could infer from match statements / switch / FSM libraries, but is a separate concern. +- No timing annotations (real-time deadlines, timer events) — sequence diagrams are a better fit for temporal constraints. +- No concurrency regions by default — a future extension may add orthogonal regions; for now, sub_states are strictly hierarchical. +- No auto-detection of terminal states — caller provides them. + +## Interaction with tailoring + +Some projects (e.g. workflow-heavy ubproject.toml with lifecycle state enums) already declare state machines implicitly — sphinx-needs `status` enum is a two-line state machine. A caller might want to derive the diagram from the project's `workflows.yaml` (when present). That derivation is NOT this skill's concern; the caller invokes `pharaoh-state-diagram-draft` with explicit `states` and `transitions`. A wrapper that reads `workflows.yaml` and calls this skill is orchestration, not atomic. diff --git a/skills/pharaoh-status-lifecycle-check/SKILL.md b/skills/pharaoh-status-lifecycle-check/SKILL.md new file mode 100644 index 0000000..efe7f45 --- /dev/null +++ b/skills/pharaoh-status-lifecycle-check/SKILL.md @@ -0,0 +1,113 @@ +--- +name: pharaoh-status-lifecycle-check +description: Use when running a release-gate check over a full sphinx-needs corpus to confirm that zero needs remain in the initial `draft` status. Single mechanical binary gate — aggregates `status` across every need in `needs.json`, compares against the initial-state declaration in `workflows.yaml`, and returns pass/fail plus per-status counts. Advisory by default (pre-release development passes); release pipelines override `enforce=true` so any draft blocks the gate. +--- + +# pharaoh-status-lifecycle-check + +## When to use + +Invoke from a release pipeline, from `pharaoh-quality-gate` as the delegated check for the `status-lifecycle-healthy` invariant, or standalone when auditing whether a corpus is past the draft stage. Reads the project's `workflows.yaml` (for the initial-state name) plus `needs.json` (for each need's current `status`), buckets the needs, and emits a single findings JSON with `draft_count` and an `overall` verdict. + +Do NOT invoke to check whether a single transition `from → to` is legal — that is the per-need state-machine question answered by `pharaoh-lifecycle-check`. That skill checks one need, one proposed transition, consults the `transitions` list with `requires:` prerequisites, and answers "is this move allowed right now?". This skill answers a different question over a different input: given the whole corpus, how many needs are still in the initial `draft` bucket, and does the release policy tolerate any? No per-need transition walk, no prerequisite resolution — only the binary "current status bucket" aggregation. + +Do NOT invoke to score percentage thresholds like "≥50% past draft". The plan that commissioned this atom explicitly rejects fuzzy thresholds for release gates. Under `enforce=true` the gate is binary: zero drafts pass, one draft fails. Under `enforce=false` the output reports counts without failing, so callers still see the distribution. + +Do NOT invoke to transition needs. Read-only audit. + +## Atomicity + +- (a) Indivisible: one `workflows.yaml` + one `needs.json` in → one findings JSON out. No per-need transition checks, no set-level re-authoring, no dispatch of other skills. +- (b) Input: `{workflow_path: <str>, needs_json_path: <str>, enforce: bool (default false)}`. Output: findings JSON per the shape in `## Output` below. +- (c) Reward: fixtures under `skills/pharaoh-status-lifecycle-check/fixtures/` — one per outcome branch: + 1. `all-draft-enforcing/` — every need `status: draft`, `enforce: true`. Expected: `overall: "fail"`, `draft_count` equals total, `blockers` lists the draft need ids. + 2. `all-draft-advisory/` — every need `status: draft`, `enforce: false`. Expected: `overall: "pass"` with an advisory `blockers` entry describing the drafts without failing the gate. + 3. `mixed-enforcing/` — some drafts, some past draft, `enforce: true`. Expected: `overall: "fail"`, `blockers` lists only the draft need ids. + 4. `fully-reviewed-enforcing/` — every need past draft, `enforce: true`. Expected: `overall: "pass"`, empty `blockers`. + + Pass = each fixture's actual output matches `expected-output.json` modulo ordering of need ids inside the `blockers` list. +- (d) Reusable across projects — lifecycle state names come from `workflows.yaml` (the project declares them); only the bucket named by `initial_state` (or the literal `draft` fallback) triggers the gate. No project-specific vocabulary in the base. +- (e) Read-only. Does not modify `workflows.yaml`, `needs.json`, or any need status. + +## Input + +- `workflow_path`: absolute path to the project's `workflows.yaml` (typically `.pharaoh/project/workflows.yaml`). The skill reads two keys: + - `initial_state` (optional): the state name that signals "not yet reviewed". If absent, the skill falls back to the literal string `"draft"` and records a note. + - `lifecycle_states` (optional): map of declared state names. Used only to validate that every observed status is declared; unknown statuses surface in `notes` but do not change the verdict. +- `needs_json_path`: absolute path to the project's `needs.json` (typically `docs/_build/needs/needs.json`). The skill reads the flat ID map and inspects each entry's `status` field. +- `enforce`: boolean, default `false`. When `false`, the skill runs in advisory mode — counts and lists drafts but always emits `overall: "pass"`. When `true`, any draft flips `overall` to `"fail"`. + +Edge cases: +- `workflow_path` missing or unparseable → emit `overall: "fail"` with blocker `"workflows.yaml unresolved: <path>"` regardless of `enforce` (cannot decide without the initial-state name). +- `needs_json_path` missing or unparseable → emit `overall: "fail"` with blocker `"needs.json unresolved: <path>"` regardless of `enforce`. +- `needs.json` contains zero needs → emit `overall: "pass"`, `draft_count: 0`, `notes: ["needs.json empty — nothing to gate"]`. +- Need lacks a `status` field → bucket it under the literal key `"<missing>"`, count it as past-draft for gate purposes, and surface it in `notes`. This avoids crashing on malformed corpora while keeping the gate focused on `draft`. + +## Output + +```json +{ + "needs_by_status": {"draft": 40, "reviewed": 0, "approved": 0, "released": 0}, + "draft_count": 40, + "enforce": true, + "overall": "fail", + "blockers": [ + "40 needs still in draft status; release gate requires zero drafts", + "comp_req__example_a", + "comp_req__example_b", + "..." + ], + "notes": [] +} +``` + +Fields: +- `needs_by_status`: bucket counts keyed by every status value observed in `needs.json`. Entries with zero are included for states declared in `workflows.yaml.lifecycle_states` so downstream dashboards see a stable shape; observed-but-undeclared statuses are included with their actual count and added to `notes`. +- `draft_count`: count of needs whose status equals the `initial_state` from `workflows.yaml` (fallback literal `"draft"`). +- `enforce`: echo of the input flag so downstream callers can distinguish advisory from release runs without re-reading their own config. +- `overall`: `"pass"` when `enforce=false` OR `draft_count == 0`. `"fail"` otherwise, or when preconditions (workflow/needs files) failed to resolve. +- `blockers`: in `enforce=true` mode with `draft_count > 0`, one summary line plus one entry per draft need id (capped at the first 500 ids to bound output size; overflow surfaces as a single `"... and N more"` line). In advisory mode with `draft_count > 0`, a single informational entry like `"advisory: 40 needs in draft; release gate not enforced"` — `overall` stays `"pass"`. In pass cases, empty list. +- `notes`: informational observations — fallback `initial_state` used, undeclared statuses observed, missing `status` fields, empty corpus. + +## Detection rule + +One mechanical check. No LLM judgement. + +### 1. `draft_count_against_enforce` + +**Check:** Parse `workflows.yaml` for `initial_state` (fallback `"draft"`). Iterate `needs.json`, count needs whose `status` equals the initial-state name. If `enforce=true` and the count is non-zero, fail. Otherwise pass. + +**Detection:** +```bash +# Initial state name (fallback to literal "draft"): +initial=$(python -c "import yaml; d=yaml.safe_load(open('$workflow_path')); print(d.get('initial_state','draft'))") + +# Count drafts: +python -c " +import json, sys +needs = json.load(open('$needs_json_path')) +items = needs if isinstance(needs, list) else needs.get('needs', needs) +if isinstance(items, dict): + items = items.values() +count = sum(1 for n in items if n.get('status') == '$initial') +print(count) +" + +# Gate: +# enforce == true AND count > 0 → fail +# else → pass +``` + +The skill performs the same extraction in whatever runtime the caller invokes (direct tool use, subagent shell); the pseudocode above is the reference implementation. + +## Tailoring extension point + +The `initial_state` name is read from `workflows.yaml` — the project tailors what "draft" means by declaring the state there. No other knobs are exposed on this skill; a project that wants percentage-based reporting should wire a separate metric collector rather than weaken this binary gate. + +## Composition + +Role: `atom-check`. + +Called from `pharaoh-quality-gate` as the delegated check for the `status-lifecycle-healthy` invariant (pass requirement: `overall == "pass"` with `enforce=true` set by the release pipeline). Also callable standalone from any release workflow that wants the binary gate without the full quality-gate pipeline. Never dispatches other skills. Never modifies tailoring, needs.json, or need status. + +Distinct from `pharaoh-lifecycle-check`: that skill is per-need and consults the `transitions` list (`from → to` legality with `requires:` prerequisites); this skill is corpus-wide and consults only the initial-state bucket. diff --git a/skills/pharaoh-status-lifecycle-check/fixtures/all-draft-advisory/README.md b/skills/pharaoh-status-lifecycle-check/fixtures/all-draft-advisory/README.md new file mode 100644 index 0000000..e40f0b5 --- /dev/null +++ b/skills/pharaoh-status-lifecycle-check/fixtures/all-draft-advisory/README.md @@ -0,0 +1,3 @@ +# all-draft-advisory + +Advisory mode over the same fully unreviewed corpus. Every need is `status: draft`; `enforce: false`. Expected verdict is `overall: "pass"` — the gate does not block pre-release development. `draft_count` still reports the true count so consumers (dashboards, pharaoh-quality-gate summaries) see the distribution, and `blockers` carries a single informational line flagging the advisory drafts rather than failing. diff --git a/skills/pharaoh-status-lifecycle-check/fixtures/all-draft-advisory/expected-output.json b/skills/pharaoh-status-lifecycle-check/fixtures/all-draft-advisory/expected-output.json new file mode 100644 index 0000000..186b636 --- /dev/null +++ b/skills/pharaoh-status-lifecycle-check/fixtures/all-draft-advisory/expected-output.json @@ -0,0 +1,10 @@ +{ + "needs_by_status": {"draft": 3, "reviewed": 0, "approved": 0, "released": 0}, + "draft_count": 3, + "enforce": false, + "overall": "pass", + "blockers": [ + "advisory: 3 needs in draft; release gate not enforced" + ], + "notes": [] +} diff --git a/skills/pharaoh-status-lifecycle-check/fixtures/all-draft-advisory/input-needs.json b/skills/pharaoh-status-lifecycle-check/fixtures/all-draft-advisory/input-needs.json new file mode 100644 index 0000000..fd0dbf1 --- /dev/null +++ b/skills/pharaoh-status-lifecycle-check/fixtures/all-draft-advisory/input-needs.json @@ -0,0 +1,26 @@ +{ + "versions": { + "1.0": { + "needs": { + "CREQ_inventory_read": { + "id": "CREQ_inventory_read", + "type": "comp_req", + "title": "Inventory reader", + "status": "draft" + }, + "CREQ_inventory_validate": { + "id": "CREQ_inventory_validate", + "type": "comp_req", + "title": "Inventory validation", + "status": "draft" + }, + "CREQ_inventory_export": { + "id": "CREQ_inventory_export", + "type": "comp_req", + "title": "Inventory export", + "status": "draft" + } + } + } + } +} diff --git a/skills/pharaoh-status-lifecycle-check/fixtures/all-draft-advisory/input-workflows.yaml b/skills/pharaoh-status-lifecycle-check/fixtures/all-draft-advisory/input-workflows.yaml new file mode 100644 index 0000000..9d22ab3 --- /dev/null +++ b/skills/pharaoh-status-lifecycle-check/fixtures/all-draft-advisory/input-workflows.yaml @@ -0,0 +1,10 @@ +initial_state: draft +lifecycle_states: + draft: {} + reviewed: {} + approved: {} + released: {} +transitions: + - {from: draft, to: reviewed, requires: [independent_review_complete]} + - {from: reviewed, to: approved, requires: [inspection_record_present]} + - {from: approved, to: released, requires: []} diff --git a/skills/pharaoh-status-lifecycle-check/fixtures/all-draft-enforcing/README.md b/skills/pharaoh-status-lifecycle-check/fixtures/all-draft-enforcing/README.md new file mode 100644 index 0000000..09fe00c --- /dev/null +++ b/skills/pharaoh-status-lifecycle-check/fixtures/all-draft-enforcing/README.md @@ -0,0 +1,3 @@ +# all-draft-enforcing + +Release-gate mode with a fully unreviewed corpus. Every need is `status: draft`; `enforce: true`. Expected verdict is `overall: "fail"` — `draft_count` equals the total, and `blockers` carries a summary line plus one entry per draft need id. The `needs_by_status` map shows declared states with zero buckets for the ones nothing has reached yet, proving the skill reports the full shape rather than only observed keys. diff --git a/skills/pharaoh-status-lifecycle-check/fixtures/all-draft-enforcing/expected-output.json b/skills/pharaoh-status-lifecycle-check/fixtures/all-draft-enforcing/expected-output.json new file mode 100644 index 0000000..887b5a0 --- /dev/null +++ b/skills/pharaoh-status-lifecycle-check/fixtures/all-draft-enforcing/expected-output.json @@ -0,0 +1,13 @@ +{ + "needs_by_status": {"draft": 3, "reviewed": 0, "approved": 0, "released": 0}, + "draft_count": 3, + "enforce": true, + "overall": "fail", + "blockers": [ + "3 needs still in draft status; release gate requires zero drafts", + "CREQ_inventory_read", + "CREQ_inventory_validate", + "CREQ_inventory_export" + ], + "notes": [] +} diff --git a/skills/pharaoh-status-lifecycle-check/fixtures/all-draft-enforcing/input-needs.json b/skills/pharaoh-status-lifecycle-check/fixtures/all-draft-enforcing/input-needs.json new file mode 100644 index 0000000..fd0dbf1 --- /dev/null +++ b/skills/pharaoh-status-lifecycle-check/fixtures/all-draft-enforcing/input-needs.json @@ -0,0 +1,26 @@ +{ + "versions": { + "1.0": { + "needs": { + "CREQ_inventory_read": { + "id": "CREQ_inventory_read", + "type": "comp_req", + "title": "Inventory reader", + "status": "draft" + }, + "CREQ_inventory_validate": { + "id": "CREQ_inventory_validate", + "type": "comp_req", + "title": "Inventory validation", + "status": "draft" + }, + "CREQ_inventory_export": { + "id": "CREQ_inventory_export", + "type": "comp_req", + "title": "Inventory export", + "status": "draft" + } + } + } + } +} diff --git a/skills/pharaoh-status-lifecycle-check/fixtures/all-draft-enforcing/input-workflows.yaml b/skills/pharaoh-status-lifecycle-check/fixtures/all-draft-enforcing/input-workflows.yaml new file mode 100644 index 0000000..9d22ab3 --- /dev/null +++ b/skills/pharaoh-status-lifecycle-check/fixtures/all-draft-enforcing/input-workflows.yaml @@ -0,0 +1,10 @@ +initial_state: draft +lifecycle_states: + draft: {} + reviewed: {} + approved: {} + released: {} +transitions: + - {from: draft, to: reviewed, requires: [independent_review_complete]} + - {from: reviewed, to: approved, requires: [inspection_record_present]} + - {from: approved, to: released, requires: []} diff --git a/skills/pharaoh-status-lifecycle-check/fixtures/fully-reviewed-enforcing/README.md b/skills/pharaoh-status-lifecycle-check/fixtures/fully-reviewed-enforcing/README.md new file mode 100644 index 0000000..77d0644 --- /dev/null +++ b/skills/pharaoh-status-lifecycle-check/fixtures/fully-reviewed-enforcing/README.md @@ -0,0 +1,3 @@ +# fully-reviewed-enforcing + +Release-gate happy path. Every need is past `draft` — one `reviewed`, one `approved`, one `released`; `enforce: true`. Expected verdict is `overall: "pass"` with `draft_count: 0` and an empty `blockers` list. Confirms the binary gate lets a fully-promoted corpus through and does not accidentally fail on needs that stopped at `reviewed` or `approved` (past draft is sufficient; the gate does not require `released`). diff --git a/skills/pharaoh-status-lifecycle-check/fixtures/fully-reviewed-enforcing/expected-output.json b/skills/pharaoh-status-lifecycle-check/fixtures/fully-reviewed-enforcing/expected-output.json new file mode 100644 index 0000000..f236f9c --- /dev/null +++ b/skills/pharaoh-status-lifecycle-check/fixtures/fully-reviewed-enforcing/expected-output.json @@ -0,0 +1,8 @@ +{ + "needs_by_status": {"draft": 0, "reviewed": 1, "approved": 1, "released": 1}, + "draft_count": 0, + "enforce": true, + "overall": "pass", + "blockers": [], + "notes": [] +} diff --git a/skills/pharaoh-status-lifecycle-check/fixtures/fully-reviewed-enforcing/input-needs.json b/skills/pharaoh-status-lifecycle-check/fixtures/fully-reviewed-enforcing/input-needs.json new file mode 100644 index 0000000..4990ae4 --- /dev/null +++ b/skills/pharaoh-status-lifecycle-check/fixtures/fully-reviewed-enforcing/input-needs.json @@ -0,0 +1,26 @@ +{ + "versions": { + "1.0": { + "needs": { + "CREQ_inventory_read": { + "id": "CREQ_inventory_read", + "type": "comp_req", + "title": "Inventory reader", + "status": "reviewed" + }, + "CREQ_inventory_validate": { + "id": "CREQ_inventory_validate", + "type": "comp_req", + "title": "Inventory validation", + "status": "approved" + }, + "CREQ_inventory_export": { + "id": "CREQ_inventory_export", + "type": "comp_req", + "title": "Inventory export", + "status": "released" + } + } + } + } +} diff --git a/skills/pharaoh-status-lifecycle-check/fixtures/fully-reviewed-enforcing/input-workflows.yaml b/skills/pharaoh-status-lifecycle-check/fixtures/fully-reviewed-enforcing/input-workflows.yaml new file mode 100644 index 0000000..9d22ab3 --- /dev/null +++ b/skills/pharaoh-status-lifecycle-check/fixtures/fully-reviewed-enforcing/input-workflows.yaml @@ -0,0 +1,10 @@ +initial_state: draft +lifecycle_states: + draft: {} + reviewed: {} + approved: {} + released: {} +transitions: + - {from: draft, to: reviewed, requires: [independent_review_complete]} + - {from: reviewed, to: approved, requires: [inspection_record_present]} + - {from: approved, to: released, requires: []} diff --git a/skills/pharaoh-status-lifecycle-check/fixtures/mixed-enforcing/README.md b/skills/pharaoh-status-lifecycle-check/fixtures/mixed-enforcing/README.md new file mode 100644 index 0000000..ba389df --- /dev/null +++ b/skills/pharaoh-status-lifecycle-check/fixtures/mixed-enforcing/README.md @@ -0,0 +1,3 @@ +# mixed-enforcing + +Release-gate mode over a partially reviewed corpus. Two needs are `draft`, one is `reviewed`, one is `approved`; `enforce: true`. Expected verdict is `overall: "fail"` — the binary gate blocks on any remaining drafts. `blockers` lists only the two draft need ids, not the reviewed or approved ones, proving the skill enumerates drafts rather than "everything not released". `needs_by_status` reports the four buckets with observed counts. diff --git a/skills/pharaoh-status-lifecycle-check/fixtures/mixed-enforcing/expected-output.json b/skills/pharaoh-status-lifecycle-check/fixtures/mixed-enforcing/expected-output.json new file mode 100644 index 0000000..cdd50ce --- /dev/null +++ b/skills/pharaoh-status-lifecycle-check/fixtures/mixed-enforcing/expected-output.json @@ -0,0 +1,12 @@ +{ + "needs_by_status": {"draft": 2, "reviewed": 1, "approved": 1, "released": 0}, + "draft_count": 2, + "enforce": true, + "overall": "fail", + "blockers": [ + "2 needs still in draft status; release gate requires zero drafts", + "CREQ_inventory_validate", + "CREQ_inventory_checksum" + ], + "notes": [] +} diff --git a/skills/pharaoh-status-lifecycle-check/fixtures/mixed-enforcing/input-needs.json b/skills/pharaoh-status-lifecycle-check/fixtures/mixed-enforcing/input-needs.json new file mode 100644 index 0000000..3866cdb --- /dev/null +++ b/skills/pharaoh-status-lifecycle-check/fixtures/mixed-enforcing/input-needs.json @@ -0,0 +1,32 @@ +{ + "versions": { + "1.0": { + "needs": { + "CREQ_inventory_read": { + "id": "CREQ_inventory_read", + "type": "comp_req", + "title": "Inventory reader", + "status": "reviewed" + }, + "CREQ_inventory_validate": { + "id": "CREQ_inventory_validate", + "type": "comp_req", + "title": "Inventory validation", + "status": "draft" + }, + "CREQ_inventory_export": { + "id": "CREQ_inventory_export", + "type": "comp_req", + "title": "Inventory export", + "status": "approved" + }, + "CREQ_inventory_checksum": { + "id": "CREQ_inventory_checksum", + "type": "comp_req", + "title": "Inventory checksum", + "status": "draft" + } + } + } + } +} diff --git a/skills/pharaoh-status-lifecycle-check/fixtures/mixed-enforcing/input-workflows.yaml b/skills/pharaoh-status-lifecycle-check/fixtures/mixed-enforcing/input-workflows.yaml new file mode 100644 index 0000000..9d22ab3 --- /dev/null +++ b/skills/pharaoh-status-lifecycle-check/fixtures/mixed-enforcing/input-workflows.yaml @@ -0,0 +1,10 @@ +initial_state: draft +lifecycle_states: + draft: {} + reviewed: {} + approved: {} + released: {} +transitions: + - {from: draft, to: reviewed, requires: [independent_review_complete]} + - {from: reviewed, to: approved, requires: [inspection_record_present]} + - {from: approved, to: released, requires: []} diff --git a/skills/pharaoh-tailor-bootstrap/SKILL.md b/skills/pharaoh-tailor-bootstrap/SKILL.md new file mode 100644 index 0000000..7163540 --- /dev/null +++ b/skills/pharaoh-tailor-bootstrap/SKILL.md @@ -0,0 +1,114 @@ +--- +name: pharaoh-tailor-bootstrap +description: Use when a sphinx-needs project has just been bootstrapped (post pharaoh-bootstrap, pre any needs authoring) and you need to generate minimal tailoring files from declared types — workflows.yaml, id-conventions.yaml, artefact-catalog.yaml, and per-type checklists — without requiring any needs to exist. Complements pharaoh-tailor-detect which requires ≥10 needs. +--- + +# pharaoh-tailor-bootstrap + +## When to use + +Invoke immediately after `pharaoh-bootstrap` + `pharaoh-setup` on a greenfield sphinx-needs project. The project has declared need types in `ubproject.toml` but has zero needs yet, so `pharaoh-tailor-detect` (which requires ≥10 needs to infer conventions) fails by design. This skill fills the "tailoring donut hole" — it emits minimal but valid tailoring derived from the bootstrap inputs, so: + +- Every need gets a defined lifecycle (`:status: draft` can transition to `reviewed` and `approved`). +- ID patterns are machine-checkable before the first need lands. +- `pharaoh-quality-gate` has a gate spec to evaluate against. +- Review checklists exist per type for `pharaoh-req-review` to consume. + +Do NOT use on a project that already has `.pharaoh/project/*.yaml` files — this skill never overwrites without explicit `overwrite=true`. Run `pharaoh-tailor-detect` / `pharaoh-tailor-fill` on matured projects instead. + +## Atomicity + +- (a) Indivisible — one `project_root` in → up to 5 files out in `.pharaoh/project/`. No needs authoring, no RST mutation, no setup-level edits. One tailoring phase × one project. +- (b) Input: `{project_root: str, overwrite?: bool}`. Output: JSON `{files_created: list[str], files_skipped: list[str], warnings: list[str]}`. Default `overwrite=false`. +- (c) Reward: fixture `pharaoh-validation/fixtures/pharaoh-tailor-bootstrap/input_ubproject.toml` declares two types (`feat` FEAT_, `comp_req` CREQ_) and one extra_link (`satisfies`). Run skill against it. Scorer checks: + 1. Five files created under `.pharaoh/project/`: `workflows.yaml`, `id-conventions.yaml`, `artefact-catalog.yaml`, `checklists/feat.md`, `checklists/comp_req.md`. + 2. Each emitted file's content matches the corresponding `expected_output/` fixture byte-exact (YAML canonical-sorted for the YAML files; md exact for the checklists). + 3. Re-running the skill with `overwrite=false` on the now-populated `.pharaoh/project/` is idempotent: `files_created=[]`, `files_skipped` lists all five, `warnings=[]`. + 4. Running with `overwrite=true` on populated target regenerates all five (byte-exact equality with fixture preserved). + + Pass = all 4 checks. +- (d) Reusable on any first-time Pharaoh project. +- (e) Composable: callable by `pharaoh-setup` post-bootstrap. Never calls other skills. + +## Input + +- `project_root`: absolute path. Must contain `ubproject.toml` with at least one `[[needs.types]]` entry. +- `overwrite` (optional): if `true`, regenerate existing tailoring files. Default `false` (skip + warn on collision). + +## Output + +```json +{ + "files_created": [ + ".pharaoh/project/workflows.yaml", + ".pharaoh/project/id-conventions.yaml", + ".pharaoh/project/artefact-catalog.yaml", + ".pharaoh/project/checklists/feat.md", + ".pharaoh/project/checklists/comp_req.md", + ".pharaoh/project/checklists/requirement.md" + ], + "files_skipped": [], + "warnings": [] +} +``` + +Paths are relative to `project_root`. + +## Process + +### Step 1: Read ubproject.toml + +Read `<project_root>/ubproject.toml`. Extract every `[[needs.types]]` entry's `directive` and `prefix`. Extract every `[[needs.extra_links]]` entry's `option`, `incoming`, `outgoing`. + +If zero types declared, FAIL: `"no [[needs.types]] in ubproject.toml; run pharaoh-bootstrap first"`. + +### Step 2: Emit workflows.yaml + +For each declared type, emit a block with states `draft`, `reviewed`, `approved`, transitions `draft→reviewed` (gate `reviewer_present`), `reviewed→approved` (gate `approver_present`), `reviewed→draft` (gate `reviewer_rejected`), initial `draft`, final `approved`. + +See `expected_output/workflows.yaml` in the fixture for exact format. + +### Step 3: Emit id-conventions.yaml + +`prefixes`: mapping from each directive to its prefix. `id_regex`: OR-join of all prefixes as a regex anchored to start + snake_case tail. `separator`: `"_"`. + +See fixture for exact format. + +### Step 4: Emit artefact-catalog.yaml + +For each declared type, emit: +- `required_fields`: at minimum `id`, `status`. +- `optional_fields`: `reviewer`, `approved_by`, plus `source_doc` for types that typically carry provenance (heuristic: top-level types like `feat`, `story`, `use_case` — if unsure, include it). +- `child_of`: list of parent types inferred from extra_links. Rule: if `satisfies` link exists, types that commonly use it (`comp_req`, `spec`, `impl`) list their parent (`feat`, `story`, etc.). If no clear inference, `child_of: []` — caller tunes later. +- `lifecycle_ref`: `workflows.yaml#<type>`. + +### Step 5: Emit per-type checklists + +For each declared type, write `checklists/<directive>.md` with frontmatter `applies_to: <directive>`, `required_before: [reviewed]`, and a short review checklist body. The content is type-generic for `comp_req` and `feat`; see fixtures for exact content. + +For types not covered by built-in templates (anything beyond `feat`, `comp_req`, `story`, `use_case`, `spec`, `impl`, `test`), emit a minimal checklist with `- [ ] Review this <type> for clarity, correctness, and traceability.` + +Additionally, emit `checklists/requirement.md` as a canonical alias for the primary requirement-type checklist (the `comp_req` checklist if declared, otherwise whichever declared type has prefix `REQ_` / role `requirement` per artefact-catalog.yaml). The alias is a one-line redirect `# See [<directive>.md](<directive>.md) — canonical requirement checklist` plus the frontmatter block. Downstream skills (`pharaoh-tailor-review`, `pharaoh-req-review`) reference `checklists/requirement.md` as the well-known filename, so the alias keeps the interop contract stable regardless of the project's directive naming. + +### Step 6: Check overwrite + write + +For each target file, check existence. If exists and `overwrite=false`, add to `files_skipped`, emit warning `"<path> already exists; skipping (use overwrite=true to regenerate)"`. Otherwise write the file and add to `files_created`. + +Create intermediate directories (`.pharaoh/project/`, `.pharaoh/project/checklists/`) as needed. + +### Step 7: Return + +Return the JSON report. + +## Failure modes + +- `ubproject.toml` missing → FAIL. +- Zero types declared → FAIL per Step 1. +- `.pharaoh/project/` unwritable → FAIL. + +## Non-goals + +- No tailoring inference from corpus statistics. For that, use `pharaoh-tailor-detect` + `pharaoh-tailor-fill` on matured projects. +- No pharaoh.toml generation. That is `pharaoh-setup`. +- No sphinx-needs config generation. That is `pharaoh-bootstrap`. +- No checklist customization per project — checklists are built-in templates. Caller edits after generation. diff --git a/skills/pharaoh-tailor-code-grounding-filters/SKILL.md b/skills/pharaoh-tailor-code-grounding-filters/SKILL.md new file mode 100644 index 0000000..83c1a93 --- /dev/null +++ b/skills/pharaoh-tailor-code-grounding-filters/SKILL.md @@ -0,0 +1,188 @@ +--- +name: pharaoh-tailor-code-grounding-filters +description: Use when authoring a project's `code-grounding-filters.yaml` from observed stack conventions. Detects language + CLI framework + config-object style in the project source tree and emits a tailoring YAML populated with the four parameterised filter strategies. Does not invoke `pharaoh-req-code-grounding-check`; purely produces tailoring. +chains_to: [pharaoh-tailor-review] +--- + +# pharaoh-tailor-code-grounding-filters + +## When to use + +Invoke once per project, before running `pharaoh-req-code-grounding-check` at scale, to populate `.pharaoh/project/code-grounding-filters.yaml`. Inspects a project source tree, detects which CLI framework / import syntax / config-default idiom the code uses, and emits a filter YAML whose `filters:` entries wire up the four strategies in [`../shared/code-grounding-filters.md`](../shared/code-grounding-filters.md) to the detected stack. + +Do NOT invoke to validate an existing filter YAML — that is part of `pharaoh-tailor-review`. Do NOT invoke to apply filters to a CREQ — that is `pharaoh-req-code-grounding-check`. This skill only reads the codebase and writes one YAML. + +## Atomicity + +- (a) Indivisible: one source-tree in → one filter-YAML + detection report out. No CREQ scoring, no plan dispatch, no coupling to other tailor skills beyond emitting YAML in the format the target skill reads. +- (b) Input: `{project_root: str, output_path?: str, on_existing?: "fail"|"overwrite"|"skip"}`. Output: JSON `{detected: {language, cli_framework, config_style, import_style}, emitted_filters: [...], yaml_path: str, warnings: [...]}` plus the YAML written to `output_path`. +- (c) Reward: fixtures under `skills/pharaoh-tailor-code-grounding-filters/fixtures/`: + 1. `python-typer/` — source tree with `import typer`, `def from_csv(...)`, `TypeAlias = Annotated[..., typer.Option(...)]` markers → emits `typer_kebab` (with `Opt` morphology), `python_import`, `python_dataclass_default`, and `env_var_glob` (if `os.environ` or `envvar=` usages detected). + 2. `python-click-click/` — source tree with `import click`, `@click.command()`, `click.option(...)` markers → emits `click_kebab` (no `Opt` morphology because Click projects do not use the `Opt` TypeAlias convention), `python_import`, `python_dataclass_default`. + 3. `rust-clap/` — source tree with `use clap::...`, `#[derive(Parser)]`, `#[arg(long)]` markers → emits `clap_kebab`, `rust_use_clause`, `rust_serde_default` (when `serde(default="...")` is detected). + + Pass = each fixture's emitted YAML matches `expected-filters.yaml` by substring inclusion of every declared filter's `strategy` + `name`, and the detection report matches `expected-report.json` on `detected.*` fields. +- (d) Reusable: any project regardless of language — the detection matrix is a bounded table (roughly: Python, Rust, TypeScript, Go), each detection slot independently optional. +- (e) Read-only wrt source code. Writes one YAML file at `output_path`. Running twice with `on_existing="skip"` is a no-op. + +## Input + +- `project_root`: absolute path to the project root. The skill walks up to 3 levels deep looking for source files; skips `node_modules`, `target`, `.venv`, `dist`, `build`, `__pycache__`. +- `output_path`: absolute path to write the YAML. Default: `<project_root>/.pharaoh/project/code-grounding-filters.yaml`. +- `on_existing`: `"fail"` (default — refuse to overwrite), `"overwrite"` (replace the file), `"skip"` (if file exists, return without writing; still returns the detection report for review). + +## Detection matrix + +The skill walks the source tree and greps for language + framework markers. Each detection is boolean (present / absent). Multiple languages can coexist; the emitted YAML gets filters for every detected stack. + +### Language detection + +| language | markers (any-of) | +|---|---| +| python | file extension `.py` AND (`^import\s` OR `^from\s.*\simport\s` OR `^def\s` OR `^class\s`) | +| rust | file extension `.rs` AND (`^use\s` OR `^fn\s` OR `^struct\s` OR `^enum\s`) | +| typescript | file extension `.ts` or `.tsx` AND (`^import\s` OR `^export\s`) | +| go | file extension `.go` AND (`^package\s` OR `^import\s`) | + +### CLI framework detection (Python) + +| framework | markers (any-of) | +|---|---| +| typer | `import typer` OR `typer.Typer()` OR `typer.Option` OR `@.*_app\.command` | +| click | `import click` OR `@click\.command` OR `click\.option` | +| argparse | `argparse\.ArgumentParser` | +| none | no matches | + +### CLI framework detection (Rust) + +| framework | markers (any-of) | +|---|---| +| clap | `use clap::` OR `#\[derive\(.*Parser.*\)\]` OR `#\[arg\(` OR `#\[command\(` | +| structopt | `use structopt::` OR `#\[derive\(.*StructOpt.*\)\]` | +| none | no matches | + +### CLI framework detection (Go) + +| framework | markers (any-of) | +|---|---| +| cobra | `github.com/spf13/cobra` OR `cobra.Command` | +| urfave-cli | `github.com/urfave/cli` | +| flag | `"flag"` import | +| none | no matches | + +### Config-default idiom detection + +| idiom | markers (any-of) | +|---|---| +| python_dataclass | `@dataclass` AND `field\(default=` | +| python_pydantic | `from pydantic` AND `Field\(default=` | +| python_attrs | `import attr` OR `@attr.s` | +| rust_serde | `#\[derive\(.*Deserialize.*\)\]` AND `#\[serde\(default` | +| go_struct_tag | `` `json:"` `` with `default:` stanza | +| none | no matches | + +### Env-var convention detection + +| style | markers | +|---|---| +| uppercase-prefix | ≥3 distinct `[A-Z][A-Z0-9_]+_\w+` identifiers declared as env var strings, sharing a common prefix of ≥3 chars | +| none | fewer than 3 | + +If uppercase-prefix style IS detected, also detect the **prefix** — the longest common uppercase run across observed env-var identifiers (e.g. `JAMA_` from `JAMA_URL_ENV`, `JAMA_USERNAME_ENV`, ...). The prefix is NOT encoded into the filter (prefix is per-call from the CREQ token), only the strategy itself is enabled. + +## Emission rules + +After detection, build the `filters:` list in this order (deterministic for fixture comparison): + +1. **Kebab filter** — one entry if any CLI framework was detected. Name the filter after the framework (`typer_kebab`, `click_kebab`, `clap_kebab`, `cobra_kebab`). Parameters: + - `token_regex`: `"^(--)?[a-z][a-z0-9]*(-[a-z0-9]+)+$"` + - `strip_leading`: `["--"]` + - `morphology_prefixes`: `["Opt"]` only for `typer` (that convention is Typer-specific); `[]` otherwise. + +2. **Env-var glob** — one entry if uppercase-prefix style was detected. Name `env_var_glob`. Parameters: + - `token_regex`: `"^[A-Z][A-Z0-9_]*_?\\*$"` + - `separator_character`: `"_"` + +3. **Dotted import resolution** — one entry per language with a known import syntax. Parameters differ by language: + + **Python:** + ```yaml + - name: python_import + strategy: dotted_import_resolution + token_regex: "^[a-z][\\w.]*\\.[A-Z]\\w+$" + separator: "." + import_patterns: + - "from\\s+${mod}\\s+import\\s+${attr}" + - "import\\s+${mod}\\b" + - "${tok}" + ``` + + **Rust:** + ```yaml + - name: rust_use_clause + strategy: dotted_import_resolution + token_regex: "^[a-z][\\w]*(::[\\w]+)+$" + separator: "::" + import_patterns: + - "use\\s+${tok}" + - "use\\s+${mod}::\\{[^}]*\\b${attr}\\b[^}]*\\}" + - "${tok}" + ``` + + **TypeScript:** + ```yaml + - name: ts_named_import + strategy: dotted_import_resolution + token_regex: "^@?[a-z][\\w/.-]*:[A-Z]\\w+$" + separator: ":" + import_patterns: + - "import\\s*\\{[^}]*\\b${attr}\\b[^}]*\\}\\s*from\\s*['\"]${mod}['\"]" + ``` + + **Go:** + ```yaml + - name: go_import + strategy: dotted_import_resolution + token_regex: "^[a-z][\\w/.-]*\\.[A-Z]\\w+$" + separator: "." + import_patterns: + - "\"${mod}\"\\s*$" + - "${tok}" + ``` + +4. **Literal-default** — one entry per detected config-default idiom: + + - `python_dataclass` → `python_dataclass_default` with `field\\(default=[\"']${tok}[\"']\\)` and `hint_dir_pattern: "config/"`. + - `python_pydantic` → `python_pydantic_default` with `Field\\(default=[\"']${tok}[\"']\\)` and `hint_dir_pattern: "config/|models/"`. + - `rust_serde` → `rust_serde_default` with `#\\[serde\\(default\\s*=\\s*\"${tok}\"\\)\\]` and `hint_dir_pattern: "config/|src/config/"`. + - Absent → omit this filter (the skill's other axes still catch the CREQ; the actionable evidence is just missing). + +## Output + +```json +{ + "detected": { + "languages": ["python"], + "cli_framework": "typer", + "config_default_idiom": "python_dataclass", + "env_var_style": "uppercase-prefix", + "detected_env_prefix": "JAMA_" + }, + "emitted_filters": [ + "typer_kebab", + "env_var_glob", + "python_import", + "python_dataclass_default" + ], + "yaml_path": "/abs/path/to/.pharaoh/project/code-grounding-filters.yaml", + "warnings": [] +} +``` + +`warnings` surfaces any ambiguity: two CLI frameworks co-detected (emits filters for both and warns), no language detected (emits empty `filters:` and warns), config idiom detected without matching dirs on disk (emits filter, warns that `hint_dir_pattern` may not match anything at review time). + +## Composition + +Role: `tailor-author`. + +Runs once per project during bootstrap, typically chained after `pharaoh-tailor-detect` and before `pharaoh-tailor-review`. Its output feeds `pharaoh-req-code-grounding-check` via the `tailoring_path` input. Never invokes emission or review skills; produces YAML that other skills consume. diff --git a/skills/pharaoh-tailor-code-grounding-filters/fixtures/python-click-click/README.md b/skills/pharaoh-tailor-code-grounding-filters/fixtures/python-click-click/README.md new file mode 100644 index 0000000..5c8b369 --- /dev/null +++ b/skills/pharaoh-tailor-code-grounding-filters/fixtures/python-click-click/README.md @@ -0,0 +1,7 @@ +# python-click-click + +Click-based CLI — contrasts with Typer fixture by NOT emitting the `Opt` +morphology in the kebab filter. Validates that the `morphology_prefixes` +parameter is framework-specific, not Python-wide. Also validates the +"env-var style: none" branch when fewer than 3 uppercase-prefix identifiers +are present. diff --git a/skills/pharaoh-tailor-code-grounding-filters/fixtures/python-click-click/expected-filters.yaml b/skills/pharaoh-tailor-code-grounding-filters/fixtures/python-click-click/expected-filters.yaml new file mode 100644 index 0000000..0a90d81 --- /dev/null +++ b/skills/pharaoh-tailor-code-grounding-filters/fixtures/python-click-click/expected-filters.yaml @@ -0,0 +1,21 @@ +filters: + - name: click_kebab + strategy: kebab_to_snake_or_pascal + token_regex: "^(--)?[a-z][a-z0-9]*(-[a-z0-9]+)+$" + strip_leading: ["--"] + morphology_prefixes: [] + + - name: python_import + strategy: dotted_import_resolution + token_regex: "^[a-z][\\w.]*\\.[A-Z]\\w+$" + separator: "." + import_patterns: + - "from\\s+${mod}\\s+import\\s+${attr}" + - "import\\s+${mod}\\b" + - "${tok}" + + - name: python_dataclass_default + strategy: cross_file_literal_default + token_regex: "^[a-z_][a-z0-9_]*$" + hint_dir_pattern: "config/" + field_regex: "field\\(default=[\"']${tok}[\"']\\)" diff --git a/skills/pharaoh-tailor-code-grounding-filters/fixtures/python-click-click/expected-report.json b/skills/pharaoh-tailor-code-grounding-filters/fixtures/python-click-click/expected-report.json new file mode 100644 index 0000000..86df86e --- /dev/null +++ b/skills/pharaoh-tailor-code-grounding-filters/fixtures/python-click-click/expected-report.json @@ -0,0 +1,15 @@ +{ + "detected": { + "languages": ["python"], + "cli_framework": "click", + "config_default_idiom": "python_dataclass", + "env_var_style": "none", + "detected_env_prefix": null + }, + "emitted_filters": [ + "click_kebab", + "python_import", + "python_dataclass_default" + ], + "warnings": [] +} diff --git a/skills/pharaoh-tailor-code-grounding-filters/fixtures/python-click-click/src/cli.py b/skills/pharaoh-tailor-code-grounding-filters/fixtures/python-click-click/src/cli.py new file mode 100644 index 0000000..4fc6dd1 --- /dev/null +++ b/skills/pharaoh-tailor-code-grounding-filters/fixtures/python-click-click/src/cli.py @@ -0,0 +1,16 @@ +"""Click CLI — no Opt TypeAlias convention.""" + +from dataclasses import dataclass, field + +import click + + +@dataclass +class RunConfig: + mode: str = field(default="prod") + + +@click.command() +@click.option("--license-key", help="License key") +def from_csv(license_key: str) -> None: + _ = RunConfig() diff --git a/skills/pharaoh-tailor-code-grounding-filters/fixtures/python-typer/README.md b/skills/pharaoh-tailor-code-grounding-filters/fixtures/python-typer/README.md new file mode 100644 index 0000000..2c86b68 --- /dev/null +++ b/skills/pharaoh-tailor-code-grounding-filters/fixtures/python-typer/README.md @@ -0,0 +1,11 @@ +# python-typer + +Typer + dataclass + uppercase env-var convention. Exercises all four filter +strategies being emitted together. Validates that: + +- Kebab filter includes `morphology_prefixes: ["Opt"]` (Typer-specific). +- Env-var glob is emitted because ≥3 `JAMA_*` identifiers share a prefix. +- Python import strategy is emitted because `import` / `from … import` are + observed. +- Dataclass-default literal strategy is emitted because `field(default=...)` + is observed. diff --git a/skills/pharaoh-tailor-code-grounding-filters/fixtures/python-typer/expected-filters.yaml b/skills/pharaoh-tailor-code-grounding-filters/fixtures/python-typer/expected-filters.yaml new file mode 100644 index 0000000..cdab5be --- /dev/null +++ b/skills/pharaoh-tailor-code-grounding-filters/fixtures/python-typer/expected-filters.yaml @@ -0,0 +1,26 @@ +filters: + - name: typer_kebab + strategy: kebab_to_snake_or_pascal + token_regex: "^(--)?[a-z][a-z0-9]*(-[a-z0-9]+)+$" + strip_leading: ["--"] + morphology_prefixes: ["Opt"] + + - name: env_var_glob + strategy: prefix_glob_expansion + token_regex: "^[A-Z][A-Z0-9_]*_?\\*$" + separator_character: "_" + + - name: python_import + strategy: dotted_import_resolution + token_regex: "^[a-z][\\w.]*\\.[A-Z]\\w+$" + separator: "." + import_patterns: + - "from\\s+${mod}\\s+import\\s+${attr}" + - "import\\s+${mod}\\b" + - "${tok}" + + - name: python_dataclass_default + strategy: cross_file_literal_default + token_regex: "^[a-z_][a-z0-9_]*$" + hint_dir_pattern: "config/" + field_regex: "field\\(default=[\"']${tok}[\"']\\)" diff --git a/skills/pharaoh-tailor-code-grounding-filters/fixtures/python-typer/expected-report.json b/skills/pharaoh-tailor-code-grounding-filters/fixtures/python-typer/expected-report.json new file mode 100644 index 0000000..5581cd8 --- /dev/null +++ b/skills/pharaoh-tailor-code-grounding-filters/fixtures/python-typer/expected-report.json @@ -0,0 +1,16 @@ +{ + "detected": { + "languages": ["python"], + "cli_framework": "typer", + "config_default_idiom": "python_dataclass", + "env_var_style": "uppercase-prefix", + "detected_env_prefix": "JAMA_" + }, + "emitted_filters": [ + "typer_kebab", + "env_var_glob", + "python_import", + "python_dataclass_default" + ], + "warnings": [] +} diff --git a/skills/pharaoh-tailor-code-grounding-filters/fixtures/python-typer/src/cli.py b/skills/pharaoh-tailor-code-grounding-filters/fixtures/python-typer/src/cli.py new file mode 100644 index 0000000..04c409e --- /dev/null +++ b/skills/pharaoh-tailor-code-grounding-filters/fixtures/python-typer/src/cli.py @@ -0,0 +1,24 @@ +"""Typer CLI entrypoint — fixture source.""" + +from dataclasses import dataclass, field +from typing import Annotated, TypeAlias + +import typer + +app = typer.Typer() + + +OptLicenseKey: TypeAlias = Annotated[ + str | None, + typer.Option(help="License key", envvar="UBCONNECT_LICENSE_KEY"), +] + + +@dataclass +class RunConfig: + mode: str = field(default="prod") + + +@app.command() +def from_csv(license_key: OptLicenseKey = None) -> None: + _ = RunConfig() diff --git a/skills/pharaoh-tailor-code-grounding-filters/fixtures/python-typer/src/env.py b/skills/pharaoh-tailor-code-grounding-filters/fixtures/python-typer/src/env.py new file mode 100644 index 0000000..883adb1 --- /dev/null +++ b/skills/pharaoh-tailor-code-grounding-filters/fixtures/python-typer/src/env.py @@ -0,0 +1,6 @@ +"""Env-var constants — triggers uppercase-prefix detection.""" + +JAMA_URL_ENV = "JAMA_URL" +JAMA_USERNAME_ENV = "JAMA_USERNAME" +JAMA_PASSWORD_ENV = "JAMA_PASSWORD" +JAMA_PROJECT_ID_ENV = "JAMA_PROJECT_ID" diff --git a/skills/pharaoh-tailor-code-grounding-filters/fixtures/rust-clap/README.md b/skills/pharaoh-tailor-code-grounding-filters/fixtures/rust-clap/README.md new file mode 100644 index 0000000..df229a5 --- /dev/null +++ b/skills/pharaoh-tailor-code-grounding-filters/fixtures/rust-clap/README.md @@ -0,0 +1,8 @@ +# rust-clap + +Rust + Clap + serde. Validates that the skill generalises across languages: +different separator (`::` not `.`), different import pattern (`use X::Y` not +`from X import Y`), different literal-default idiom (`#[serde(default=...)]` +not `field(default=...)`), different default directory hint (`src/config/` +included). Demonstrates that the four strategies carry enough parameters to +handle a language stack that shares zero syntax with Python. diff --git a/skills/pharaoh-tailor-code-grounding-filters/fixtures/rust-clap/expected-filters.yaml b/skills/pharaoh-tailor-code-grounding-filters/fixtures/rust-clap/expected-filters.yaml new file mode 100644 index 0000000..beb05c7 --- /dev/null +++ b/skills/pharaoh-tailor-code-grounding-filters/fixtures/rust-clap/expected-filters.yaml @@ -0,0 +1,21 @@ +filters: + - name: clap_kebab + strategy: kebab_to_snake_or_pascal + token_regex: "^(--)?[a-z][a-z0-9]*(-[a-z0-9]+)+$" + strip_leading: ["--"] + morphology_prefixes: [] + + - name: rust_use_clause + strategy: dotted_import_resolution + token_regex: "^[a-z][\\w]*(::[\\w]+)+$" + separator: "::" + import_patterns: + - "use\\s+${tok}" + - "use\\s+${mod}::\\{[^}]*\\b${attr}\\b[^}]*\\}" + - "${tok}" + + - name: rust_serde_default + strategy: cross_file_literal_default + token_regex: "^[a-z_][a-z0-9_]*$" + hint_dir_pattern: "config/|src/config/" + field_regex: "#\\[serde\\(default\\s*=\\s*\"${tok}\"\\)\\]" diff --git a/skills/pharaoh-tailor-code-grounding-filters/fixtures/rust-clap/expected-report.json b/skills/pharaoh-tailor-code-grounding-filters/fixtures/rust-clap/expected-report.json new file mode 100644 index 0000000..3aa612a --- /dev/null +++ b/skills/pharaoh-tailor-code-grounding-filters/fixtures/rust-clap/expected-report.json @@ -0,0 +1,15 @@ +{ + "detected": { + "languages": ["rust"], + "cli_framework": "clap", + "config_default_idiom": "rust_serde", + "env_var_style": "none", + "detected_env_prefix": null + }, + "emitted_filters": [ + "clap_kebab", + "rust_use_clause", + "rust_serde_default" + ], + "warnings": [] +} diff --git a/skills/pharaoh-tailor-code-grounding-filters/fixtures/rust-clap/src/main.rs b/skills/pharaoh-tailor-code-grounding-filters/fixtures/rust-clap/src/main.rs new file mode 100644 index 0000000..a1051cb --- /dev/null +++ b/skills/pharaoh-tailor-code-grounding-filters/fixtures/rust-clap/src/main.rs @@ -0,0 +1,23 @@ +use clap::Parser; +use serde::Deserialize; + +#[derive(Parser)] +#[command(version, about)] +struct Cli { + #[arg(long)] + license_key: Option<String>, +} + +#[derive(Deserialize)] +struct RunConfig { + #[serde(default = "default_mode")] + mode: String, +} + +fn default_mode() -> String { + "prod".to_string() +} + +fn main() { + let _cli = Cli::parse(); +} diff --git a/skills/pharaoh-tailor-review/SKILL.md b/skills/pharaoh-tailor-review/SKILL.md index 75662f9..7eb8aca 100644 --- a/skills/pharaoh-tailor-review/SKILL.md +++ b/skills/pharaoh-tailor-review/SKILL.md @@ -119,7 +119,7 @@ but do not replace them. For each entry in `prefixes`, the key must be a non-empty string and the value must be a non-empty string (the description). See -`examples/score/.pharaoh/project/schemas/id-conventions.schema.json` for the authoritative +`examples/my-project/.pharaoh/project/schemas/id-conventions.schema.json` for the authoritative JSON Schema. **workflows.yaml required keys:** @@ -134,7 +134,7 @@ For each transition in `transitions`: - `from` and `to` must be non-empty strings. - `requires` must be a list (may be empty). -See `examples/score/.pharaoh/project/schemas/workflows.schema.json` for the authoritative +See `examples/my-project/.pharaoh/project/schemas/workflows.schema.json` for the authoritative JSON Schema. **artefact-catalog.yaml required structure:** @@ -148,14 +148,14 @@ Top level must be a map of artefact-type keys. For each artefact type: | `required_body_sections` | list | Optional; entries are top-level heading names that must appear inside the directive body prose (e.g. `Inputs`, `Steps`, `Expected` for `tc`). Validated as body prose, not as `:key:` options. | | `lifecycle` | list | Optional; if present must be non-empty | -See `examples/score/.pharaoh/project/schemas/artefact-catalog.schema.json` for the +See `examples/my-project/.pharaoh/project/schemas/artefact-catalog.schema.json` for the authoritative JSON Schema. **checklists/*.md frontmatter:** YAML frontmatter (delimited by `---`) at the top of a checklist file is **optional**. When present, it is validated against -`examples/score/.pharaoh/project/schemas/checklists-frontmatter.schema.json`: +`examples/my-project/.pharaoh/project/schemas/checklists-frontmatter.schema.json`: | Key | Rule | |---|---| diff --git a/skills/pharaoh-toctree-emit/SKILL.md b/skills/pharaoh-toctree-emit/SKILL.md new file mode 100644 index 0000000..ac4f8e2 --- /dev/null +++ b/skills/pharaoh-toctree-emit/SKILL.md @@ -0,0 +1,112 @@ +--- +name: pharaoh-toctree-emit +description: Use when a composition skill has just emitted a set of RST files into a directory and needs to add (or regenerate) an `index.rst` with a Sphinx toctree over them. Prevents orphan-file warnings under `sphinx-build -W`. Does NOT modify the emitted RST files. Does NOT wire the emitted directory into any parent toctree — that is a caller concern. +--- + +# pharaoh-toctree-emit + +## When to use + +Invoke at the end of a plan (emitted by `pharaoh-write-plan`, executed by `pharaoh-execute-plan`) that writes N RST files into a directory (e.g. `docs/source/features/`). Without an `index.rst` with a matching toctree, Sphinx treats every generated file as orphan and `sphinx-build -W` fails. This skill writes that index, and only that index. + +Do NOT use to wire the emitted directory into its parent (e.g. updating the project-root `index.rst` to reference `features/index`). That is a separate caller concern — changes to the parent toctree are outside this skill's scope. + +Do NOT use to patch an existing `index.rst` with different content — if an existing index disagrees with what this skill would write, the skill warns and does not overwrite. Caller decides whether to delete the existing file or merge manually. + +## Atomicity + +- (a) Indivisible — one directory + one glob + one caption in → one `index.rst` file written (or no-op if already matches). No modifications to other RST files. No parent-toctree edits. +- (b) Input: `{target_dir: str, file_glob: str, parent_caption: str, maxdepth?: int, exclude?: list[str]}`. Output: JSON `{toctree_path: str, files_included: list[str], files_modified: bool}`. +- (c) Reward: fixture `pharaoh-validation/fixtures/pharaoh-toctree-emit/input_dir/` containing three files (`csv_export.rst`, `jama_pull.rst`, `reqif_import.rst`). With `target_dir` = that path, `file_glob` = `"*.rst"`, `parent_caption` = `"Features"`, `maxdepth` = `1` → skill writes `<target_dir>/index.rst` whose content exactly matches the `expected_index.rst` fixture (same alphabetical ordering of stems, same caption, same maxdepth, same blank lines). +- (d) Reusable for any sphinx-needs project with dynamically emitted RST sets (features, modules, decisions, releases). +- (e) Composable: one directory per call. A plan emitted by `pharaoh-write-plan` may include N toctree-emit tasks (one per target_dir) dispatched by `pharaoh-execute-plan`, but this skill itself handles exactly one. + +## Input + +- `target_dir`: absolute path to the directory whose RST files should be indexed. +- `file_glob`: glob pattern applied within `target_dir` (e.g. `"*.rst"`, `"*.md"`). Non-recursive — toctrees are one level deep by convention. +- `parent_caption`: human-readable heading shown above the toctree (`Features`, `Modules`, `Decisions`, etc.). +- `maxdepth` (optional): `:maxdepth:` option for the toctree. Default `1`. +- `exclude` (optional): list of filename globs to exclude from the toctree. Default `["index.rst"]` (never self-reference). + +## Output + +```json +{ + "toctree_path": "/abs/path/to/target_dir/index.rst", + "files_included": ["csv_export", "jama_pull", "reqif_import"], + "files_modified": true +} +``` + +`files_included` contains stems (filename without `.rst` extension), in the order they appear in the emitted toctree (alphabetical). + +`files_modified`: +- `true` if the skill wrote a new `index.rst` or overwrote one with matching content (idempotent write). +- `false` if `index.rst` already existed with CONTENT MATCHING what this skill would have written — no-op. +- `false` PLUS a warning in the return if `index.rst` exists with different content — skill did not overwrite; caller must handle merge. + +## Process + +### Step 1: Enumerate files + +Glob `target_dir/<file_glob>` non-recursively. Subtract `exclude` matches. Sort alphabetically by filename. Strip file extensions to produce toctree entries (Sphinx toctree entries omit `.rst`). + +If zero files remain, FAIL: `"no files matched <file_glob> under <target_dir>; nothing to index"`. + +### Step 2: Build toctree content + +Emit content in this exact shape: + +```rst +<parent_caption> +<underline of = characters, exact length of parent_caption> + +.. toctree:: + :maxdepth: <maxdepth> + + <stem_1> + <stem_2> + <stem_3> +``` + +- Single blank line between the caption underline and `.. toctree::`. +- Single blank line between `:maxdepth:` line and the first stem. +- Stems indented with 3 spaces (Sphinx toctree convention). +- No trailing blank lines after the last stem. + +### Step 3: Check existing index.rst + +If `target_dir/index.rst` does not exist → write the new content, return `files_modified=true`. + +If it exists and its content (after normalizing line endings and trailing whitespace) exactly equals what would be written → no-op, return `files_modified=false`, `warnings=[]`. + +If it exists with different content → no-op, return `files_modified=false`, warnings include `"index.rst exists with different content — not overwriting; delete manually or merge to regenerate"`. + +### Step 4: Return + +Return the JSON shape per Output section. + +## Last step + +No dedicated `*-review` atom exists for toctree emission; the operation is structural (write one `index.rst` listing N files) and its correctness is checked mechanically rather than via a prose-judgement review atom. This skill therefore performs an inline self-verification in Step 4 before returning: + +1. Every entry in the emitted `toctree` block resolves to an existing file under `target_dir` (no dangling references). +2. The emitted `index.rst` contains exactly one `.. toctree::` directive (no accidental duplication). +3. No entry appears twice in the toctree body. + +If any check fails, do not write `index.rst`; return `status: "failed"` with evidence. + +Coverage is mechanically enforced at plan level by `pharaoh-quality-gate`'s orphan / link-completeness invariants plus `sphinx-build -W` itself (which fails on orphan RST files). See [`shared/self-review-invariant.md`](../shared/self-review-invariant.md) for the rationale. + +## Failure modes + +- `target_dir` does not exist → FAIL. +- `target_dir` is not a directory → FAIL. +- Zero files matched glob (after exclude) → FAIL per Step 1. + +## Non-goals + +- No parent-toctree updates. The caller (human or a future composition skill) wires `<target_dir>/index.rst` into the project root's `index.rst` manually. +- No inter-file toctree. This skill assumes a flat glob over one directory — nested toctrees require separate invocations. +- No .md support beyond passing `*.md` as `file_glob`. Sphinx's myst_parser must already be configured in the project for Markdown files to resolve. diff --git a/skills/pharaoh-use-case-diagram-draft/SKILL.md b/skills/pharaoh-use-case-diagram-draft/SKILL.md new file mode 100644 index 0000000..f410f4b --- /dev/null +++ b/skills/pharaoh-use-case-diagram-draft/SKILL.md @@ -0,0 +1,120 @@ +--- +name: pharaoh-use-case-diagram-draft +description: Use when drafting one use-case diagram for a single feat — actors (primary, secondary, external systems), use cases (one per user-facing capability), and system boundary. Renderer-aware (mermaid or plantuml per `.pharaoh/project/diagram-conventions.yaml`). First concrete `*-diagram-draft` skill — others follow the same shape. +chains_from: [pharaoh-feat-draft-from-docs] +chains_to: [pharaoh-diagram-review] +--- + +# pharaoh-use-case-diagram-draft + +## When to use + +Invoke when a feat (user-facing capability) needs a use-case view. Typical caller: a plan emitted by `pharaoh-write-plan` that selects this skill via the view_map (see `shared/diagram-view-selection.md`). + +Do NOT use for component decomposition — that's `pharaoh-component-diagram-draft`. Do NOT use for interaction flow — that's `pharaoh-sequence-diagram-draft`. One skill per diagram kind — see the atomic-skill criteria. + +## Atomicity + +- (a) One feat context + one actor list + one renderer in → one use-case diagram block out. +- (b) Input: `{feat_id: str, actors: list[{name, role, kind}], use_cases: list[str], external_systems: list[str], renderer: "mermaid" | "plantuml", tailoring_path: str}`. Output: JSON `{diagram_block: <rst_directive_str>, element_count: int}`. +- (c) Reward: fixtures in `pharaoh-validation/fixtures/pharaoh-use-case-diagram-draft/` — canonical inputs + expected outputs for both renderers. Parser gate: emitted block passes `mmdc` (mermaid) or `plantuml -checkonly` (plantuml). Required-elements gate: ≥1 actor, 1 system boundary, ≥1 use case. Element count ≤ `element_count_max` from tailoring. +- (d) Reusable for every feat that needs a use-case view. +- (e) Emits only the directive block. Does not touch conf.py, does not mutate tailoring, does not dispatch other skills (except the mandated review as last step). + +## Input + +- `feat_id`: the need_id of the parent feat. Used as the diagram's `:caption:` hook for `trace_to_parent`. +- `actors`: list of actor specs. Each: + - `name`: short label (e.g. "User", "CSV file", "Jama REST API"). + - `role`: "primary" or "secondary". + - `kind`: "human" or "system". +- `use_cases`: list of user-facing capability strings. Each becomes a use case node inside the system boundary. Derived from the feat's body (shall-clauses) or supplied by the caller. +- `external_systems`: list of labels for external participants shown outside the system boundary (databases, third-party APIs, file formats). +- `renderer`: "mermaid" or "plantuml". If unspecified, read from `tailoring_path/diagram-conventions.yaml > renderer`; if that is also unspecified, default to "mermaid". +- `tailoring_path`: absolute path to `.pharaoh/project/`. Reads `diagram-conventions.yaml` for renderer + `element_count_max` + `stereotype_aliases`. + +## Output + +JSON document, no prose wrapper: + +```json +{ + "diagram_block": ".. mermaid::\n :caption: FEAT_example — use case diagram\n\n flowchart TB\n ...", + "element_count": 5, + "renderer": "mermaid" +} +``` + +## How to emit — mermaid + +Mermaid does not have a first-class use-case diagram type. Use `flowchart TB` with stereotype-labelled nodes. Shape template: + +```mermaid +flowchart TB + %% Actors + actor_user(("User")) + actor_jama[("Jama REST API")] + + %% System boundary + subgraph SYS["<<system>> <project>"] + uc1["Fetch Jama items"] + uc2["Convert to Sphinx-Needs"] + uc3["Export needs.json"] + end + + actor_user --> uc1 + uc1 --> actor_jama + uc1 --> uc2 + uc2 --> uc3 +``` + +Conventions: +- `(( ))` shape = human actor. +- `[( )]` cylinder shape = external data/system actor. +- `subgraph` with a `<<system>>` prefix in its label = system boundary. +- Arrows connect actors to use cases (primary: actor → uc; secondary: uc → external). + +## How to emit — plantuml + +PlantUML has a first-class use-case syntax. Use it: + +```plantuml +@startuml +left to right direction +actor "User" as user +actor "Jama REST API" as jama <<external>> + +rectangle "<project>" { + usecase "Fetch Jama items" as uc1 + usecase "Convert to Sphinx-Needs" as uc2 + usecase "Export needs.json" as uc3 +} + +user --> uc1 +uc1 --> jama +uc1 --> uc2 +uc2 --> uc3 +@enduml +``` + +Conventions: +- `actor "..." as <alias>` for human actors. +- `actor "..." as <alias> <<external>>` for system actors. +- `rectangle "SystemName" { ... }` for the system boundary. +- `usecase "..." as <alias>` inside the rectangle. + +## Safe labels + +Every emitted label obeys [`shared/diagram-safe-labels.md`](../shared/diagram-safe-labels.md) — no `;`, no `|`, no unescaped `"` in labels, etc. The draft output runs through `pharaoh-diagram-lint` before success. + +## Relationship semantics + +Use-case diagrams use association arrows (`-->`) from actor to use case, and include / extend between use cases if the scope requires. See [`shared/uml-relationship-semantics.md`](../shared/uml-relationship-semantics.md) for the full decision matrix if include / extend are needed (rare at feat level). + +## Last step + +After emitting the diagram block, invoke `pharaoh-diagram-review` with `diagram_type: use_case` and `parent_need_id: <feat_id>`. Attach the returned review JSON to this skill's output under the key `review`. If review emits any critical finding, return non-success with the findings verbatim. See [`shared/self-review-invariant.md`](../shared/self-review-invariant.md). + +## Composition + +Invoked as a task in plans emitted by `pharaoh-write-plan` when the plan includes a feat that selects use-case as its primary view (per `shared/diagram-view-selection.md`). diff --git a/skills/pharaoh-vplan-draft/SKILL.md b/skills/pharaoh-vplan-draft/SKILL.md index 696e0ad..5ffdda4 100644 --- a/skills/pharaoh-vplan-draft/SKILL.md +++ b/skills/pharaoh-vplan-draft/SKILL.md @@ -338,3 +338,9 @@ concrete ("ABS pump output signal activates within 50 ms"). All pass. Consider running `pharaoh-vplan-review tc__abs_pump_activation_system` to audit against per-axis criteria. ``` + +## Last step + +After emitting the artefact, invoke `pharaoh-vplan-review` on it. Pass the emitted artefact (or its `need_id`) as `target`. Attach the returned review JSON to the skill's output under the key `review`. If the review emits any axis with `score: 0` or `severity: critical`, return a non-success status with the review findings verbatim and do NOT finalize the artefact — the caller must regenerate (via `pharaoh-vplan-regenerate` if available, or by re-invoking this skill with the findings as input). + +See [`shared/self-review-invariant.md`](../shared/self-review-invariant.md) for the rationale and enforcement mechanism. Coverage is mechanically enforced by `pharaoh-self-review-coverage-check` in `pharaoh-quality-gate`. diff --git a/skills/pharaoh-write-plan/SKILL.md b/skills/pharaoh-write-plan/SKILL.md new file mode 100644 index 0000000..9ed35b2 --- /dev/null +++ b/skills/pharaoh-write-plan/SKILL.md @@ -0,0 +1,240 @@ +--- +name: pharaoh-write-plan +description: Use when you have an intent (e.g. "reverse-engineer features and reqs from this module") and need a concrete plan.yaml that pharaoh-execute-plan can run. Picks a plan template by intent, fills project-specific values, emits a plan that validates against schema.md. Does NOT execute anything. +--- + +# pharaoh-write-plan + +## When to use + +Invoke when you need a plan.yaml and do not already have one. Typical inputs: a short natural-language intent plus a project root and its tailoring files. Typical output: a plan ready to hand to `pharaoh-execute-plan`. + +Do NOT use to execute plans (that is `pharaoh-execute-plan`). Do NOT use to review emitted artefacts (that is `pharaoh-quality-gate` or `pharaoh-req-review`). Do NOT use to discover feats or files at scale — discovery is expressed as tasks in the plan itself. + +## Why this skill exists + +The deleted composition skills (`pharaoh-feats-from-project`, `pharaoh-reqs-from-module`) attempted to orchestrate 6-12 atomic skills via prose. In practice an LLM executing them flattened the process and dropped steps. This skill replaces that pattern by emitting the orchestration as data (plan.yaml), consumed by a generic executor. The domain heuristics those skills carried — split_strategy choice, preseed ordering, quality-gate terminal placement, id-allocate positioning — live here, but they decide plan content, not runtime behaviour. + +## Atomicity + +- (a) **Indivisible.** One intent → one plan.yaml. Does not execute tasks. Does not write artefacts. Does not mutate `.papyrus/` or `.pharaoh/`. Pure transformation: intent + project state → plan text. +- (b) **Typed I/O.** + - Input: `{intent: str, project_root: str, tailoring: {ubproject_toml_path?: str, pharaoh_toml_path?: str}, template_name?: str, vars?: dict[str, any]}`. + - Output: `{plan_yaml: str, template_used: str, warnings: list[str]}`. +- (c) **Execution-based reward.** Fixture in `pharaoh-validation/fixtures/write-plan-smoke/` contains a minimal project (docs/ with 1 RST file, src/ with 3 Python files, `ubproject.toml` declaring `feat` + `comp_req` types). Scorer runs write-plan with intent `"reverse-engineer features and reqs"`. Assertions: + 1. Output is valid YAML parseable by PyYAML. + 2. Output passes `pharaoh-execute-plan/schema.md` static validation (ref parsing, skill existence, cycle detection). + 3. `preseed_papyrus` task appears before any task invoking `pharaoh-req-from-code`. + 4. `pharaoh-quality-gate` is the last task (no downstream deps). + 5. `pharaoh-id-allocate` appears before any `pharaoh-req-from-code`. + 5a. Every plan that schedules `pharaoh-req-from-code` also schedules a `review_comp_reqs` task invoking `pharaoh-req-review` AND a `grounding_check_comp_reqs` task invoking `pharaoh-req-code-grounding-check`, both with `foreach: ${reqs_from_code.emitted_ids}` and `depends_on: [reqs_from_code]` (or equivalent per-file dependency set). Both tasks must appear in `quality_gate.depends_on`. These are **explicit** plan tasks — they are NOT replaced by the skill's in-body `## Last step` self-invocation, which the LLM-executor drops under foreach fan-out. + 6. Every `skill:` references a directory present under `<pharaoh>/skills/` or `<papyrus>/skills/`. + 7. **Dep-probe enrichment with prerequisite insertion.** Fixture's `conf.py` declares only `['sphinx_needs']` (no mermaid extension). Scorer checks: (a) the emitted plan still contains every diagram-emitting task (`pharaoh-feat-component-extract`, `pharaoh-feat-flow-extract`) — tasks are NOT stripped; (b) the plan contains exactly one new task with `skill: pharaoh-sphinx-extension-add` whose `extensions` input lists `sphinxcontrib.mermaid`; (c) every diagram-emitting task's `depends_on` list includes the new prerequisite task id; (d) `warnings` contains at least one human-readable entry naming the missing `sphinxcontrib.mermaid` module and pointing to the prerequisite task. +- (d) **Reusable.** Any intent matching an available template. Adding a new workflow = adding a new template, not modifying this skill. +- (e) **Composable.** Called by humans or by future wrapper skills (`pharaoh-reverse-engineer` could chain write-plan → execute-plan → quality-gate interpretation). + +## Input + +- `intent`: short phrase, normalised against the template index. Accepted phrasings map to templates: + - `"reverse-engineer project"`, `"reverse-engineer features and reqs"`, `"rev-eng full project"` → `templates/reverse-engineer-project.yaml.j2` + - `"reverse-engineer module"`, `"reqs from module"`, `"extract reqs from files"` → `templates/reverse-engineer-module.yaml.j2` + - When the intent does not match any template, emit a `warnings` entry and return `template_used: none`; do not fabricate a plan. +- `project_root`: absolute path to the target project. +- `tailoring.ubproject_toml_path`: path to `ubproject.toml` for type/prefix lookup. Optional if the project has no tailoring (use baseline defaults from `pharaoh-bootstrap`). +- `tailoring.pharaoh_toml_path`: path to `pharaoh.toml` for source-layout discovery. +- `template_name` (optional): overrides intent-based dispatch; names a template file under `templates/` without the `.yaml.j2` suffix. +- `vars` (optional): dict of additional template variables (e.g. `{"docs_root": "docs/source", "src_root": "src/<project>"}`). Caller-provided values win over skill-inferred ones. Notable optional vars consumed by the reverse-engineer-project template: + - `target_docs_path`: where emitted artefacts finally live (`toctree-emit` and `quality-gate` read from this path). Default `${workspace}/artefacts`. Set this when the caller wants reverse-engineered spec to land directly under the project's docs tree (e.g. `docs/source/spec/feature/`) without having to override `workspace_dir`. + +## Output + +```yaml +plan_yaml: | + name: ... + version: 1 + ... +template_used: reverse-engineer-project +warnings: + - "inferred docs_root as docs/source (no explicit value in pharaoh.toml)" +``` + +`plan_yaml` is the full text to hand to `pharaoh-execute-plan`. `template_used` records provenance for audit. `warnings` surfaces inference decisions (e.g., guessed paths, missing optional inputs). + +## Templates + +Templates live under `templates/` with filename `<name>.yaml.j2`. Each template: + +1. Begins with a YAML front-matter block (actual YAML, not Jinja) declaring `required_vars` and `optional_vars`. +2. Is a Jinja2-style text template with `{{ var }}` placeholders and `{% for %}` loops only. +3. Produces a plan.yaml body below the front-matter. + +Supported Jinja constructs: +- `{{ var }}` simple substitution. +- `{% for item in list %}` ... `{% endfor %}` iteration (rare — most iteration should be expressed as `foreach:` in the emitted plan, not unrolled at write time). +- `{% if cond %}` ... `{% endif %}` for optional tasks (e.g., include diagram tasks only when tailoring declares a diagram renderer). + +No arbitrary Python expressions, no filters beyond `| default(...)`. If a template needs richer logic, split it into two templates. + +## Process + +### Step 1: Resolve template + +1. If `template_name` is provided, use it directly. +2. Else normalise `intent` (lowercase, strip punctuation, collapse whitespace) and look up in the intent→template map above. +3. If no match, return `template_used: "none"`, `plan_yaml: ""`, and add a warning `"no template matched intent '<intent>'; valid intents: <list>"`. + +### Step 2: Gather variables + +Combine variables in this precedence (higher wins): + +1. Defaults baked into the template's front-matter. +2. Inferred from `tailoring.pharaoh_toml_path` (if present): + - `src_root` from `[pharaoh.codelinks].src_dir` or `[source_discover].src_dir`. + - `docs_root` from sphinx conf lookup (`docs/source/conf.py`, `docs/conf.py`, `conf.py`). +3. Inferred from `tailoring.ubproject_toml_path` (if present): + - `feat_directive`, `feat_prefix`, `comp_req_directive`, `comp_req_prefix` from the `[[needs.types]]` array. +4. Inferred from `docs_root` (after it resolves): + - `docs`: list of relative paths (relative to `project_root`) produced by globbing `<project_root>/<docs_root>/**/*.rst` and `**/*.md`, sorted alphabetically. Excludes `index.rst` / `index.md` (toctree parent is not a feature doc). Empty list if the directory is absent. This satisfies the `doc_files` shape that `pharaoh-feat-draft-from-docs` expects; templates iterate `docs` at write time rather than passing `docs_root` through to the skill. +5. Caller-supplied `vars`. + +Any required_var missing after this merge → add a warning, leave placeholder intact in the emitted plan (caller must fill before executing), do not fabricate a value. + +### Step 3: Render template + +Substitute `{{ var }}` tokens. Evaluate `{% if %}`/`{% for %}` blocks. Emit the rendered body (the part below the template's front-matter) as the plan. + +### Step 3.5: Probe required sphinx extensions and insert prerequisite tasks + +Before validating the rendered plan, probe `conf.py` to verify that the renderers required by diagram-emitting tasks are loaded. When a required extension is missing, this step enriches the plan with a `pharaoh-sphinx-extension-add` prerequisite task — it does NOT strip the diagram task. The plan body gains a task; no task is removed. This preserves the B1 invariant ("enrich, never strip") while also giving the executor an actionable step instead of a human handoff. + +**Probe procedure:** + +1. Resolve `conf.py` using the same lookup chain as Step 2 (`<docs_root>/conf.py` → `docs/source/conf.py` → `docs/conf.py` → `<project_root>/conf.py`). If absent, emit a warning and skip this step. +2. Parse `extensions = [...]` from the resolved `conf.py`. Flatten to a set of imported extension module paths. +3. Scan the rendered plan for diagram-emitting skills. Each has a fixed renderer surface: + + | Skill | Default renderer | Required extension module | pypi package | + | ------------------------------ | ---------------- | ------------------------- | ---------------------- | + | `pharaoh-feat-component-extract` | mermaid (or uml) | `sphinxcontrib.mermaid` | `sphinxcontrib-mermaid` | + | `pharaoh-feat-flow-extract` | mermaid (or uml) | `sphinxcontrib.mermaid` | `sphinxcontrib-mermaid` | + | `pharaoh-diagram-lint` | both | `sphinxcontrib.mermaid` AND/OR `sphinxcontrib.plantuml` | same, by renderer | + + If a task's inputs include `renderer_override: "plantuml"`, the required extension becomes `sphinxcontrib.plantuml` (pypi: `sphinxcontrib-plantuml`). If the template does not set `renderer_override`, fall back to `pharaoh.toml [pharaoh.diagrams].renderer` when readable, else `mermaid`. +4. Collect the set of missing extensions (present in the "required" column for some diagram task but absent from `conf.py` extensions list). If empty, skip to Step 4. +5. **Insert a prerequisite task into the plan body.** For all missing extensions (batched into one task invocation — `pharaoh-sphinx-extension-add` accepts a list), append a task with deterministic id `sphinx_extension_add`: + + ```yaml + - id: sphinx_extension_add + skill: pharaoh-sphinx-extension-add + inputs: + conf_py: <resolved conf_py path> + extensions: [<list of missing extension modules>] + install_if_missing: true + on_package_manager_missing: warn + reporter_id: "write-plan:sphinx-extension-add" + depends_on: [] + ``` + + Place this task before any diagram-emitting task in the plan's task list. + +6. **Rewrite `depends_on` of every diagram-emitting task** so it includes `sphinx_extension_add` as a dependency. This preserves the diagram task's existing dependencies and adds `sphinx_extension_add` to the list. +7. For each missing extension, ALSO append a warning entry (human-readable handoff in case someone inspects the plan without running it): + + ``` + diagram task '<task_id>' emits <renderer> blocks but conf.py does not load '<ext_module>'. Plan includes prerequisite task 'sphinx_extension_add' that will install '<pypi_pkg>' and update conf.py before diagram tasks run. Requires a resolvable package manager in the execution environment. + ``` + +**Design notes:** + +- No task is ever REMOVED. This step is plan-body enrichment: one new task + `depends_on` additions, no deletions. This preserves the B1 invariant. +- If `pharaoh-sphinx-extension-add` is itself missing from the skills tree (e.g. an old installation), log a warning and fall back to warn-only mode (do not insert the task). +- If probing fails (e.g. `conf.py` unparseable), emit one warning naming the parse failure and proceed without insertion. Do not abort. +- The prerequisite task is always batched (one task per plan, not one per missing extension), keeping the plan's task count bounded. + +### Step 4: Validate against schema + +Invoke the static-validation portion of `pharaoh-execute-plan/schema.md` mentally: + +1. Parse rendered text as YAML. +2. Confirm required top-level fields present. +3. Confirm every `skill:` references an existing skill directory under `<pharaoh>/skills/` or `<papyrus>/skills/`. +4. **Terminal quality-gate invariant (unconditional).** Compute the set of tasks with no downstream dependents (no other task lists them in `depends_on`). That set must be non-empty and every task in it must have `skill: pharaoh-quality-gate`. Rationale: when quality gate is absent or non-terminal, executors skip it under cost pressure and ID/body/satisfies checks go unenforced. Every reverse-engineering template ships with a terminal `quality_gate`; this check prevents custom templates from drifting away from that invariant. +5. Confirm the ordering invariants specific to reverse-engineering intents: + - preseed_papyrus before any req-emission task. + - id-allocate before any req-emission task referring to allocated IDs. +6. On any violation: abort, return empty `plan_yaml`, add the violation to `warnings`. + +### Step 5: Return + +Return `{plan_yaml, template_used, warnings}`. + +## Heuristics carried from deleted composition skills + +These are the domain bits that used to live in prose inside `pharaoh-feats-from-project` and `pharaoh-reqs-from-module`. They now live in template content and in the variable-inference logic above. Enumerated for auditability: + +1. **split_strategy per file.** Templates emit `split_strategy: ${heuristics.split_strategy(${item.file})}` on every `pharaoh-req-from-code` task, so the executor's helper evaluates it at dispatch time (no LOC counting at write time). The helper's rule: LOC ≤ 500 → single; 500 < LOC ≤ 2000 and section markers → sections; else → top_level_symbols. +2. **preseed Papyrus before reqs (unconditional).** Templates always emit a `preseed_papyrus` task (using `pharaoh-decision-record` or a dedicated future skill) with `depends_on: []` and a `depends_on: [preseed_papyrus]` on every req-emission task. Preseed registers canonical feat names in Papyrus, not file-level associations. Every feat-emission agent queries the same canonical-name namespace regardless of which files it touches, so preseed is always useful even when concurrent agents work on disjoint file sets. +3. **id-allocate before reqs.** Templates emit `pharaoh-id-allocate` after file discovery (feat-file-map or module-file-enum), producing `allowed_ids` that req-emission tasks consume via ref. +4. **Multi-parent reqs.** When a file maps to multiple feats (feat-file-map's `shared_with`), the template emits a single req-from-code task with `parent_feat_ids: ${item.parents}` (a list), not one task per parent. +5. **Quality-gate terminal.** Templates always include `pharaoh-quality-gate` as the last task, taking `artefacts_dir: ${workspace}/artefacts`. +6. **file-per-feature layout.** Templates produce one artefact per feat via the `pharaoh-toctree-emit` task's inputs, matching the layout committed in `pharaoh-feats-from-project` Task 6 of the prior plan. +7. **Diagram dep probe (enrich, never strip).** Write-plan reads `conf.py` extensions and, when a plan includes diagram-emitting tasks whose renderer extension is absent, inserts a `pharaoh-sphinx-extension-add` prerequisite task into the plan body (see Step 3.5). The diagram tasks remain in place and gain the new task as a dependency. A warning is still emitted alongside for human inspection. The contract is: write-plan informs AND enriches, never drops; the diagram deliverable is both visible and actually runnable end-to-end. + +## Review invariant (self-review) + +Every emission task in a plan is followed by its matching review task in the DAG. Mapping in [`shared/self-review-map.yaml`](../shared/self-review-map.yaml). The template handles this automatically — user does not need to request review tasks. + +The terminal `pharaoh-quality-gate` task lists all review tasks in its `depends_on` and configures `gate_spec.invariants.self_review_coverage: true` so that missing reviews fail the gate. See [`shared/self-review-invariant.md`](../shared/self-review-invariant.md). + +The two companion invariants are also unconditional: + +- `papyrus_non_empty` — enabled when the plan ran `preseed_papyrus`. Catches the LLM-executor-skipped-Papyrus-writes failure class. +- `dispatch_signal_matches_plan` — always enabled. Catches the LLM-executor-collapsed-parallel-to-inline failure class. + +Both delegate to atomic check skills; the gate itself stays a pure aggregator. + +## Failure modes + +| Condition | Response | +| ------------------------------------------- | -------------------------------------------------------- | +| Intent matches no template | Return empty plan; warning enumerating valid intents. | +| Required var missing after merge | Leave placeholder; warning; caller must fill. | +| Template references a non-existent skill | Abort with warning; return empty plan. | +| Rendered YAML fails schema validation | Abort with warning; return empty plan. | +| tailoring paths unreadable | Proceed with defaults; warning per missing file. | + +## Framework for remaining *-diagram-draft skills + +`pharaoh-use-case-diagram-draft` is the first concrete implementation of the `*-diagram-draft` family. It demonstrates the pattern every future draft skill must follow: + +1. Frontmatter `name`, `description` starting "Use when", `chains_from` / `chains_to: [pharaoh-diagram-review]`. +2. Atomicity section showing (a)-(e). +3. Input section naming `renderer` + `tailoring_path` + per-diagram-type scope inputs. +4. Output section with `{diagram_block, element_count, renderer}` shape. +5. Two "How to emit" subsections — one per renderer (mermaid, plantuml). +6. "Safe labels" subsection linking to `shared/diagram-safe-labels.md`. +7. "Relationship semantics" subsection linking to `shared/uml-relationship-semantics.md` if the diagram kind uses structural relationships. +8. "Last step" subsection invoking `pharaoh-diagram-review` per the self-review invariant. + +Agent cross-ref in `.github/agents/pharaoh.<name>.agent.md` required for CI. + +Diagram-draft skill catalogue (one per UML / SysML view the emitter supports): + +- `pharaoh-use-case-diagram-draft` — **shipped**, runnable end-to-end. +- `pharaoh-sequence-diagram-draft` — design-only scaffold. +- `pharaoh-component-diagram-draft` — design-only scaffold. +- `pharaoh-class-diagram-draft` — design-only scaffold. +- `pharaoh-state-diagram-draft` — design-only scaffold. +- `pharaoh-activity-diagram-draft` — design-only scaffold. +- `pharaoh-block-diagram-draft` — design-only scaffold. +- `pharaoh-deployment-diagram-draft` — design-only scaffold. +- `pharaoh-fault-tree-diagram-draft` — design-only scaffold. + +Shipped skills follow the canonical skeleton (frontmatter → input → output → process → review invocation) and have a matching agent under `.github/agents/`. Design-only scaffolds declare the same frontmatter + agent pair for plan-authoring and CI validation, but their process body is a `DESIGN ONLY` placeholder with a sentinel FAIL — implementation lands per-kind when a flow actually needs that view. Use-case extraction is handled end-to-end by `pharaoh-use-case-diagram-draft`; component and flow views are currently extracted directly from code by `pharaoh-feat-component-extract` and `pharaoh-feat-flow-extract` (not by the draft skills). + +## Non-goals + +- No skill discovery beyond checking directory existence. If a template references `pharaoh-foo` and that directory exists, write-plan trusts its contents are valid. +- No file enumeration. File lists come from tasks in the plan at execute time (e.g., `pharaoh-feat-file-map`). +- No artefact emission. This skill emits a plan, not artefacts. +- No branching at runtime. If an intent has two variants ("with diagrams" vs "without"), add two templates. diff --git a/skills/pharaoh-write-plan/templates/reverse-engineer-module.yaml.j2 b/skills/pharaoh-write-plan/templates/reverse-engineer-module.yaml.j2 new file mode 100644 index 0000000..8099fba --- /dev/null +++ b/skills/pharaoh-write-plan/templates/reverse-engineer-module.yaml.j2 @@ -0,0 +1,126 @@ +--- +template_name: reverse-engineer-module +description: | + Single-module reverse-engineering: caller supplies a file list and + optional parent feat ids, plan emits preseed, id-allocate, per-file + req-from-code, quality-gate. No feat-draft, no file-map, no diagrams. + Use when feats already exist and you just want comp_reqs for a module. + + Files are UNROLLED into concrete per-file tasks at write time (no + runtime foreach), so this template suits small modules (3-12 files). + For larger sets use reverse-engineer-project, which discovers files + via feat-file-map and foreach-expands. +required_vars: + - project_root + - module_name # slug for plan.name + - module_dir # relative to project_root + - docs_root # relative to project_root (e.g. "docs/source"); used to locate needs.json + - files # list[{filename, stem}] relative to module_dir; caller supplies stems + - comp_req_directive # from ubproject.toml + - comp_req_prefix # e.g. "CREQ_" +optional_vars: + - parent_feat_ids # list[str], defaults to empty + - papyrus_seeds # list[{canonical_name, body}], caller-supplied + - execution_mode # default "inline" (small N) + - papyrus_workspace # default <project_root>/.papyrus + - workspace_dir # default <project_root>/.pharaoh/runs/<module_name>-<ts>/ +--- +name: reverse-engineer-{{ module_name }} +version: 1 +project_root: {{ project_root }} +{% if workspace_dir %}workspace_dir: {{ workspace_dir }}{% endif %} +defaults: + execution_mode: {{ execution_mode | default("inline") }} + retry_on_validation_fail: 1 + +tasks: +{% if papyrus_seeds %} +{% for seed in papyrus_seeds %} + - id: preseed_{{ loop.index0 }} + skill: pharaoh-decision-record + inputs: + type: fact + canonical_name: "{{ seed.canonical_name }}" + body: "{{ seed.body }}" + reporter_id: "write-plan:preseed-module" + tags: ["origin:preseed"] +{% endfor %} +{% endif %} + + - id: id_allocate + skill: pharaoh-id-allocate + inputs: + requests: +{% for f in files %} + - stem: {{ f.stem }} + count: 3 + prefix: {{ comp_req_prefix }} + type: {{ comp_req_directive }} + parent_feat_id: {{ (parent_feat_ids | default([]))[0] | default("null") }} +{% endfor %} + existing_ids_file: {{ project_root }}/{{ docs_root }}/_build/needs/needs.json + depends_on: {% if papyrus_seeds %}[{% for seed in papyrus_seeds %}preseed_{{ loop.index0 }}{% if not loop.last %}, {% endif %}{% endfor %}]{% else %}[]{% endif %} + outputs: + allocations: mapping + +{% for f in files %} + - id: reqs_{{ f.stem }} + skill: pharaoh-req-from-code + inputs: + file_path: {{ module_dir }}/{{ f.filename }} + parent_feat_ids: {{ parent_feat_ids | default([]) }} + target_level: {{ comp_req_directive }} + split_strategy: ${heuristics.split_strategy("{{ module_dir }}/{{ f.filename }}")} + allowed_ids: ${id_allocate.allocations | by_stem("{{ f.stem }}")} + papyrus_workspace: {{ papyrus_workspace | default(project_root + "/.papyrus") }} + project_root: {{ project_root }} + reporter_id: "write-plan:req-from-code-{{ f.stem }}" + depends_on: [id_allocate] + parallel_group: reqs_from_code + expected_output_schema: json_obj + outputs: + reqs: list # [{id, title, type, body, source_doc, satisfies, verification, raw_rst}, ...] +{% endfor %} + + # Per-comp_req reviews and grounding checks — rendered per-stem since + # the module template has no aggregate `reqs_from_code` task (each file + # gets its own `reqs_<stem>`). Each stem gets its own review + grounding + # task foreach-ing over that stem's reqs list. +{% for f in files %} + - id: review_{{ f.stem }} + skill: pharaoh-req-review + foreach: "{% raw %}${reqs_{% endraw %}{{ f.stem }}{% raw %}.reqs}{% endraw %}" + execution_mode: subagents + depends_on: [reqs_{{ f.stem }}] + inputs: + target: ${item.raw_rst} + checklist_path: skills/shared/checklists/requirement.md + tailoring_path: "{{ project_root }}/.pharaoh/project" + parallel_group: review_comp_reqs + + - id: grounding_{{ f.stem }} + skill: pharaoh-req-code-grounding-check + foreach: "{% raw %}${reqs_{% endraw %}{{ f.stem }}{% raw %}.reqs}{% endraw %}" + execution_mode: subagents + depends_on: [reqs_{{ f.stem }}] + inputs: + target: ${item.raw_rst} + source_doc_path: ${item.source_doc} + project_root: ${project_root} + tailoring_path: "{{ project_root }}/.pharaoh/project" + parallel_group: grounding_check_comp_reqs +{% endfor %} + + - id: quality_gate + skill: pharaoh-quality-gate + inputs: + artefacts_dir: ${workspace}/artefacts + project_root: {{ project_root }} + depends_on: [{% for f in files %}reqs_{{ f.stem }}, review_{{ f.stem }}, grounding_{{ f.stem }}{% if not loop.last %}, {% endif %}{% endfor %}] + +validation: +{% for f in files %} + - task_output: ${reqs_{{ f.stem }}} + schema: json_obj + on_fail: skip_dependents +{% endfor %} diff --git a/skills/pharaoh-write-plan/templates/reverse-engineer-project.yaml.j2 b/skills/pharaoh-write-plan/templates/reverse-engineer-project.yaml.j2 new file mode 100644 index 0000000..b36521b --- /dev/null +++ b/skills/pharaoh-write-plan/templates/reverse-engineer-project.yaml.j2 @@ -0,0 +1,346 @@ +--- +# Template front-matter. Parsed by pharaoh-write-plan, stripped before render. +template_name: reverse-engineer-project +description: | + Full reverse-engineering pipeline: prose docs -> feats -> per-feat file map + -> id allocation -> per-file req emission -> optional diagrams -> codelinks + -> toctree -> quality gate. + + Upstream shape: pharaoh-feat-file-map emits FLAT mappings + {feat_id, files, rationale, entry_point?, shared_with?} per invocation + (one per foreach iteration). The to_files_flat helper denormalises a list + of those flat mappings into a file-centric list with populated `parents` + for files shared across feats. +required_vars: + - project_root # absolute path + - project_name # slug used in plan.name + - docs_root # relative to project_root (e.g. "docs/source"); used by write-plan to enumerate `docs` + - docs # list[str] of relative doc paths; populated by write-plan Step 2 from docs_root + - src_root # relative to project_root (e.g. "src/myproject") + - feat_directive # from ubproject.toml (e.g. "feat") + - comp_req_directive # from ubproject.toml (e.g. "comp_req") +optional_vars: + - workspace_dir # default: <project_root>/.pharaoh/runs/<project_name>-<ts>/ + - target_docs_path # where emitted artefacts (RST, diagrams) finally live; default: ${workspace}/artefacts. Use this to land reverse-engineered spec under the project's docs tree (e.g. docs/source/spec/feature/) without overriding workspace_dir. + - emit_component_diagrams # bool, default true + - emit_flow_diagrams # bool, default true + - emit_codelinks # bool, default true + - execution_mode # "inline" | "subagents", default "subagents" + - papyrus_workspace # default <project_root>/.papyrus +--- +name: reverse-engineer-{{ project_name }} +version: 1 +project_root: {{ project_root }} +{% if workspace_dir %}workspace_dir: {{ workspace_dir }}{% endif %} +defaults: + execution_mode: {{ execution_mode | default("subagents") }} + retry_on_validation_fail: 1 + +tasks: + - id: draft_feats + skill: pharaoh-feat-draft-from-docs + inputs: + doc_files: +{% for d in docs %} + - {{ d }} +{% endfor %} + target_level: {{ feat_directive }} + project_root: {{ project_root }} + papyrus_workspace: {{ papyrus_workspace | default(project_root + "/.papyrus") }} + reporter_id: "write-plan:draft-feats" + outputs: + feats: list # [{id, title, type, body, source_doc, raw_rst}, ...] + expected_output_schema: json_obj + + - id: preseed_papyrus + skill: pharaoh-decision-record + foreach: ${draft_feats.feats | to_papyrus_seeds} + inputs: + type: fact + canonical_name: ${item.canonical_name} + body: ${item.body} + reporter_id: "write-plan:preseed" + tags: ["origin:preseed"] + depends_on: [draft_feats] + + - id: map_files + skill: pharaoh-feat-file-map + foreach: ${draft_feats.feats} + inputs: + feat_id: ${item.id} + feat_title: ${item.title} + feat_body: ${item.body} + src_root: {{ src_root }} + papyrus_workspace: {{ papyrus_workspace | default(project_root + "/.papyrus") }} + reporter_id: "write-plan:feat-file-map" + depends_on: [preseed_papyrus] + parallel_group: map_files + outputs: + feat_id: str + files: list + rationale: str + entry_point: mapping + shared_with: list + + - id: id_allocate + skill: pharaoh-id-allocate + inputs: + requests: ${map_files | to_files_flat | to_id_requests} + existing_ids_file: {{ project_root }}/{{ docs_root }}/_build/needs/needs.json + default_prefix: {{ comp_req_prefix | default("CREQ_") }} + type: {{ comp_req_directive }} + depends_on: [map_files] + outputs: + allocations: mapping + + - id: reqs_from_code + skill: pharaoh-req-from-code + foreach: ${map_files | to_files_flat} + inputs: + file_path: ${item.file} + parent_feat_ids: ${item.parents} + target_level: {{ comp_req_directive }} + split_strategy: ${heuristics.split_strategy(item.file)} + allowed_ids: ${id_allocate.allocations | by_stem(item.stem)} + papyrus_workspace: {{ papyrus_workspace | default(project_root + "/.papyrus") }} + project_root: {{ project_root }} + reporter_id: "write-plan:req-from-code" + depends_on: [id_allocate] + parallel_group: reqs_from_code + expected_output_schema: json_obj + outputs: + reqs: list # [{id, title, type, body, source_doc, satisfies, verification, raw_rst}, ...] + + # Per-comp_req prose review — explicit top-level task so the executor does + # not silently skip it. The emitter's `## Last step` used to self-invoke + # this but dozens of emissions under foreach fan-out let the LLM drop the + # self-invoke. Scheduling it here as a top-level task with its own DAG + # dependency is harder to skip. + - id: review_comp_reqs + skill: pharaoh-req-review + foreach: ${reqs_from_code.reqs | flatten} + execution_mode: subagents + depends_on: [reqs_from_code] + inputs: + target: ${item.raw_rst} + checklist_path: skills/shared/checklists/requirement.md + tailoring_path: "{{ project_root }}/.pharaoh/project" + parallel_group: review_comp_reqs + + # Per-comp_req code-grounding check — explicit top-level task (same + # rationale as review_comp_reqs). The skill auto-derives source_doc_path + # from the RST block's `:source_doc:` option, resolved against project_root + # for absolute paths. Pluggable filter YAML resolved from tailoring_path. + - id: grounding_check_comp_reqs + skill: pharaoh-req-code-grounding-check + foreach: ${reqs_from_code.reqs | flatten} + execution_mode: subagents + depends_on: [reqs_from_code] + inputs: + target: ${item.raw_rst} + source_doc_path: ${item.source_doc} + project_root: ${project_root} + tailoring_path: "{{ project_root }}/.pharaoh/project" + parallel_group: grounding_check_comp_reqs + + # Per-source-file coverage — reverse direction of reqs_from_code. Asks + # "did the emitter cover every public symbol and every raise-site + # exception this file exposes?". Catches the under-decomposition failure + # mode where a compound CREQ bundles 5 behaviours and the other 4 + # exception classes never get CREQs of their own. Uses the existing + # pharaoh-api-coverage-check atom. Iterates directly over the flat file + # list from to_files_flat ({file, stem, parents, ...}). + - id: api_coverage_comp_reqs + skill: pharaoh-api-coverage-check + foreach: ${map_files | to_files_flat} + execution_mode: subagents + depends_on: [reqs_from_code] + inputs: + source_file: ${item.file} + needs_json_path: {{ project_root }}/{{ docs_root }}/_build/needs/needs.json + project_root: ${project_root} + language: auto + parallel_group: api_coverage_comp_reqs + + # Note: a regenerate_comp_reqs loop is intentionally omitted here. Wiring + # regenerate into the DAG requires pairing each review's findings_json + # with its original RST, which needs either a pairing helper in schema.md + # or an RST-emitting review output. Until one of those lands, operators + # invoke pharaoh-req-regenerate manually per-CREQ on review failures. + +{% if emit_codelinks | default(true) %} + - id: codelink_annotate + skill: pharaoh-req-codelink-annotate + foreach: ${reqs_from_code.reqs | flatten} + inputs: + req_id: ${item.id} + req_title: ${item.title} + req_type: ${item.type} + file_path: ${item.source_doc} + anchor: {type: top_of_file} + parent_links: ${item.satisfies} + project_root: ${project_root} + depends_on: [reqs_from_code] + parallel_group: codelinks + expected_output_schema: json_obj +{% endif %} + + - id: toctree + skill: pharaoh-toctree-emit + inputs: + artefacts_dir: {{ target_docs_path | default("${workspace}/artefacts") }} + docs_root: {{ docs_root }} + feat_directive: {{ feat_directive }} + depends_on: [reqs_from_code{% if emit_component_diagrams | default(true) %}, extract_component_diagrams{% endif %}{% if emit_flow_diagrams | default(true) %}, extract_flow_diagrams{% endif %}] + +{% if emit_component_diagrams | default(true) or emit_flow_diagrams | default(true) %} + - id: diagram_lint + skill: pharaoh-diagram-lint + inputs: + docs_dir: {{ target_docs_path | default("${workspace}/artefacts") }} + strictness: report_only + reporter_id: "write-plan:diagram-lint" + papyrus_workspace: {{ papyrus_workspace | default(project_root + "/.papyrus") }} + depends_on: [toctree] + outputs: + findings: list + status: str +{% endif %} + + # Per-feat review — self-review invariant (shared/self-review-invariant.md) + - id: review_feats + skill: pharaoh-feat-review + depends_on: [draft_feats] + execution_mode: subagents + foreach: ${draft_feats.feats} + inputs: + target: ${item.raw_rst} + checklist_path: skills/shared/checklists/feat.md + tailoring_path: "{{ project_root }}/.pharaoh/project" + + # Diagram emission — use-case view. Iterates over feats; the skill + # derives actors / use-cases / external-systems from the feat body at + # dispatch time. + - id: draft_use_case_diagrams + skill: pharaoh-use-case-diagram-draft + depends_on: [draft_feats] + execution_mode: subagents + foreach: ${draft_feats.feats} + inputs: + feat_id: ${item.id} + tailoring_path: "{{ project_root }}/.pharaoh/project" + +{% if emit_component_diagrams | default(true) %} + # Diagram emission — static view (component extract from code). Iterates + # over map_files's per-feat outputs ({feat_id, files, ...}). + - id: extract_component_diagrams + skill: pharaoh-feat-component-extract + depends_on: [map_files] + execution_mode: subagents + foreach: ${map_files} + inputs: + feat_id: ${item.feat_id} + files: ${item.files} + project_root: {{ project_root }} + src_root: {{ src_root }} + tailoring_path: "{{ project_root }}/.pharaoh/project" +{% endif %} + +{% if emit_flow_diagrams | default(true) %} + # Diagram emission — dynamic view (flow extract from call graph). Uses + # with_entry_point to skip feats that have no declared entry_point, since + # flow tracing requires one. Per-scenario enumeration is handled inside + # the skill from project.diagram_conventions.dynamic_view_scenarios — + # plan-level foreach is single-axis per schema.md. + - id: extract_flow_diagrams + skill: pharaoh-feat-flow-extract + depends_on: [map_files] + execution_mode: subagents + foreach: ${map_files | with_entry_point} + inputs: + feat_id: ${item.feat_id} + files: ${item.files} + entry_point: ${item.entry_point} + project_root: {{ project_root }} + src_root: {{ src_root }} + tailoring_path: "{{ project_root }}/.pharaoh/project" +{% endif %} + + # Per-diagram review — self-review invariant. Split per diagram kind + # because plan.yaml ref grammar disallows list-concatenation across + # multiple producer tasks. Each review task iterates the matching + # drafter's outputs. Executor runs them in parallel via the shared + # review_diagrams parallel_group. + - id: review_use_case_diagrams + skill: pharaoh-diagram-review + depends_on: [draft_use_case_diagrams] + execution_mode: subagents + foreach: ${draft_use_case_diagrams} + inputs: + diagram_block: ${item} + diagram_type: use_case + checklist_path: skills/shared/checklists/diagram.md + tailoring_path: "{{ project_root }}/.pharaoh/project" + parallel_group: review_diagrams + +{% if emit_component_diagrams | default(true) %} + - id: review_component_diagrams + skill: pharaoh-diagram-review + depends_on: [extract_component_diagrams] + execution_mode: subagents + foreach: ${extract_component_diagrams} + inputs: + diagram_block: ${item} + diagram_type: feat_component_extract + checklist_path: skills/shared/checklists/diagram.md + tailoring_path: "{{ project_root }}/.pharaoh/project" + parallel_group: review_diagrams +{% endif %} + +{% if emit_flow_diagrams | default(true) %} + - id: review_flow_diagrams + skill: pharaoh-diagram-review + depends_on: [extract_flow_diagrams] + execution_mode: subagents + foreach: ${extract_flow_diagrams} + inputs: + diagram_block: ${item} + diagram_type: feat_flow_extract + checklist_path: skills/shared/checklists/diagram.md + tailoring_path: "{{ project_root }}/.pharaoh/project" + parallel_group: review_diagrams +{% endif %} + + - id: quality_gate + skill: pharaoh-quality-gate + inputs: + artefacts_dir: {{ target_docs_path | default("${workspace}/artefacts") }} + project_root: {{ project_root }} +{% if emit_component_diagrams | default(true) or emit_flow_diagrams | default(true) %} + diagram_lint_findings: ${diagram_lint.findings} + diagram_lint_status: ${diagram_lint.status} +{% endif %} + gate_spec: + invariants: + papyrus_non_empty: + enabled: true + required_min: 1 + dispatch_signal_matches_plan: + enabled: true + self_review_coverage: + enabled: true + self_review_map_path: skills/shared/self-review-map.yaml + depends_on: [toctree{% if emit_component_diagrams | default(true) or emit_flow_diagrams | default(true) %}, diagram_lint{% endif %}{% if emit_codelinks | default(true) %}, codelink_annotate{% endif %}, review_feats, review_comp_reqs, grounding_check_comp_reqs, api_coverage_comp_reqs, draft_use_case_diagrams{% if emit_component_diagrams | default(true) %}, extract_component_diagrams{% endif %}{% if emit_flow_diagrams | default(true) %}, extract_flow_diagrams{% endif %}, review_use_case_diagrams{% if emit_component_diagrams | default(true) %}, review_component_diagrams{% endif %}{% if emit_flow_diagrams | default(true) %}, review_flow_diagrams{% endif %}] + +validation: + - task_output: ${draft_feats} + schema: json_obj + on_fail: retry + - task_output: ${reqs_from_code.*} + schema: json_obj + on_fail: skip_dependents +{% if emit_codelinks | default(true) %} + - task_output: ${codelink_annotate.*} + schema: json_obj + on_fail: skip_dependents +{% endif %} diff --git a/skills/shared/checklists/decision.md b/skills/shared/checklists/decision.md new file mode 100644 index 0000000..cf3e617 --- /dev/null +++ b/skills/shared/checklists/decision.md @@ -0,0 +1,40 @@ +--- +name: decision +applies_to: decision +axes: + - context_section_present + - alternatives_listed + - consequences_section_present + - trace_to_affected_artefacts + - canonical_name_unique + - rationale_quality +--- + +# Decision review checklist + +Generic baseline for reviewing a single recorded decision (DR / ADR / design note). + +## Mechanized axes + +### context_section_present +Body contains a `Context` section heading (`Context\n-------` or `**Context:**`). + +### alternatives_listed +Body contains an `Alternatives` section with at least two bullet items OR an explicit "alternatives considered" table. + +### consequences_section_present +Body contains a `Consequences` section (positive + negative bullets, or prose). + +### trace_to_affected_artefacts +Directive has at least one outgoing link (`:affects:`, `:supersedes:`, `:relates_to:`, or the project's declared decision-link option) that resolves. + +### canonical_name_unique +If the decision is Papyrus-backed, the `(type, canonical_name)` tuple appears exactly once in the Papyrus workspace. Mechanical dedup check. + +## Subjective axis + +### rationale_quality (0-3) +- 3 — Rationale names the driving constraint or value and ties it explicitly to the chosen alternative. +- 2 — Rationale present but weak link to the choice. +- 1 — Rationale restates the choice without justification. +- 0 — Rationale missing or placeholder. diff --git a/skills/shared/checklists/diagram.md b/skills/shared/checklists/diagram.md new file mode 100644 index 0000000..24ff3b3 --- /dev/null +++ b/skills/shared/checklists/diagram.md @@ -0,0 +1,117 @@ +--- +name: diagram +applies_to: diagram +axes: + - trace_to_parent + - caption_present + - element_count_within_bounds + - parser_clean + - required_elements_for_type + - conditional_branches_marked + - external_library_participant + - returns_match_call_stack + - purpose_clarity + - granularity_consistency + - naming_clarity +--- + +# Diagram review checklist + +Generic baseline for reviewing a single diagram block (Mermaid or PlantUML). Domain-agnostic — no references to specific standards (ISO 26262, ASPICE) or project-specific conventions (safety markers, connector-family terms, etc.). Projects add tailoring addenda. + +## Mechanized axes + +### trace_to_parent +Diagram's `:caption:` option OR an annotation inside the block names the parent need_id (matches regex `[A-Z_]+(__|_)[a-z0-9_]+` if the project follows sphinx-needs ID conventions). Orphan diagram (no named parent) → fail. + +**Detection rule:** Extract `:caption:` value; look for the parent_need_id substring, case-sensitive. If absent, also search the diagram body for a comment or note referencing the parent. + +### caption_present +The RST directive has a non-empty `:caption:` option. + +**Detection rule:** grep directive options for `:caption:\s+\S`. + +### element_count_within_bounds +Element count ≤ `tailoring.diagram.element_count_max` (default 7, tailorable). Element definition is renderer-specific: +- Mermaid: count of nodes (non-edge lines that introduce a new participant/node/state/class). +- PlantUML: count of `component`, `class`, `participant`, `state`, `usecase`, `rectangle` declarations. + +Above the limit → fail. The >N rule signals need for decomposition. + +### parser_clean +`pharaoh-diagram-lint` passes on this block. Mechanical delegation. + +**Detection rule:** Run the lint atom; propagate its pass/fail. + +### required_elements_for_type +Per `diagram_type`, check presence of the canonical element set declared in `pharaoh-diagram-review/SKILL.md > Per-type required-elements` table. Mechanical grep on known renderer keywords. + +### conditional_branches_marked +Applies only when `diagram_type == sequence`. If the function named by `:source_doc:` contains two or more `if` / `elif` / `else` branches whose bodies produce observable calls, the diagram body must contain at least one conditional block marker. Markers are renderer-agnostic in effect: Mermaid uses `alt` / `opt` / `loop`; PlantUML uses `alt` / `opt` / `group`. Two of the three tokens (`alt`, `opt`) overlap; the detection rule looks for any of the five tokens at line-start, so both renderers are covered by the same grep. + +If the source function has branching but the diagram body presents an unconditional call sequence → fail. + +**Check:** source function is branching; diagram must expose that branching. + +**Detection rule:** parse `:source_doc:` with the `ast` module (Python source), count `ast.If` nodes inside the named function plus any `elif`/`else` clauses they carry; if total branch count is >= 2, grep the diagram body (case-sensitive, line-start after optional whitespace) for `\b(alt|opt|loop|group)\b`. No token found → fail with evidence `"source has N conditional branches; diagram body has no alt/opt/loop/group marker"`. For non-Python sources fall back to regex `^\s*(if|elif|else\s*if|else)\b` in the function body slice delimited by the next sibling def. PlantUML-only `group` is intentionally included here and ignored for Mermaid parsing because Mermaid does not recognise `group` as a keyword — a stray `group` in Mermaid would already fail `parser_clean`, so this axis over-matches safely. + +### external_library_participant +Applies to `sequence` and `block` diagrams. If the function named by `:source_doc:` imports a non-stdlib package and calls it inside the function body, the library name (or a documented alias for it) must appear as a participant in the diagram. + +If an external dependency is silently elided from the interaction → fail. + +**Check:** external dependencies used in the function surface as diagram participants. + +**Detection rule:** parse `:source_doc:` with `ast`, collect every top-level `Import` / `ImportFrom` target whose root package is NOT a member of `sys.stdlib_module_names` (Python 3.10+) and not one of the well-known local-root markers declared in `tailoring.internal_packages` (defaults: project root package). For each such external package, confirm at least one `ast.Call` inside the named function references the imported name (attribute chain or bare name). For each confirmed external call, grep the diagram body for the library name as a participant declaration: +- Mermaid sequence: `^\s*participant\s+<lib>\b` or `^\s*<lib>\s*->>|^\s*<lib>\s*->>?\+?` (actor shorthand). +- PlantUML sequence: `^\s*(participant|actor|boundary|control|entity|database|queue|collections)\s+"?<lib>"?\b`. +- Mermaid block / flowchart: node id or label containing `<lib>` (`\b<lib>\b` anywhere in a node declaration line starting with a non-edge token). +- PlantUML block / component: `^\s*(component|rectangle|package|node|artifact)\s+"?<lib>"?\b`. + +If at least one external-call library is unreferenced in the diagram body → fail with evidence `"<lib> imported and called at line N; absent from participant list"`. A single regex union `^\s*(participant|actor|component|rectangle|package|node|artifact|boundary|control|entity|database|queue|collections)\s+"?<lib>"?\b` covers both renderers' declaration forms; fall-through node-label match handles block-diagram flowchart syntax where participants are implicit. + +Tailoring: `tailoring.internal_packages` (list of package roots to treat as first-party and therefore skip); `tailoring.external_alias_map` (dict from import-name to allowed diagram-label alias, e.g. `{"jira": "Jira API"}`). + +### returns_match_call_stack +Applies only when `diagram_type == sequence`. Every return arrow must terminate at a participant that previously appeared as the source of a call (an entry in the live call stack) or at the diagram's declared entrypoint participant. Return arrows terminating at a free-floating participant — one that never issued a call and is not the declared entrypoint — fail the axis. Declaring `User` / `Actor` as the entrypoint is legitimate and passes; silently routing returns to an invented `User` that never called anything is the failure mode. + +**Check:** return arrows respect the call stack induced by call arrows; no invented return destinations. + +**Detection rule:** tokenise the diagram body line by line. For each line, classify: +- Mermaid call: `^\s*(?P<src>\w+)\s*->>\+?\s*(?P<dst>\w+)\s*:` (also `->>` without `+`). +- Mermaid return: `^\s*(?P<src>\w+)\s*-->>-?\s*(?P<dst>\w+)\s*:`. +- PlantUML call: `^\s*(?P<src>\w+)\s*->\s*(?P<dst>\w+)\s*:` (solid arrow). +- PlantUML return: `^\s*(?P<src>\w+)\s*-->\s*(?P<dst>\w+)\s*:` (dashed arrow). + +Walk the token stream maintaining a stack: push `src` on every call arrow; on every return arrow check that `dst` equals either the top-of-stack caller (ideal — balanced return) OR any caller earlier on the stack (tolerated — collapsed returns) OR the declared entrypoint (first participant in `participant` / `actor` declarations, or the source of the first call). On match, pop down to that caller; on mismatch record `{"return_from": src, "return_to": dst, "stack": [...]}` in evidence and fail the axis. The entrypoint exemption ensures diagrams that legitimately start from a `User` actor pass; the stack check ensures returns to a never-seen `User` fail. + +Because Mermaid's `-->>` and PlantUML's `-->` both denote returns in their respective renderers and never collide (Mermaid does not parse `-->` as a return; PlantUML does not parse `-->>` at all), the tokeniser can try both patterns in parallel and the renderer-specific grammar disambiguates by virtue of the one that matches — no explicit renderer switch needed. Lines matching neither are skipped (notes, comments, `activate` / `deactivate`, block markers handled by `conditional_branches_marked`). + +## Subjective axes (0-3, LLM-judge fallback) + +### purpose_clarity +- 3 — Caption + diagram together make the purpose obvious: what question does this diagram answer? +- 2 — Purpose inferrable from surrounding text. +- 1 — Purpose unclear without reading the source code. +- 0 — No identifiable purpose. + +### granularity_consistency +- 3 — All elements at the same level of abstraction (e.g. all files, or all classes, not both mixed). +- 2 — One element out of level. +- 1 — Multiple levels mixed, confusing the reader. +- 0 — Level incoherent throughout. + +### naming_clarity +- 3 — Every node/edge label is self-explanatory. +- 2 — One opaque label. +- 1 — Multiple opaque labels. +- 0 — Labels are meaningless or absent. + +## Tailoring extension point + +Per-project addenda under `.pharaoh/project/checklists/diagram.md` add axes with keys prefixed `tailoring.*`. Examples: + +- Safety-critical project: `tailoring.asil_marker_present`, `tailoring.scenario_coverage` (N scenarios per dynamic view), `tailoring.supplier_manual_coverage`. +- Single-renderer project: `tailoring.renderer_is_mermaid` (project uses mermaid only), `tailoring.external_system_labelled`. + +Architecture-rule examples (e.g. "every diagram must trace to requirements" → `tailoring.trace_to_requirements_full`; ">3 children requires rationale" → `tailoring.simplicity_with_rationale` parametrized with `threshold: 3`) live in project tailoring files, never in this base checklist. diff --git a/skills/shared/checklists/feat.md b/skills/shared/checklists/feat.md new file mode 100644 index 0000000..b15ebca --- /dev/null +++ b/skills/shared/checklists/feat.md @@ -0,0 +1,86 @@ +--- +name: feat +applies_to: feat +axes: + - trace_to_parent_or_workflow + - single_user_capability + - source_doc_present_and_valid + - required_fields_complete + - shall_clause_user_observable + - body_length_within_bounds + - no_comp_level_mechanism_leak + - naming_clarity +--- + +# Feat review checklist + +Generic baseline for reviewing feat-level needs. Project-specific addenda in `.pharaoh/project/checklists/feat.md` extend but never replace these. + +## Mechanized axes (pass/fail, execution-based reward) + +### trace_to_parent_or_workflow + +**Check:** The feat has `:satisfies:` or `:realizes:` linking to a parent workflow (`wf__*`), a higher-level feat, or a stakeholder need. An orphan feat (no parent link) fails. + +**Detection rule:** In the feat directive, grep for any outgoing link option declared in `artefact-catalog.yaml` under the feat type. At least one must resolve. + +### source_doc_present_and_valid + +**Check:** The feat has a `:source_doc:` field, and the referenced path exists in the repository. + +**Detection rule:** Read `:source_doc:` value; assert file exists relative to the docs root. + +### required_fields_complete + +**Check:** All fields listed under `artefact-catalog.yaml > feat > required_fields` are present in the directive options. + +**Detection rule:** Parse directive options; diff against required_fields list. + +### body_length_within_bounds + +**Check:** The feat body (prose + shall-clauses) is between 3 and 15 non-blank lines. Shorter = under-specified. Longer = likely two feats fused. + +**Detection rule:** Count non-blank lines between the directive header and the next RST section or directive. + +## Subjective axes (0-3 score, LLM-judge fallback) + +### single_user_capability (0-3) + +**Rubric:** +- 3 — Describes exactly one user-facing capability. Name and body refer to one action, one data flow, one exit. +- 2 — One capability with minor overlap into a neighbour (e.g. "export and notify"). +- 1 — Two or more capabilities fused (e.g. "reqif_exchange" covers both export and import). +- 0 — Capability not identifiable; body describes implementation rather than behavior. + +### shall_clause_user_observable (0-3) + +**Rubric:** +- 3 — Shall-clause expresses observable external behavior: input X → output Y, no internal mechanism named. +- 2 — Minor mechanism reference that does not obscure the external behavior. +- 1 — Shall-clause names an internal module, class, or function and the user-observable behavior is unclear. +- 0 — Shall-clause is implementation description, not a behavioral claim. + +### no_comp_level_mechanism_leak (0-3) + +**Rubric:** +- 3 — Body contains no references to specific classes, methods, file paths, or data structures. Pure user / system / external-interface vocabulary. +- 2 — One or two stray references that do not dominate. +- 1 — Body half-describes mechanism (e.g. "the JamaClient fetches items and the Jama2Needs processor converts..."). +- 0 — Body reads like a method-level docstring. + +### naming_clarity (0-3) + +**Rubric:** +- 3 — ID and title are self-explanatory to a reviewer unfamiliar with the codebase. +- 2 — Mostly clear, one abbreviation or acronym without expansion. +- 1 — Opaque abbreviations dominate; title does not disambiguate siblings. +- 0 — Generic placeholder names (`FEAT_stuff`, `FEAT_misc`). + +## Tailoring extension point + +Per-project addenda under `.pharaoh/project/checklists/feat.md` add axes with keys prefixed `tailoring.*`. Examples: + +- Safety-critical project: `tailoring.asil_marker_present` (every feat carries the project's safety-classification marker). +- Connector project: `tailoring.connector_family_named` (every connector feat names its family per the project's controlled vocabulary). + +Extension axes follow the same mechanized vs subjective split. The base axes are always run; tailoring axes are run only if the project defines them. diff --git a/skills/shared/checklists/fmea.md b/skills/shared/checklists/fmea.md new file mode 100644 index 0000000..8655fd3 --- /dev/null +++ b/skills/shared/checklists/fmea.md @@ -0,0 +1,45 @@ +--- +name: fmea +applies_to: fmea +axes: + - trace_to_analyzed_artefact + - severity_in_range + - occurrence_in_range + - detection_in_range + - rpn_computed_correctly + - cause_well_formed + - effect_well_formed + - mitigation_proposed_if_rpn_high +--- + +# FMEA review checklist + +Generic baseline for reviewing a single FMEA row. + +## Mechanized axes + +### trace_to_analyzed_artefact +FMEA entry has `:analyzes:` or `:satisfies:` linking to the requirement / architecture element under analysis. Orphan FMEA → fail. + +### severity_in_range / occurrence_in_range / detection_in_range +Each of S/O/D is integer within the configured scale (default 1..10). Scale tailorable via `.pharaoh/project/artefact-catalog.yaml > fmea > scales`. + +### rpn_computed_correctly +`rpn == severity * occurrence * detection`. Recompute; fail on mismatch. + +### mitigation_proposed_if_rpn_high +If `rpn >= threshold` (default 60, tailorable), a `:mitigation:` field must be present and non-empty. Below threshold: mitigation optional. + +## Subjective axes + +### cause_well_formed (0-3) +- 3 — Root cause named, not a symptom; physically / logically plausible. +- 2 — Cause named but at symptom level. +- 1 — Cause vague or circular. +- 0 — Cause missing or unparseable. + +### effect_well_formed (0-3) +- 3 — Effect describes observable external consequence (functional, safety, security). +- 2 — Effect partially internal. +- 1 — Effect is a synonym of the failure mode itself. +- 0 — Effect missing or unparseable. diff --git a/skills/shared/checklists/requirement.md b/skills/shared/checklists/requirement.md new file mode 100644 index 0000000..6a345f1 --- /dev/null +++ b/skills/shared/checklists/requirement.md @@ -0,0 +1,251 @@ +--- +name: requirement +applies_to: comp_req +axes: + - atomicity + - internal_consistency + - verifiability + - schema + - source_doc_resolves + - unambiguity_prose + - comprehensibility + - feasibility + - completeness + - external_consistency + - no_duplication + - maintainability + - exception_raise_sites_exist + - trigger_condition_literal_match + - named_symbol_exists + - type_framework_matches_imports + - backtick_symbol_in_source_doc + - no_weasel_adjectives + - quantifier_enumerated + - branch_count_aligned +--- + +# Requirement review checklist + +Generic baseline for reviewing a single requirement-level need against ISO 26262-8 §6 axes plus +code-grounding fidelity axes. Domain-neutral — no regulatory-standard-specific vocabulary, no +downstream-consumer link names. Projects add tailoring addenda under +`.pharaoh/project/checklists/requirement.md`. + +## Mechanized axes (pass/fail, execution-based reward) + +### atomicity + +**Check:** The body contains exactly one `shall` clause and no coordinating conjunction joins modal +verbs within that clause. + +**Detection rule:** +```bash +grep -cE '\bshall\b' <creq_body> # must equal 1 +grep -E 'shall .*(, and | and | or |, or )' <creq_body> # must return no match +``` + +### internal_consistency + +**Check:** The body contains no self-contradictory phrasing (simultaneous "always" and "unless not +required", or an exception that negates the main clause). + +**Detection rule:** +```bash +grep -E '\b(always|must)\b.*\bunless\b|\balways\b.*\bnever\b' <creq_body> +``` + +### verifiability + +**Check:** The directive has a `:verification:` (or project-declared equivalent) option, non-empty, +whose target resolves in `needs.json`. + +**Detection rule:** extract `:verification:` option; look up the value in the needs map; pass iff +the key exists. + +### schema + +**Check:** Every field listed under `required_fields` for this artefact type in +`artefact-catalog.yaml` is present and non-empty in the directive options. + +**Detection rule:** parse directive options; diff keys against the catalog's required list; pass +iff the diff is empty. + +### source_doc_resolves + +**Check:** If the artefact catalog declares `:source_doc:` for this type, the option must be +present, point at an existing file, and the file must contain at least one symbol named in the +requirement body. Runs whenever `:source_doc:` is declared, independent of any grounding-check +invocation. See `../../pharaoh-req-code-grounding-check/SKILL.md#axes` for symbol-extraction +details. + +**Detection rule:** +```bash +grep -oE ':source_doc:\s+\S+' <creq> # option present +test -f "<source_doc_value>" # path resolves +grep -qE '<extracted_symbol>' "<source_doc_value>" # ≥1 symbol hit +``` + +## Subjective axes (0-3, LLM-judge fallback) + +### unambiguity_prose + +- 3 — Single interpretation; measurable terms; no weasel adjectives. +- 2 — Single interpretation; minor phrasing issues. +- 1 — Two plausible interpretations; a vague term like "sufficient" without a threshold. +- 0 — Multiple conflicting interpretations. + +### comprehensibility + +- 3 — Reader at adjacent abstraction level understands without supporting documents. +- 2 — Mostly clear; minor jargon or an undefined acronym. +- 1 — Mostly unclear without extra context. +- 0 — Reader at adjacent level cannot follow. + +### feasibility + +- 3 — Clearly feasible, tightly bounded. +- 2 — Feasible with known engineering effort. +- 1 — Feasible but significant unknowns. +- 0 — Obviously infeasible within item-development constraints. + +## Deferred / chain-level axes + +`completeness`, `external_consistency`, `no_duplication` require the full set of sibling +requirements — assessed by a set-level review atom, not here. Record as +`{"score": "deferred", "reason": "set-level axis"}`. `maintainability` requires observing +regeneration convergence — record as `{"score": null, "reason": "chain-level axis"}`. + +## Mechanised axes — code-grounded + +Runs only when the requirement declares `:source_doc:`. Paired with the runner at +`../../pharaoh-req-code-grounding-check/SKILL.md` — this checklist is the rubric reference, the +skill is the runner. See `../../pharaoh-req-code-grounding-check/SKILL.md#axes` for full axis +prose; entries below are the grep-able detection rules the runner applies. + +### exception_raise_sites_exist + +**Check:** Every exception class named in a `raises X` / `shall raise X` / `throws X` clause in the +body has at least one raise site in `:source_doc:`. + +**Detection rule:** +```bash +grep -oE '(raises?|throws?|shall raise)\s+(the\s+|an?\s+)?[A-Z][A-Za-z0-9_]+' <creq_body> \ + | awk '{print $NF}' | sort -u +# for each extracted class X, verify: +grep -cE "raise\s+X\s*\(" <source_doc> # must be ≥ 1 +``` + +### trigger_condition_literal_match + +**Check:** For each `when <field> == "<value>"` / `when <field> is <value>` clause in the body, the +source file contains the matching operator and literal — not an inverted `!=` / different value. + +**Detection rule:** +```bash +grep -oE 'when\s+[a-z_]+\s*(==|is)\s*"[^"]*"' <creq_body> +# then verify the same operator / literal in <source_doc>: +grep -E '<field>\s*(==|!=)\s*"<value>"' <source_doc> +``` + +### named_symbol_exists + +**Check:** Every symbol mentioned in a structural context (after `raises/throws/uses/wraps/calls/ +invokes/extends/subclasses`, or as a function-call with parens) exists as a definition or call +site in `:source_doc:`. Bounded extraction avoids false positives on stdlib generics and narrative +capitalization. + +**Detection rule:** +```bash +grep -E '(raises?|throws?|uses?|wraps?|calls?|invokes?|extends?|subclasses?)\s+(the\s+|an?\s+)?[A-Z][A-Za-z0-9_]+' <creq_body> +grep -E '[a-z_][a-z0-9_]+\(' <creq_body> +# each extracted name must appear in <source_doc>: +grep -E '(def|class)\s+<name>|<name>\s*\(' <source_doc> +``` + +### type_framework_matches_imports + +**Check:** If the body cites a type-framework (`Pydantic model`, `dataclass`, `attrs class`, +`TypedDict`), the source file's imports match. Naming a `FooError`-shaped Pydantic model while the +source uses `@dataclass` fails. + +**Detection rule:** +```bash +grep -oEi 'pydantic|dataclass|attrs\s+class|typeddict' <creq_body> +grep -E '^(from|import)\s+(pydantic|dataclasses|attr|typing)' <source_doc> +grep -E '^@(dataclass|attr\.s|attrs\.define)' <source_doc> +``` + +### backtick_symbol_in_source_doc + +**Check:** Every backtick-quoted token in the body — surviving the filter chain (TOML section, +file path / CLI string, Typer kebab flag, env glob, external dotted path, short-prose acronym) — +must appear as a literal substring in the file named by `:source_doc:`. Catches the cross-file +leak where a CREQ cites a config-side default literal (e.g. ``reqif_uuid``) while its +`:source_doc:` points at the consumer module that only sees the attribute (`self.config.uuid_target`). + +**Detection rule:** delegated to `pharaoh-req-code-grounding-check` axis #5 — see that skill for the +full filter chain and evidence format. The check runs per-CREQ during the sibling-review phase. + +### no_weasel_adjectives + +**Check:** The body contains no word from the base blacklist +`structured, comprehensive, full, absolute, paginated, robust, complete, proper` nor any entry +added via `tailoring.weasel_extra`. These words imply mechanised behaviour without grounding. + +**Detection rule:** +```bash +grep -iwE '\b(structured|comprehensive|full|absolute|paginated|robust|complete|proper)\b' <creq_body> +# extended blacklist resolved at runtime from tailoring.weasel_extra +``` + +### quantifier_enumerated + +**Check:** If the body contains an unbounded quantifier over an enumerable noun +(`all errors`, `every validator`, `each branch`, ...), the same or next sentence must enumerate +the members (colon + list, `namely`, `specifically`, `including`, or a Sphinx list directive). + +**Detection rule:** +```bash +grep -oE '\b(all|every|each)\s+([a-z]+\s+){0,3}(error|errors|exception|exceptions|failure|failures|case|cases|command|commands|branch|branches|mode|modes|validator|validators)s?\b' <creq_body> +# then require in the same or next sentence: +grep -E ':\s|\s(namely|specifically|including)\s|\.\.\s+list-table::|^\s*-\s' <adjacent_block> +``` + +### branch_count_aligned (subjective, 0-3) + +**Check:** Count `if` / `elif` / `else` / `match` branches in the function named by `:source_doc:`. +Score by how well the requirement structure reflects the branch count. + +**Rubric:** +- 3 — One shall per branch, or a single requirement with explicit per-branch enumeration. +- 2 — Branches grouped under a justified umbrella (e.g. "validation errors" for 2-3 similar branches). +- 1 — Requirement collapses ≥3 distinct branches into one shall-clause with no enumeration. +- 0 — Requirement omits entire branches that produce observable output. + +**Detection rule:** +```bash +python3 -c "import ast,sys; t=ast.parse(open(sys.argv[1]).read()); \ + print(sum(isinstance(n,(ast.If,ast.Match)) for n in ast.walk(t)))" <source_doc> +# regex fallback for non-Python: +grep -cE '^\s*(if|elif|else|match)\b' <source_doc> +``` + +## Tailoring extension point + +Per-project addenda under `.pharaoh/project/checklists/requirement.md` add axes with keys prefixed +`tailoring.*`. Frontmatter-level tailoring knobs extend the base axes in place: + +```yaml +tailoring: + weasel_extra: ["comprehensively", "seamlessly", "flexibly"] +``` + +`weasel_extra` is union-merged with the base blacklist before `no_weasel_adjectives` runs. +Projects may also add their own axes: + +- `tailoring.link_to_safety_goal` — for safety-critical domains that require a dedicated upstream + link option beyond the base `:satisfies:`. +- `tailoring.controlled_vocabulary_used` — for corpora that mandate a glossary-backed term set. + +Extension axes follow the same mechanized vs subjective split. The base axes are always run; +tailoring axes run only if the project defines them. diff --git a/skills/shared/code-grounding-filters-python.yaml.example b/skills/shared/code-grounding-filters-python.yaml.example new file mode 100644 index 0000000..67a5abf --- /dev/null +++ b/skills/shared/code-grounding-filters-python.yaml.example @@ -0,0 +1,45 @@ +# Reference tailoring file for a Python + Typer project. +# Drop this into .pharaoh/project/code-grounding-filters.yaml and adjust to +# your project's conventions. Strategies and their parameters are documented +# in shared/code-grounding-filters.md. + +filters: + # Typer / Click style CLI — kebab flags and bare kebab subcommands. + # Covers --license-key ↔ license_key = typer.Option(...) + # and from-csv ↔ def from_csv(...) + # and --license-key ↔ OptLicenseKey: TypeAlias = Annotated[...] + - name: typer_kebab + strategy: kebab_to_snake_or_pascal + token_regex: "^(--)?[a-z][a-z0-9]*(-[a-z0-9]+)+$" + strip_leading: ["--"] + morphology_prefixes: ["Opt"] + + # Uppercase env-var families cited via trailing * (JAMA_*). + - name: env_var_glob + strategy: prefix_glob_expansion + token_regex: "^[A-Z][A-Z0-9_]*_?\\*$" + separator_character: "_" + + # External imports cited via dotted path (rich.console.Console). + # Resolves when source contains `from rich.console import Console`, plain + # `import rich.console`, or the literal `rich.console.Console` attribute + # access. + - name: python_import + strategy: dotted_import_resolution + token_regex: "^[a-z][\\w.]*\\.[A-Z]\\w+$" + separator: "." + import_patterns: + - "from\\s+${mod}\\s+import\\s+${attr}" + - "import\\s+${mod}\\b" + - "${tok}" + + # Dataclass default literal vs. consumer-side attribute access. + # Catches the CREQ pattern where the body cites `reqif_uuid` (a literal + # string default of a dataclass field) while :source_doc: points at the + # consumer module that only reads `self.config.uuid_target`. + - name: python_dataclass_default + strategy: cross_file_literal_default + token_regex: "^[a-z_][a-z0-9_]*$" + hint_dir_pattern: "config/" + field_regex: "field\\(default=[\"']${tok}[\"']\\)" + search_root: "" # inferred from :source_doc: package root diff --git a/skills/shared/code-grounding-filters.md b/skills/shared/code-grounding-filters.md new file mode 100644 index 0000000..0ba0bbd --- /dev/null +++ b/skills/shared/code-grounding-filters.md @@ -0,0 +1,229 @@ +# Code-grounding filter schema + +Shared schema for the pluggable false-positive filter chain used by +`pharaoh-req-code-grounding-check` axis #5 (`backtick_symbol_in_source_doc`). + +The base skill ships three universal filters (TOML section header, file-path / +command-string, short-prose acronym). Everything else — language-specific +import syntax, CLI framework conventions, env-var globs, literal-default +heuristics — comes from the project's `.pharaoh/project/code-grounding-filters.yaml`. +A project without the YAML runs only the three universal filters; authors get +more signal but more false positives, and the skill stays usable in any +language. + +This file defines four parameterised **strategies**. The YAML lists which +strategies to enable and how to configure them. Strategies are the universal +shape (kebab transform, glob expansion, import-clause lookup, cross-file +default resolution); the parameters make them language-specific. + +## Schema + +```yaml +# .pharaoh/project/code-grounding-filters.yaml +filters: + - name: <human-readable identifier> + strategy: <one of: kebab_to_snake_or_pascal | + prefix_glob_expansion | + dotted_import_resolution | + cross_file_literal_default> + token_regex: <Python regex matched against the bare backtick-quoted token> + # per-strategy parameters (see below) +``` + +Filters run in declaration order; the first filter whose `token_regex` matches +AND whose strategy resolves the token short-circuits the chain. A token that +matches no filter and is not literally in `:source_doc:` fails axis #5. + +## Strategy 1 — `kebab_to_snake_or_pascal` + +Purpose: CREQ prose cites CLI flag or subcommand names in kebab form +(`--license-key`, `from-csv`) which CLI frameworks render from snake-case or +CamelCase identifiers in source. + +Parameters: + +| parameter | purpose | example | +|---|---|---| +| `morphology_prefixes` | list of CamelCase prefixes to also try (e.g. `["Opt"]` for Typer `TypeAlias` wrappers, `["Cmd"]` for some Clap patterns) | `["Opt"]` | +| `strip_leading` | characters to strip from the token before transform; empty list = no strip | `["--"]` | + +Resolution: strip leading chars from token, replace `-` with `_`, check if +resulting `snake_case` string is in source. Additionally for each +`morphology_prefix`, check if `<prefix><PascalCase-snake-converted>` is in +source. Any hit resolves. + +Example (Python / Typer): + +```yaml +- name: typer_kebab + strategy: kebab_to_snake_or_pascal + token_regex: "^(--)?[a-z][a-z0-9]*(-[a-z0-9]+)+$" + strip_leading: ["--"] + morphology_prefixes: ["Opt"] +``` + +Example (Rust / Clap — same strategy, no morphology prefix): + +```yaml +- name: clap_kebab + strategy: kebab_to_snake_or_pascal + token_regex: "^(--)?[a-z][a-z0-9]*(-[a-z0-9]+)+$" + strip_leading: ["--"] + morphology_prefixes: [] +``` + +## Strategy 2 — `prefix_glob_expansion` + +Purpose: CREQ cites a family of identifiers sharing a prefix via a trailing +`*` (`JAMA_*`, `OPT_*`). + +Parameters: + +| parameter | purpose | example | +|---|---|---| +| `separator_character` | char or empty string before `*` to strip in addition to the `*` itself | `"_"` | + +Resolution: strip `*` and the `separator_character` suffix from the token, +compile regex `\b<prefix>[<separator>]?\w+\b`, search source. At least one +match resolves. + +Example: + +```yaml +- name: env_var_glob + strategy: prefix_glob_expansion + token_regex: "^[A-Z][A-Z0-9_]*_?\\*$" + separator_character: "_" +``` + +## Strategy 3 — `dotted_import_resolution` + +Purpose: CREQ cites an external symbol via a dotted path +(`rich.console.Console`, `std::fs::File`, `@nestjs/common:Injectable`) which +the source imports under a different surface form. + +Parameters: + +| parameter | purpose | example | +|---|---|---| +| `separator` | dotted-path separator in the CREQ token | `"."` | +| `import_patterns` | list of regex patterns that resolve the token, using `${mod}`, `${attr}`, `${tok}` placeholders | see below | + +Resolution: split the token on `separator` into `mod` (everything before the +last separator) and `attr` (last segment). Substitute placeholders into each +pattern, compile regex, search source. Any hit resolves. + +Example (Python): + +```yaml +- name: python_import + strategy: dotted_import_resolution + token_regex: "^[a-z][\\w.]*\\.[A-Z]\\w+$" + separator: "." + import_patterns: + - "from\\s+${mod}\\s+import\\s+${attr}" + - "import\\s+${mod}\\b" + - "${tok}" +``` + +Example (Rust): + +```yaml +- name: rust_use_clause + strategy: dotted_import_resolution + token_regex: "^[a-z][\\w]*(::[\\w]+)+$" + separator: "::" + import_patterns: + - "use\\s+${tok}" + - "use\\s+${mod}::\\{[^}]*\\b${attr}\\b[^}]*\\}" + - "${tok}" +``` + +Example (TypeScript): + +```yaml +- name: ts_named_import + strategy: dotted_import_resolution + token_regex: "^@?[a-z][\\w/.-]*:[A-Z]\\w+$" + separator: ":" + import_patterns: + - "import\\s*\\{[^}]*\\b${attr}\\b[^}]*\\}\\s*from\\s*['\"]${mod}['\"]" +``` + +## Strategy 4 — `cross_file_literal_default` + +Purpose: CREQ cites a default-value literal (e.g. `"reqif_uuid"`) that lives +only in the config module, NOT in the consumer module named by `:source_doc:`. +Raised specifically for this flavour because the literal IS somewhere in the +project tree — just not in the cited file — and the review should distinguish +"token absent" from "token cited against the wrong file". + +Parameters: + +| parameter | purpose | example | +|---|---|---| +| `hint_dir_pattern` | directory glob where the defining module is expected to live; evidence names it when the heuristic fires | `"config/"` | +| `field_regex` | regex with `${tok}` placeholder, matched against the suspected defining file; captures the "this IS a default-value literal" signal | `field\\(default=["']${tok}["']\\)` | +| `search_root` | project subtree to scan for the defining module; default: inferred as the package root containing `:source_doc:` | `"src/"` | + +Resolution: scan `search_root`. For each file whose path matches +`hint_dir_pattern`, apply `field_regex` with `${tok}` substituted. If any +file matches, DO NOT resolve — instead, emit evidence +`"'${tok}' is a default-value literal; lives in <matched-file>. Cite the +consumer-side attribute or retarget :source_doc: to <matched-file>."` so the +author knows which remediation to apply. + +This is the one strategy that explicitly FAILS rather than resolves — because +the finding IS the value. A token that matches this strategy fails axis #5 +with actionable evidence; without this strategy, the same token would fail +with the generic `"token not in source_doc"` message and the author would not +know where to look. + +Example (Python dataclass `field(default=...)`): + +```yaml +- name: python_dataclass_default + strategy: cross_file_literal_default + token_regex: "^[a-z_][a-z0-9_]*$" + hint_dir_pattern: "config/" + field_regex: "field\\(default=[\"']${tok}[\"']\\)" +``` + +Example (Rust `serde(default = "...")`): + +```yaml +- name: rust_serde_default + strategy: cross_file_literal_default + token_regex: "^[a-z_][a-z0-9_]*$" + hint_dir_pattern: "src/config/" + field_regex: "#\\[serde\\(default\\s*=\\s*\"${tok}\"\\)\\]" +``` + +## Authoring + +A fresh project obtains the YAML via `pharaoh-tailor-code-grounding-filters`. +That skill scans the codebase, detects which CLI framework / config style is +in use, and proposes a filter set. Humans review and accept. + +Projects that want zero filters simply omit the YAML — the three universal +filters in the base skill cover the highest-signal false positives +(TOML section, file path, short prose) and the rest runs as plain substring +lookup in `:source_doc:`. + +## Running without tailoring + +Absent-or-empty `code-grounding-filters.yaml` is a legitimate configuration. +The base skill's three universal filters run unconditionally; nothing else +applies. The corpus gets stricter checking: tokens that would otherwise +resolve via language-specific conventions now fail axis #5 with +`"token not in source_doc"`. That is the correct default — the skill does +not guess the language / framework. + +## Strategy addition + +New strategies are added by extending this file and the base skill's +implementation. Do not encode project-specific semantics in strategy +parameters beyond the four listed above — projects that need a fundamentally +different resolution shape (e.g. macro expansion, cross-repo symbol lookup) +should land a new named strategy here rather than overloading one of the +existing four. diff --git a/skills/shared/diagram-conventions.schema.json b/skills/shared/diagram-conventions.schema.json new file mode 100644 index 0000000..f690730 --- /dev/null +++ b/skills/shared/diagram-conventions.schema.json @@ -0,0 +1,48 @@ +{ + "$schema": "https://json-schema.org/draft/2020-12/schema", + "$id": "https://useblocks.com/pharaoh/diagram-conventions.schema.json", + "title": "Pharaoh diagram conventions (per-project tailoring)", + "description": "Schema for .pharaoh/project/diagram-conventions.yaml. Consumed by pharaoh-write-plan and every *-diagram-draft skill.", + "type": "object", + "additionalProperties": false, + "properties": { + "version": {"type": "integer", "const": 1}, + "renderer": { + "type": "string", + "enum": ["mermaid", "plantuml"], + "default": "mermaid", + "description": "Default renderer for all diagram blocks emitted by this project." + }, + "view_map": { + "type": "object", + "additionalProperties": { + "type": "string", + "description": "Skill name overriding the shared/diagram-view-selection.md default for a given view." + }, + "description": "Per-view override. Keys are view names (e.g. 'feat_arc_sta'); values are Pharaoh skill names (e.g. 'pharaoh-component-diagram-draft'). Missing keys fall back to the shared default." + }, + "dynamic_view_scenarios": { + "type": "array", + "items": {"type": "string"}, + "default": ["default"], + "description": "Scenarios for which dynamic views emit one diagram each. Example: ['success_path', 'error_handling', 'operation_lifecycle']." + }, + "stereotype_aliases": { + "type": "object", + "additionalProperties": {"type": "string"}, + "description": "Maps canonical names to renderer-specific stereotype tokens. Example: {'block': '<<block>>', 'component': '<<component>>'}." + }, + "element_count_max": { + "type": "integer", + "minimum": 1, + "default": 7, + "description": "Default maximum element count per diagram (mechanized axis in shared/checklists/diagram.md). Above this, diagram-review flags as over-decomposed." + }, + "safety_markers": { + "type": "object", + "additionalProperties": {"type": "string"}, + "description": "Project-specific safety classification → visual marker map. Example: {'ASIL_B': 'red', 'ASIL_D': 'red_bold'}." + } + }, + "required": ["version"] +} diff --git a/skills/shared/diagram-safe-labels.md b/skills/shared/diagram-safe-labels.md new file mode 100644 index 0000000..5e09c18 --- /dev/null +++ b/skills/shared/diagram-safe-labels.md @@ -0,0 +1,68 @@ +# Diagram safe-label rule + +Shared guidance for every Pharaoh skill that emits Mermaid or PlantUML diagram blocks (`pharaoh-feat-component-extract`, `pharaoh-feat-flow-extract`, and every `pharaoh-*-diagram-draft`). + +## Why this exists + +A dogfooding iteration shipped Mermaid sequence diagrams that passed `sphinx-build -nW --keep-going -b html` with zero warnings but rendered as `Syntax error in text` in the browser. sphinx-build treats the diagram body as an opaque literal and hands it to the Mermaid runtime at render time; any parse failure surfaces only when a human opens the page. + +Classic silent failure. The root cause on one diagram was a semicolon inside a sequence message label: + +```mermaid +sequenceDiagram + J->>J: filter by type; skip SET/Folder +``` + +Mermaid 11's sequence grammar treats `;` as a statement terminator. Parser sees `filter by type` as the message, then hits `skip SET/Folder` and fails with `Expecting SOLID_ARROW ... got NEWLINE`. + +## The rule + +Inside any emitted label (message labels, edge labels, node labels, participant aliases, notes), the following characters MUST be avoided or replaced: + +| Character | Where it breaks | Safe replacement | +| ----------- | ---------------------------------------------------------------------------- | -------------------------------------------- | +| `;` | Sequence message labels (Mermaid 11 statement terminator) | `,` or `·` or rephrase | +| `\|` | Flowchart edge labels (Mermaid pipe-delimited edge label syntax) | `/` or the word "or" | +| Backtick | Mermaid code-span in labels (lexer slices on the first pair) | Drop or use single quotes | +| `"` (unescaped) | Mermaid label quoting; unmatched double-quote breaks the label parser | Use `"` in node labels, escape in messages | +| `->`, `-->`, `->>` inside label text | Mermaid arrow tokens; any occurrence in a label confuses the edge scanner | Use `to`, `leads to`, or `→` (UTF-8) | +| Literal newline (`\n` intended as content) | Breaks statement boundaries in both Mermaid and PlantUML | Use `<br/>` (Mermaid) or `\n` escape (PlantUML) | +| Leading/trailing whitespace in labels | PlantUML trims, Mermaid preserves — inconsistent across renderers | Trim before emitting | + +Additional rules for sequence diagrams: + +- Participant IDs must be valid identifiers: `[A-Za-z_][A-Za-z0-9_]*`. File paths like `csv/export.py` are NOT valid IDs — use an alias: `participant Export as csv/export.py`. +- Message labels should be a single short phrase. If you need a subordinate clause, use a `note over`/`note right of` block instead of stuffing it into the arrow label. + +Additional rules for flowcharts (`graph` / `flowchart`): + +- Edge labels wrapped in `|...|` cannot themselves contain `|`. Use quoted-label form `-- "label with pipe" -->` if you genuinely need one. +- Node IDs follow the same identifier rule as participants. Put path-shaped or symbol-shaped content in the `[label]`, not the ID. + +Additional rules for PlantUML: + +- Participant/entity names with spaces or punctuation MUST be double-quoted: `participant "My Service" as MS`. +- Don't use `@startuml`/`@enduml` more than once per block. + +## How to apply + +Every diagram-emitting skill MUST sanitise labels BEFORE emitting the block. The minimum viable sanitiser: + +1. For every string that will land inside a label position (message labels, edge labels, node labels, participant aliases, notes): + - Replace `;` with `,`. + - Replace `|` with `/`. + - Strip backticks. + - Escape unescaped `"` per the renderer's conventions. + - Replace literal newlines with `<br/>` (Mermaid) or `\n` (PlantUML). + - Trim leading and trailing whitespace. +2. For every ID position (participant id, node id), verify the string matches `[A-Za-z_][A-Za-z0-9_]*`. If not, generate an alias (e.g. `P1`, `P2`, ...) and emit `as <original>` if the renderer supports an alias clause. + +## Parser validation is the only truth + +Sanitisation catches the known classes above. It does not catch everything the parser rejects (diagram-specific syntax rules evolve with Mermaid / PlantUML versions). The only reliable end-to-end check is to run the emitted block through the real parser. + +The `pharaoh-diagram-lint` atomic skill does this batch-style over an emitted docs tree. Every diagram-emitting skill's reward criteria SHOULD include a per-emitted-block `mmdc -i tmp.mmd -o /dev/null` (Mermaid) or `plantuml -checkonly tmp.puml` (PlantUML) pass. Sanitisation without parser validation is a false sense of safety. + +## Why not move this into a helper? + +The sanitisation rules are short and diagram-specific. Hardcoding them inside each emitter's output path keeps the skill atomic (Criterion (e)) — no inter-skill call graph, no shared runtime dependency. If the ruleset grows (e.g. a third renderer lands), extract into a per-language helper then. diff --git a/skills/shared/diagram-tailoring.md b/skills/shared/diagram-tailoring.md new file mode 100644 index 0000000..1a2e119 --- /dev/null +++ b/skills/shared/diagram-tailoring.md @@ -0,0 +1,93 @@ +# Diagram tailoring — shared contract for all `pharaoh-*-diagram-draft` skills + +Every atomic diagram skill (`pharaoh-component-diagram-draft`, `pharaoh-sequence-diagram-draft`, `pharaoh-class-diagram-draft`, `pharaoh-state-diagram-draft`, future additions) reads renderer and styling config from the consumer project's `pharaoh.toml`. This document is the single source of truth for how that config is shaped and resolved. Every diagram SKILL.md references this file instead of re-declaring the rules. + +## Renderer resolution + +The renderer choice is **tailored**, not hardcoded. Each skill resolves the renderer in this order: + +1. `renderer_override` input parameter (explicit per-call override) +2. `pharaoh.toml` → `[pharaoh.diagrams].renderer` (project-wide default) +3. `pharaoh.toml` → `[pharaoh.diagrams.<type>].renderer` (per-diagram-type override) +4. Built-in fallback: `"mermaid"` (no JRE dependency, easiest first-time setup) + +Per-type override wins over project-wide: a project can set `[pharaoh.diagrams].renderer = "mermaid"` globally but `[pharaoh.diagrams.sequence].renderer = "plantuml"` because PlantUML sequence diagrams are richer. + +Supported renderers (initial set): `"mermaid"`, `"plantuml"`. Adding a third (e.g. D2, Graphviz) is a matter of adding one branch per skill; never rewrite. + +## Which diagram types a project requires + +A project declares which diagram types it wants in `pharaoh.toml`: + +```toml +[pharaoh.diagrams] +renderer = "mermaid" +required = ["component", "sequence"] # optional: gates pharaoh:mece or CI + +# Optional per-type overrides: +[pharaoh.diagrams.component] +direction = "TB" + +[pharaoh.diagrams.sequence] +renderer = "plantuml" + +[pharaoh.diagrams.state] +# state-specific tailoring +``` + +`required` is consulted by `pharaoh-mece` / future CI skills to flag missing diagrams for features or modules. Individual diagram skills do NOT consult `required` — they emit exactly what the caller asks for. + +## Node styling by need type + +Optional. Lives in `pharaoh.toml`: + +```toml +[pharaoh.diagrams.type_styles] +feat = { shape = "stadium", color = "#4ECDC4" } +comp_req = { shape = "rect", color = "#BFD8D2" } +arch = { shape = "hexagon", color = "#F7B2B7" } +``` + +Each diagram skill that emits typed nodes looks up the corresponding entry and applies the renderer-specific equivalent. If absent → renderer default. + +## Layout drift + +Known, accepted. Mermaid's layout engine can produce non-identical output on identical input across versions, causing noisy diffs in VCS. **On this iteration we do not canonicalize output.** Users who need stable diagrams should (a) pin the renderer version, (b) commit the rendered image alongside the RST, or (c) live with diff noise. A future `pharaoh-diagram-canonicalize` skill may address this. + +## check → propose → confirm (tailoring-aware skills) + +Any diagram skill invoked on a project where `[pharaoh.diagrams]` is absent MUST follow the shared `check → propose → confirm` pattern (see `shared/data-access.md` for data-access flavor of the same pattern). Concretely: + +1. Skill reads `pharaoh.toml`. +2. If `[pharaoh.diagrams]` is missing → emit a structured proposal in the output (not plain prose), including the default renderer choice, any per-type defaults, and a prompt for confirmation. +3. Caller (human or outer LLM) either confirms or rejects. +4. On confirm: the caller (or `pharaoh-tailor-fill`) patches `pharaoh.toml`, then re-invokes the skill with the now-present config. +5. The skill never silently picks a default on first run — the default is only "silent" on runs after the user has confirmed and the config is committed. + +Parameter name for this: `on_missing_config: "fail" | "prompt" | "use_default"`, default `"prompt"`. + +## Edge-handling across diagram types + +Edges derive from sphinx-needs link options (`:links:`, `:satisfies:`, `:verifies:`, tailored extra links). When an edge endpoint is outside the diagram's `scope_ids`, behavior depends on diagram kind: + +**Component diagrams** (`pharaoh-component-diagram-draft`): **ghost node by default.** Rendered with a distinct visual marker (dashed outline, muted color, "external" stereotype) that unambiguously separates "our scope" from "external dependencies." The diagram retains the trace information instead of hiding it. Callers who want a clean scope-only view opt out via `ghost_nodes = false` → warn + drop. + +**Class diagrams** (`pharaoh-class-diagram-draft`): **FAIL on dangling.** A class diagram is a closed type model; an edge to a class not in `classes` is a caller error, not an external dependency to hint at. The caller must either add the class or remove the relationship. + +**Sequence diagrams** (`pharaoh-sequence-diagram-draft`): **FAIL on dangling.** Every `from`/`to` in `messages` MUST reference an id in `participants`. External actors should be explicitly declared as `kind = "external"` participants. + +**State diagrams** (`pharaoh-state-diagram-draft`): **FAIL on dangling.** Every transition endpoint MUST be a state in `states` (or `[*]`). A state machine with a transition to an undeclared state is an incomplete model, not an external dependency. + +Always log a warning in addition to the default action so diffs surface the fact that "the full architecture is richer than this view" or "the caller passed an incomplete model." + +Per-diagram SKILL.md re-states its dangling-edge contract to avoid ambiguity; this section is the reason WHY the contracts differ. + +## What every per-type SKILL.md still has to specify + +- (a) indivisibility: one diagram per call, one diagram kind, no multi-kind bundling +- (b) input contract: diagram-specific (participants+messages for sequence; classes+relationships for class; states+transitions for state; nodes+edges for component) +- (c) reward: deterministic fixture with diagram-specific validity checks (e.g. sequence must have every participant appear in at least one message; state must have exactly one initial state and at least one terminal) +- (d) reusable across projects +- (e) composable — one skill emits one block, orchestration is a caller's problem + +This shared doc covers only the CROSS-CUTTING concerns (renderer, tailoring, edge handling). Per-type semantics always live in the per-type SKILL.md. diff --git a/skills/shared/diagram-view-selection.md b/skills/shared/diagram-view-selection.md new file mode 100644 index 0000000..45eea41 --- /dev/null +++ b/skills/shared/diagram-view-selection.md @@ -0,0 +1,79 @@ +# Diagram view → diagram type selection + +Shared reference mapping architectural views to diagram types, consumed by `pharaoh-write-plan` when emitting diagram tasks in a plan DAG. + +Defaults below are generic. Projects override via `.pharaoh/project/diagram-conventions.yaml`. Any project not overriding picks up the defaults. + +## Default view → diagram-type map + +| View (architectural intent) | Default diagram skill | Notes | +| ---------------------------------------------- | -------------------------------------------- | ------------------------------------------------------------- | +| Use case (feat level, user-facing) | `pharaoh-use-case-diagram-draft` | Actors + system boundary + use cases | +| Feature static (feat level, building blocks) | `pharaoh-component-diagram-draft` | Logical components and interfaces | +| Feature dynamic (feat level, interaction) | `pharaoh-sequence-diagram-draft` (+ `pharaoh-activity-diagram-draft` for error flows) | Scenario coverage knob in sequence-draft | +| Feature interface | `pharaoh-component-diagram-draft` | With logical interface stereotypes | +| Component static (comp level, white-box) | `pharaoh-class-diagram-draft` OR `pharaoh-component-diagram-draft` | Class for OO languages; component for services | +| Component dynamic | `pharaoh-sequence-diagram-draft` | Internal interactions | +| State behaviour | `pharaoh-state-diagram-draft` | Per stateful component | +| Deployment | `pharaoh-deployment-diagram-draft` | Physical nodes + artefacts | +| Fault analysis (FTA) | `pharaoh-fault-tree-diagram-draft` | Top event + gates | +| Reverse-engineered static | `pharaoh-feat-component-extract` | Automatic — import graph | +| Reverse-engineered dynamic (call graph) | `pharaoh-feat-flow-extract` | Automatic — call graph | + +## Scenario coverage + +Dynamic views (`feat_arc_dyn`, `comp_arc_dyn`) may require multiple diagrams covering different scenarios. Projects declare the scenarios in `.pharaoh/project/diagram-conventions.yaml` under `dynamic_view_scenarios`. Defaults: + +```yaml +dynamic_view_scenarios: + - default +``` + +Single diagram per dynamic view. Projects requiring fuller coverage override, e.g.: + +```yaml +dynamic_view_scenarios: + - success_path + - error_handling + - operation_lifecycle # start-up, shut-down + - critical_interface +``` + +`pharaoh-sequence-diagram-draft` and `pharaoh-feat-flow-extract` accept a `scenarios: [...]` input. `pharaoh-write-plan` foreach-expands per scenario when emitting dynamic-view tasks. + +## Renderer selection + +Default renderer is `mermaid`. Projects override via `.pharaoh/project/diagram-conventions.yaml > renderer`: + +```yaml +renderer: plantuml # or: mermaid +``` + +All `*-diagram-draft` skills accept `renderer` as input. If unspecified, they read it from the tailoring file. + +## Stereotype aliases + +Projects using SysML-style stereotypes override the defaults. Example: + +```yaml +stereotype_aliases: + block: "<<block>>" + component: "<<component>>" + interface: "<<interface>>" +``` + +Draft skills consult the alias map when emitting. Defaults: plain UML stereotypes as listed. + +## Safety and security markers + +Not in the base. Projects with safety/security concerns add axes via tailoring (Score's ASIL markers are an example — see `.pharaoh/project/diagram-conventions.yaml` + `.pharaoh/project/checklists/diagram.md`). + +## How plans use this + +`pharaoh-write-plan` reads `.pharaoh/project/diagram-conventions.yaml` at plan-write time. For every artefact in the plan that requires a diagram view, it: +1. Looks up `diagram_type` from this mapping (or from project override). +2. Inserts a task invoking the mapped `*-diagram-draft` skill. +3. For dynamic views, foreach-expands over `dynamic_view_scenarios`. +4. Inserts a `pharaoh-diagram-review` task depending on each diagram-draft task. + +Review always follows draft — see `shared/self-review-invariant.md`. diff --git a/skills/shared/gate-enablement.md b/skills/shared/gate-enablement.md new file mode 100644 index 0000000..2ba5207 --- /dev/null +++ b/skills/shared/gate-enablement.md @@ -0,0 +1,48 @@ +# Gate enablement ladder + +Shared reference documenting the fixed five-step ladder that Pharaoh projects walk to move from "advisory everywhere" to "enforcing everywhere". Consumed by `pharaoh-gate-advisor` (which walks it mechanically and reports the next unmet step) and by `pharaoh-setup` / `pharaoh-bootstrap` (which ship step 1 enabled by default). Treated as documentation, not a skill. + +## The rule + +The ladder is fixed, ordered, and five steps. Projects advance one step at a time. Advancing more than one step per change makes failure-to-enable debugging ambiguous — if two flips land together and the build alarms, the project cannot cheaply tell which flip is to blame. + +| Step | Gate (TOML line) | Blocker (pre-work required) | Rationale | +|------|----------------------------------|----------------------------------------------------|--------------------------------------------------------------------------------------------------------------| +| 1 | `require_verification = true` | none — safe to enable now | Review skills (`pharaoh-req-review`, `pharaoh-arch-review`, etc.) are ship-ready and read-only. Enabling this catches every PARTIAL finding immediately at no implementation cost. Highest value, lowest cost. | +| 2 | `require_change_analysis = true` | `pharaoh-change` must be tailored for this project | The change-analysis skill needs the project's impact model (which needs touch which code, which tests cover which reqs). Flipping the flag before the model exists alarms every authoring task with no mitigation path. | +| 3 | `require_mece_on_release = true` | release-gate workflow exists | MECE checks run at release time, not mid-authoring. The flag is meaningful only when there's a release pipeline that actually invokes `pharaoh-quality-gate` and acts on its `pass`/`fail` verdict. | +| 4 | `codelinks.enabled = true` | source tree has codelink annotations | Enabling the flag activates a traceability view that grows as source gets annotated. On an unannotated source tree the view is empty; the flag does no harm, but it also signals nothing useful until annotations land. | +| 5 | `strictness = "enforcing"` | steps 1–4 all satisfied | The master flag. Flipping strictness to `enforcing` before the individual gates are on ships a gate that enforces nothing — then later flips of the individual gates cause surprise blocks because strictness was the only part already hot. | + +## Why this order + +The common alternative — "turn strictness enforcing first, then enable individual gates" — is what produced a dogfooding failure pattern. When strictness flipped to enforcing on day one, the individual gate flags were all `false`, so enforcing had zero teeth. Later, flipping one gate such as `require_change_analysis = true` immediately blocked every authoring task, because the other half of the gate (strictness) had been enforcing the whole time, just silently. + +Walking the ladder in value-cost order avoids that surprise: +- Step 1 has no cost and immediate value → enable immediately. +- Steps 2–4 each require concrete pre-work whose absence is easy to check (did you tailor `pharaoh-change`? do your releases invoke MECE? is your source annotated?). A project that CAN enable the step should; a project that CANNOT should ship the pre-work first. +- Step 5 is a synthesis step. Its enabling condition is "all four previous flags are already `true`". Flipping it is low risk at that point because every gate it governs has already been exercised independently. + +## Bootstrap defaults + +`pharaoh-setup` ships `require_verification = true` out of the box — step 1 of the ladder enabled by default. A fresh project that runs setup and then `pharaoh-gate-advisor` immediately receives step 2 as its next recommendation, not step 1. This is a deliberate reversal of the pre-2026-04-22 default where every gate shipped `false`; a shipped-as-advisory-everywhere config teaches the user that gates are optional cosmetic knobs, which is not the signal Pharaoh wants to send. + +All other flags still ship at their advisory defaults (`false` / `"advisory"`) because each one has pre-work that projects must clear before enabling is meaningful. Step 1 is the only step with no pre-work, so it is the only step safe to ship hot. + +## Per-mode interaction + +`pharaoh-setup` also classifies the project into `reverse-eng` / `greenfield` / `steady-state` modes (see `pharaoh-setup` Step 2a.bis). The mode table sets the three workflow flags' initial values. After 2026-04-22 the table looks like: + +| Mode | `require_change_analysis` | `require_verification` | `require_mece_on_release` | +|----------------|---------------------------|------------------------|---------------------------| +| `reverse-eng` | `false` | `true` | `false` | +| `greenfield` | `false` | `true` | `false` | +| `steady-state` | `true` | `true` | `true` | + +`require_verification = true` is now uniform across all three modes — the ladder says step 1 is safe everywhere, and the mode-specific defaults defer to that. Mode still differentiates the other two flags (reverse-eng and greenfield keep change-analysis and MECE off until the catalogue stabilises). + +## What this reference is NOT + +- NOT a gate itself. `pharaoh-quality-gate` runs invariants against the effects of the flags (review coverage, draft-lifecycle, id-convention, etc.). The ladder talks about WHICH FLAGS to turn on, not whether those flags' effects are passing. +- NOT a substitute for `pharaoh.toml`'s own documentation. Each key is documented in `pharaoh.toml.example` and `skills/shared/strictness.md`. This reference layers phasing on top of that documentation; read both. +- NOT a tailoring extension point. Projects do not fork the ladder; they ship rationale overrides via `tailoring.gate_advisor_rationale_overrides` if they want house-style blocker notes, but the ladder ORDER is fixed. If your project disagrees with the order, file an issue against this reference — forking it per-project silently defeats the shared-meaning-across-projects signal that the ladder is trying to carry. diff --git a/skills/shared/public-symbol-patterns.md b/skills/shared/public-symbol-patterns.md new file mode 100644 index 0000000..b200af5 --- /dev/null +++ b/skills/shared/public-symbol-patterns.md @@ -0,0 +1,41 @@ +# Public-symbol regex patterns per language + +Shared regex table consumed by every Pharaoh skill that needs to enumerate top-level public symbols in a source file — currently `pharaoh-req-from-code` (file → reqs, via `split_strategy: "top_level_symbols"`) and `pharaoh-api-coverage-check` (file classification as behavioral vs non-behavioral, plus raise-site extraction). Any future upgrade (tree-sitter, libclang, per-language AST) replaces this one table and both consumers benefit automatically. + +## Table + +| language | extension globs | public symbol regex (named capture `name`) | private-prefix rule | +|------------|--------------------------------------|---------------------------------------------------------------------------------------------------------|----------------------------| +| python | `*.py` | `^(?:async\s+)?(?:def|class)\s+(?P<name>[A-Za-z_][A-Za-z0-9_]*)\s*[(:]` | leading `_` | +| rust | `*.rs` | `^pub\s+(?:fn|struct|enum|trait)\s+(?P<name>[A-Za-z_][A-Za-z0-9_]*)` | n/a (only `pub` items match)| +| typescript | `*.ts`, `*.tsx` | `^export\s+(?:async\s+)?(?:function|class|const|let|interface)\s+(?P<name>[A-Za-z_$][A-Za-z0-9_$]*)` | n/a (only `export` items match)| +| go | `*.go` | `^(?:func|type)\s+(?P<name>[A-Z][A-Za-z0-9_]*)` | lowercase first letter | +| c | `*.c`, `*.h` | `^[A-Za-z_][A-Za-z0-9_\s\*]*\s+(?P<name>[A-Za-z_][A-Za-z0-9_]*)\s*\([^)]*\)\s*\{?` | leading `_` or `static` | +| cpp | `*.cpp`, `*.hpp`, `*.cc`, `*.h` | `^(?:class|struct)\s+(?P<name>[A-Za-z_][A-Za-z0-9_]*)` plus the C free-function regex for free fns | leading `_` or `private:` section | +| java | `*.java` | `^\s*public\s+(?:static\s+)?(?:class|interface|[A-Za-z_][A-Za-z0-9_<>]*)\s+(?P<name>[A-Za-z_][A-Za-z0-9_]*)` | anything not declared `public` | + +## How consumer skills read the table + +1. The consumer is given (or infers) a `language` value (either explicit input or resolved from `file_path` extension against the globs column). +2. The consumer compiles the row's regex (Python `re.MULTILINE`, or an equivalent multi-line flag in whatever runtime). The pattern exposes one named capture group, `name`, which carries the public-symbol identifier. +3. The consumer runs the pattern over the source file's text and collects every `name` match. +4. The private-prefix rule is applied AFTER the match: matches whose `name` begins with the declared prefix (or otherwise satisfies the rule, e.g. lowercase-first-letter for Go, non-`public` for Java) are dropped from the public-symbol set. Rust and TypeScript rows do not need a post-filter because the regex itself already anchors on `pub` / `export`. +5. The consumer treats the resulting set as the canonical list of public symbols for that file. + +Language inference from `file_path` uses the globs column: the first row whose glob list contains the file's extension wins. If no row matches, the consumer fails with `unsupported language` — it does not silently fall back to a default. + +## MVP accuracy tradeoff + +These are line-anchored regexes, not language-aware parsers. Known false-positive pattern: any of the regex keywords appearing inside a comment or string literal at column 0 will be matched as if it were a real definition. + +```python +# "class Foo" mentioned in a docstring at the start of a line matches the python row +``` + +```rust +// pub fn example_in_doc_comment() — matches the rust row if the leading // is stripped by the host +``` + +For the coverage-gate use case this is acceptable: false-positive uncovered symbols show up as a small handful of ghost names that the author can manually dismiss, but false-negative uncovered symbols (a real public symbol that the regex misses) would silently hide API surface from the check. The regexes here are conservative in the direction that matters — they over-report candidates, so they never miss a genuine public definition. + +When AST-precision parsing lands (tree-sitter is the likely path), it replaces this file and both consumers inherit the upgrade. The consumer contract (named capture `name`, glob-based language resolution, post-filter for private rule) is designed to remain stable across that swap. diff --git a/skills/shared/self-review-invariant.md b/skills/shared/self-review-invariant.md new file mode 100644 index 0000000..2c7c9c6 --- /dev/null +++ b/skills/shared/self-review-invariant.md @@ -0,0 +1,46 @@ +# Self-review invariant + +Shared invariant for every Pharaoh skill that emits an artefact (draft, extract, record, annotation). + +## The rule + +Before returning success, an emission skill MUST arrange for its matching review skill (see `shared/self-review-map.yaml`) to run against the emitted artefact. This applies in three operating contexts: + +1. **Invoked via a plan** (`pharaoh-execute-plan`): the plan DAG must contain a `*_review` task that depends on the `*_draft` task. `pharaoh-write-plan` enforces this when templating. +2. **Invoked ad-hoc outside a plan**: the emission skill's "Last step" instructs the caller (human or agent) to invoke the mapped review skill and attach the review JSON alongside the emitted artefact. +3. **Invoked by another orchestrator skill**: the orchestrator is responsible for calling the mapped review skill before returning success. + +In all three contexts, the `pharaoh-self-review-coverage-check` invariant (wired into `pharaoh-quality-gate`) detects violations after the fact. + +## Why it exists + +Reviewing drafts is possible via the per-artefact review skills (`pharaoh-req-review`, `pharaoh-arch-review`, `pharaoh-vplan-review`, `pharaoh-feat-review`, `pharaoh-fmea-review`, `pharaoh-decision-review`, `pharaoh-diagram-review`). But dogfooding surfaced that an LLM running a plan end-to-end routinely skips review tasks as a cost-saving improvisation. The invariant upgrades "review is available" to "review always runs and its absence is a gate failure". + +## How emission skills reference this + +Every emission skill's body includes a **Last step** section of the form: + +```markdown +## Last step + +After emitting the artefact, invoke `pharaoh-<TYPE>-review` on it. Pass the emitted artefact (or its `need_id`) as `target`. Attach the returned review JSON to the skill's output as `review: <findings JSON>`. See [`shared/self-review-invariant.md`](../shared/self-review-invariant.md) for the rationale and enforcement mechanism. +``` + +with `<TYPE>` resolved via `shared/self-review-map.yaml`. + +## Failure closed vs failure open + +Self-review is **failure-closed** on critical findings. If the review emits any axis with `score: 0` (failing, per ISO 26262 §6 rubric) or explicit `severity: critical`, the emission skill returns a non-success status with the review findings verbatim. Non-critical findings (score ≥ 1 or informational) do not fail emission — they are attached to the output for the reviewer to act on. + +## Exclusions + +Skills that do NOT emit artefacts are not in scope: + +- Validators (`pharaoh-output-validate`, `pharaoh-diagram-lint`) +- Gates (`pharaoh-quality-gate`) +- Checks (`pharaoh-papyrus-non-empty-check`, `pharaoh-dispatch-signal-check`, `pharaoh-self-review-coverage-check`, `pharaoh-lifecycle-check`, `pharaoh-review-completeness`) +- Retrieval (`pharaoh-context-gather`, `papyrus-query`, `papyrus-drill`) +- ID allocation (`pharaoh-id-allocate`) +- The review skills themselves (they ARE the review) + +Shared references (not skills): `shared/diagram-safe-labels.md`, `shared/uml-relationship-semantics.md`, `shared/data-access.md`, `shared/strictness.md` — out of scope. diff --git a/skills/shared/self-review-map.yaml b/skills/shared/self-review-map.yaml new file mode 100644 index 0000000..c839a9d --- /dev/null +++ b/skills/shared/self-review-map.yaml @@ -0,0 +1,56 @@ +# Draft ↔ review skill mapping. +# Used by: +# - pharaoh-self-review-coverage-check (as fixture input) +# - pharaoh-write-plan (to insert *_review tasks after *_draft tasks) +# - shared/self-review-invariant.md (documentation) +# +# Keep in sync when adding a new emission skill. Adding an emission skill +# WITHOUT a corresponding review skill is a design failure — every +# artefact type must have a review atom. + +version: 1 + +map: + # Requirement emission + pharaoh-req-draft: pharaoh-req-review + pharaoh-req-from-code: + - pharaoh-req-review + - pharaoh-req-code-grounding-check + pharaoh-req-regenerate: pharaoh-req-review + + # Architecture emission + pharaoh-arch-draft: pharaoh-arch-review + + # Verification plan emission + pharaoh-vplan-draft: pharaoh-vplan-review + + # Feature emission + pharaoh-feat-draft-from-docs: pharaoh-feat-review + + # FMEA emission + pharaoh-fmea: pharaoh-fmea-review + + # Decision emission + pharaoh-decision-record: pharaoh-decision-review + + # Diagram emission — extractors + pharaoh-feat-component-extract: pharaoh-diagram-review + pharaoh-feat-flow-extract: pharaoh-diagram-review + + # Diagram emission — drafts (one entry per *-diagram-draft skill) + pharaoh-use-case-diagram-draft: pharaoh-diagram-review + pharaoh-sequence-diagram-draft: pharaoh-diagram-review + pharaoh-component-diagram-draft: pharaoh-diagram-review + pharaoh-class-diagram-draft: pharaoh-diagram-review + pharaoh-state-diagram-draft: pharaoh-diagram-review + pharaoh-activity-diagram-draft: pharaoh-diagram-review + pharaoh-block-diagram-draft: pharaoh-diagram-review + pharaoh-deployment-diagram-draft: pharaoh-diagram-review + pharaoh-fault-tree-diagram-draft: pharaoh-diagram-review + +# Skills intentionally NOT in the map (no artefact emission, no review needed): +# pharaoh-output-validate, pharaoh-diagram-lint, pharaoh-quality-gate, +# pharaoh-id-allocate, pharaoh-papyrus-non-empty-check, +# pharaoh-dispatch-signal-check, pharaoh-self-review-coverage-check, +# pharaoh-lifecycle-check, pharaoh-review-completeness, +# pharaoh-context-gather, pharaoh-toctree-emit, pharaoh-req-codelink-annotate. diff --git a/skills/shared/strictness.md b/skills/shared/strictness.md index 7b32961..88a290e 100644 --- a/skills/shared/strictness.md +++ b/skills/shared/strictness.md @@ -18,9 +18,9 @@ consult `.pharaoh/session.json`, and do **not** block on gate conditions. This k indivisible (atomicity criterion a) and composable into arbitrary flows. **Orchestrator / composite skills** (`pharaoh:flow`, `pharaoh:audit-fanout`, -`pharaoh:reqs-from-module`, `pharaoh:plan`, `pharaoh:change`, `pharaoh:release`, and the -legacy top-level skills `pharaoh:decide`, `pharaoh:spec`, `pharaoh:mece`, `pharaoh:trace`, -`pharaoh:setup`) are responsible for: +`pharaoh:plan`, `pharaoh:change`, `pharaoh:release`, and the legacy top-level skills +`pharaoh:decide`, `pharaoh:spec`, `pharaoh:mece`, `pharaoh:trace`, `pharaoh:setup`) +are responsible for: 1. Reading `pharaoh.toml` and determining the strictness level (Section 1). 2. Reading `.pharaoh/session.json` to check whether gate prerequisites are met @@ -354,3 +354,15 @@ Warning: Skipping change analysis gate at user request. Workflow compliance is n ``` Then execute the skill normally. Do not update session state to indicate the prerequisite was met. + +--- + +## Required-link chains with empty targets + +`pharaoh.toml`'s `[pharaoh.traceability].required_links` declares chains like `"comp_req -> test"`. If the target type (`test`) has zero needs in the project — because no test case has been authored yet — a naive check flags 100% of `comp_req` needs as unverified. That is false alarm, not signal. + +**Rule for `pharaoh-mece` and `pharaoh-coverage-gap`:** treat chains whose target type has zero declared needs as **inactive**. Warn once at project init (`"chain <chain> is declared but target type <type> has no needs yet — chain is inactive"`), but do not flag individual source-type needs as unverified. As soon as the first need of the target type lands, the chain becomes active and starts flagging. + +**Rule for `pharaoh-quality-gate` defaults:** the `unverified_rate_max` threshold defaults to `1.00` (inactive gate) — projects opt into stricter thresholds as their test corpus grows. The quality-gate evaluator skips any threshold whose observed rate corresponds to an inactive chain. + +**Reason this rule exists:** an observed dogfooding pattern declared `test` + `verifies` link before any test cases were written. Day-one `pharaoh-mece` would have flagged every `comp_req` need as unverified — noise, not signal. The fix ships in two parts: `pharaoh-bootstrap` stops declaring speculative types (only types with immediate use), and `pharaoh.toml` generation in `pharaoh-setup` filters out chains whose target type is not declared. The inactive-chain rule above protects projects that legitimately have the target type declared but no needs of that type yet. diff --git a/skills/shared/tailoring-access.md b/skills/shared/tailoring-access.md new file mode 100644 index 0000000..4a683e9 --- /dev/null +++ b/skills/shared/tailoring-access.md @@ -0,0 +1,21 @@ +# tailoring-access + +Shared helper module documenting how Pharaoh skills resolve project tailoring paths. Referenced by `pharaoh-req-draft`, `pharaoh-req-regenerate`, `pharaoh-arch-draft`, `pharaoh-vplan-draft`, and any other skill whose input includes `tailoring_path`. + +## Resolution order + +Given a `tailoring_path` input (typically the absolute path to `.pharaoh/project/`): + +1. **Artefact catalogue**: read `<tailoring_path>/artefact-catalog.yaml`. Contains `required_fields`, `optional_fields`, `child_of`, and `lifecycle_ref` per declared type. Produced by `pharaoh-tailor-bootstrap` or hand-authored. +2. **ID conventions**: read `<tailoring_path>/id-conventions.yaml`. Contains `prefixes` (directive → prefix map), `id_regex` (validation regex), `separator`. Produced by `pharaoh-tailor-bootstrap`. +3. **Workflows**: read `<tailoring_path>/workflows.yaml`. Per-type state machine (`states`, `transitions`, `initial`, `final`). Produced by `pharaoh-tailor-bootstrap`. +4. **Checklists**: read `<tailoring_path>/checklists/<directive>.md` for per-type checklists. Use `<tailoring_path>/checklists/requirement.md` as the canonical-alias filename when the caller wants "the" requirement checklist without knowing the project's directive name (alias is emitted by `pharaoh-tailor-bootstrap`). +5. **Project-level config** (outside tailoring path, but same resolution family): `<project_root>/ubproject.toml` defines `[[needs.types]]` and `[[needs.extra_links]]` used upstream by `pharaoh-tailor-bootstrap`. `<project_root>/pharaoh.toml` carries the strictness setting read by orchestrators (see `shared/strictness.md`). + +## Fallback behaviour + +A missing file is not automatically an error — each calling skill documents whether it requires the file strictly or falls back to built-in defaults. Skills that operate without tailoring MUST emit a `"note"` in their output naming the missing file and the default applied, so the caller can tell tailored runs apart from fallback runs. + +## Non-goals + +This helper does not read or parse the files; it only documents resolution order. Each skill does its own YAML / TOML parse and validation. diff --git a/skills/shared/uml-relationship-semantics.md b/skills/shared/uml-relationship-semantics.md new file mode 100644 index 0000000..869658f --- /dev/null +++ b/skills/shared/uml-relationship-semantics.md @@ -0,0 +1,65 @@ +# UML relationship semantics cheatsheet + +Shared reference for diagram-emitting skills that deal with structural UML diagrams: `pharaoh-class-diagram-draft`, `pharaoh-component-diagram-draft`, `pharaoh-block-diagram-draft`, `pharaoh-feat-component-extract`. + +LLMs emitting UML diagrams generally get the syntax right (Mermaid and PlantUML both in training data). They generally get the semantics wrong — every relationship becomes a generic `-->` arrow regardless of whether the concept is composition, aggregation, or a plain use dependency. This cheatsheet gives the minimum rules. + +## Relationship types and when to use which + +### Composition (strong ownership) +- **Semantics:** The whole OWNS the part. If the whole is destroyed, the part ceases to exist. One part belongs to exactly one whole. +- **Mermaid:** `Whole *-- Part` +- **PlantUML:** `Whole *-- Part` +- **Examples:** Car ↔ Engine (engine lives and dies with the car instance). File system Directory ↔ File entry. + +### Aggregation (weak ownership) +- **Semantics:** The whole has a reference to the part but does not own it. Part can outlive the whole. Shared references allowed. +- **Mermaid:** `Whole o-- Part` +- **PlantUML:** `Whole o-- Part` +- **Examples:** University Department ↔ Professor (professor can switch departments). Playlist ↔ Song. + +### Association (uses / knows about) +- **Semantics:** A knows about B, can call its methods. No ownership. No lifetime coupling. +- **Mermaid:** `A --> B` +- **PlantUML:** `A --> B` or `A -- B` +- **Examples:** OrderProcessor → PaymentGateway. JamaClient → HttpClient. + +### Dependency (transient use) +- **Semantics:** A needs B at some point (parameter, local variable, return type) but does not keep a long-term reference. +- **Mermaid:** `A ..> B` +- **PlantUML:** `A ..> B` +- **Examples:** ReportGenerator ..> PdfSerializer (used once per report). A function taking a logger as a parameter. + +### Inheritance (generalization) +- **Semantics:** Subtype is a kind of Supertype. Liskov-substitutable. +- **Mermaid:** `Subtype --|> Supertype` +- **PlantUML:** `Subtype --|> Supertype` (or `Supertype <|-- Subtype`) +- **Examples:** Dog --|> Animal. HTTPError --|> Exception. + +### Interface realization (implementation) +- **Semantics:** Concrete class implements abstract interface contract. +- **Mermaid:** `Class ..|> Interface` +- **PlantUML:** `Class ..|> Interface` (or `Interface <|.. Class`) +- **Examples:** ArrayList ..|> List. JamaClient ..|> IRemoteClient. + +## Decision matrix + +| If the source ... | Use | +| ------------------------------------------------------- | -------------- | +| owns the target and its lifetime is bound to the source | Composition | +| has a long-lived reference to the target, which is shared | Aggregation | +| keeps a long-lived reference to the target | Association | +| uses the target transiently (parameter, local) | Dependency | +| IS a kind of the target | Inheritance | +| implements the target's contract | Interface realization | + +## Common mistakes to avoid + +1. **Every arrow as `-->`** — this defaults everything to association, losing ownership / dependency distinctions. Draft skills MUST choose per above. +2. **Composition vs aggregation flipped** — composition is STRONGER (filled diamond, `*--`). Aggregation is WEAKER (hollow diamond, `o--`). "Whole owns part and shares the part's lifetime" is composition. "Whole just references the part" is aggregation. +3. **Inheritance for HAS-A** — inheritance means IS-A. If the source has-a, use composition/aggregation/association, not inheritance. +4. **Mixing directions in PlantUML** — both `Subtype --|> Supertype` and `Supertype <|-- Subtype` are legal in PlantUML. Pick one direction convention per diagram and stick to it. + +## Scope + +This cheatsheet is for STRUCTURAL UML diagrams only. Sequence, state, activity, use-case, deployment, and fault-tree diagrams have their own grammars — see each draft skill's body for per-type guidance.