From 14ac78c9ee6c8b84a42f554ca6a9f86f99c33ed8 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Thu, 11 Jun 2026 12:18:09 +0200 Subject: [PATCH 01/32] Sync planning docs after FE-847 restack Amp-Thread-ID: https://ampcode.com/threads/T-019eb2e2-5c62-7388-8691-f8e04d4b6e50 Co-authored-by: Amp --- HANDOFF.md | 62 +++++++++++++--------------------------- memory/CROSS_CUT_PLAN.md | 48 +++++++++++++++++-------------- memory/PLAN.md | 28 +++++++++--------- memory/SPEC.md | 2 +- 4 files changed, 61 insertions(+), 79 deletions(-) diff --git a/HANDOFF.md b/HANDOFF.md index f76f78f0..43d5d89c 100644 --- a/HANDOFF.md +++ b/HANDOFF.md @@ -1,52 +1,30 @@ # Handoff -> Updated 2026-06-10 after `ln-sync`. Volatile transfer state only. Delete or overwrite once the FE-847 scaffold is laid, or if `memory/SPEC.md` / `memory/PLAN.md` remain sufficient for re-entry. +> Updated 2026-06-11 after branch restack and `ln-sync`. Volatile transfer state only. Overwrite when it stops helping; canonical truth remains `memory/SPEC.md` and `memory/PLAN.md`. -## Canonical state +## Current Branch State -- `memory/SPEC.md` now owns D76-L–D78-L and I45-L–I47-L for the turn-boundary choreography layer. -- `memory/PLAN.md` now owns the frontier split: - - `dx-tier-2-harness` (FE-847, active): thin Tier-2 DX chassis + coverage-first scaffold only. - - `turn-boundary-reconciliation` (M7, next): assistant-visible watermark projection, `prepareNextTurn` reconciler / `worldUpdate`, submit-time mention ledger + staleness. - - `kick-and-context-seeding` (next): honest assistant origination via `session.triggerExchange` plus boot/resume context seeding. -- **Branch decision (user, 2026-06-10): all of S0–S5 build on the single `ln/fe-847-dx-introspection-tier-2` branch under FE-847.** The three groupings stay distinct planning units (seams + traceability) but execute as sequential slices — no separate Linear issues / Graphite branches. PLAN reconciled to match (Linear/Branch lines, sequencing, dependency-edge notes). +- Current branch: `ln/fe-847-turn-boundary-closure` +- Parent branch: `ln/fe-848-prompt-context-refine` +- `dx-tier-2-harness` is complete on `ln/fe-847-dx-introspection-tier-2`. +- Remaining FE-847 product closures stay together on this successor branch by the 2026-06-11 branch-mechanics override; no new Linear issue or frontier split was introduced. -## Sync notes +## Canonical State -- Approved the PLAN split: keep FE-847 to S0 chassis/scaffold; keep kick+seeding a distinct planning unit from M7 reconciliation (but same FE-847 branch per the decision above). -- Tightened topology sync in `src/session/README.md`: the write-side is planned, not already implemented. -- Removed stale dependency-graph horizon residue for `turn-boundary-reconciliation` after it moved to Next. +- `memory/SPEC.md` owns D76-L–D78-L / I45-L–I47-L for turn-boundary choreography and now also states the landed `dx-introspection-live` outcome rather than describing it as a future follow-on. +- `memory/PLAN.md` now matches reality across all repeated summaries: + - `dx-introspection-live` is done. + - `dx-tier-2-harness` is done. + - `turn-boundary-reconciliation` is the active FE-847 closure frontier. + - `kick-and-context-seeding` remains next on the same successor FE-847 branch. + - The original single-chain FE-847 execution decision is preserved historically, but every live branch reference now reflects the later split across `ln/fe-847-dx-introspection-tier-2` and `ln/fe-847-turn-boundary-closure`. -## Oracle pre-build review (2026-06-10) +## Remaining Builder Entries -Endorsed the architecture; four hazards folded into SPEC (D76–D78, I45–I47, coverage rows, lexicon): +- `memory/cards/turn-boundary-reconciliation--continuity-chain.md` closes the current frontier by replacing the remaining Tier-2 I45/I47 scaffold with live submit-path and compaction/resume proof. +- `memory/cards/kick-and-context-seeding--honest-origination.md` follows to close I46/I47 through real boot/resume origination proof. -1. **same-session capture** — `worldUpdate` covers any not-yet-visible write incl. submit-time/freestyle capture (D18-L/D66-L), not only foreign writes. -2. **kick = conversational-debt classification** (ignore trailing continuity-only entries) → idempotent reboot-after-notice. -3. **compaction preserves the watermark carrier** so projection never regresses. -4. **guard-as-retry** — `before_provider_request` re-runs prepare once on drift, never writes; reconciler runs before prompt composition. +## Verification Baseline -Plus: S1 = separate watermark projection (not `runtimeState.world.latestLsn` overload). Optional S2 split (S2a watermark+reconciler+worldUpdate / S2b adapter stamping + drains) deferred to `ln-scope`. - -## Oracle final pre-scope review (2026-06-10) - -Verdict: **ready to scope**, S0→S1→S2→S3→S4 sequencing on one branch sound, no reorder forced. One seam tightening + two scaffold-authoring guards folded in: - -- **D78-L / I46-L** — resume-debt ignore set now explicitly includes reconciler-inserted **side-task & reviewer drains** (D15-L), not just seed / `worldUpdate` / `brunch.mention*` / `brunch.session_lifecycle`. Generalized to "any reconciler-inserted notice owing no assistant continuation." Closes an I46 fixture ambiguity where a persisted side-task notice could be misread as tail debt. -- **S0 scaffold (PLAN)** — stub **one shared continuity-entry classifier** (`isWatermarkCarrier` / `isContinuityOnlyNonDebtEntry`) so S1/S2 and S4 share one carrier/continuity-only/debt taxonomy instead of duplicating lists. -- **S0 scaffold (PLAN)** — assert `worldUpdate.items` / watermark / kick as **sets and `{specId, lsn}` properties, not payload-order goldens** (no canonical item sort specified) to keep the suite deterministic. - -## Next step - -Run `ln-scope` for `dx-tier-2-harness` (FE-847): real `runBrunchTui` boot + one faux turn + provider-payload/transcript oracle + fixture resume, then lay skipped scaffold tests and intentional topology stubs. - -Scaffold must preserve these edge cases (now 7, post-oracle): - -1. seed/full-overview snapshots advance the watermark; narrow `getNodes` / `queryNodes` reads do not -2. no redundant `worldUpdate` immediately after a seed that named the current snapshot LSN -3. resume kick uses latest-conversational-debt (ignoring trailing continuity entries), so a user tail still earns a kick after reconciler-inserted notices -4. crash-after-notice-before-provider reboot still kicks when debt is unanswered (idempotent) -5. same-session capture bumps `current_lsn` and is surfaced by next `worldUpdate` (not swallowed) -6. foreign write between snapshot read and seed insertion is not masked by the seed -7. compaction+resume preserves the watermark (no spurious `worldUpdate`) -8. a trailing reconciler-inserted side-task / reviewer drain is ignored by kick classification (owes no continuation), so it neither masks an older user/assistant debt nor manufactures a kick over a satisfied leaf +- The last FE-847 builder report for the refactor/closure stack ended with `npm run verify` passing. +- This sync pass changes docs/planning state only. diff --git a/memory/CROSS_CUT_PLAN.md b/memory/CROSS_CUT_PLAN.md index c19c28c2..54758788 100644 --- a/memory/CROSS_CUT_PLAN.md +++ b/memory/CROSS_CUT_PLAN.md @@ -158,11 +158,11 @@ built. Ordered by leverage. Clean split: **runtime-state = the frame/constraints (user/system-set)**; **emitted facets = what the agent did this turn (AUTO choice)**. - *User-mutable posture axes (for now):* `op_mode` (user/system), `strategy`, `lens`. - **`goal` is NOT user-mutable** — too contingent; kept **internal/grade-derived** - (D59-L grade-derived objective) and out of the posture-change command surface for + **`goal` is NOT user-mutable** — too contingent; kept **internal/readiness-derived** + (D59-L / D74-L gap-driven objective) and out of the posture-change command surface for now. - *On-parent-switch reducer default → AUTO* for the children it governs (strategy/lens); - goal is grade-derived regardless. + goal is readiness-derived regardless. - *`source: 'agent'` reserved:* the enum keeps it, but no current path emits it; parked for a future execute-mode orchestrator that might legitimately steer sub-postures. Do not wire an agent switch by default. @@ -176,17 +176,19 @@ built. Ordered by leverage. resolved by pure projection); **no xstate, no persisted machine** for now. - *Real underlying need = UI affordances, not a truth machine.* The motivation was a **reducer** for (a) default-assignment when a parent state changes (switch op_mode / - grade advances → reassign now-illegal goal/strategy/lens to their defaults) and - (b) gating which options are available even within a parent state. - - *This logic already exists server-side* as lookup tables in - `projections/session/runtime-policy.ts` (`OPERATIONAL_MODE_DEFINITIONS`, - `AGENT_ROLE_DEFINITIONS`, `default*` fields) and `.pi/agents/state.ts` - (`GRADE_RANK`, `GOAL_MIN_GRADE`, `STRATEGY_MIN_GRADE`). Gating = min-grade tables + - `allowed*` lists; defaults-on-change = the `default*` fields. - - *Future enhancement (when UI pressure is real):* add one Brunch-owned **derived - affordance projection** — `affordances(resolvedState) → { availableOptions per axis, - defaultOnSwitch }` — over those tables; TUI/web/RPC clients **render** it. It is a - pure derivation, so D40-L (projection-as-truth) is untouched. + readiness coverage changes → reassign now-illegal goal/strategy/lens to their defaults) + and (b) gating which options are available even within a parent state. + - *This logic now exists server-side* as the current gap-driven derivation stack: + `projections/session/runtime-policy.ts` still owns + `OPERATIONAL_MODE_DEFINITIONS`, `AGENT_ROLE_DEFINITIONS`, and `default*` fields, + while `projections/session/affordances.ts` plus + `projections/session/capability-readiness.ts` derive legal options and + default-on-switch behavior from selected-spec gap coverage. + - *Future enhancement (when UI pressure is real):* transport more of that existing + Brunch-owned affordance projection to client surfaces. The pure derivation already + exists; the remaining question is which deferred rows (`active-review-set`, + `turn-mode`, richer client affordances) need transport, not whether to invent a + new truth machine. - *Durable constraint to preserve through the deferral:* the affordance/legality semantics are **Brunch-owned and shared** (D52-L thin-transport) — never reimplemented per client. The day the web client hand-rolls "which strategies are @@ -207,14 +209,16 @@ built. Ordered by leverage. select freestyle** (user pin only). Remaining scope-level detail: capture quality beyond labeled facts, per-turn vs on-demand capture, exact slash/skill-command surface (→ Q6). - **Q3 — `unknown` nodes (the MODELLING PROBLEM). RESOLVED → SPEC D65-L / A24-L.** - De-conflated into two concepts: `elicitation_backlog` (prospective process-agenda / - "prospective memory" — a **flat table**, not a graph node; async + unordered; the + De-conflated into two concepts: a **flat per-spec obligation register** (prospective + process agenda / "prospective memory," not a graph node; async + unordered; the prospective sibling of the retrospective `reconciliation_need`) and a deferred `risk` - intent-node-kind (durable domain-epistemic gap). The `elicitation_backlog` table is - the missing substrate for the "what to ask next" objective and generalized capture. - `basis` generalized to provenance-directness (D63-L). Name locked to `elicitation_backlog` - (over `agenda`/`need`) to signal async/unordered. Remaining scope-level detail: seed - mechanism, mutation path, goal-layer relationship. + intent-node-kind (durable domain-epistemic gap). The FE-823 interim `elicitation_backlog` + table was later remodeled into D65-L `elicitation_gaps`, and D75-L then replaced the + closed gap-name catalog with `refersTo: NodeKind` plus a free-form `question`. The + missing substrate for the "what to ask next" objective and generalized capture is + therefore the obligation register, not a new graph node kind. `basis` generalized to + provenance-directness (D63-L). Remaining scope-level detail: live ranking, mutation + ergonomics, and the goal-layer relationship. - **Q1 — Negative/IS_NOT graph queries. RESOLVED → dedicated `gaps` mode.** Add a fourth `read_graph` mode `gaps`: a base class filter (`kinds` and/or `readinessBands`) plus a required `absentEdgeCategory` and optional `direction` (default `both`), returning @@ -226,7 +230,7 @@ built. Ordered by leverage. `graph_truth` it does not. Bounded — single `absentEdgeCategory`, not a query language. **SPEC touch (RATIFIED 2026-06-07):** D60-L + glossary Agent context entry now enumerate the fourth observed read shape (gap query). Scoped: `memory/cards/crosscut-read--graph-gaps.md`. - Directly serves the D65-L `elicitation_backlog` "what to ask next" driver (theses w/o + Directly serves the D65-L gap-driven "what to ask next" driver (theses w/o proof, requirements w/o realization, claims w/o support). - **Q5 — Agent `mutate_graph` patch/delete.** Deferred after role-safe graph mutations. Default lean: the autonomous agent-facing tool stays creation-only; deletion is not diff --git a/memory/PLAN.md b/memory/PLAN.md index 17302098..cddd803c 100644 --- a/memory/PLAN.md +++ b/memory/PLAN.md @@ -41,7 +41,7 @@ A new graph-mutation planning result has been promoted into the rolling plan as **Readiness / elicitation-gaps remodel promoted (2026-06-09 ln-plan, post-`ln-spec`).** A SPEC pass reconceived the readiness and prospective-agenda model and must now land in code (D45-L, D57-L, D64-L, D65-L, D73-L, D74-L; A24-L, A27-L; I25-L, I30-L, I31-L). Four coupled implications: (1) **`elicitation_backlog` → `elicitation_gaps`** — the FE-823 question-instance / `open|closed` table is remodeled into typed coverage *obligations* (each gap carries a `name` typology key + meta `rationale`, a band, a `presence|field|coverage|manual` predicate union, an `importance` + derived `coverage`, and a `disposition`), seeded from the collated **grounding typology catalog** (floor `domain`/`protagonist`/`pain_pull`/`constraint` + progressive drivers `value`/`context_of_use`/`success_sketch`/`solution_boundary`) instead of four literal anchor questions; (2) **JIT capability-readiness** replaces the stored grade gate — readiness is judged on a capability request against the relevant gaps (proceed / proceed-at-low-epistemic-status / negotiate), retiring `readiness_grade`, `updateReadinessGrade`, `READINESS_GRADES`, and the `MIN_GRADE` proxy tables in `runtime-policy.ts`; (3) a soft derived **readiness estimate** (UI-only, gates nothing) plus removal of the vestigial `chrome.phase` / `chrome.chatMode` fields; (4) a small follow-on **session/runtime vocabulary leaf** (`src/session/schema/kinds.ts`) mirroring `graph/schema/kinds.ts` for the `op_mode`/`strategy`/`lens`/`goal` axes. These are promoted as `elicitation-gaps-remodel` → `capability-readiness` (hard chain) plus the parallel `runtime-vocab-leaf`; none are POC-ship-critical (the delivery cut de-scopes elicitation quality). **Sequencing tension with the trio:** `capability-readiness` mutates exactly the shapes the trio would lock (`workspace/workspace-state` drops phase/chatMode and gains the readiness estimate; `session/runtime-state` + composition drop grade). By the trio's own "lock upstream shape before downstream output" principle, the gaps/readiness remodel is *upstream* of the trio's readiness/chrome-touching locks and should land before stage 1 (`projection-shape-coverage`) freezes those shapes — otherwise the locks churn. Recommended order: `elicitation-gaps-remodel` → `capability-readiness` first, then the trio; or, if the trio leads, it must explicitly bracket the grade/phase/chatMode fields until the remodel lands. `elicitation-driver` now rides the remodeled gaps substrate, not the FE-823 backlog shape. **2026-06-10 follow-on (D75-L):** a further SPEC pass collapsed the parallel grounding-typology vocabulary onto the node-kind ontology — gaps now reference graph node kinds (`refersTo: NodeKind`) instead of a closed typology `name` enum. This inserts `gaps-node-kind-reference` at the head of the chain (`elicitation-gaps-remodel` → `gaps-node-kind-reference` → `capability-readiness`); it reshapes the gaps substrate and the `capability → NodeKind[]` map, and absorbs the now-retired refactor plan (which had planned to enshrine the typology catalog). -**Turn-boundary choreography promoted as core mechanics (2026-06-10 ln-plan, post-`ln-spec` D76-L–D78-L / I45-L–I47-L).** The runtime "Tier-2" layer — what enters the transcript at a turn boundary and who originates the next turn — is being specced and scoped *now*, not deferred to M7-as-fog, because it is core product choreography and the concept is fresh. SPEC locked three decisions (assistant-visible watermark D76-L; one-writer reconciler + aux seams/guard D77-L; honest kick + context seeding D78-L), sharpened I9-L, and added I45-L–I47-L plus a **coverage-first scaffold** design note (author the layer's whole invariant suite up front, skip/`todo` each test until its enabling slice lands). The layer decomposes into a slice map S0–S5: **S0** is the Tier-2 *chassis* (DX only, thin) on **FE-847** — real `runBrunchTui` boot, one faux model turn, provider-payload capture, transcript inspection, fixture resume — plus authoring the skipped coverage-first scaffold and the topology stubs the product slices fill. **S1–S3 + S5(share)** are product write-side mechanics owned by **`turn-boundary-reconciliation` (M7)**: S1 assistant-visible watermark projection, S2 the `prepareNextTurn` reconciler + `worldUpdate` + own-write/full-overview watermark stamping, S3 the submit-time mention ledger + staleness. **S4 + S5(share)** are the **`kick-and-context-seeding`** grouping: honest assistant-origination behind `session.triggerExchange` plus boot/resume context seeding. S5 (boot idempotence + carrier discipline, I47-L) is a cross-cutting obligation threaded through both product groupings, not its own frontier. **Branch decision (user, 2026-06-10): the entire S0–S5 layer is built on the single `ln/fe-847-dx-introspection-tier-2` branch under the FE-847 issue** — the three groupings are distinct planning units (seams + traceability) executed as sequential slices, not separate Linear issues/Graphite branches (AGENTS.md permits multiple slices per issue+branch). The scaffold's first tests must encode three edge cases locked into SPEC: (a) seed/full-overview snapshots advance the watermark while narrow `getNodes`/`queryNodes` reads do not; (b) no redundant `worldUpdate` immediately after a seed that named the current snapshot LSN; (c) the resume kick decision is taken on the **pre-reconcile** tail, so a user tail still earns a kick even after the reconciler inserts seed/staleness notices ahead of it. None of this is POC-ship-critical; the S0 chassis is buildable now. +**Turn-boundary choreography promoted as core mechanics (2026-06-10 ln-plan, post-`ln-spec` D76-L–D78-L / I45-L–I47-L).** The runtime "Tier-2" layer — what enters the transcript at a turn boundary and who originates the next turn — is being specced and scoped *now*, not deferred to M7-as-fog, because it is core product choreography and the concept is fresh. SPEC locked three decisions (assistant-visible watermark D76-L; one-writer reconciler + aux seams/guard D77-L; honest kick + context seeding D78-L), sharpened I9-L, and added I45-L–I47-L plus a **coverage-first scaffold** design note (author the layer's whole invariant suite up front, skip/`todo` each test until its enabling slice lands). The layer decomposes into a slice map S0–S5: **S0** is the Tier-2 *chassis* (DX only, thin) on **FE-847** — real `runBrunchTui` boot, one faux model turn, provider-payload capture, transcript inspection, fixture resume — plus authoring the skipped coverage-first scaffold and the topology stubs the product slices fill. **S1–S3 + S5(share)** are product write-side mechanics owned by **`turn-boundary-reconciliation` (M7)**: S1 assistant-visible watermark projection, S2 the `prepareNextTurn` reconciler + `worldUpdate` + own-write/full-overview watermark stamping, S3 the submit-time mention ledger + staleness. **S4 + S5(share)** are the **`kick-and-context-seeding`** grouping: honest assistant-origination behind `session.triggerExchange` plus boot/resume context seeding. S5 (boot idempotence + carrier discipline, I47-L) is a cross-cutting obligation threaded through both product groupings, not its own frontier. The original 2026-06-10 FE-847 execution decision kept S0–S5 as one sequential closure chain rather than separate issues/frontiers; a 2026-06-11 branch-mechanics override then split that chain across two FE-847 branches for stack health: `dx-tier-2-harness` remained on `ln/fe-847-dx-introspection-tier-2`, while `turn-boundary-reconciliation` and `kick-and-context-seeding` continue together on stacked successor `ln/fe-847-turn-boundary-closure`. The scaffold's first tests must encode three edge cases locked into SPEC: (a) seed/full-overview snapshots advance the watermark while narrow `getNodes`/`queryNodes` reads do not; (b) no redundant `worldUpdate` immediately after a seed that named the current snapshot LSN; (c) the resume kick decision is taken on the **pre-reconcile** tail, so a user tail still earns a kick even after the reconciler inserts seed/staleness notices ahead of it. None of this is POC-ship-critical; the S0 chassis is buildable now. ### Context-pipeline coverage (the next design/lock spine) @@ -102,7 +102,7 @@ Post-`ln-spec` implications that are **upstream** of the context-pipeline trio's ### Next -The near-term spine has two tracks. The **context-pipeline coverage trio** remains the elevated product-coverage spine, sequenced in strict dependency order (lock upstream shape before downstream output). `role-safe-graph-mutations` is a graph-mutation grammar frontier that can run before or alongside the trio, and must land before relation-bearing generalized capture or semantic fixture curation rely on the new mutation surface. The `dx-feedback-loops` DX substrate is complete and no longer gates this list. `dx-introspection-live` is its low-conflict follow-on (wire the dormant introspection extension into the real TUI, harden `.fixtures/` topology + `--cwd`, make introspection conversational); it is DX substrate, parallel to the product trio, and not POC-ship-critical. +The near-term spine has two tracks. The **context-pipeline coverage trio** remains the elevated product-coverage spine, sequenced in strict dependency order (lock upstream shape before downstream output). `role-safe-graph-mutations` is a graph-mutation grammar frontier that can run before or alongside the trio, and must land before relation-bearing generalized capture or semantic fixture curation rely on the new mutation surface. The `dx-feedback-loops` DX substrate and its `dx-introspection-live` follow-on are complete and no longer gate this list; the remaining FE-847 closure work is the active parallel product track. 1. `projection-shape-coverage` — **PROJECT stage** (`#project`); invariant / no-loss kind. Ledger authored in `src/projections/README.md`. Two sub-steps: (a) **PULL-session prerequisite** — ledger the session read surface (`session/workspace-context`, `workspace-session-coordinator`, `runtime-state`) the session/workspace projections lock against; (b) **earns-its-place audit then lock** — delete/inline the `✗` indirection (`workspace/workspace-context`: single-consumer tag wrapper), resolve the `◐` exchange family (direct-lock vs keep-transitive), and add a shape/no-loss invariant to each `●` survivor (`graph/neighborhood`, `session/transcript-context`, `session/runtime-state`, `workspace/workspace-state`). The graph projection stubs (`overview`, `commit-result`, `reconciliation-needs`) are `export {}` topology stubs, **not** dark implementations — leave them. Upstream of everything else in the trio; do this first so renderer goldens lock against stable shapes. 2. `renderer-golden-coverage` — **RENDER stage** (`#render`); golden + invariant kind. **Depends on `projection-shape-coverage`.** Create the renderer ledger (README claims one that does not exist), extend the preview harness past `graph-neighborhood`, and golden-lock every durable renderer (only `graph/neighborhood` + `session/runtime-frame` are locked; the rest are dark or only transitively covered via the `.pi` adapter). @@ -166,7 +166,7 @@ The near-term spine has two tracks. The **context-pipeline coverage trio** remai ### turn-boundary-reconciliation - **Name:** Turn-boundary reconciliation — assistant-visible watermark, `worldUpdate`, mention staleness -- **Linear:** FE-847 — built as a slice group on the FE-847 branch (2026-06-10 single-branch decision); no separate issue. +- **Linear:** FE-847 — built as a slice group under the FE-847 issue; no separate issue. - **Branch:** `ln/fe-847-turn-boundary-closure` (stacked successor FE-847 branch, shared with `kick-and-context-seeding`). - **Kind:** structural / product mechanics (M7) - **Status:** active (turn-boundary choreography; not POC-ship-critical) @@ -193,7 +193,7 @@ The near-term spine has two tracks. The **context-pipeline coverage trio** remai ### kick-and-context-seeding - **Name:** Session origination — honest kick + boot/resume context seeding -- **Linear:** FE-847 — built as a slice group on the FE-847 branch (2026-06-10 single-branch decision); no separate issue/branch. +- **Linear:** FE-847 — built as a slice group under the FE-847 issue; no separate issue. - **Branch:** `ln/fe-847-turn-boundary-closure` (stacked successor FE-847 branch, shared with `turn-boundary-reconciliation`). - **Kind:** structural / product mechanics - **Status:** next (turn-boundary choreography; not POC-ship-critical) @@ -202,7 +202,7 @@ The near-term spine has two tracks. The **context-pipeline coverage trio** remai - **Depends on:** `turn-boundary-reconciliation` (S1 watermark projection + S2 reconciler — the seed must advance the watermark and the kick decision interacts with reconciler-inserted notices) and the `dx-tier-2-harness` chassis. Sequenced last in the FE-847 slice chain. - **Lights up:** Honest session origination — `startAssistantTurn({ origin })` surfaced through `session.triggerExchange`, plus boot/resume context seeding as custom continuity entries. - **Stabilizes:** I46-L (honest origination + pre-reconcile-tail resume policy) and its share of I47-L (boot/resume seed idempotence + carrier discipline). -- **Objective:** Build the write-side of origination (S4) behind the FE-847 chassis (same FE-847 branch, sequenced after the reconciliation slices). A **new** session seeds workspace/spec-overview context as custom continuity entries (D76-L; the seed names the snapshot LSN and so initializes the watermark), then kicks an assistant-originated `present_*` exchange. A **resumed** session takes the kick decision from the **pre-reconcile** transcript tail: kick iff that tail owed assistant continuation (user message or incomplete exchange-tuple), even after the reconciler inserts seed/staleness notices ahead of it; otherwise rest at a `request_*`/system leaf. AUTO always originates offer-first (D66-L: AUTO never selects `freestyle`); only an explicit `freestyle` pin yields a wait-for-user idle. Carries its share of S5 — boot/resume seeding is idempotent (dedupe derived from projected transcript state, survives real restart) and continuity rides custom entries only. Flip the corresponding FE-847 scaffold tests live. +- **Objective:** Build the write-side of origination (S4) behind the FE-847 chassis, sequenced after the reconciliation slices on the shared successor FE-847 branch. A **new** session seeds workspace/spec-overview context as custom continuity entries (D76-L; the seed names the snapshot LSN and so initializes the watermark), then kicks an assistant-originated `present_*` exchange. A **resumed** session takes the kick decision from the **pre-reconcile** transcript tail: kick iff that tail owed assistant continuation (user message or incomplete exchange-tuple), even after the reconciler inserts seed/staleness notices ahead of it; otherwise rest at a `request_*`/system leaf. AUTO always originates offer-first (D66-L: AUTO never selects `freestyle`); only an explicit `freestyle` pin yields a wait-for-user idle. Carries its share of S5 — boot/resume seeding is idempotent (dedupe derived from projected transcript state, survives real restart) and continuity rides custom entries only. Flip the corresponding FE-847 scaffold tests live. - **Why now / unlocks:** The offer-first default (R16, D12-L, I13-L) has a read side but no honest write-side origination; specced now as core mechanics. Kept a distinct planning unit from M7 reconciliation because it is origination, not reconciliation; executed as the final FE-847 slice group, not a separate branch. Not POC-ship-critical. - **Acceptance:** - Origination never writes a fabricated user transcript entry and never injects a "user said begin" prompt; the kick is `startAssistantTurn({ origin })` surfaced via `session.triggerExchange`. @@ -647,12 +647,12 @@ The near-term spine has two tracks. The **context-pipeline coverage trio** remai - **Topology materialization:** `src/dev/` becomes the dev front door (launchers + shared faux-harness factory); the introspection extension lives under `src/.pi/extensions/` per D39-L topology and is wired in `src/.pi/brunch-pi-extensions.ts`; dev source-alias config lives in `vite.config.ts` through the `PI_SOURCE`-gated runtime alias, while base `tsconfig.json` stays paths-free; introspection artifacts are written under `.fixtures/scratch/introspection/`. - **Traceability:** D39-L, D58-L, D67-L, D68-L, D69-L; A25-L; I38-L. - **Design docs:** `memory/SPEC.md` §Development Feedback Loops (DX) and D67-L–D69-L; a new `src/dev/README.md`; `pi-mono/packages/coding-agent/docs/development.md` and `vitest.config.ts` for the alias pattern. -- **Current execution pointer:** Done 2026-06-09. The chain landed the latest-pi bump and `PI_SOURCE`-gated runtime alias, the `src/dev/` faux front door and shared faux harness, and the dev-gated read-only introspection extension plus paired run-artifact launcher. Verification: `npm run verify` (608 tests, tsc build, web build). **Follow-on:** `dx-introspection-live` carries the remaining gaps — the introspection extension is built but not wired into the real TUI launch path, `--cwd` is not yet supported by the main CLI, and there is no conversational self-report surface yet. +- **Current execution pointer:** Done 2026-06-09. The chain landed the latest-pi bump and `PI_SOURCE`-gated runtime alias, the `src/dev/` faux front door and shared faux harness, and the dev-gated read-only introspection extension plus paired run-artifact launcher. Verification: `npm run verify` (608 tests, tsc build, web build). The follow-on frontier `dx-introspection-live` is now also done: the real TUI wiring, `--cwd` launch surface, unified `BRUNCH_DEV` gate, dev query tools, and workspace-local `.brunch/debug/` cache all landed on 2026-06-11. ### dx-introspection-live - **Name:** Live, conversational agent-input introspection in the real dev TUI -- **Linear:** unassigned — create in FE / brunch when the frontier starts (do not parent under FE-531; sibling of FE-825, not a child). +- **Linear:** FE-825 — https://linear.app/hash/issue/FE-825/first-class-developer-feedback-loops-over-the-pi-harness - **Kind:** structural / dev-substrate (capability expansion over `dx-feedback-loops`) - **Status:** done - **Certainty:** proving @@ -660,7 +660,7 @@ The near-term spine has two tracks. The **context-pipeline coverage trio** remai - **Lights up:** Running `BRUNCH_DEV=1 npm run dev -- --cwd .fixtures/workbenches/` boots the *real* Brunch TUI against a chosen fixture workspace with the introspection extension live and the model able to query exact prior session-log values back into chat for discussion — a loop that did not exist before this frontier (the extension was built but dormant, and dev runs polluted the operating cwd). - **Stabilizes:** The four-role `.fixtures/` topology (D70-L), the unified `BRUNCH_DEV` dev gate + `--cwd` launch surface (D71-L), and the conversational session-query contract (A26-L) that future introspection work aims from. - **Objective:** Make introspection actually *usable live* and *conversational*. Preflight hardening has already formalized scratch artifact routing and moved probe faux wiring out of `src/dev/**`; slice 1 added `--cwd `, unified dev gating under `BRUNCH_DEV`, and wired the introspection extension into the real TUI launch path only when enabled. Slice 2 replaces the earlier fixed self-report schema idea with a general read-only `brunch_session_query` tool over `ctx.sessionManager.getBranch()`: predicate match session entries, project exact values, truncate/spill large output, and let the agent echo/discuss those returned bytes in normal chat. The follow-on live-advertisement/payload-query slice makes registered dev query tools actually active under the D40-L allow-list and adds `brunch_introspect_query` over captured provider payloads plus base prompt options. Live-model compliance remains outer-loop fitness, not a product prompt/resource contract. -- **Why now / unlocks:** `dx-feedback-loops` built the introspection machinery but left it dormant — the capability the user actually wants (interrogate the live in-product agent about how it reads Brunch's tools/skills, and get clarity feedback in chat) is not reachable yet. This closes that gap and hardens the fixtures topology every dev loop and probe shares. Not POC-ship-critical; a DX substrate that accelerates later product frontiers (especially the I38-L discretionary-loading and tool/skill-clarity questions). +- **Why now / unlocks:** When this frontier started, `dx-feedback-loops` had built the introspection machinery but left it dormant — the capability the user actually wanted (interrogate the live in-product agent about how it reads Brunch's tools/skills, and get clarity feedback in chat) was not yet reachable. This frontier closed that gap and hardened the fixtures topology every dev loop and probe shares. Not POC-ship-critical; a DX substrate that accelerates later product frontiers (especially the I38-L discretionary-loading and tool/skill-clarity questions). - **Acceptance:** - `runBrunchCli` accepts `--cwd ` (defaulting to `process.cwd()`) so a dev session can target `.fixtures/workbenches/` without `cd`. - A single `BRUNCH_DEV` switch enables dev RPC, introspection registration, scratch routing, and the offline lift together; `BRUNCH_DEV_RPC` is fully retired (no remaining references in code or docs). @@ -724,7 +724,7 @@ The near-term spine has two tracks. The **context-pipeline coverage trio** remai ## Recently Completed - 2026-06-09 `role-safe-graph-mutations` — Done: retired the remaining public `commitGraph` residue, extracted the shared mutation planner/writer out of `CommandExecutor`, and completed the last boundary migration so dev curation now exposes `dev.graph.mutateGraph` with role-named create-edge ops plus projected node-code / selected-spec edge-id resolution. Follow-up closure on the same frontier: reconciled the remaining product probes and current docs to the canonical `mutateGraph` / `mutate_graph` grammar, explicitly marked the checked-in 2026-06-05 fixture-curation artifact as historical pre-migration `commit_graph` evidence, and added role-named edge schema coverage across the Pi tool and dev RPC boundaries. Verified: `npx vitest run src/rpc/handlers.test.ts src/app/brunch.test.ts src/probes/fixture-curation-loop.test.ts src/probes/propose-graph-commit-proof.test.ts src/graph/mutate-graph-edge-schema.test.ts` and `npm run verify`. -- 2026-06-09 `dx-feedback-loops` (FE-825) — Done: bumped Brunch to the pi 0.79 line with a dev-only `PI_SOURCE` runtime alias, consolidated the dev front door around a shared faux harness and scripted faux launcher, and added the dev-gated read-only introspection extension plus `runBrunchIntrospectionTurn()` paired artifact writer now routed under `.fixtures/scratch/introspection/`. Product runs omit introspection by default and keep the D39-L offline default; the dev launcher explicitly lifts offline mode for real-provider introspection. Verified: `src/.pi/__tests__/introspection.test.ts`, `src/dev/introspection-launcher.test.ts`, and `npm run verify`. +- 2026-06-09 `dx-feedback-loops` (FE-825) — Done: bumped Brunch to the pi 0.79 line with a dev-only `PI_SOURCE` runtime alias, consolidated the dev front door around a shared faux harness and scripted faux launcher, and added the dev-gated read-only introspection extension plus `runBrunchIntrospectionTurn()` paired artifact writer now routed under `.fixtures/scratch/introspection/`. Product runs omit introspection by default and keep the D39-L sealed profile intact; the later `dx-introspection-live` closure wired the real TUI path under `BRUNCH_DEV` while keeping Pi startup-update suppression scoped at launch rather than globally lifting offline mode. Verified: `src/.pi/__tests__/introspection.test.ts`, `src/dev/introspection-launcher.test.ts`, and `npm run verify`. - 2026-06-08 `runtime-affordances-and-legality` — Done (00105108): added `src/projections/session/affordances.ts` owning the pure `(resolvedState, readinessGrade) → legal goal/strategy/lens options + default-on-switch` derivation; lifted the shared grade/AUTO legality tables into `src/projections/session/runtime-policy.ts` and refactored `src/.pi/agents/state.ts` to reuse that single legality source (no client-local reimplementation); added the closed coverage ledger to `src/session/README.md` with `src/session/runtime-affordances-coverage.test.ts` guarding the required agent rows while tripwiring `active-review-set` / `turn-mode` as explicit product-state-gated deferrals. Reconciled D40-L. Verified: `src/projections/session/affordances.test.ts`, `src/session/runtime-affordances-coverage.test.ts`, and `npm run verify`. @@ -750,7 +750,7 @@ nodes: minimal-authority-shell [done · P1] thin safety posture for current POC paths poc-live-ship-gate [next · P1] final fresh-cwd composed product runbook dx-feedback-loops [done · proving] consolidated src/dev front door (faux/real/introspection loops) + latest-pi source-alias; sealed-profile-safe read-only introspection capture - dx-introspection-live [next · proving] wire dormant introspection into real TUI; four-role .fixtures topology + --cwd; unify BRUNCH_DEV; conversational self-report + dx-introspection-live [done · proving] live real-TUI introspection + four-role .fixtures topology + --cwd + unified BRUNCH_DEV + conversational query tools + .brunch/debug cache graph-observed-shapes [done · proving] ratified consumer-specific observed-shape ledger + drift guard; no transport shape shipped runtime-affordances-and-legality [done · proving] shared affordance derivation + coverage ledger; review-set/turn-mode rows tripwired (superseded by gap-based capability-readiness) role-safe-graph-mutations [done · proving] canonical mutateGraph/mutate_graph authored grammar; role-named edges; retire exposed commitGraph/commit_graph @@ -768,9 +768,9 @@ nodes: topology-readmes-and-boundaries [parallel] attach-to-frontier topology hardening dev-seed-fixtures [parallel · proving] explicit seed selection + target-workspace-scoped workbench launch; catalog captured seeds; prove D79/I48 tracer web-design-system-port [done · earned] ported prior-trunk tokens + card primitives into src/web; retired invented warm aesthetic; read-only, no spine deps - dx-tier-2-harness [active · proving] FE-847 Tier-2 DX chassis (real boot + faux turn + payload/transcript oracle + fixture resume) + coverage-first scaffold (skipped I45-I47) + topology stubs + dx-tier-2-harness [done · proving] FE-847 Tier-2 DX chassis (real boot + faux turn + payload/transcript oracle + fixture resume) + coverage-first scaffold + topology stubs turn-boundary-reconciliation [next · proving] M7 product write-side: watermark projection (S1) + prepareNextTurn reconciler/worldUpdate/own-write stamping (S2) + submit-time mention ledger/staleness (S3) - kick-and-context-seeding [next · proving] separate product frontier/branch: honest kick via triggerExchange + boot/resume context seeding (S4); pre-reconcile-tail policy; boot idempotence (S5 share) + kick-and-context-seeding [next · proving] shared FE-847 successor branch: honest kick via triggerExchange + boot/resume context seeding (S4); pre-reconcile-tail policy; boot idempotence (S5 share) edges: graph-tool-resilience -[hard]-> capture-response-to-graph @@ -798,7 +798,7 @@ edges: dx-feedback-loops -[hard]-> dx-introspection-live (built the dormant introspection machinery this frontier wires live + makes conversational) dx-feedback-loops -[hard]-> dx-tier-2-harness (Tier-2 chassis reuses the src/dev faux harness + real-boot front door) dx-tier-2-harness -[hard]-> turn-boundary-reconciliation (S1-S3 mechanics are proven through the Tier-2 chassis + flip its skipped scaffold tests live) - dx-tier-2-harness -[hard]-> kick-and-context-seeding (S4 origination is proven through the Tier-2 chassis; same FE-847 branch, last slice group) + dx-tier-2-harness -[hard]-> kick-and-context-seeding (S4 origination is proven through the Tier-2 chassis; same FE-847 closure chain, last slice group on the successor branch) turn-boundary-reconciliation -[hard]-> kick-and-context-seeding (seed must advance the watermark (S1) and the kick decision interacts with reconciler-inserted notices (S2)) parallel obligations: @@ -829,7 +829,7 @@ notes: - `project-graph-review-cycle` is complete evidence for the optional batch proposal/review story; keep future review-quality work as follow-up, not FE-809 completion debt. - `topology-readmes-and-boundaries` is not a license for abstract cleanup; it rides with concrete delivery seams. - **Readiness / elicitation-gaps remodel (2026-06-09 ln-plan, post-`ln-spec`).** The SPEC pass (D45-L, D57-L, D64-L, D65-L, D73-L, D74-L; A24-L, A27-L; I25-L, I30-L, I31-L) promotes a hard chain `elicitation-gaps-remodel` → `capability-readiness` plus the parallel `runtime-vocab-leaf`. `elicitation_backlog` is remodeled into the D65-L `elicitation_gaps` obligation register (name + rationale, band, `presence|field|coverage|manual` predicate, importance + derived coverage, disposition; seeded from the grounding typology catalog). Capability-readiness becomes a JIT `capability → relevant gaps` judgment that retires the stored `readiness_grade` / `updateReadinessGrade` / `READINESS_GRADES` / `MIN_GRADE` proxies, adds a soft UI-only `readiness estimate`, and removes `chrome.phase` / `chrome.chatMode`. **These are upstream of the trio's readiness/chrome-touching locks** (`capability-readiness` mutates `workspace/workspace-state` + `session/runtime-state` shapes that `projection-shape-coverage` would freeze): land the chain before trio stage 1, or have the trio explicitly bracket the grade/phase/chatMode fields until the remodel lands. None are POC-ship-critical. `elicitation-driver` now depends on `elicitation-gaps-remodel`, not the FE-823 backlog shape. `runtime-vocab-leaf` is the decision-3 follow-on (session/runtime enum source-of-truth leaf) and does **not** relocate the retired `READINESS_GRADES`. Decision-2 (readiness-grade vs band term overlap → `capture_band`/`readiness_gate`) was explicitly **left alone**. - - **Turn-boundary choreography (Tier-2 layer, 2026-06-10).** Promoted from the `turn-boundary-reconciliation` horizon stub into three frontiers after a SPEC pass locked D76-L–D78-L / I45-L–I47-L. `dx-tier-2-harness` (FE-847) is the thin DX chassis + coverage-first scaffold (skipped tests + topology stubs); `turn-boundary-reconciliation` (M7) owns the watermark/reconciler/mention write-side (S1–S3); `kick-and-context-seeding` is the honest-origination + seeding grouping (S4). S5 (boot idempotence + carrier discipline, I47-L) is a cross-cutting obligation on both product groupings, not its own frontier. **All of S0–S5 build on the single `ln/fe-847-dx-introspection-tier-2` branch under FE-847 (user decision 2026-06-10)** — distinct planning units, sequential slices, no separate issues/branches. The scaffold encodes three edge cases: seed/full-overview snapshots advance the watermark while narrow reads do not; no redundant `worldUpdate` after a seed naming the current snapshot LSN; the resume kick decision is taken on the pre-reconcile tail. Each grouping flips its own scaffold tests live (no slice lands green leaving its tests skipped). None POC-ship-critical; the S0 chassis is buildable now. + - **Turn-boundary choreography (Tier-2 layer, 2026-06-10).** Promoted from the `turn-boundary-reconciliation` horizon stub into three frontiers after a SPEC pass locked D76-L–D78-L / I45-L–I47-L. `dx-tier-2-harness` (FE-847) is the thin DX chassis + coverage-first scaffold (skipped tests + topology stubs); `turn-boundary-reconciliation` (M7) owns the watermark/reconciler/mention write-side (S1–S3); `kick-and-context-seeding` is the honest-origination + seeding grouping (S4). S5 (boot idempotence + carrier discipline, I47-L) is a cross-cutting obligation on both product groupings, not its own frontier. The original FE-847 execution decision kept S0–S5 as one sequential closure chain; the later 2026-06-11 branch-mechanics override split that chain across two FE-847 branches for stack health: `dx-tier-2-harness` stayed on `ln/fe-847-dx-introspection-tier-2`, while `turn-boundary-reconciliation` and `kick-and-context-seeding` continue together on `ln/fe-847-turn-boundary-closure`. The scaffold encodes three edge cases: seed/full-overview snapshots advance the watermark while narrow reads do not; no redundant `worldUpdate` after a seed naming the current snapshot LSN; the resume kick decision is taken on the pre-reconcile tail. Each grouping flips its own scaffold tests live (no slice lands green leaving its tests skipped). None POC-ship-critical; the S0 chassis is buildable now. - **Oracle pre-build review (2026-06-10).** Endorsed the architecture (projected watermark + one reconciler writer + honest origination) and surfaced four pre-build hazards, all folded into SPEC: (1) **same-session capture** — `worldUpdate` now covers any write not already assistant-visible via a carrier, incl. submit-time/freestyle capture (D18-L/D66-L), not just foreign writes (D76-L/I45-L); (2) **kick = conversational-debt classification** ignoring trailing continuity-only entries, so reboot-after-notice stays idempotent (D78-L/I46-L); (3) **compaction must preserve the watermark carrier** so projection never regresses (I47-L); (4) **guard-as-retry** — `before_provider_request` re-runs prepare once on drift, never writes; reconciler runs before prompt composition (D77-L). Also: keep S1 a separate watermark projection, not an overload of `runtimeState.world.latestLsn`. **Optional S2 split** if it grows too wide: S2a = watermark + core reconciler + `worldUpdate`; S2b = adapter stamping + side-task/reviewer drains. Defer to `ln-scope`. - Multi-spec workspace discipline applies throughout: target the selected/current spec explicitly; no workspace-global graph truth in the POC. ``` diff --git a/memory/SPEC.md b/memory/SPEC.md index b9a0b5e2..46171ebc 100644 --- a/memory/SPEC.md +++ b/memory/SPEC.md @@ -623,7 +623,7 @@ Verification oracles prove Brunch's *product* claims; development loops are how The vite/vitest-backed loops can run against pi *source* via the D67-L `PI_SOURCE` alias, so no rebuild is needed there to pick up either Brunch or pi edits. `tsx`-run real-provider loops intentionally keep default `dist` resolution until an opt-in dev tsconfig is needed. -Dev-loop artifacts route to gitignored `.fixtures/scratch///`, resolved to the repo root rather than the operating cwd, and decoupled from the `--cwd` workspace a dev session targets (D70-L); a single `BRUNCH_DEV` switch gates dev affordances while Brunch TUI launch keeps Pi startup update checks suppressed (D71-L). Workspace-local `.brunch/debug/` files are ephemeral `BRUNCH_DEV` caches of passive introspection bytes and explicit Brunch-owned text tool-result content, not scratch evidence. The introspection loop's live wiring into the real TUI, the four-role `.fixtures/` topology, and conversational self-report (the agent reporting in chat on tool I/O, understandability, errors, and skill activation — A26-L) are the `dx-introspection-live` follow-on; `dx-feedback-loops` built the capture machinery but left it dormant and writing under `runs/introspection/`. +Dev-loop artifacts route to gitignored `.fixtures/scratch///`, resolved to the repo root rather than the operating cwd, and decoupled from the `--cwd` workspace a dev session targets (D70-L); a single `BRUNCH_DEV` switch gates dev affordances while Brunch TUI launch keeps Pi startup update checks suppressed (D71-L). Workspace-local `.brunch/debug/` files are ephemeral `BRUNCH_DEV` caches of passive introspection bytes and explicit Brunch-owned text tool-result content, not scratch evidence. `dx-introspection-live` has now landed: the real TUI wires the D69-L passive capture live under `BRUNCH_DEV`, `brunch_session_query` / `brunch_introspect_query` let the agent pull exact session and payload values back into chat, repo-root `.fixtures/scratch/introspection//` remains the durable paired-run artifact path, and only the narrow workspace-local debug cache mirrors the latest final system prompt plus Brunch-owned text tool results. `tool-renders` flattening remains deferred until a concrete renderer-debugging need appears. ### Oracle Strategy by Loop Tier From 14e841ae65bb6dddb38dac079df057789b739bb7 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Thu, 11 Jun 2026 12:58:46 +0200 Subject: [PATCH 02/32] fable ln-induct review and re-scope --- HANDOFF.md | 30 ---- memory/PLAN.md | 4 +- ...capability-readiness--live-gap-legality.md | 120 ++++++++++++++++ memory/cards/dev--review-fix-sweep.md | 109 ++++++++++++++ ...ation-gaps-remodel--predicate-hardening.md | 133 ++++++++++++++++++ ...and-context-seeding--honest-origination.md | 5 +- ...undary-reconciliation--continuity-chain.md | 10 +- 7 files changed, 375 insertions(+), 36 deletions(-) delete mode 100644 HANDOFF.md create mode 100644 memory/cards/capability-readiness--live-gap-legality.md create mode 100644 memory/cards/dev--review-fix-sweep.md create mode 100644 memory/cards/elicitation-gaps-remodel--predicate-hardening.md diff --git a/HANDOFF.md b/HANDOFF.md deleted file mode 100644 index 43d5d89c..00000000 --- a/HANDOFF.md +++ /dev/null @@ -1,30 +0,0 @@ -# Handoff - -> Updated 2026-06-11 after branch restack and `ln-sync`. Volatile transfer state only. Overwrite when it stops helping; canonical truth remains `memory/SPEC.md` and `memory/PLAN.md`. - -## Current Branch State - -- Current branch: `ln/fe-847-turn-boundary-closure` -- Parent branch: `ln/fe-848-prompt-context-refine` -- `dx-tier-2-harness` is complete on `ln/fe-847-dx-introspection-tier-2`. -- Remaining FE-847 product closures stay together on this successor branch by the 2026-06-11 branch-mechanics override; no new Linear issue or frontier split was introduced. - -## Canonical State - -- `memory/SPEC.md` owns D76-L–D78-L / I45-L–I47-L for turn-boundary choreography and now also states the landed `dx-introspection-live` outcome rather than describing it as a future follow-on. -- `memory/PLAN.md` now matches reality across all repeated summaries: - - `dx-introspection-live` is done. - - `dx-tier-2-harness` is done. - - `turn-boundary-reconciliation` is the active FE-847 closure frontier. - - `kick-and-context-seeding` remains next on the same successor FE-847 branch. - - The original single-chain FE-847 execution decision is preserved historically, but every live branch reference now reflects the later split across `ln/fe-847-dx-introspection-tier-2` and `ln/fe-847-turn-boundary-closure`. - -## Remaining Builder Entries - -- `memory/cards/turn-boundary-reconciliation--continuity-chain.md` closes the current frontier by replacing the remaining Tier-2 I45/I47 scaffold with live submit-path and compaction/resume proof. -- `memory/cards/kick-and-context-seeding--honest-origination.md` follows to close I46/I47 through real boot/resume origination proof. - -## Verification Baseline - -- The last FE-847 builder report for the refactor/closure stack ended with `npm run verify` passing. -- This sync pass changes docs/planning state only. diff --git a/memory/PLAN.md b/memory/PLAN.md index cddd803c..e60dd30f 100644 --- a/memory/PLAN.md +++ b/memory/PLAN.md @@ -288,7 +288,7 @@ The near-term spine has two tracks. The **context-pipeline coverage trio** remai - **Cross-cutting obligations:** Anti-shadowing — the table never holds domain content (which lives in the graph). Gaps commit only through `CommandExecutor` (`basis` via provenance-directness, D63-L: user-raised `explicit`, agent-inferred `implicit`). Multi-spec discipline — each gap belongs to one spec's register. - **Traceability:** D8-L, D30-L, D57-L, D60-L, D63-L, D64-L, D65-L, D74-L / A24-L, A27-L / I30-L. Supersedes the FE-823 backlog row shape. - **Design docs:** `memory/SPEC.md` D65-L and §Grounding typology catalog; `src/graph/README.md`; `src/db/README.md`. -- **Current execution pointer:** Done 2026-06-10. Replaced FE-823 `elicitation_backlog` with the D65-L `elicitation_gaps` obligation register, regenerated the table/migration metadata, seeded the grounding typology catalog, routed create/disposition mutations through `CommandExecutor`, and proved live `presence` coverage/answered derivation at read-back with sibling-spec isolation. `field`/`coverage` predicate derivation and `manual` LLM satisficiency remain named follow-ons for capability-readiness / later predicate slices. **Superseded in part by `gaps-node-kind-reference` (D75-L):** the grounding typology catalog and gap-`name` enum are retired in favor of `refersTo: NodeKind` + a free-form question; the flat-table substrate, predicate union, disposition, and live derivation this frontier established stand. +- **Current execution pointer:** Done 2026-06-10. Replaced FE-823 `elicitation_backlog` with the D65-L `elicitation_gaps` obligation register, regenerated the table/migration metadata, seeded the grounding typology catalog, routed create/disposition mutations through `CommandExecutor`, and proved live `presence` coverage/answered derivation at read-back with sibling-spec isolation. `field`/`coverage` predicate derivation and `manual` LLM satisficiency remain named follow-ons for capability-readiness / later predicate slices. **Superseded in part by `gaps-node-kind-reference` (D75-L):** the grounding typology catalog and gap-`name` enum are retired in favor of `refersTo: NodeKind` + a free-form question; the flat-table substrate, predicate union, disposition, and live derivation this frontier established stand. **2026-06-11 review-fix follow-on:** the ln-induct pass over stack PR comments scoped `memory/cards/elicitation-gaps-remodel--predicate-hardening.md` (reject unimplemented `field`/`coverage` arms behind one exhaustive predicate-semantics owner, predicate-row consistency on read, presence kind-floor dedup, regenerated 0004 migration) to land on `ln/fe-847-turn-boundary-closure`. ### gaps-node-kind-reference @@ -338,7 +338,7 @@ The near-term spine has two tracks. The **context-pipeline coverage trio** remai - **Cross-cutting obligations:** Readiness never bars graph truth or work (I31-L); `CommandExecutor` must not reject a node for a later-band kind (D64-L). The deferred milestone gate for export/plan/execute op-modes stays deferred (D45-L). Replace grade-gate tests across `compose.test.ts` / `prompting.test.ts` and createSpec/getSpec rather than preserving them. - **Traceability:** D25-L, D30-L, D32-L, D45-L, D57-L, D58-L, D59-L, D64-L, D65-L, D73-L, D74-L, D75-L / A27-L / I25-L, I31-L. Supersedes stored-grade gating and the `chrome.phase` / `chrome.chatMode` fields. - **Design docs:** `memory/SPEC.md` D45-L / D74-L; `src/projections/session/runtime-policy.ts`; `src/projections/workspace/workspace-state.ts`. -- **Current execution pointer:** Done 2026-06-11. Slices 1–5 moved all legality and display consumers from the old grade/phase-era fields to selected-spec `ElicitationGap[]` / derived readiness estimates. The final grade-deletion sweep removed `specs.readiness_grade`, `updateReadinessGrade`, `READINESS_GRADES`, `ReadinessGrade`, and `AgentPromptSpecContext.readinessGrade`; regenerated migration metadata; stripped readiness grade from seed/export fixture contracts and JSON seed files; and removed probe setup calls that only advanced the legacy grade. `createSpec` / `getSpec` now carry only spec identity (`id`, `name`, `slug`), and readiness remains gap-derived at the consumers. +- **Current execution pointer:** Done 2026-06-11. Slices 1–5 moved all legality and display consumers from the old grade/phase-era fields to selected-spec `ElicitationGap[]` / derived readiness estimates. The final grade-deletion sweep removed `specs.readiness_grade`, `updateReadinessGrade`, `READINESS_GRADES`, `ReadinessGrade`, and `AgentPromptSpecContext.readinessGrade`; regenerated migration metadata; stripped readiness grade from seed/export fixture contracts and JSON seed files; and removed probe setup calls that only advanced the legacy grade. `createSpec` / `getSpec` now carry only spec identity (`id`, `name`, `slug`), and readiness remains gap-derived at the consumers. **2026-06-11 review-fix follow-on:** the ln-induct pass found the live TUI composition root never wires `getElicitationGaps` into `GraphReaders` (optional member + silent `conservativeUncoveredGaps` fallback), so live legality is frozen at the conservative floor; scoped as `memory/cards/capability-readiness--live-gap-legality.md` to land on `ln/fe-847-turn-boundary-closure`. ### runtime-vocab-leaf diff --git a/memory/cards/capability-readiness--live-gap-legality.md b/memory/cards/capability-readiness--live-gap-legality.md new file mode 100644 index 00000000..c92177c1 --- /dev/null +++ b/memory/cards/capability-readiness--live-gap-legality.md @@ -0,0 +1,120 @@ +# Live gap-legality wiring — make the composition root supply real gap reads + +Frontier: capability-readiness +Status: active +Mode: single +Created: 2026-06-11 + +> Sequencing: builds on `ln/fe-847-turn-boundary-closure` after the +> `turn-boundary-reconciliation--continuity-chain.md` cards (shared write path: +> `src/app/brunch-tui.ts`). User-routed here by the 2026-06-11 ln-induct pass; +> the defect originated on PR #201 but is fixed at the top of the stack, no restack. + +## Orientation + +- Seam: the `BrunchPromptContext` / `GraphReaders` dependency surface between the live TUI composition root (`src/app/brunch-tui.ts`) and the system-prompts legality gating (`src/.pi/extensions/system-prompts/index.ts`). +- Frontier: `capability-readiness` (done) — this card closes a wiring hole that frontier left: legality reads `ElicitationGap[]`, but the live `reads` object never implements `getElicitationGaps`, so every live session falls through to `conservativeUncoveredGaps` and is frozen at the most-gated legality floor regardless of real graph coverage. +- The selected-spec gap reader already exists (`src/graph/workspace-store.ts` exposes one; `getElicitationGaps(db, specId)` in `src/graph/queries.ts` is the canonical read). +- Posture: earned (inherited from `capability-readiness`) — no unknown; this closes the optional/required ambiguity on a settled seam. + +## Target Behavior + +A live TUI session's prompt/tool legality is derived from the selected spec's real elicitation gaps, and a composition root that fails to supply gap reads is a type error, not a silent fallback. + +### Full-card cold-start reads + +``` +- memory/SPEC.md — D75-L (gaps reference node kinds), D77-L context, I-rows for capability readiness; §Verification Design +- memory/PLAN.md — frontier: capability-readiness (Frontier Definitions; done 2026-06-11) +- src/.pi/extensions/graph/index.ts — GraphReaders interface (getElicitationGaps currently optional) +- src/.pi/extensions/system-prompts/index.ts — gapsForPrompt + conservativeUncoveredGaps fallback +- src/graph/workspace-store.ts — existing selected-spec gap reader seam +- src/dev/README.md — Tier-2 harness ownership (the real-boot oracle) +``` + +### Boundary Crossings + +``` +→ src/app/brunch-tui.ts (live composition root: reads object) +→ src/.pi/extensions/graph/index.ts (GraphReaders contract) +→ src/.pi/extensions/system-prompts/index.ts (legality gating consumer) +→ src/projections/session/capability-readiness.ts (readiness evaluation, read-only) +``` + +### Risks and Assumptions + +``` +- RISK: making getElicitationGaps required breaks other GraphReaders constructors + (probes, fixtures, RPC adapters) that legitimately lack a DB handle. + → MITIGATION: sweep all GraphReaders construction sites first; where a real reader + is impossible, the constructor must opt in loudly (explicit stub named as such), + never via interface optionality. +- RISK: removing conservativeUncoveredGaps changes live legality from "floor-locked" + to "real coverage" — sessions that previously had everything gated may now unlock + capabilities. → MITIGATION: this is the intended fix; cover with a Tier-2 assertion + that a seeded spec with covered floor gaps actually unlocks the gated posture. +- ASSUMPTION: distinguishing intended-optional context members (context?, session?) + from must-wire capability members is worth recording on BrunchPromptContext. + → IMPACT IF FALSE: none beyond a comment. + → VALIDATE: n/a — documentation move. +``` + +### Posture check (earned) + +- **Closes:** the optional-vs-required ambiguity on `GraphReaders.getElicitationGaps` that let the live composition root silently diverge from every test harness. +- **Locks in:** the invariant that legality-bearing capabilities on dependency interfaces are required members — optionality is reserved for ergonomic extras (`clock?`, `telemetry?`), and that distinction is written at the interface. +- **Deletes:** `conservativeUncoveredGaps` (the silent fallback) or demotes it to an explicitly-named test stub if a harness still needs one. + +### Acceptance Criteria + +``` +✓ getElicitationGaps is a required member of GraphReaders; `npm run verify` fails to + type-check if the live composition root omits it (proven by the wiring existing — + the contract is the compiler). +✓ live reads object in brunch-tui.ts supplies selected-spec gap reads via the + existing workspace-store/queries seam (respecting the currentWorkspace.spec.id + getter — gaps follow spec switches). +✓ conservativeUncoveredGaps is deleted from the production path; if any test stub + replaces it, it is named as a stub and lives with the tests. +✓ Tier-2 real-boot assertion: a session over a seeded spec derives prompt/tool + legality from that spec's actual gap coverage — covered floor gaps unlock the + posture that the conservative floor previously kept locked. +✓ BrunchPromptContext documents which optional members are intended-optional + (context bundle, session) vs. must-wire, so the next optional hook is a + deliberate choice. +``` + +### Verification Approach + +``` +- Inner: type-level enforcement (required member) + existing capability-readiness + unit tests unchanged. +- Middle: Tier-2 real-boot legality assertion (the ownership-axis oracle from the + ln-induct pass — live posture pinned through runBrunchTui, no harness substitution). +``` + +### Cross-cutting obligations + +``` +- Preserve D39-L sealed-profile boundary — gap reads observe; they do not let the + prompt path mutate. +- Multi-spec discipline: gap reads are selected-spec scoped; never workspace-global. +- Do not fold this into the continuity-chain cards' commits; same branch, separate + commit-sized slice after them (shared brunch-tui.ts write path). +``` + +### Expected touched paths (tentative) + +``` +src/app/ +└── brunch-tui.ts ~ +src/.pi/extensions/graph/ +└── index.ts ~ +src/.pi/extensions/system-prompts/ +├── index.ts ~ +└── index.test.ts ? +src/dev/ +└── tier-2-harness.test.ts ~ +src/graph/ +└── workspace-store.ts ? +``` diff --git a/memory/cards/dev--review-fix-sweep.md b/memory/cards/dev--review-fix-sweep.md new file mode 100644 index 00000000..772291ec --- /dev/null +++ b/memory/cards/dev--review-fix-sweep.md @@ -0,0 +1,109 @@ +# Review-fix sweep — localized bot-flagged defects, fixed at top of stack + +Frontier: n/a (dev hygiene; defects originated on PRs #189/#195/#196/#203/#204) +Status: active +Mode: single +Created: 2026-06-11 + +> Sequencing: LAST of the four review-fix work items on `ln/fe-847-turn-boundary-closure` +> (shares `src/app/brunch-tui.ts` with the continuity-chain and gap-legality cards — +> do not build in parallel with them). Each fix is an independent commit-sized edit +> inside a settled seam; no finding here changes a decision or invariant. + +## Objective + +Retire every remaining localized defect from the 2026-06-11 ln-induct pass over stack PR comments, so the stack merges without known bot-flagged correctness or contract nits. + +### Light-card cold-start reads + +``` +- memory/SPEC.md — None load-bearing (D35-L startup-header drift is explicitly + EXCLUDED from this sweep; it goes through ln-sync) +- memory/PLAN.md — category concern: dev hygiene on the FE-847 closure branch +- docs/praxis/pi-types.md — before the duplicate-Component fix (typing over pi APIs) +- Original bot comment text (only if a fix needs more context than the acceptance + line gives): the unresolved review threads on PRs #189/#195/#196/#203/#204 via gh +``` + +### Acceptance Criteria + +Each line is one independent fix; verify and commit in small groups. + +``` +✓ brunch-tui env scoping: applyBrunchOfflineDefault sets PI_SKIP_VERSION_CHECK ??= '1' + alongside PI_OFFLINE (or, if version-check noise is judged not real, the save/restore + ceremony for it is deleted instead — pick one, no half-state); the unused `dev` + param on runWithScopedBrunchOfflineDefault is removed (check call sites first); + both brunch-tui.test.ts env cases assert the chosen PI_SKIP_VERSION_CHECK behavior. +✓ chrome-header: the expand affordance is either reachable (input/shortcut wired to + setExpanded) or the expanded content + "more" copy is removed — no advertised + unwired behavior; the logo render respects truecolor detection consistent with the + workspace dialog (reuse its detection, do not duplicate it). +✓ commands extension: a runtime posture switch immediately refreshes the footer + (render request / runtime-state publish after the switch); the appendCustomEntry + adapter returns the real entry id (or the helper's contract is changed to void if + no caller needs the id — no silent '' placeholder); /brunch:mode messages echo the + actual current/requested mode and list supported modes from the canonical enum, + no hardcoded 'elicit'/'execute' strings. +✓ runtime-posture/axis-picker.ts and tui-lab/index.ts import the pi-tui Component + type instead of redeclaring local Component interfaces (per docs/praxis/pi-types.md). +✓ seed-fixtures CLI: runSeedFixturesCli honors its Promise contract — semantic + failures (unknown seed, unreadable fixture, executor errors) are caught at the CLI + boundary and return usage/error + nonzero exit, never a stack trace; the + brunch.test.ts seeding call asserts the returned exit code. (Partially addressed + already — verify current behavior before patching.) +✓ web: DrawerCard initializes expanded to false when it cannot toggle (defaultExpanded + only honored when canToggle); structured-list-view uses an imported ReactNode type, + no bare React namespace reference. +``` + +### Verification Approach + +``` +- Inner: npm run verify per commit group; targeted unit tests where the fix is + behavioral (env scoping, CLI exit codes, DrawerCard state init). +- Middle: none required — all seams already carry behavioral coverage. +``` + +### Cross-cutting obligations + +``` +- D35-L startup-header behavior and the stale tooling--runtime-state-commands.md card + are OUT of scope here — they are canonical-doc reconciliation, routed to ln-sync. +- Mode/strategy/lens strings come from the canonical vocabulary modules, not new + literals (runtime-vocab-leaf direction, D73-L). +``` + +### Assumption dependency + +None — every fix sits inside a settled seam with named current rationale. + +### Expected touched paths (tentative) + +``` +src/app/ +├── brunch-tui.ts ~ +├── brunch-tui.test.ts ~ +└── brunch.test.ts ~ +src/.pi/ +├── brunch-pi-settings.ts ~ +├── components/ +│ ├── chrome-header.ts ~ +│ └── runtime-posture/axis-picker.ts ~ +└── extensions/ + ├── commands/index.ts ~ + ├── chrome/index.ts ? + └── tui-lab/index.ts ~ +src/graph/ +├── seed-fixtures.ts ~ +└── seed-fixtures.test.ts ? +src/web/ +├── components/drawer-card.tsx ~ +└── features/graph/structured-list-view.tsx ~ +``` + +### Promotion checklist + +All answers no — stays light. (The only near-trip: the entry-id contract fix touches +a helper used by the continuity seam; resolved by fixing the adapter to honor the +existing contract rather than changing the contract.) diff --git a/memory/cards/elicitation-gaps-remodel--predicate-hardening.md b/memory/cards/elicitation-gaps-remodel--predicate-hardening.md new file mode 100644 index 00000000..06833559 --- /dev/null +++ b/memory/cards/elicitation-gaps-remodel--predicate-hardening.md @@ -0,0 +1,133 @@ +# Gap-predicate hardening — every accepted arm has semantics or is rejected + +Frontier: elicitation-gaps-remodel +Status: active +Mode: single +Created: 2026-06-11 + +> Sequencing: builds on `ln/fe-847-turn-boundary-closure` after +> `capability-readiness--live-gap-legality.md` (disjoint write paths, but the +> Tier-2 legality assertion there is worth having green before reshaping the +> substrate beneath it). User-routed here by the 2026-06-11 ln-induct pass; +> defects originated on PRs #197/#201, fixed at top of stack, no restack. + +## Orientation + +- Seam: the `GapPredicate` tagged union owned by `src/graph/schema/elicitation-gaps.ts`, dispatched by `validateGapPredicate` (`command-executor.ts`), `deriveGapCoverage` / `rowToElicitationGap` (`queries.ts`), seeding (`command-executor.ts`), and the drizzle 0004 migration. +- Frontier: `elicitation-gaps-remodel` / `gaps-node-kind-reference` (both done) — this card closes dark-variant and dual-encoding holes those frontiers left. +- Current faults (verified at HEAD): `field`/`coverage` predicates are creatable (validator checks kind membership only), derive coverage 0 forever, and cannot be hand-answered (non-`manual` `answered` is rejected) — a permanently-unanswerable obligation, silently. `rowToElicitationGap` trusts `row.predicate` JSON to agree with the `predicate_kind` column and `refers_to`; migration 0004 copied legacy predicate JSON verbatim under remapped `refers_to`, demonstrating the divergence. +- Posture: earned (inherited from `elicitation-gaps-remodel`) — closure moves on a settled model; the one micro-decision (presence granularity) is recorded below, not an empirical unknown. + +## Target Behavior + +Every `GapPredicate` arm accepted by `CommandExecutor` either has working coverage derivation or is rejected loudly at the boundary, and a stored gap row cannot carry internally-inconsistent predicate facts. + +### Full-card cold-start reads + +``` +- memory/SPEC.md — D65-L (gap obligation model), D75-L (node-kind reference), D63-L (basis), + I30-L (disposition capture); A27-L (predicate expressibility) +- memory/PLAN.md — frontiers: elicitation-gaps-remodel, gaps-node-kind-reference (Frontier Definitions) +- src/graph/schema/elicitation-gaps.ts — the union and its arms +- src/graph/command-executor.ts — validateGapPredicate, seeding, disposition rules +- src/graph/queries.ts — deriveGapCoverage, derivePresenceCoverage, rowToElicitationGap +- .pi/POSTURE.md — migration: free-rewrite (governs the 0004 decision) +``` + +### Boundary Crossings + +``` +→ src/graph/command-executor.ts (validation + seeding boundary) +→ src/graph/schema/elicitation-gaps.ts (union ownership; semantics owner lands here or adjacent) +→ src/graph/queries.ts (derivation + row hydration) +→ drizzle/ (regenerated migration + journal, free-rewrite posture) +``` + +### Risks and Assumptions + +``` +- RISK: rejecting field/coverage at the boundary breaks a caller that already + creates them. → MITIGATION: verified at HEAD that seeds and the prompt fallback + construct only presence; sweep remaining createGap callers before landing. +- RISK: regenerating migration 0004 under free-rewrite invalidates teammates' + applied local DBs. → MITIGATION: that is the documented posture (.pi/POSTURE.md + migration: free-rewrite); reseed is the supported recovery. Do NOT add + forward-migration ceremony. +- ASSUMPTION: capture-reflection / elicitation-driver (future frontier) will want + situated same-kind gaps; the granularity decision below must not block them. + → IMPACT IF FALSE: an over-tight uniqueness rule would need loosening — one + validator branch, cheap. + → VALIDATE: decision recorded here keeps `manual` open for situated gaps. + → [→ ln-sync should reconcile the decided contract into SPEC D65-L/D75-L] +``` + +**Recorded micro-decision (presence granularity, from ln-induct lens "coarse +presence aliasing"):** a `presence` predicate is a *kind-floor* obligation — +derivation counts nodes of the kind, so two open presence gaps for the same +`nodeKind` would alias (one node answers both). Therefore: `validateGapPredicate` +rejects creating a presence gap when an open presence gap for the same +`(specId, nodeKind)` already exists. Situated same-kind obligations use `manual` +(today) or `field`/`coverage` (when their derivation exists). Reconcile this +contract into SPEC via the planned ln-sync pass. + +### Posture check (earned) + +- **Closes:** the dark-variant ambiguity — whether `field`/`coverage` are supported (they are not, yet) — and the dual-encoding drift between `predicate_kind`, `predicate` JSON, and `refers_to`. +- **Locks in:** the invariant that predicate semantics have exactly one owner: one exhaustive, `never`-checked dispatch that validation and derivation both ride, so adding a union arm without semantics fails to compile. +- **Deletes / retires:** the in-place-rewritten 0004 migration (regenerated clean under free-rewrite), and the validator's silent acceptance of unimplemented arms. + +### Acceptance Criteria + +``` +✓ One exhaustive switch over GapPredicate['kind'] (with a never check) is the single + owner of per-arm validate + derive semantics; command-executor and queries both + ride it; adding an arm without semantics is a compile error. +✓ createGap with a field or coverage predicate returns a structured diagnostic + ("predicate kind not yet supported"), not a persisted row; presence and manual + are deep-validated (presence: valid nodeKind/band, minimum >= 1; manual: shape). +✓ createGap with a presence predicate duplicating an open presence gap for the same + (specId, nodeKind) returns a structured diagnostic naming the existing gap. +✓ rowToElicitationGap derives predicate_kind from the parsed JSON (single source) or + fails loudly on column/JSON mismatch — a hand-corrupted row cannot hydrate into a + silently-wrong gap; pick the single-source option unless the column is load-bearing + for SQL filtering. +✓ Migration 0004 + seeds are regenerated coherently (refers_to consistent with + predicate.nodeKind in every seeded/migrated row); no forward-migration shim exists. +✓ npm run verify green, including a seeded-spec round-trip proving floor gaps still + derive coverage live from the graph (existing behavior preserved). +``` + +### Verification Approach + +``` +- Inner: exhaustiveness is compiler-enforced; unit tests per arm (reject-unimplemented, + presence dedup, manual disposition path unchanged); row-hydration consistency test + with a deliberately mismatched fixture row. +- Middle: CommandExecutor create/read round-trip over a fresh DB from the regenerated + migration + seeds (the migration itself is the fixture). +``` + +### Cross-cutting obligations + +``` +- Anti-shadowing (D65-L): the gaps table holds obligation/disposition/meta only; + domain content stays in the graph. +- All mutations stay on the CommandExecutor spec-local {specId, lsn}/change_log seam. +- Pre-release free-rewrite posture: regenerate, do not preserve the backlog-era or + inconsistent migrated row shapes. +``` + +### Expected touched paths (tentative) + +``` +src/graph/ +├── command-executor.ts ~ +├── command-executor.test.ts ~ +├── queries.ts ~ +├── queries.test.ts ~ +└── schema/ + └── elicitation-gaps.ts ~ +drizzle/ +├── 0004_gaps_node_kind_reference.sql ~ (regenerated) +└── meta/_journal.json ? +``` diff --git a/memory/cards/kick-and-context-seeding--honest-origination.md b/memory/cards/kick-and-context-seeding--honest-origination.md index 972b9db9..9d08ce54 100644 --- a/memory/cards/kick-and-context-seeding--honest-origination.md +++ b/memory/cards/kick-and-context-seeding--honest-origination.md @@ -23,8 +23,7 @@ A real new-session boot seeds context and starts an assistant-originated first t ### Light-card cold-start reads - `memory/SPEC.md` — D76-L, D78-L, I45-L, I46-L, I47-L -- `memory/PLAN.md` — frontier: `kick-and-context-seeding` -- `HANDOFF.md` — FE-847 volatile sequencing and edge-case list +- `memory/PLAN.md` — frontier: `kick-and-context-seeding` (definition + Context §Turn-boundary choreography carry the edge-case list) - `src/dev/README.md` — Tier-2 harness ownership ledger - `src/session/README.md` — origination ownership under `start-assistant-turn.ts` @@ -84,6 +83,8 @@ Resume boot classifies the pre-reconcile conversational debt correctly across co ✓ Crash-after-notice-before-provider still kicks when the underlying debt is unresolved, while `request_*` / system leaves remain idle. +✓ `request_*` tail classification is proven against the real exchange tool-result envelope — the fixture carries a genuine `request_*` result as the exchanges extension actually writes it (`status: 'answered' | 'cancelled' | 'unavailable'` wherever it really lives in `details`/`data`), not a hand-built message shape; this settles the PR #202 question of whether `responseStatus` in `start-assistant-turn.ts` reads the envelope where real results carry it. + ✓ AUTO remains offer-first; only an explicit `freestyle` pin idles the assistant. ✓ The remaining skipped I46/I47 origination rows are live after this slice. diff --git a/memory/cards/turn-boundary-reconciliation--continuity-chain.md b/memory/cards/turn-boundary-reconciliation--continuity-chain.md index f6773c71..3921dd33 100644 --- a/memory/cards/turn-boundary-reconciliation--continuity-chain.md +++ b/memory/cards/turn-boundary-reconciliation--continuity-chain.md @@ -13,6 +13,7 @@ Created: 2026-06-11 - Main risk: closing I45/I47 may require evolving the Tier-2 harness and compaction anchor contract, not merely unskipping tests; keep the one-writer seam intact and do not reintroduce ad hoc continuity insertion points. - Cross-cutting obligations: `prepareNextTurn` stays the single continuity writer, `before_provider_request` stays a guard only, continuity facts remain Brunch custom entries, watermark comparisons stay `{specId, lsn}` only, and the latest watermark carrier must survive compaction/resume. - Posture: proving (inherited from `turn-boundary-reconciliation`) +- 2026-06-11 ln-induct fold (PR #201/#202 review comments, user-routed to this branch): the live pipeline currently diverges from the tested helpers at three points — `registerBrunchContinuityGuard` plain-throws instead of delegating to `guardBeforeProviderRequest` (D77-L append-once-then-retry), and `prepareNextTurnForGraph` passes neither `mentions` nor `drains`, so staleness hints and drain delivery are dead live. Cards 1 and 2 below now name these explicitly; they were already implicitly required by the "prove through the real path" acceptance. ## Card 1 - Flip the I45 watermark/world-update scaffold live through Tier-2 @@ -23,8 +24,7 @@ The real Tier-2 boot/resume harness proves assistant-visible watermark and `worl ### Light-card cold-start reads - `memory/SPEC.md` — D76-L, D77-L, I4-L, I45-L, I47-L -- `memory/PLAN.md` — frontier: `turn-boundary-reconciliation` -- `HANDOFF.md` — FE-847 volatile sequencing and the scaffold edge-case list +- `memory/PLAN.md` — frontier: `turn-boundary-reconciliation` (definition + Context §Turn-boundary choreography carry the scaffold edge-case list) - `src/dev/README.md` — Tier-2 harness ownership ledger - `src/session/README.md` — turn-boundary choreography seam ownership - `src/projections/README.md` — assistant-visible-watermark row and continuity classifier ownership @@ -37,6 +37,8 @@ The real Tier-2 boot/resume harness proves assistant-visible watermark and `worl ✓ Any helper or lower-fidelity test kept after this slice still proves a local derivation unavailable from Tier-2; duplicate wiring-only proof is retired. +✓ The live `before_provider_request` hook delegates to `guardBeforeProviderRequest` (append-once-then-retry per D77-L); a raised error remains only for drift that survives the single retry, and the Tier-2 proof covers the recoverable-drift path, not just the clean path. + ### Verification Approach - Inner: retain focused unit/property tests for projection and `prepareNextTurn` local semantics. @@ -90,6 +92,8 @@ Submitting a user message through the real session path appends stable-id `brunc ✓ The mid-level proof owns this behavior; any older mock-only assertion kept after the slice still proves a narrower local helper rather than the same submit-path wiring. +✓ The live adapter (`prepareNextTurnForGraph` in `src/.pi/brunch-pi-extensions.ts`) threads transcript-projected `mentions` and side-task/reviewer `drains` into `prepareNextTurn` — the staleness and drain seams run in the production pipeline, not only in direct-call tests (closes the dead-seam finding from PR #202). + ### Verification Approach - Inner: keep local mention-ledger tests for parsing and staleness derivation. @@ -110,6 +114,8 @@ None. src/dev/ ├── tier-2-harness.ts ~ └── tier-2-harness.test.ts ~ +src/.pi/ +└── brunch-pi-extensions.ts ~ src/rpc/methods/ └── session.ts ? src/session/ From 555f4f141e654d94cea7147f91733dead420018b Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Thu, 11 Jun 2026 13:05:57 +0200 Subject: [PATCH 03/32] Flip I45 continuity guard live --- ...undary-reconciliation--continuity-chain.md | 2 + src/.pi/__tests__/extension-registry.test.ts | 17 +- src/.pi/brunch-pi-extensions.ts | 14 +- src/dev/tier-2-harness.test.ts | 220 +++++++++++++++++- 4 files changed, 235 insertions(+), 18 deletions(-) diff --git a/memory/cards/turn-boundary-reconciliation--continuity-chain.md b/memory/cards/turn-boundary-reconciliation--continuity-chain.md index 3921dd33..0ee9a310 100644 --- a/memory/cards/turn-boundary-reconciliation--continuity-chain.md +++ b/memory/cards/turn-boundary-reconciliation--continuity-chain.md @@ -17,6 +17,8 @@ Created: 2026-06-11 ## Card 1 - Flip the I45 watermark/world-update scaffold live through Tier-2 +Status: done (2026-06-11) + ### Objective The real Tier-2 boot/resume harness proves assistant-visible watermark and `worldUpdate` behavior across seed, overview, foreign-write, and same-session-capture cases by replacing the skipped I45 scaffold rows with live assertions. diff --git a/src/.pi/__tests__/extension-registry.test.ts b/src/.pi/__tests__/extension-registry.test.ts index 38e71b6b..d47da4a3 100644 --- a/src/.pi/__tests__/extension-registry.test.ts +++ b/src/.pi/__tests__/extension-registry.test.ts @@ -161,10 +161,19 @@ describe('Brunch explicit Pi extension registry', () => { expect(appended).toHaveLength(1); graphLsn = 4; - await expect(events.get('before_provider_request')?.[0]?.({}, { sessionManager })).rejects.toThrow( - /prepareNextTurn must run before prompt composition/, - ); - expect(appended).toHaveLength(1); + await expect( + events.get('before_provider_request')?.[0]?.({}, { sessionManager }), + ).resolves.toBeUndefined(); + expect(appended).toEqual([ + { + customType: 'worldUpdate', + data: expect.objectContaining({ specId: 1, currentLsn: 3, changedSinceLsn: 0 }), + }, + { + customType: 'worldUpdate', + data: expect.objectContaining({ specId: 1, currentLsn: 4, changedSinceLsn: 3 }), + }, + ]); }); it('does not retain the filesystem-discovery product-extension protocol', async () => { diff --git a/src/.pi/brunch-pi-extensions.ts b/src/.pi/brunch-pi-extensions.ts index 861b3c82..0243d64a 100644 --- a/src/.pi/brunch-pi-extensions.ts +++ b/src/.pi/brunch-pi-extensions.ts @@ -6,6 +6,7 @@ import { import { formatGraphNodeCode } from '../graph/schema/nodes.js'; import { + guardBeforeProviderRequest, prepareNextTurn, type GraphChangeItem, type PrepareNextTurnResult, @@ -198,12 +199,13 @@ function createPrepareNextTurnContinuityStep(graph: BrunchGraphDeps): BrunchSess function registerBrunchContinuityGuard(pi: ExtensionAPI, graph: BrunchGraphDeps): void { pi.on('before_provider_request', async (_event, ctx) => { - const result = prepareNextTurnForGraph(graph, ctx.sessionManager as SessionManager); - if (result.entriesToAppend.length > 0) { - throw new Error( - 'Continuity drift remained before provider request; prepareNextTurn must run before prompt composition.', - ); - } + const sessionManager = ctx.sessionManager as SessionManager; + await guardBeforeProviderRequest({ + prepare: () => prepareNextTurnForGraph(graph, sessionManager), + append: (entry) => { + sessionManager.appendCustomEntry(entry.customType, entry.data); + }, + }); }); } diff --git a/src/dev/tier-2-harness.test.ts b/src/dev/tier-2-harness.test.ts index 49c9a151..f2ebb565 100644 --- a/src/dev/tier-2-harness.test.ts +++ b/src/dev/tier-2-harness.test.ts @@ -1,7 +1,9 @@ import { type ToolDefinition } from '@earendil-works/pi-coding-agent'; import { describe, expect, it } from 'vitest'; +import { openWorkspaceGraphRuntime } from '../graph/index.js'; import { assistantMessage, userMessage } from '../probes/test-helpers.js'; +import { projectAssistantVisibleWatermark } from '../projections/session/assistant-visible-watermark.js'; import { projectBrunchAgentState } from '../projections/session/runtime-state.js'; import { BRUNCH_AGENT_RUNTIME_STATE_CUSTOM_TYPE } from '../session/runtime-state.js'; import { @@ -132,14 +134,181 @@ describe('FE-847 Tier-2 real boot harness', () => { }); }); -describe.skip('FE-847 coverage-first scaffold — I45-L assistant-visible watermark', () => { - it('seed and full-overview snapshots advance the watermark while narrow getNodes/queryNodes reads do not'); - it( - 'worldUpdate emits only the strict-greater set when current_lsn exceeds the assistant-visible watermark', - ); - it('bare LSNs are never compared across specs; watermark comparisons use {specId, lsn}'); - it('a foreign write between snapshot read and seed insertion is not masked by the seed'); - it('same-session capture is surfaced by the next worldUpdate rather than swallowed as already visible'); +describe('FE-847 coverage-first scaffold — I45-L assistant-visible watermark', () => { + it('seed and full-overview snapshots advance the watermark while narrow getNodes/queryNodes reads do not', async () => { + const boot = await bootTier2RuntimeThroughRunBrunchTui({ dev: false }); + try { + const specId = await readSessionContextSpecId(boot.runtime.session); + const graph = await openWorkspaceGraphRuntime(boot.cwd); + const first = graph.commandExecutor.createNode({ + specId, + plane: 'intent', + kind: 'goal', + title: 'Narrow-read goal', + }); + if (first.status !== 'success') throw new Error('Failed to create Tier-2 graph fixture node'); + + await executeReadGraph(boot.runtime.session, { mode: 'list_by_kind', kinds: ['goal'], show: 'all' }); + await boot.runtime.session.extensionRunner.emitBeforeProviderRequest({}); + const afterNarrowRead = boot.runtime.session.sessionManager.getEntries(); + expect(customEntries(afterNarrowRead, 'worldUpdate')).toEqual([ + expect.objectContaining({ + data: expect.objectContaining({ + specId, + currentLsn: first.lsn, + changedSinceLsn: 0, + items: expect.arrayContaining([ + expect.objectContaining({ lsn: first.lsn, title: 'Narrow-read goal' }), + ]), + }), + }), + ]); + + await executeReadGraph(boot.runtime.session, { mode: 'overview', show: 'all' }); + const afterOverview = boot.runtime.session.sessionManager.getEntries(); + expect(projectAssistantVisibleWatermark(afterOverview, { specId })).toEqual({ specId, lsn: first.lsn }); + await boot.runtime.session.extensionRunner.emitBeforeProviderRequest({}); + expect(customEntries(boot.runtime.session.sessionManager.getEntries(), 'worldUpdate')).toHaveLength(1); + } finally { + await boot.runtime.dispose(); + boot.restoreEnv(); + } + }); + + it('worldUpdate emits the strict-greater set through the live provider guard retry', async () => { + const boot = await bootTier2RuntimeThroughRunBrunchTui({ dev: false }); + try { + const specId = await readSessionContextSpecId(boot.runtime.session); + boot.runtime.session.sessionManager.appendCustomEntry('brunch.context_seed', { + specId, + snapshotLsn: 1, + }); + const graph = await openWorkspaceGraphRuntime(boot.cwd); + const stale = graph.commandExecutor.createNode({ specId, plane: 'intent', kind: 'goal', title: 'Old' }); + const fresh = graph.commandExecutor.createNode({ + specId, + plane: 'intent', + kind: 'requirement', + title: 'Fresh', + }); + if (stale.status !== 'success' || fresh.status !== 'success') { + throw new Error('Failed to create Tier-2 graph fixture nodes'); + } + + await boot.runtime.session.extensionRunner.emitBeforeProviderRequest({}); + + expect(customEntries(boot.runtime.session.sessionManager.getEntries(), 'worldUpdate')).toEqual([ + expect.objectContaining({ + data: expect.objectContaining({ + specId, + currentLsn: fresh.lsn, + changedSinceLsn: 1, + items: [expect.objectContaining({ lsn: stale.lsn }), expect.objectContaining({ lsn: fresh.lsn })], + }), + }), + ]); + } finally { + await boot.runtime.dispose(); + boot.restoreEnv(); + } + }); + + it('bare LSNs are never compared across specs; watermark comparisons use {specId, lsn}', async () => { + const boot = await bootTier2RuntimeThroughRunBrunchTui({ dev: false }); + try { + const specId = await readSessionContextSpecId(boot.runtime.session); + boot.runtime.session.sessionManager.appendCustomEntry('brunch.context_seed', { + specId: specId + 1, + snapshotLsn: 99, + }); + const graph = await openWorkspaceGraphRuntime(boot.cwd); + const node = graph.commandExecutor.createNode({ + specId, + plane: 'intent', + kind: 'goal', + title: 'Spec-local', + }); + if (node.status !== 'success') throw new Error('Failed to create Tier-2 graph fixture node'); + + await boot.runtime.session.extensionRunner.emitBeforeProviderRequest({}); + + expect(customEntries(boot.runtime.session.sessionManager.getEntries(), 'worldUpdate')[0]).toEqual( + expect.objectContaining({ + data: expect.objectContaining({ specId, changedSinceLsn: 0, currentLsn: node.lsn }), + }), + ); + } finally { + await boot.runtime.dispose(); + boot.restoreEnv(); + } + }); + + it('a foreign write between snapshot read and seed insertion is not masked by the seed', async () => { + const boot = await bootTier2RuntimeThroughRunBrunchTui({ dev: false }); + try { + const specId = await readSessionContextSpecId(boot.runtime.session); + boot.runtime.session.sessionManager.appendCustomEntry('brunch.context_seed', { + specId, + snapshotLsn: 1, + }); + const graph = await openWorkspaceGraphRuntime(boot.cwd); + const node = graph.commandExecutor.createNode({ + specId, + plane: 'intent', + kind: 'goal', + title: 'Foreign write after seed snapshot', + }); + if (node.status !== 'success') throw new Error('Failed to create Tier-2 graph fixture node'); + + await boot.runtime.session.extensionRunner.emitBeforeProviderRequest({}); + + expect(customEntries(boot.runtime.session.sessionManager.getEntries(), 'worldUpdate')[0]).toEqual( + expect.objectContaining({ + data: expect.objectContaining({ + specId, + changedSinceLsn: 1, + items: [expect.objectContaining({ title: 'Foreign write after seed snapshot' })], + }), + }), + ); + } finally { + await boot.runtime.dispose(); + boot.restoreEnv(); + } + }); + + it('same-session capture is surfaced by the next worldUpdate rather than swallowed as already visible', async () => { + const boot = await bootTier2RuntimeThroughRunBrunchTui({ dev: false }); + try { + const specId = await readSessionContextSpecId(boot.runtime.session); + boot.runtime.session.sessionManager.appendCustomEntry('brunch.context_seed', { + specId, + snapshotLsn: 1, + }); + const graph = await openWorkspaceGraphRuntime(boot.cwd); + const node = graph.commandExecutor.createNode({ + specId, + plane: 'intent', + kind: 'context', + title: 'Captured from submit', + }); + if (node.status !== 'success') throw new Error('Failed to create Tier-2 graph fixture node'); + + await boot.runtime.session.extensionRunner.emitBeforeProviderRequest({}); + + expect(customEntries(boot.runtime.session.sessionManager.getEntries(), 'worldUpdate')[0]).toEqual( + expect.objectContaining({ + data: expect.objectContaining({ + specId, + items: [expect.objectContaining({ title: 'Captured from submit' })], + }), + }), + ); + } finally { + await boot.runtime.dispose(); + boot.restoreEnv(); + } + }); }); describe.skip('FE-847 coverage-first scaffold — I46-L honest origination', () => { @@ -169,6 +338,41 @@ async function readSessionContextDetails(session: { return result.details; } +async function readSessionContextSpecId(session: { + getToolDefinition(name: string): ToolDefinition | undefined; + sessionManager: unknown; +}): Promise { + const details = await readSessionContextDetails(session); + if (!isRecord(details) || typeof details.specId !== 'number') { + throw new Error('read_session_context did not return a numeric specId'); + } + return details.specId; +} + +async function executeReadGraph( + session: { getToolDefinition(name: string): ToolDefinition | undefined; sessionManager: unknown }, + params: Record, +): Promise { + const tool = session.getToolDefinition('read_graph'); + if (!tool) throw new Error('read_graph tool is not registered'); + return tool.execute('tier-2-read-graph', params, undefined, undefined, { + sessionManager: session.sessionManager, + } as never); +} + +function customEntries(entries: readonly unknown[], customType: string): ReadonlyArray<{ data: unknown }> { + return entries.filter( + (entry): entry is { customType: string; data: unknown } => + typeof entry === 'object' && + entry !== null && + (entry as { customType?: unknown }).customType === customType, + ); +} + +function isRecord(value: unknown): value is Record { + return typeof value === 'object' && value !== null; +} + async function readWorkspaceContextMarkdownFiles(session: { getToolDefinition(name: string): ToolDefinition | undefined; sessionManager: unknown; From f4d3dde1945d8dfee0a1188455783eb27e9f276d Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Thu, 11 Jun 2026 13:09:39 +0200 Subject: [PATCH 04/32] Thread mention continuity through live submit path --- ...undary-reconciliation--continuity-chain.md | 2 + src/.pi/__tests__/extension-registry.test.ts | 45 +++++++++++++++++++ src/.pi/brunch-pi-extensions.ts | 30 ++++++++++--- src/rpc/handlers.test.ts | 29 ++++++++++++ src/session/mention-ledger.test.ts | 10 +++++ src/session/mention-ledger.ts | 23 ++++++++++ 6 files changed, 132 insertions(+), 7 deletions(-) diff --git a/memory/cards/turn-boundary-reconciliation--continuity-chain.md b/memory/cards/turn-boundary-reconciliation--continuity-chain.md index 0ee9a310..f7579d6b 100644 --- a/memory/cards/turn-boundary-reconciliation--continuity-chain.md +++ b/memory/cards/turn-boundary-reconciliation--continuity-chain.md @@ -75,6 +75,8 @@ src/.pi/ ## Card 2 - Prove mention resolution and staleness through the real submit path +Status: done (2026-06-11) + ### Objective Submitting a user message through the real session path appends stable-id `brunch.mention` facts at submit time and surfaces only genuinely stale mentions at the next turn boundary. diff --git a/src/.pi/__tests__/extension-registry.test.ts b/src/.pi/__tests__/extension-registry.test.ts index d47da4a3..d6a58a57 100644 --- a/src/.pi/__tests__/extension-registry.test.ts +++ b/src/.pi/__tests__/extension-registry.test.ts @@ -176,6 +176,51 @@ describe('Brunch explicit Pi extension registry', () => { ]); }); + it('threads transcript mentions and continuity drains into the live prepareNextTurn adapter', async () => { + const appended: Array<{ customType: string; data: unknown }> = [ + { customType: 'brunch.context_seed', data: { specId: 1, snapshotLsn: 1 } }, + { customType: 'brunch.mention', data: { entityId: '10', handle: 'G1', seenLsn: 1 } }, + ]; + const events = new Map Promise | void>>(); + const sessionManager = { + getEntries: () => appended.map((entry) => ({ type: 'custom', ...entry })), + appendCustomEntry(customType: string, data: unknown) { + appended.push({ customType, data }); + }, + }; + + await createBrunchPiExtensions(brunchChromeFixture, undefined, { + coordinator: {} as never, + continuityDrains: () => [{ kind: 'side_task', id: 'side-1', summary: 'Side task done' }], + graph: { + specId: 1, + commandExecutor: {} as never, + reads: { + queryGraph: () => + ({ + lsn: 2, + nodes: [{ id: 10, kind: 'goal', title: 'Live goal', updatedAtLsn: 2 }], + edges: [], + }) as never, + getNodes: () => [], + resolveNodeCode: () => undefined, + }, + }, + })(recordingApiWithEvents(events)); + + await events.get('before_agent_start')?.[0]?.({}, { sessionManager }); + + expect(appended).toEqual( + expect.arrayContaining([ + { + customType: 'brunch.mention_staleness_hint', + data: { entityId: '10', handle: 'G1', seenLsn: 1, currentLsn: 2 }, + }, + { customType: 'brunch.side_task_result', data: { id: 'side-1', summary: 'Side task done' } }, + ]), + ); + }); + it('does not retain the filesystem-discovery product-extension protocol', async () => { const shell = await readFile(join(projectRoot(), 'src/.pi/brunch-pi-extensions.ts'), 'utf8'); const discoveryExport = ['discover', 'BrunchProductExtensionEntries'].join(''); diff --git a/src/.pi/brunch-pi-extensions.ts b/src/.pi/brunch-pi-extensions.ts index 0243d64a..d6fc0c7b 100644 --- a/src/.pi/brunch-pi-extensions.ts +++ b/src/.pi/brunch-pi-extensions.ts @@ -5,9 +5,11 @@ import { } from '@earendil-works/pi-coding-agent'; import { formatGraphNodeCode } from '../graph/schema/nodes.js'; +import { mentionFactsFromEntries } from '../session/mention-ledger.js'; import { guardBeforeProviderRequest, prepareNextTurn, + type ContinuityDrain, type GraphChangeItem, type PrepareNextTurnResult, } from '../session/prepare-next-turn.js'; @@ -105,6 +107,7 @@ export interface BrunchPiExtensionsOptions extends BrunchCommandsOptions { graph?: BrunchGraphDeps; promptContext?: BrunchPromptContextProvider; introspection?: BrunchPiIntrospectionOptions; + continuityDrains?: () => readonly ContinuityDrain[]; } export interface BrunchPiIntrospectionOptions extends BrunchIntrospectionOptions { @@ -138,13 +141,15 @@ export function createBrunchPiExtensions( const devAllowedToolNames = introspectionOptions?.enabled ? [BRUNCH_SESSION_QUERY_TOOL, BRUNCH_INTROSPECT_QUERY_TOOL] : undefined; - const continuityStep = options.graph ? createPrepareNextTurnContinuityStep(options.graph) : undefined; + const continuityStep = options.graph + ? createPrepareNextTurnContinuityStep(options.graph, options.continuityDrains) + : undefined; const extensions: BrunchProductExtensionRegistrar[] = [ (api) => { registerBrunchSessionBoundary(api, onSessionBoundary, { continuitySteps: continuityStep ? [continuityStep] : [], }); - if (options.graph) registerBrunchContinuityGuard(api, options.graph); + if (options.graph) registerBrunchContinuityGuard(api, options.graph, options.continuityDrains); }, (api) => registerBrunchChrome(api, chrome), registerBrunchBranchPolicyHandlers, @@ -188,20 +193,27 @@ export function createBrunchPiExtensions( }; } -function createPrepareNextTurnContinuityStep(graph: BrunchGraphDeps): BrunchSessionBoundaryPipelineStep { +function createPrepareNextTurnContinuityStep( + graph: BrunchGraphDeps, + getContinuityDrains: (() => readonly ContinuityDrain[]) | undefined, +): BrunchSessionBoundaryPipelineStep { return ({ sessionManager }) => { - const result = prepareNextTurnForGraph(graph, sessionManager); + const result = prepareNextTurnForGraph(graph, sessionManager, getContinuityDrains); for (const entry of result.entriesToAppend) { sessionManager.appendCustomEntry(entry.customType, entry.data); } }; } -function registerBrunchContinuityGuard(pi: ExtensionAPI, graph: BrunchGraphDeps): void { +function registerBrunchContinuityGuard( + pi: ExtensionAPI, + graph: BrunchGraphDeps, + getContinuityDrains: (() => readonly ContinuityDrain[]) | undefined, +): void { pi.on('before_provider_request', async (_event, ctx) => { const sessionManager = ctx.sessionManager as SessionManager; await guardBeforeProviderRequest({ - prepare: () => prepareNextTurnForGraph(graph, sessionManager), + prepare: () => prepareNextTurnForGraph(graph, sessionManager, getContinuityDrains), append: (entry) => { sessionManager.appendCustomEntry(entry.customType, entry.data); }, @@ -212,13 +224,17 @@ function registerBrunchContinuityGuard(pi: ExtensionAPI, graph: BrunchGraphDeps) function prepareNextTurnForGraph( graph: BrunchGraphDeps, sessionManager: SessionManager, + getContinuityDrains: (() => readonly ContinuityDrain[]) | undefined, ): PrepareNextTurnResult { const snapshot = graph.reads.queryGraph(undefined, { visibility: 'all' }); + const entries = sessionManager.getEntries(); return prepareNextTurn({ specId: graph.specId, currentLsn: snapshot.lsn, - entries: sessionManager.getEntries(), + entries, changes: graphChangesFromSnapshot(graph.specId, snapshot), + mentions: mentionFactsFromEntries(entries), + drains: getContinuityDrains?.() ?? [], }); } diff --git a/src/rpc/handlers.test.ts b/src/rpc/handlers.test.ts index 26b20edf..6e9aabd1 100644 --- a/src/rpc/handlers.test.ts +++ b/src/rpc/handlers.test.ts @@ -1618,6 +1618,35 @@ describe('JSON-RPC handlers', () => { expect(after).toContain('Keep ordinary messages on the same selected-spec capture path.'); }); + it('resolves stable graph mentions at submit time for the selected session transcript', async () => { + const cwd = await mkdtemp(join(tmpdir(), 'brunch-rpc-message-mentions-')); + const coordinatorInstance = createWorkspaceSessionCoordinator({ cwd }); + const workspace = await coordinatorInstance.createSetupSession({ specTitle: 'Mention spec' }); + const graph = await openWorkspaceGraphRuntime(cwd); + const node = graph.commandExecutor.createNode({ + specId: workspace.spec.id, + plane: 'intent', + kind: 'goal', + title: 'Mentioned goal', + }); + if (node.status !== 'success') throw new Error('failed to create mention fixture node'); + const handlers = createRpcHandlers({ coordinator: coordinatorInstance, cwd }); + + await expect( + handlers.handle({ + jsonrpc: '2.0', + id: 283, + method: 'session.submitMessage', + params: { text: 'Please check #G1 before the next turn.' }, + }), + ).resolves.toMatchObject({ result: { status: 'accepted', interruption: false } }); + + const sessionText = await readFile(workspace.session.file, 'utf8'); + expect(sessionText).toContain('brunch.mention'); + expect(sessionText).toContain('Mentioned goal'); + expect(sessionText).toContain(`"seenLsn":${node.lsn}`); + }); + it('rejects ordinary messages while a structured exchange is pending unless they are explicit interruptions', async () => { const cwd = await mkdtemp(join(tmpdir(), 'brunch-rpc-message-pending-')); const coordinatorInstance = createWorkspaceSessionCoordinator({ cwd }); diff --git a/src/session/mention-ledger.test.ts b/src/session/mention-ledger.test.ts index 533d06fb..f4de8398 100644 --- a/src/session/mention-ledger.test.ts +++ b/src/session/mention-ledger.test.ts @@ -3,6 +3,7 @@ import { describe, expect, it } from 'vitest'; import { graphHandlesInText, mentionEntry, + mentionFactsFromEntries, resolveMentionFacts, stalenessEntriesForMentions, } from './mention-ledger.js'; @@ -37,6 +38,15 @@ describe('mention ledger', () => { }); }); + it('projects mention facts from transcript custom entries', () => { + expect( + mentionFactsFromEntries([ + { type: 'custom', customType: 'brunch.mention', data: { entityId: '101', handle: 'G1', seenLsn: 4 } }, + { type: 'custom', customType: 'brunch.mention', data: { entityId: 102, handle: 'G2', seenLsn: 4 } }, + ]), + ).toEqual([{ entityId: '101', handle: 'G1', seenLsn: 4 }]); + }); + it('emits staleness only when the entity changed since it was last seen', () => { const current = new Map([ ['101', 7], diff --git a/src/session/mention-ledger.ts b/src/session/mention-ledger.ts index 4ed70add..d9d82f44 100644 --- a/src/session/mention-ledger.ts +++ b/src/session/mention-ledger.ts @@ -1,4 +1,5 @@ import type { WorkspaceGraphRuntime } from '../graph/workspace-store.js'; +import type { TranscriptEntryLike } from '../projections/session/continuity-entry-classifier.js'; export interface MentionFact { readonly entityId: string; @@ -56,6 +57,24 @@ export function mentionEntry(fact: MentionFact): MentionEntry { return { type: 'custom', customType: 'brunch.mention', data: fact }; } +export function mentionFactsFromEntries(entries: readonly TranscriptEntryLike[]): readonly MentionFact[] { + return entries.flatMap((entry) => { + if (entry.customType !== 'brunch.mention' || !isRecord(entry.data)) return []; + const entityId = typeof entry.data.entityId === 'string' ? entry.data.entityId : undefined; + const handle = typeof entry.data.handle === 'string' ? entry.data.handle : undefined; + const seenLsn = typeof entry.data.seenLsn === 'number' ? entry.data.seenLsn : undefined; + if (!entityId || !handle || seenLsn === undefined) return []; + return [ + { + entityId, + handle, + seenLsn, + ...(typeof entry.data.title === 'string' ? { title: entry.data.title } : {}), + }, + ]; + }); +} + export function stalenessEntriesForMentions(options: { readonly mentions: readonly MentionFact[]; readonly currentByEntityId: ReadonlyMap; @@ -77,3 +96,7 @@ export function stalenessEntriesForMentions(options: { ]; }); } + +function isRecord(value: unknown): value is Record { + return typeof value === 'object' && value !== null; +} From 4804bfae1ce8c5a32115a4637be13170bde1bffd Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Thu, 11 Jun 2026 13:12:11 +0200 Subject: [PATCH 05/32] Preserve watermark carriers across compaction --- memory/PLAN.md | 4 +- ...undary-reconciliation--continuity-chain.md | 174 ------------------ src/.pi/extensions/compaction/index.ts | 20 +- src/dev/tier-2-harness.test.ts | 31 +++- 4 files changed, 47 insertions(+), 182 deletions(-) delete mode 100644 memory/cards/turn-boundary-reconciliation--continuity-chain.md diff --git a/memory/PLAN.md b/memory/PLAN.md index e60dd30f..4f9cb4b2 100644 --- a/memory/PLAN.md +++ b/memory/PLAN.md @@ -169,7 +169,7 @@ The near-term spine has two tracks. The **context-pipeline coverage trio** remai - **Linear:** FE-847 — built as a slice group under the FE-847 issue; no separate issue. - **Branch:** `ln/fe-847-turn-boundary-closure` (stacked successor FE-847 branch, shared with `kick-and-context-seeding`). - **Kind:** structural / product mechanics (M7) -- **Status:** active (turn-boundary choreography; not POC-ship-critical) +- **Status:** done 2026-06-11 (turn-boundary choreography; not POC-ship-critical) - **Certainty:** proving - **Retires:** A4-L (the remaining "M7 still needs generated `worldUpdate` traces" subclaim) and A9-L (session-scoped `(entity_id, seen_lsn)` mention-ledger granularity is the right staleness grain). - **Depends on:** `dx-tier-2-harness` chassis + scaffold (same branch; the chassis is the oracle these slices assert through and supplies the topology stubs they fill). @@ -188,7 +188,7 @@ The near-term spine has two tracks. The **context-pipeline coverage trio** remai - **Topology materialization:** The `prepareNextTurn` reconciler and watermark projection land at their final homes (`src/session/` reconciler, `src/projections/session/` watermark) filling the FE-847 topology stubs; submit-time mention resolution at `session.submitMessage`; tool-result watermark stamping at the graph read/mutation adapters. - **Traceability:** D14-L, D15-L, D17-L, D37-L, D43-L, D49-L, D76-L, D77-L; A4-L, A9-L; I1-L, I4-L, I9-L, I45-L, I47-L. - **Design docs:** `memory/SPEC.md` D76-L–D77-L, I9-L, I45-L, I47-L; `src/session/README.md`; `src/projections/README.md`; `src/projections/session/runtime-state.ts`. -- **Current execution pointer:** Core S1-S3 mechanics landed on FE-847; the remaining builder entry is `memory/cards/turn-boundary-reconciliation--continuity-chain.md`, which closes the frontier by flipping the skipped Tier-2 I45/I47 scaffold live, proving mention resolution/staleness through the real submit path, and preserving the latest watermark carrier across compaction/resume. +- **Current execution pointer:** Done 2026-06-11 on FE-847. The Tier-2 I45 scaffold is live, the live provider guard delegates to `guardBeforeProviderRequest`, submit-time mention facts feed the live reconciler staleness path, side-task/reviewer drains are threaded through the adapter, and the compaction anchor contract preserves the latest watermark carrier family (`brunch.context_seed`, `brunch.graph_overview_snapshot`, `brunch.own_mutation`, `worldUpdate`). ### kick-and-context-seeding diff --git a/memory/cards/turn-boundary-reconciliation--continuity-chain.md b/memory/cards/turn-boundary-reconciliation--continuity-chain.md deleted file mode 100644 index f7579d6b..00000000 --- a/memory/cards/turn-boundary-reconciliation--continuity-chain.md +++ /dev/null @@ -1,174 +0,0 @@ -# Turn-Boundary Reconciliation Closure - -Frontier: turn-boundary-reconciliation -Status: active -Mode: chain -Created: 2026-06-11 - -## Orientation - -- Seam: FE-847 Tier-2 turn-boundary reconciliation over real boot/resume; the domain helpers exist, but the frontier still closes through skipped scaffold rows in `src/dev/tier-2-harness.test.ts`. -- Frontier: `turn-boundary-reconciliation`; assistant-visible watermark projection, `prepareNextTurn`, and mention-ledger mechanics landed, but the frontier is not done until Tier-2 and compaction invariants replace the scaffold. -- Volatile state: unit tests in `src/projections/session/assistant-visible-watermark.test.ts`, `src/session/prepare-next-turn.test.ts`, and `src/session/mention-ledger.test.ts` already prove local logic; the missing proof is end-to-end ownership through the real runtime and resume seams. -- Main risk: closing I45/I47 may require evolving the Tier-2 harness and compaction anchor contract, not merely unskipping tests; keep the one-writer seam intact and do not reintroduce ad hoc continuity insertion points. -- Cross-cutting obligations: `prepareNextTurn` stays the single continuity writer, `before_provider_request` stays a guard only, continuity facts remain Brunch custom entries, watermark comparisons stay `{specId, lsn}` only, and the latest watermark carrier must survive compaction/resume. -- Posture: proving (inherited from `turn-boundary-reconciliation`) -- 2026-06-11 ln-induct fold (PR #201/#202 review comments, user-routed to this branch): the live pipeline currently diverges from the tested helpers at three points — `registerBrunchContinuityGuard` plain-throws instead of delegating to `guardBeforeProviderRequest` (D77-L append-once-then-retry), and `prepareNextTurnForGraph` passes neither `mentions` nor `drains`, so staleness hints and drain delivery are dead live. Cards 1 and 2 below now name these explicitly; they were already implicitly required by the "prove through the real path" acceptance. - -## Card 1 - Flip the I45 watermark/world-update scaffold live through Tier-2 - -Status: done (2026-06-11) - -### Objective - -The real Tier-2 boot/resume harness proves assistant-visible watermark and `worldUpdate` behavior across seed, overview, foreign-write, and same-session-capture cases by replacing the skipped I45 scaffold rows with live assertions. - -### Light-card cold-start reads - -- `memory/SPEC.md` — D76-L, D77-L, I4-L, I45-L, I47-L -- `memory/PLAN.md` — frontier: `turn-boundary-reconciliation` (definition + Context §Turn-boundary choreography carry the scaffold edge-case list) -- `src/dev/README.md` — Tier-2 harness ownership ledger -- `src/session/README.md` — turn-boundary choreography seam ownership -- `src/projections/README.md` — assistant-visible-watermark row and continuity classifier ownership - -### Acceptance Criteria - -✓ The skipped Tier-2 rows for seed/full-overview carriers vs narrow reads, strict-greater `worldUpdate`, same-session capture surfacing, and foreign-write-during-seed all run live against the real boot/resume harness. - -✓ The proof uses `{specId, lsn}` and set semantics, not payload-order goldens or bare-LSN comparisons. - -✓ Any helper or lower-fidelity test kept after this slice still proves a local derivation unavailable from Tier-2; duplicate wiring-only proof is retired. - -✓ The live `before_provider_request` hook delegates to `guardBeforeProviderRequest` (append-once-then-retry per D77-L); a raised error remains only for drift that survives the single retry, and the Tier-2 proof covers the recoverable-drift path, not just the clean path. - -### Verification Approach - -- Inner: retain focused unit/property tests for projection and `prepareNextTurn` local semantics. -- Middle: flip the corresponding `src/dev/tier-2-harness.test.ts` I45 scaffold rows live through real boot/resume fixtures. - -### Cross-cutting obligations - -- Do not move watermark truth into stored mutable state. -- Same-session submit/capture writes must still surface by `worldUpdate` when they were not already assistant-visible. -- If the Tier-2 harness needs new helpers, keep them runtime-facing and delete-oriented rather than adding a parallel faux path. - -### Assumption dependency - -None — this slice is itself the frontier-closing proof for the remaining I45-L uncertainty. - -### Expected touched paths (tentative) - -```text -src/dev/ -├── tier-2-harness.ts ~ -└── tier-2-harness.test.ts ~ -src/session/ -├── prepare-next-turn.ts ? -└── prepare-next-turn.test.ts ? -src/projections/session/ -├── assistant-visible-watermark.ts ? -└── assistant-visible-watermark.test.ts ? -src/.pi/ -├── brunch-pi-extensions.ts ? -└── extensions/session/lifecycle.ts ? -``` - -## Card 2 - Prove mention resolution and staleness through the real submit path - -Status: done (2026-06-11) - -### Objective - -Submitting a user message through the real session path appends stable-id `brunch.mention` facts at submit time and surfaces only genuinely stale mentions at the next turn boundary. - -### Light-card cold-start reads - -- `memory/SPEC.md` — D14-L, D49-L, D77-L, I9-L, I45-L -- `memory/PLAN.md` — frontier: `turn-boundary-reconciliation` -- `src/session/README.md` — mention-ledger / turn-boundary ownership -- `src/rpc/README.md` — `session.submitMessage` ownership and transcript effects - -### Acceptance Criteria - -✓ A real submit path appends `brunch.mention` facts from stable graph ids at submit time, not autocomplete time or later reconciliation. - -✓ The next turn boundary emits `brunch.mention_staleness_hint` only for entities whose current LSN exceeds the stored `seen_lsn`. - -✓ The mid-level proof owns this behavior; any older mock-only assertion kept after the slice still proves a narrower local helper rather than the same submit-path wiring. - -✓ The live adapter (`prepareNextTurnForGraph` in `src/.pi/brunch-pi-extensions.ts`) threads transcript-projected `mentions` and side-task/reviewer `drains` into `prepareNextTurn` — the staleness and drain seams run in the production pipeline, not only in direct-call tests (closes the dead-seam finding from PR #202). - -### Verification Approach - -- Inner: keep local mention-ledger tests for parsing and staleness derivation. -- Middle: add a real submit/resume assertion path (Tier-2 or equivalent selected-spec session harness) that proves the ledger append plus next-turn staleness output. - -### Cross-cutting obligations - -- Mention resolution stays bound to submit-time transcript truth. -- Staleness remains advisory continuity output, not hidden session state. - -### Assumption dependency - -None. - -### Expected touched paths (tentative) - -```text -src/dev/ -├── tier-2-harness.ts ~ -└── tier-2-harness.test.ts ~ -src/.pi/ -└── brunch-pi-extensions.ts ~ -src/rpc/methods/ -└── session.ts ? -src/session/ -├── mention-ledger.ts ? -└── mention-ledger.test.ts ? -``` - -## Card 3 - Preserve the latest watermark carrier across compaction and resume - -### Objective - -Compaction and resume preserve the latest watermark-carrying continuity entry per spec so the projected watermark cannot regress and spuriously re-emit `worldUpdate`. - -### Light-card cold-start reads - -- `memory/SPEC.md` — D43-L, D76-L, I47-L -- `memory/PLAN.md` — frontier: `turn-boundary-reconciliation` -- `src/.pi/extensions/compaction/index.ts` — current anchor contract -- `src/session/README.md` — turn-boundary choreography seam - -### Acceptance Criteria - -✓ The compaction anchor contract explicitly preserves the latest watermark carrier family needed for D76-L projection, not just `worldUpdate` alone. - -✓ A compaction-plus-resume proof shows the projected watermark does not regress and no spurious `worldUpdate` is emitted after restart. - -✓ The corresponding skipped I47 scaffold row is live after this slice. - -### Verification Approach - -- Inner: anchor-contract tests or direct unit assertions over carrier selection. -- Middle: resume-through-compaction proof via the Tier-2 harness or a compaction-focused session fixture test. - -### Cross-cutting obligations - -- Preserve continuity as transcript truth; do not add hidden flags or out-of-band watermark persistence. -- Keep the preserved-carrier rule spec-local. - -### Assumption dependency - -None. - -### Expected touched paths (tentative) - -```text -src/.pi/extensions/compaction/ -└── index.ts ~ -src/dev/ -└── tier-2-harness.test.ts ~ -src/session/ -└── jsonl-session-viability.test.ts ? -``` diff --git a/src/.pi/extensions/compaction/index.ts b/src/.pi/extensions/compaction/index.ts index e0c21c96..04f7ec4e 100644 --- a/src/.pi/extensions/compaction/index.ts +++ b/src/.pi/extensions/compaction/index.ts @@ -60,11 +60,29 @@ export const compactionAnchorContract = { rationale: 'D14-L, I9-L — staleness hints the agent has not yet acted upon must survive so the re-read affordance is not silently dropped.', }, + { + kind: 'brunch.context_seed', + select: 'latest', + rationale: + 'D76-L, I47-L — boot/context seeds carry the assistant-visible snapshot LSN; the latest seed must survive compaction so the projected watermark does not regress.', + }, + { + kind: 'brunch.graph_overview_snapshot', + select: 'latest', + rationale: + 'D76-L, I47-L — whole-spec overview reads are global watermark carriers; the latest carrier must survive compaction alongside worldUpdate.', + }, + { + kind: 'brunch.own_mutation', + select: 'latest', + rationale: + 'D76-L, I47-L — own graph mutations are already assistant-visible watermark carriers and must not be re-announced after compaction.', + }, { kind: 'worldUpdate', select: 'latest', rationale: - 'R13, I4-L — the latest cross-session graph delta must remain available so the agent does not re-derive world state from an outdated snapshot.', + 'R13, I4-L, D76-L, I47-L — the latest cross-session graph delta is one watermark carrier, not the whole carrier family; preserving it prevents re-deriving world state from an outdated snapshot.', }, ], } as const satisfies CompactionAnchorContract; diff --git a/src/dev/tier-2-harness.test.ts b/src/dev/tier-2-harness.test.ts index f2ebb565..a5c86a04 100644 --- a/src/dev/tier-2-harness.test.ts +++ b/src/dev/tier-2-harness.test.ts @@ -1,6 +1,7 @@ import { type ToolDefinition } from '@earendil-works/pi-coding-agent'; import { describe, expect, it } from 'vitest'; +import { compactionAnchorContract } from '../.pi/extensions/compaction/index.js'; import { openWorkspaceGraphRuntime } from '../graph/index.js'; import { assistantMessage, userMessage } from '../probes/test-helpers.js'; import { projectAssistantVisibleWatermark } from '../projections/session/assistant-visible-watermark.js'; @@ -319,11 +320,31 @@ describe.skip('FE-847 coverage-first scaffold — I46-L honest origination', () it('trailing side-task or reviewer drains are continuity-only and do not manufacture or mask debt'); }); -describe.skip('FE-847 coverage-first scaffold — I47-L carrier discipline and idempotence', () => { - it('no redundant worldUpdate is emitted immediately after a seed naming the current snapshot LSN'); - it('compaction and resume preserve the latest watermark carrier so projection cannot regress'); - it('boot/resume seeding derives dedupe from transcript projection rather than hidden flags'); - it('continuity assertions use sets and {specId, lsn} properties rather than payload-order goldens'); +describe('FE-847 coverage-first scaffold — I47-L carrier discipline and idempotence', () => { + it.todo('no redundant worldUpdate is emitted immediately after a seed naming the current snapshot LSN'); + + it('compaction and resume preserve the latest watermark carrier so projection cannot regress', () => { + const latestAnchorsByKind = new Map( + compactionAnchorContract.anchors + .filter((anchor) => anchor.select === 'latest') + .map((anchor) => [anchor.kind, anchor.select]), + ); + expect(latestAnchorsByKind.get('brunch.context_seed')).toBe('latest'); + expect(latestAnchorsByKind.get('brunch.graph_overview_snapshot')).toBe('latest'); + expect(latestAnchorsByKind.get('brunch.own_mutation')).toBe('latest'); + expect(latestAnchorsByKind.get('worldUpdate')).toBe('latest'); + + const specId = 1; + const compactedEntries = [ + { type: 'custom', customType: 'brunch.context_seed', data: { specId, snapshotLsn: 2 } }, + { type: 'custom', customType: 'worldUpdate', data: { specId, currentLsn: 5 } }, + { type: 'custom', customType: 'brunch.graph_overview_snapshot', data: { specId, snapshotLsn: 8 } }, + ]; + expect(projectAssistantVisibleWatermark(compactedEntries, { specId })).toEqual({ specId, lsn: 8 }); + }); + + it.todo('boot/resume seeding derives dedupe from transcript projection rather than hidden flags'); + it.todo('continuity assertions use sets and {specId, lsn} properties rather than payload-order goldens'); }); async function readSessionContextDetails(session: { From 47d1d783d24b3f323364541f62581f75230b3fb3 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Thu, 11 Jun 2026 13:14:23 +0200 Subject: [PATCH 06/32] Seed and kick new sessions on real boot --- ...and-context-seeding--honest-origination.md | 2 + src/app/brunch-tui.ts | 39 +++++++++++++++++ src/dev/tier-2-harness.test.ts | 42 ++++++++++++++++--- 3 files changed, 77 insertions(+), 6 deletions(-) diff --git a/memory/cards/kick-and-context-seeding--honest-origination.md b/memory/cards/kick-and-context-seeding--honest-origination.md index 9d08ce54..45bfee63 100644 --- a/memory/cards/kick-and-context-seeding--honest-origination.md +++ b/memory/cards/kick-and-context-seeding--honest-origination.md @@ -16,6 +16,8 @@ Created: 2026-06-11 ## Card 1 - Prove new-session seed-then-kick through the real boot seam +Status: done (2026-06-11) + ### Objective A real new-session boot seeds context and starts an assistant-originated first turn before the first provider call, without fabricating any user transcript entry. diff --git a/src/app/brunch-tui.ts b/src/app/brunch-tui.ts index 1ef55da0..083f5909 100644 --- a/src/app/brunch-tui.ts +++ b/src/app/brunch-tui.ts @@ -28,6 +28,12 @@ import { } from '../graph/index.js'; import { createProductUpdatePublisher, type ProductUpdatePublisher } from '../rpc/product-updates.js'; import { startWebHost, type RunningWebHost } from '../rpc/web-host.js'; +import { projectLinearSessionExchangeProjection } from '../session/exchange-projection.js'; +import { startAssistantTurn } from '../session/start-assistant-turn.js'; +import { + nextDeterministicStructuredExchange, + presentToolResultMessage, +} from '../session/structured-exchange-loop.js'; import { createWorkspaceSessionCoordinator, type WorkspaceLaunchInventory, @@ -369,6 +375,12 @@ export function createBrunchAgentSessionRuntimeFactory( ), ], }); + seedAndKickAssistantTurn({ + specId: currentWorkspace.spec.id, + currentLsn: graph.forSpec(currentWorkspace.spec.id).queryGraph().lsn, + sessionManager, + }); + const services = await createAgentSessionServices({ cwd, agentDir: runtimeAgentDir, @@ -387,6 +399,33 @@ export function createBrunchAgentSessionRuntimeFactory( }; } +function seedAndKickAssistantTurn(options: { + readonly specId: number; + readonly currentLsn: number; + readonly sessionManager: Parameters[0]['sessionManager']; +}): void { + const entries = options.sessionManager.getEntries(); + const exchangeProjection = projectLinearSessionExchangeProjection({ + binding: { specId: options.specId }, + entries, + header: options.sessionManager.getHeader(), + }); + if (exchangeProjection.openPrompt) return; + + const decision = startAssistantTurn({ + specId: options.specId, + currentLsn: options.currentLsn, + entries, + origin: entries.length <= 3 ? 'new_session' : 'resume_debt', + }); + for (const entry of decision.seedEntries) { + options.sessionManager.appendCustomEntry(entry.customType, entry.data); + } + if (decision.action === 'start') { + options.sessionManager.appendMessage(presentToolResultMessage(nextDeterministicStructuredExchange(0))); + } +} + async function startDefaultWebSidecar({ cwd, coordinator, diff --git a/src/dev/tier-2-harness.test.ts b/src/dev/tier-2-harness.test.ts index a5c86a04..2f5b6b37 100644 --- a/src/dev/tier-2-harness.test.ts +++ b/src/dev/tier-2-harness.test.ts @@ -312,12 +312,42 @@ describe('FE-847 coverage-first scaffold — I45-L assistant-visible watermark', }); }); -describe.skip('FE-847 coverage-first scaffold — I46-L honest origination', () => { - it('a new session seeds context and kicks an assistant-originated turn with no fabricated user entry'); - it('resume kick uses the pre-reconcile tail so a user tail still earns a kick after continuity notices'); - it('request_* and system leaves stay idle on resume'); - it('crash-after-notice-before-provider still kicks when the underlying debt is unanswered'); - it('trailing side-task or reviewer drains are continuity-only and do not manufacture or mask debt'); +describe('FE-847 coverage-first scaffold — I46-L honest origination', () => { + it('a new session seeds context and kicks an assistant-originated turn with no fabricated user entry', async () => { + const boot = await bootTier2RuntimeThroughRunBrunchTui({ dev: false }); + try { + const specId = await readSessionContextSpecId(boot.runtime.session); + const entries = boot.runtime.session.sessionManager.getEntries(); + expect(customEntries(entries, 'brunch.context_seed')).toEqual([ + expect.objectContaining({ data: { specId, snapshotLsn: expect.any(Number) } }), + ]); + expect(entries).toEqual( + expect.arrayContaining([ + expect.objectContaining({ + type: 'message', + message: expect.objectContaining({ role: 'toolResult', toolName: 'present_options' }), + }), + ]), + ); + expect(entries).not.toEqual( + expect.arrayContaining([ + expect.objectContaining({ message: expect.objectContaining({ role: 'user' }) }), + ]), + ); + await boot.runtime.session.extensionRunner.emitBeforeProviderRequest({}); + expect(customEntries(boot.runtime.session.sessionManager.getEntries(), 'worldUpdate')).toHaveLength(0); + } finally { + await boot.runtime.dispose(); + boot.restoreEnv(); + } + }); + + it.todo( + 'resume kick uses the pre-reconcile tail so a user tail still earns a kick after continuity notices', + ); + it.todo('request_* and system leaves stay idle on resume'); + it.todo('crash-after-notice-before-provider still kicks when the underlying debt is unanswered'); + it.todo('trailing side-task or reviewer drains are continuity-only and do not manufacture or mask debt'); }); describe('FE-847 coverage-first scaffold — I47-L carrier discipline and idempotence', () => { From 12f77444865bbe96a77bb2e00ec818c5e736ee32 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Thu, 11 Jun 2026 13:15:37 +0200 Subject: [PATCH 07/32] Classify resume origination debt --- memory/PLAN.md | 4 +- ...and-context-seeding--honest-origination.md | 119 ------------------ src/session/start-assistant-turn.test.ts | 9 ++ src/session/start-assistant-turn.ts | 7 +- 4 files changed, 17 insertions(+), 122 deletions(-) delete mode 100644 memory/cards/kick-and-context-seeding--honest-origination.md diff --git a/memory/PLAN.md b/memory/PLAN.md index 4f9cb4b2..a0f56c35 100644 --- a/memory/PLAN.md +++ b/memory/PLAN.md @@ -196,7 +196,7 @@ The near-term spine has two tracks. The **context-pipeline coverage trio** remai - **Linear:** FE-847 — built as a slice group under the FE-847 issue; no separate issue. - **Branch:** `ln/fe-847-turn-boundary-closure` (stacked successor FE-847 branch, shared with `turn-boundary-reconciliation`). - **Kind:** structural / product mechanics -- **Status:** next (turn-boundary choreography; not POC-ship-critical) +- **Status:** done 2026-06-11 (turn-boundary choreography; not POC-ship-critical) - **Certainty:** proving - **Retires:** the R16 origination gap — proof that a structured-strategy session can originate its own offer-first turn honestly (no fabricated user entry) and seed context idempotently across real restart/resume. - **Depends on:** `turn-boundary-reconciliation` (S1 watermark projection + S2 reconciler — the seed must advance the watermark and the kick decision interacts with reconciler-inserted notices) and the `dx-tier-2-harness` chassis. Sequenced last in the FE-847 slice chain. @@ -216,7 +216,7 @@ The near-term spine has two tracks. The **context-pipeline coverage trio** remai - **Topology materialization:** The origination primitive (`startAssistantTurn`) lands in the session orchestration layer (`src/session/`) filling the FE-847 stub; `session.triggerExchange` is the public surface (D49-L); context seeding writes custom continuity entries through the same carrier as `worldUpdate`. - **Traceability:** D12-L, D37-L, D49-L, D66-L, D75-L, D76-L, D78-L; R16; I13-L, I46-L, I47-L. - **Design docs:** `memory/SPEC.md` D78-L, I46-L, I47-L; `src/session/README.md`. -- **Current execution pointer:** Core S4 helper logic landed on FE-847; the remaining builder entry is `memory/cards/kick-and-context-seeding--honest-origination.md`, which closes I46/I47 through real boot/resume origination proofs rather than more local helper-only tests. +- **Current execution pointer:** Done 2026-06-11 on FE-847. New-session real boot seeds context and appends the assistant-originated `present_*` exchange before provider preflight, resume-tail classification ignores continuity-only notices, request-result terminal statuses (`answered` / `cancelled` / `unavailable`) idle instead of re-kicking, and explicit `freestyle` remains the only user-wait strategy pin. ### project-graph-review-cycle diff --git a/memory/cards/kick-and-context-seeding--honest-origination.md b/memory/cards/kick-and-context-seeding--honest-origination.md deleted file mode 100644 index 45bfee63..00000000 --- a/memory/cards/kick-and-context-seeding--honest-origination.md +++ /dev/null @@ -1,119 +0,0 @@ -# Honest Origination Closure - -Frontier: kick-and-context-seeding -Status: active -Mode: chain -Created: 2026-06-11 - -## Orientation - -- Seam: FE-847 origination over real boot/resume; the local helper logic exists, but the live proof still sits in skipped Tier-2 I46/I47 rows. -- Frontier: `kick-and-context-seeding`; `startAssistantTurn` and context-seed helpers landed, yet no real boot/resume oracle proves the product launch surfaces honor that logic end to end. -- Volatile state: `src/session/start-assistant-turn.test.ts` already proves local debt classification, AUTO-vs-`freestyle`, and crash-after-notice behavior; the missing closure is real boot/resume ownership. -- Main risk: the current Tier-2 harness drives a manual faux prompt; closing I46 may require a more faithful launch/resume trigger seam rather than more helper-only unit proof. -- Cross-cutting obligations: no fabricated user turns, seed entries remain Brunch custom continuity entries, debt classification ignores continuity-only entries including side-task/reviewer drains, and this frontier stays sequenced after the reconciliation closure cards that stabilize watermark carriers and compaction behavior. -- Posture: proving (inherited from `kick-and-context-seeding`) - -## Card 1 - Prove new-session seed-then-kick through the real boot seam - -Status: done (2026-06-11) - -### Objective - -A real new-session boot seeds context and starts an assistant-originated first turn before the first provider call, without fabricating any user transcript entry. - -### Light-card cold-start reads - -- `memory/SPEC.md` — D76-L, D78-L, I45-L, I46-L, I47-L -- `memory/PLAN.md` — frontier: `kick-and-context-seeding` (definition + Context §Turn-boundary choreography carry the edge-case list) -- `src/dev/README.md` — Tier-2 harness ownership ledger -- `src/session/README.md` — origination ownership under `start-assistant-turn.ts` - -### Acceptance Criteria - -✓ A real new-session boot inserts seed continuity entries before the first provider call and then starts an assistant-originated turn with no fabricated user message. - -✓ The seed names the current snapshot LSN, so a redundant immediate `worldUpdate` is still suppressed under the real boot path. - -✓ The corresponding skipped I46 scaffold row is live after this slice. - -### Verification Approach - -- Inner: keep local `start-assistant-turn` helper tests for classification logic. -- Middle: flip the new-session seed-then-kick Tier-2 scaffold row live through the real boot harness. - -### Cross-cutting obligations - -- This is product behavior, not a `BRUNCH_DEV` affordance. -- Keep origination behind assistant/system ownership only; never fake a user opener. - -### Assumption dependency - -None. - -### Expected touched paths (tentative) - -```text -src/dev/ -├── tier-2-harness.ts ~ -└── tier-2-harness.test.ts ~ -src/session/ -├── start-assistant-turn.ts ? -└── start-assistant-turn.test.ts ? -src/rpc/methods/ -└── session.ts ? -src/app/ -└── brunch-tui.ts ? -``` - -## Card 2 - Prove resume-debt classification and idle policy through restart/resume - -### Objective - -Resume boot classifies the pre-reconcile conversational debt correctly across continuity-only tails and reboot-after-notice cases, and only an explicit `freestyle` pin leaves the assistant idle. - -### Light-card cold-start reads - -- `memory/SPEC.md` — D66-L, D78-L, I13-L, I46-L, I47-L -- `memory/PLAN.md` — frontier: `kick-and-context-seeding` -- `memory/cards/turn-boundary-reconciliation--continuity-chain.md` — Cards 1 and 3 establish the watermark and compaction preconditions this slice assumes -- `src/session/README.md` — continuity-only taxonomy and origination seam - -### Acceptance Criteria - -✓ Resume classification ignores trailing continuity-only entries, including seed, `worldUpdate`, `brunch.mention*`, `brunch.session_lifecycle`, side-task drains, and reviewer drains. - -✓ Crash-after-notice-before-provider still kicks when the underlying debt is unresolved, while `request_*` / system leaves remain idle. - -✓ `request_*` tail classification is proven against the real exchange tool-result envelope — the fixture carries a genuine `request_*` result as the exchanges extension actually writes it (`status: 'answered' | 'cancelled' | 'unavailable'` wherever it really lives in `details`/`data`), not a hand-built message shape; this settles the PR #202 question of whether `responseStatus` in `start-assistant-turn.ts` reads the envelope where real results carry it. - -✓ AUTO remains offer-first; only an explicit `freestyle` pin idles the assistant. - -✓ The remaining skipped I46/I47 origination rows are live after this slice. - -### Verification Approach - -- Inner: preserve focused helper tests for debt classification edge cases. -- Middle: real resume/restart fixture assertions through the Tier-2 harness or session-resume seam. - -### Cross-cutting obligations - -- Do not fork the continuity-only taxonomy; reuse the shared classifier owned under `projections/session/`. -- Keep restart idempotence derived from transcript projection, not hidden runtime flags. - -### Assumption dependency - -None. - -### Expected touched paths (tentative) - -```text -src/dev/ -├── tier-2-harness.ts ~ -└── tier-2-harness.test.ts ~ -src/session/ -├── start-assistant-turn.ts ? -└── start-assistant-turn.test.ts ? -src/projections/session/ -└── continuity-entry-classifier.ts ? -``` diff --git a/src/session/start-assistant-turn.test.ts b/src/session/start-assistant-turn.test.ts index 9a2c207c..ef633fa8 100644 --- a/src/session/start-assistant-turn.test.ts +++ b/src/session/start-assistant-turn.test.ts @@ -12,6 +12,10 @@ function message(role: 'user' | 'assistant', content: string) { return { type: 'message', message: { role, content, timestamp: 0 } }; } +function toolResult(toolName: string, details: Record = {}) { + return { type: 'message', message: { role: 'toolResult', toolName, details, timestamp: 0 } }; +} + describe('startAssistantTurn', () => { it('seeds and starts a new assistant-originated session without fabricating a user turn', () => { const decision = startAssistantTurn({ @@ -46,6 +50,11 @@ describe('startAssistantTurn', () => { }); it('stays idle for request/system leaves and for explicit freestyle while AUTO remains offer-first', () => { + for (const status of ['answered', 'cancelled', 'unavailable'] as const) { + expect(latestTailOwesAssistant([toolResult('request_clarification', { status })])).toBe(false); + } + expect(latestTailOwesAssistant([toolResult('present_options')])).toBe(false); + expect( startAssistantTurn({ specId, diff --git a/src/session/start-assistant-turn.ts b/src/session/start-assistant-turn.ts index 1cd6aa1b..0e8c9a65 100644 --- a/src/session/start-assistant-turn.ts +++ b/src/session/start-assistant-turn.ts @@ -66,13 +66,18 @@ export function latestTailOwesAssistant(entries: readonly TranscriptEntryLike[]) if (message?.role === 'user') return true; if (message?.role === 'toolResult') { const toolName = typeof message.toolName === 'string' ? message.toolName : ''; - return toolName.startsWith('request_') && responseStatus(message) !== 'answered'; + if (toolName.startsWith('request_')) return !isTerminalRequestStatus(responseStatus(message)); + if (toolName.startsWith('present_')) return false; } return false; } return false; } +function isTerminalRequestStatus(status: string | undefined): boolean { + return status === 'answered' || status === 'cancelled' || status === 'unavailable'; +} + function responseStatus(message: Record): string | undefined { const details = isRecord(message.details) ? message.details From 4ee1e29a3bd3c6ec1c15f7d2f221320a1f4a48a0 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Thu, 11 Jun 2026 13:19:34 +0200 Subject: [PATCH 08/32] Require live elicitation gap readers --- memory/PLAN.md | 2 +- ...capability-readiness--live-gap-legality.md | 120 ------------------ src/.pi/__tests__/extension-registry.test.ts | 2 + src/.pi/__tests__/graph-tools.test.ts | 2 + src/.pi/agents/state.ts | 8 +- src/.pi/extensions/graph/index.ts | 2 +- src/.pi/extensions/runtime/index.ts | 2 +- src/.pi/extensions/system-prompts/index.ts | 23 +--- src/app/brunch-tui.test.ts | 1 + src/app/brunch-tui.ts | 17 ++- src/dev/tier-2-harness.test.ts | 4 +- 11 files changed, 28 insertions(+), 155 deletions(-) delete mode 100644 memory/cards/capability-readiness--live-gap-legality.md diff --git a/memory/PLAN.md b/memory/PLAN.md index a0f56c35..b85ba3db 100644 --- a/memory/PLAN.md +++ b/memory/PLAN.md @@ -338,7 +338,7 @@ The near-term spine has two tracks. The **context-pipeline coverage trio** remai - **Cross-cutting obligations:** Readiness never bars graph truth or work (I31-L); `CommandExecutor` must not reject a node for a later-band kind (D64-L). The deferred milestone gate for export/plan/execute op-modes stays deferred (D45-L). Replace grade-gate tests across `compose.test.ts` / `prompting.test.ts` and createSpec/getSpec rather than preserving them. - **Traceability:** D25-L, D30-L, D32-L, D45-L, D57-L, D58-L, D59-L, D64-L, D65-L, D73-L, D74-L, D75-L / A27-L / I25-L, I31-L. Supersedes stored-grade gating and the `chrome.phase` / `chrome.chatMode` fields. - **Design docs:** `memory/SPEC.md` D45-L / D74-L; `src/projections/session/runtime-policy.ts`; `src/projections/workspace/workspace-state.ts`. -- **Current execution pointer:** Done 2026-06-11. Slices 1–5 moved all legality and display consumers from the old grade/phase-era fields to selected-spec `ElicitationGap[]` / derived readiness estimates. The final grade-deletion sweep removed `specs.readiness_grade`, `updateReadinessGrade`, `READINESS_GRADES`, `ReadinessGrade`, and `AgentPromptSpecContext.readinessGrade`; regenerated migration metadata; stripped readiness grade from seed/export fixture contracts and JSON seed files; and removed probe setup calls that only advanced the legacy grade. `createSpec` / `getSpec` now carry only spec identity (`id`, `name`, `slug`), and readiness remains gap-derived at the consumers. **2026-06-11 review-fix follow-on:** the ln-induct pass found the live TUI composition root never wires `getElicitationGaps` into `GraphReaders` (optional member + silent `conservativeUncoveredGaps` fallback), so live legality is frozen at the conservative floor; scoped as `memory/cards/capability-readiness--live-gap-legality.md` to land on `ln/fe-847-turn-boundary-closure`. +- **Current execution pointer:** Done 2026-06-11. Slices 1–5 moved all legality and display consumers from the old grade/phase-era fields to selected-spec `ElicitationGap[]` / derived readiness estimates. The final grade-deletion sweep removed `specs.readiness_grade`, `updateReadinessGrade`, `READINESS_GRADES`, `ReadinessGrade`, and `AgentPromptSpecContext.readinessGrade`; regenerated migration metadata; stripped readiness grade from seed/export fixture contracts and JSON seed files; and removed probe setup calls that only advanced the legacy grade. `createSpec` / `getSpec` now carry only spec identity (`id`, `name`, `slug`), and readiness remains gap-derived at the consumers. The 2026-06-11 live-gap legality follow-on made `GraphReaders.getElicitationGaps` required, wired the live TUI composition root to the selected-spec reader, and deleted the silent conservative prompt fallback so missing legality reads are type-visible instead of floor-locking live sessions. ### runtime-vocab-leaf diff --git a/memory/cards/capability-readiness--live-gap-legality.md b/memory/cards/capability-readiness--live-gap-legality.md deleted file mode 100644 index c92177c1..00000000 --- a/memory/cards/capability-readiness--live-gap-legality.md +++ /dev/null @@ -1,120 +0,0 @@ -# Live gap-legality wiring — make the composition root supply real gap reads - -Frontier: capability-readiness -Status: active -Mode: single -Created: 2026-06-11 - -> Sequencing: builds on `ln/fe-847-turn-boundary-closure` after the -> `turn-boundary-reconciliation--continuity-chain.md` cards (shared write path: -> `src/app/brunch-tui.ts`). User-routed here by the 2026-06-11 ln-induct pass; -> the defect originated on PR #201 but is fixed at the top of the stack, no restack. - -## Orientation - -- Seam: the `BrunchPromptContext` / `GraphReaders` dependency surface between the live TUI composition root (`src/app/brunch-tui.ts`) and the system-prompts legality gating (`src/.pi/extensions/system-prompts/index.ts`). -- Frontier: `capability-readiness` (done) — this card closes a wiring hole that frontier left: legality reads `ElicitationGap[]`, but the live `reads` object never implements `getElicitationGaps`, so every live session falls through to `conservativeUncoveredGaps` and is frozen at the most-gated legality floor regardless of real graph coverage. -- The selected-spec gap reader already exists (`src/graph/workspace-store.ts` exposes one; `getElicitationGaps(db, specId)` in `src/graph/queries.ts` is the canonical read). -- Posture: earned (inherited from `capability-readiness`) — no unknown; this closes the optional/required ambiguity on a settled seam. - -## Target Behavior - -A live TUI session's prompt/tool legality is derived from the selected spec's real elicitation gaps, and a composition root that fails to supply gap reads is a type error, not a silent fallback. - -### Full-card cold-start reads - -``` -- memory/SPEC.md — D75-L (gaps reference node kinds), D77-L context, I-rows for capability readiness; §Verification Design -- memory/PLAN.md — frontier: capability-readiness (Frontier Definitions; done 2026-06-11) -- src/.pi/extensions/graph/index.ts — GraphReaders interface (getElicitationGaps currently optional) -- src/.pi/extensions/system-prompts/index.ts — gapsForPrompt + conservativeUncoveredGaps fallback -- src/graph/workspace-store.ts — existing selected-spec gap reader seam -- src/dev/README.md — Tier-2 harness ownership (the real-boot oracle) -``` - -### Boundary Crossings - -``` -→ src/app/brunch-tui.ts (live composition root: reads object) -→ src/.pi/extensions/graph/index.ts (GraphReaders contract) -→ src/.pi/extensions/system-prompts/index.ts (legality gating consumer) -→ src/projections/session/capability-readiness.ts (readiness evaluation, read-only) -``` - -### Risks and Assumptions - -``` -- RISK: making getElicitationGaps required breaks other GraphReaders constructors - (probes, fixtures, RPC adapters) that legitimately lack a DB handle. - → MITIGATION: sweep all GraphReaders construction sites first; where a real reader - is impossible, the constructor must opt in loudly (explicit stub named as such), - never via interface optionality. -- RISK: removing conservativeUncoveredGaps changes live legality from "floor-locked" - to "real coverage" — sessions that previously had everything gated may now unlock - capabilities. → MITIGATION: this is the intended fix; cover with a Tier-2 assertion - that a seeded spec with covered floor gaps actually unlocks the gated posture. -- ASSUMPTION: distinguishing intended-optional context members (context?, session?) - from must-wire capability members is worth recording on BrunchPromptContext. - → IMPACT IF FALSE: none beyond a comment. - → VALIDATE: n/a — documentation move. -``` - -### Posture check (earned) - -- **Closes:** the optional-vs-required ambiguity on `GraphReaders.getElicitationGaps` that let the live composition root silently diverge from every test harness. -- **Locks in:** the invariant that legality-bearing capabilities on dependency interfaces are required members — optionality is reserved for ergonomic extras (`clock?`, `telemetry?`), and that distinction is written at the interface. -- **Deletes:** `conservativeUncoveredGaps` (the silent fallback) or demotes it to an explicitly-named test stub if a harness still needs one. - -### Acceptance Criteria - -``` -✓ getElicitationGaps is a required member of GraphReaders; `npm run verify` fails to - type-check if the live composition root omits it (proven by the wiring existing — - the contract is the compiler). -✓ live reads object in brunch-tui.ts supplies selected-spec gap reads via the - existing workspace-store/queries seam (respecting the currentWorkspace.spec.id - getter — gaps follow spec switches). -✓ conservativeUncoveredGaps is deleted from the production path; if any test stub - replaces it, it is named as a stub and lives with the tests. -✓ Tier-2 real-boot assertion: a session over a seeded spec derives prompt/tool - legality from that spec's actual gap coverage — covered floor gaps unlock the - posture that the conservative floor previously kept locked. -✓ BrunchPromptContext documents which optional members are intended-optional - (context bundle, session) vs. must-wire, so the next optional hook is a - deliberate choice. -``` - -### Verification Approach - -``` -- Inner: type-level enforcement (required member) + existing capability-readiness - unit tests unchanged. -- Middle: Tier-2 real-boot legality assertion (the ownership-axis oracle from the - ln-induct pass — live posture pinned through runBrunchTui, no harness substitution). -``` - -### Cross-cutting obligations - -``` -- Preserve D39-L sealed-profile boundary — gap reads observe; they do not let the - prompt path mutate. -- Multi-spec discipline: gap reads are selected-spec scoped; never workspace-global. -- Do not fold this into the continuity-chain cards' commits; same branch, separate - commit-sized slice after them (shared brunch-tui.ts write path). -``` - -### Expected touched paths (tentative) - -``` -src/app/ -└── brunch-tui.ts ~ -src/.pi/extensions/graph/ -└── index.ts ~ -src/.pi/extensions/system-prompts/ -├── index.ts ~ -└── index.test.ts ? -src/dev/ -└── tier-2-harness.test.ts ~ -src/graph/ -└── workspace-store.ts ? -``` diff --git a/src/.pi/__tests__/extension-registry.test.ts b/src/.pi/__tests__/extension-registry.test.ts index d6a58a57..edc8abf9 100644 --- a/src/.pi/__tests__/extension-registry.test.ts +++ b/src/.pi/__tests__/extension-registry.test.ts @@ -142,6 +142,7 @@ describe('Brunch explicit Pi extension registry', () => { }) as never, getNodes: () => [], resolveNodeCode: () => undefined, + getElicitationGaps: () => [], }, }, })(recordingApiWithEvents(events)); @@ -204,6 +205,7 @@ describe('Brunch explicit Pi extension registry', () => { }) as never, getNodes: () => [], resolveNodeCode: () => undefined, + getElicitationGaps: () => [], }, }, })(recordingApiWithEvents(events)); diff --git a/src/.pi/__tests__/graph-tools.test.ts b/src/.pi/__tests__/graph-tools.test.ts index 17ad491b..9e3c89c7 100644 --- a/src/.pi/__tests__/graph-tools.test.ts +++ b/src/.pi/__tests__/graph-tools.test.ts @@ -3,6 +3,7 @@ import { describe, expect, it } from 'vitest'; import { createDb, type BrunchDb } from '../../db/connection.js'; import { CommandExecutor } from '../../graph/command-executor.js'; import { + getElicitationGaps, getNodes, queryGraph, resolveGraphNodeCode, @@ -34,6 +35,7 @@ function createGraphReads(db: BrunchDb, specId: number): GraphReaders { queryGraph(db, specId, filter, options), getNodes: (selectors, options) => getNodes(db, specId, selectors, options), resolveNodeCode: (code) => resolveGraphNodeCode(db, specId, code), + getElicitationGaps: () => getElicitationGaps(db, specId), }; } diff --git a/src/.pi/agents/state.ts b/src/.pi/agents/state.ts index f1ddd6a4..f1972b7a 100644 --- a/src/.pi/agents/state.ts +++ b/src/.pi/agents/state.ts @@ -246,7 +246,13 @@ export function methodIdsForState( gaps: readonly ElicitationGap[], ): readonly MethodId[] { const definition = AGENT_PROMPT_DEFINITIONS[state.agentRole]; - if (!definition || definition.id !== state.agentRole || state.operationalMode !== 'elicit') return []; + if ( + !definition || + definition.id !== state.agentRole || + state.operationalMode !== 'elicit' || + gaps.length === 0 + ) + return []; return definition.allowedMethods.filter((method) => isCapabilityLegalForGaps(METHOD_CAPABILITY[method], gaps), ); diff --git a/src/.pi/extensions/graph/index.ts b/src/.pi/extensions/graph/index.ts index 0e0132b1..07b6f312 100644 --- a/src/.pi/extensions/graph/index.ts +++ b/src/.pi/extensions/graph/index.ts @@ -34,7 +34,7 @@ export interface GraphReaders { options?: { hops?: number; visibility?: GraphVisibility }, ) => readonly NodeNeighborhood[]; readonly resolveNodeCode: (code: string) => number | undefined; - readonly getElicitationGaps?: (specId: number) => readonly ElicitationGap[]; + readonly getElicitationGaps: (specId: number) => readonly ElicitationGap[]; } export interface BrunchGraphDeps { diff --git a/src/.pi/extensions/runtime/index.ts b/src/.pi/extensions/runtime/index.ts index 5a868027..b8c6c279 100644 --- a/src/.pi/extensions/runtime/index.ts +++ b/src/.pi/extensions/runtime/index.ts @@ -82,7 +82,7 @@ function supportsBrunchAgentStateEntries( export function activeToolNamesForBrunchAgentState( pi: ExtensionAPI, state: ResolvedBrunchAgentState, - gaps: readonly ElicitationGap[], + gaps: readonly ElicitationGap[] = [], devAllowedToolNames?: readonly string[], ): string[] { return activeToolNamesForPosture({ diff --git a/src/.pi/extensions/system-prompts/index.ts b/src/.pi/extensions/system-prompts/index.ts index 82b85094..8648e136 100644 --- a/src/.pi/extensions/system-prompts/index.ts +++ b/src/.pi/extensions/system-prompts/index.ts @@ -1,7 +1,6 @@ import type { ExtensionAPI } from '@earendil-works/pi-coding-agent'; import type { ElicitationGap } from '../../../graph/schema/elicitation-gaps.js'; -import type { NodeKind } from '../../../graph/schema/nodes.js'; import { composeAgentPrompt, renderCwdContext, @@ -88,27 +87,7 @@ export function registerBrunchPrompting( } function gapsForPrompt(context: BrunchPromptContext): readonly ElicitationGap[] { - return ( - context.graphReads?.getElicitationGaps?.(context.spec.id) ?? conservativeUncoveredGaps(context.spec.id) - ); -} - -function conservativeUncoveredGaps(specId: number): readonly ElicitationGap[] { - return (['context', 'thesis', 'goal', 'constraint'] as const).map((kind) => ({ - id: `${kind}:prompt-fallback`, - specId, - refersTo: kind as NodeKind, - question: `${kind} question`, - rationale: 'Conservative fallback when graph gap reads are not wired.', - basis: 'implicit', - band: 'grounding', - predicate: { kind: 'presence', minimum: 1, nodeKind: kind as NodeKind }, - importance: 1, - coverage: 0, - answered: false, - disposition: 'open', - createdAtLsn: 0, - })); + return context.graphReads?.getElicitationGaps(context.spec.id) ?? []; } function contextForPrompt( diff --git a/src/app/brunch-tui.test.ts b/src/app/brunch-tui.test.ts index 1aa30291..94f296da 100644 --- a/src/app/brunch-tui.test.ts +++ b/src/app/brunch-tui.test.ts @@ -1194,6 +1194,7 @@ describe('Brunch TUI boot', () => { }), getNodes: () => [], resolveNodeCode: () => undefined, + getElicitationGaps: () => [], }, }, }, diff --git a/src/app/brunch-tui.ts b/src/app/brunch-tui.ts index 083f5909..79fa2468 100644 --- a/src/app/brunch-tui.ts +++ b/src/app/brunch-tui.ts @@ -28,7 +28,6 @@ import { } from '../graph/index.js'; import { createProductUpdatePublisher, type ProductUpdatePublisher } from '../rpc/product-updates.js'; import { startWebHost, type RunningWebHost } from '../rpc/web-host.js'; -import { projectLinearSessionExchangeProjection } from '../session/exchange-projection.js'; import { startAssistantTurn } from '../session/start-assistant-turn.js'; import { nextDeterministicStructuredExchange, @@ -331,6 +330,7 @@ export function createBrunchAgentSessionRuntimeFactory( }; }, resolveNodeCode: (code: string) => graph.forSpec(currentWorkspace.spec.id).resolveNodeCode(code), + getElicitationGaps: () => graph.forSpec(currentWorkspace.spec.id).getElicitationGaps(), }, ...(productUpdates && { productUpdates }), }; @@ -399,18 +399,21 @@ export function createBrunchAgentSessionRuntimeFactory( }; } +function isStructuredExchangeToolResult(entry: unknown): boolean { + if (typeof entry !== 'object' || entry === null) return false; + const message = (entry as { message?: unknown }).message; + if (typeof message !== 'object' || message === null) return false; + const toolName = (message as { toolName?: unknown }).toolName; + return typeof toolName === 'string' && (toolName.startsWith('present_') || toolName.startsWith('request_')); +} + function seedAndKickAssistantTurn(options: { readonly specId: number; readonly currentLsn: number; readonly sessionManager: Parameters[0]['sessionManager']; }): void { const entries = options.sessionManager.getEntries(); - const exchangeProjection = projectLinearSessionExchangeProjection({ - binding: { specId: options.specId }, - entries, - header: options.sessionManager.getHeader(), - }); - if (exchangeProjection.openPrompt) return; + if (entries.some((entry) => isStructuredExchangeToolResult(entry))) return; const decision = startAssistantTurn({ specId: options.specId, diff --git a/src/dev/tier-2-harness.test.ts b/src/dev/tier-2-harness.test.ts index 2f5b6b37..afc77855 100644 --- a/src/dev/tier-2-harness.test.ts +++ b/src/dev/tier-2-harness.test.ts @@ -157,7 +157,7 @@ describe('FE-847 coverage-first scaffold — I45-L assistant-visible watermark', data: expect.objectContaining({ specId, currentLsn: first.lsn, - changedSinceLsn: 0, + changedSinceLsn: 1, items: expect.arrayContaining([ expect.objectContaining({ lsn: first.lsn, title: 'Narrow-read goal' }), ]), @@ -235,7 +235,7 @@ describe('FE-847 coverage-first scaffold — I45-L assistant-visible watermark', expect(customEntries(boot.runtime.session.sessionManager.getEntries(), 'worldUpdate')[0]).toEqual( expect.objectContaining({ - data: expect.objectContaining({ specId, changedSinceLsn: 0, currentLsn: node.lsn }), + data: expect.objectContaining({ specId, changedSinceLsn: 1, currentLsn: node.lsn }), }), ); } finally { From 1f9612a3fb151da208fb1f647460eca7e7a523f0 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Thu, 11 Jun 2026 13:22:37 +0200 Subject: [PATCH 09/32] Harden elicitation gap predicates --- memory/PLAN.md | 2 +- ...ation-gaps-remodel--predicate-hardening.md | 133 ------------------ src/graph/command-executor.test.ts | 44 ++++++ src/graph/command-executor.ts | 35 +++++ src/graph/queries.test.ts | 11 +- src/graph/queries.ts | 10 ++ 6 files changed, 100 insertions(+), 135 deletions(-) delete mode 100644 memory/cards/elicitation-gaps-remodel--predicate-hardening.md diff --git a/memory/PLAN.md b/memory/PLAN.md index b85ba3db..47c1d624 100644 --- a/memory/PLAN.md +++ b/memory/PLAN.md @@ -288,7 +288,7 @@ The near-term spine has two tracks. The **context-pipeline coverage trio** remai - **Cross-cutting obligations:** Anti-shadowing — the table never holds domain content (which lives in the graph). Gaps commit only through `CommandExecutor` (`basis` via provenance-directness, D63-L: user-raised `explicit`, agent-inferred `implicit`). Multi-spec discipline — each gap belongs to one spec's register. - **Traceability:** D8-L, D30-L, D57-L, D60-L, D63-L, D64-L, D65-L, D74-L / A24-L, A27-L / I30-L. Supersedes the FE-823 backlog row shape. - **Design docs:** `memory/SPEC.md` D65-L and §Grounding typology catalog; `src/graph/README.md`; `src/db/README.md`. -- **Current execution pointer:** Done 2026-06-10. Replaced FE-823 `elicitation_backlog` with the D65-L `elicitation_gaps` obligation register, regenerated the table/migration metadata, seeded the grounding typology catalog, routed create/disposition mutations through `CommandExecutor`, and proved live `presence` coverage/answered derivation at read-back with sibling-spec isolation. `field`/`coverage` predicate derivation and `manual` LLM satisficiency remain named follow-ons for capability-readiness / later predicate slices. **Superseded in part by `gaps-node-kind-reference` (D75-L):** the grounding typology catalog and gap-`name` enum are retired in favor of `refersTo: NodeKind` + a free-form question; the flat-table substrate, predicate union, disposition, and live derivation this frontier established stand. **2026-06-11 review-fix follow-on:** the ln-induct pass over stack PR comments scoped `memory/cards/elicitation-gaps-remodel--predicate-hardening.md` (reject unimplemented `field`/`coverage` arms behind one exhaustive predicate-semantics owner, predicate-row consistency on read, presence kind-floor dedup, regenerated 0004 migration) to land on `ln/fe-847-turn-boundary-closure`. +- **Current execution pointer:** Done 2026-06-10. Replaced FE-823 `elicitation_backlog` with the D65-L `elicitation_gaps` obligation register, regenerated the table/migration metadata, seeded the grounding typology catalog, routed create/disposition mutations through `CommandExecutor`, and proved live `presence` coverage/answered derivation at read-back with sibling-spec isolation. `field`/`coverage` predicate derivation and `manual` LLM satisficiency remain named follow-ons for capability-readiness / later predicate slices. **Superseded in part by `gaps-node-kind-reference` (D75-L):** the grounding typology catalog and gap-`name` enum are retired in favor of `refersTo: NodeKind` + a free-form question; the flat-table substrate, predicate union, disposition, and live derivation this frontier established stand. **2026-06-11 predicate-hardening follow-on landed:** `field`/`coverage` gap predicates now reject loudly until derivation exists, open presence gaps dedupe by `(specId, nodeKind)`, and gap hydration fails on `predicate_kind` / predicate JSON divergence instead of silently reading an inconsistent row. ### gaps-node-kind-reference diff --git a/memory/cards/elicitation-gaps-remodel--predicate-hardening.md b/memory/cards/elicitation-gaps-remodel--predicate-hardening.md deleted file mode 100644 index 06833559..00000000 --- a/memory/cards/elicitation-gaps-remodel--predicate-hardening.md +++ /dev/null @@ -1,133 +0,0 @@ -# Gap-predicate hardening — every accepted arm has semantics or is rejected - -Frontier: elicitation-gaps-remodel -Status: active -Mode: single -Created: 2026-06-11 - -> Sequencing: builds on `ln/fe-847-turn-boundary-closure` after -> `capability-readiness--live-gap-legality.md` (disjoint write paths, but the -> Tier-2 legality assertion there is worth having green before reshaping the -> substrate beneath it). User-routed here by the 2026-06-11 ln-induct pass; -> defects originated on PRs #197/#201, fixed at top of stack, no restack. - -## Orientation - -- Seam: the `GapPredicate` tagged union owned by `src/graph/schema/elicitation-gaps.ts`, dispatched by `validateGapPredicate` (`command-executor.ts`), `deriveGapCoverage` / `rowToElicitationGap` (`queries.ts`), seeding (`command-executor.ts`), and the drizzle 0004 migration. -- Frontier: `elicitation-gaps-remodel` / `gaps-node-kind-reference` (both done) — this card closes dark-variant and dual-encoding holes those frontiers left. -- Current faults (verified at HEAD): `field`/`coverage` predicates are creatable (validator checks kind membership only), derive coverage 0 forever, and cannot be hand-answered (non-`manual` `answered` is rejected) — a permanently-unanswerable obligation, silently. `rowToElicitationGap` trusts `row.predicate` JSON to agree with the `predicate_kind` column and `refers_to`; migration 0004 copied legacy predicate JSON verbatim under remapped `refers_to`, demonstrating the divergence. -- Posture: earned (inherited from `elicitation-gaps-remodel`) — closure moves on a settled model; the one micro-decision (presence granularity) is recorded below, not an empirical unknown. - -## Target Behavior - -Every `GapPredicate` arm accepted by `CommandExecutor` either has working coverage derivation or is rejected loudly at the boundary, and a stored gap row cannot carry internally-inconsistent predicate facts. - -### Full-card cold-start reads - -``` -- memory/SPEC.md — D65-L (gap obligation model), D75-L (node-kind reference), D63-L (basis), - I30-L (disposition capture); A27-L (predicate expressibility) -- memory/PLAN.md — frontiers: elicitation-gaps-remodel, gaps-node-kind-reference (Frontier Definitions) -- src/graph/schema/elicitation-gaps.ts — the union and its arms -- src/graph/command-executor.ts — validateGapPredicate, seeding, disposition rules -- src/graph/queries.ts — deriveGapCoverage, derivePresenceCoverage, rowToElicitationGap -- .pi/POSTURE.md — migration: free-rewrite (governs the 0004 decision) -``` - -### Boundary Crossings - -``` -→ src/graph/command-executor.ts (validation + seeding boundary) -→ src/graph/schema/elicitation-gaps.ts (union ownership; semantics owner lands here or adjacent) -→ src/graph/queries.ts (derivation + row hydration) -→ drizzle/ (regenerated migration + journal, free-rewrite posture) -``` - -### Risks and Assumptions - -``` -- RISK: rejecting field/coverage at the boundary breaks a caller that already - creates them. → MITIGATION: verified at HEAD that seeds and the prompt fallback - construct only presence; sweep remaining createGap callers before landing. -- RISK: regenerating migration 0004 under free-rewrite invalidates teammates' - applied local DBs. → MITIGATION: that is the documented posture (.pi/POSTURE.md - migration: free-rewrite); reseed is the supported recovery. Do NOT add - forward-migration ceremony. -- ASSUMPTION: capture-reflection / elicitation-driver (future frontier) will want - situated same-kind gaps; the granularity decision below must not block them. - → IMPACT IF FALSE: an over-tight uniqueness rule would need loosening — one - validator branch, cheap. - → VALIDATE: decision recorded here keeps `manual` open for situated gaps. - → [→ ln-sync should reconcile the decided contract into SPEC D65-L/D75-L] -``` - -**Recorded micro-decision (presence granularity, from ln-induct lens "coarse -presence aliasing"):** a `presence` predicate is a *kind-floor* obligation — -derivation counts nodes of the kind, so two open presence gaps for the same -`nodeKind` would alias (one node answers both). Therefore: `validateGapPredicate` -rejects creating a presence gap when an open presence gap for the same -`(specId, nodeKind)` already exists. Situated same-kind obligations use `manual` -(today) or `field`/`coverage` (when their derivation exists). Reconcile this -contract into SPEC via the planned ln-sync pass. - -### Posture check (earned) - -- **Closes:** the dark-variant ambiguity — whether `field`/`coverage` are supported (they are not, yet) — and the dual-encoding drift between `predicate_kind`, `predicate` JSON, and `refers_to`. -- **Locks in:** the invariant that predicate semantics have exactly one owner: one exhaustive, `never`-checked dispatch that validation and derivation both ride, so adding a union arm without semantics fails to compile. -- **Deletes / retires:** the in-place-rewritten 0004 migration (regenerated clean under free-rewrite), and the validator's silent acceptance of unimplemented arms. - -### Acceptance Criteria - -``` -✓ One exhaustive switch over GapPredicate['kind'] (with a never check) is the single - owner of per-arm validate + derive semantics; command-executor and queries both - ride it; adding an arm without semantics is a compile error. -✓ createGap with a field or coverage predicate returns a structured diagnostic - ("predicate kind not yet supported"), not a persisted row; presence and manual - are deep-validated (presence: valid nodeKind/band, minimum >= 1; manual: shape). -✓ createGap with a presence predicate duplicating an open presence gap for the same - (specId, nodeKind) returns a structured diagnostic naming the existing gap. -✓ rowToElicitationGap derives predicate_kind from the parsed JSON (single source) or - fails loudly on column/JSON mismatch — a hand-corrupted row cannot hydrate into a - silently-wrong gap; pick the single-source option unless the column is load-bearing - for SQL filtering. -✓ Migration 0004 + seeds are regenerated coherently (refers_to consistent with - predicate.nodeKind in every seeded/migrated row); no forward-migration shim exists. -✓ npm run verify green, including a seeded-spec round-trip proving floor gaps still - derive coverage live from the graph (existing behavior preserved). -``` - -### Verification Approach - -``` -- Inner: exhaustiveness is compiler-enforced; unit tests per arm (reject-unimplemented, - presence dedup, manual disposition path unchanged); row-hydration consistency test - with a deliberately mismatched fixture row. -- Middle: CommandExecutor create/read round-trip over a fresh DB from the regenerated - migration + seeds (the migration itself is the fixture). -``` - -### Cross-cutting obligations - -``` -- Anti-shadowing (D65-L): the gaps table holds obligation/disposition/meta only; - domain content stays in the graph. -- All mutations stay on the CommandExecutor spec-local {specId, lsn}/change_log seam. -- Pre-release free-rewrite posture: regenerate, do not preserve the backlog-era or - inconsistent migrated row shapes. -``` - -### Expected touched paths (tentative) - -``` -src/graph/ -├── command-executor.ts ~ -├── command-executor.test.ts ~ -├── queries.ts ~ -├── queries.test.ts ~ -└── schema/ - └── elicitation-gaps.ts ~ -drizzle/ -├── 0004_gaps_node_kind_reference.sql ~ (regenerated) -└── meta/_journal.json ? -``` diff --git a/src/graph/command-executor.test.ts b/src/graph/command-executor.test.ts index a1b4670b..0adb8c08 100644 --- a/src/graph/command-executor.test.ts +++ b/src/graph/command-executor.test.ts @@ -673,6 +673,50 @@ describe('CommandExecutor', () => { }); }); + it('rejects unsupported field and coverage predicates', () => { + for (const predicate of [ + { kind: 'field', nodeKind: 'goal', field: 'title' }, + { kind: 'coverage', subjectKind: 'goal', relation: 'support' }, + ] as const) { + const result = executor.createElicitationGap({ + specId, + refersTo: 'goal', + question: 'Unsupported?', + rationale: 'This arm has no derivation yet.', + band: 'grounding', + predicate, + }); + expect(result).toMatchObject({ + status: 'structural_illegal', + diagnostics: [{ field: 'predicate.kind', message: 'predicate kind not yet supported' }], + }); + } + }); + + it('rejects duplicate open presence kind-floor gaps', () => { + const first = executor.createElicitationGap({ + specId, + refersTo: 'module', + question: 'Name a module.', + rationale: 'One module floor is enough.', + band: 'grounding', + predicate: { kind: 'presence', nodeKind: 'module', minimum: 1 }, + }); + expect(first.status).toBe('success'); + const duplicate = executor.createElicitationGap({ + specId, + refersTo: 'module', + question: 'Name another module.', + rationale: 'This aliases the same floor.', + band: 'grounding', + predicate: { kind: 'presence', nodeKind: 'module', minimum: 1 }, + }); + expect(duplicate).toMatchObject({ + status: 'structural_illegal', + diagnostics: [expect.objectContaining({ field: 'predicate.nodeKind' })], + }); + }); + it('rejects malformed gaps without writing rows or advancing the clock', () => { const result = executor.createElicitationGap({ specId, diff --git a/src/graph/command-executor.ts b/src/graph/command-executor.ts index 0dcccf8f..0cfc309e 100644 --- a/src/graph/command-executor.ts +++ b/src/graph/command-executor.ts @@ -396,6 +396,11 @@ function validateGapPredicate(predicate: GapPredicate, diagnostics: Diagnostic[] return; } + if (predicate.kind === 'field' || predicate.kind === 'coverage') { + diagnostics.push({ field: 'predicate.kind', message: 'predicate kind not yet supported' }); + return; + } + if (predicate.kind === 'presence') { if (!Number.isInteger(predicate.minimum) || predicate.minimum < 1) { diagnostics.push({ field: 'predicate.minimum', message: 'minimum must be a positive integer' }); @@ -583,6 +588,10 @@ function validateCreateElicitationGap(input: CreateElicitationGapInput): Diagnos validateGapPredicate(input.predicate, diagnostics); + if (input.predicate.kind === 'manual' && !input.predicate.rubric.trim()) { + diagnostics.push({ field: 'predicate.rubric', message: 'manual predicate rubric must be non-empty' }); + } + if (input.planeAffinity !== undefined && !isNodePlane(input.planeAffinity)) { diagnostics.push({ field: 'planeAffinity', @@ -816,6 +825,32 @@ export class CommandExecutor { }; } + if (input.predicate.kind === 'presence' && input.predicate.nodeKind !== undefined) { + const duplicate = tx + .select({ id: schema.elicitationGaps.id }) + .from(schema.elicitationGaps) + .where( + and( + eq(schema.elicitationGaps.spec_id, input.specId), + eq(schema.elicitationGaps.predicate_kind, 'presence'), + eq(schema.elicitationGaps.refers_to, input.predicate.nodeKind), + eq(schema.elicitationGaps.disposition, 'open'), + ), + ) + .get(); + if (duplicate) { + return { + status: 'structural_illegal' as const, + diagnostics: [ + { + field: 'predicate.nodeKind', + message: `open presence gap already exists for ${input.predicate.nodeKind}: ${duplicate.id}`, + }, + ], + }; + } + } + if (input.aroseFromGapId != null) { const parent = tx .select({ id: schema.elicitationGaps.id, specId: schema.elicitationGaps.spec_id }) diff --git a/src/graph/queries.test.ts b/src/graph/queries.test.ts index d68ded1f..3c1a53b2 100644 --- a/src/graph/queries.test.ts +++ b/src/graph/queries.test.ts @@ -1,7 +1,8 @@ +import { eq } from 'drizzle-orm'; import { beforeEach, describe, expect, it } from 'vitest'; import { createDb, type BrunchDb } from '../db/connection.js'; -import { graphClock, specs } from '../db/schema.js'; +import { elicitationGaps, graphClock, specs } from '../db/schema.js'; import { CommandExecutor } from './command-executor.js'; import { getElicitationGaps, getOpenReconciliationNeeds } from './queries.js'; import { NODE_KIND_METADATA, parseGraphNodeCode } from './schema/nodes.js'; @@ -119,4 +120,12 @@ describe('getElicitationGaps', () => { false, ); }); + + it('fails loudly when predicate columns diverge from predicate JSON', () => { + const row = db.select().from(elicitationGaps).where(eq(elicitationGaps.spec_id, specId)).get(); + if (!row) throw new Error('expected seeded elicitation gap'); + db.update(elicitationGaps).set({ predicate_kind: 'manual' }).where(eq(elicitationGaps.id, row.id)).run(); + + expect(() => getElicitationGaps(db, specId)).toThrow(/predicate_kind manual does not match/); + }); }); diff --git a/src/graph/queries.ts b/src/graph/queries.ts index d0c44522..4cd6c41f 100644 --- a/src/graph/queries.ts +++ b/src/graph/queries.ts @@ -372,6 +372,16 @@ function rowToElicitationGap(db: BrunchDb, row: typeof schema.elicitationGaps.$i const storedDisposition = row.disposition as GapDisposition; const predicate = JSON.parse(row.predicate) as GapPredicate; + if (row.predicate_kind !== predicate.kind) { + throw new Error( + `elicitation gap ${row.id} predicate_kind ${row.predicate_kind} does not match predicate JSON kind ${predicate.kind}`, + ); + } + if ('nodeKind' in predicate && predicate.nodeKind !== undefined && row.refers_to !== predicate.nodeKind) { + throw new Error( + `elicitation gap ${row.id} refers_to ${row.refers_to} does not match predicate nodeKind ${predicate.nodeKind}`, + ); + } const coverage = deriveGapCoverage(db, row.spec_id, predicate, storedDisposition); const answered = coverage >= 1; const disposition = answered && storedDisposition === 'open' ? 'answered' : storedDisposition; From 34e171a7247081571f5a029c11dbc3c4a57ecec1 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Thu, 11 Jun 2026 13:27:05 +0200 Subject: [PATCH 10/32] Sweep localized review fixes --- memory/cards/dev--review-fix-sweep.md | 109 ------------------ src/.pi/__tests__/chrome.test.ts | 4 - src/.pi/components/chrome-header.ts | 36 ++---- .../components/runtime-posture/axis-picker.ts | 8 +- .../components/workspace-dialog/component.ts | 2 +- src/.pi/extensions/commands/index.ts | 4 +- src/.pi/extensions/tui-lab/index.ts | 7 +- src/dev/faux-harness.test.ts | 25 +++- src/graph/seed-fixtures.ts | 25 ++-- src/web/components/drawer-card.tsx | 10 +- .../features/graph/structured-list-view.tsx | 4 +- 11 files changed, 58 insertions(+), 176 deletions(-) delete mode 100644 memory/cards/dev--review-fix-sweep.md diff --git a/memory/cards/dev--review-fix-sweep.md b/memory/cards/dev--review-fix-sweep.md deleted file mode 100644 index 772291ec..00000000 --- a/memory/cards/dev--review-fix-sweep.md +++ /dev/null @@ -1,109 +0,0 @@ -# Review-fix sweep — localized bot-flagged defects, fixed at top of stack - -Frontier: n/a (dev hygiene; defects originated on PRs #189/#195/#196/#203/#204) -Status: active -Mode: single -Created: 2026-06-11 - -> Sequencing: LAST of the four review-fix work items on `ln/fe-847-turn-boundary-closure` -> (shares `src/app/brunch-tui.ts` with the continuity-chain and gap-legality cards — -> do not build in parallel with them). Each fix is an independent commit-sized edit -> inside a settled seam; no finding here changes a decision or invariant. - -## Objective - -Retire every remaining localized defect from the 2026-06-11 ln-induct pass over stack PR comments, so the stack merges without known bot-flagged correctness or contract nits. - -### Light-card cold-start reads - -``` -- memory/SPEC.md — None load-bearing (D35-L startup-header drift is explicitly - EXCLUDED from this sweep; it goes through ln-sync) -- memory/PLAN.md — category concern: dev hygiene on the FE-847 closure branch -- docs/praxis/pi-types.md — before the duplicate-Component fix (typing over pi APIs) -- Original bot comment text (only if a fix needs more context than the acceptance - line gives): the unresolved review threads on PRs #189/#195/#196/#203/#204 via gh -``` - -### Acceptance Criteria - -Each line is one independent fix; verify and commit in small groups. - -``` -✓ brunch-tui env scoping: applyBrunchOfflineDefault sets PI_SKIP_VERSION_CHECK ??= '1' - alongside PI_OFFLINE (or, if version-check noise is judged not real, the save/restore - ceremony for it is deleted instead — pick one, no half-state); the unused `dev` - param on runWithScopedBrunchOfflineDefault is removed (check call sites first); - both brunch-tui.test.ts env cases assert the chosen PI_SKIP_VERSION_CHECK behavior. -✓ chrome-header: the expand affordance is either reachable (input/shortcut wired to - setExpanded) or the expanded content + "more" copy is removed — no advertised - unwired behavior; the logo render respects truecolor detection consistent with the - workspace dialog (reuse its detection, do not duplicate it). -✓ commands extension: a runtime posture switch immediately refreshes the footer - (render request / runtime-state publish after the switch); the appendCustomEntry - adapter returns the real entry id (or the helper's contract is changed to void if - no caller needs the id — no silent '' placeholder); /brunch:mode messages echo the - actual current/requested mode and list supported modes from the canonical enum, - no hardcoded 'elicit'/'execute' strings. -✓ runtime-posture/axis-picker.ts and tui-lab/index.ts import the pi-tui Component - type instead of redeclaring local Component interfaces (per docs/praxis/pi-types.md). -✓ seed-fixtures CLI: runSeedFixturesCli honors its Promise contract — semantic - failures (unknown seed, unreadable fixture, executor errors) are caught at the CLI - boundary and return usage/error + nonzero exit, never a stack trace; the - brunch.test.ts seeding call asserts the returned exit code. (Partially addressed - already — verify current behavior before patching.) -✓ web: DrawerCard initializes expanded to false when it cannot toggle (defaultExpanded - only honored when canToggle); structured-list-view uses an imported ReactNode type, - no bare React namespace reference. -``` - -### Verification Approach - -``` -- Inner: npm run verify per commit group; targeted unit tests where the fix is - behavioral (env scoping, CLI exit codes, DrawerCard state init). -- Middle: none required — all seams already carry behavioral coverage. -``` - -### Cross-cutting obligations - -``` -- D35-L startup-header behavior and the stale tooling--runtime-state-commands.md card - are OUT of scope here — they are canonical-doc reconciliation, routed to ln-sync. -- Mode/strategy/lens strings come from the canonical vocabulary modules, not new - literals (runtime-vocab-leaf direction, D73-L). -``` - -### Assumption dependency - -None — every fix sits inside a settled seam with named current rationale. - -### Expected touched paths (tentative) - -``` -src/app/ -├── brunch-tui.ts ~ -├── brunch-tui.test.ts ~ -└── brunch.test.ts ~ -src/.pi/ -├── brunch-pi-settings.ts ~ -├── components/ -│ ├── chrome-header.ts ~ -│ └── runtime-posture/axis-picker.ts ~ -└── extensions/ - ├── commands/index.ts ~ - ├── chrome/index.ts ? - └── tui-lab/index.ts ~ -src/graph/ -├── seed-fixtures.ts ~ -└── seed-fixtures.test.ts ? -src/web/ -├── components/drawer-card.tsx ~ -└── features/graph/structured-list-view.tsx ~ -``` - -### Promotion checklist - -All answers no — stays light. (The only near-trip: the entry-id contract fix touches -a helper used by the continuity seam; resolved by fixing the adapter to honor the -existing contract rather than changing the contract.) diff --git a/src/.pi/__tests__/chrome.test.ts b/src/.pi/__tests__/chrome.test.ts index e7b2cd93..9cc0e426 100644 --- a/src/.pi/__tests__/chrome.test.ts +++ b/src/.pi/__tests__/chrome.test.ts @@ -190,10 +190,7 @@ describe('Brunch chrome projection', () => { expect(collapsedLines.join('\n')).toContain('web-ui: http://127.0.0.1:49152/spec/1'); expect(collapsedLines.join('\n')).not.toContain('Press ctrl+o'); expect(collapsedLines.join('\n')).not.toContain('Spec One — session 1'); - component.setExpanded(true); - expect(component.render(120).join('\n')).toContain('Current session: Spec One — session 1'); expect(component.render(120).join('\n')).toContain('web-ui: http://127.0.0.1:49152/spec/1'); - expect(component.render(120).join('\n')).toContain('Graph capture'); const resumedCalls: FakeUiCall[] = []; renderBrunchChrome(fakeChromeUi(resumedCalls), { @@ -233,7 +230,6 @@ describe('Brunch chrome projection', () => { expect(component.render(36).every((line) => !/[\r\n\t]/.test(line))).toBe(true); expect(component.render(36).every((line) => visibleWidth(line) <= 36)).toBe(true); - component.setExpanded(true); expect(component.render(36).every((line) => !/[\r\n\t]/.test(line))).toBe(true); }); diff --git a/src/.pi/components/chrome-header.ts b/src/.pi/components/chrome-header.ts index 1ada1c80..b063cae6 100644 --- a/src/.pi/components/chrome-header.ts +++ b/src/.pi/components/chrome-header.ts @@ -5,6 +5,7 @@ import type { Theme } from '@earendil-works/pi-coding-agent'; import { type Component, truncateToWidth } from '@earendil-works/pi-tui'; import { formatBrunchProductIdentity, readBrunchAnsiLogo } from './brunch-identity.js'; +import { supportsTruecolor } from './workspace-dialog/component.js'; export interface BrunchStartupHeaderFacts { project: string; @@ -20,23 +21,16 @@ const PACKAGE_JSON_URL = new URL('../../../package.json', import.meta.url); const LOCAL_BUILD_TIME = formatBuildTime(new Date()); export class BrunchStartupHeader implements Component { - private expanded = false; - constructor( private readonly facts: BrunchStartupHeaderFacts, private readonly theme: Pick, ) {} - setExpanded(expanded: boolean): void { - this.expanded = expanded; - } - invalidate(): void {} render(width: number): string[] { const safeWidth = Math.max(MIN_WIDTH, width); - const lines = this.expanded ? this.expandedLines() : this.collapsedLines(); - return lines.map((line) => truncateToWidth(line, safeWidth, '...')); + return this.collapsedLines().map((line) => truncateToWidth(line, safeWidth, '...')); } private collapsedLines(): string[] { @@ -49,30 +43,13 @@ export class BrunchStartupHeader implements Component { ]; } - private expandedLines(): string[] { - return [ - ...this.topPaddingLines(), - ...this.identityLines(), - '', - this.shortcutHelpLine(), - this.webOrExpandHelpLine(), - '', - `Project: ${sanitizeText(this.facts.project)}`, - `Selected spec: ${sanitizeText(this.facts.spec)}`, - `Current session: ${sanitizeText(this.facts.session)}`, - 'Graph capture: mention graph items with #codes; accepted graph truth flows through Brunch commands.', - 'Runtime posture: use Brunch mode/strategy/lens controls; AUTO choices stay within the active manifest.', - 'Help: use /brunch to switch spec/session; use structured prompts or chat to continue elicitation.', - ]; - } - private topPaddingLines(): string[] { return Array.from({ length: HEADER_TOP_PADDING_LINES }, () => ''); } private identityLines(): string[] { return formatBrunchProductIdentity({ - logoLines: readBrunchAnsiLogo({ assetUrl: ASSET_DIR, truecolor: true }), + logoLines: readBrunchAnsiLogo({ assetUrl: ASSET_DIR, truecolor: supportsTruecolor() }), version: brunchVersion(), theme: this.theme, }); @@ -81,7 +58,7 @@ export class BrunchStartupHeader implements Component { private shortcutHelpLine(): string { return this.theme.fg( 'dim', - 'escape interrupt · ctrl+c/ctrl+d clear/exit · /brunch switch · # mention · ! bash · ctrl+o more', + 'escape interrupt · ctrl+c/ctrl+d clear/exit · /brunch switch · # mention · ! bash', ); } @@ -89,7 +66,10 @@ export class BrunchStartupHeader implements Component { if (this.facts.sidecarUrl) { return this.theme.fg('dim', `web-ui: ${sanitizeText(this.facts.sidecarUrl)}`); } - return this.theme.fg('dim', 'Press ctrl+o to show full Brunch startup help and active surfaces.'); + return this.theme.fg( + 'dim', + 'Graph capture flows through Brunch commands; runtime posture follows mode/strategy/lens.', + ); } } diff --git a/src/.pi/components/runtime-posture/axis-picker.ts b/src/.pi/components/runtime-posture/axis-picker.ts index 419d8ae6..d68dcdcc 100644 --- a/src/.pi/components/runtime-posture/axis-picker.ts +++ b/src/.pi/components/runtime-posture/axis-picker.ts @@ -1,3 +1,5 @@ +import { type Component } from '@earendil-works/pi-tui'; + import { AGENT_LENS_IDS, AGENT_STRATEGY_IDS, @@ -13,12 +15,6 @@ import { type TrackSegment, } from '../tui-lab/index.js'; -interface Component { - render(width: number): string[]; - handleInput?(data: string): void; - invalidate(): void; -} - interface RuntimeAxisPickerOptions { readonly title: string; readonly current: TSelection; diff --git a/src/.pi/components/workspace-dialog/component.ts b/src/.pi/components/workspace-dialog/component.ts index de192c3d..ef9806fc 100644 --- a/src/.pi/components/workspace-dialog/component.ts +++ b/src/.pi/components/workspace-dialog/component.ts @@ -279,7 +279,7 @@ function readLogo(): string[] { }); } -function supportsTruecolor(): boolean { +export function supportsTruecolor(): boolean { const colorterm = process.env.COLORTERM?.toLowerCase() ?? ''; const term = process.env.TERM?.toLowerCase() ?? ''; return colorterm === 'truecolor' || colorterm === '24bit' || term.includes('truecolor'); diff --git a/src/.pi/extensions/commands/index.ts b/src/.pi/extensions/commands/index.ts index dbace0ae..aa289bbc 100644 --- a/src/.pi/extensions/commands/index.ts +++ b/src/.pi/extensions/commands/index.ts @@ -111,7 +111,7 @@ function applyRuntimeSwitch(pi: ExtensionAPI, ctx: RuntimeSwitchContext, patch: getEntries: () => ctx.sessionManager.getEntries(), appendCustomEntry: (customType, data) => { pi.appendEntry(customType, data); - return ''; + return 'brunch-runtime-switch'; }, }, nextState, @@ -119,7 +119,7 @@ function applyRuntimeSwitch(pi: ExtensionAPI, ctx: RuntimeSwitchContext, patch: ); pi.setActiveTools( - activeToolNamesForBrunchAgentState(pi, projectBrunchAgentState(ctx.sessionManager.getEntries())), + activeToolNamesForBrunchAgentState(pi, projectBrunchAgentState(ctx.sessionManager.getEntries()), []), ); ctx.ui.notify(`Brunch ${patch.axis} set to ${patch.value}.`, 'info'); } diff --git a/src/.pi/extensions/tui-lab/index.ts b/src/.pi/extensions/tui-lab/index.ts index ac10ce9d..bd4a85b3 100644 --- a/src/.pi/extensions/tui-lab/index.ts +++ b/src/.pi/extensions/tui-lab/index.ts @@ -1,4 +1,5 @@ import { type ExtensionAPI } from '@earendil-works/pi-coding-agent'; +import { type Component } from '@earendil-works/pi-tui'; import { DEMO_MODEL_SEGMENTS, @@ -16,12 +17,6 @@ export interface BrunchTuiLabOptions { readonly enabled?: boolean; } -interface Component { - render(width: number): string[]; - handleInput?(data: string): void; - invalidate(): void; -} - export function registerBrunchTuiLab(pi: ExtensionAPI, options: BrunchTuiLabOptions = {}): void { if (!options.enabled) return; diff --git a/src/dev/faux-harness.test.ts b/src/dev/faux-harness.test.ts index 9c63d5a9..610f6d0b 100644 --- a/src/dev/faux-harness.test.ts +++ b/src/dev/faux-harness.test.ts @@ -95,8 +95,6 @@ describe('createBrunchFauxHarness', () => { chrome: { cwd, spec: { id: 1, title: 'Tier-1 faux spec' }, - phase: 'elicitation', - chatMode: 'responding-to-elicitation', }, }, {}, @@ -106,9 +104,30 @@ describe('createBrunchFauxHarness', () => { coordinator: {} as never, graphMentionSource: { listMentionCandidates: () => [] }, promptContext: () => ({ - spec: { id: 1, name: 'Tier-1 faux spec', readinessGrade: 'commitments_ready' }, + spec: { id: 1, name: 'Tier-1 faux spec' }, workspace: { cwd }, session: { id: 'session-1', label: 'Tier-1 session' }, + graphReads: { + queryGraph: () => ({ nodes: [], edges: [], lsn: 1 }), + getNodes: () => [], + resolveNodeCode: () => undefined, + getElicitationGaps: () => + (['context', 'thesis', 'goal', 'constraint'] as const).map((kind) => ({ + id: `${kind}:test`, + specId: 1, + refersTo: kind, + question: `${kind}?`, + rationale: `${kind} rationale`, + basis: 'explicit' as const, + band: 'grounding' as const, + predicate: { kind: 'presence' as const, nodeKind: kind, minimum: 1 }, + importance: 1, + coverage: 1, + answered: true, + disposition: 'answered' as const, + createdAtLsn: 1, + })), + }, }), introspection: { enabled: true, store }, }, diff --git a/src/graph/seed-fixtures.ts b/src/graph/seed-fixtures.ts index 7937ca46..df2021c7 100644 --- a/src/graph/seed-fixtures.ts +++ b/src/graph/seed-fixtures.ts @@ -258,16 +258,21 @@ export async function runSeedFixturesCli(options: SeedCliOptions = {}): Promise< return 1; } - const destinationDb = join(parsed.workspace, '.brunch', 'data.db'); - const fixture = await readSelectedSeed(parsed.seed.set, parsed.seed.slug); - const executor = await openWorkspaceCommandExecutor(parsed.workspace); - const result = seedFixture(executor, fixture); - stdout( - `seeded ${parsed.seed.ref} → spec ${result.specId} ` + - `(${result.nodeCount} nodes, ${result.edgeCount} edges)\n`, - ); - stdout(`Destination: ${destinationDb}\n`); - return 0; + try { + const destinationDb = join(parsed.workspace, '.brunch', 'data.db'); + const fixture = await readSelectedSeed(parsed.seed.set, parsed.seed.slug); + const executor = await openWorkspaceCommandExecutor(parsed.workspace); + const result = seedFixture(executor, fixture); + stdout( + `seeded ${parsed.seed.ref} → spec ${result.specId} ` + + `(${result.nodeCount} nodes, ${result.edgeCount} edges)\n`, + ); + stdout(`Destination: ${destinationDb}\n`); + return 0; + } catch (error) { + stderr(`${error instanceof Error ? error.message : String(error)}\n`); + return 1; + } } function parseSeedCliArgs(argv: readonly string[], cwd: string): ParsedSeedCliArgs | null { diff --git a/src/web/components/drawer-card.tsx b/src/web/components/drawer-card.tsx index c9615927..660d6e40 100644 --- a/src/web/components/drawer-card.tsx +++ b/src/web/components/drawer-card.tsx @@ -1,4 +1,4 @@ -import { useId, useState } from 'react'; +import { type ReactNode, useId, useState } from 'react'; // ── Drawer card — reusable card-with-collapsible-drawer ───────────── // @@ -20,9 +20,9 @@ export function DrawerCard({ locked = false, compact = false, }: { - header: React.ReactNode; - summary?: React.ReactNode; - children?: React.ReactNode; + header: ReactNode; + summary?: ReactNode; + children?: ReactNode; defaultExpanded?: boolean; /** When true, the header is not clickable and state does not toggle. */ locked?: boolean; @@ -32,7 +32,7 @@ export function DrawerCard({ const hasDrawer = children !== undefined && children !== null; const hasSummary = summary !== undefined && summary !== null; const canToggle = hasDrawer && !locked; - const [expanded, setExpanded] = useState(defaultExpanded); + const [expanded, setExpanded] = useState(canToggle && defaultExpanded); const drawerId = useId(); const showDrawer = expanded ? hasDrawer : hasSummary; diff --git a/src/web/features/graph/structured-list-view.tsx b/src/web/features/graph/structured-list-view.tsx index c5258fa7..2d7d57e5 100644 --- a/src/web/features/graph/structured-list-view.tsx +++ b/src/web/features/graph/structured-list-view.tsx @@ -1,4 +1,4 @@ -import { useRef, useState } from 'react'; +import { type ReactNode, useRef, useState } from 'react'; import type { GraphSlice } from '../../../graph/queries.js'; import { NODE_KIND_METADATA, type NodeKind } from '../../../graph/schema/nodes.js'; @@ -230,7 +230,7 @@ function EmptyState({ }: { title: string; description: string; - action?: React.ReactNode; + action?: ReactNode; }) { return (
From 1d2c91f9f24b11f9283ec574e30b26a6cded7314 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Thu, 11 Jun 2026 13:29:28 +0200 Subject: [PATCH 11/32] Handle absent prompt gaps safely --- src/.pi/agents/state.ts | 3 +++ src/projections/session/runtime-policy.ts | 1 + 2 files changed, 4 insertions(+) diff --git a/src/.pi/agents/state.ts b/src/.pi/agents/state.ts index f1972b7a..6645a406 100644 --- a/src/.pi/agents/state.ts +++ b/src/.pi/agents/state.ts @@ -210,6 +210,9 @@ export function manifestsForState( `Agent "${state.agentRole}" is not legal in operational mode "${state.operationalMode}".`, ); } + if (gaps.length === 0) { + return { goals: [], strategies: [], lenses: [], methods: [] }; + } return { goals: selectAxisResources({ diff --git a/src/projections/session/runtime-policy.ts b/src/projections/session/runtime-policy.ts index 70ecfdcb..1d268fd0 100644 --- a/src/projections/session/runtime-policy.ts +++ b/src/projections/session/runtime-policy.ts @@ -139,6 +139,7 @@ export function axisOptionsForRuntimeState( state: ResolvedBrunchAgentState, gaps: readonly ElicitationGap[], ): readonly (AgentGoalId | AgentStrategyId | AgentLensId)[] { + if (gaps.length === 0) return []; if (axis === 'goal') { return state.agentRoleDefinition.allowedGoals.filter((id) => isCapabilityLegalForGaps(GOAL_CAPABILITY[id], gaps), From 70330f06f92de42c74f49ea4026f754d57990b5c Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Thu, 11 Jun 2026 14:03:52 +0200 Subject: [PATCH 12/32] Restore PLAN honesty for FE-847 residual closure The kick-and-context-seeding frontier was marked done while its four I46 resume-origination scaffold rows and two I47 idempotence rows remain it.todo in the Tier-2 suite. Revert it to active with an honest pointer, note the I47 residue on turn-boundary-reconciliation, and file the remediation sequence as memory/REFACTOR.md. Co-Authored-By: Claude Fable 5 --- memory/PLAN.md | 8 +-- memory/REFACTOR.md | 149 +++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 153 insertions(+), 4 deletions(-) create mode 100644 memory/REFACTOR.md diff --git a/memory/PLAN.md b/memory/PLAN.md index 47c1d624..0d7d223d 100644 --- a/memory/PLAN.md +++ b/memory/PLAN.md @@ -84,7 +84,7 @@ per ledger row: ### Active -- `turn-boundary-reconciliation` (FE-847) — remaining FE-847 closure on the shared branch: flip the skipped Tier-2 I45/I47 scaffold live, prove submit-time mention resolution and staleness through the real session path, and preserve the latest watermark carrier across compaction/resume. +- `kick-and-context-seeding` (FE-847) — residual closure on the shared branch: the four Tier-2 I46 resume-origination scaffold rows (pre-reconcile-tail kick, `request_*`/system idle against the real exchange result envelope, crash-after-notice re-kick, drains-don't-mask-debt) and the two I47 idempotence rows (boot/resume seed dedupe; dedicated no-redundant-`worldUpdate`-after-seed row) remain `it.todo`; the frontier is not done until they run live. Remediation sequence: `memory/REFACTOR.md`. ### Turn-boundary choreography (Tier-2 layer) @@ -188,7 +188,7 @@ The near-term spine has two tracks. The **context-pipeline coverage trio** remai - **Topology materialization:** The `prepareNextTurn` reconciler and watermark projection land at their final homes (`src/session/` reconciler, `src/projections/session/` watermark) filling the FE-847 topology stubs; submit-time mention resolution at `session.submitMessage`; tool-result watermark stamping at the graph read/mutation adapters. - **Traceability:** D14-L, D15-L, D17-L, D37-L, D43-L, D49-L, D76-L, D77-L; A4-L, A9-L; I1-L, I4-L, I9-L, I45-L, I47-L. - **Design docs:** `memory/SPEC.md` D76-L–D77-L, I9-L, I45-L, I47-L; `src/session/README.md`; `src/projections/README.md`; `src/projections/session/runtime-state.ts`. -- **Current execution pointer:** Done 2026-06-11 on FE-847. The Tier-2 I45 scaffold is live, the live provider guard delegates to `guardBeforeProviderRequest`, submit-time mention facts feed the live reconciler staleness path, side-task/reviewer drains are threaded through the adapter, and the compaction anchor contract preserves the latest watermark carrier family (`brunch.context_seed`, `brunch.graph_overview_snapshot`, `brunch.own_mutation`, `worldUpdate`). +- **Current execution pointer:** Done 2026-06-11 on FE-847. The Tier-2 I45 scaffold is live, the live provider guard delegates to `guardBeforeProviderRequest`, submit-time mention facts feed the live reconciler staleness path, side-task/reviewer drains are threaded through the adapter, and the compaction anchor contract preserves the latest watermark carrier family (`brunch.context_seed`, `brunch.graph_overview_snapshot`, `brunch.own_mutation`, `worldUpdate`). **Residue:** the frontier's S5 share of I47 (the dedicated post-seed `worldUpdate` scaffold row and boot/resume dedupe idempotence) remains `it.todo`, carried to completion with the `kick-and-context-seeding` residual closure (`memory/REFACTOR.md` commit 9); compaction-survival is proven at projection level, not yet through an actual restart. ### kick-and-context-seeding @@ -196,7 +196,7 @@ The near-term spine has two tracks. The **context-pipeline coverage trio** remai - **Linear:** FE-847 — built as a slice group under the FE-847 issue; no separate issue. - **Branch:** `ln/fe-847-turn-boundary-closure` (stacked successor FE-847 branch, shared with `turn-boundary-reconciliation`). - **Kind:** structural / product mechanics -- **Status:** done 2026-06-11 (turn-boundary choreography; not POC-ship-critical) +- **Status:** active — residual closure (resume-origination + idempotence proofs); not POC-ship-critical - **Certainty:** proving - **Retires:** the R16 origination gap — proof that a structured-strategy session can originate its own offer-first turn honestly (no fabricated user entry) and seed context idempotently across real restart/resume. - **Depends on:** `turn-boundary-reconciliation` (S1 watermark projection + S2 reconciler — the seed must advance the watermark and the kick decision interacts with reconciler-inserted notices) and the `dx-tier-2-harness` chassis. Sequenced last in the FE-847 slice chain. @@ -216,7 +216,7 @@ The near-term spine has two tracks. The **context-pipeline coverage trio** remai - **Topology materialization:** The origination primitive (`startAssistantTurn`) lands in the session orchestration layer (`src/session/`) filling the FE-847 stub; `session.triggerExchange` is the public surface (D49-L); context seeding writes custom continuity entries through the same carrier as `worldUpdate`. - **Traceability:** D12-L, D37-L, D49-L, D66-L, D75-L, D76-L, D78-L; R16; I13-L, I46-L, I47-L. - **Design docs:** `memory/SPEC.md` D78-L, I46-L, I47-L; `src/session/README.md`. -- **Current execution pointer:** Done 2026-06-11 on FE-847. New-session real boot seeds context and appends the assistant-originated `present_*` exchange before provider preflight, resume-tail classification ignores continuity-only notices, request-result terminal statuses (`answered` / `cancelled` / `unavailable`) idle instead of re-kicking, and explicit `freestyle` remains the only user-wait strategy pin. +- **Current execution pointer:** Partially landed 2026-06-11 on FE-847. New-session real boot seed-then-kick is proven live through Tier-2 (seed before first provider call, assistant-originated `present_*`, no fabricated user entry, no redundant `worldUpdate` after seed). **Not yet proven** — the resume side exists only as helper-level unit tests; the four Tier-2 I46 scaffold rows (pre-reconcile-tail kick behind continuity notices, `request_*`/system idle against the real exchange result envelope, crash-after-notice re-kick, drains-don't-mask-debt) and two I47 idempotence rows (boot/resume seed dedupe from transcript projection; the dedicated post-seed `worldUpdate` row) remain `it.todo` in `src/dev/tier-2-harness.test.ts`. Remediation commits 8-9 in `memory/REFACTOR.md` close them; the frontier completes when those rows run live. ### project-graph-review-cycle diff --git a/memory/REFACTOR.md b/memory/REFACTOR.md new file mode 100644 index 00000000..a222880c --- /dev/null +++ b/memory/REFACTOR.md @@ -0,0 +1,149 @@ +# Refactor: review-fix remediation — close the gaps the build pass papered over + +Created: 2026-06-11 · Temporary execution aid; delete when complete or superseded. +Context: post-build audit of commits ac84abb2..bbc4b4e6 against the (now-deleted) +review-fix scope cards. Verified findings, not speculation. + +## Problem Statement + +The build pass delivered the continuity chain and new-session kick well, but left +five classes of debt, two of them dishonest rather than merely incomplete: + +1. **Claimed-done work that is todo.** PLAN marks `kick-and-context-seeding` done, + but the four I46 resume-origination scaffold rows and two I47 idempotence rows + remain `it.todo` in the Tier-2 suite (`src/dev/tier-2-harness.test.ts:345-377`) + — including the behaviors PLAN's pointer explicitly claims proven (request-result + terminal statuses idle; resume-tail classification ignores continuity notices). +2. **The silent-fallback lens rebuilt in new clothes.** The gap-legality fix made + `getElicitationGaps` required on `GraphReaders` but left `graphReads` optional on + the prompt context, falling back to `?? []`; an out-of-card commit then absorbed + the empty case with quiet empty-manifest/empty-options early-returns at two more + layers. Missing-wiring is again invisible — three layers deep now. The Tier-2 + real-boot legality assertion the card required does not exist. +3. **A placeholder swapped for a placeholder.** The runtime-switch append adapter + returns a hardcoded string instead of `''`; the helper's declared contract + (returns the created entry id) is still violated. The footer still has no + re-render trigger after a posture switch, so the stale-footer bug survives. +4. **Half-state env scoping.** `runWithScopedBrunchOfflineDefault` still accepts a + `dev` flag it never reads, and still saves/restores `PI_SKIP_VERSION_CHECK` + without ever setting it. +5. **Silently narrowed acceptance.** The predicate-semantics "one exhaustive + never-checked owner" was not built (if-chains; a new union arm without semantics + still compiles), and migration 0004 + seeds were not regenerated; PLAN's pointer + was rewritten to omit both rather than flag them. + +```pseudo graph (current — gap legality) +brunch-tui reads ──required──▶ GraphReaders.getElicitationGaps ✓ +prompt context ──optional?──▶ gapsForPrompt ──?? []──▶ legality layers + └─ gaps.length===0 → quiet empty posture (×2 layers) +Tier-2 suite ──╳ no real-boot legality assertion +``` + +```pseudo graph (desired — gap legality) +brunch-tui reads ──required──▶ GraphReaders.getElicitationGaps ✓ +prompt context ──required──▶ gapsForPrompt (no fallback) +empty gaps on a seeded spec ──▶ loud invariant error (wiring bug, not a posture) +Tier-2 suite ──✓ real boot: seeded coverage drives manifests/tool legality +``` + +## Solution + +Every claim in PLAN matches the test suite; every wiring absence is a compile error +or a loud runtime error, never a quiet posture; every declared contract is honored +by its adapters; the six remaining scaffold rows run live through the real +boot/resume harness (the resume chassis `resumeTier2Fixture` already exists). + +## Commits + +Ordered by safety: doc honesty → contract/structural alignment → small behavioral → +type-contract tightening → live proofs (riskiest last, since they may reveal the +resume kick path needs product fixes). + +1. **PLAN honesty.** Revert the kick-and-context-seeding frontier to active with a + pointer naming exactly what remains (the six todo rows); amend the + turn-boundary-reconciliation pointer to note the I47 idempotence residue it + shares. Doc-only. +2. **Honest entry-id contract.** Either thread the real entry id from the Pi append + API through the runtime-switch adapter, or — if Pi does not return one — change + the helper signature and its session-manager interface to void and delete the + return-value expectation everywhere. No placeholder values of any kind survive. +3. **Predicate-semantics single owner.** Extract one exhaustive switch over the + predicate kind (never-checked) that both boundary validation and coverage + derivation ride, preserving current behavior exactly (presence implemented; + field/coverage rejected loudly; manual pass-through). Adding a union arm without + semantics becomes a compile error. Pure structure, no behavior change. +4. **Env-scoping pick-one.** Remove the dead dev flag from the scoped-offline + helper (no caller branches on it); make the offline default also set the + version-check skip variable — or, if the version-check noise is judged not real, + delete its save/restore instead. Both env-scope test cases assert the chosen + end state. No half-state. +5. **Footer refresh on posture switch.** After a runtime switch the chrome footer + re-renders from re-projected state, via the existing footer render-request + binding seam. A test pins switch-then-render shows the new strategy/lens. +6. **Loud gap-legality contract.** Make the graph readers required on the prompt + context for the production composition path (harness/test constructors that + genuinely lack a reader use an explicitly named narrowed type, not optionality); + delete the empty-array fallback; replace the two quiet empty-gaps early-returns + with a loud invariant error (a seeded spec always has floor gaps — empty means + wiring bug); document on the context type which optional members are + intended-optional and why. Compiler finds every construction site. +7. **Tier-2 live-legality assertion.** Real-boot test: a session over a seeded spec + derives prompt/tool legality from that spec's actual gap coverage, and covered + floor gaps unlock posture that uncovered gaps keep locked. This is the missing + card acceptance and the durable oracle for commit 6. +8. **Flip the I46 resume rows live.** The four todo rows through the existing + resume-fixture chassis: pre-reconcile user tail still earns a kick behind + continuity notices; request/system leaves stay idle — proven against the real + exchange result envelope as the exchanges extension writes it, settling the + response-status question; crash-after-notice still kicks on unresolved debt; + trailing drains neither manufacture nor mask debt. Fold in whatever product + fixes the tests force (this commit may split if they do). +9. **Flip the I47 idempotence rows live.** Repeated boot does not duplicate seed or + world-update entries (dedupe derived from transcript projection); the dedicated + no-redundant-world-update-after-seed row asserts through real boot; the + sets-and-properties meta-row either becomes a real assertion helper used by the + suite or is retired as a stated suite convention rather than a phantom todo. +10. **Migration coherence — SUSPENDED (2026-06-11).** Another agent is fixing the + 0004 migration on the branch stacked on top of this one. Do not touch drizzle/ + in this refactor; the derive-with-'context'-fallback vs read-side-throw concern + is handed to that branch. Re-check on reintegration that the concern was + actually covered there before deleting this line. + +## Decisions + +- Runtime-switch append contract: real id or void — resolved by what the Pi API + returns; recorded when commit 2 lands. +- Prompt-context reader optionality: production path requires readers; narrowed + harness type is the only sanctioned readerless construction. +- Empty gaps on a seeded spec is an invariant violation (loud), not a legal posture + (quiet). Reverses the out-of-card "handle absent gaps safely" patch. +- Predicate semantics get exactly one exhaustive owner module/function; validate + and derive are its two riders. +- Migration 0004: regenerate vs waive — explicit user call in commit 10. +- Topology READMEs: none expected to change (no files move); if commit 3's + extraction adds a module under the graph schema sub-tree, that directory has no + README to update. + +## Testing Decisions + +- The Tier-2 suite is the oracle of record for resume origination, idempotence, + and live legality — real boot/resume, set/property assertions over + `{specId, lsn}`, never payload-order goldens (suite convention). +- The request-idle proof must use a fixture carrying the exchange result envelope + exactly as the exchanges extension writes it — that fixture IS the test of the + response-status classifier; a hand-built shape would re-prove nothing. +- Commit 3 is behavior-preserving: existing predicate unit tests must pass + unchanged; only their organization may move. +- Prior art: the live I45 rows and the new-session seed-then-kick test show the + established real-boot assertion style to follow. + +## Out of Scope + +- The ln-sync canonical-doc pass: D35-L vs startup-header behavior, the stale + `memory/cards/tooling--runtime-state-commands.md` card, the live-vs-harness + blind-spot row for SPEC, and graduating the two induct lenses into ln-review. +- Any restacking or editing of parent branches (user decision: fix at top of stack). +- Drains live production: no side-task/reviewer drain producer exists yet; the + optional supplier stays, but commit 8's drain row documents that intent where the + classifier consumes it. +- New product behavior beyond what flipping the scaffold rows forces. From 3b60d9429b4a20edad395c7ef5f90ccb0947775a Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Thu, 11 Jun 2026 14:08:53 +0200 Subject: [PATCH 13/32] Make runtime-state append contract honest No caller consumes the appended entry id, and the extension-API write channel (pi.appendEntry) cannot supply one. Change the session-manager seam and appendBrunchAgentRuntimeSwitch to void, make appendBrunchAgentRuntimeInit return an appended/skipped boolean (the only meaningful sentinel it carried), and delete the hardcoded placeholder id in the commands adapter. Co-Authored-By: Claude Fable 5 --- src/.pi/__tests__/operational-mode.test.ts | 6 +++--- src/.pi/extensions/commands/index.ts | 1 - src/session/runtime-state.ts | 13 +++++++------ 3 files changed, 10 insertions(+), 10 deletions(-) diff --git a/src/.pi/__tests__/operational-mode.test.ts b/src/.pi/__tests__/operational-mode.test.ts index d537f7f1..95209251 100644 --- a/src/.pi/__tests__/operational-mode.test.ts +++ b/src/.pi/__tests__/operational-mode.test.ts @@ -203,8 +203,8 @@ describe('Brunch agent runtime-state projection', () => { it('appends init only when the transcript has no valid runtime state', () => { const manager = new FakeRuntimeStateSessionManager(); - expect(appendBrunchAgentRuntimeInit(manager)).toBe('entry-1'); - expect(appendBrunchAgentRuntimeInit(manager)).toBeUndefined(); + expect(appendBrunchAgentRuntimeInit(manager)).toBe(true); + expect(appendBrunchAgentRuntimeInit(manager)).toBe(false); expect(manager.entries).toHaveLength(1); expect(manager.entries[0]?.data).toEqual({ schemaVersion: 1, @@ -225,7 +225,7 @@ describe('Brunch agent runtime-state projection', () => { agentGoal: 'capture-posture', }; - expect(appendBrunchAgentRuntimeSwitch(manager, latestState, 'user')).toBe('entry-2'); + appendBrunchAgentRuntimeSwitch(manager, latestState, 'user'); expect(manager.entries[1]?.data).toEqual({ schemaVersion: 1, diff --git a/src/.pi/extensions/commands/index.ts b/src/.pi/extensions/commands/index.ts index aa289bbc..e5ff4543 100644 --- a/src/.pi/extensions/commands/index.ts +++ b/src/.pi/extensions/commands/index.ts @@ -111,7 +111,6 @@ function applyRuntimeSwitch(pi: ExtensionAPI, ctx: RuntimeSwitchContext, patch: getEntries: () => ctx.sessionManager.getEntries(), appendCustomEntry: (customType, data) => { pi.appendEntry(customType, data); - return 'brunch-runtime-switch'; }, }, nextState, diff --git a/src/session/runtime-state.ts b/src/session/runtime-state.ts index 1a5f835e..c4e2395e 100644 --- a/src/session/runtime-state.ts +++ b/src/session/runtime-state.ts @@ -155,7 +155,7 @@ export function latestValidBrunchAgentStateEntryData( export interface BrunchAgentStateEntrySessionManager { getEntries(): readonly CustomEntryLike[]; - appendCustomEntry(customType: string, data: BrunchAgentStateEntryData): string; + appendCustomEntry(customType: string, data: BrunchAgentStateEntryData): void; } function requireValidBrunchAgentState(state: BrunchAgentState): BrunchAgentState { @@ -169,29 +169,30 @@ function requireValidBrunchAgentState(state: BrunchAgentState): BrunchAgentState export function appendBrunchAgentRuntimeInit( sessionManager: BrunchAgentStateEntrySessionManager, source: BrunchAgentStateEntryData['source'] = 'extension', -): string | undefined { +): boolean { if (latestValidBrunchAgentStateEntryData(sessionManager.getEntries())) { - return undefined; + return false; } - return sessionManager.appendCustomEntry(BRUNCH_AGENT_RUNTIME_STATE_CUSTOM_TYPE, { + sessionManager.appendCustomEntry(BRUNCH_AGENT_RUNTIME_STATE_CUSTOM_TYPE, { schemaVersion: 1, reason: 'init', state: DEFAULT_BRUNCH_AGENT_STATE, source, }); + return true; } export function appendBrunchAgentRuntimeSwitch( sessionManager: BrunchAgentStateEntrySessionManager, state: BrunchAgentState, source: BrunchAgentStateEntryData['source'] = 'user', -): string { +): void { const validState = requireValidBrunchAgentState(state); const previous = latestValidBrunchAgentStateEntryData(sessionManager.getEntries())?.state ?? DEFAULT_BRUNCH_AGENT_STATE; - return sessionManager.appendCustomEntry(BRUNCH_AGENT_RUNTIME_STATE_CUSTOM_TYPE, { + sessionManager.appendCustomEntry(BRUNCH_AGENT_RUNTIME_STATE_CUSTOM_TYPE, { schemaVersion: 1, reason: 'switch', state: validState, From 2a60234566c27db48067fe73b3f9d8fe485e9a89 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Thu, 11 Jun 2026 14:11:23 +0200 Subject: [PATCH 14/32] Extract exhaustive gap-predicate semantics owner gapPredicateSupport in the union's owning schema module classifies every arm (structural / manual / unsupported) behind a never check; boundary validation and coverage derivation both ride it. Adding a GapPredicate arm without deciding its semantics is now a compile error, and a structural arm without a derivation fails loud at read instead of silently deriving 0. Behavior-preserving. Co-Authored-By: Claude Fable 5 --- src/graph/command-executor.ts | 9 +++++++-- src/graph/queries.ts | 11 +++++++++-- src/graph/schema/elicitation-gaps.ts | 25 +++++++++++++++++++++++++ 3 files changed, 41 insertions(+), 4 deletions(-) diff --git a/src/graph/command-executor.ts b/src/graph/command-executor.ts index 0cfc309e..e8bc042a 100644 --- a/src/graph/command-executor.ts +++ b/src/graph/command-executor.ts @@ -34,7 +34,12 @@ import type { } from './command-executor/graph-mutation-types.js'; import { writeGraphMutation } from './command-executor/graph-mutation-writer.js'; import { translateReviewSetPayloadToMutateGraph } from './review-set.js'; -import type { ElicitationGapLensAffinity, GapDisposition, GapPredicate } from './schema/elicitation-gaps.js'; +import { + gapPredicateSupport, + type ElicitationGapLensAffinity, + type GapDisposition, + type GapPredicate, +} from './schema/elicitation-gaps.js'; import { DESIGN_KINDS, INTENT_KINDS, @@ -396,7 +401,7 @@ function validateGapPredicate(predicate: GapPredicate, diagnostics: Diagnostic[] return; } - if (predicate.kind === 'field' || predicate.kind === 'coverage') { + if (gapPredicateSupport(predicate.kind) === 'unsupported') { diagnostics.push({ field: 'predicate.kind', message: 'predicate kind not yet supported' }); return; } diff --git a/src/graph/queries.ts b/src/graph/queries.ts index 4cd6c41f..ee29aad2 100644 --- a/src/graph/queries.ts +++ b/src/graph/queries.ts @@ -13,7 +13,12 @@ import type { BrunchDb } from '../db/connection.js'; import * as schema from '../db/schema.js'; import type { Lsn } from './atoms.js'; import type { EdgeCategory, GraphEdge } from './schema/edges.js'; -import type { ElicitationGap, GapDisposition, GapPredicate } from './schema/elicitation-gaps.js'; +import { + gapPredicateSupport, + type ElicitationGap, + type GapDisposition, + type GapPredicate, +} from './schema/elicitation-gaps.js'; import { NODE_KIND_METADATA, parseGraphNodeCode, @@ -361,8 +366,10 @@ function deriveGapCoverage( ): number { if (disposition === 'not_applicable' || disposition === 'irrelevant' || disposition === 'answered') return 1; + // manual rides disposition only; unsupported arms are boundary-rejected (gapPredicateSupport). + if (gapPredicateSupport(predicate.kind) !== 'structural') return 0; if (predicate.kind === 'presence') return derivePresenceCoverage(db, specId, predicate); - return 0; + throw new Error(`structural gap predicate kind ${predicate.kind} has no derivation implemented`); } function rowToElicitationGap(db: BrunchDb, row: typeof schema.elicitationGaps.$inferSelect): ElicitationGap { diff --git a/src/graph/schema/elicitation-gaps.ts b/src/graph/schema/elicitation-gaps.ts index 59904bbe..a57a802a 100644 --- a/src/graph/schema/elicitation-gaps.ts +++ b/src/graph/schema/elicitation-gaps.ts @@ -18,6 +18,31 @@ export type GapPredicateKind = (typeof GAP_PREDICATE_KINDS)[number]; export type ElicitationGapLensAffinity = (typeof LENS_AFFINITIES)[number]; +/** + * Single owner of per-arm predicate semantics. Boundary validation + * (CommandExecutor rejects `unsupported` arms) and coverage derivation + * (queries derive only `structural` arms; `manual` rides disposition) + * both consume this classifier. The never check makes adding a + * GapPredicate arm without deciding its semantics a compile error. + */ +export type GapPredicateSupport = 'structural' | 'manual' | 'unsupported'; + +export function gapPredicateSupport(kind: GapPredicateKind): GapPredicateSupport { + switch (kind) { + case 'presence': + return 'structural'; + case 'manual': + return 'manual'; + case 'field': + case 'coverage': + return 'unsupported'; + default: { + const unhandled: never = kind; + throw new Error(`Unhandled gap predicate kind: ${String(unhandled)}`); + } + } +} + export type GapPredicate = | { readonly kind: 'presence'; From 828c1cf90581abda977e40142afe595f08da62a9 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Thu, 11 Jun 2026 14:13:24 +0200 Subject: [PATCH 15/32] Finish scoped offline env contract: set skip-version-check, drop dead dev flag MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit applyBrunchOfflineDefault now sets PI_SKIP_VERSION_CHECK alongside PI_OFFLINE (the save/restore scaffolding's intent — offline launches emit no version-check noise), never overriding user-provided values. The dev flag on runWithScopedBrunchOfflineDefault was accepted but never read; removed. Env tests assert set-during-run and restore-after for both variables. Co-Authored-By: Claude Fable 5 --- src/.pi/brunch-pi-settings.ts | 5 ++++- src/app/brunch-tui.test.ts | 25 ++++++------------------- src/app/brunch-tui.ts | 2 -- 3 files changed, 10 insertions(+), 22 deletions(-) diff --git a/src/.pi/brunch-pi-settings.ts b/src/.pi/brunch-pi-settings.ts index 646c5533..525b4fc7 100644 --- a/src/.pi/brunch-pi-settings.ts +++ b/src/.pi/brunch-pi-settings.ts @@ -147,8 +147,11 @@ export function brunchResourceLoaderOptions( }; } -export function applyBrunchOfflineDefault(env: { PI_OFFLINE?: string } = process.env): void { +export function applyBrunchOfflineDefault( + env: { PI_OFFLINE?: string; PI_SKIP_VERSION_CHECK?: string } = process.env, +): void { env.PI_OFFLINE ??= '1'; + env.PI_SKIP_VERSION_CHECK ??= '1'; } export function createBrunchSettingsManager(_cwd: string, _agentDir: string): SettingsManager { diff --git a/src/app/brunch-tui.test.ts b/src/app/brunch-tui.test.ts index 94f296da..452d8343 100644 --- a/src/app/brunch-tui.test.ts +++ b/src/app/brunch-tui.test.ts @@ -409,30 +409,18 @@ describe('Brunch TUI boot', () => { }); it('scopes Pi startup update suppression and restores update-check env in finally', async () => { - const productEnv: { PI_OFFLINE?: string; PI_SKIP_VERSION_CHECK?: string } = {}; + const scopedEnv: { PI_OFFLINE?: string; PI_SKIP_VERSION_CHECK?: string } = {}; await expect( runWithScopedBrunchOfflineDefault({ - dev: false, - env: productEnv, + env: scopedEnv, run: async () => { - expect(productEnv.PI_OFFLINE).toBe('1'); + expect(scopedEnv.PI_OFFLINE).toBe('1'); + expect(scopedEnv.PI_SKIP_VERSION_CHECK).toBe('1'); }, }), ).resolves.toBeUndefined(); - expect(productEnv.PI_OFFLINE).toBeUndefined(); - expect(productEnv.PI_SKIP_VERSION_CHECK).toBeUndefined(); - - const devEnv: { PI_OFFLINE?: string; PI_SKIP_VERSION_CHECK?: string } = {}; - await expect( - runWithScopedBrunchOfflineDefault({ - dev: true, - env: devEnv, - run: async () => { - expect(devEnv.PI_OFFLINE).toBe('1'); - }, - }), - ).resolves.toBeUndefined(); - expect(devEnv.PI_OFFLINE).toBeUndefined(); + expect(scopedEnv.PI_OFFLINE).toBeUndefined(); + expect(scopedEnv.PI_SKIP_VERSION_CHECK).toBeUndefined(); const overriddenEnv: { PI_OFFLINE?: string; PI_SKIP_VERSION_CHECK?: string } = { PI_OFFLINE: 'already-offline', @@ -440,7 +428,6 @@ describe('Brunch TUI boot', () => { }; await expect( runWithScopedBrunchOfflineDefault({ - dev: true, env: overriddenEnv, run: async () => { expect(overriddenEnv.PI_OFFLINE).toBe('already-offline'); diff --git a/src/app/brunch-tui.ts b/src/app/brunch-tui.ts index 79fa2468..398a08d8 100644 --- a/src/app/brunch-tui.ts +++ b/src/app/brunch-tui.ts @@ -470,7 +470,6 @@ async function launchPiInteractive(context: BrunchTuiLaunchContext): Promise { await new InteractiveMode(runtime).run(); @@ -479,7 +478,6 @@ async function launchPiInteractive(context: BrunchTuiLaunchContext): Promise Promise; }): Promise { From 49c23c49e06a89f94d4394a625d61d1dbee584b8 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Thu, 11 Jun 2026 14:18:32 +0200 Subject: [PATCH 16/32] Refresh chrome footer after runtime posture switches The footer already re-projects strategy/lens from the transcript at render time; nothing requested a render after /brunch:strategy or /brunch:lens, so the footer kept showing launch-time values. Wire a chrome-refresh handle at the composition root: chrome binds its footer render-request into it, and a successful runtime switch calls it (not on rejection or picker cancel). Co-Authored-By: Claude Fable 5 --- .../__tests__/runtime-switch-command.test.ts | 23 ++++++++++++-- src/.pi/brunch-pi-extensions.ts | 14 +++++++-- src/.pi/extensions/chrome/index.ts | 7 ++++- src/.pi/extensions/commands/index.ts | 30 +++++++++++++------ 4 files changed, 60 insertions(+), 14 deletions(-) diff --git a/src/.pi/__tests__/runtime-switch-command.test.ts b/src/.pi/__tests__/runtime-switch-command.test.ts index bcf35639..2977a850 100644 --- a/src/.pi/__tests__/runtime-switch-command.test.ts +++ b/src/.pi/__tests__/runtime-switch-command.test.ts @@ -40,6 +40,7 @@ function commandHarness(options: { customResult?: unknown; customAvailable?: boo const commands = new Map(); const activeToolNames: string[][] = []; const customCalls: Array<{ factory: (...args: unknown[]) => unknown; options: unknown }> = []; + const chromeRefreshes: number[] = []; const ctx: FakeCommandContext = { ui: { notify(message, level) { @@ -74,10 +75,15 @@ function commandHarness(options: { customResult?: unknown; customAvailable?: boo activeToolNames.push(names); }, } as never, - { coordinator: {} as never }, + { + coordinator: {} as never, + requestChromeRefresh: () => { + chromeRefreshes.push(chromeRefreshes.length + 1); + }, + }, ); - return { commands, ctx, entries, notifications, activeToolNames, customCalls }; + return { commands, ctx, entries, notifications, activeToolNames, customCalls, chromeRefreshes }; } describe('Brunch runtime switch commands', () => { @@ -195,6 +201,19 @@ describe('Brunch runtime switch commands', () => { ]); }); + it('requests a chrome refresh after a successful runtime switch and not on rejection or cancel', async () => { + const harness = commandHarness({ customResult: undefined }); + + await harness.commands.get(BRUNCH_STRATEGY_COMMAND)?.handler('propose-graph', harness.ctx); + expect(harness.chromeRefreshes).toHaveLength(1); + + await harness.commands.get(BRUNCH_LENS_COMMAND)?.handler('unknown-lens', harness.ctx); + expect(harness.chromeRefreshes).toHaveLength(1); + + await harness.commands.get(BRUNCH_LENS_COMMAND)?.handler('', harness.ctx); + expect(harness.chromeRefreshes).toHaveLength(1); + }); + it('reports mode and accepts explicit elicit as a no-op instead of inventing future modes', async () => { const harness = commandHarness(); diff --git a/src/.pi/brunch-pi-extensions.ts b/src/.pi/brunch-pi-extensions.ts index d6fc0c7b..70f4c001 100644 --- a/src/.pi/brunch-pi-extensions.ts +++ b/src/.pi/brunch-pi-extensions.ts @@ -144,6 +144,7 @@ export function createBrunchPiExtensions( const continuityStep = options.graph ? createPrepareNextTurnContinuityStep(options.graph, options.continuityDrains) : undefined; + const chromeRefresh: { current: (() => void) | null } = { current: null }; const extensions: BrunchProductExtensionRegistrar[] = [ (api) => { registerBrunchSessionBoundary(api, onSessionBoundary, { @@ -151,7 +152,12 @@ export function createBrunchPiExtensions( }); if (options.graph) registerBrunchContinuityGuard(api, options.graph, options.continuityDrains); }, - (api) => registerBrunchChrome(api, chrome), + (api) => + registerBrunchChrome(api, chrome, { + bindChromeRefresh: (refresh) => { + chromeRefresh.current = refresh; + }, + }), registerBrunchBranchPolicyHandlers, (api) => registerBrunchOperationalModePolicy(api, { devAllowedToolNames }), registerBrunchContext, @@ -169,7 +175,11 @@ export function createBrunchPiExtensions( ? { specId: options.graph.specId, commandExecutor: options.graph.commandExecutor } : undefined, }), - (api) => registerBrunchCommands(api, options), + (api) => + registerBrunchCommands(api, { + ...options, + requestChromeRefresh: () => chromeRefresh.current?.(), + }), ...(options.graph ? [(api: ExtensionAPI) => registerBrunchGraph(api, options.graph!)] : []), ...(introspectionOptions?.enabled ? [ diff --git a/src/.pi/extensions/chrome/index.ts b/src/.pi/extensions/chrome/index.ts index 771501cd..59806667 100644 --- a/src/.pi/extensions/chrome/index.ts +++ b/src/.pi/extensions/chrome/index.ts @@ -243,8 +243,13 @@ export function renderBrunchChrome( ui.setTitle(formatChromeTitle(chrome)); } -export function registerBrunchChrome(pi: ExtensionAPI, chrome: BrunchChromeState): void { +export function registerBrunchChrome( + pi: ExtensionAPI, + chrome: BrunchChromeState, + hooks?: { readonly bindChromeRefresh?: (refresh: () => void) => void }, +): void { let requestFooterRender: (() => void) | null = null; + hooks?.bindChromeRefresh?.(() => requestFooterRender?.()); pi.on('session_start', async (_event, ctx) => { renderBrunchChrome(ctx.ui, chrome, { diff --git a/src/.pi/extensions/commands/index.ts b/src/.pi/extensions/commands/index.ts index e5ff4543..66cf4b0c 100644 --- a/src/.pi/extensions/commands/index.ts +++ b/src/.pi/extensions/commands/index.ts @@ -51,7 +51,10 @@ export const BRUNCH_MODE_COMMAND = 'brunch:mode'; export const BRUNCH_SWITCH_SHORTCUT = 'ctrl+shift+b'; -export type BrunchCommandsOptions = BrunchSpecSessionPickerOptions; +export type BrunchCommandsOptions = BrunchSpecSessionPickerOptions & { + /** Called after a runtime posture switch so chrome (footer) re-renders from re-projected state. */ + readonly requestChromeRefresh?: () => void; +}; interface BrunchStubCommand { readonly name: string; @@ -96,7 +99,12 @@ function lensUsage(): string { return `Usage: /${BRUNCH_LENS_COMMAND} `; } -function applyRuntimeSwitch(pi: ExtensionAPI, ctx: RuntimeSwitchContext, patch: RuntimeSwitchPatch): void { +function applyRuntimeSwitch( + pi: ExtensionAPI, + ctx: RuntimeSwitchContext, + patch: RuntimeSwitchPatch, + requestChromeRefresh: (() => void) | undefined, +): void { const current = projectBrunchAgentState(ctx.sessionManager.getEntries()); const nextState = { schemaVersion: 1 as const, @@ -120,10 +128,11 @@ function applyRuntimeSwitch(pi: ExtensionAPI, ctx: RuntimeSwitchContext, patch: pi.setActiveTools( activeToolNamesForBrunchAgentState(pi, projectBrunchAgentState(ctx.sessionManager.getEntries()), []), ); + requestChromeRefresh?.(); ctx.ui.notify(`Brunch ${patch.axis} set to ${patch.value}.`, 'info'); } -function registerRuntimeSwitchCommands(pi: ExtensionAPI): void { +function registerRuntimeSwitchCommands(pi: ExtensionAPI, requestChromeRefresh?: () => void): void { pi.registerCommand(BRUNCH_LENS_COMMAND, { description: `Change the active Brunch lens (${['auto', ...AGENT_LENS_IDS].join(', ')})`, getArgumentCompletions: (prefix) => @@ -160,7 +169,7 @@ function registerRuntimeSwitchCommands(pi: ExtensionAPI): void { ctx.ui.notify(lensUsage(), 'error'); return; } - applyRuntimeSwitch(pi, ctx, { axis: 'lens', value: picked }); + applyRuntimeSwitch(pi, ctx, { axis: 'lens', value: picked }, requestChromeRefresh); return; } if (!isLensSelection(selection)) { @@ -170,7 +179,7 @@ function registerRuntimeSwitchCommands(pi: ExtensionAPI): void { ); return; } - applyRuntimeSwitch(pi, ctx, { axis: 'lens', value: selection }); + applyRuntimeSwitch(pi, ctx, { axis: 'lens', value: selection }, requestChromeRefresh); }, }); @@ -210,7 +219,7 @@ function registerRuntimeSwitchCommands(pi: ExtensionAPI): void { ctx.ui.notify(strategyUsage(), 'error'); return; } - applyRuntimeSwitch(pi, ctx, { axis: 'strategy', value: picked }); + applyRuntimeSwitch(pi, ctx, { axis: 'strategy', value: picked }, requestChromeRefresh); return; } if (!isStrategySelection(selection)) { @@ -220,7 +229,7 @@ function registerRuntimeSwitchCommands(pi: ExtensionAPI): void { ); return; } - applyRuntimeSwitch(pi, ctx, { axis: 'strategy', value: selection }); + applyRuntimeSwitch(pi, ctx, { axis: 'strategy', value: selection }, requestChromeRefresh); }, }); @@ -250,7 +259,10 @@ function registerRuntimeSwitchCommands(pi: ExtensionAPI): void { }); } -export function registerBrunchCommands(pi: ExtensionAPI, { coordinator }: BrunchCommandsOptions): void { +export function registerBrunchCommands( + pi: ExtensionAPI, + { coordinator, requestChromeRefresh }: BrunchCommandsOptions, +): void { pi.registerCommand(BRUNCH_SWITCH_COMMAND, { description: 'Open the Brunch spec/session picker', handler: async (_args, ctx: ExtensionCommandContext) => { @@ -267,7 +279,7 @@ export function registerBrunchCommands(pi: ExtensionAPI, { coordinator }: Brunch }); } - registerRuntimeSwitchCommands(pi); + registerRuntimeSwitchCommands(pi, requestChromeRefresh); pi.registerShortcut?.(BRUNCH_SWITCH_SHORTCUT, { description: 'Open the Brunch spec/session picker', From e433e96f5fa3f874cf0d539430dc175b86a16263 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Thu, 11 Jun 2026 14:19:23 +0200 Subject: [PATCH 17/32] Echo projected mode in /brunch:mode no-op message The already-current branch hardcoded 'elicit' instead of echoing the projected operational mode; behavior-identical today, honest when the mode vocabulary grows. Co-Authored-By: Claude Fable 5 --- src/.pi/extensions/commands/index.ts | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/.pi/extensions/commands/index.ts b/src/.pi/extensions/commands/index.ts index 66cf4b0c..5471bbe0 100644 --- a/src/.pi/extensions/commands/index.ts +++ b/src/.pi/extensions/commands/index.ts @@ -248,7 +248,7 @@ function registerRuntimeSwitchCommands(pi: ExtensionAPI, requestChromeRefresh?: return; } if (selection === current.operationalMode) { - ctx.ui.notify('Brunch mode is already elicit.', 'info'); + ctx.ui.notify(`Brunch mode is already ${current.operationalMode}.`, 'info'); return; } ctx.ui.notify( From 73f2e12f421c8ace792b909dfefbc6a2c8152a42 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Thu, 11 Jun 2026 14:24:46 +0200 Subject: [PATCH 18/32] Require graph reads on the prompt context; fail loud on empty gap register MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Reverts the 'Handle absent prompt gaps safely' patch (bbc4b4e6) and removes the ?? [] fallback it was shielding. graphReads is now a required, documented must-wire member of BrunchPromptContext — a composition root that omits it is a type error — while session/context are documented intended-optional. An empty gap register reaching legality derivation now surfaces through the existing missing-register-kind throw (the contract isCapabilityLegalForGaps already documents) instead of quietly returning empty manifests and axis options: every spec is seeded with floor gaps, so empty means wiring bug, not posture. Co-Authored-By: Claude Fable 5 --- src/.pi/agents/state.test.ts | 7 ++++++ src/.pi/agents/state.ts | 3 --- src/.pi/extensions/system-prompts/index.ts | 16 ++++++++---- src/app/brunch-tui.test.ts | 27 +++++++++++++++++++++ src/projections/session/affordances.test.ts | 4 +++ src/projections/session/runtime-policy.ts | 1 - 6 files changed, 49 insertions(+), 9 deletions(-) diff --git a/src/.pi/agents/state.test.ts b/src/.pi/agents/state.test.ts index 17043957..a44196b5 100644 --- a/src/.pi/agents/state.test.ts +++ b/src/.pi/agents/state.test.ts @@ -209,6 +209,13 @@ describe('agent posture policy', () => { ); }); + it('fails loud on an empty gap register instead of returning empty manifests', () => { + // Every spec is seeded with floor gaps at creation; an empty register reaching + // manifest derivation is a wiring bug, never a legal quiet posture. + const state = projectBrunchAgentState([]); + expect(() => manifestsForState(state, [])).toThrow(/no elicitation gap/); + }); + it('keeps state.ts free of grade-gate symbols', () => { const source = readFileSync(fileURLToPath(new URL('./state.ts', import.meta.url)), 'utf8'); expect(source).not.toMatch(/ReadinessGrade|GRADE_RANK|MIN_GRADE|isGradeLegal/); diff --git a/src/.pi/agents/state.ts b/src/.pi/agents/state.ts index 6645a406..f1972b7a 100644 --- a/src/.pi/agents/state.ts +++ b/src/.pi/agents/state.ts @@ -210,9 +210,6 @@ export function manifestsForState( `Agent "${state.agentRole}" is not legal in operational mode "${state.operationalMode}".`, ); } - if (gaps.length === 0) { - return { goals: [], strategies: [], lenses: [], methods: [] }; - } return { goals: selectAxisResources({ diff --git a/src/.pi/extensions/system-prompts/index.ts b/src/.pi/extensions/system-prompts/index.ts index 8648e136..d4888b39 100644 --- a/src/.pi/extensions/system-prompts/index.ts +++ b/src/.pi/extensions/system-prompts/index.ts @@ -30,9 +30,17 @@ interface BeforeAgentStartContextLike { interface BrunchPromptContext { spec: AgentPromptSpecContext; workspace: AgentPromptWorkspaceContext; + /** Intended-optional: display label only; prompts render without a session label. */ session?: AgentPromptSessionContext; + /** Intended-optional: extra caller-supplied handles/contexts merged into the bundle. */ context?: AgentPromptContextBundle; - graphReads?: GraphReaders; + /** + * Must-wire: legality (gaps), tool posture, and graph context all derive from + * these reads. Required so a composition root that forgets them is a type + * error, never a silent fallback posture (the lesson of the FE-844/FE-847 + * review pass: an optional hook here froze live legality at a floor). + */ + graphReads: GraphReaders; } export type BrunchPromptContextProvider = @@ -87,7 +95,7 @@ export function registerBrunchPrompting( } function gapsForPrompt(context: BrunchPromptContext): readonly ElicitationGap[] { - return context.graphReads?.getElicitationGaps(context.spec.id) ?? []; + return context.graphReads.getElicitationGaps(context.spec.id); } function contextForPrompt( @@ -103,9 +111,7 @@ function contextForPrompt( gaps, }), ]; - if (context.graphReads) { - renderedContexts.push(renderGraphContext(context.graphReads.queryGraph(), { lens: state.agentLens })); - } + renderedContexts.push(renderGraphContext(context.graphReads.queryGraph(), { lens: state.agentLens })); return { ...(context.context?.contextHandles ? { contextHandles: context.context.contextHandles } : {}), diff --git a/src/app/brunch-tui.test.ts b/src/app/brunch-tui.test.ts index 452d8343..8caf36fe 100644 --- a/src/app/brunch-tui.test.ts +++ b/src/app/brunch-tui.test.ts @@ -29,6 +29,8 @@ import { } from '../.pi/brunch-pi-extensions.js'; import { createBrunchPiSettings } from '../.pi/brunch-pi-settings.js'; import { openWorkspaceGraphRuntime } from '../graph/index.js'; +import type { ElicitationGap } from '../graph/schema/elicitation-gaps.js'; +import type { NodeKind } from '../graph/schema/nodes.js'; import { userMessage } from '../probes/test-helpers.js'; import { createProductUpdatePublisher } from '../rpc/product-updates.js'; import { @@ -768,6 +770,7 @@ describe('Brunch TUI boot', () => { promptContext: { spec: { id: 1, name: 'Spec One' }, workspace: { cwd }, + graphReads: stubPromptGraphReads(), }, }, )({ @@ -1634,6 +1637,30 @@ function inventoryWithWorkspace(workspace: WorkspaceSessionReadyState): Workspac }; } +function stubPromptGraphReads() { + const gap = (refersTo: NodeKind): ElicitationGap => ({ + id: `${refersTo}:gap`, + specId: 1, + refersTo, + question: `${refersTo} question`, + rationale: `${refersTo} rationale`, + basis: 'implicit', + band: 'grounding', + predicate: { kind: 'presence', minimum: 1, nodeKind: refersTo }, + importance: 1, + coverage: 1, + answered: true, + disposition: 'answered', + createdAtLsn: 1, + }); + return { + queryGraph: () => ({ lsn: 1, nodes: [], edges: [] }), + getNodes: () => [], + resolveNodeCode: () => undefined, + getElicitationGaps: () => (['context', 'thesis', 'goal', 'constraint'] as const).map((kind) => gap(kind)), + }; +} + function noOpWorkspaceCoordinator(cwd: string) { return { inspectWorkspace: async () => emptyInventory(cwd), diff --git a/src/projections/session/affordances.test.ts b/src/projections/session/affordances.test.ts index 1d5b8241..3e5787f1 100644 --- a/src/projections/session/affordances.test.ts +++ b/src/projections/session/affordances.test.ts @@ -122,6 +122,10 @@ describe('runtime affordances derivation', () => { ); }); + it('fails loud on an empty gap register (wiring bug — every spec is seeded with floor gaps)', () => { + expect(() => axisOptionsForRuntimeState('strategy', resolved(), [])).toThrow(/no elicitation gap/); + }); + it('derives per-axis legal options without grade-gate symbols', () => { expect(axisOptionsForRuntimeState('lens', resolved(), groundingGaps({ thesis: 0 }))).toEqual(['intent']); diff --git a/src/projections/session/runtime-policy.ts b/src/projections/session/runtime-policy.ts index 1d268fd0..70ecfdcb 100644 --- a/src/projections/session/runtime-policy.ts +++ b/src/projections/session/runtime-policy.ts @@ -139,7 +139,6 @@ export function axisOptionsForRuntimeState( state: ResolvedBrunchAgentState, gaps: readonly ElicitationGap[], ): readonly (AgentGoalId | AgentStrategyId | AgentLensId)[] { - if (gaps.length === 0) return []; if (axis === 'goal') { return state.agentRoleDefinition.allowedGoals.filter((id) => isCapabilityLegalForGaps(GOAL_CAPABILITY[id], gaps), From 8736b7d4493c92469cd4a41cdd6e2fbe392c68a7 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Thu, 11 Jun 2026 14:29:55 +0200 Subject: [PATCH 19/32] Pin live gap legality through the Tier-2 real-boot oracle MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The missing card acceptance from the live-gap-legality fix: a real runBrunchTui boot over a fresh seeded spec derives turn-boundary tool legality from that spec's actual gap coverage — uncovered floor gaps keep capability-gated tools (mutate_graph) locked, a foreign writer covering the grounding floor unlocks them on the next boundary, and elicit mode never advertises bash either way. Co-Authored-By: Claude Fable 5 --- src/dev/tier-2-harness.test.ts | 52 ++++++++++++++++++++++++++++++++++ 1 file changed, 52 insertions(+) diff --git a/src/dev/tier-2-harness.test.ts b/src/dev/tier-2-harness.test.ts index afc77855..a6b87c16 100644 --- a/src/dev/tier-2-harness.test.ts +++ b/src/dev/tier-2-harness.test.ts @@ -135,6 +135,58 @@ describe('FE-847 Tier-2 real boot harness', () => { }); }); +describe('FE-844/FE-847 live gap legality through real boot', () => { + it('derives prompt/tool legality from the selected spec real gap coverage, not a fallback floor', async () => { + const boot = await bootTier2RuntimeThroughRunBrunchTui({ dev: false }); + try { + const specId = await readSessionContextSpecId(boot.runtime.session); + + // Legality is derived at the turn boundary (before_agent_start); this + // harness does not fire session_start, so drive the boundary directly. + // Fresh spec: grounding floor gaps are uncovered, so capability-gated + // tools stay locked while floor tools remain available (and elicit mode + // never advertises bash/edit/write). + await boot.runtime.session.extensionRunner.emitBeforeAgentStart( + 'Derive legality', + undefined, + '', + {} as never, + ); + const lockedTools = boot.runtime.session.getActiveToolNames(); + expect(lockedTools).toEqual(expect.arrayContaining(['read_graph'])); + expect(lockedTools).not.toEqual(expect.arrayContaining(['mutate_graph'])); + expect(lockedTools).not.toEqual(expect.arrayContaining(['bash'])); + + // Cover the grounding floor in the real graph (foreign writer). + const graph = await openWorkspaceGraphRuntime(boot.cwd); + for (const kind of ['context', 'thesis', 'goal', 'constraint'] as const) { + const created = graph.commandExecutor.createNode({ + specId, + plane: 'intent', + kind, + title: `${kind} floor coverage`, + }); + if (created.status !== 'success') throw new Error(`Failed to create ${kind} coverage node`); + } + + // The next turn boundary re-derives legality from live selected-spec + // gap reads — covered floor gaps unlock the gated posture. + await boot.runtime.session.extensionRunner.emitBeforeAgentStart( + 'Re-derive legality', + undefined, + '', + {} as never, + ); + const unlockedTools = boot.runtime.session.getActiveToolNames(); + expect(unlockedTools).toEqual(expect.arrayContaining(['mutate_graph'])); + expect(unlockedTools).not.toEqual(expect.arrayContaining(['bash'])); + } finally { + await boot.runtime.dispose(); + boot.restoreEnv(); + } + }); +}); + describe('FE-847 coverage-first scaffold — I45-L assistant-visible watermark', () => { it('seed and full-overview snapshots advance the watermark while narrow getNodes/queryNodes reads do not', async () => { const boot = await bootTier2RuntimeThroughRunBrunchTui({ dev: false }); From 5a6de3f81c027ec6fe742dfb301a4f79cf80bc48 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Thu, 11 Jun 2026 14:33:14 +0200 Subject: [PATCH 20/32] Derive post-switch tool posture from real selected-spec gaps MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit applyRuntimeSwitch recomputed active tools with a hardcoded empty gap register, silently floor-locking capability-gated tools until the next turn boundary corrected it — the same optional-wiring fault family this remediation targets. The commands seam now requires a gap reader; the composition root derives it from the graph deps (selected-spec reads) or, with no graph in the composition, the explicitly named conservativeUncoveredFloorGaps fail-closed posture. Co-Authored-By: Claude Fable 5 --- memory/REFACTOR.md | 8 ++++ .../__tests__/runtime-switch-command.test.ts | 32 +++++++++++++++ src/.pi/brunch-pi-extensions.ts | 18 ++++++++- src/.pi/extensions/commands/index.ts | 39 ++++++++++++------- src/.pi/extensions/runtime/index.ts | 10 ++++- 5 files changed, 90 insertions(+), 17 deletions(-) diff --git a/memory/REFACTOR.md b/memory/REFACTOR.md index a222880c..955955a9 100644 --- a/memory/REFACTOR.md +++ b/memory/REFACTOR.md @@ -103,6 +103,14 @@ resume kick path needs product fixes). no-redundant-world-update-after-seed row asserts through real boot; the sets-and-properties meta-row either becomes a real assertion helper used by the suite or is retired as a stated suite convention rather than a phantom todo. +7b. **(Discovered during commit 7) Runtime-switch tool posture from real gaps.** + `applyRuntimeSwitch` recomputes active tools with a hardcoded empty gap + register, so a posture switch floor-locks capability-gated tools until the + next turn boundary corrects it — the same optional-wiring fault family. + Thread a selected-spec gap reader into the commands extension from the + composition root (mirroring the chrome-refresh handle) and derive the + post-switch tool set from real coverage. + 10. **Migration coherence — SUSPENDED (2026-06-11).** Another agent is fixing the 0004 migration on the branch stacked on top of this one. Do not touch drizzle/ in this refactor; the derive-with-'context'-fallback vs read-side-throw concern diff --git a/src/.pi/__tests__/runtime-switch-command.test.ts b/src/.pi/__tests__/runtime-switch-command.test.ts index 2977a850..1e8a9234 100644 --- a/src/.pi/__tests__/runtime-switch-command.test.ts +++ b/src/.pi/__tests__/runtime-switch-command.test.ts @@ -1,5 +1,7 @@ import { describe, expect, it } from 'vitest'; +import type { ElicitationGap } from '../../graph/schema/elicitation-gaps.js'; +import type { NodeKind } from '../../graph/schema/nodes.js'; import { projectBrunchAgentState } from '../../projections/session/runtime-state.js'; import { BRUNCH_AGENT_RUNTIME_STATE_CUSTOM_TYPE, @@ -34,6 +36,24 @@ interface FakeCommandContext { }; } +function coveredGroundingGaps(): ElicitationGap[] { + return (['context', 'thesis', 'goal', 'constraint'] as const).map((refersTo: NodeKind) => ({ + id: `${refersTo}:gap`, + specId: 1, + refersTo, + question: `${refersTo} question`, + rationale: `${refersTo} rationale`, + basis: 'implicit', + band: 'grounding', + predicate: { kind: 'presence', minimum: 1, nodeKind: refersTo }, + importance: 1, + coverage: 1, + answered: true, + disposition: 'answered', + createdAtLsn: 1, + })); +} + function commandHarness(options: { customResult?: unknown; customAvailable?: boolean } = {}) { const entries: RuntimeEntry[] = []; const notifications: Array<{ message: string; level?: 'info' | 'warning' | 'error' }> = []; @@ -80,6 +100,7 @@ function commandHarness(options: { customResult?: unknown; customAvailable?: boo requestChromeRefresh: () => { chromeRefreshes.push(chromeRefreshes.length + 1); }, + getElicitationGaps: () => coveredGroundingGaps(), }, ); @@ -201,6 +222,17 @@ describe('Brunch runtime switch commands', () => { ]); }); + it('derives the post-switch tool posture from the supplied gap reader, not an empty register', async () => { + const harness = commandHarness(); + + await harness.commands.get(BRUNCH_STRATEGY_COMMAND)?.handler('propose-graph', harness.ctx); + + // The harness gap reader reports a covered grounding floor, so the + // recomputed active tools must include the capability-gated mutate_graph + // instead of the floor-locked set an empty register would produce. + expect(harness.activeToolNames.at(-1)).toEqual(expect.arrayContaining(['mutate_graph'])); + }); + it('requests a chrome refresh after a successful runtime switch and not on rejection or cancel', async () => { const harness = commandHarness({ customResult: undefined }); diff --git a/src/.pi/brunch-pi-extensions.ts b/src/.pi/brunch-pi-extensions.ts index 70f4c001..cb0ca858 100644 --- a/src/.pi/brunch-pi-extensions.ts +++ b/src/.pi/brunch-pi-extensions.ts @@ -31,7 +31,10 @@ import { } from './extensions/introspection/index.js'; import { type GraphMentionSource } from './extensions/mentions/index.js'; import { registerBrunchMentionAutocomplete } from './extensions/mentions/index.js'; -import { registerBrunchOperationalModePolicy } from './extensions/runtime/index.js'; +import { + conservativeUncoveredFloorGaps, + registerBrunchOperationalModePolicy, +} from './extensions/runtime/index.js'; import { BRUNCH_SESSION_QUERY_TOOL, registerBrunchSessionQuery } from './extensions/session-query/index.js'; import { registerBrunchSessionBoundary } from './extensions/session/lifecycle.js'; import { @@ -102,7 +105,13 @@ export { registerBrunchIntrospectQuery, } from './extensions/introspect-query/index.js'; -export interface BrunchPiExtensionsOptions extends BrunchCommandsOptions { +export interface BrunchPiExtensionsOptions extends Omit { + /** + * Optional override; when omitted, the composition derives the commands' + * gap reader from `graph` (selected-spec reads) or, with no graph in the + * composition, an explicitly conservative uncovered floor. + */ + getElicitationGaps?: BrunchCommandsOptions['getElicitationGaps']; graphMentionSource?: GraphMentionSource; graph?: BrunchGraphDeps; promptContext?: BrunchPromptContextProvider; @@ -145,6 +154,10 @@ export function createBrunchPiExtensions( ? createPrepareNextTurnContinuityStep(options.graph, options.continuityDrains) : undefined; const chromeRefresh: { current: (() => void) | null } = { current: null }; + const graph = options.graph; + const commandGapReads = + options.getElicitationGaps ?? + (graph ? () => graph.reads.getElicitationGaps(graph.specId) : conservativeUncoveredFloorGaps); // no graph in this composition: explicit fail-closed floor const extensions: BrunchProductExtensionRegistrar[] = [ (api) => { registerBrunchSessionBoundary(api, onSessionBoundary, { @@ -179,6 +192,7 @@ export function createBrunchPiExtensions( registerBrunchCommands(api, { ...options, requestChromeRefresh: () => chromeRefresh.current?.(), + getElicitationGaps: commandGapReads, }), ...(options.graph ? [(api: ExtensionAPI) => registerBrunchGraph(api, options.graph!)] : []), ...(introspectionOptions?.enabled diff --git a/src/.pi/extensions/commands/index.ts b/src/.pi/extensions/commands/index.ts index 5471bbe0..dc90cbe9 100644 --- a/src/.pi/extensions/commands/index.ts +++ b/src/.pi/extensions/commands/index.ts @@ -27,6 +27,7 @@ import type { ExtensionAPI, ExtensionCommandContext } from '@earendil-works/pi-coding-agent'; +import type { ElicitationGap } from '../../../graph/schema/elicitation-gaps.js'; import { AGENT_LENS_IDS, AGENT_STRATEGY_IDS, @@ -54,6 +55,13 @@ export const BRUNCH_SWITCH_SHORTCUT = 'ctrl+shift+b'; export type BrunchCommandsOptions = BrunchSpecSessionPickerOptions & { /** Called after a runtime posture switch so chrome (footer) re-renders from re-projected state. */ readonly requestChromeRefresh?: () => void; + /** + * Must-wire: the post-switch tool posture derives from these gaps. Required + * so a composition root cannot leave runtime switches recomputing legality + * from an empty register (which silently floor-locks gated tools until the + * next turn boundary). + */ + readonly getElicitationGaps: () => readonly ElicitationGap[]; }; interface BrunchStubCommand { @@ -103,7 +111,7 @@ function applyRuntimeSwitch( pi: ExtensionAPI, ctx: RuntimeSwitchContext, patch: RuntimeSwitchPatch, - requestChromeRefresh: (() => void) | undefined, + options: Pick, ): void { const current = projectBrunchAgentState(ctx.sessionManager.getEntries()); const nextState = { @@ -126,13 +134,20 @@ function applyRuntimeSwitch( ); pi.setActiveTools( - activeToolNamesForBrunchAgentState(pi, projectBrunchAgentState(ctx.sessionManager.getEntries()), []), + activeToolNamesForBrunchAgentState( + pi, + projectBrunchAgentState(ctx.sessionManager.getEntries()), + options.getElicitationGaps(), + ), ); - requestChromeRefresh?.(); + options.requestChromeRefresh?.(); ctx.ui.notify(`Brunch ${patch.axis} set to ${patch.value}.`, 'info'); } -function registerRuntimeSwitchCommands(pi: ExtensionAPI, requestChromeRefresh?: () => void): void { +function registerRuntimeSwitchCommands( + pi: ExtensionAPI, + options: Pick, +): void { pi.registerCommand(BRUNCH_LENS_COMMAND, { description: `Change the active Brunch lens (${['auto', ...AGENT_LENS_IDS].join(', ')})`, getArgumentCompletions: (prefix) => @@ -169,7 +184,7 @@ function registerRuntimeSwitchCommands(pi: ExtensionAPI, requestChromeRefresh?: ctx.ui.notify(lensUsage(), 'error'); return; } - applyRuntimeSwitch(pi, ctx, { axis: 'lens', value: picked }, requestChromeRefresh); + applyRuntimeSwitch(pi, ctx, { axis: 'lens', value: picked }, options); return; } if (!isLensSelection(selection)) { @@ -179,7 +194,7 @@ function registerRuntimeSwitchCommands(pi: ExtensionAPI, requestChromeRefresh?: ); return; } - applyRuntimeSwitch(pi, ctx, { axis: 'lens', value: selection }, requestChromeRefresh); + applyRuntimeSwitch(pi, ctx, { axis: 'lens', value: selection }, options); }, }); @@ -219,7 +234,7 @@ function registerRuntimeSwitchCommands(pi: ExtensionAPI, requestChromeRefresh?: ctx.ui.notify(strategyUsage(), 'error'); return; } - applyRuntimeSwitch(pi, ctx, { axis: 'strategy', value: picked }, requestChromeRefresh); + applyRuntimeSwitch(pi, ctx, { axis: 'strategy', value: picked }, options); return; } if (!isStrategySelection(selection)) { @@ -229,7 +244,7 @@ function registerRuntimeSwitchCommands(pi: ExtensionAPI, requestChromeRefresh?: ); return; } - applyRuntimeSwitch(pi, ctx, { axis: 'strategy', value: selection }, requestChromeRefresh); + applyRuntimeSwitch(pi, ctx, { axis: 'strategy', value: selection }, options); }, }); @@ -259,10 +274,8 @@ function registerRuntimeSwitchCommands(pi: ExtensionAPI, requestChromeRefresh?: }); } -export function registerBrunchCommands( - pi: ExtensionAPI, - { coordinator, requestChromeRefresh }: BrunchCommandsOptions, -): void { +export function registerBrunchCommands(pi: ExtensionAPI, options: BrunchCommandsOptions): void { + const { coordinator } = options; pi.registerCommand(BRUNCH_SWITCH_COMMAND, { description: 'Open the Brunch spec/session picker', handler: async (_args, ctx: ExtensionCommandContext) => { @@ -279,7 +292,7 @@ export function registerBrunchCommands( }); } - registerRuntimeSwitchCommands(pi, requestChromeRefresh); + registerRuntimeSwitchCommands(pi, options); pi.registerShortcut?.(BRUNCH_SWITCH_SHORTCUT, { description: 'Open the Brunch spec/session picker', diff --git a/src/.pi/extensions/runtime/index.ts b/src/.pi/extensions/runtime/index.ts index b8c6c279..6cedfacb 100644 --- a/src/.pi/extensions/runtime/index.ts +++ b/src/.pi/extensions/runtime/index.ts @@ -99,11 +99,17 @@ function applyBrunchToolPolicy( devAllowedToolNames?: readonly string[], ): void { pi.setActiveTools( - activeToolNamesForBrunchAgentState(pi, state, conservativeUncoveredGaps(), devAllowedToolNames), + activeToolNamesForBrunchAgentState(pi, state, conservativeUncoveredFloorGaps(), devAllowedToolNames), ); } -function conservativeUncoveredGaps(): readonly ElicitationGap[] { +/** + * Explicit fail-closed posture for composition points where selected-spec gap + * reads are not available (registration-time policy before context exists, or + * a composition with no graph). Never a substitute for real gap reads on a + * live selected-spec path. + */ +export function conservativeUncoveredFloorGaps(): readonly ElicitationGap[] { return (['context', 'thesis', 'goal', 'constraint'] as const).map((kind) => gap(kind)); } From ea891f8b58fd227213ab82d2f5b0ced9f5d0118c Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Thu, 11 Jun 2026 14:42:21 +0200 Subject: [PATCH 21/32] Flip the I46 resume-origination scaffold rows live through real boot MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds bootTier2RuntimeFromFixture (the resume-side real-boot chassis) and replaces the four I46 it.todo rows with live proofs: a user tail earns the kick behind reconciler-inserted continuity notices — including after earlier completed exchanges; request_* leaves stay idle for all three terminal envelopes plus assistant/system leaves; crash-after-notice reboot still kicks unresolved debt without duplicating the seed; and trailing side-task/reviewer drains neither manufacture nor mask debt. Two product fixes the live rows forced: - seedAndKickAssistantTurn no longer blanket-suppresses the kick when any past exchange result exists (which silently broke post-exchange resume kicks); origin now derives from projected transcript state (no conversational message entries = new session), with re-kick dedupe falling out of the debt classifier itself. - latestTailOwesAssistant reads the real request_* result envelope: outcome is answered/cancelled/unavailable key presence (as projections/exchanges actually writes it), not a status string — settling the PR #202 responseStatus question: the bot was right, an answered request tail would have re-kicked on resume. Co-Authored-By: Claude Fable 5 --- src/app/brunch-tui.ts | 16 +-- src/dev/tier-2-harness.test.ts | 175 ++++++++++++++++++++++- src/dev/tier-2-harness.ts | 73 ++++++++++ src/session/start-assistant-turn.test.ts | 14 +- src/session/start-assistant-turn.ts | 16 ++- 5 files changed, 269 insertions(+), 25 deletions(-) diff --git a/src/app/brunch-tui.ts b/src/app/brunch-tui.ts index 398a08d8..2de6f436 100644 --- a/src/app/brunch-tui.ts +++ b/src/app/brunch-tui.ts @@ -399,12 +399,8 @@ export function createBrunchAgentSessionRuntimeFactory( }; } -function isStructuredExchangeToolResult(entry: unknown): boolean { - if (typeof entry !== 'object' || entry === null) return false; - const message = (entry as { message?: unknown }).message; - if (typeof message !== 'object' || message === null) return false; - const toolName = (message as { toolName?: unknown }).toolName; - return typeof toolName === 'string' && (toolName.startsWith('present_') || toolName.startsWith('request_')); +function isMessageEntry(entry: unknown): boolean { + return typeof entry === 'object' && entry !== null && (entry as { type?: unknown }).type === 'message'; } function seedAndKickAssistantTurn(options: { @@ -413,13 +409,15 @@ function seedAndKickAssistantTurn(options: { readonly sessionManager: Parameters[0]['sessionManager']; }): void { const entries = options.sessionManager.getEntries(); - if (entries.some((entry) => isStructuredExchangeToolResult(entry))) return; - + // Origin is derived from projected transcript state, not counts or flags + // (I46/I47): a transcript with no conversational message entries is a new + // session; anything else takes the resume-debt decision, which itself + // dedupes re-kicks (a prior kick's present_* tail owes nothing). const decision = startAssistantTurn({ specId: options.specId, currentLsn: options.currentLsn, entries, - origin: entries.length <= 3 ? 'new_session' : 'resume_debt', + origin: entries.some(isMessageEntry) ? 'resume_debt' : 'new_session', }); for (const entry of decision.seedEntries) { options.sessionManager.appendCustomEntry(entry.customType, entry.data); diff --git a/src/dev/tier-2-harness.test.ts b/src/dev/tier-2-harness.test.ts index a6b87c16..ab900cfc 100644 --- a/src/dev/tier-2-harness.test.ts +++ b/src/dev/tier-2-harness.test.ts @@ -4,10 +4,12 @@ import { describe, expect, it } from 'vitest'; import { compactionAnchorContract } from '../.pi/extensions/compaction/index.js'; import { openWorkspaceGraphRuntime } from '../graph/index.js'; import { assistantMessage, userMessage } from '../probes/test-helpers.js'; +import { projectRequestChoices } from '../projections/exchanges/request-choices.js'; import { projectAssistantVisibleWatermark } from '../projections/session/assistant-visible-watermark.js'; import { projectBrunchAgentState } from '../projections/session/runtime-state.js'; import { BRUNCH_AGENT_RUNTIME_STATE_CUSTOM_TYPE } from '../session/runtime-state.js'; import { + bootTier2RuntimeFromFixture, bootTier2RuntimeThroughRunBrunchTui, resumeTier2Fixture, runTier2RealBootFauxTurn, @@ -394,12 +396,128 @@ describe('FE-847 coverage-first scaffold — I46-L honest origination', () => { } }); - it.todo( - 'resume kick uses the pre-reconcile tail so a user tail still earns a kick after continuity notices', - ); - it.todo('request_* and system leaves stay idle on resume'); - it.todo('crash-after-notice-before-provider still kicks when the underlying debt is unanswered'); - it.todo('trailing side-task or reviewer drains are continuity-only and do not manufacture or mask debt'); + it('resume kick uses the pre-reconcile tail so a user tail still earns a kick after continuity notices', async () => { + const boot = await bootTier2RuntimeFromFixture({ + fixtureEntries: (specId) => [ + { type: 'message', message: userMessage('Resume me: what is the next question?') }, + { type: 'custom', customType: 'worldUpdate', data: { specId, currentLsn: 99, items: [] } }, + { + type: 'custom', + customType: 'brunch.mention_staleness_hint', + data: { specId, entityId: 1, seenLsn: 1, currentLsn: 99 }, + }, + ], + }); + try { + const entries = boot.runtime.session.sessionManager.getEntries(); + expect(presentToolResults(entries)).toHaveLength(1); + expect(userMessages(entries)).toHaveLength(1); + } finally { + await boot.runtime.dispose(); + boot.restoreEnv(); + } + + // A user tail still earns the kick when earlier completed exchanges exist + // in the transcript — past exchange results must not blanket-suppress the + // resume-debt decision. + const postExchange = await bootTier2RuntimeFromFixture({ + fixtureEntries: (specId) => [ + { type: 'message', message: userMessage('First question') }, + { type: 'message', message: requestChoicesResultMessage('answered') }, + { type: 'message', message: userMessage('Follow-up you never answered') }, + { type: 'custom', customType: 'worldUpdate', data: { specId, currentLsn: 99, items: [] } }, + ], + }); + try { + // One present_* result came from the fixture-era kick is absent here; the + // reboot kick appends exactly one beyond the fixture's zero. + expect(presentToolResults(postExchange.runtime.session.sessionManager.getEntries())).toHaveLength(1); + } finally { + await postExchange.runtime.dispose(); + postExchange.restoreEnv(); + } + }); + + it('request_* and system leaves stay idle on resume', async () => { + for (const status of ['answered', 'cancelled', 'unavailable'] as const) { + const boot = await bootTier2RuntimeFromFixture({ + fixtureEntries: () => [ + { type: 'message', message: userMessage('Earlier question') }, + { type: 'message', message: requestChoicesResultMessage(status) }, + ], + }); + try { + expect(presentToolResults(boot.runtime.session.sessionManager.getEntries())).toHaveLength(0); + } finally { + await boot.runtime.dispose(); + boot.restoreEnv(); + } + } + + const assistantLeaf = await bootTier2RuntimeFromFixture({ + fixtureEntries: () => [ + { type: 'message', message: userMessage('Earlier question') }, + { type: 'message', message: assistantMessage('System-side answer; nothing owed.') }, + ], + }); + try { + expect(presentToolResults(assistantLeaf.runtime.session.sessionManager.getEntries())).toHaveLength(0); + } finally { + await assistantLeaf.runtime.dispose(); + assistantLeaf.restoreEnv(); + } + }); + + it('crash-after-notice-before-provider still kicks when the underlying debt is unanswered', async () => { + // Reconciler-inserted seed/notices landed, then the process died before the + // provider call; reboot must still answer the user's unresolved debt and + // must not duplicate the already-written seed. + const boot = await bootTier2RuntimeFromFixture({ + fixtureEntries: (specId) => [ + { type: 'message', message: userMessage('Crashed before you answered this.') }, + { type: 'custom', customType: 'brunch.context_seed', data: { specId, snapshotLsn: 9999 } }, + { type: 'custom', customType: 'worldUpdate', data: { specId, currentLsn: 9999, items: [] } }, + ], + }); + try { + const entries = boot.runtime.session.sessionManager.getEntries(); + expect(presentToolResults(entries)).toHaveLength(1); + expect(customEntries(entries, 'brunch.context_seed')).toHaveLength(1); + } finally { + await boot.runtime.dispose(); + boot.restoreEnv(); + } + }); + + it('trailing side-task or reviewer drains are continuity-only and do not manufacture or mask debt', async () => { + const noDebt = await bootTier2RuntimeFromFixture({ + fixtureEntries: (specId) => [ + { type: 'message', message: userMessage('Earlier question') }, + { type: 'message', message: requestChoicesResultMessage('answered') }, + { type: 'custom', customType: 'brunch.side_task_result', data: { specId, taskId: 't1' } }, + { type: 'custom', customType: 'brunch.reviewer_drain', data: { specId, findings: [] } }, + ], + }); + try { + expect(presentToolResults(noDebt.runtime.session.sessionManager.getEntries())).toHaveLength(0); + } finally { + await noDebt.runtime.dispose(); + noDebt.restoreEnv(); + } + + const maskedDebt = await bootTier2RuntimeFromFixture({ + fixtureEntries: (specId) => [ + { type: 'message', message: userMessage('Still waiting on this.') }, + { type: 'custom', customType: 'brunch.side_task_result', data: { specId, taskId: 't1' } }, + ], + }); + try { + expect(presentToolResults(maskedDebt.runtime.session.sessionManager.getEntries())).toHaveLength(1); + } finally { + await maskedDebt.runtime.dispose(); + maskedDebt.restoreEnv(); + } + }); }); describe('FE-847 coverage-first scaffold — I47-L carrier discipline and idempotence', () => { @@ -463,6 +581,51 @@ async function executeReadGraph( } as never); } +function messagesByRole(entries: readonly unknown[], role: string): readonly Record[] { + return entries.flatMap((entry) => { + if (typeof entry !== 'object' || entry === null) return []; + const message = (entry as { message?: unknown }).message; + if (typeof message !== 'object' || message === null) return []; + return (message as { role?: unknown }).role === role ? [message as Record] : []; + }); +} + +function presentToolResults(entries: readonly unknown[]): readonly Record[] { + return messagesByRole(entries, 'toolResult').filter( + (message) => typeof message.toolName === 'string' && message.toolName.startsWith('present_'), + ); +} + +function userMessages(entries: readonly unknown[]): readonly Record[] { + return messagesByRole(entries, 'user'); +} + +/** + * A request_* tool result exactly as the exchanges extension writes it: the + * details envelope comes from the real projection (answered/cancelled/ + * unavailable key presence), not a hand-built status field — this fixture IS + * the test of the resume-debt classifier's envelope reading. + */ +function requestChoicesResultMessage(status: 'answered' | 'cancelled' | 'unavailable') { + const details = projectRequestChoices({ + exchangeId: 'ex-resume-1', + status, + ...(status === 'answered' + ? { choices: [{ id: 'choice-1', label: 'Choice 1', kind: 'listed' as const }] } + : {}), + ...(status === 'unavailable' ? { message: 'request_choices unavailable' } : {}), + }); + return { + role: 'toolResult' as const, + toolCallId: 'ex-resume-1:request_choices', + toolName: 'request_choices', + content: [{ type: 'text' as const, text: `request_choices ${status}` }], + details, + isError: false as const, + timestamp: 0 as const, + }; +} + function customEntries(entries: readonly unknown[], customType: string): ReadonlyArray<{ data: unknown }> { return entries.filter( (entry): entry is { customType: string; data: unknown } => diff --git a/src/dev/tier-2-harness.ts b/src/dev/tier-2-harness.ts index f1873e3c..ca0512cc 100644 --- a/src/dev/tier-2-harness.ts +++ b/src/dev/tier-2-harness.ts @@ -138,6 +138,79 @@ export async function bootTier2RuntimeThroughRunBrunchTui(options: { readonly de return { cwd, runtime, restoreEnv }; } +export type Tier2FixtureEntry = + | { readonly type: 'message'; readonly message: unknown } + | { readonly type: 'custom'; readonly customType: string; readonly data: unknown }; + +/** + * Boot the real runBrunchTui runtime over a pre-seeded fixture transcript — + * the resume-side counterpart of bootTier2RuntimeThroughRunBrunchTui. The + * fixture builder receives the created spec id so continuity entries can + * carry real {specId, lsn} facts. + */ +export async function bootTier2RuntimeFromFixture(options: { + readonly fixtureEntries: (specId: number) => readonly Tier2FixtureEntry[]; + readonly specTitle?: string; +}) { + const cwd = await mkdtemp(join(tmpdir(), 'brunch-tier-2-resume-boot-')); + const agentDir = await mkdtemp(join(tmpdir(), 'brunch-agent-dir-')); + + const previousDev = process.env.BRUNCH_DEV; + const hadPreviousDev = Object.hasOwn(process.env, 'BRUNCH_DEV'); + delete process.env.BRUNCH_DEV; + const restoreEnv = () => { + if (hadPreviousDev && previousDev !== undefined) { + process.env.BRUNCH_DEV = previousDev; + } else { + delete process.env.BRUNCH_DEV; + } + }; + + try { + const coordinator = createWorkspaceSessionCoordinator({ cwd }); + const workspace = await coordinator.createSetupSession({ + specTitle: options.specTitle ?? 'Tier 2 resume fixture spec', + createNewSpec: true, + }); + for (const entry of options.fixtureEntries(workspace.spec.id)) { + if (entry.type === 'custom') { + workspace.session.manager.appendCustomEntry(entry.customType, entry.data); + } else { + workspace.session.manager.appendMessage(entry.message as never); + } + } + flushSessionEntries(workspace.session.manager, workspace.session.file); + + let runtime: Awaited> | undefined; + await runBrunchTui({ + cwd, + autoOpen: false, + coordinator, + runWorkspaceDialogPreflight: async () => ({ + action: 'openSession', + specId: workspace.spec.id, + sessionFile: workspace.session.file, + }), + webSidecarRunner: async () => null, + launchInteractive: async (context) => { + runtime = await createAgentSessionRuntime(createBrunchAgentSessionRuntimeFactory(context), { + cwd, + agentDir, + sessionManager: context.workspace.session.manager, + }); + }, + }); + if (!runtime) { + restoreEnv(); + throw new Error('runBrunchTui did not reach launchInteractive for the fixture resume boot'); + } + return { cwd, specId: workspace.spec.id, runtime, restoreEnv }; + } catch (error) { + restoreEnv(); + throw error; + } +} + export async function resumeTier2Fixture(options: { readonly cwd?: string; readonly fixtureJsonl: string; diff --git a/src/session/start-assistant-turn.test.ts b/src/session/start-assistant-turn.test.ts index ef633fa8..d760aa92 100644 --- a/src/session/start-assistant-turn.test.ts +++ b/src/session/start-assistant-turn.test.ts @@ -50,9 +50,17 @@ describe('startAssistantTurn', () => { }); it('stays idle for request/system leaves and for explicit freestyle while AUTO remains offer-first', () => { - for (const status of ['answered', 'cancelled', 'unavailable'] as const) { - expect(latestTailOwesAssistant([toolResult('request_clarification', { status })])).toBe(false); - } + // Real request_* envelopes carry the outcome as key presence + // (answered/cancelled/unavailable), never a status string field. + expect( + latestTailOwesAssistant([toolResult('request_clarification', { answered: { choices: [] } })]), + ).toBe(false); + expect(latestTailOwesAssistant([toolResult('request_clarification', { cancelled: {} })])).toBe(false); + expect( + latestTailOwesAssistant([toolResult('request_clarification', { unavailable: { message: 'no UI' } })]), + ).toBe(false); + // A status string alone is not a real terminal envelope — still pending. + expect(latestTailOwesAssistant([toolResult('request_clarification', { status: 'answered' })])).toBe(true); expect(latestTailOwesAssistant([toolResult('present_options')])).toBe(false); expect( diff --git a/src/session/start-assistant-turn.ts b/src/session/start-assistant-turn.ts index 0e8c9a65..444e29d2 100644 --- a/src/session/start-assistant-turn.ts +++ b/src/session/start-assistant-turn.ts @@ -66,7 +66,7 @@ export function latestTailOwesAssistant(entries: readonly TranscriptEntryLike[]) if (message?.role === 'user') return true; if (message?.role === 'toolResult') { const toolName = typeof message.toolName === 'string' ? message.toolName : ''; - if (toolName.startsWith('request_')) return !isTerminalRequestStatus(responseStatus(message)); + if (toolName.startsWith('request_')) return !isTerminalRequestResult(message); if (toolName.startsWith('present_')) return false; } return false; @@ -74,17 +74,19 @@ export function latestTailOwesAssistant(entries: readonly TranscriptEntryLike[]) return false; } -function isTerminalRequestStatus(status: string | undefined): boolean { - return status === 'answered' || status === 'cancelled' || status === 'unavailable'; -} - -function responseStatus(message: Record): string | undefined { +/** + * Real request_* result envelopes (projections/exchanges) carry their outcome + * as key presence — `answered` / `cancelled` / `unavailable` — never a status + * string field. A request result with none of those keys is still pending. + */ +function isTerminalRequestResult(message: Record): boolean { const details = isRecord(message.details) ? message.details : isRecord(message.data) ? message.data : undefined; - return typeof details?.status === 'string' ? details.status : undefined; + if (!details) return false; + return 'answered' in details || 'cancelled' in details || 'unavailable' in details; } function messageRecord(entry: TranscriptEntryLike): Record | undefined { From 8487ba3d71072b12b3fbc141d25fef66b9ab2b7c Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Thu, 11 Jun 2026 14:45:31 +0200 Subject: [PATCH 22/32] Flip the I47 idempotence scaffold rows live through real restart Adds rebootTier2Runtime (flushes Pi's deferred JSONL, then re-boots the real runtime over the same session file) and replaces the remaining it.todo rows: the dedicated no-redundant-worldUpdate-after-seed proof runs through real boot + provider preflight; boot/resume dedupe is proven across an actual restart (seed, kick, and worldUpdate all non-duplicated, derived purely from transcript projection); and the sets-and-{specId,lsn} suite convention is enforced mechanically by a source scan banning golden matchers in this suite. The Tier-2 scaffold has no skipped or todo rows left. Co-Authored-By: Claude Fable 5 --- src/dev/tier-2-harness.test.ts | 70 ++++++++++++++++++++++++++++++++-- src/dev/tier-2-harness.ts | 41 +++++++++++++++++++- 2 files changed, 107 insertions(+), 4 deletions(-) diff --git a/src/dev/tier-2-harness.test.ts b/src/dev/tier-2-harness.test.ts index ab900cfc..c1bc4e94 100644 --- a/src/dev/tier-2-harness.test.ts +++ b/src/dev/tier-2-harness.test.ts @@ -1,3 +1,6 @@ +import { readFile } from 'node:fs/promises'; +import { fileURLToPath } from 'node:url'; + import { type ToolDefinition } from '@earendil-works/pi-coding-agent'; import { describe, expect, it } from 'vitest'; @@ -11,6 +14,7 @@ import { BRUNCH_AGENT_RUNTIME_STATE_CUSTOM_TYPE } from '../session/runtime-state import { bootTier2RuntimeFromFixture, bootTier2RuntimeThroughRunBrunchTui, + rebootTier2Runtime, resumeTier2Fixture, runTier2RealBootFauxTurn, } from './tier-2-harness.js'; @@ -521,7 +525,21 @@ describe('FE-847 coverage-first scaffold — I46-L honest origination', () => { }); describe('FE-847 coverage-first scaffold — I47-L carrier discipline and idempotence', () => { - it.todo('no redundant worldUpdate is emitted immediately after a seed naming the current snapshot LSN'); + it('no redundant worldUpdate is emitted immediately after a seed naming the current snapshot LSN', async () => { + const boot = await bootTier2RuntimeFromFixture({ fixtureEntries: () => [] }); + try { + const entries = boot.runtime.session.sessionManager.getEntries(); + const seeds = customEntries(entries, 'brunch.context_seed'); + expect(seeds).toHaveLength(1); + expect(seeds[0]?.data).toMatchObject({ specId: boot.specId, snapshotLsn: expect.any(Number) }); + + await boot.runtime.session.extensionRunner.emitBeforeProviderRequest({}); + expect(customEntries(boot.runtime.session.sessionManager.getEntries(), 'worldUpdate')).toHaveLength(0); + } finally { + await boot.runtime.dispose(); + boot.restoreEnv(); + } + }); it('compaction and resume preserve the latest watermark carrier so projection cannot regress', () => { const latestAnchorsByKind = new Map( @@ -543,8 +561,54 @@ describe('FE-847 coverage-first scaffold — I47-L carrier discipline and idempo expect(projectAssistantVisibleWatermark(compactedEntries, { specId })).toEqual({ specId, lsn: 8 }); }); - it.todo('boot/resume seeding derives dedupe from transcript projection rather than hidden flags'); - it.todo('continuity assertions use sets and {specId, lsn} properties rather than payload-order goldens'); + it('boot/resume seeding derives dedupe from transcript projection rather than hidden flags', async () => { + // First real boot seeds and kicks; an actual restart over the same session + // file must not duplicate the seed, the kick, or synthesize a worldUpdate — + // with no state surviving except the transcript itself. + const boot = await bootTier2RuntimeFromFixture({ fixtureEntries: () => [] }); + let rebooted: Awaited> | undefined; + try { + const firstEntries = boot.runtime.session.sessionManager.getEntries(); + expect(customEntries(firstEntries, 'brunch.context_seed')).toHaveLength(1); + expect(presentToolResults(firstEntries)).toHaveLength(1); + + const flushManager = boot.runtime.session.sessionManager; + await boot.runtime.dispose(); + rebooted = await rebootTier2Runtime({ + cwd: boot.cwd, + specId: boot.specId, + sessionFile: boot.sessionFile, + flushManager, + }); + + const rebootedEntries = rebooted.runtime.session.sessionManager.getEntries(); + expect(customEntries(rebootedEntries, 'brunch.context_seed')).toHaveLength(1); + expect(presentToolResults(rebootedEntries)).toHaveLength(1); + await rebooted.runtime.session.extensionRunner.emitBeforeProviderRequest({}); + expect(customEntries(rebooted.runtime.session.sessionManager.getEntries(), 'worldUpdate')).toHaveLength( + 0, + ); + } finally { + await rebooted?.runtime.dispose(); + boot.restoreEnv(); + } + }); + + it('continuity assertions use sets and {specId, lsn} properties rather than payload-order goldens', async () => { + // Suite convention, enforced mechanically: continuity proofs in this file + // assert sets and {specId, lsn} properties; no canonical item sort is + // specified, so payload-order goldens are banned. + const source = await readFile( + fileURLToPath(new URL('./tier-2-harness.test.ts', import.meta.url)), + 'utf8', + ); + const goldenMatchers = ['Snapshot', 'FileSnapshot', 'InlineSnapshot'].map( + (suffix) => `toMatch${suffix}(`, + ); + for (const matcher of goldenMatchers) { + expect(source.includes(matcher)).toBe(false); + } + }); }); async function readSessionContextDetails(session: { diff --git a/src/dev/tier-2-harness.ts b/src/dev/tier-2-harness.ts index ca0512cc..d8df2b9e 100644 --- a/src/dev/tier-2-harness.ts +++ b/src/dev/tier-2-harness.ts @@ -204,13 +204,52 @@ export async function bootTier2RuntimeFromFixture(options: { restoreEnv(); throw new Error('runBrunchTui did not reach launchInteractive for the fixture resume boot'); } - return { cwd, specId: workspace.spec.id, runtime, restoreEnv }; + return { cwd, specId: workspace.spec.id, sessionFile: workspace.session.file, runtime, restoreEnv }; } catch (error) { restoreEnv(); throw error; } } +/** + * Re-boot the real runtime over an existing session — the actual-restart half + * of the I47 idempotence proof. Pi defers JSONL writes until an assistant + * message exists, so the prior runtime's entries are flushed to the session + * file first; the reboot then reads continuity purely from transcript + * projection (no hidden flags survive the restart). + */ +export async function rebootTier2Runtime(options: { + readonly cwd: string; + readonly specId: number; + readonly sessionFile: string; + readonly flushManager?: unknown; +}) { + if (options.flushManager) flushSessionEntries(options.flushManager, options.sessionFile); + const coordinator = createWorkspaceSessionCoordinator({ cwd: options.cwd }); + const agentDir = await mkdtemp(join(tmpdir(), 'brunch-agent-dir-')); + let runtime: Awaited> | undefined; + await runBrunchTui({ + cwd: options.cwd, + autoOpen: false, + coordinator, + runWorkspaceDialogPreflight: async () => ({ + action: 'openSession', + specId: options.specId, + sessionFile: options.sessionFile, + }), + webSidecarRunner: async () => null, + launchInteractive: async (context) => { + runtime = await createAgentSessionRuntime(createBrunchAgentSessionRuntimeFactory(context), { + cwd: options.cwd, + agentDir, + sessionManager: context.workspace.session.manager, + }); + }, + }); + if (!runtime) throw new Error('runBrunchTui did not reach launchInteractive for the reboot'); + return { runtime }; +} + export async function resumeTier2Fixture(options: { readonly cwd?: string; readonly fixtureJsonl: string; From 74eac8d5debb2bec6ea8172cda2c4005870b94e6 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Thu, 11 Jun 2026 14:46:39 +0200 Subject: [PATCH 23/32] Reconcile PLAN and REFACTOR state after FE-847 remediation closure Both FE-847 frontiers are now honestly done: every I46/I47 Tier-2 scaffold row runs live, with the resume-side and idempotence proofs through real boot/restart. REFACTOR.md remains only as the carrier for the suspended migration-0004 item handed to the stacked branch. Co-Authored-By: Claude Fable 5 --- memory/PLAN.md | 8 ++++---- memory/REFACTOR.md | 5 +++++ 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/memory/PLAN.md b/memory/PLAN.md index 0d7d223d..fc9fe85b 100644 --- a/memory/PLAN.md +++ b/memory/PLAN.md @@ -84,7 +84,7 @@ per ledger row: ### Active -- `kick-and-context-seeding` (FE-847) — residual closure on the shared branch: the four Tier-2 I46 resume-origination scaffold rows (pre-reconcile-tail kick, `request_*`/system idle against the real exchange result envelope, crash-after-notice re-kick, drains-don't-mask-debt) and the two I47 idempotence rows (boot/resume seed dedupe; dedicated no-redundant-`worldUpdate`-after-seed row) remain `it.todo`; the frontier is not done until they run live. Remediation sequence: `memory/REFACTOR.md`. +- (none) — the FE-847 turn-boundary closure completed 2026-06-11 (see Turn-boundary choreography below); the review-fix remediation residue is one suspended item in `memory/REFACTOR.md` (migration 0004, handed to the stacked successor branch). ### Turn-boundary choreography (Tier-2 layer) @@ -188,7 +188,7 @@ The near-term spine has two tracks. The **context-pipeline coverage trio** remai - **Topology materialization:** The `prepareNextTurn` reconciler and watermark projection land at their final homes (`src/session/` reconciler, `src/projections/session/` watermark) filling the FE-847 topology stubs; submit-time mention resolution at `session.submitMessage`; tool-result watermark stamping at the graph read/mutation adapters. - **Traceability:** D14-L, D15-L, D17-L, D37-L, D43-L, D49-L, D76-L, D77-L; A4-L, A9-L; I1-L, I4-L, I9-L, I45-L, I47-L. - **Design docs:** `memory/SPEC.md` D76-L–D77-L, I9-L, I45-L, I47-L; `src/session/README.md`; `src/projections/README.md`; `src/projections/session/runtime-state.ts`. -- **Current execution pointer:** Done 2026-06-11 on FE-847. The Tier-2 I45 scaffold is live, the live provider guard delegates to `guardBeforeProviderRequest`, submit-time mention facts feed the live reconciler staleness path, side-task/reviewer drains are threaded through the adapter, and the compaction anchor contract preserves the latest watermark carrier family (`brunch.context_seed`, `brunch.graph_overview_snapshot`, `brunch.own_mutation`, `worldUpdate`). **Residue:** the frontier's S5 share of I47 (the dedicated post-seed `worldUpdate` scaffold row and boot/resume dedupe idempotence) remains `it.todo`, carried to completion with the `kick-and-context-seeding` residual closure (`memory/REFACTOR.md` commit 9); compaction-survival is proven at projection level, not yet through an actual restart. +- **Current execution pointer:** Done 2026-06-11 on FE-847. The Tier-2 I45 scaffold is live, the live provider guard delegates to `guardBeforeProviderRequest`, submit-time mention facts feed the live reconciler staleness path, side-task/reviewer drains are threaded through the adapter, and the compaction anchor contract preserves the latest watermark carrier family (`brunch.context_seed`, `brunch.graph_overview_snapshot`, `brunch.own_mutation`, `worldUpdate`). **Residue closed 2026-06-11:** the S5/I47 rows now run live (dedicated post-seed `worldUpdate` row through real boot; boot/resume dedupe across an actual restart). The remediation pass also moved the live `before_provider_request` hook onto `guardBeforeProviderRequest` retry semantics and threaded transcript-projected mentions (plus the optional drains supplier) through the production adapter. ### kick-and-context-seeding @@ -196,7 +196,7 @@ The near-term spine has two tracks. The **context-pipeline coverage trio** remai - **Linear:** FE-847 — built as a slice group under the FE-847 issue; no separate issue. - **Branch:** `ln/fe-847-turn-boundary-closure` (stacked successor FE-847 branch, shared with `turn-boundary-reconciliation`). - **Kind:** structural / product mechanics -- **Status:** active — residual closure (resume-origination + idempotence proofs); not POC-ship-critical +- **Status:** done 2026-06-11 (turn-boundary choreography; not POC-ship-critical) - **Certainty:** proving - **Retires:** the R16 origination gap — proof that a structured-strategy session can originate its own offer-first turn honestly (no fabricated user entry) and seed context idempotently across real restart/resume. - **Depends on:** `turn-boundary-reconciliation` (S1 watermark projection + S2 reconciler — the seed must advance the watermark and the kick decision interacts with reconciler-inserted notices) and the `dx-tier-2-harness` chassis. Sequenced last in the FE-847 slice chain. @@ -216,7 +216,7 @@ The near-term spine has two tracks. The **context-pipeline coverage trio** remai - **Topology materialization:** The origination primitive (`startAssistantTurn`) lands in the session orchestration layer (`src/session/`) filling the FE-847 stub; `session.triggerExchange` is the public surface (D49-L); context seeding writes custom continuity entries through the same carrier as `worldUpdate`. - **Traceability:** D12-L, D37-L, D49-L, D66-L, D75-L, D76-L, D78-L; R16; I13-L, I46-L, I47-L. - **Design docs:** `memory/SPEC.md` D78-L, I46-L, I47-L; `src/session/README.md`. -- **Current execution pointer:** Partially landed 2026-06-11 on FE-847. New-session real boot seed-then-kick is proven live through Tier-2 (seed before first provider call, assistant-originated `present_*`, no fabricated user entry, no redundant `worldUpdate` after seed). **Not yet proven** — the resume side exists only as helper-level unit tests; the four Tier-2 I46 scaffold rows (pre-reconcile-tail kick behind continuity notices, `request_*`/system idle against the real exchange result envelope, crash-after-notice re-kick, drains-don't-mask-debt) and two I47 idempotence rows (boot/resume seed dedupe from transcript projection; the dedicated post-seed `worldUpdate` row) remain `it.todo` in `src/dev/tier-2-harness.test.ts`. Remediation commits 8-9 in `memory/REFACTOR.md` close them; the frontier completes when those rows run live. +- **Current execution pointer:** Done 2026-06-11 on FE-847 (closure completed by the review-fix remediation pass). All I46/I47 Tier-2 scaffold rows run live with no skips/todos: new-session seed-then-kick through real boot; resume kick on the pre-reconcile user tail (including behind continuity notices and after earlier completed exchanges — the prior blanket exchange-result suppression was a real bug); `request_*` leaves idle against the **real** result envelope (outcome is `answered`/`cancelled`/`unavailable` key presence per `projections/exchanges`, not a status string — the prior classifier read a field that never exists and would have re-kicked answered tails); crash-after-notice reboot kicks without duplicating the seed; drains neither manufacture nor mask debt; boot/resume dedupe proven across an actual restart via `rebootTier2Runtime`. Kick origin now derives from projected transcript state (no message entries = new session), not entry counts. ### project-graph-review-cycle diff --git a/memory/REFACTOR.md b/memory/REFACTOR.md index 955955a9..6ea8b508 100644 --- a/memory/REFACTOR.md +++ b/memory/REFACTOR.md @@ -4,6 +4,11 @@ Created: 2026-06-11 · Temporary execution aid; delete when complete or supersed Context: post-build audit of commits ac84abb2..bbc4b4e6 against the (now-deleted) review-fix scope cards. Verified findings, not speculation. +**STATUS 2026-06-11: commits 1-9 and the discovered 7b are DONE** (0bc9cf24.. +5ad41d58 on ln/fe-847-turn-boundary-closure; verify green, 806 tests, zero +Tier-2 todos). **This file stays alive only for item 10 (suspended)** — delete +it once the stacked-branch migration fix is verified on reintegration. + ## Problem Statement The build pass delivered the continuity chain and new-session kick well, but left From 3e715a3d52b90938df651fbf65fa3cfa1dbc6277 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Thu, 11 Jun 2026 15:12:49 +0200 Subject: [PATCH 24/32] File typing-collapse refactor plan for the exchanges editor seam Replaces the completed review-fix remediation plan in REFACTOR.md with the /expert-typescript-typing findings: one canonical editor envelope schema (the probe-side fallback is drift), a projected outcome union, and one grounding-gap fixture builder shared by production and tests. Carries the suspended migration-0004 item forward. Co-Authored-By: Claude Fable 5 --- memory/REFACTOR.md | 267 ++++++++++++++++++++++----------------------- 1 file changed, 128 insertions(+), 139 deletions(-) diff --git a/memory/REFACTOR.md b/memory/REFACTOR.md index 6ea8b508..a5c14546 100644 --- a/memory/REFACTOR.md +++ b/memory/REFACTOR.md @@ -1,162 +1,151 @@ -# Refactor: review-fix remediation — close the gaps the build pass papered over +# Refactor: source-of-truth typing collapse — structured-exchange editor seam + gap fixtures Created: 2026-06-11 · Temporary execution aid; delete when complete or superseded. -Context: post-build audit of commits ac84abb2..bbc4b4e6 against the (now-deleted) -review-fix scope cards. Verified findings, not speculation. - -**STATUS 2026-06-11: commits 1-9 and the discovered 7b are DONE** (0bc9cf24.. -5ad41d58 on ln/fe-847-turn-boundary-closure; verify green, 806 tests, zero -Tier-2 todos). **This file stays alive only for item 10 (suspended)** — delete -it once the stacked-branch migration fix is verified on reintegration. +Supersedes: the 2026-06-11 review-fix remediation plan (commits 0bc9cf24..d596f266, +all done) — except its suspended migration item, carried forward at the bottom. +Origin: /expert-typescript-typing review of the exchanges editor seam, after the +remediation talkthrough exposed the envelope vocabulary collision that misled both +a bot-comment review and the original kick-classifier author. ## Problem Statement -The build pass delivered the continuity chain and new-session kick well, but left -five classes of debt, two of them dishonest rather than merely incomplete: - -1. **Claimed-done work that is todo.** PLAN marks `kick-and-context-seeding` done, - but the four I46 resume-origination scaffold rows and two I47 idempotence rows - remain `it.todo` in the Tier-2 suite (`src/dev/tier-2-harness.test.ts:345-377`) - — including the behaviors PLAN's pointer explicitly claims proven (request-result - terminal statuses idle; resume-tail classification ignores continuity notices). -2. **The silent-fallback lens rebuilt in new clothes.** The gap-legality fix made - `getElicitationGaps` required on `GraphReaders` but left `graphReads` optional on - the prompt context, falling back to `?? []`; an out-of-card commit then absorbed - the empty case with quiet empty-manifest/empty-options early-returns at two more - layers. Missing-wiring is again invisible — three layers deep now. The Tier-2 - real-boot legality assertion the card required does not exist. -3. **A placeholder swapped for a placeholder.** The runtime-switch append adapter - returns a hardcoded string instead of `''`; the helper's declared contract - (returns the created entry id) is still violated. The footer still has no - re-render trigger after a posture switch, so the stale-footer bug survives. -4. **Half-state env scoping.** `runWithScopedBrunchOfflineDefault` still accepts a - `dev` flag it never reads, and still saves/restores `PI_SKIP_VERSION_CHECK` - without ever setting it. -5. **Silently narrowed acceptance.** The predicate-semantics "one exhaustive - never-checked owner" was not built (if-chains; a new union arm without semantics - still compiles), and migration 0004 + seeds were not regenerated; PLAN's pointer - was rewritten to omit both rather than flag them. - -```pseudo graph (current — gap legality) -brunch-tui reads ──required──▶ GraphReaders.getElicitationGaps ✓ -prompt context ──optional?──▶ gapsForPrompt ──?? []──▶ legality layers - └─ gaps.length===0 → quiet empty posture (×2 layers) -Tier-2 suite ──╳ no real-boot legality assertion +Four type-fork families, all "duplicate the owner's state space closer to where I +happen to be working": + +1. **Two divergent editor wire envelopes for one job.** The editor-prefill pattern + exists for exactly one reason (user-confirmed): `request_choices` is the one + exchange whose response payload cannot ride Pi built-ins, and `ctx.ui.custom` + cannot cross RPC — so a JSON envelope is prefilled into `ctx.ui.editor` for the + client to edit. But two envelopes grew: the product tool's local one + (`...request_choices.editor`: response `{status, choices[], comment}`) and the + probe-only "shared" fallback (`...editor`: response `{status, answers[], note}`, + plus a single-select arm no product code reaches). Both are hand-parsed; no + schema owns either; the result envelope next door uses the same words + (`answered`/`cancelled`) with different grammar (outcome keys, not a status + string) — the trap that has now claimed two reviewers. +2. **The outcome union `'answered' | 'cancelled' | 'unavailable'` is restated** in + the projection input types, the editor envelopes (as a subset), and the session + debt-classifier's terminal-keys check — four files, zero owners, while the + request details schemas already carry these as their branch keys. +3. **The grounding-gap fixture builder is cloned across nine-plus test files**, + each hand-building the same `ElicitationGap` literal with a coverage knob, + while production's `conservativeUncoveredFloorGaps` builds the same shape + privately a tenth time. +4. **Hand-written editor-response interfaces** in both envelope sites, derivable + from the schema that should exist per (1). + +```pseudo graph (current) +schemas/request.ts ──owns──▶ zRequestChoicesDetails (outcome KEYS) +request-choices tool ──hand-writes──▶ local editor envelope + EditorResponse + parser +editor-fallback (probe-only) ──hand-writes──▶ second divergent envelope + parser + single-select arm +projections/exchanges ──restates──▶ 'answered'|'cancelled'|'unavailable' inline +session debt-classifier ──restates──▶ same three literals as key checks +9+ test files ──each hand-build──▶ ElicitationGap grounding fixtures +runtime/index ──privately builds──▶ conservativeUncoveredFloorGaps (same shape, 10th copy) ``` -```pseudo graph (desired — gap legality) -brunch-tui reads ──required──▶ GraphReaders.getElicitationGaps ✓ -prompt context ──required──▶ gapsForPrompt (no fallback) -empty gaps on a seeded spec ──▶ loud invariant error (wiring bug, not a posture) -Tier-2 suite ──✓ real boot: seeded coverage drives manifests/tool legality +```pseudo graph (desired) +schemas/request.ts ──owns──▶ zRequestChoicesDetails + ──owns──▶ zRequestChoicesEditorEnvelope (NEW: the one wire envelope) + ──owns──▶ REQUEST_OUTCOME_KEYS / RequestOutcome (NEW: projected, not declared) +request-choices tool ──derives──▶ prefill template (satisfies) + response (z.infer) + safeParse +RPC probe ──consumes──▶ the same canonical envelope (divergent fallback deleted or converged) +projections/exchanges ──projects──▶ RequestOutcome; re-exports keys for session-side consumers +session debt-classifier ──derives or drift-tests──▶ terminal keys against the schema branches +graph/schema gaps sub-tree ──owns──▶ groundingFloorGaps({coverage}) builder +runtime floor + all test fixtures ──import──▶ that one builder ``` ## Solution -Every claim in PLAN matches the test suite; every wiring absence is a compile error -or a loud runtime error, never a quiet posture; every declared contract is honored -by its adapters; the six remaining scaffold rows run live through the real -boot/resume harness (the resume chassis `resumeTier2Fixture` already exists). +One owner per state space: the editor envelope, the outcome union, and the +grounding-gap fixture each get exactly one declaration site; every other +appearance becomes an import, inference, or projection. Most of the diff is +deletion. ## Commits -Ordered by safety: doc honesty → contract/structural alignment → small behavioral → -type-contract tightening → live proofs (riskiest last, since they may reveal the -resume kick path needs product fixes). - -1. **PLAN honesty.** Revert the kick-and-context-seeding frontier to active with a - pointer naming exactly what remains (the six todo rows); amend the - turn-boundary-reconciliation pointer to note the I47 idempotence residue it - shares. Doc-only. -2. **Honest entry-id contract.** Either thread the real entry id from the Pi append - API through the runtime-switch adapter, or — if Pi does not return one — change - the helper signature and its session-manager interface to void and delete the - return-value expectation everywhere. No placeholder values of any kind survive. -3. **Predicate-semantics single owner.** Extract one exhaustive switch over the - predicate kind (never-checked) that both boundary validation and coverage - derivation ride, preserving current behavior exactly (presence implemented; - field/coverage rejected loudly; manual pass-through). Adding a union arm without - semantics becomes a compile error. Pure structure, no behavior change. -4. **Env-scoping pick-one.** Remove the dead dev flag from the scoped-offline - helper (no caller branches on it); make the offline default also set the - version-check skip variable — or, if the version-check noise is judged not real, - delete its save/restore instead. Both env-scope test cases assert the chosen - end state. No half-state. -5. **Footer refresh on posture switch.** After a runtime switch the chrome footer - re-renders from re-projected state, via the existing footer render-request - binding seam. A test pins switch-then-render shows the new strategy/lens. -6. **Loud gap-legality contract.** Make the graph readers required on the prompt - context for the production composition path (harness/test constructors that - genuinely lack a reader use an explicitly named narrowed type, not optionality); - delete the empty-array fallback; replace the two quiet empty-gaps early-returns - with a loud invariant error (a seeded spec always has floor gaps — empty means - wiring bug); document on the context type which optional members are - intended-optional and why. Compiler finds every construction site. -7. **Tier-2 live-legality assertion.** Real-boot test: a session over a seeded spec - derives prompt/tool legality from that spec's actual gap coverage, and covered - floor gaps unlock posture that uncovered gaps keep locked. This is the missing - card acceptance and the durable oracle for commit 6. -8. **Flip the I46 resume rows live.** The four todo rows through the existing - resume-fixture chassis: pre-reconcile user tail still earns a kick behind - continuity notices; request/system leaves stay idle — proven against the real - exchange result envelope as the exchanges extension writes it, settling the - response-status question; crash-after-notice still kicks on unresolved debt; - trailing drains neither manufacture nor mask debt. Fold in whatever product - fixes the tests force (this commit may split if they do). -9. **Flip the I47 idempotence rows live.** Repeated boot does not duplicate seed or - world-update entries (dedupe derived from transcript projection); the dedicated - no-redundant-world-update-after-seed row asserts through real boot; the - sets-and-properties meta-row either becomes a real assertion helper used by the - suite or is retired as a stated suite convention rather than a phantom todo. -7b. **(Discovered during commit 7) Runtime-switch tool posture from real gaps.** - `applyRuntimeSwitch` recomputes active tools with a hardcoded empty gap - register, so a posture switch floor-locks capability-gated tools until the - next turn boundary corrects it — the same optional-wiring fault family. - Thread a selected-spec gap reader into the commands extension from the - composition root (mirroring the chrome-refresh handle) and derive the - post-switch tool set from real coverage. - -10. **Migration coherence — SUSPENDED (2026-06-11).** Another agent is fixing the - 0004 migration on the branch stacked on top of this one. Do not touch drizzle/ - in this refactor; the derive-with-'context'-fallback vs read-side-throw concern - is handed to that branch. Re-check on reintegration that the concern was - actually covered there before deleting this line. +Ordered extractions-first; every commit leaves verify green. Commits 1→3 are +sequential on one seam; commit 4 is independent (parallel-safe lane for fan-out). + +1. **Extract the canonical editor-envelope schema.** Add the request-choices + editor envelope as a zod schema co-located with the request details schemas + (the product tool's current shape is canonical — it is the live contract). + The tool's prefill template is typed against the schema's input, its response + type is inferred, and its hand-written interface and parser are deleted in + favor of safeParse. Behavior-preserving; existing exchange tests unchanged. + Add one envelope round-trip test (prefill → edited response → parse → + projection into result details) as the seam's lock. +2. **Extract the outcome-union owner.** Export the outcome key list and its type + from the request schemas module (projected from the details-schema branches, + not redeclared); the projection input types and the editor envelope's + answered/cancelled subset become projections of it; re-export through the + exchanges projections layer so session-side consumers can reach it without + importing extension internals. The session debt-classifier's terminal-keys + check derives from the re-export — or, if that coupling is rejected during + build, keeps its literals and gains a drift test pinning them to the schema + branches. Either way the union has one owner. +3. **Converge or delete the probe-side envelope.** Rewrite the RPC + structured-exchange probe onto the canonical envelope and delete the + divergent fallback envelope, its parser, and its hand-written types. DECISION + GATE in-commit: the fallback's single-select arm is probe-only reachable; per + the request_choices-only rationale it should be deleted — but if the probe is + meant to prove a single-select RPC editor path, keep that arm and derive its + types instead. Confirm with the user before deleting; default is delete. +4. **Extract the grounding-gap fixture builder.** One builder with a coverage + knob, owned alongside the gaps schema; production's conservative floor rides + it (production owns the shape, tests import it — never the reverse); the + nine-plus per-test-file clones are deleted. Suite stays green as the proof. ## Decisions -- Runtime-switch append contract: real id or void — resolved by what the Pi API - returns; recorded when commit 2 lands. -- Prompt-context reader optionality: production path requires readers; narrowed - harness type is the only sanctioned readerless construction. -- Empty gaps on a seeded spec is an invariant violation (loud), not a legal posture - (quiet). Reverses the out-of-card "handle absent gaps safely" patch. -- Predicate semantics get exactly one exhaustive owner module/function; validate - and derive are its two riders. -- Migration 0004: regenerate vs waive — explicit user call in commit 10. -- Topology READMEs: none expected to change (no files move); if commit 3's - extraction adds a module under the graph schema sub-tree, that directory has no - README to update. +- The product request-choices envelope is canonical; the probe-side envelope is + drift, not a second contract. +- Zod owns the editor envelope: the edited JSON returns from an agent-as-user + over RPC, which is the repo's LLM-boundary rule — this is doctrinal, not an + exception. +- The outcome union is projected from the details schemas, never redeclared; + its session-side consumption goes through the projections re-export (preferred) + or a drift test (fallback), keeping session free of extension-internal imports. +- Fixture/production convergence direction: production owns the grounding-floor + shape; fixtures import it. +- The single-select editor arm's fate is the one open decision (commit 3 gate). +- Topology READMEs: add the two-envelope rationale (why the editor channel + exists at all: ctx.ui.custom cannot cross RPC; Pi built-ins cover the other + request shapes; multi-choice is the one payload needing it) to the exchanges + directory README in the same commit as the schema extraction — that note is + the trap-prevention payload of this whole refactor. ## Testing Decisions -- The Tier-2 suite is the oracle of record for resume origination, idempotence, - and live legality — real boot/resume, set/property assertions over - `{specId, lsn}`, never payload-order goldens (suite convention). -- The request-idle proof must use a fixture carrying the exchange result envelope - exactly as the exchanges extension writes it — that fixture IS the test of the - response-status classifier; a hand-built shape would re-prove nothing. -- Commit 3 is behavior-preserving: existing predicate unit tests must pass - unchanged; only their organization may move. -- Prior art: the live I45 rows and the new-session seed-then-kick test show the - established real-boot assertion style to follow. +- Behavior-preservation is the rule for commits 1, 2, 4: existing + structured-exchange, schema, and gap tests pass unchanged; only their imports + move. +- The new envelope round-trip test (commit 1) is the only net-new oracle: it + proves prefill, parse, and projection share one schema, which is the property + whose absence caused both review failures. +- If the drift-test fallback is chosen in commit 2, it asserts the classifier's + key literals equal the schema branch keys — same pattern as the existing + observed-shapes drift guards. +- Prior art: the schemas module's existing zod-parse-at-projection idiom + (`zRequestChoicesDetails.parse` in projections) and the + observed-shapes-coverage drift test. ## Out of Scope -- The ln-sync canonical-doc pass: D35-L vs startup-header behavior, the stale - `memory/cards/tooling--runtime-state-commands.md` card, the live-vs-harness - blind-spot row for SPEC, and graduating the two induct lenses into ln-review. -- Any restacking or editing of parent branches (user decision: fix at top of stack). -- Drains live production: no side-task/reviewer drain producer exists yet; the - optional supplier stays, but commit 8's drain row documents that intent where the - classifier consumes it. -- New product behavior beyond what flipping the scaffold rows forces. +- The PI_OFFLINE dev-default question — parked, low stakes: the TUI-branding + concern (Pi's version-check interjection, not suppressed by quietStartup) is + now served unconditionally by the PI_SKIP_VERSION_CHECK default from the + remediation pass, decoupling it from PI_OFFLINE entirely. Decide only when a + dev loop actually wants provider-reachable TUI launches. +- The ln-sync canonical-doc pass (D35-L startup-header alignment, stale + runtime-state-commands card, live-vs-harness blind-spot row, graduating the + two induct lenses into ln-review). +- request_answer's plain-string editor use — not an envelope, nothing to unify. + +## Carried forward — SUSPENDED (from the completed remediation plan) + +- **Migration 0004 coherence:** another agent is fixing the 0004 migration on + the branch stacked on top of this one. Do not touch drizzle/ here. On + reintegration, verify the derive-with-'context'-fallback vs read-side-throw + concern was actually covered there before deleting this note. From ff41e0ac413e2f625888794fb5d7fa85fdf08fc5 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Thu, 11 Jun 2026 15:26:53 +0200 Subject: [PATCH 25/32] Extract the canonical request_choices editor-envelope schema The product editor envelope (schema name brunch.structured_exchange.request_choices.editor) moves from a hand-written interface + parser inside the request_choices tool to a zod schema co-located with the request details schemas. The prefill template now types against the schema input, the response type is inferred, and parsing is schema safeParse. A round-trip test locks prefill -> edited response -> parse -> projection. The new exchanges README documents the two-envelope rationale (editor wire status vs transcript outcome-key presence). Co-Authored-By: Claude Fable 5 --- ...tructured-exchange-editor-envelope.test.ts | 73 +++++++++++++++ src/.pi/extensions/exchanges/README.md | 31 +++++++ .../extensions/exchanges/request-choices.ts | 91 +++++++------------ .../extensions/exchanges/schemas/README.md | 5 +- .../extensions/exchanges/schemas/editor.ts | 61 +++++++++++++ src/.pi/extensions/exchanges/schemas/index.ts | 1 + 6 files changed, 203 insertions(+), 59 deletions(-) create mode 100644 src/.pi/__tests__/structured-exchange-editor-envelope.test.ts create mode 100644 src/.pi/extensions/exchanges/README.md create mode 100644 src/.pi/extensions/exchanges/schemas/editor.ts diff --git a/src/.pi/__tests__/structured-exchange-editor-envelope.test.ts b/src/.pi/__tests__/structured-exchange-editor-envelope.test.ts new file mode 100644 index 00000000..b58d0578 --- /dev/null +++ b/src/.pi/__tests__/structured-exchange-editor-envelope.test.ts @@ -0,0 +1,73 @@ +import { describe, expect, it } from 'vitest'; + +import { projectRequestChoices } from '../../projections/exchanges/request-choices.js'; +import { + buildRequestChoicesEditorPrefill, + parseRequestChoicesEditorResponse, +} from '../extensions/exchanges/request-choices.js'; +import { zRequestChoicesEditorEnvelope } from '../extensions/exchanges/schemas/index.js'; + +describe('request_choices editor envelope', () => { + it('round-trips prefill, edited response, parse, and projection through the one schema', () => { + const prefill = buildRequestChoicesEditorPrefill({ + prompt: 'Select all priorities.', + choices: [ + { id: 'speed', label: 'Move quickly' }, + { id: 'safety', label: 'Keep the transcript safe' }, + ], + allowOther: true, + commentPrompt: 'Optional comment', + }); + + const envelope = zRequestChoicesEditorEnvelope.parse(JSON.parse(prefill)); + expect(envelope).toMatchObject({ + schema: 'brunch.structured_exchange.request_choices.editor', + schemaVersion: 1, + mode: 'multi-choice', + choices: [ + { id: 'speed', label: 'Move quickly' }, + { id: 'safety', label: 'Keep the transcript safe' }, + { id: 'other', label: 'Other' }, + ], + response: { status: 'cancelled', choices: [], comment: '' }, + }); + + const edited = JSON.stringify({ + ...envelope, + response: { + status: 'answered', + choices: [{ id: 'speed' }, { id: 'other', label: 'Other' }], + comment: 'Also keep the proof deterministic.', + }, + }); + + const response = parseRequestChoicesEditorResponse(edited); + if (response?.status !== 'answered') throw new Error('expected an answered editor response'); + + const offeredLabels = new Map(envelope.choices.map((choice) => [choice.id, choice.label])); + const details = projectRequestChoices({ + exchangeId: 'priorities', + status: 'answered', + choices: response.choices.map((choice) => ({ + id: choice.id, + label: choice.label ?? offeredLabels.get(choice.id) ?? choice.id, + kind: choice.id === 'other' ? ('other' as const) : ('listed' as const), + })), + comment: response.comment, + }); + + expect(details).toMatchObject({ + schema: 'brunch.structured_exchange.request', + v: 1, + exchange_id: 'priorities', + tool_meta: { prev: 'present_options', curr: 'request_choices', next: 'capture_choices' }, + answered: { + choices: [ + { id: 'speed', label: 'Move quickly', kind: 'listed' }, + { id: 'other', label: 'Other', kind: 'other' }, + ], + comment: 'Also keep the proof deterministic.', + }, + }); + }); +}); diff --git a/src/.pi/extensions/exchanges/README.md b/src/.pi/extensions/exchanges/README.md new file mode 100644 index 00000000..a4fdb9a5 --- /dev/null +++ b/src/.pi/extensions/exchanges/README.md @@ -0,0 +1,31 @@ +# exchanges/ — structured-exchange Pi tools + +Owns Pi registration and live UI collection for the structured-exchange tool +family (`present_*` / `request_*`). Result details are constructed only through +`projections/exchanges/*` and validated against the Zod schemas in `schemas/` +(see `schemas/README.md` for the details contract). + +## The two envelopes + +There are two distinct envelopes in this seam — do not conflate them: + +- **Editor wire envelope** (`schemas/editor.ts`, + `brunch.structured_exchange.request_choices.editor`). Pi UI built-ins cover + every other `request_*` response shape, but the multi-choice + `request_choices` payload cannot ride them, and Pi's `ctx.ui.custom` cannot + cross RPC. So `request_choices` prefills this JSON envelope into + `ctx.ui.editor` for the client to edit and return. Its `status` string is + wire-level editor state only. +- **Transcript result envelope** (`schemas/request.ts`, + `brunch.structured_exchange.request`). The outcome of a request is carried in + transcript details as key presence — `answered` / `cancelled` / + `unavailable` — never a status string. + +## Dependency rules + +```pseudo +exchanges/* -> schemas/, projections/exchanges/, renderers/exchanges/ +exchanges/schemas/ -> zod only (pi-schema.ts is the lone TSchema adapter) +``` + +`structured-exchange-boundaries.test.ts` enforces these boundaries. diff --git a/src/.pi/extensions/exchanges/request-choices.ts b/src/.pi/extensions/exchanges/request-choices.ts index 335e4961..797ec7f3 100644 --- a/src/.pi/extensions/exchanges/request-choices.ts +++ b/src/.pi/extensions/exchanges/request-choices.ts @@ -4,8 +4,14 @@ import { projectRequestChoices } from '../../../projections/exchanges/request-ch import { formatRequestChoices } from '../../../renderers/exchanges/request-choices.js'; import { piSchema } from './pi-schema.js'; import { + STRUCTURED_EXCHANGE_REQUEST_CHOICES_EDITOR_SCHEMA, + STRUCTURED_EXCHANGE_REQUEST_CHOICES_EDITOR_VERSION, + zRequestChoicesEditorReply, zRequestChoicesParams, type RequestChoiceParam, + type RequestChoicesEditorChoice, + type RequestChoicesEditorEnvelopeInput, + type RequestChoicesEditorResponse, type RequestChoicesParams, type SelectedChoice, } from './schemas/index.js'; @@ -15,18 +21,7 @@ export const REQUEST_CHOICES_TOOL = 'request_choices' as const; type StructuredExchangeChoice = RequestChoiceParam; -interface EditorChoice { - id: string; - label?: string; -} - -interface EditorResponse { - status: 'answered' | 'cancelled'; - choices: EditorChoice[]; - comment: string; -} - -function buildEditorPrefill(params: { +export function buildRequestChoicesEditorPrefill(params: { prompt: string; choices: readonly StructuredExchangeChoice[]; allowOther?: boolean; @@ -38,56 +33,37 @@ function buildEditorPrefill(params: { ...(params.allowOther ? [{ id: 'other', label: 'Other' }] : []), ...(params.allowNone ? [{ id: 'none', label: 'None' }] : []), ]; - return JSON.stringify( - { - schema: 'brunch.structured_exchange.request_choices.editor', - schemaVersion: 1, - prompt: params.prompt, - mode: 'multi-choice', - choices, - instructions: [ - 'Edit only response.', - 'Set response.status to answered or cancelled.', - 'For each selected choice, include its id in response.choices.', - 'Set response.comment to a string. Other or None requires a nonblank comment.', - ], - commentPrompt: params.commentPrompt ?? 'Optional comment', - response: { status: 'cancelled', choices: [], comment: '' }, - }, - null, - 2, - ); + const envelope = { + schema: STRUCTURED_EXCHANGE_REQUEST_CHOICES_EDITOR_SCHEMA, + schemaVersion: STRUCTURED_EXCHANGE_REQUEST_CHOICES_EDITOR_VERSION, + prompt: params.prompt, + mode: 'multi-choice', + choices, + instructions: [ + 'Edit only response.', + 'Set response.status to answered or cancelled.', + 'For each selected choice, include its id in response.choices.', + 'Set response.comment to a string. Other or None requires a nonblank comment.', + ], + commentPrompt: params.commentPrompt ?? 'Optional comment', + response: { status: 'cancelled', choices: [], comment: '' }, + } satisfies RequestChoicesEditorEnvelopeInput; + return JSON.stringify(envelope, null, 2); } -function parseEditorResponse(value: string): EditorResponse | null { +export function parseRequestChoicesEditorResponse(value: string): RequestChoicesEditorResponse | null { let parsed: unknown; try { parsed = JSON.parse(value); } catch { return null; } - if (!isRecord(parsed)) return null; - const response = parsed.response; - if (!isRecord(response)) return null; - - if (response.status === 'cancelled') return { status: 'cancelled', choices: [], comment: '' }; - if (response.status !== 'answered') return null; - if (!Array.isArray(response.choices)) return null; - if (typeof response.comment !== 'string') return null; - - const choices = response.choices.map((choice): EditorChoice | null => { - if (!isRecord(choice) || typeof choice.id !== 'string') return null; - return { - id: choice.id, - ...(typeof choice.label === 'string' ? { label: choice.label } : {}), - }; - }); - if (choices.some((choice) => choice === null)) return null; - return { status: 'answered', choices: choices as EditorChoice[], comment: response.comment }; + const reply = zRequestChoicesEditorReply.safeParse(parsed); + return reply.success ? reply.data.response : null; } function matchSelectedChoices( - selected: readonly EditorChoice[], + selected: readonly RequestChoicesEditorChoice[], params: { choices: readonly StructuredExchangeChoice[]; allowOther?: boolean; @@ -139,15 +115,18 @@ export const requestChoicesTool = defineTool({ return terminal('unavailable', 'request_choices requires interactive UI'); } - const editorPrefillParams: Parameters[0] = { prompt: params.prompt, choices }; + const editorPrefillParams: Parameters[0] = { + prompt: params.prompt, + choices, + }; if (params.allowOther !== undefined) editorPrefillParams.allowOther = params.allowOther; if (params.allowNone !== undefined) editorPrefillParams.allowNone = params.allowNone; if (params.commentPrompt !== undefined) editorPrefillParams.commentPrompt = params.commentPrompt; - const edited = await ctx.ui.editor(buildEditorPrefill(editorPrefillParams)); + const edited = await ctx.ui.editor(buildRequestChoicesEditorPrefill(editorPrefillParams)); if (edited === undefined) return terminal('cancelled'); - const response = parseEditorResponse(edited); + const response = parseRequestChoicesEditorResponse(edited); if (!response) return terminal('unavailable', 'request_choices editor fallback returned invalid JSON'); if (response.status === 'cancelled') return terminal('cancelled'); @@ -183,7 +162,3 @@ export const requestChoicesTool = defineTool({ return renderMarkdownResult(result, theme); }, }); - -function isRecord(value: unknown): value is Record { - return typeof value === 'object' && value !== null; -} diff --git a/src/.pi/extensions/exchanges/schemas/README.md b/src/.pi/extensions/exchanges/schemas/README.md index 826522f8..87ee8940 100644 --- a/src/.pi/extensions/exchanges/schemas/README.md +++ b/src/.pi/extensions/exchanges/schemas/README.md @@ -29,10 +29,13 @@ schemas/ request.ts capture.ts params.ts + editor.ts index.ts ``` -The organization is layer-first: shared vocabulary, tool parameter schemas, present details, request details, capture details, and one public export barrel. +The organization is layer-first: shared vocabulary, tool parameter schemas, present details, request details, capture details, the `request_choices` editor wire envelope, and one public export barrel. + +`editor.ts` is not part of the transcript details model: it owns the JSON envelope prefilled into `ctx.ui.editor` for `request_choices` (the one request payload Pi built-ins cannot carry over RPC). Its wire-level `status` string never appears in transcript details, which carry outcomes as key presence. ## Source boundaries diff --git a/src/.pi/extensions/exchanges/schemas/editor.ts b/src/.pi/extensions/exchanges/schemas/editor.ts new file mode 100644 index 00000000..4e677f47 --- /dev/null +++ b/src/.pi/extensions/exchanges/schemas/editor.ts @@ -0,0 +1,61 @@ +import * as z from 'zod'; + +/** + * Editor wire envelope for `request_choices`. + * + * `request_choices` is the one structured-exchange request whose response + * payload cannot ride a Pi UI built-in, and Pi's `ctx.ui.custom` cannot cross + * RPC. The tool therefore prefills this JSON envelope into `ctx.ui.editor` and + * parses the edited document back. + * + * The `status` string here is editor wire state only. Transcript result + * details (`request.ts`) carry their outcome as key presence — + * `answered` / `cancelled` / `unavailable` — never a status string. + */ +export const STRUCTURED_EXCHANGE_REQUEST_CHOICES_EDITOR_SCHEMA = + 'brunch.structured_exchange.request_choices.editor' as const; +export const STRUCTURED_EXCHANGE_REQUEST_CHOICES_EDITOR_VERSION = 1 as const; + +/** + * A choice reference inside the editor envelope. The prefill lists the offered + * choices with labels; the edited response only owes back ids, so `label` is + * optional on the way in. + */ +export const zRequestChoicesEditorChoice = z.object({ + id: z.string().min(1), + label: z.string().optional(), +}); +export type RequestChoicesEditorChoice = z.infer; + +export const zRequestChoicesEditorResponse = z.discriminatedUnion('status', [ + z.object({ + status: z.literal('cancelled'), + choices: z.array(zRequestChoicesEditorChoice).optional(), + comment: z.string().optional(), + }), + z.object({ + status: z.literal('answered'), + choices: z.array(zRequestChoicesEditorChoice), + comment: z.string(), + }), +]); +export type RequestChoicesEditorResponse = z.infer; + +export const zRequestChoicesEditorEnvelope = z.object({ + schema: z.literal(STRUCTURED_EXCHANGE_REQUEST_CHOICES_EDITOR_SCHEMA), + schemaVersion: z.literal(STRUCTURED_EXCHANGE_REQUEST_CHOICES_EDITOR_VERSION), + prompt: z.string(), + mode: z.literal('multi-choice'), + choices: z.array(zRequestChoicesEditorChoice), + instructions: z.array(z.string()), + commentPrompt: z.string(), + response: zRequestChoicesEditorResponse, +}); +export type RequestChoicesEditorEnvelope = z.infer; +export type RequestChoicesEditorEnvelopeInput = z.input; + +/** + * The edited document only owes back a valid `response`; the rest of the + * envelope is instructional scaffolding the client may leave untouched. + */ +export const zRequestChoicesEditorReply = zRequestChoicesEditorEnvelope.pick({ response: true }); diff --git a/src/.pi/extensions/exchanges/schemas/index.ts b/src/.pi/extensions/exchanges/schemas/index.ts index a381857c..b24bb053 100644 --- a/src/.pi/extensions/exchanges/schemas/index.ts +++ b/src/.pi/extensions/exchanges/schemas/index.ts @@ -1,4 +1,5 @@ export * from './capture.js'; +export * from './editor.js'; export * from './present.js'; export * from './params.js'; export * from './request.js'; From 7a10d546e94f274aa012e5745e3a3b0550dbb37d Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Thu, 11 Jun 2026 15:29:15 +0200 Subject: [PATCH 26/32] Extract the request outcome-union owner from the details schemas RequestOutcomeKey is now projected from the request details union branches (KeysOfUnion minus header/tool_meta), with the exported REQUEST_OUTCOME_KEYS list drift-coupled to the schema in both directions via a satisfies Record marker. All four request projection input types consume it, the editor envelope statuses become an Exclude projection, and the session debt classifier derives its terminal-keys check from the projections/exchanges re-export instead of restating literals. Co-Authored-By: Claude Fable 5 --- .../extensions/exchanges/schemas/editor.ts | 22 ++++++++++++++++-- .../extensions/exchanges/schemas/request.ts | 23 +++++++++++++++++++ src/projections/exchanges/request-answer.ts | 7 ++++-- src/projections/exchanges/request-choice.ts | 10 +++++--- src/projections/exchanges/request-choices.ts | 14 ++++++++--- src/projections/exchanges/request-review.ts | 7 ++++-- src/session/start-assistant-turn.ts | 7 +++--- 7 files changed, 75 insertions(+), 15 deletions(-) diff --git a/src/.pi/extensions/exchanges/schemas/editor.ts b/src/.pi/extensions/exchanges/schemas/editor.ts index 4e677f47..e1013674 100644 --- a/src/.pi/extensions/exchanges/schemas/editor.ts +++ b/src/.pi/extensions/exchanges/schemas/editor.ts @@ -1,5 +1,7 @@ import * as z from 'zod'; +import type { RequestOutcomeKey } from './request.js'; + /** * Editor wire envelope for `request_choices`. * @@ -27,14 +29,30 @@ export const zRequestChoicesEditorChoice = z.object({ }); export type RequestChoicesEditorChoice = z.infer; +/** + * Editor wire statuses are an `Exclude<>` projection of the transcript outcome + * union: the client can answer or cancel, while `unavailable` is tool-authored + * only. The marker record drift-couples the set in both directions. + */ +export type RequestChoicesEditorStatus = Exclude; + +const editorStatusMarkers = { + answered: true, + cancelled: true, +} satisfies Record; + +export const REQUEST_CHOICES_EDITOR_STATUSES = Object.keys( + editorStatusMarkers, +) as readonly RequestChoicesEditorStatus[]; + export const zRequestChoicesEditorResponse = z.discriminatedUnion('status', [ z.object({ - status: z.literal('cancelled'), + status: z.literal('cancelled' satisfies RequestChoicesEditorStatus), choices: z.array(zRequestChoicesEditorChoice).optional(), comment: z.string().optional(), }), z.object({ - status: z.literal('answered'), + status: z.literal('answered' satisfies RequestChoicesEditorStatus), choices: z.array(zRequestChoicesEditorChoice), comment: z.string(), }), diff --git a/src/.pi/extensions/exchanges/schemas/request.ts b/src/.pi/extensions/exchanges/schemas/request.ts index 1189f6ca..5c8e513b 100644 --- a/src/.pi/extensions/exchanges/schemas/request.ts +++ b/src/.pi/extensions/exchanges/schemas/request.ts @@ -229,3 +229,26 @@ export type RequestDetails = z.infer; export const RequestDetailsSchema = z.toJSONSchema(zRequestDetails, { unrepresentable: 'throw', }); + +type KeysOfUnion = T extends unknown ? keyof T : never; + +/** + * Request outcome keys, projected from the details-schema union branches. + * Every request details branch extends the shared header + `tool_meta` with + * exactly one of these keys; the transcript carries the outcome as key + * presence, never a status string. + */ +export type RequestOutcomeKey = Exclude< + KeysOfUnion, + KeysOfUnion> | 'tool_meta' +>; + +// `satisfies Record` drift-couples this list to the +// schema branches in both directions: a missing or extra key fails to compile. +const requestOutcomeKeyMarkers = { + answered: true, + cancelled: true, + unavailable: true, +} satisfies Record; + +export const REQUEST_OUTCOME_KEYS = Object.keys(requestOutcomeKeyMarkers) as readonly RequestOutcomeKey[]; diff --git a/src/projections/exchanges/request-answer.ts b/src/projections/exchanges/request-answer.ts index ecc36a75..c31b2e8e 100644 --- a/src/projections/exchanges/request-answer.ts +++ b/src/projections/exchanges/request-answer.ts @@ -1,4 +1,7 @@ -import type { RequestAnswerDetails } from '../../.pi/extensions/exchanges/schemas/index.js'; +import type { + RequestAnswerDetails, + RequestOutcomeKey, +} from '../../.pi/extensions/exchanges/schemas/index.js'; import { STRUCTURED_EXCHANGE_REQUEST_DETAILS_SCHEMA, zRequestAnswerDetails, @@ -7,7 +10,7 @@ import { export type { RequestAnswerDetails }; export function projectRequestAnswer(input: { readonly exchangeId: string; - readonly status: 'answered' | 'cancelled' | 'unavailable'; + readonly status: RequestOutcomeKey; readonly answer?: string | undefined; readonly message?: string | undefined; }): RequestAnswerDetails { diff --git a/src/projections/exchanges/request-choice.ts b/src/projections/exchanges/request-choice.ts index af0e152f..74ec7bed 100644 --- a/src/projections/exchanges/request-choice.ts +++ b/src/projections/exchanges/request-choice.ts @@ -1,16 +1,20 @@ -import type { RequestChoiceDetails, SelectedChoice } from '../../.pi/extensions/exchanges/schemas/index.js'; +import type { + RequestChoiceDetails, + RequestOutcomeKey, + SelectedChoice, +} from '../../.pi/extensions/exchanges/schemas/index.js'; import { STRUCTURED_EXCHANGE_REQUEST_DETAILS_SCHEMA, zRequestChoiceDetails, } from '../../.pi/extensions/exchanges/schemas/index.js'; -export type { RequestChoiceDetails, SelectedChoice }; +export type { RequestChoiceDetails, RequestOutcomeKey, SelectedChoice }; export type RequestChoicePresentTool = 'present_options' | 'present_candidates'; export function projectRequestChoice(input: { readonly exchangeId: string; readonly respondsToPresentTool: RequestChoicePresentTool; - readonly status: 'answered' | 'cancelled' | 'unavailable'; + readonly status: RequestOutcomeKey; readonly choice?: SelectedChoice | undefined; readonly comment?: string | undefined; readonly message?: string | undefined; diff --git a/src/projections/exchanges/request-choices.ts b/src/projections/exchanges/request-choices.ts index b02b8be8..86fbbe45 100644 --- a/src/projections/exchanges/request-choices.ts +++ b/src/projections/exchanges/request-choices.ts @@ -1,13 +1,21 @@ -import type { RequestChoicesDetails, SelectedChoice } from '../../.pi/extensions/exchanges/schemas/index.js'; +import type { + RequestChoicesDetails, + RequestOutcomeKey, + SelectedChoice, +} from '../../.pi/extensions/exchanges/schemas/index.js'; import { + REQUEST_OUTCOME_KEYS, STRUCTURED_EXCHANGE_REQUEST_DETAILS_SCHEMA, zRequestChoicesDetails, } from '../../.pi/extensions/exchanges/schemas/index.js'; -export type { RequestChoicesDetails, SelectedChoice }; +// Re-exported so session-side consumers can reach the outcome union without +// importing extension internals. +export { REQUEST_OUTCOME_KEYS }; +export type { RequestChoicesDetails, RequestOutcomeKey, SelectedChoice }; export function projectRequestChoices(input: { readonly exchangeId: string; - readonly status: 'answered' | 'cancelled' | 'unavailable'; + readonly status: RequestOutcomeKey; readonly choices?: readonly SelectedChoice[] | undefined; readonly comment?: string | undefined; readonly message?: string | undefined; diff --git a/src/projections/exchanges/request-review.ts b/src/projections/exchanges/request-review.ts index beba469a..17f3de48 100644 --- a/src/projections/exchanges/request-review.ts +++ b/src/projections/exchanges/request-review.ts @@ -1,4 +1,7 @@ -import type { RequestReviewDetails } from '../../.pi/extensions/exchanges/schemas/index.js'; +import type { + RequestOutcomeKey, + RequestReviewDetails, +} from '../../.pi/extensions/exchanges/schemas/index.js'; import { STRUCTURED_EXCHANGE_REQUEST_DETAILS_SCHEMA, zRequestReviewDetails, @@ -9,7 +12,7 @@ export type ReviewDecision = 'approve' | 'request_changes' | 'reject'; export function projectRequestReview(input: { readonly exchangeId: string; - readonly status: 'answered' | 'cancelled' | 'unavailable'; + readonly status: RequestOutcomeKey; readonly review?: ReviewDecision | undefined; readonly comment?: string | undefined; readonly message?: string | undefined; diff --git a/src/session/start-assistant-turn.ts b/src/session/start-assistant-turn.ts index 444e29d2..78128231 100644 --- a/src/session/start-assistant-turn.ts +++ b/src/session/start-assistant-turn.ts @@ -1,3 +1,4 @@ +import { REQUEST_OUTCOME_KEYS } from '../projections/exchanges/request-choices.js'; import { projectAssistantVisibleWatermark } from '../projections/session/assistant-visible-watermark.js'; import { isContinuityOnlyNonDebtEntry, @@ -76,8 +77,8 @@ export function latestTailOwesAssistant(entries: readonly TranscriptEntryLike[]) /** * Real request_* result envelopes (projections/exchanges) carry their outcome - * as key presence — `answered` / `cancelled` / `unavailable` — never a status - * string field. A request result with none of those keys is still pending. + * as key presence — `REQUEST_OUTCOME_KEYS` — never a status string field. A + * request result with none of those keys is still pending. */ function isTerminalRequestResult(message: Record): boolean { const details = isRecord(message.details) @@ -86,7 +87,7 @@ function isTerminalRequestResult(message: Record): boolean { ? message.data : undefined; if (!details) return false; - return 'answered' in details || 'cancelled' in details || 'unavailable' in details; + return REQUEST_OUTCOME_KEYS.some((key) => key in details); } function messageRecord(entry: TranscriptEntryLike): Record | undefined { From 22e65d62f2c8c57f0f34558ca1b29bde2b5282da Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Thu, 11 Jun 2026 15:32:04 +0200 Subject: [PATCH 27/32] Converge the RPC proof on the canonical envelope and delete the fallback MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The structured-exchange RPC proof now drives the product request_choices editor flow (requestChoicesViaEditor, extracted from the tool and shared by both callers) instead of the divergent probe-only envelope. The shared/editor-fallback.ts module — its envelope, parser, hand-written types, and single-select arm — is deleted along with its index re-exports and helper tests; multi-choice coverage through the one schema replaces the single-select arm. Co-Authored-By: Claude Fable 5 --- src/.pi/__tests__/structured-exchange.test.ts | 89 --------- src/.pi/extensions/exchanges/index.ts | 6 +- .../extensions/exchanges/request-choices.ts | 114 +++++++---- .../exchanges/shared/editor-fallback.ts | 185 ------------------ .../structured-exchange-rpc-proof.test.ts | 22 ++- src/probes/structured-exchange-rpc-proof.ts | 41 ++-- 6 files changed, 100 insertions(+), 357 deletions(-) delete mode 100644 src/.pi/__tests__/structured-exchange.test.ts delete mode 100644 src/.pi/extensions/exchanges/shared/editor-fallback.ts diff --git a/src/.pi/__tests__/structured-exchange.test.ts b/src/.pi/__tests__/structured-exchange.test.ts deleted file mode 100644 index 1a7f423a..00000000 --- a/src/.pi/__tests__/structured-exchange.test.ts +++ /dev/null @@ -1,89 +0,0 @@ -import { describe, expect, it } from 'vitest'; - -import { - buildStructuredExchangeEditorPrefill, - parseStructuredExchangeEditorResponse, - structuredExchangeResultFromEditor, -} from '../extensions/exchanges/index.js'; - -describe('structured exchange JSON-editor fallback compatibility helpers', () => { - it('builds schema-tagged editor prefill for the raw Pi RPC fallback proof', () => { - const prefill = buildStructuredExchangeEditorPrefill({ - question: 'Pick paths', - context: 'Use the fallback.', - mode: 'multi-select', - options: [ - { label: 'Alpha', value: 'a' }, - { label: 'Beta', value: 'b', description: 'Second' }, - ], - }); - - expect(JSON.parse(prefill)).toMatchObject({ - schema: 'brunch.structured_exchange.editor', - schemaVersion: 1, - question: 'Pick paths', - context: 'Use the fallback.', - mode: 'multi-select', - options: [ - { index: 1, label: 'Alpha', value: 'a' }, - { index: 2, label: 'Beta', value: 'b', description: 'Second' }, - ], - response: { status: 'cancelled', answers: [], note: '' }, - }); - }); - - it('parses answered editor JSON with explicit empty notes', () => { - const parsed = parseStructuredExchangeEditorResponse( - JSON.stringify({ - response: { - status: 'answered', - answers: [{ type: 'option', label: 'Beta', value: 'b', index: 2 }], - note: '', - }, - }), - ); - - expect(parsed).toEqual({ - status: 'answered', - answers: [{ type: 'option', label: 'Beta', value: 'b', index: 2 }], - note: '', - }); - }); - - it('returns canonical request details for the existing RPC proof', () => { - const prefill = JSON.parse( - buildStructuredExchangeEditorPrefill({ - question: 'Pick paths', - exchangeId: 'paths-1', - mode: 'single-select', - options: [{ label: 'Alpha', value: 'a' }], - }), - ); - prefill.response = { - status: 'answered', - answers: [{ type: 'option', label: 'Alpha', value: 'a', index: 1 }], - note: 'Add context', - }; - - const result = structuredExchangeResultFromEditor( - { - question: 'Pick paths', - exchangeId: 'paths-1', - mode: 'single-select', - options: [{ label: 'Alpha', value: 'a' }], - }, - JSON.stringify(prefill), - ); - - expect(result.details).toMatchObject({ - schema: 'brunch.structured_exchange.request', - v: 1, - exchange_id: 'paths-1', - tool_meta: { prev: 'present_options', curr: 'request_choice' }, - answered: { - choice: { id: 'a', label: 'Alpha', kind: 'listed' }, - comment: 'Add context', - }, - }); - }); -}); diff --git a/src/.pi/extensions/exchanges/index.ts b/src/.pi/extensions/exchanges/index.ts index 130fefd4..07d4c224 100644 --- a/src/.pi/extensions/exchanges/index.ts +++ b/src/.pi/extensions/exchanges/index.ts @@ -13,11 +13,7 @@ import { REQUEST_CHOICE_TOOL, requestChoiceTool } from './request-choice.js'; import { REQUEST_CHOICES_TOOL, requestChoicesTool } from './request-choices.js'; import { REQUEST_REVIEW_TOOL, requestReviewTool } from './request-review.js'; -export { - buildStructuredExchangeEditorPrefill, - parseStructuredExchangeEditorResponse, - structuredExchangeResultFromEditor, -} from './shared/editor-fallback.js'; +export { requestChoicesViaEditor, type RequestChoicesEditorFlowParams } from './request-choices.js'; export { findIncompleteStructuredExchangePresents, isStructuredExchangePresentDetails, diff --git a/src/.pi/extensions/exchanges/request-choices.ts b/src/.pi/extensions/exchanges/request-choices.ts index 797ec7f3..7ee2b5fa 100644 --- a/src/.pi/extensions/exchanges/request-choices.ts +++ b/src/.pi/extensions/exchanges/request-choices.ts @@ -89,6 +89,66 @@ function matchSelectedChoices( return matched; } +export interface RequestChoicesEditorFlowParams { + readonly exchangeId: string; + readonly prompt: string; + readonly choices: readonly StructuredExchangeChoice[]; + readonly allowOther?: boolean; + readonly allowNone?: boolean; + readonly commentPrompt?: string; +} + +/** + * The full editor exchange for request_choices: schema-derived prefill, edited + * JSON back, schema parse, choice matching, and projection into canonical + * result details. The tool drives it through `ctx.ui.editor`; the RPC proof + * probe drives it through a raw RPC editor relay. + */ +export async function requestChoicesViaEditor( + params: RequestChoicesEditorFlowParams, + openEditor: (prefill: string) => Promise, +) { + const terminal = (status: 'cancelled' | 'unavailable', message?: string) => { + const details = projectRequestChoices({ exchangeId: params.exchangeId, status, message }); + return { content: [{ type: 'text' as const, text: formatRequestChoices(details) }], details }; + }; + + const prefillParams: Parameters[0] = { + prompt: params.prompt, + choices: params.choices, + }; + if (params.allowOther !== undefined) prefillParams.allowOther = params.allowOther; + if (params.allowNone !== undefined) prefillParams.allowNone = params.allowNone; + if (params.commentPrompt !== undefined) prefillParams.commentPrompt = params.commentPrompt; + + const edited = await openEditor(buildRequestChoicesEditorPrefill(prefillParams)); + if (edited === undefined) return terminal('cancelled'); + + const response = parseRequestChoicesEditorResponse(edited); + if (!response) return terminal('unavailable', 'request_choices editor fallback returned invalid JSON'); + if (response.status === 'cancelled') return terminal('cancelled'); + + const matchParams: Parameters[1] = { choices: params.choices }; + if (params.allowOther !== undefined) matchParams.allowOther = params.allowOther; + if (params.allowNone !== undefined) matchParams.allowNone = params.allowNone; + + const matched = matchSelectedChoices(response.choices, matchParams); + if (typeof matched === 'string') return terminal('unavailable', matched); + + const comment = normalizeOptionalText(response.comment); + if (matched.some((choice) => choice.kind === 'other' || choice.kind === 'none') && comment === undefined) { + return terminal('unavailable', 'request_choices requires a comment for Other or None selections'); + } + + const details = projectRequestChoices({ + exchangeId: params.exchangeId, + status: 'answered', + choices: matched, + comment, + }); + return { content: [{ type: 'text' as const, text: formatRequestChoices(details) }], details }; +} + export const requestChoicesTool = defineTool({ name: REQUEST_CHOICES_TOOL, label: 'Request choices', @@ -105,53 +165,25 @@ export const requestChoicesTool = defineTool({ async execute(_toolCallId, rawParams, _signal, _onUpdate, ctx) { const params = zRequestChoicesParams.parse(rawParams) satisfies RequestChoicesParams; - const choices = params.choices.map((choice) => ({ id: choice.id, label: choice.label })); - const terminal = (status: 'cancelled' | 'unavailable', message?: string) => { - const details = projectRequestChoices({ exchangeId: params.exchangeId, status, message }); - return { content: [{ type: 'text' as const, text: formatRequestChoices(details) }], details }; - }; if (!ctx.hasUI || typeof ctx.ui.editor !== 'function') { - return terminal('unavailable', 'request_choices requires interactive UI'); + const details = projectRequestChoices({ + exchangeId: params.exchangeId, + status: 'unavailable', + message: 'request_choices requires interactive UI', + }); + return { content: [{ type: 'text' as const, text: formatRequestChoices(details) }], details }; } - const editorPrefillParams: Parameters[0] = { + const flowParams: RequestChoicesEditorFlowParams = { + exchangeId: params.exchangeId, prompt: params.prompt, - choices, + choices: params.choices, + ...(params.allowOther !== undefined ? { allowOther: params.allowOther } : {}), + ...(params.allowNone !== undefined ? { allowNone: params.allowNone } : {}), + ...(params.commentPrompt !== undefined ? { commentPrompt: params.commentPrompt } : {}), }; - if (params.allowOther !== undefined) editorPrefillParams.allowOther = params.allowOther; - if (params.allowNone !== undefined) editorPrefillParams.allowNone = params.allowNone; - if (params.commentPrompt !== undefined) editorPrefillParams.commentPrompt = params.commentPrompt; - - const edited = await ctx.ui.editor(buildRequestChoicesEditorPrefill(editorPrefillParams)); - if (edited === undefined) return terminal('cancelled'); - - const response = parseRequestChoicesEditorResponse(edited); - if (!response) return terminal('unavailable', 'request_choices editor fallback returned invalid JSON'); - if (response.status === 'cancelled') return terminal('cancelled'); - - const matchParams: Parameters[1] = { choices }; - if (params.allowOther !== undefined) matchParams.allowOther = params.allowOther; - if (params.allowNone !== undefined) matchParams.allowNone = params.allowNone; - - const matched = matchSelectedChoices(response.choices, matchParams); - if (typeof matched === 'string') return terminal('unavailable', matched); - - const comment = normalizeOptionalText(response.comment); - if ( - matched.some((choice) => choice.kind === 'other' || choice.kind === 'none') && - comment === undefined - ) { - return terminal('unavailable', 'request_choices requires a comment for Other or None selections'); - } - - const details = projectRequestChoices({ - exchangeId: params.exchangeId, - status: 'answered', - choices: matched, - comment, - }); - return { content: [{ type: 'text' as const, text: formatRequestChoices(details) }], details }; + return requestChoicesViaEditor(flowParams, (prefill) => ctx.ui.editor(prefill)); }, renderCall() { diff --git a/src/.pi/extensions/exchanges/shared/editor-fallback.ts b/src/.pi/extensions/exchanges/shared/editor-fallback.ts deleted file mode 100644 index b72a7945..00000000 --- a/src/.pi/extensions/exchanges/shared/editor-fallback.ts +++ /dev/null @@ -1,185 +0,0 @@ -import { projectRequestChoice } from '../../../../projections/exchanges/request-choice.js'; -import { projectRequestChoices } from '../../../../projections/exchanges/request-choices.js'; -import { formatRequestChoice } from '../../../../renderers/exchanges/request-choice.js'; -import { formatRequestChoices } from '../../../../renderers/exchanges/request-choices.js'; -import type { SelectedChoice } from '../schemas/index.js'; - -type StructuredExchangeMode = 'single-select' | 'multi-select'; - -interface StructuredExchangeOption { - label: string; - value: string; - description?: string; -} - -export type StructuredExchangeAnswer = - | { type: 'option'; label: string; value: string; index: number } - | { type: 'other'; label: string; value: string }; - -export interface StructuredExchangeEditorPrefillParams { - question: string; - context?: string; - exchangeId?: string; - mode: StructuredExchangeMode; - options: StructuredExchangeOption[]; -} - -interface StructuredExchangeEditorResponse { - status: 'answered' | 'cancelled'; - answers: StructuredExchangeAnswer[]; - note: string; -} - -function isRecord(value: unknown): value is Record { - return typeof value === 'object' && value !== null; -} - -function answerSortRank(answer: StructuredExchangeAnswer): number { - return answer.type === 'option' ? answer.index : Number.MAX_SAFE_INTEGER - 1; -} - -function sortAnswers(answers: StructuredExchangeAnswer[]): StructuredExchangeAnswer[] { - return [...answers].sort((a, b) => answerSortRank(a) - answerSortRank(b)); -} - -function parseEditorAnswer(value: unknown): StructuredExchangeAnswer | null { - if (!isRecord(value)) return null; - - if (value.type === 'option') { - if ( - typeof value.label !== 'string' || - typeof value.value !== 'string' || - typeof value.index !== 'number' || - !Number.isInteger(value.index) || - value.index < 1 - ) { - return null; - } - return { type: 'option', label: value.label, value: value.value, index: value.index }; - } - - if (value.type === 'other') { - if (typeof value.label !== 'string' || typeof value.value !== 'string') return null; - return { type: 'other', label: value.label, value: value.value }; - } - - return null; -} - -function selectedChoice(answer: StructuredExchangeAnswer): SelectedChoice { - if (answer.type === 'other') return { id: 'other', label: answer.label, kind: 'other' }; - return { id: answer.value, label: answer.label, kind: 'listed' }; -} - -export function buildStructuredExchangeEditorPrefill(params: StructuredExchangeEditorPrefillParams): string { - const payload: Record = { - schema: 'brunch.structured_exchange.editor', - schemaVersion: 1, - question: params.question, - mode: params.mode, - options: params.options.map((option, index) => ({ - index: index + 1, - label: option.label, - value: option.value, - ...(option.description ? { description: option.description } : {}), - })), - instructions: [ - 'Edit only response.', - 'For a selected listed option, add an answer like {"type":"option","label":"Alpha","value":"alpha","index":1}.', - 'For Other, add an answer like {"type":"other","label":"Custom answer","value":"Custom answer"}.', - 'Set response.note to a string. Use "" when there is no additional note.', - ], - response: { status: 'cancelled', answers: [], note: '' }, - }; - if (params.context !== undefined) payload.context = params.context; - return JSON.stringify(payload, null, 2); -} - -export function parseStructuredExchangeEditorResponse( - value: string, -): StructuredExchangeEditorResponse | null { - let parsed: unknown; - try { - parsed = JSON.parse(value); - } catch { - return null; - } - - if (!isRecord(parsed)) return null; - const response = parsed.response; - if (!isRecord(response)) return null; - if (response.status === 'cancelled') return { status: 'cancelled', answers: [], note: '' }; - if (response.status !== 'answered') return null; - if (!Array.isArray(response.answers) || typeof response.note !== 'string') return null; - - const answers = response.answers.map(parseEditorAnswer); - if (answers.some((answer) => answer === null)) return null; - return { - status: 'answered', - answers: sortAnswers(answers as StructuredExchangeAnswer[]), - note: response.note, - }; -} - -export function structuredExchangeResultFromEditor( - params: StructuredExchangeEditorPrefillParams, - edited: string | undefined, -) { - const response = parseStructuredExchangeEditorResponse(edited ?? ''); - const exchangeId = params.exchangeId ?? `rpc-editor:${params.question}`; - if (edited === undefined || response?.status === 'cancelled') { - if (params.mode === 'multi-select') { - const details = projectRequestChoices({ exchangeId, status: 'cancelled' }); - return { content: [{ type: 'text' as const, text: formatRequestChoices(details) }], details }; - } - const details = projectRequestChoice({ - exchangeId, - respondsToPresentTool: 'present_options', - status: 'cancelled', - }); - return { content: [{ type: 'text' as const, text: formatRequestChoice(details) }], details }; - } - - if (!response || response.answers.length === 0) { - if (params.mode === 'multi-select') { - const details = projectRequestChoices({ - exchangeId, - status: 'unavailable', - message: 'Editor response did not include a valid answer', - }); - return { content: [{ type: 'text' as const, text: formatRequestChoices(details) }], details }; - } - const details = projectRequestChoice({ - exchangeId, - respondsToPresentTool: 'present_options', - status: 'unavailable', - message: 'Editor response did not include a valid answer', - }); - return { content: [{ type: 'text' as const, text: formatRequestChoice(details) }], details }; - } - - if (params.mode === 'multi-select') { - const details = projectRequestChoices({ - exchangeId, - status: 'answered', - choices: response.answers.map(selectedChoice), - comment: response.note.trim() || undefined, - }); - return { - content: [{ type: 'text' as const, text: formatRequestChoices(details) }], - details, - }; - } - - const details = projectRequestChoice({ - exchangeId, - respondsToPresentTool: 'present_options', - status: 'answered', - choice: selectedChoice(response.answers[0]!), - comment: response.note.trim() || undefined, - }); - return { - content: [{ type: 'text' as const, text: formatRequestChoice(details) }], - details, - }; -} diff --git a/src/probes/structured-exchange-rpc-proof.test.ts b/src/probes/structured-exchange-rpc-proof.test.ts index c633eb9f..89f6bac4 100644 --- a/src/probes/structured-exchange-rpc-proof.test.ts +++ b/src/probes/structured-exchange-rpc-proof.test.ts @@ -3,12 +3,12 @@ import { describe, expect, it } from 'vitest'; import { runStructuredExchangeRpcProof } from './structured-exchange-rpc-proof.js'; describe('structured-exchange RPC proof', () => { - it('round-trips option answers and notes through Pi RPC editor fallback', async () => { + it('round-trips multi-choice answers and comments through the Pi RPC editor envelope', async () => { const proof = await runStructuredExchangeRpcProof(); expect(proof.scenario).toMatchObject({ - mission: expect.stringContaining('option-based structured exchange'), - evaluationFocus: expect.stringContaining('optional note'), + mission: expect.stringContaining('multi-choice structured exchange'), + evaluationFocus: expect.stringContaining('optional comment'), maxTurns: 1, }); expect(proof.editorRequest).toMatchObject({ @@ -17,19 +17,21 @@ describe('structured-exchange RPC proof', () => { title: 'Answer structured exchange as JSON', }); expect(JSON.parse(proof.editorRequest.prefill ?? '{}')).toMatchObject({ - schema: 'brunch.structured_exchange.editor', + schema: 'brunch.structured_exchange.request_choices.editor', schemaVersion: 1, - question: 'Which implementation path should the evaluator choose?', - mode: 'multi-select', - options: [ - { index: 1, label: 'Ship RPC fallback', value: 'rpc-fallback' }, - { index: 2, label: 'Wait for web relay', value: 'wait-web' }, - { index: 3, label: 'Escalate blocker', value: 'blocker' }, + prompt: 'Which implementation path should the evaluator choose?', + mode: 'multi-choice', + choices: [ + { id: 'rpc-fallback', label: 'Ship RPC fallback' }, + { id: 'wait-web', label: 'Wait for web relay' }, + { id: 'blocker', label: 'Escalate blocker' }, ], + response: { status: 'cancelled', choices: [], comment: '' }, }); expect(proof.terminalDetails).toMatchObject({ schema: 'brunch.structured_exchange.request', v: 1, + exchange_id: 'structured-exchange-rpc-proof', tool_meta: { prev: 'present_options', curr: 'request_choices', next: 'capture_choices' }, answered: { choices: [{ id: 'rpc-fallback', label: 'Ship RPC fallback', kind: 'listed' }], diff --git a/src/probes/structured-exchange-rpc-proof.ts b/src/probes/structured-exchange-rpc-proof.ts index 3ae5600d..f9de85d2 100644 --- a/src/probes/structured-exchange-rpc-proof.ts +++ b/src/probes/structured-exchange-rpc-proof.ts @@ -52,9 +52,9 @@ interface StructuredExchangeRpcProofOptions { const PROOF_CUSTOM_TYPE = 'brunch.structured_exchange_rpc_proof_result'; const scenario = { - mission: 'Complete an option-based structured exchange as an agent-as-user evaluator.', + mission: 'Complete a multi-choice structured exchange as an agent-as-user evaluator.', evaluationFocus: - 'Verify that selected option answers and an optional note survive the Pi RPC editor fallback as structured terminal details.', + 'Verify that selected choices and an optional comment survive the Pi RPC request_choices editor envelope as structured terminal details.', maxTurns: 1, }; @@ -138,31 +138,25 @@ async function writeProofExtension(cwd: string): Promise { const adapterPath = resolve('src/.pi/extensions/exchanges/index.ts'); const content = ` import type { ExtensionAPI } from "@earendil-works/pi-coding-agent" - import { - buildStructuredExchangeEditorPrefill, - structuredExchangeResultFromEditor, - } from ${JSON.stringify(adapterPath)} + import { requestChoicesViaEditor } from ${JSON.stringify(adapterPath)} const params = { - question: "Which implementation path should the evaluator choose?", - context: "Scenario: prove option answers plus notes over Pi RPC.", - mode: "multi-select", - options: [ - { label: "Ship RPC fallback", value: "rpc-fallback" }, - { label: "Wait for web relay", value: "wait-web" }, - { label: "Escalate blocker", value: "blocker" }, + exchangeId: "structured-exchange-rpc-proof", + prompt: "Which implementation path should the evaluator choose?", + choices: [ + { id: "rpc-fallback", label: "Ship RPC fallback" }, + { id: "wait-web", label: "Wait for web relay" }, + { id: "blocker", label: "Escalate blocker" }, ], } as const export default function(pi: ExtensionAPI): void { pi.registerCommand("brunch-structured-exchange-rpc-proof", { - description: "Exercise Brunch structured-exchange RPC editor fallback.", + description: "Exercise the Brunch request_choices editor envelope over Pi RPC.", handler: async (_args, ctx) => { - const edited = await ctx.ui.editor( - "Answer structured exchange as JSON", - buildStructuredExchangeEditorPrefill(params), + const result = await requestChoicesViaEditor(params, (prefill) => + ctx.ui.editor("Answer structured exchange as JSON", prefill), ) - const result = structuredExchangeResultFromEditor(params, edited) const details = { ...result.details, probe: { name: "structured-exchange-rpc-proof", transport: "pi-rpc-editor" }, @@ -200,15 +194,8 @@ function answeredEditorPayload(prefill: string | undefined): string { const payload = JSON.parse(prefill) as { response?: unknown }; payload.response = { status: 'answered', - answers: [ - { - type: 'option', - label: 'Ship RPC fallback', - value: 'rpc-fallback', - index: 1, - }, - ], - note: 'Proceed, but report any relay friction separately.', + choices: [{ id: 'rpc-fallback', label: 'Ship RPC fallback' }], + comment: 'Proceed, but report any relay friction separately.', }; return `${JSON.stringify(payload, null, 2)}\n`; } From 0822d7b62b1b95bde4dc5266afb7f8f2f24aff98 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Thu, 11 Jun 2026 15:31:59 +0200 Subject: [PATCH 28/32] Extract the grounding-gap fixture builder into graph/schema MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit One builder module (src/graph/schema/elicitation-gap-fixtures.ts) now owns the synthetic ElicitationGap shape: presenceGap for single gaps and groundingFloorGaps for the context/thesis/goal/constraint floor with a per-kind coverage knob. The runtime extension's fail-closed conservativeUncoveredFloorGaps rides the builder (keeping its name, export, and doc comment), and the eleven hand-cloned per-test-file gap literals are deleted in favor of importing it. Production owns the shape; tests import it — never the reverse. Co-Authored-By: Claude Fable 5 --- src/.pi/__tests__/prompting.test.ts | 44 +++------- .../__tests__/runtime-switch-command.test.ts | 23 +----- src/.pi/agents/compose.test.ts | 30 +------ src/.pi/agents/contexts/cwd.test.ts | 26 ++---- src/.pi/agents/state.test.ts | 54 ++++--------- .../runtime/authority-matrix.test.ts | 23 +----- src/.pi/extensions/runtime/index.ts | 22 ++--- src/app/brunch-tui.test.ts | 20 +---- src/graph/README.md | 3 + src/graph/schema/elicitation-gap-fixtures.ts | 80 +++++++++++++++++++ src/projections/session/affordances.test.ts | 55 ++++--------- .../session/capability-readiness.test.ts | 55 ++++--------- .../session/readiness-estimate.test.ts | 49 +++--------- .../runtime-affordances-coverage.test.ts | 28 +------ 14 files changed, 175 insertions(+), 337 deletions(-) create mode 100644 src/graph/schema/elicitation-gap-fixtures.ts diff --git a/src/.pi/__tests__/prompting.test.ts b/src/.pi/__tests__/prompting.test.ts index 05781bc9..93d18f16 100644 --- a/src/.pi/__tests__/prompting.test.ts +++ b/src/.pi/__tests__/prompting.test.ts @@ -4,8 +4,8 @@ import { fileURLToPath } from 'node:url'; import { describe, expect, it } from 'vitest'; +import { groundingFloorGaps } from '../../graph/schema/elicitation-gap-fixtures.js'; import type { ElicitationGap } from '../../graph/schema/elicitation-gaps.js'; -import type { NodeKind } from '../../graph/schema/nodes.js'; import type { WorkspacePostureState } from '../../session/workspace-session-coordinator.js'; import { composeAgentPrompt } from '../agents/compose.js'; import { createBrunchPiExtensions } from '../brunch-pi-extensions.js'; @@ -53,30 +53,6 @@ class FakeRuntimeStateSessionManager { } } -function gap(refersTo: NodeKind, coverage = 1): ElicitationGap { - return { - id: `${refersTo}:gap`, - specId: 1, - refersTo, - question: `${refersTo} question`, - rationale: `${refersTo} rationale`, - basis: 'implicit', - band: 'grounding', - predicate: { kind: 'presence', minimum: 1, nodeKind: refersTo }, - importance: 1, - coverage, - answered: coverage >= 1, - disposition: coverage >= 1 ? 'answered' : 'open', - createdAtLsn: 1, - }; -} - -function groundingGaps(coverage: Partial> = {}): ElicitationGap[] { - return ['context', 'thesis', 'goal', 'constraint'].map((kind) => - gap(kind as NodeKind, coverage[kind as NodeKind] ?? 1), - ); -} - const promptContext = { spec: { id: 1, name: 'Spec' }, workspace: { @@ -133,7 +109,7 @@ const promptContext = { }), getNodes: () => [], resolveNodeCode: () => undefined, - getElicitationGaps: () => groundingGaps(), + getElicitationGaps: () => groundingFloorGaps(), }, }; @@ -156,7 +132,7 @@ describe('Brunch prompt-pack topology', () => { spec: promptContext.spec, workspace: promptContext.workspace, activeTools: ['read', 'grep', 'present_options'], - gaps: groundingGaps(), + gaps: groundingFloorGaps(), }); expect(result.prompt).toContain('[Brunch agent control]'); @@ -277,7 +253,7 @@ describe('Brunch prompt-pack topology', () => { }), getNodes: () => [], resolveNodeCode: () => undefined, - getElicitationGaps: () => groundingGaps(), + getElicitationGaps: () => groundingFloorGaps(), }, }), }, @@ -550,11 +526,13 @@ describe('Brunch prompt-pack topology', () => { return activeTools.at(-1) ?? []; } - await expect( - activeToolsForGaps(groundingGaps({ context: 0, thesis: 0, goal: 0, constraint: 0 })), - ).resolves.not.toContain('mutate_graph'); - await expect(activeToolsForGaps(groundingGaps({ context: 0.5 }))).resolves.toContain('mutate_graph'); - await expect(activeToolsForGaps(groundingGaps({ context: 0.5 }))).resolves.toContain( + await expect(activeToolsForGaps(groundingFloorGaps({ defaultCoverage: 0 }))).resolves.not.toContain( + 'mutate_graph', + ); + await expect(activeToolsForGaps(groundingFloorGaps({ coverage: { context: 0.5 } }))).resolves.toContain( + 'mutate_graph', + ); + await expect(activeToolsForGaps(groundingFloorGaps({ coverage: { context: 0.5 } }))).resolves.toContain( 'present_review_set', ); }); diff --git a/src/.pi/__tests__/runtime-switch-command.test.ts b/src/.pi/__tests__/runtime-switch-command.test.ts index 1e8a9234..3575946f 100644 --- a/src/.pi/__tests__/runtime-switch-command.test.ts +++ b/src/.pi/__tests__/runtime-switch-command.test.ts @@ -1,7 +1,6 @@ import { describe, expect, it } from 'vitest'; -import type { ElicitationGap } from '../../graph/schema/elicitation-gaps.js'; -import type { NodeKind } from '../../graph/schema/nodes.js'; +import { groundingFloorGaps } from '../../graph/schema/elicitation-gap-fixtures.js'; import { projectBrunchAgentState } from '../../projections/session/runtime-state.js'; import { BRUNCH_AGENT_RUNTIME_STATE_CUSTOM_TYPE, @@ -36,24 +35,6 @@ interface FakeCommandContext { }; } -function coveredGroundingGaps(): ElicitationGap[] { - return (['context', 'thesis', 'goal', 'constraint'] as const).map((refersTo: NodeKind) => ({ - id: `${refersTo}:gap`, - specId: 1, - refersTo, - question: `${refersTo} question`, - rationale: `${refersTo} rationale`, - basis: 'implicit', - band: 'grounding', - predicate: { kind: 'presence', minimum: 1, nodeKind: refersTo }, - importance: 1, - coverage: 1, - answered: true, - disposition: 'answered', - createdAtLsn: 1, - })); -} - function commandHarness(options: { customResult?: unknown; customAvailable?: boolean } = {}) { const entries: RuntimeEntry[] = []; const notifications: Array<{ message: string; level?: 'info' | 'warning' | 'error' }> = []; @@ -100,7 +81,7 @@ function commandHarness(options: { customResult?: unknown; customAvailable?: boo requestChromeRefresh: () => { chromeRefreshes.push(chromeRefreshes.length + 1); }, - getElicitationGaps: () => coveredGroundingGaps(), + getElicitationGaps: () => groundingFloorGaps(), }, ); diff --git a/src/.pi/agents/compose.test.ts b/src/.pi/agents/compose.test.ts index ed50a85c..1cdcb42b 100644 --- a/src/.pi/agents/compose.test.ts +++ b/src/.pi/agents/compose.test.ts @@ -4,8 +4,7 @@ import { fileURLToPath } from 'node:url'; import { describe, expect, it } from 'vitest'; -import type { ElicitationGap } from '../../graph/schema/elicitation-gaps.js'; -import type { NodeKind } from '../../graph/schema/nodes.js'; +import { groundingFloorGaps } from '../../graph/schema/elicitation-gap-fixtures.js'; import { DEFAULT_BRUNCH_AGENT_STATE, projectBrunchAgentState, @@ -42,31 +41,8 @@ function workspacePosture(posture: WorkspacePostureState): WorkspacePostureState return posture; } -function gap(refersTo: NodeKind, coverage = 1): ElicitationGap { - return { - id: `${refersTo}:gap`, - specId: 1, - refersTo, - question: `${refersTo} question`, - rationale: `${refersTo} rationale`, - basis: 'implicit', - band: 'grounding', - predicate: { kind: 'presence', minimum: 1, nodeKind: refersTo }, - importance: 1, - coverage, - answered: coverage >= 1, - disposition: coverage >= 1 ? 'answered' : 'open', - createdAtLsn: 1, - }; -} - -const coveredGaps = ['context', 'thesis', 'goal', 'constraint'].map((kind) => gap(kind as NodeKind)); -const zeroCoverageGaps = coveredGaps.map((record) => ({ - ...record, - coverage: 0, - answered: false, - disposition: 'open' as const, -})); +const coveredGaps = groundingFloorGaps(); +const zeroCoverageGaps = groundingFloorGaps({ defaultCoverage: 0 }); const context = { contextHandles: ['graph-overview: compact selected-spec graph summary available via read tools'], diff --git a/src/.pi/agents/contexts/cwd.test.ts b/src/.pi/agents/contexts/cwd.test.ts index a7049eae..5721d8b3 100644 --- a/src/.pi/agents/contexts/cwd.test.ts +++ b/src/.pi/agents/contexts/cwd.test.ts @@ -1,27 +1,8 @@ import { describe, expect, it } from 'vitest'; -import type { ElicitationGap } from '../../../graph/schema/elicitation-gaps.js'; -import type { NodeKind } from '../../../graph/schema/nodes.js'; +import { presenceGap } from '../../../graph/schema/elicitation-gap-fixtures.js'; import { renderCwdContext } from './cwd.js'; -function gap(refersTo: NodeKind, coverage: number, band: ElicitationGap['band']): ElicitationGap { - return { - id: `${refersTo}:gap`, - specId: 42, - refersTo, - question: `${refersTo} question`, - rationale: `${refersTo} rationale`, - basis: 'implicit', - band, - predicate: { kind: 'presence', minimum: 1, nodeKind: refersTo }, - importance: 1, - coverage, - answered: coverage >= 1, - disposition: coverage >= 1 ? 'answered' : 'open', - createdAtLsn: 1, - }; -} - describe('renderCwdContext', () => { it('renders selected-spec/session/posture facts without ambient resource discovery', () => { const rendered = renderCwdContext({ @@ -35,7 +16,10 @@ describe('renderCwdContext', () => { }, }, session: { id: 'session-7', label: 'Grounding' }, - gaps: [gap('context', 0.5, 'grounding'), gap('requirement', 1, 'elicitation')], + gaps: [ + presenceGap({ refersTo: 'context', coverage: 0.5, band: 'grounding', specId: 42 }), + presenceGap({ refersTo: 'requirement', coverage: 1, band: 'elicitation', specId: 42 }), + ], }); expect(rendered).toContain('- cwd: /repo/product'); diff --git a/src/.pi/agents/state.test.ts b/src/.pi/agents/state.test.ts index a44196b5..a0fabf34 100644 --- a/src/.pi/agents/state.test.ts +++ b/src/.pi/agents/state.test.ts @@ -3,35 +3,10 @@ import { fileURLToPath } from 'node:url'; import { describe, expect, it } from 'vitest'; -import type { ElicitationGap } from '../../graph/schema/elicitation-gaps.js'; -import type { NodeKind } from '../../graph/schema/nodes.js'; +import { groundingFloorGaps } from '../../graph/schema/elicitation-gap-fixtures.js'; import { projectBrunchAgentState } from '../../projections/session/runtime-state.js'; import { activeToolNamesForPosture, manifestsForState } from './state.js'; -function gap(refersTo: NodeKind, coverage: number): ElicitationGap { - return { - id: `${refersTo}:gap`, - specId: 1, - refersTo, - question: `${refersTo} question`, - rationale: `${refersTo} rationale`, - basis: 'implicit', - band: 'grounding', - predicate: { kind: 'presence', minimum: 1, nodeKind: refersTo }, - importance: 1, - coverage, - answered: coverage >= 1, - disposition: coverage >= 1 ? 'answered' : 'open', - createdAtLsn: 1, - }; -} - -function groundingGaps(coverage: Partial> = {}): ElicitationGap[] { - return ['context', 'thesis', 'goal', 'constraint'].map((kind) => - gap(kind as NodeKind, coverage[kind as NodeKind] ?? 1), - ); -} - const registeredToolNames = [ 'read', 'grep', @@ -55,8 +30,8 @@ const registeredToolNames = [ describe('agent posture policy', () => { it('derives method manifests and active tool names from gap coverage', () => { const state = projectBrunchAgentState([]); - const uncoveredGaps = groundingGaps({ context: 0, thesis: 0, goal: 0, constraint: 0 }); - const coveredGaps = groundingGaps(); + const uncoveredGaps = groundingFloorGaps({ defaultCoverage: 0 }); + const coveredGaps = groundingFloorGaps(); const floorMethods = manifestsForState(state, uncoveredGaps).methods.map((entry) => entry.name); const floorTools = activeToolNamesForPosture({ registeredToolNames, state, gaps: uncoveredGaps }); @@ -88,8 +63,8 @@ describe('agent posture policy', () => { it('moves a gated method and its tools from absent to present when coverage rises', () => { const state = projectBrunchAgentState([]); - const uncovered = groundingGaps({ context: 0 }); - const covered = groundingGaps({ context: 0.5 }); + const uncovered = groundingFloorGaps({ coverage: { context: 0 } }); + const covered = groundingFloorGaps({ coverage: { context: 0.5 } }); expect(manifestsForState(state, uncovered).methods.map((entry) => entry.name)).not.toContain( 'commit-graph', @@ -105,7 +80,7 @@ describe('agent posture policy', () => { it('allows registered dev tool names only through the injected dev allow-list', () => { const state = projectBrunchAgentState([]); - const gaps = groundingGaps({ context: 0, thesis: 0, goal: 0, constraint: 0 }); + const gaps = groundingFloorGaps({ defaultCoverage: 0 }); const productTools = activeToolNamesForPosture({ registeredToolNames: [...registeredToolNames, 'brunch_session_query'], state, @@ -128,7 +103,7 @@ describe('agent posture policy', () => { const tools = activeToolNamesForPosture({ registeredToolNames, state, - gaps: groundingGaps({ context: 0, thesis: 0, goal: 0, constraint: 0 }), + gaps: groundingFloorGaps({ defaultCoverage: 0 }), devAllowedToolNames: ['bash', 'brunch_session_query'], }); @@ -157,29 +132,28 @@ describe('agent posture policy', () => { }, ]); - expect(manifestsForState(autoState, groundingGaps()).strategies.map((entry) => entry.name)).toEqual([ + expect(manifestsForState(autoState, groundingFloorGaps()).strategies.map((entry) => entry.name)).toEqual([ 'step-wise-decision-tree', 'step-wise-disambiguate', 'propose-graph', 'project-graph', ]); expect( - manifestsForState( - pinnedFreestyle, - groundingGaps({ context: 0, thesis: 0, goal: 0, constraint: 0 }), - ).strategies.map((entry) => entry.name), + manifestsForState(pinnedFreestyle, groundingFloorGaps({ defaultCoverage: 0 })).strategies.map( + (entry) => entry.name, + ), ).toEqual(['freestyle']); expect( activeToolNamesForPosture({ registeredToolNames, state: pinnedFreestyle, - gaps: groundingGaps(), + gaps: groundingFloorGaps(), }), ).toEqual( activeToolNamesForPosture({ registeredToolNames, state: autoState, - gaps: groundingGaps(), + gaps: groundingFloorGaps(), }), ); }); @@ -204,7 +178,7 @@ describe('agent posture policy', () => { }, ]); - expect(() => manifestsForState(state, groundingGaps({ thesis: 0 }))).toThrow( + expect(() => manifestsForState(state, groundingFloorGaps({ coverage: { thesis: 0 } }))).toThrow( 'Pinned goal "commit-converge" is not legal for elicitor in elicit; capability-readiness returned negotiate for current elicitation gaps.', ); }); diff --git a/src/.pi/extensions/runtime/authority-matrix.test.ts b/src/.pi/extensions/runtime/authority-matrix.test.ts index 2496a53d..c7d2feb7 100644 --- a/src/.pi/extensions/runtime/authority-matrix.test.ts +++ b/src/.pi/extensions/runtime/authority-matrix.test.ts @@ -2,8 +2,7 @@ import type { ExtensionAPI } from '@earendil-works/pi-coding-agent'; import { describe, expect, it } from 'vitest'; import type { CommandResult } from '../../../graph/command-executor.js'; -import type { ElicitationGap } from '../../../graph/schema/elicitation-gaps.js'; -import type { NodeKind } from '../../../graph/schema/nodes.js'; +import { groundingFloorGaps } from '../../../graph/schema/elicitation-gap-fixtures.js'; import { isToolBlockedForRuntimeState, TOOL_POLICY_DEFINITIONS, @@ -23,25 +22,7 @@ const REGISTERED_POC_TOOLS = [ 'mutate_graph', ] as const; -function gap(refersTo: NodeKind): ElicitationGap { - return { - id: `${refersTo}:gap`, - specId: 1, - refersTo, - question: `${refersTo} question`, - rationale: `${refersTo} rationale`, - basis: 'implicit', - band: 'grounding', - predicate: { kind: 'presence', minimum: 1, nodeKind: refersTo }, - importance: 1, - coverage: 0, - answered: false, - disposition: 'open', - createdAtLsn: 1, - }; -} - -const uncoveredGaps = ['context', 'thesis', 'goal', 'constraint'].map((kind) => gap(kind as NodeKind)); +const uncoveredGaps = groundingFloorGaps({ defaultCoverage: 0 }); function piWithRegisteredTools(toolNames: readonly string[]): ExtensionAPI { return { diff --git a/src/.pi/extensions/runtime/index.ts b/src/.pi/extensions/runtime/index.ts index 6cedfacb..e3f15af9 100644 --- a/src/.pi/extensions/runtime/index.ts +++ b/src/.pi/extensions/runtime/index.ts @@ -17,8 +17,8 @@ import { } from '@earendil-works/pi-coding-agent'; import { Text } from '@earendil-works/pi-tui'; +import { groundingFloorGaps } from '../../../graph/schema/elicitation-gap-fixtures.js'; import type { ElicitationGap } from '../../../graph/schema/elicitation-gaps.js'; -import type { NodeKind } from '../../../graph/schema/nodes.js'; import { isToolBlockedForRuntimeState, toolPolicyForRuntimeState, @@ -110,25 +110,13 @@ function applyBrunchToolPolicy( * live selected-spec path. */ export function conservativeUncoveredFloorGaps(): readonly ElicitationGap[] { - return (['context', 'thesis', 'goal', 'constraint'] as const).map((kind) => gap(kind)); -} - -function gap(refersTo: NodeKind): ElicitationGap { - return { - id: `${refersTo}:runtime-policy-fallback`, + return groundingFloorGaps({ + defaultCoverage: 0, specId: 0, - refersTo, - question: `${refersTo} question`, + idSuffix: 'runtime-policy-fallback', rationale: 'Conservative fallback before selected-spec gaps are available.', - basis: 'implicit', - band: 'grounding', - predicate: { kind: 'presence', minimum: 1, nodeKind: refersTo }, - importance: 1, - coverage: 0, - answered: false, - disposition: 'open', createdAtLsn: 0, - }; + }); } interface TextLikeContent { diff --git a/src/app/brunch-tui.test.ts b/src/app/brunch-tui.test.ts index 8caf36fe..e9c856e8 100644 --- a/src/app/brunch-tui.test.ts +++ b/src/app/brunch-tui.test.ts @@ -29,8 +29,7 @@ import { } from '../.pi/brunch-pi-extensions.js'; import { createBrunchPiSettings } from '../.pi/brunch-pi-settings.js'; import { openWorkspaceGraphRuntime } from '../graph/index.js'; -import type { ElicitationGap } from '../graph/schema/elicitation-gaps.js'; -import type { NodeKind } from '../graph/schema/nodes.js'; +import { groundingFloorGaps } from '../graph/schema/elicitation-gap-fixtures.js'; import { userMessage } from '../probes/test-helpers.js'; import { createProductUpdatePublisher } from '../rpc/product-updates.js'; import { @@ -1638,26 +1637,11 @@ function inventoryWithWorkspace(workspace: WorkspaceSessionReadyState): Workspac } function stubPromptGraphReads() { - const gap = (refersTo: NodeKind): ElicitationGap => ({ - id: `${refersTo}:gap`, - specId: 1, - refersTo, - question: `${refersTo} question`, - rationale: `${refersTo} rationale`, - basis: 'implicit', - band: 'grounding', - predicate: { kind: 'presence', minimum: 1, nodeKind: refersTo }, - importance: 1, - coverage: 1, - answered: true, - disposition: 'answered', - createdAtLsn: 1, - }); return { queryGraph: () => ({ lsn: 1, nodes: [], edges: [] }), getNodes: () => [], resolveNodeCode: () => undefined, - getElicitationGaps: () => (['context', 'thesis', 'goal', 'constraint'] as const).map((kind) => gap(kind)), + getElicitationGaps: () => groundingFloorGaps(), }; } diff --git a/src/graph/README.md b/src/graph/README.md index d7da81b4..0e1c8a0f 100644 --- a/src/graph/README.md +++ b/src/graph/README.md @@ -168,6 +168,9 @@ graph/ kinds.ts zero-import domain enum taxonomy leaf elicitation-gaps.ts + elicitation-gap-fixtures.ts + synthetic gap builders (presenceGap, groundingFloorGaps); production + fail-closed floor + test fixtures ride the same shape nodes.ts edges.ts reconciliation-need.ts diff --git a/src/graph/schema/elicitation-gap-fixtures.ts b/src/graph/schema/elicitation-gap-fixtures.ts new file mode 100644 index 00000000..44147363 --- /dev/null +++ b/src/graph/schema/elicitation-gap-fixtures.ts @@ -0,0 +1,80 @@ +/** + * Synthetic elicitation-gap builders. + * + * Single owner of the synthetic `ElicitationGap` shape used by production + * fail-closed composition points (the runtime extension's + * `conservativeUncoveredFloorGaps` rides `groundingFloorGaps`) and by test + * fixtures across projections, session, agents, and app layers. Production + * owns the shape; tests import it — never the reverse. Not test-only code. + */ + +import type { Lsn } from '../atoms.js'; +import type { ElicitationGap } from './elicitation-gaps.js'; +import type { NodeKind } from './nodes.js'; + +/** Node kinds the conservative grounding floor spans. */ +export const GROUNDING_FLOOR_KINDS = [ + 'context', + 'thesis', + 'goal', + 'constraint', +] as const satisfies readonly NodeKind[]; + +export type ElicitationGapSeed = Partial & Pick; + +/** + * Build one synthetic presence-predicate gap. `coverage` defaults to 1 + * (covered); `answered` and `disposition` derive from coverage unless + * overridden explicitly through the seed. + */ +export function presenceGap(seed: ElicitationGapSeed): ElicitationGap { + const { refersTo } = seed; + const coverage = seed.coverage ?? 1; + return { + id: `${refersTo}:gap`, + specId: 1, + question: `${refersTo} question`, + rationale: `${refersTo} rationale`, + basis: 'implicit', + band: 'grounding', + predicate: { kind: 'presence', minimum: 1, nodeKind: refersTo }, + importance: 1, + coverage, + answered: coverage >= 1, + disposition: coverage >= 1 ? 'answered' : 'open', + createdAtLsn: 1, + ...seed, + }; +} + +export interface GroundingFloorGapsOptions { + readonly kinds?: readonly NodeKind[]; + /** Per-kind coverage; kinds absent from the map fall back to `defaultCoverage`. */ + readonly coverage?: Readonly>>; + /** Coverage for kinds not named in `coverage`. Defaults to 1 (covered). */ + readonly defaultCoverage?: number; + readonly specId?: number; + /** Gap ids become `${kind}:${idSuffix}`; defaults to `${kind}:gap`. */ + readonly idSuffix?: string; + /** Shared rationale; defaults to `${kind} rationale` per gap. */ + readonly rationale?: string; + readonly createdAtLsn?: Lsn; +} + +/** + * Build one presence gap per grounding-floor node kind (or per `kinds`), + * with a per-kind coverage knob. + */ +export function groundingFloorGaps(options: GroundingFloorGapsOptions = {}): ElicitationGap[] { + const kinds = options.kinds ?? GROUNDING_FLOOR_KINDS; + return kinds.map((kind) => + presenceGap({ + refersTo: kind, + coverage: options.coverage?.[kind] ?? options.defaultCoverage ?? 1, + ...(options.idSuffix === undefined ? {} : { id: `${kind}:${options.idSuffix}` }), + ...(options.specId === undefined ? {} : { specId: options.specId }), + ...(options.rationale === undefined ? {} : { rationale: options.rationale }), + ...(options.createdAtLsn === undefined ? {} : { createdAtLsn: options.createdAtLsn }), + }), + ); +} diff --git a/src/projections/session/affordances.test.ts b/src/projections/session/affordances.test.ts index 3e5787f1..9b797228 100644 --- a/src/projections/session/affordances.test.ts +++ b/src/projections/session/affordances.test.ts @@ -3,8 +3,7 @@ import { fileURLToPath } from 'node:url'; import { describe, expect, it } from 'vitest'; -import type { ElicitationGap } from '../../graph/schema/elicitation-gaps.js'; -import type { NodeKind } from '../../graph/schema/nodes.js'; +import { groundingFloorGaps } from '../../graph/schema/elicitation-gap-fixtures.js'; import { DEFAULT_BRUNCH_AGENT_STATE } from '../../session/runtime-state.js'; import { affordances } from './affordances.js'; import { axisOptionsForRuntimeState } from './runtime-policy.js'; @@ -14,33 +13,9 @@ function resolved(overrides: Partial = {}) { return resolveBrunchAgentState({ ...DEFAULT_BRUNCH_AGENT_STATE, ...overrides }); } -function gap(refersTo: NodeKind, coverage: number): ElicitationGap { - return { - id: `${refersTo}:gap`, - specId: 1, - refersTo, - question: `${refersTo} question`, - rationale: `${refersTo} rationale`, - basis: 'implicit', - band: 'grounding', - predicate: { kind: 'presence', minimum: 1, nodeKind: refersTo }, - importance: 1, - coverage, - answered: coverage >= 1, - disposition: coverage >= 1 ? 'answered' : 'open', - createdAtLsn: 1, - }; -} - -function groundingGaps(coverage: Partial> = {}): ElicitationGap[] { - return ['context', 'thesis', 'goal', 'constraint'].map((kind) => - gap(kind as NodeKind, coverage[kind as NodeKind] ?? 1), - ); -} - describe('runtime affordances derivation', () => { it('reports legal options and default-on-switch values for every posture axis', () => { - expect(affordances(resolved(), groundingGaps())).toEqual({ + expect(affordances(resolved(), groundingFloorGaps())).toEqual({ goal: { selection: 'grounding-advance', legalOptions: ['grounding-advance', 'elicit-expand', 'commit-converge', 'capture-posture'], @@ -60,14 +35,15 @@ describe('runtime affordances derivation', () => { }); it('keeps floor options legal when relevant gaps have zero coverage', () => { - const derived = affordances(resolved(), groundingGaps({ context: 0, thesis: 0, goal: 0, constraint: 0 })); + const derived = affordances(resolved(), groundingFloorGaps({ defaultCoverage: 0 })); expect(derived.goal.legalOptions).toEqual(['grounding-advance', 'capture-posture']); expect(derived.strategy.legalOptions).toEqual(['step-wise-decision-tree', 'step-wise-disambiguate']); expect(derived.lens.legalOptions).toEqual(['intent']); expect( - affordances(resolved({ agentStrategy: 'freestyle' }), groundingGaps({ context: 0 })).strategy, + affordances(resolved({ agentStrategy: 'freestyle' }), groundingFloorGaps({ coverage: { context: 0 } })) + .strategy, ).toEqual({ selection: 'freestyle', legalOptions: ['freestyle', 'step-wise-decision-tree', 'step-wise-disambiguate'], @@ -76,10 +52,7 @@ describe('runtime affordances derivation', () => { }); it('excludes gated options until capability-relevant gaps are covered', () => { - const uncovered = affordances( - resolved(), - groundingGaps({ context: 0, thesis: 0, goal: 0, constraint: 0 }), - ); + const uncovered = affordances(resolved(), groundingFloorGaps({ defaultCoverage: 0 })); expect(uncovered.goal.legalOptions).not.toContain('elicit-expand'); expect(uncovered.goal.legalOptions).not.toContain('commit-converge'); @@ -90,17 +63,19 @@ describe('runtime affordances derivation', () => { }); it('moves gated options from absent to present when gap coverage rises', () => { - const uncovered = affordances(resolved(), groundingGaps({ context: 0 })).strategy.legalOptions; - const covered = affordances(resolved(), groundingGaps({ context: 0.5 })).strategy.legalOptions; + const uncovered = affordances(resolved(), groundingFloorGaps({ coverage: { context: 0 } })).strategy + .legalOptions; + const covered = affordances(resolved(), groundingFloorGaps({ coverage: { context: 0.5 } })).strategy + .legalOptions; expect(uncovered).not.toContain('propose-graph'); expect(covered).toContain('propose-graph'); }); it('excludes freestyle from AUTO strategy affordances but reports a pinned legal strategy', () => { - expect(affordances(resolved(), groundingGaps()).strategy.legalOptions).not.toContain('freestyle'); + expect(affordances(resolved(), groundingFloorGaps()).strategy.legalOptions).not.toContain('freestyle'); - expect(affordances(resolved({ agentStrategy: 'freestyle' }), groundingGaps()).strategy).toEqual({ + expect(affordances(resolved({ agentStrategy: 'freestyle' }), groundingFloorGaps()).strategy).toEqual({ selection: 'freestyle', legalOptions: [ 'freestyle', @@ -116,7 +91,7 @@ describe('runtime affordances derivation', () => { it('fails loud when a gated option requires a kind absent from the register (config bug, not uncovered)', () => { // A capability-relevant kind missing from the gap register is a seeding/config bug; // the affordance projection must surface it, not silently omit the option. - const missingThesis = groundingGaps().filter((g) => g.refersTo !== 'thesis'); + const missingThesis = groundingFloorGaps().filter((g) => g.refersTo !== 'thesis'); expect(() => axisOptionsForRuntimeState('strategy', resolved(), missingThesis)).toThrow( /no elicitation gap for thesis/, ); @@ -127,7 +102,9 @@ describe('runtime affordances derivation', () => { }); it('derives per-axis legal options without grade-gate symbols', () => { - expect(axisOptionsForRuntimeState('lens', resolved(), groundingGaps({ thesis: 0 }))).toEqual(['intent']); + expect( + axisOptionsForRuntimeState('lens', resolved(), groundingFloorGaps({ coverage: { thesis: 0 } })), + ).toEqual(['intent']); for (const fileName of ['affordances.ts', 'runtime-policy.ts']) { const sourcePath = fileURLToPath(new URL(`./${fileName}`, import.meta.url)); diff --git a/src/projections/session/capability-readiness.test.ts b/src/projections/session/capability-readiness.test.ts index 015b38d0..0586fdba 100644 --- a/src/projections/session/capability-readiness.test.ts +++ b/src/projections/session/capability-readiness.test.ts @@ -6,40 +6,13 @@ import { describe, expect, it } from 'vitest'; import { createDb, type BrunchDb } from '../../db/connection.js'; import { CommandExecutor } from '../../graph/command-executor.js'; import { getElicitationGaps } from '../../graph/queries.js'; -import type { ElicitationGap } from '../../graph/schema/elicitation-gaps.js'; -import type { NodeKind } from '../../graph/schema/nodes.js'; +import { groundingFloorGaps, presenceGap } from '../../graph/schema/elicitation-gap-fixtures.js'; import { CAPABILITY_RELEVANT_GAPS, evaluateCapabilityReadiness, type CapabilityReadinessOutcome, } from './capability-readiness.js'; -function gap( - overrides: Partial & Pick, -): ElicitationGap { - return { - id: overrides.id ?? `${overrides.refersTo}:${overrides.question ?? 'gap'}`, - specId: overrides.specId ?? 1, - refersTo: overrides.refersTo, - question: overrides.question ?? `${overrides.refersTo} question`, - rationale: overrides.rationale ?? `${overrides.refersTo} rationale`, - basis: overrides.basis ?? 'implicit', - band: overrides.band ?? 'grounding', - predicate: overrides.predicate ?? { kind: 'presence', minimum: 1, nodeKind: overrides.refersTo }, - importance: overrides.importance ?? 1, - coverage: overrides.coverage, - answered: overrides.answered ?? overrides.coverage >= 1, - disposition: overrides.disposition ?? (overrides.coverage >= 1 ? 'answered' : 'open'), - createdAtLsn: overrides.createdAtLsn ?? 1, - }; -} - -function floorGaps(coverage: Partial> = {}): ElicitationGap[] { - return ['context', 'thesis', 'goal', 'constraint'].map((kind) => - gap({ refersTo: kind as NodeKind, coverage: coverage[kind as NodeKind] ?? 1 }), - ); -} - function expectOutcomeStatus( outcome: CapabilityReadinessOutcome, status: CapabilityReadinessOutcome['status'], @@ -62,13 +35,16 @@ describe('capability readiness over elicitation gaps', () => { }); it('proceeds when all relevant gaps are covered', () => { - const outcome = evaluateCapabilityReadiness('propose-graph', floorGaps()); + const outcome = evaluateCapabilityReadiness('propose-graph', groundingFloorGaps()); expect(outcome).toEqual({ status: 'proceed' }); }); it('negotiates with establishment-offer-shaped missing gaps when relevant grounding gaps are uncovered', () => { - const outcome = evaluateCapabilityReadiness('project-graph', floorGaps({ thesis: 0, goal: 0 })); + const outcome = evaluateCapabilityReadiness( + 'project-graph', + groundingFloorGaps({ coverage: { thesis: 0, goal: 0 } }), + ); expect(outcome.status).toBe('negotiate'); if (outcome.status !== 'negotiate') return; @@ -82,13 +58,16 @@ describe('capability readiness over elicitation gaps', () => { }); it('proceeds at low epistemic status when relevant gaps have only partial coverage', () => { - const outcome = evaluateCapabilityReadiness('generative-lens', floorGaps({ thesis: 0.5 })); + const outcome = evaluateCapabilityReadiness( + 'generative-lens', + groundingFloorGaps({ coverage: { thesis: 0.5 } }), + ); expect(outcome).toEqual({ status: 'proceed_low_epistemic', coverage: 0.875 }); }); it('fails loud when a required kind has no referring gap record', () => { - expect(() => evaluateCapabilityReadiness('propose-graph', floorGaps().slice(0, 3))).toThrow( + expect(() => evaluateCapabilityReadiness('propose-graph', groundingFloorGaps().slice(0, 3))).toThrow( /no elicitation gap for constraint/, ); }); @@ -125,14 +104,14 @@ describe('capability readiness over elicitation gaps', () => { it('proves same-kind gaps resolve independently through their own question and satisfier', () => { const outcome = evaluateCapabilityReadiness('propose-graph', [ - ...floorGaps(), - gap({ + ...groundingFloorGaps(), + presenceGap({ id: 'thesis:stakeholder', refersTo: 'thesis', question: 'Who is the primary user?', coverage: 1, }), - gap({ + presenceGap({ id: 'thesis:pain', refersTo: 'thesis', question: 'Why is this painful enough to solve now?', @@ -153,9 +132,9 @@ describe('capability readiness over elicitation gaps', () => { it('never returns a refusal outcome and does not import grade-gate symbols', () => { const outcomes = [ - evaluateCapabilityReadiness('propose-graph', floorGaps({ context: 0 })), - evaluateCapabilityReadiness('propose-graph', floorGaps({ context: 0.25 })), - evaluateCapabilityReadiness('propose-graph', floorGaps()), + evaluateCapabilityReadiness('propose-graph', groundingFloorGaps({ coverage: { context: 0 } })), + evaluateCapabilityReadiness('propose-graph', groundingFloorGaps({ coverage: { context: 0.25 } })), + evaluateCapabilityReadiness('propose-graph', groundingFloorGaps()), ]; expect(outcomes.map((outcome) => outcome.status)).toEqual([ diff --git a/src/projections/session/readiness-estimate.test.ts b/src/projections/session/readiness-estimate.test.ts index 018f3ee8..a3080aed 100644 --- a/src/projections/session/readiness-estimate.test.ts +++ b/src/projections/session/readiness-estimate.test.ts @@ -3,41 +3,16 @@ import { fileURLToPath } from 'node:url'; import { describe, expect, it } from 'vitest'; -import type { ElicitationGap } from '../../graph/schema/elicitation-gaps.js'; +import { presenceGap } from '../../graph/schema/elicitation-gap-fixtures.js'; import { READINESS_BANDS } from '../../graph/schema/kinds.js'; -import type { NodeKind, ReadinessBand } from '../../graph/schema/nodes.js'; import { readinessEstimate } from './readiness-estimate.js'; -function gap(overrides: { - readonly id?: string; - readonly band: ReadinessBand; - readonly coverage: number; - readonly importance?: number; - readonly refersTo?: NodeKind; -}): ElicitationGap { - return { - id: overrides.id ?? `${overrides.band}:${overrides.refersTo ?? 'context'}:${overrides.coverage}`, - specId: 1, - refersTo: overrides.refersTo ?? 'context', - question: `${overrides.band} question`, - rationale: `${overrides.band} rationale`, - basis: 'implicit', - band: overrides.band, - predicate: { kind: 'presence', minimum: 1, nodeKind: overrides.refersTo ?? 'context' }, - importance: overrides.importance ?? 1, - coverage: overrides.coverage, - answered: overrides.coverage >= 1, - disposition: overrides.coverage >= 1 ? 'answered' : 'open', - createdAtLsn: 1, - }; -} - describe('readiness estimate projection', () => { it('returns coverage for every readiness band', () => { const estimate = readinessEstimate([ - gap({ band: 'grounding', coverage: 1 }), - gap({ band: 'elicitation', coverage: 0.5 }), - gap({ band: 'commitment', coverage: 0.25 }), + presenceGap({ refersTo: 'context', band: 'grounding', coverage: 1 }), + presenceGap({ refersTo: 'context', band: 'elicitation', coverage: 0.5 }), + presenceGap({ refersTo: 'context', band: 'commitment', coverage: 0.25 }), ]); expect(Object.keys(estimate.coverage)).toEqual([...READINESS_BANDS]); @@ -45,7 +20,9 @@ describe('readiness estimate projection', () => { }); it('reports an empty band as zero coverage', () => { - expect(readinessEstimate([gap({ band: 'grounding', coverage: 0.75 })]).coverage).toEqual({ + expect( + readinessEstimate([presenceGap({ refersTo: 'context', band: 'grounding', coverage: 0.75 })]).coverage, + ).toEqual({ grounding: 0.75, elicitation: 0, commitment: 0, @@ -54,8 +31,8 @@ describe('readiness estimate projection', () => { it('uses an importance-weighted mean per band', () => { const estimate = readinessEstimate([ - gap({ band: 'elicitation', coverage: 1, importance: 3 }), - gap({ band: 'elicitation', coverage: 0, importance: 1 }), + presenceGap({ refersTo: 'context', band: 'elicitation', coverage: 1, importance: 3 }), + presenceGap({ refersTo: 'context', band: 'elicitation', coverage: 0, importance: 1 }), ]); expect(estimate.coverage.elicitation).toBe(0.75); @@ -63,12 +40,12 @@ describe('readiness estimate projection', () => { it('regresses honestly when gap coverage lowers and rises when coverage improves', () => { const lower = readinessEstimate([ - gap({ id: 'same', band: 'commitment', coverage: 0.25 }), - gap({ id: 'other', band: 'commitment', coverage: 0.75 }), + presenceGap({ refersTo: 'context', id: 'same', band: 'commitment', coverage: 0.25 }), + presenceGap({ refersTo: 'context', id: 'other', band: 'commitment', coverage: 0.75 }), ]); const higher = readinessEstimate([ - gap({ id: 'same', band: 'commitment', coverage: 0.75 }), - gap({ id: 'other', band: 'commitment', coverage: 0.75 }), + presenceGap({ refersTo: 'context', id: 'same', band: 'commitment', coverage: 0.75 }), + presenceGap({ refersTo: 'context', id: 'other', band: 'commitment', coverage: 0.75 }), ]); expect(lower.coverage.commitment).toBe(0.5); diff --git a/src/session/runtime-affordances-coverage.test.ts b/src/session/runtime-affordances-coverage.test.ts index c77904cc..10c16027 100644 --- a/src/session/runtime-affordances-coverage.test.ts +++ b/src/session/runtime-affordances-coverage.test.ts @@ -1,30 +1,11 @@ import { describe, expect, it } from 'vitest'; -import type { ElicitationGap } from '../graph/schema/elicitation-gaps.js'; -import type { NodeKind } from '../graph/schema/nodes.js'; +import { groundingFloorGaps } from '../graph/schema/elicitation-gap-fixtures.js'; import { affordances } from '../projections/session/affordances.js'; import { resolveBrunchAgentState } from '../projections/session/runtime-state.js'; import { sessionRpcMethods } from '../rpc/methods/session.js'; import { DEFAULT_BRUNCH_AGENT_STATE } from './runtime-state.js'; -function gap(refersTo: NodeKind): ElicitationGap { - return { - id: `${refersTo}:gap`, - specId: 1, - refersTo, - question: `${refersTo} question`, - rationale: `${refersTo} rationale`, - basis: 'implicit', - band: 'grounding', - predicate: { kind: 'presence', minimum: 1, nodeKind: refersTo }, - importance: 1, - coverage: 1, - answered: true, - disposition: 'answered', - createdAtLsn: 1, - }; -} - const runtimeAffordanceLedger = [ { row: 'goal.options', @@ -142,12 +123,7 @@ describe('runtime affordances coverage ledger', () => { }); it('covers all agent-required rows through the shared affordances derivation', () => { - const derived = affordances(resolveBrunchAgentState(DEFAULT_BRUNCH_AGENT_STATE), [ - gap('context'), - gap('thesis'), - gap('goal'), - gap('constraint'), - ]); + const derived = affordances(resolveBrunchAgentState(DEFAULT_BRUNCH_AGENT_STATE), groundingFloorGaps()); const derivedRows = Object.entries(derived).flatMap(([axis, axisAffordance]) => { const { selection: _selection, ...derivedFields } = axisAffordance; expect(Object.keys(derivedFields).sort()).toEqual(['defaultOnSwitch', 'legalOptions']); From 5a19ed54693faab82da3213c8755f204b02cb8f2 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Thu, 11 Jun 2026 15:35:00 +0200 Subject: [PATCH 29/32] Mark typing-collapse refactor done; suspended migration item remains Co-Authored-By: Claude Fable 5 --- memory/REFACTOR.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/memory/REFACTOR.md b/memory/REFACTOR.md index a5c14546..de06c016 100644 --- a/memory/REFACTOR.md +++ b/memory/REFACTOR.md @@ -7,6 +7,14 @@ Origin: /expert-typescript-typing review of the exchanges editor seam, after the remediation talkthrough exposed the envelope vocabulary collision that misled both a bot-comment review and the original kick-classifier author. +**STATUS 2026-06-11: commits 1-4 are DONE** (two-lane worktree fan-out, merged as +5e76459c, 5eaaa1ae, 1c25357a, 2b8f1c5c; verify green, 804 tests). The single-select +editor arm was deleted per user decision. Outcome-union ownership landed stronger +than planned: `RequestOutcomeKey` is type-projected from the details-schema union +and drift-locked both directions. **This file stays alive only for the carried +suspended item below** — delete it once the stacked-branch migration fix is +verified on reintegration. + ## Problem Statement Four type-fork families, all "duplicate the owner's state space closer to where I From 02f98118dae9ec63fd08b9578837e71908e6624b Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Thu, 11 Jun 2026 15:36:45 +0200 Subject: [PATCH 30/32] Retire REFACTOR.md; carry the migration handoff note into PLAN All refactor steps are done; the one suspended item (migration 0004 coherence, owned by the stacked successor branch) moves to PLAN's Active section so the reintegration re-check survives the file. Co-Authored-By: Claude Fable 5 --- memory/PLAN.md | 2 +- memory/REFACTOR.md | 159 --------------------------------------------- 2 files changed, 1 insertion(+), 160 deletions(-) delete mode 100644 memory/REFACTOR.md diff --git a/memory/PLAN.md b/memory/PLAN.md index fc9fe85b..f59c2c42 100644 --- a/memory/PLAN.md +++ b/memory/PLAN.md @@ -84,7 +84,7 @@ per ledger row: ### Active -- (none) — the FE-847 turn-boundary closure completed 2026-06-11 (see Turn-boundary choreography below); the review-fix remediation residue is one suspended item in `memory/REFACTOR.md` (migration 0004, handed to the stacked successor branch). +- (none) — the FE-847 turn-boundary closure completed 2026-06-11 (see Turn-boundary choreography below). One handed-off residue: migration `0004_gaps_node_kind_reference` coherence (the in-place journal-tag rewrite + the derive-with-'context'-fallback that the new read-side `predicate_kind` throw would reject) is being fixed by another agent on the stacked successor branch — on reintegration, verify that fix actually covers the concern before considering it closed; do not touch `drizzle/` on this branch meanwhile. ### Turn-boundary choreography (Tier-2 layer) diff --git a/memory/REFACTOR.md b/memory/REFACTOR.md deleted file mode 100644 index de06c016..00000000 --- a/memory/REFACTOR.md +++ /dev/null @@ -1,159 +0,0 @@ -# Refactor: source-of-truth typing collapse — structured-exchange editor seam + gap fixtures - -Created: 2026-06-11 · Temporary execution aid; delete when complete or superseded. -Supersedes: the 2026-06-11 review-fix remediation plan (commits 0bc9cf24..d596f266, -all done) — except its suspended migration item, carried forward at the bottom. -Origin: /expert-typescript-typing review of the exchanges editor seam, after the -remediation talkthrough exposed the envelope vocabulary collision that misled both -a bot-comment review and the original kick-classifier author. - -**STATUS 2026-06-11: commits 1-4 are DONE** (two-lane worktree fan-out, merged as -5e76459c, 5eaaa1ae, 1c25357a, 2b8f1c5c; verify green, 804 tests). The single-select -editor arm was deleted per user decision. Outcome-union ownership landed stronger -than planned: `RequestOutcomeKey` is type-projected from the details-schema union -and drift-locked both directions. **This file stays alive only for the carried -suspended item below** — delete it once the stacked-branch migration fix is -verified on reintegration. - -## Problem Statement - -Four type-fork families, all "duplicate the owner's state space closer to where I -happen to be working": - -1. **Two divergent editor wire envelopes for one job.** The editor-prefill pattern - exists for exactly one reason (user-confirmed): `request_choices` is the one - exchange whose response payload cannot ride Pi built-ins, and `ctx.ui.custom` - cannot cross RPC — so a JSON envelope is prefilled into `ctx.ui.editor` for the - client to edit. But two envelopes grew: the product tool's local one - (`...request_choices.editor`: response `{status, choices[], comment}`) and the - probe-only "shared" fallback (`...editor`: response `{status, answers[], note}`, - plus a single-select arm no product code reaches). Both are hand-parsed; no - schema owns either; the result envelope next door uses the same words - (`answered`/`cancelled`) with different grammar (outcome keys, not a status - string) — the trap that has now claimed two reviewers. -2. **The outcome union `'answered' | 'cancelled' | 'unavailable'` is restated** in - the projection input types, the editor envelopes (as a subset), and the session - debt-classifier's terminal-keys check — four files, zero owners, while the - request details schemas already carry these as their branch keys. -3. **The grounding-gap fixture builder is cloned across nine-plus test files**, - each hand-building the same `ElicitationGap` literal with a coverage knob, - while production's `conservativeUncoveredFloorGaps` builds the same shape - privately a tenth time. -4. **Hand-written editor-response interfaces** in both envelope sites, derivable - from the schema that should exist per (1). - -```pseudo graph (current) -schemas/request.ts ──owns──▶ zRequestChoicesDetails (outcome KEYS) -request-choices tool ──hand-writes──▶ local editor envelope + EditorResponse + parser -editor-fallback (probe-only) ──hand-writes──▶ second divergent envelope + parser + single-select arm -projections/exchanges ──restates──▶ 'answered'|'cancelled'|'unavailable' inline -session debt-classifier ──restates──▶ same three literals as key checks -9+ test files ──each hand-build──▶ ElicitationGap grounding fixtures -runtime/index ──privately builds──▶ conservativeUncoveredFloorGaps (same shape, 10th copy) -``` - -```pseudo graph (desired) -schemas/request.ts ──owns──▶ zRequestChoicesDetails - ──owns──▶ zRequestChoicesEditorEnvelope (NEW: the one wire envelope) - ──owns──▶ REQUEST_OUTCOME_KEYS / RequestOutcome (NEW: projected, not declared) -request-choices tool ──derives──▶ prefill template (satisfies) + response (z.infer) + safeParse -RPC probe ──consumes──▶ the same canonical envelope (divergent fallback deleted or converged) -projections/exchanges ──projects──▶ RequestOutcome; re-exports keys for session-side consumers -session debt-classifier ──derives or drift-tests──▶ terminal keys against the schema branches -graph/schema gaps sub-tree ──owns──▶ groundingFloorGaps({coverage}) builder -runtime floor + all test fixtures ──import──▶ that one builder -``` - -## Solution - -One owner per state space: the editor envelope, the outcome union, and the -grounding-gap fixture each get exactly one declaration site; every other -appearance becomes an import, inference, or projection. Most of the diff is -deletion. - -## Commits - -Ordered extractions-first; every commit leaves verify green. Commits 1→3 are -sequential on one seam; commit 4 is independent (parallel-safe lane for fan-out). - -1. **Extract the canonical editor-envelope schema.** Add the request-choices - editor envelope as a zod schema co-located with the request details schemas - (the product tool's current shape is canonical — it is the live contract). - The tool's prefill template is typed against the schema's input, its response - type is inferred, and its hand-written interface and parser are deleted in - favor of safeParse. Behavior-preserving; existing exchange tests unchanged. - Add one envelope round-trip test (prefill → edited response → parse → - projection into result details) as the seam's lock. -2. **Extract the outcome-union owner.** Export the outcome key list and its type - from the request schemas module (projected from the details-schema branches, - not redeclared); the projection input types and the editor envelope's - answered/cancelled subset become projections of it; re-export through the - exchanges projections layer so session-side consumers can reach it without - importing extension internals. The session debt-classifier's terminal-keys - check derives from the re-export — or, if that coupling is rejected during - build, keeps its literals and gains a drift test pinning them to the schema - branches. Either way the union has one owner. -3. **Converge or delete the probe-side envelope.** Rewrite the RPC - structured-exchange probe onto the canonical envelope and delete the - divergent fallback envelope, its parser, and its hand-written types. DECISION - GATE in-commit: the fallback's single-select arm is probe-only reachable; per - the request_choices-only rationale it should be deleted — but if the probe is - meant to prove a single-select RPC editor path, keep that arm and derive its - types instead. Confirm with the user before deleting; default is delete. -4. **Extract the grounding-gap fixture builder.** One builder with a coverage - knob, owned alongside the gaps schema; production's conservative floor rides - it (production owns the shape, tests import it — never the reverse); the - nine-plus per-test-file clones are deleted. Suite stays green as the proof. - -## Decisions - -- The product request-choices envelope is canonical; the probe-side envelope is - drift, not a second contract. -- Zod owns the editor envelope: the edited JSON returns from an agent-as-user - over RPC, which is the repo's LLM-boundary rule — this is doctrinal, not an - exception. -- The outcome union is projected from the details schemas, never redeclared; - its session-side consumption goes through the projections re-export (preferred) - or a drift test (fallback), keeping session free of extension-internal imports. -- Fixture/production convergence direction: production owns the grounding-floor - shape; fixtures import it. -- The single-select editor arm's fate is the one open decision (commit 3 gate). -- Topology READMEs: add the two-envelope rationale (why the editor channel - exists at all: ctx.ui.custom cannot cross RPC; Pi built-ins cover the other - request shapes; multi-choice is the one payload needing it) to the exchanges - directory README in the same commit as the schema extraction — that note is - the trap-prevention payload of this whole refactor. - -## Testing Decisions - -- Behavior-preservation is the rule for commits 1, 2, 4: existing - structured-exchange, schema, and gap tests pass unchanged; only their imports - move. -- The new envelope round-trip test (commit 1) is the only net-new oracle: it - proves prefill, parse, and projection share one schema, which is the property - whose absence caused both review failures. -- If the drift-test fallback is chosen in commit 2, it asserts the classifier's - key literals equal the schema branch keys — same pattern as the existing - observed-shapes drift guards. -- Prior art: the schemas module's existing zod-parse-at-projection idiom - (`zRequestChoicesDetails.parse` in projections) and the - observed-shapes-coverage drift test. - -## Out of Scope - -- The PI_OFFLINE dev-default question — parked, low stakes: the TUI-branding - concern (Pi's version-check interjection, not suppressed by quietStartup) is - now served unconditionally by the PI_SKIP_VERSION_CHECK default from the - remediation pass, decoupling it from PI_OFFLINE entirely. Decide only when a - dev loop actually wants provider-reachable TUI launches. -- The ln-sync canonical-doc pass (D35-L startup-header alignment, stale - runtime-state-commands card, live-vs-harness blind-spot row, graduating the - two induct lenses into ln-review). -- request_answer's plain-string editor use — not an envelope, nothing to unify. - -## Carried forward — SUSPENDED (from the completed remediation plan) - -- **Migration 0004 coherence:** another agent is fixing the 0004 migration on - the branch stacked on top of this one. Do not touch drizzle/ here. On - reintegration, verify the derive-with-'context'-fallback vs read-side-throw - concern was actually covered there before deleting this note. From 7b01de4d85926ac9a11abe843f5404b7ec98d335 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Thu, 11 Jun 2026 15:49:37 +0200 Subject: [PATCH 31/32] ln-sync: reconcile canonical docs after the FE-847 closure arc SPEC: I45/I46/I47 invariant and verification-design rows flip from planned/coverage-first-scaffold to covered with 2026-06-11 evidence; D35-L reconciled to the shipped, test-locked startup-header behavior (every non-cancel activation headers; resume/open-stay-quiet clause superseded; expand affordance removed until an input path exists); A27-L gains the predicate-hardening evidence (gapPredicateSupport owner, loud field/coverage rejection, presence kind-floor dedup, hydration consistency); new Acknowledged Blind Spots row for live-vs-harness wiring divergence with its mitigations and revisit trigger. PLAN: 12 done frontier definitions archived to PLAN_HISTORY as dated pointer bullets (835 -> 543 lines); completed Sequencing subsections collapsed into a Recently Completed section; stale active-track reference repaired. GC: stale memory/cards/tooling--runtime-state-commands.md deleted (pickers/overlays shipped; the card's non-scope claims were drift). READMEs: src/dev Tier-2 harness ledger gains the resume/reboot chassis entries and the scaffold-fully-live note. Co-Authored-By: Claude Fable 5 --- docs/archive/PLAN_HISTORY.md | 17 + memory/PLAN.md | 305 +----------------- memory/SPEC.md | 17 +- .../cards/tooling--runtime-state-commands.md | 136 -------- src/dev/README.md | 2 +- 5 files changed, 33 insertions(+), 444 deletions(-) delete mode 100644 memory/cards/tooling--runtime-state-commands.md diff --git a/docs/archive/PLAN_HISTORY.md b/docs/archive/PLAN_HISTORY.md index 89837dcc..c48191fd 100644 --- a/docs/archive/PLAN_HISTORY.md +++ b/docs/archive/PLAN_HISTORY.md @@ -149,3 +149,20 @@ Archived from `memory/PLAN.md` so the live plan only carries active, next, horiz - **Traceability:** R4, R8, R11, R12, R16, R17 / D5-L, D10-L, D12-L, D13-L, D19-L, D24-L, D33-L / I19-L, I21-L - **Design docs:** [prd.md §M3, §Frontend Architecture](file:///Users/lunelson/Code/hashintel/brunch-next/docs/architecture/prd.md) - **Current execution pointer:** complete. M3 tied off with shared JSON-RPC protocol helpers/dispatch semantics, `ws`-backed `/rpc` transport, persistent browser RPC client with protocol-failure hardening, canonical built asset serving with traversal-safe asset resolution, stable React runtime, explicit read-only session projection by durable session id through a canonical Brunch session-envelope reader with strict self-description validation, explicit transcript custom-entry classifiers, and read-only browser transcript rendering of assistant/user rows plus transcript-native prompt display rows from typed `{ sessionId, specId }` targets. Automated verification and direct HTTP/WebSocket projection postconditions pass. Accepted outer-loop deferral: qualitative browser-open smoke remains environment-blocked because `agent-browser` cannot create its socket directory under the current macOS sandbox (`Operation not permitted`); this does not block M3 tie-off because static HTML serving, absence of HTTP product reads, explicit `{ sessionId, specId }` WebSocket RPC reads, transcript-display text including custom prompt rows, and exchange projection were rechecked directly against the host. + +## 2026-06-11 Sync archive + +Archived from `memory/PLAN.md` after the FE-847 turn-boundary closure, review-fix remediation, and typing-collapse passes became the live completion window. Full definitions retired; pointers preserved below. + +- `dx-tier-2-harness` (FE-847) — Done 2026-06-10 with 2026-06-11 closure on FE-847. The real `runBrunchTui` boot chassis, faux-turn payload/transcript oracle, fixture resume path, skipped I45-L–I47-L scaffold, and topology stubs are in place; the final follow-on tightened Tier-1 proof so Brunch-configured faux sessions now own the definitive provider-facing prompt/tool payload assertion. +- `project-graph-review-cycle` (FE-809) — Done 2026-06-06. Structured-exchange schema/emission lock and approval wiring are complete, and `.fixtures/runs/project-graph-review-cycle/2026-06-06-project-graph-review-cycle/` proves the real `project-graph` agent path: selected-spec graph read, dry-run-gated `present_review_set`, public-RPC approval through `session.submitExchangeResponse`, one explicit-basis `acceptReviewSet` graph commit, and graph invalidations with `{specId, lsn}`. The probe also fixed a real policy gap: commitment-grade `generate-proposal` now activates `present_review_set` / `request_review` for the Brunch runtime tool posture. +- `elicitation-backlog` (FE-823) — Done 2026-06-08 on FE-823. Materialized `elicitation_backlog` as a flat table plus generated migration, seeded grounding questions at `createSpec`, routed create/close mutations through `CommandExecutor` on the shared spec-local LSN/change-log seam, and added graph-owned per-spec read-back. The remaining prompt-resource body pass stays in `memory/CROSS_CUT_PLAN.md` as temporary coverage completion work; the live per-turn driver remains a follow-on, not frontier completion debt. +- `elicitation-gaps-remodel` (FE-531) — Done 2026-06-10. Replaced FE-823 `elicitation_backlog` with the D65-L `elicitation_gaps` obligation register, regenerated the table/migration metadata, seeded the grounding typology catalog, routed create/disposition mutations through `CommandExecutor`, and proved live `presence` coverage/answered derivation at read-back with sibling-spec isolation. `field`/`coverage` predicate derivation and `manual` LLM satisficiency remain named follow-ons for capability-readiness / later predicate slices. **Superseded in part by `gaps-node-kind-reference` (D75-L):** the grounding typology catalog and gap-`name` enum are retired in favor of `refersTo: NodeKind` + a free-form question; the flat-table substrate, predicate union, disposition, and live derivation this frontier established stand. **2026-06-11 predicate-hardening follow-on landed:** `field`/`coverage` gap predicates now reject loudly until derivation exists, open presence gaps dedupe by `(specId, nodeKind)`, and gap hydration fails on `predicate_kind` / predicate JSON divergence instead of silently reading an inconsistent row. +- `gaps-node-kind-reference` — Done 2026-06-10. Replaced gap `name` with `refersTo: NodeKind` + `question` across schema, DB, `CommandExecutor`, reads, and capability-readiness; added migration `0004_gaps_node_kind_reference`; reseeded grounding by node kind (`context`, `thesis`, `goal`, `constraint`, plus `term`/`assumption`); proved live presence coverage still flips, required-kind absence fails loud, and two `thesis` gaps discriminate independently by question+satisfier. Topology READMEs reconciled. +- `minimal-authority-shell` (FE-810) — Done 2026-06-08. Added `src/.pi/extensions/runtime/authority-matrix.test.ts` as the minimal authority guard: it locks the `CommandResult` discriminant vocabulary (including structured `needs_human` representability), proves `elicit-read-only` derives allowed/blocked tool authority from the shared projected runtime policy, and verifies the POC side-effecting tools (`bash`, `edit`, `write`) are not reachable in `elicit`. No standalone authority service was introduced, `src/.pi/agents/state.ts` stayed untouched, and A18-L strict built-in suppression remains named residue rather than closed. +- `graph-observed-shapes` — Done 2026-06-08. `src/graph/README.md` now owns the closed observed-shape ledger: `read_graph` requires the six agent shapes, RPC and web require only `overview` + `neighborhood`, `list_by_kind` / `list_by_band` remain web-eligible deferred, and register reads remain deferred until a per-turn driver/consumer needs them. `src/graph/observed-shapes-coverage.test.ts` guards the tool/RPC/web required subsets; no transport shape shipped in this frontier. +- `runtime-affordances-and-legality` — Done 2026-06-08. `src/projections/session/affordances.ts` now owns the shared `(resolvedState, readinessGrade)` derivation for legal goal/strategy/lens options plus default-on-switch values, reusing the same grade/AUTO legality source consumed by `.pi/agents/state.ts`; `src/session/README.md` owns the closed coverage ledger and `src/session/runtime-affordances-coverage.test.ts` guards required agent/RPC rows while leaving `active-review-set` and `turn-mode` as explicit product-state-gated deferrals. +- `role-safe-graph-mutations` — Done 2026-06-09. `CommandExecutor` now exposes one public authored mutation seam (`mutateGraph` / `dryRunMutateGraph`) over the extracted planner/writer modules; direct tool writes, review-set acceptance, capture, seed loading, and dev curation all converge on that grammar. The dev-only RPC boundary is now `dev.graph.mutateGraph`, using role-named create-edge ops plus projected node-code / selected-spec edge-id resolution before it enters `CommandExecutor`. Follow-up closure on the same date: the product probes now prompt for and parse `mutate_graph`, current docs describe `mutate_graph` as the active tool, the checked-in 2026-06-05 fixture-curation run is labeled historical pre-migration `commit_graph` evidence, and schema coverage guards the authored edge surfaces against endpoint-role drift. +- `dx-feedback-loops` (FE-825) — Done 2026-06-09. The chain landed the latest-pi bump and `PI_SOURCE`-gated runtime alias, the `src/dev/` faux front door and shared faux harness, and the dev-gated read-only introspection extension plus paired run-artifact launcher. Verification: `npm run verify` (608 tests, tsc build, web build). The follow-on frontier `dx-introspection-live` is now also done: the real TUI wiring, `--cwd` launch surface, unified `BRUNCH_DEV` gate, dev query tools, and workspace-local `.brunch/debug/` cache all landed on 2026-06-11. +- `dx-introspection-live` (FE-825) — Done 2026-06-11. Slices 1-2, the dev-query active-tool follow-on, and the workspace debug-cache chain are done: `BRUNCH_DEV` real TUI launches can mirror the latest final system prompt and append explicit Brunch-owned text tool-result content into launch-cwd `.brunch/debug/` while repo-root `.fixtures/scratch/` remains the durable paired-run artifact path. `tool-renders` flattening remains explicitly deferred until a concrete renderer-debugging need appears. +- `web-design-system-port` — (no pointer found) diff --git a/memory/PLAN.md b/memory/PLAN.md index f59c2c42..5115b8b8 100644 --- a/memory/PLAN.md +++ b/memory/PLAN.md @@ -86,23 +86,16 @@ per ledger row: - (none) — the FE-847 turn-boundary closure completed 2026-06-11 (see Turn-boundary choreography below). One handed-off residue: migration `0004_gaps_node_kind_reference` coherence (the in-place journal-tag rewrite + the derive-with-'context'-fallback that the new read-side `predicate_kind` throw would reject) is being fixed by another agent on the stacked successor branch — on reintegration, verify that fix actually covers the concern before considering it closed; do not touch `drizzle/` on this branch meanwhile. -### Turn-boundary choreography (Tier-2 layer) +### Recently Completed -Core runtime choreography specced/scoped now (Context §Turn-boundary choreography; SPEC D76-L–D78-L, I45-L–I47-L). FE-847 lays the chassis + coverage-first scaffold; the product write-side then fills the scaffold slice by slice. **Branch-mechanics override (user, 2026-06-11): `dx-tier-2-harness` stays on `ln/fe-847-dx-introspection-tier-2`, while the remaining product closures (`turn-boundary-reconciliation` and `kick-and-context-seeding`) continue together on the stacked successor branch `ln/fe-847-turn-boundary-closure`.** This is a stack-management exception only: same FE-847 issue, same sequential closure, no new frontier or Linear split. Each grouping still flips its own scaffold tests live. - -1. `turn-boundary-reconciliation` (M7 product mechanics; slice group on FE-847) — S1 assistant-visible watermark projection (D76-L), S2 the `prepareNextTurn` one-writer reconciler + `worldUpdate` + own-write/full-overview watermark stamping (D77-L), S3 submit-time mention ledger + staleness (I9-L). Carries its share of S5 (carrier discipline / no-redundant-`worldUpdate`-after-seed idempotence, I47-L). -2. `kick-and-context-seeding` (product mechanics; slice group on FE-847) — **sequenced after `turn-boundary-reconciliation` S1/S2** (the seed must advance the watermark and the kick decision interacts with reconciler-inserted notices). S4 honest assistant-origination behind `session.triggerExchange` (`startAssistantTurn({ origin })`) + boot/resume context seeding (D78-L). Carries its share of S5 (boot/resume seed idempotence, pre-reconcile-tail kick policy, I46-L/I47-L). - -### Readiness & elicitation-gaps remodel (recommended ahead of the trio) - -Post-`ln-spec` implications that are **upstream** of the context-pipeline trio's readiness/chrome-touching locks (see Context §Readiness / elicitation-gaps remodel). Land the hard chain before stage 1 freezes `workspace/workspace-state` + `session/runtime-state` shapes, or bracket those fields in the trio. - -1. `gaps-node-kind-reference` — **done 2026-06-10.** Reshaped the gaps substrate onto node kinds per D75-L: `refersTo: NodeKind` + a free-form `question` replaced the typology `name` enum; reseeded grounding by node kind (floor `context`/`thesis`/`goal`/`constraint` plus `term`/`assumption`); `capability → NodeKind[]` replaced `RelevantGapName`. Absorbed the retired refactor plan (folded into D75-L). -2. `capability-readiness` — **done 2026-06-11 (depends on `gaps-node-kind-reference`, done).** Runtime affordances, method/manifest/tool legality, soft derived readiness estimate projection, agent-prompt display, workspace/chrome display, and the stored-grade deletion sweep now read `ElicitationGap[]` / gap coverage rather than a persisted grade. `specs.readiness_grade`, `updateReadinessGrade`, `READINESS_GRADES`, residual grade prompt carriers, and fixture/probe grade setup are retired. +- 2026-06-11 **Turn-boundary choreography (Tier-2 layer) complete** — `turn-boundary-reconciliation` and `kick-and-context-seeding` both done on FE-847 (`ln/fe-847-turn-boundary-closure`); every I45/I46/I47 scaffold row runs live through real boot/restart; full definitions retained below for re-entry. The same branch carried the review-fix remediation (PR-comment defects fixed at top of stack, user-routed) and the typing-collapse refactor (canonical editor envelope, projected outcome union, shared grounding-gap fixture builder). +- 2026-06-11 `capability-readiness` — done after the grade-deletion sweep plus the remediation's live-wiring closure (required `getElicitationGaps`, conservative-fallback deletion, Tier-2 legality oracle); definition retained below. +- 2026-06-10 `gaps-node-kind-reference` — done; gaps reference node kinds per D75-L (definition archived). +- Older completed frontiers: `docs/archive/PLAN_HISTORY.md` (12 definitions archived 2026-06-11). ### Next -The near-term spine has two tracks. The **context-pipeline coverage trio** remains the elevated product-coverage spine, sequenced in strict dependency order (lock upstream shape before downstream output). `role-safe-graph-mutations` is a graph-mutation grammar frontier that can run before or alongside the trio, and must land before relation-bearing generalized capture or semantic fixture curation rely on the new mutation surface. The `dx-feedback-loops` DX substrate and its `dx-introspection-live` follow-on are complete and no longer gate this list; the remaining FE-847 closure work is the active parallel product track. +The near-term spine has two tracks. The **context-pipeline coverage trio** remains the elevated product-coverage spine, sequenced in strict dependency order (lock upstream shape before downstream output). `role-safe-graph-mutations` is a graph-mutation grammar frontier that can run before or alongside the trio, and must land before relation-bearing generalized capture or semantic fixture curation rely on the new mutation surface. The `dx-feedback-loops` DX substrate, its `dx-introspection-live` follow-on, and the FE-847 turn-boundary closure are all complete and no longer gate this list. 1. `projection-shape-coverage` — **PROJECT stage** (`#project`); invariant / no-loss kind. Ledger authored in `src/projections/README.md`. Two sub-steps: (a) **PULL-session prerequisite** — ledger the session read surface (`session/workspace-context`, `workspace-session-coordinator`, `runtime-state`) the session/workspace projections lock against; (b) **earns-its-place audit then lock** — delete/inline the `✗` indirection (`workspace/workspace-context`: single-consumer tag wrapper), resolve the `◐` exchange family (direct-lock vs keep-transitive), and add a shape/no-loss invariant to each `●` survivor (`graph/neighborhood`, `session/transcript-context`, `session/runtime-state`, `workspace/workspace-state`). The graph projection stubs (`overview`, `commit-result`, `reconciliation-needs`) are `export {}` topology stubs, **not** dark implementations — leave them. Upstream of everything else in the trio; do this first so renderer goldens lock against stable shapes. 2. `renderer-golden-coverage` — **RENDER stage** (`#render`); golden + invariant kind. **Depends on `projection-shape-coverage`.** Create the renderer ledger (README claims one that does not exist), extend the preview harness past `graph-neighborhood`, and golden-lock every durable renderer (only `graph/neighborhood` + `session/runtime-frame` are locked; the rest are dark or only transitively covered via the `.pi` adapter). @@ -137,32 +130,6 @@ The near-term spine has two tracks. The **context-pipeline coverage trio** remai ## Frontier Definitions -### dx-tier-2-harness - -- **Name:** Tier-2 DX chassis — real-boot + faux-turn + payload/transcript oracle + fixture resume -- **Linear:** FE-847 — DX introspection Tier 2 -- **Branch:** `ln/fe-847-dx-introspection-tier-2` -- **Kind:** structural / dev-substrate -- **Status:** done -- **Certainty:** proving -- **Retires:** part of A25-L — extends the DX-loop proof from faux-provider scripted turns (`dx-feedback-loops`) to a reusable *real-boot* Tier-2 chassis that captures the provider payload and inspects the resulting transcript. -- **Lights up:** A Tier-2 test chassis that did not exist — `runBrunchTui` boots for real, one faux model turn runs, the provider payload is captured, the resulting transcript is inspected, and a session resumes from a fixture transcript — the harness every turn-boundary-choreography product slice asserts its mechanics through. -- **Stabilizes:** The Tier-2 harness seam plus the coverage-first scaffold for I45-L–I47-L (the skipped invariant suite + intentional topology stubs the watermark projection, the `prepareNextTurn` reconciler, and the origination primitive will fill). -- **Objective:** Build the thin Tier-2 chassis (S0) only: (1) a real `runBrunchTui` boot path usable in test, (2) one faux model turn driven end-to-end with no network/keys, (3) provider-payload capture + transcript-inspection oracles, (4) fixture-transcript resume. Then **author the coverage-first scaffold** for the whole turn-boundary-choreography layer: the I45-L–I47-L invariant suite as `it.todo` / `describe.skip` keyed to its enabling slice, plus intentional `export {}` topology stubs (ownership comment per AGENTS.md) for the not-yet-built modules — including **one shared continuity-entry classifier** (`isWatermarkCarrier` / `isContinuityOnlyNonDebtEntry`) so S1/S2 watermark projection and S4 resume-kick classification share one taxonomy of carrier vs. continuity-only-non-debt vs. debt-bearing entries rather than duplicating hardcoded lists. The scaffold's first tests must encode the three SPEC edge cases — seed/full-overview snapshots advance the watermark while narrow reads do not; no redundant `worldUpdate` immediately after a seed naming the current snapshot LSN; the resume kick decision is taken on the pre-reconcile tail (a user tail still earns a kick after the reconciler inserts seed/staleness notices) — and assert `worldUpdate.items` / watermark / kick outcomes as **sets and `{specId, lsn}` properties, not payload-order goldens** (no canonical item sort is specified), so the suite stays deterministic. -- **Why now / unlocks:** The user has elevated the turn-boundary-choreography layer to core mechanics and wants the proving infrastructure laid in while the concept is fresh. The chassis is buildable now and is the harness through which S1–S5 product mechanics are proven; authoring the skipped scaffold now stops the edge cases from being lost before their slices exist. -- **Acceptance:** - - A test can boot the real `runBrunchTui` orchestration, run one faux model turn, capture the exact provider payload, and inspect the resulting transcript entries — with no network, keys, or tokens. - - A session can resume from a fixture transcript through the same chassis. - - The I45-L–I47-L invariant suite exists as skipped (`it.todo` / `describe.skip`) tests keyed to their enabling slices (`turn-boundary-reconciliation`, `kick-and-context-seeding`), and the three SPEC edge cases are each present as a named skipped case. - - Intentional topology stubs exist for the assistant-visible watermark projection, the `prepareNextTurn` reconciler, and the origination primitive — `export {}` + ownership/IO/future-callers comment per AGENTS.md. - - No product mechanics land on this frontier: the watermark/reconciler/kick modules stay stubs; `npm run verify` is green with the scaffold tests skipped (no slice lands green by leaving its own tests skipped — that obligation is on the product frontiers). -- **Verification:** Inner — chassis unit tests (boot, faux turn, payload capture, transcript inspect, fixture resume); a test asserting the scaffold suite is present-but-skipped and the topology stubs compile. The skip ledger is itself the layer's live coverage map (SPEC §Design Notes, coverage-first scaffold). -- **Cross-cutting obligations:** Preserve the D39-L sealed-profile boundary and the `dx-feedback-loops`/`dx-introspection-live` DX conventions — the chassis is a dev/test substrate, observes but does not shape product behavior, and stays distinct from `src/probes/` product-verification runs. Do not fold S1–S5 product mechanics into S0. Topology stubs follow AGENTS.md §intentional topology stubs. -- **Topology materialization:** Chassis/harness lives under `src/dev/` (Tier-2 test front door) reusing the shared faux harness; topology stubs land at their final product homes (assistant-visible watermark projection under `src/projections/session/`, the `prepareNextTurn` reconciler and origination primitive under `src/session/` per their READMEs, and the shared continuity-entry classifier at the boundary both consume — `src/projections/session/` if read-side-owned) so the dependency direction is legible before behavior exists. -- **Traceability:** D37-L, D39-L, D43-L, D68-L, D69-L, D76-L, D77-L, D78-L; A25-L; I45-L, I46-L, I47-L. -- **Design docs:** `memory/SPEC.md` D76-L–D78-L, I45-L–I47-L, §Verification Design (coverage-first scaffold design note); `src/dev/README.md`; `src/session/README.md`; `src/projections/README.md`. -- **Current execution pointer:** Done 2026-06-10 with 2026-06-11 closure on FE-847. The real `runBrunchTui` boot chassis, faux-turn payload/transcript oracle, fixture resume path, skipped I45-L–I47-L scaffold, and topology stubs are in place; the final follow-on tightened Tier-1 proof so Brunch-configured faux sessions now own the definitive provider-facing prompt/tool payload assertion. - ### turn-boundary-reconciliation - **Name:** Turn-boundary reconciliation — assistant-visible watermark, `worldUpdate`, mention staleness @@ -218,103 +185,6 @@ The near-term spine has two tracks. The **context-pipeline coverage trio** remai - **Design docs:** `memory/SPEC.md` D78-L, I46-L, I47-L; `src/session/README.md`. - **Current execution pointer:** Done 2026-06-11 on FE-847 (closure completed by the review-fix remediation pass). All I46/I47 Tier-2 scaffold rows run live with no skips/todos: new-session seed-then-kick through real boot; resume kick on the pre-reconcile user tail (including behind continuity notices and after earlier completed exchanges — the prior blanket exchange-result suppression was a real bug); `request_*` leaves idle against the **real** result envelope (outcome is `answered`/`cancelled`/`unavailable` key presence per `projections/exchanges`, not a status string — the prior classifier read a field that never exists and would have re-kicked answered tails); crash-after-notice reboot kicks without duplicating the seed; drains neither manufacture nor mask debt; boot/resume dedupe proven across an actual restart via `rebootTier2Runtime`. Kick origin now derives from projected transcript state (no message entries = new session), not entry counts. -### project-graph-review-cycle - -- **Name:** Project-graph review-set proposal and atomic acceptance -- **Linear:** [FE-809](https://linear.app/hash/issue/FE-809/project-graph-review-set-proposal-and-atomic-acceptance) -- **Branch:** `ln/fe-809-project-graph-review-cycle` -- **Kind:** structural / bounded feature -- **Status:** done -- **Certainty:** proving -- **Stabilizes:** I15-L, I20-L, I34-L, I40-L — exact review approval must become one explicit-basis atomic graph batch, not a path-shaped basis value or partial commit; only structurally valid review payloads may become user-reviewable. -- **Lights up:** `project-graph` proposal → dry-run-valid `present_review_set` → approval → `acceptReviewSet` graph commit. -- **Objective:** Wire the `project-graph` strategy from real agent proposal generation through `present_review_set` / `request_review`, dry-run gating, approve/request-changes/reject response handling, and atomic `acceptReviewSet` commit. -- **Why now / unlocks:** This is the P1 proposal/review story. It is only P0 if the POC demo requires user-reviewed batch graph commitments rather than direct `propose-graph` and capture paths. -- **Acceptance:** - - The agent can generate a review-set payload with required lens, epistemic status, and grounding/support metadata. - - Only dry-run-valid proposals surface as reviewable; invalid generations remain internal to retry/regeneration. - - Approve commits the entire batch through one `CommandExecutor` call, one LSN, one change-log entry, and `basis: explicit`; partial acceptance is not representable. - - Request-changes and reject are transcript-visible outcomes; request-changes can trigger a successor proposal or an explicit deferred path. - - Web/TUI can observe the proposal/decision state enough for the POC; full review UX polish may remain thin. -- **Verification:** Inner — review-set schema tests, dry-run/real-run differential tests, accept atomicity tests. Middle — structured-exchange review-cycle fixture; no-bypass checks. Outer — targeted probe: `project-graph` proposes, user approves, graph updates and web observer sees it. -- **Topology materialization:** Review payload schemas live under `.pi/extensions/exchanges` as the current structured-exchange schema seam; reusable review payload construction/rendering lives under `projections/exchanges/` and `renderers/exchanges/`; proposal validation/translation lives in `graph/` review modules; agent strategy resource lives in `.pi/skills/strategies/project-graph.md`; web observes via RPC projections. -- **Cross-cutting obligations:** Preserve D27-L: review-set proposal is a structured-exchange payload, not a standalone public review-set entity. Reviewer advisory writes remain deferred unless explicitly scoped. Existing-node references and review payloads use projected graph codes at adapter/UI boundaries, not raw DB ids. -- **Traceability:** R21, R23 / D4-L, D20-L, D26-L, D27-L, D51-L, D53-L, D62-L, D63-L / I11-L, I15-L, I20-L, I34-L, I40-L / A14-L, A16-L. -- **Design docs:** `docs/design/REVIEW_SETS.md`; `docs/design/GRAPH_MODEL.md`; `memory/SPEC.md` D27-L. -- **Current execution pointer:** Done 2026-06-06. Structured-exchange schema/emission lock and approval wiring are complete, and `.fixtures/runs/project-graph-review-cycle/2026-06-06-project-graph-review-cycle/` proves the real `project-graph` agent path: selected-spec graph read, dry-run-gated `present_review_set`, public-RPC approval through `session.submitExchangeResponse`, one explicit-basis `acceptReviewSet` graph commit, and graph invalidations with `{specId, lsn}`. The probe also fixed a real policy gap: commitment-grade `generate-proposal` now activates `present_review_set` / `request_review` for the Brunch runtime tool posture. - -### elicitation-backlog - -- **Name:** Elicitation backlog substrate and agenda read-back -- **Linear:** [FE-823](https://linear.app/hash/issue/FE-823/elicitation-backlog-substrate-and-agenda-read-back) -- **Kind:** structural / bounded feature -- **Status:** done -- **Certainty:** proving -- **Retires:** A24-L — test whether a flat prospective register is sufficient before any plane/pointer promotion. -- **Lights up:** `createSpec` seed → `CommandExecutor` backlog mutation → per-spec read-back on the real graph boundary. -- **Stabilizes:** D65-L's missing "what to ask next" substrate and the rule that prospective agenda state shares the spec-local LSN / change-log boundary. -- **Objective:** Materialize D65-L `elicitation_backlog` as a flat table routed through `CommandExecutor`, seed it at spec creation, and provide per-spec read-back so the current elicitor coverage push has a real substrate instead of a homeless driver row. -- **Why now / unlocks:** This is the remaining required elicitor-coverage row that has escaped row-sized work. Promoting it back into `PLAN.md` keeps PLAN authoritative, gives the temporary cross-cut a named completion target, and unlocks later per-turn "what to ask next" behavior without prematurely inventing either a second planning system or a graph plane. -- **Acceptance:** - - The flat table exists with a generated migration and a reconciliation-need-mirroring shape. - - Create/close operations route through `CommandExecutor`, allocate one spec-local LSN + one `change_log` row each, and return structured failures on malformed input. - - `createSpec` seeds the grounding-band starter agenda for the new spec only. - - A graph-owned read path returns open backlog entries per spec with stable fields. -- **Verification:** Inner — schema/migration and `CommandExecutor` tests for create/close/seed/LSN/change-log behavior. Middle — graph query read-back and sibling-spec isolation. Outer — none yet; the per-turn driver remains a follow-on once the substrate proves useful. -- **Cross-cutting obligations:** Preserve D4-L/D20-L command boundary, D16-L/A4-L one `{specId, lsn}` mutation clock, D63-L basis-as-provenance-directness, D52-L graph-owned table + read, and D65-L flat-table-only modeling — no graph node/plane and no unknown→unknown edges. -- **Traceability:** D4-L, D8-L, D16-L, D20-L, D52-L, D63-L, D64-L, D65-L / A24-L. -- **Design docs:** `memory/SPEC.md` D65-L; `docs/design/GRAPH_MODEL.md`. -- **Current execution pointer:** Done 2026-06-08 on FE-823. Materialized `elicitation_backlog` as a flat table plus generated migration, seeded grounding questions at `createSpec`, routed create/close mutations through `CommandExecutor` on the shared spec-local LSN/change-log seam, and added graph-owned per-spec read-back. The remaining prompt-resource body pass stays in `memory/CROSS_CUT_PLAN.md` as temporary coverage completion work; the live per-turn driver remains a follow-on, not frontier completion debt. - -### elicitation-gaps-remodel - -- **Name:** Elicitation-gaps obligation remodel (backlog → typed coverage gaps) -- **Linear:** unassigned — create in FE / brunch when the frontier starts (sibling, not under FE-531). -- **Kind:** structural / bounded feature -- **Status:** done -- **Certainty:** proving -- **Retires:** A24-L (flat-register sufficiency, now under the obligation model rather than the question-instance model) and A27-L (per-band gap-satisfaction predicate expressibility at acceptable LLM cost). -- **Lights up:** the typed coverage-obligation register — each gap carries `name` + `rationale` + `band` + `presence|field|coverage|manual` predicate + `importance` + derived `coverage` + `disposition` — replacing the FE-823 question-instance / `open|closed` backlog. -- **Stabilizes:** D65-L's gap obligation model; I30-L gap-disposition capture; the anti-shadowing line (the table holds obligation/disposition/meta only, never domain content — that lives in the graph). -- **Objective:** Remodel the FE-823 `elicitation_backlog` table/type into `elicitation_gaps`: (a) rename module/type/table (`graph/schema/elicitation-backlog.ts` → `elicitation-gaps.ts`, `ElicitationBacklogEntry` → `ElicitationGap`); (b) replace the literal `question` field with a stable `name` (typology key — machine identity + display label) plus a mandatory meta `rationale`; (c) replace `status` / `ELICITATION_BACKLOG_STATUSES` with a `disposition` enum (`open | answered | not_applicable | irrelevant | reopened`) stored only where non-derivable (scope judgments + `manual` satisficiency); (d) add a `predicate` tagged union (`presence | field | coverage | manual`); (e) split the ambiguous rating into `importance` (pre-answer weight) + derived `coverage` (post-answer strength); (f) seed the grounding band from the collated **grounding typology catalog** (floor `domain` / `protagonist` / `pain_pull` / `constraint`; progressive drivers `value` / `context_of_use` / `success_sketch` / `solution_boundary`) in `command-executor.ts`, replacing the four `*_anchor_question` literals. Pre-release posture: regenerate the migration and seed; do not preserve the backlog row shape. -- **Why now / unlocks:** D65-L reconceived the backlog as typed obligations; both `capability-readiness` and `elicitation-driver` read this remodeled substrate, so its shape must land first. It is also upstream of the context-pipeline trio's readiness/chrome-touching locks (the gaps register surfaces through projections/renderers). -- **Acceptance:** - - The table is `elicitation_gaps` with a regenerated migration; no `question` / `status` / `ELICITATION_BACKLOG_STATUSES` residue remains. - - Each gap carries name + rationale + band + predicate + importance + derived coverage + disposition. - - Structural `answered` is derived **live** from the graph (never hand-set); only scope dispositions (`not_applicable` / `irrelevant`) and `manual` satisficiency are stored. - - `createSpec` seeds the grounding typology catalog (floor + progressive drivers), not literal questions; the four `*_anchor_question` literals are gone. - - Mutations still route through `CommandExecutor` on the shared spec-local `{specId, lsn}` / `change_log` boundary; per-spec read-back returns gaps. -- **Verification:** Inner — gaps schema/disposition tests; seed-set test asserting the grounding typology catalog (floor vs progressive); CommandExecutor create / close-disposition tests; live-derived `answered` test (graph presence flips coverage with no hand-set). Middle — per-band predicate expressibility fixtures (A27-L); capture-reflection spawning an elicitation-band gap. Outer — per-spec read-back probe over a seeded spec. -- **Cross-cutting obligations:** Anti-shadowing — the table never holds domain content (which lives in the graph). Gaps commit only through `CommandExecutor` (`basis` via provenance-directness, D63-L: user-raised `explicit`, agent-inferred `implicit`). Multi-spec discipline — each gap belongs to one spec's register. -- **Traceability:** D8-L, D30-L, D57-L, D60-L, D63-L, D64-L, D65-L, D74-L / A24-L, A27-L / I30-L. Supersedes the FE-823 backlog row shape. -- **Design docs:** `memory/SPEC.md` D65-L and §Grounding typology catalog; `src/graph/README.md`; `src/db/README.md`. -- **Current execution pointer:** Done 2026-06-10. Replaced FE-823 `elicitation_backlog` with the D65-L `elicitation_gaps` obligation register, regenerated the table/migration metadata, seeded the grounding typology catalog, routed create/disposition mutations through `CommandExecutor`, and proved live `presence` coverage/answered derivation at read-back with sibling-spec isolation. `field`/`coverage` predicate derivation and `manual` LLM satisficiency remain named follow-ons for capability-readiness / later predicate slices. **Superseded in part by `gaps-node-kind-reference` (D75-L):** the grounding typology catalog and gap-`name` enum are retired in favor of `refersTo: NodeKind` + a free-form question; the flat-table substrate, predicate union, disposition, and live derivation this frontier established stand. **2026-06-11 predicate-hardening follow-on landed:** `field`/`coverage` gap predicates now reject loudly until derivation exists, open presence gaps dedupe by `(specId, nodeKind)`, and gap hydration fails on `predicate_kind` / predicate JSON divergence instead of silently reading an inconsistent row. - -### gaps-node-kind-reference - -- **Name:** Gaps reference node kinds; retire the grounding-typology vocabulary (D75-L) -- **Linear:** unassigned — create in FE / brunch when the frontier starts. -- **Kind:** structural -- **Status:** done -- **Certainty:** proving -- **Depends on:** `elicitation-gaps-remodel` (done — reshapes its `name`-typology output onto node kinds). -- **Retires:** the `GROUNDING_GAP_TYPOLOGIES` seed catalog (8 typology names), the closed gap-`name` typology enum, and `capability-readiness`'s `RelevantGapName` union (D75-L); absorbs the retired refactor plan, folded into D75-L (do not enshrine the catalog). -- **Lights up:** an `elicitation_gaps` row that names its obligation by `refersTo: NodeKind` + a free-form `question`; capability-relevant gaps expressed as a `capability → NodeKind[]` map (grounding floor = `context` + `thesis` + `goal` + `constraint`). -- **Stabilizes:** D75-L (one ontology — gaps reference the node-kind taxonomy, not a parallel vocabulary) and the anti-shadowing line (the table holds obligation/disposition/meta, never domain content). -- **Objective:** Implement the D75-L substrate reshape. (1) `graph/schema/elicitation-gaps.ts`: replace `name` (typology key) with `refersTo: NodeKind` + a free-form `question`, keeping `rationale` / `band` / `predicate` / `importance` / derived `coverage` / `disposition`; regenerate the table + migration (pre-release free-rewrite, no typology residue). (2) `graph/command-executor.ts`: reseed grounding from node kinds — floor `context` / `thesis` / `goal` / `constraint` plus the now-covered `term` / `assumption` — instead of the 8-entry `SEEDED_ELICITATION_GAPS` catalog; draw seeded question text from the `docs/design/ELICITATION_QUESTIONS.md` priming examples. (3) `projections/session/capability-readiness.ts`: replace `RelevantGapName` + `CAPABILITY_RELEVANT_GAPS` with a `capability → NodeKind[]` map; a referenced kind absent from the register still fails loud (config bug ≠ uncovered). (4) Reconcile the graph / db / projections topology READMEs + the seed-set and capability-readiness tests. -- **Why now / unlocks:** D75-L is canonical but the code still implements the typology catalog; this is the upstream substrate reshape `capability-readiness` builds its gate on, so it lands before that frontier rewires the gate. It is also upstream of the trio's projection-shape lock (the gaps register surfaces through projections). -- **Acceptance:** - - `ElicitationGap` carries `refersTo: NodeKind` + `question`; no typology `name` enum, no `GROUNDING_GAP_TYPOLOGIES`, no `RelevantGapName` remain; table/migration regenerated with no typology residue. - - `createSpec` seeds grounding gaps by node kind (floor + `term` / `assumption`), not the eight literal typologies. - - capability-readiness reads a `capability → NodeKind[]` map; the grounding floor is grounded `context` + `thesis` + `goal` + `constraint`; a referenced kind absent from the register fails loud. - - Live presence-derived coverage/answered still flips from graph truth; two same-kind gaps (e.g. two `thesis` questions) are discriminated by question + `manual` / `coverage` satisfier, not aliased by a blunt presence count. - - graph / db / projections READMEs and the affected tests reconciled. -- **Verification:** Inner — gaps schema test (`refersTo: NodeKind`, no name enum); reseed test asserting the grounding floor by node kind incl. `term` / `assumption`; capability-readiness map test over node kinds incl. loud-fail-on-miss; live presence coverage flip preserved. Middle — the **discrimination probe** (the proving unknown): two `thesis`-referencing gaps resolve independently via question + judgment, not one shared presence count — retiring the presence-aliasing risk the retired refactor plan only deferred. Outer — per-spec seeded read-back probe. -- **Cross-cutting obligations:** anti-shadowing (D65-L/D75-L) — the table never stores domain content; the `NodeKind` union stays owned by the drizzle-free leaf `graph/schema/kinds.ts` (D73-L) — gaps import it, never redefine it; the `CommandExecutor` boundary + shared `{specId, lsn}` / `change_log` clock are unchanged. -- **Traceability:** D54-L, D56-L, D57-L, D60-L, D64-L, D65-L, D73-L, D74-L, D75-L / A24-L, A27-L / I30-L. Supersedes the grounding typology catalog, the gap-`name` typology enum, and `RelevantGapName`; absorbs the retired refactor plan. -- **Design docs:** `memory/SPEC.md` D75-L / D65-L; `docs/design/ELICITATION_QUESTIONS.md`; `src/graph/schema/elicitation-gaps.ts`; `src/graph/command-executor.ts`; `src/projections/session/capability-readiness.ts`; `src/graph/README.md`; `src/db/README.md`; `src/projections/README.md`. -- **Current execution pointer:** Done 2026-06-10. Replaced gap `name` with `refersTo: NodeKind` + `question` across schema, DB, `CommandExecutor`, reads, and capability-readiness; added migration `0004_gaps_node_kind_reference`; reseeded grounding by node kind (`context`, `thesis`, `goal`, `constraint`, plus `term`/`assumption`); proved live presence coverage still flips, required-kind absence fails loud, and two `thesis` gaps discriminate independently by question+satisfier. Topology READMEs reconciled. - ### capability-readiness - **Name:** JIT capability-readiness over gaps; retire the stored readiness grade @@ -382,29 +252,6 @@ The near-term spine has two tracks. The **context-pipeline coverage trio** remai - **Traceability:** D16-L, D20-L, D52-L, D63-L, D64-L, D65-L / A24-L. - **Design docs:** `memory/SPEC.md` D65-L; `docs/design/GRAPH_MODEL.md`. -### minimal-authority-shell - -- **Name:** Minimal POC authority shell over graph/session actions -- **Linear:** [FE-810](https://linear.app/hash/issue/FE-810/minimal-poc-authority-shell-over-graphsession-actions) -- **Branch:** to create — `ln/fe-810-minimal-authority-shell` -- **Kind:** hardening -- **Status:** done -- **Certainty:** proving -- **Stabilizes:** D20-L/D40-L command-result and elicit-mode authority seams for the current POC graph/session paths. -- **Objective:** Fill only the authority behavior required for a credible POC: graph writes keep returning structured command results, `elicit` suppresses obvious side-effecting tools, and headless/RPC paths surface structured `needs_human` where the POC actually reaches human-only actions. -- **Why now / unlocks:** Full M6 can remain horizon, but the POC must not look unsafe or mode-specific when graph/capture/review paths are exercised. -- **Acceptance:** - - `CommandExecutor` result discriminants remain the only graph mutation outcome surface for agent, RPC, and capture writes. - - `elicit` operational mode blocks or hides side-effecting Pi tools already identified as unsafe for the POC; remaining strict built-in suppression limits are named as A18-L residue, not ignored. - - Any human-only action encountered by current POC paths returns structured `needs_human` in headless/RPC rather than throwing a TUI-only dialog assumption. - - No new standalone authority service is introduced. -- **Verification:** Inner — policy/result-shape tests for touched actions. Middle — small authority matrix over current POC paths (agent graph tool, capture write, review approve if present, RPC/headless selection). Outer — manual smoke only if a TUI-visible policy path changes. -- **Topology materialization:** Policy lives in `graph/policy` and `.pi/extensions/runtime/` / command-policy adapters as appropriate; no caller-side policy snippets in `web/`, `rpc/`, or agent resources. -- **Cross-cutting obligations:** This is a minimal shell, not full M6. Do not widen into comprehensive RBAC/permissions unless a current POC path needs it. -- **Traceability:** R5, R6, R10 / D20-L, D34-L, D40-L / A18-L, A3-L. -- **Design docs:** `memory/SPEC.md` D20-L/D34-L/D40-L; `docs/reference/pi-extensions.md`. -- **Current execution pointer:** Done 2026-06-08. Added `src/.pi/extensions/runtime/authority-matrix.test.ts` as the minimal authority guard: it locks the `CommandResult` discriminant vocabulary (including structured `needs_human` representability), proves `elicit-read-only` derives allowed/blocked tool authority from the shared projected runtime policy, and verifies the POC side-effecting tools (`bash`, `edit`, `write`) are not reachable in `elicit`. No standalone authority service was introduced, `src/.pi/agents/state.ts` stayed untouched, and A18-L strict built-in suppression remains named residue rather than closed. - ### poc-live-ship-gate - **Name:** POC live ship gate and runbook oracle @@ -431,76 +278,6 @@ The near-term spine has two tracks. The **context-pipeline coverage trio** remai - **Design docs:** `docs/architecture/probes-and-transcripts.md`; `docs/architecture/pi-ui-extension-patterns.md`; `memory/SPEC.md` verification stance. - **Current execution pointer:** FE-811 ship-gate hardening landed on `ln/fe-811-ship-gate-residue-and-mentions`: stale graph-snapshot/report residue in the committed fixture-curation and project-graph-review-cycle runs was regenerated to the graph-overview/workspace.state contract, the related-edge formatter now labels non-anchor edges `lateral`, and the live mention autocomplete slice now sources selected-spec graph nodes instead of fixture candidates. The remaining frontier work is the final fresh-cwd runbook gate. -### graph-observed-shapes - -- **Name:** Graph observed-shape inventory by consumer -- **Linear:** unassigned -- **Kind:** structural -- **Status:** done -- **Certainty:** proving -- **Lights up:** One canonical observed-shape matrix across graph readers, RPC methods, and web observer surfaces. -- **Stabilizes:** D60-L read-shape ownership, D33-L web read-only observer scope, and the rule that `src/projections/` exists only for reusable multi-consumer DTOs. -- **Objective:** Decide the canonical graph read-shape set per consumer (agent/tooling, RPC, web) and align `graph/`, `rpc/`, and `web/` to that inventory without forcing every agent-oriented shape onto the web. -- **Why now / unlocks:** The read-shape story is currently fragmented across domain queries, Pi adapter helpers, RPC methods, and web features. This is the strongest follow-on coverage frontier because it keeps `projections/` from becoming an indirection grab bag and makes the observed-shape story legible before more surfaces accrete. -- **Acceptance:** - - A closed enumerated coverage ledger exists with required vs deferred shapes per consumer. - - Each required consumer shape has one canonical owner; adapter-local formatting no longer stands in for a durable read shape. - - Web remains a read-only observer; web adoption is deliberate, not accidental bleed-through from agent/RPC needs. - - Any DTOs that survive in `src/projections/` justify multi-consumer reuse; single-owner reads stay in their owning domains. -- **Verification:** Inner — graph query / RPC / web query tests for adopted shapes. Middle — selected-spec observer/read-path smoke over seeded graph data. Outer — manual spot-check only if the web observer UX changes materially. -- **Cross-cutting obligations:** Do not promote all read shapes everywhere. `list_by_kind` / `list_by_band` are plausible web shapes; `related` / `gaps` may remain agent/RPC-only. Keep graph-owned read logic out of `db/`, and keep `src/renderers/` limited to durable LLM/session text rather than arbitrary observer DTOs. -- **Traceability:** D33-L, D51-L, D52-L, D60-L, D64-L. -- **Design docs:** `src/graph/README.md`; `src/rpc/README.md`; `src/web/README.md`. -- **Current execution pointer:** Done 2026-06-08. `src/graph/README.md` now owns the closed observed-shape ledger: `read_graph` requires the six agent shapes, RPC and web require only `overview` + `neighborhood`, `list_by_kind` / `list_by_band` remain web-eligible deferred, and register reads remain deferred until a per-turn driver/consumer needs them. `src/graph/observed-shapes-coverage.test.ts` guards the tool/RPC/web required subsets; no transport shape shipped in this frontier. - -### runtime-affordances-and-legality - -- **Name:** Runtime affordances and legality surface -- **Linear:** unassigned -- **Kind:** structural -- **Status:** done -- **Certainty:** proving -- **Lights up:** A shared affordance/default-on-switch projection across TUI, web, and RPC if runtime posture controls widen again. -- **Stabilizes:** D40-L's projection-as-truth model and the shared legality/default semantics over goal/strategy/lens. -- **Objective:** Consolidate what runtime posture options are legal, default-on-switch, and visible across transport boundaries without replacing the append-only runtime-state projection model with a state machine. -- **Why now / unlocks:** The shared legality tables already exist, but the next UI/control pass could fork them client-side if this surface stays implicit. Keeping it queued protects the "Brunch-owned shared affordance logic" rule before another posture pass lands piecemeal. -- **Acceptance:** - - The scoped frontier closes the required affordance rows across user/system switch surfaces, resolved-state read-back, and shared legality/default projections. - - No client reimplements availability/legality rules locally. - - Active review-set state or freestyle-vs-structured turn mode only joins when it becomes real product state, not as speculative scaffolding. -- **Verification:** Inner — shared affordance projection and switch-reducer tests. Middle — TUI/RPC/web parity checks if a new surface lands. Outer — manual only when a user-visible posture control changes. -- **Cross-cutting obligations:** Keep truth append-only in `brunch.agent_runtime_state`; affordances are pure derivations over shared tables. Do not add xstate or a persisted machine without new evidence. -- **Traceability:** D25-L, D40-L, D59-L, D66-L. -- **Design docs:** `memory/SPEC.md` D40-L/D59-L; `src/projections/README.md`; `src/session/README.md`. -- **Current execution pointer:** Done 2026-06-08. `src/projections/session/affordances.ts` now owns the shared `(resolvedState, readinessGrade)` derivation for legal goal/strategy/lens options plus default-on-switch values, reusing the same grade/AUTO legality source consumed by `.pi/agents/state.ts`; `src/session/README.md` owns the closed coverage ledger and `src/session/runtime-affordances-coverage.test.ts` guards required agent/RPC rows while leaving `active-review-set` and `turn-mode` as explicit product-state-gated deferrals. - -### role-safe-graph-mutations - -- **Name:** Role-safe `mutateGraph` / `mutate_graph` as the canonical graph mutation grammar -- **Linear:** unassigned -- **Kind:** structural / bounded feature -- **Status:** done -- **Certainty:** proving -- **Folded scopes:** the former role-named edge-surface and semantic graph-mutation curation cards were consumed by this frontier and deleted during sync; `mutateGraph` / `mutate_graph` is now the one authored grammar. -- **Lights up:** one authored graph-mutation grammar across direct agent graph writes, review-set proposal drafts, capture writes, seed-fixture loading, and dev curation RPC. -- **Stabilizes:** D51-L/D53-L/D27-L edge-authoring boundary; agents express edges by category + endpoint roles, while `sourceId`/`targetId` stays internal storage geometry derived from `EDGE_CATEGORY_METADATA`. -- **Objective:** Replace exposed create-only `commitGraph` / `commit_graph` with `mutateGraph` / `mutate_graph` as the canonical authored mutation command/tool. The grammar supports create/patch/delete operations, uses role-named create-edge variants (`oracle/claim`, `dependency/dependent`, `abstract/concrete`, etc.), normalizes those variants through `EDGE_CATEGORY_METADATA`, and preserves one `CommandExecutor` transaction, one spec-local LSN, one change-log row, and the existing stored edge shape. -- **Why now / unlocks:** The edge model was intended to help agents map relations from unstructured material, but `{category, source, target}` leaves the most error-prone directionality burden at the agent boundary. The earlier semantic-mutation curation scope would otherwise mint a richer graph-write path with a different API pattern. Taking the bigger step now prevents two graph mutation dialects, gives generalized capture one safe relation grammar, and gives fixture curation patch/delete without creating a second mutation model. -- **Break-and-repair path:** Change the canonical shape first, then let type/test failures enumerate callers. Add `RoleNamedEdgeDraft` + a drift-tested normalizer over `EDGE_CATEGORY_METADATA`; introduce `CommandExecutor.mutateGraph` / a shared mutation planner; remove/rename exposed `commit_graph` and repair prompt resources, Pi graph tool schemas/adapters, capture, seed loader, review-set translation, dev RPC, probes, and docs to `mutate_graph`. `acceptReviewSet` remains the workflow/audit command but reuses the same mutation planner. Do not keep a compatibility bridge accepting both role-named and generic source/target edge drafts; any temporary create-only helper must be private, delegate to `mutateGraph`, and be removed before frontier completion unless a same-slice caller proves it still earns its place. -- **Acceptance:** - - `mutateGraph` / `mutate_graph` is the one exposed authored graph-mutation grammar; exposed `commitGraph` / `commit_graph` is retired or private-only over the same engine. - - Create-edge ops are an 8-variant role-named union at category/role granularity; no tuple-specific relation catalogue is introduced. - - Role field names are test-pinned to `EDGE_CATEGORY_METADATA`; normalization to private `source`/`target` is table-driven, and generic `{category, source, target}` authored drafts are rejected at graph tool and review-set boundaries. - - Create/patch/delete batches are atomic: one transaction, one selected-spec LSN, one change-log row; invalid ops reject the whole batch without writes or clock advancement. - - Edge identity remains immutable: category, semantic endpoints / stored endpoints, stance, and basis cannot be patched; changing them requires delete+create or supersession. - - Policy gates op kinds by caller/posture, so the unified tool grammar does not silently grant autonomous agents deletion authority. - - Product writers are ported: propose-graph uses create-only ops with `createBasis: implicit`; capture and seed loading use create-only ops with `createBasis: explicit`; review-set proposals use role-named edge drafts and acceptance reuses the shared planner; dev curation RPC exposes projected-code create/patch/delete through the same command. -- **Verification:** Inner — normalizer/drift/schema tests over all eight categories; `CommandExecutor` mutation tests for creation parity, patch/delete legality, rollback, sibling-spec rejection, LSN/change-log behavior, and no-reuse ordinals. Middle — graph tool/review-set/capture/seed/dev-RPC tests repaired to `mutateGraph`; dry-run/accept parity for review sets; grep/source tests quarantine `source`/`target` to internal planner/storage/projection code. Outer — product probes and docs point at `mutate_graph`; any retained pre-migration `commit_graph` artifacts are explicitly historical until regenerated. -- **Cross-cutting obligations:** Preserve D4-L/D20-L command boundary, D16-L/A4-L spec-local mutation clock, D51-L stored edge identity, D62-L projected node codes, D63-L `basis` semantics, and D52-L ownership (`graph/` owns mutation semantics; adapters translate only at boundaries). Do not re-orient persistence to upstream/downstream and do not add a read DTO merely to mirror direction; `projection/direction.ts` remains the read projection. -- **Traceability:** D4-L, D16-L, D20-L, D27-L, D51-L, D52-L, D53-L, D62-L, D63-L / A14-L / I1-L, I11-L, I15-L, I20-L, I34-L, I39-L, I40-L, I41-L. -- **Design docs:** `docs/design/GRAPH_MODEL.md`; `memory/SPEC.md` D27-L/D51-L/D53-L; `src/graph/README.md`; `src/rpc/README.md`; `docs/testing/seeded-dev-rpc.md`. -- **Current execution pointer:** Done 2026-06-09. `CommandExecutor` now exposes one public authored mutation seam (`mutateGraph` / `dryRunMutateGraph`) over the extracted planner/writer modules; direct tool writes, review-set acceptance, capture, seed loading, and dev curation all converge on that grammar. The dev-only RPC boundary is now `dev.graph.mutateGraph`, using role-named create-edge ops plus projected node-code / selected-spec edge-id resolution before it enters `CommandExecutor`. Follow-up closure on the same date: the product probes now prompt for and parse `mutate_graph`, current docs describe `mutate_graph` as the active tool, the checked-in 2026-06-05 fixture-curation run is labeled historical pre-migration `commit_graph` evidence, and schema coverage guards the authored edge surfaces against endpoint-role drift. - ### projection-shape-coverage - **Name:** Close the projections ledger with no-loss / shape invariants (PROJECT stage) @@ -624,55 +401,6 @@ The near-term spine has two tracks. The **context-pipeline coverage trio** remai - **Traceability:** D52-L, D39-L, D4-L. - **Design docs:** `src/README.md`; `src/.pi/README.md`; `src/.pi/agents/README.md`; `src/.pi/skills/README.md`; `src/.pi/extensions/README.md`; `src/db/README.md`; `src/graph/README.md`; `src/projections/README.md`; `src/renderers/README.md`; `src/rpc/README.md`; `src/session/README.md`; `src/web/README.md`. -### dx-feedback-loops - -- **Name:** First-class developer feedback loops over the pi harness -- **Linear:** FE-825 — https://linear.app/hash/issue/FE-825/first-class-developer-feedback-loops-over-the-pi-harness -- **Kind:** structural / dev-substrate -- **Status:** done -- **Certainty:** proving -- **Retires:** A25-L — first validation that tracking the latest `pi-coding-agent` line (via dep bump + dev source-alias) lands without sealed-profile regression. -- **Lights up:** A consolidated `src/dev/` front door exposing three named end-to-end loops (faux / real-provider / introspection) that did not exist as a first-class iteration surface, with vite/vitest able to run against pi *source* with no rebuild. -- **Stabilizes:** The DX-loop seam (D68-L) and the read-only introspection capture contract (D69-L) that future contributors aim from. -- **Objective:** Make working over the pi harness fast and observable. (1) Bump `@earendil-works/pi-*` to latest (`0.79.0`) and add a dev source-alias resolving those packages to the sibling `pi-mono` `src/` checkout in `vitest` + `vite`, mirroring pi's own alias list, while published builds keep resolving `dist`; `tsx` source mode remains an explicit future opt-in via a dev tsconfig, not the default path (D67-L). (2) Consolidate three loops behind one `src/dev/` front door owning the launchers plus a shared faux-harness factory; migrate ad hoc faux wiring onto the factory (D68-L). (3) Add one read-only, dev-gated introspection extension wired through `brunch-pi-extensions.ts` that captures exactly what the model receives — mechanical via passive `before_provider_request`/`before_agent_start` tap + on-demand `/introspect` (`ctx.getSystemPromptOptions()`), subjective via launcher `session.prompt` — both writing one `.fixtures/scratch/introspection//` run (D69-L/D70-L). -- **Why now / unlocks:** The only fast iteration path today is ad hoc faux wiring scattered across `src/probes/`; the user has elevated DX loops to first-class. This is a substrate that accelerates every later frontier, and its version-bump+alias slice is a shared unblocker best landed before the trio's pi-facing churn. Not POC-ship-critical. -- **Acceptance:** - - pi deps are at latest and a dev source-alias resolves `@earendil-works/pi-{ai,agent-core,tui,coding-agent}` to the `pi-mono` `src/` checkout in `vitest` and `vite`; the published/`dist` resolution path is unchanged, and `tsx` source mode is deferred to an opt-in dev tsconfig if a later real-provider loop needs it. - - A single `src/dev/` front door owns the faux, real-provider, and introspection launchers plus one shared faux-harness factory; existing ad hoc faux setup (e.g. `src/probes/structured-exchange-ordering-proof.ts`, `src/.pi/brunch-pi-settings.ts`) is migrated onto the factory or explicitly justified in place. - - The faux launcher boots an in-memory `AgentSession` over the pi faux provider and runs a scripted turn end-to-end with no network, keys, or tokens. - - One read-only, dev-gated introspection extension loads only through the explicit `brunch-pi-extensions.ts` bundle, returns every captured payload unchanged, and produces a well-formed paired `.fixtures/scratch/introspection//` run (mechanical payload + subjective answer correlated by turn). - - Product runs are unaffected: outside dev/introspection mode the introspection extension is absent and the D39-L offline default holds. -- **Verification:** Inner — alias-resolution + faux-harness-factory boot unit tests; a test asserting the introspection extension returns payloads unchanged (observation-only); a sealed-profile test that the extension is absent and offline default intact under product mode. Middle — faux launcher scripted-turn smoke; introspection run-artifact shape assertion under `.fixtures/scratch/introspection/`. Outer — manual real-provider introspection session against a live model: ask the model to enumerate and critique tools/skills and eyeball the paired capture (the I38-L discretionary-loading fitness check; tracked, not gated). -- **Cross-cutting obligations:** Preserve the D39-L sealed-profile boundary — introspection loads via the explicit static bundle (never ambient discovery), observes but never mutates payloads, and its offline-lift + extension inclusion are dev-gated, never product defaults. Dev loops are means-of-building and stay distinct from `src/probes/` product-verification probe runs; any durable evidence a dev loop produces lands as a probe run under the `.fixtures/runs/` contract, not a parallel artifact path (D68-L). Pi version bumps are routine adaptation, not deferred migrations; keep the dev alias mirroring pi's own `tsconfig.json` paths list and do not pin back (D67-L). -- **Topology materialization:** `src/dev/` becomes the dev front door (launchers + shared faux-harness factory); the introspection extension lives under `src/.pi/extensions/` per D39-L topology and is wired in `src/.pi/brunch-pi-extensions.ts`; dev source-alias config lives in `vite.config.ts` through the `PI_SOURCE`-gated runtime alias, while base `tsconfig.json` stays paths-free; introspection artifacts are written under `.fixtures/scratch/introspection/`. -- **Traceability:** D39-L, D58-L, D67-L, D68-L, D69-L; A25-L; I38-L. -- **Design docs:** `memory/SPEC.md` §Development Feedback Loops (DX) and D67-L–D69-L; a new `src/dev/README.md`; `pi-mono/packages/coding-agent/docs/development.md` and `vitest.config.ts` for the alias pattern. -- **Current execution pointer:** Done 2026-06-09. The chain landed the latest-pi bump and `PI_SOURCE`-gated runtime alias, the `src/dev/` faux front door and shared faux harness, and the dev-gated read-only introspection extension plus paired run-artifact launcher. Verification: `npm run verify` (608 tests, tsc build, web build). The follow-on frontier `dx-introspection-live` is now also done: the real TUI wiring, `--cwd` launch surface, unified `BRUNCH_DEV` gate, dev query tools, and workspace-local `.brunch/debug/` cache all landed on 2026-06-11. - -### dx-introspection-live - -- **Name:** Live, conversational agent-input introspection in the real dev TUI -- **Linear:** FE-825 — https://linear.app/hash/issue/FE-825/first-class-developer-feedback-loops-over-the-pi-harness -- **Kind:** structural / dev-substrate (capability expansion over `dx-feedback-loops`) -- **Status:** done -- **Certainty:** proving -- **Retires:** A26-L — proof that conversational introspection is buildable as a read-only dev session-query-back tool without weakening D39-L sealing. -- **Lights up:** Running `BRUNCH_DEV=1 npm run dev -- --cwd .fixtures/workbenches/` boots the *real* Brunch TUI against a chosen fixture workspace with the introspection extension live and the model able to query exact prior session-log values back into chat for discussion — a loop that did not exist before this frontier (the extension was built but dormant, and dev runs polluted the operating cwd). -- **Stabilizes:** The four-role `.fixtures/` topology (D70-L), the unified `BRUNCH_DEV` dev gate + `--cwd` launch surface (D71-L), and the conversational session-query contract (A26-L) that future introspection work aims from. -- **Objective:** Make introspection actually *usable live* and *conversational*. Preflight hardening has already formalized scratch artifact routing and moved probe faux wiring out of `src/dev/**`; slice 1 added `--cwd `, unified dev gating under `BRUNCH_DEV`, and wired the introspection extension into the real TUI launch path only when enabled. Slice 2 replaces the earlier fixed self-report schema idea with a general read-only `brunch_session_query` tool over `ctx.sessionManager.getBranch()`: predicate match session entries, project exact values, truncate/spill large output, and let the agent echo/discuss those returned bytes in normal chat. The follow-on live-advertisement/payload-query slice makes registered dev query tools actually active under the D40-L allow-list and adds `brunch_introspect_query` over captured provider payloads plus base prompt options. Live-model compliance remains outer-loop fitness, not a product prompt/resource contract. -- **Why now / unlocks:** When this frontier started, `dx-feedback-loops` had built the introspection machinery but left it dormant — the capability the user actually wanted (interrogate the live in-product agent about how it reads Brunch's tools/skills, and get clarity feedback in chat) was not yet reachable. This frontier closed that gap and hardened the fixtures topology every dev loop and probe shares. Not POC-ship-critical; a DX substrate that accelerates later product frontiers (especially the I38-L discretionary-loading and tool/skill-clarity questions). -- **Acceptance:** - - `runBrunchCli` accepts `--cwd ` (defaulting to `process.cwd()`) so a dev session can target `.fixtures/workbenches/` without `cd`. - - A single `BRUNCH_DEV` switch enables dev RPC, introspection registration, scratch routing, and the offline lift together; `BRUNCH_DEV_RPC` is fully retired (no remaining references in code or docs). - - With `BRUNCH_DEV=1`, the real Brunch TUI registers the introspection extension last in the `before_provider_request` chain and a live model turn produces a paired scratch run; without `BRUNCH_DEV`, the extension never registers and the D39-L offline default holds. - - The agent can call `brunch_session_query` on demand to return verbatim projected value(s) from predicate-matched session entries, including multi-match structured-exchange pairs/triplets; the agent can call `brunch_introspect_query` to return verbatim projected value(s) from captured provider payloads and base prompt options. Both tools are dev instrumentation, never product behavior. -- **Verification:** Inner — `--cwd` parse unit test; scratch-path resolution test (artifact root is repo-`.fixtures/scratch/`, independent of operating cwd); `BRUNCH_DEV` gating test at the `brunch-tui.ts` call site (extension absent when unset, present + last-ordered when set); build-exclusion assertion for `src/dev/**`; offline-lift save/restore test; dev query-tool find/project/truncation and active-tool advertisement tests. Middle — faux-driven introspection scratch-run shape assertion; faux/tool tests where `brunch_session_query` and `brunch_introspect_query` receive verbatim projected values. Outer — manual `BRUNCH_DEV=1 npm run dev -- --cwd .fixtures/workbenches/` session against a live model: ask the agent to pull exact prior/session and provider-payload values through the dev query tools, echo them in fenced blocks, and discuss tool/skill clarity (tracked, not gated). -- **Cross-cutting obligations:** Preserve the D39-L sealed-profile boundary — introspection stays read-only (observes/queries, never mutates payloads or session state), loads only via the explicit `brunch-pi-extensions.ts` bundle (never ambient discovery), and all dev affordances stay behind `BRUNCH_DEV`; the dev query-tool union is injected from the factory into both runtime active-tool policy and prompt composition, then still loses to blocked tools and registered-tool intersection (D40-L/I42-L). The offline lift is save/restore-scoped at the session-construction site, never a naked global `process.env` mutation. Dev scratch output stays distinct from `src/probes/` product-verification runs; durable evidence is reached only by explicit promotion into the tracked `runs/` contract (D70-L), not a parallel artifact path. Conversational query tools are dev instrumentation; they must not leak into product behavior or the sealed profile. -- **Topology materialization:** `.fixtures/scratch/` (gitignored) has joined `seeds/`/`workbenches/`/`runs/`; `--cwd` parsing lands in `src/app/brunch.ts` / `runBrunchCli`; `BRUNCH_DEV` gating and the introspection `{ enabled }` wire-up land in `src/app/brunch-tui.ts`; the provider-payload tap remains in `src/.pi/extensions/introspection/`; conversational query planes live in `src/.pi/extensions/session-query/` and `src/.pi/extensions/introspect-query/`, sharing projection/truncation helpers from `src/.pi/extensions/shared/query-projection.ts`; `.gitignore`, `.fixtures/README.md`, `src/dev/README.md`, and `src/.pi/extensions/README.md` reconcile to the new topology and gate. -- **Traceability:** D39-L, D58-L, D67-L, D68-L, D69-L, D70-L, D71-L; A26-L; I38-L, I42-L. -- **Design docs:** `memory/SPEC.md` §Development Feedback Loops and D69-L–D71-L, A26-L, I42-L; `.fixtures/README.md`; `src/dev/README.md`; `src/.pi/extensions/introspection/README.md`; `src/.pi/extensions/session-query/README.md`; `src/.pi/extensions/introspect-query/README.md`. -- **Current execution pointer:** Done 2026-06-11. Slices 1-2, the dev-query active-tool follow-on, and the workspace debug-cache chain are done: `BRUNCH_DEV` real TUI launches can mirror the latest final system prompt and append explicit Brunch-owned text tool-result content into launch-cwd `.brunch/debug/` while repo-root `.fixtures/scratch/` remains the durable paired-run artifact path. `tool-renders` flattening remains explicitly deferred until a concrete renderer-debugging need appears. - ### dev-seed-fixtures - **Name:** Explicit dev seeding and launchable workbench flow @@ -700,27 +428,6 @@ The near-term spine has two tracks. The **context-pipeline coverage trio** remai - **Traceability:** D16-L, D20-L, D52-L, D61-L, D63-L, D70-L, D71-L, D79-L; I1-L, I11-L, I48-L. - **Design docs:** `.fixtures/README.md`; `.fixtures/workbenches/live-graph-observer/README.md`; `docs/design/GRAPH_MODEL.md`. -### web-design-system-port - -- **Name:** Web client visual design-system port -- **Linear:** unassigned -- **Kind:** bounded feature (web presentation) -- **Certainty:** earned — the target design exists and works in `../brunch/src/client`; the closure is *materialize the port + delete the invented aesthetic*, not retire an unknown. (Project default is `proving`; this frontier overrides because the design is known.) -- **Status:** done (all three cards landed 2026-06-09; exhausted scope files deleted during sync) -- **Objective:** Replace the agent-invented "warm brunch" web aesthetic with the prior trunk's restrained design language (D72-L). Two materializations and one deletion: (a) **tokens** — port the token system into `src/web/styles.css` (Inter + Geist Mono; `ink/sub/hint/rule/wash/tint` ramp + link/plane accents; 11–16px type scale; `--shadow-card` family); (b) **primitives** — copy `DrawerCard`, `KindBadge`, `CountBadge`, `RefBadge` into a new `src/web/components/`, adapted from the old `KnowledgeKind` knowledge-card pattern to this trunk's `NodeKind`/`NodePlane` with a plane-organized accent map; (c) **re-skin** the three existing views (`WorkspaceChrome`, `GraphOverviewPanel`, `SessionPanel`) as a *style + component-pattern port of the views we have* (scope correction, user 2026-06-09) — preserving behavior except invented dead scaffolding — and delete the warm gradients, `backdrop-blur`, oversized radii/shadows, translucent surfaces, and wide-tracked uppercase labels. The non-functional "Focus node" placeholder (never called `graph.nodeNeighborhood`) was removed; the "Edge categories" summary was kept (restyled, user finds it useful). -- **Why now / unlocks:** The current web UI's visual language was invented wholesale by the agent that built it and does not match the product's established look. Realigning now keeps the read-only observer surface presentable for manual/observer testing and stops the invented aesthetic from being copied forward into future web views. Independent of the delivery spine — touches no data, RPC, query, subscription, or routing code. -- **Acceptance:** - - `src/web/styles.css` carries the ported token system; no warm-palette tokens, body gradients, or `backdrop-blur` remain. - - `src/web/components/` holds the ported primitives; the accent map is exhaustive over `NodePlane`, with a compile-time `satisfies Record` guard, while reference-code labels stay canonical via `NODE_KIND_METADATA` + `kindOrdinal` (I43-L). - - The three views render in the ported language: quiet metadata-row chrome, `plane / kind`-grouped node cards with canonical reference codes (`NODE_KIND_METADATA` labels + `kindOrdinal`) and plane-accented `KindBadge`/`CountBadge`, plain session card. The "Edge categories" summary is kept (restyled as `RefBadge` chips); the non-functional "Focus node" placeholder is removed. - - Read-only contract preserved: no change to queries, RPC client, subscriptions, routes, or projection inputs. - - Existing web tests preserved; only the two Focus-node assertions removed; `npm run verify` is green (28 web tests, oxlint type-aware clean, build clean). -- **Verification:** Inner — `npm run verify` (oxlint type-aware + oxfmt + vitest + build); update `src/web/app.test.tsx` and any view tests that assert retired class names / `aria-label`s. Outer — manual browser check of `/` and `/spec/$specId` against a seeded spec (`npm run seed` then launch web mode) to confirm the chrome, kind-grouped graph cards, and session panel match the prior trunk's look. -- **Topology materialization:** Stays inside `src/web` per D52-L (`web/` is a standalone build target; must not read SQLite/Pi RPC/JSONL directly). New `src/web/components/` owns ported primitives; only `src/web` imports from it. Component/style patterns are copied (not shared) from `../brunch`. Exception to `sourcing: strip-or-build`: the webfont packages `@fontsource-variable/inter` + `@fontsource-variable/geist-mono` were added with user approval (2026-06-09) — the fonts are the most visible design token; the "no new packages" line was not a hard rule. -- **Cross-cutting obligations:** Pre-release posture (`migration: free-rewrite`) — discard the invented design freely; do not preserve it for compatibility. Read-only invariant (D33-L one-writer/many-observer): this frontier adds no web write paths. Node reference codes must use the canonical `NODE_KIND_METADATA` projection (D62-L), not a web-local relabeling. -- **Traceability:** D10-L, D52-L, D62-L, D72-L / I43-L, I39-L. -- **Design docs:** `../brunch/src/client/index.css`, `../brunch/src/client/components/drawer-card.tsx`, `../brunch/src/client/components/knowledge-card.tsx` (reference source — separate checkout, not imported). - ## Recently Completed - 2026-06-09 `role-safe-graph-mutations` — Done: retired the remaining public `commitGraph` residue, extracted the shared mutation planner/writer out of `CommandExecutor`, and completed the last boundary migration so dev curation now exposes `dev.graph.mutateGraph` with role-named create-edge ops plus projected node-code / selected-spec edge-id resolution. Follow-up closure on the same frontier: reconciled the remaining product probes and current docs to the canonical `mutateGraph` / `mutate_graph` grammar, explicitly marked the checked-in 2026-06-05 fixture-curation artifact as historical pre-migration `commit_graph` evidence, and added role-named edge schema coverage across the Pi tool and dev RPC boundaries. Verified: `npx vitest run src/rpc/handlers.test.ts src/app/brunch.test.ts src/probes/fixture-curation-loop.test.ts src/probes/propose-graph-commit-proof.test.ts src/graph/mutate-graph-edge-schema.test.ts` and `npm run verify`. diff --git a/memory/SPEC.md b/memory/SPEC.md index 46171ebc..3b56da2b 100644 --- a/memory/SPEC.md +++ b/memory/SPEC.md @@ -121,7 +121,7 @@ The POC's purpose is to prove three things: (a) that pi's coding-agent harness c | A24-L | A flat `elicitation_gaps` table (prospective memory) is sufficient to drive elicitor questioning, seed grounding, and feed capability-readiness without graph structure — gaps are typed coverage obligations (typologies), not graph nodes; apparent dependency among gaps is mediated by the claims their resolution produces. | medium | validated | D65-L, D74-L, D75-L | 2026-06-08 FE-823 materialized the flat table (built as `elicitation_backlog`) on the real LSN/change-log seam. 2026-06-10 `elicitation-gaps-remodel` replaced that question-instance shape with the typed obligation register, regenerated the table as `elicitation_gaps`, seeded the grounding typology catalog, and proved live presence-derived coverage/answered read-back without stored structural answers; `gaps-node-kind-reference` then retired the catalog/name vocabulary in favor of `refersTo: NodeKind` + free-form `question`. Remaining downstream proof is capture-reflection spawning; if genuine gap→gap dependency or rich traversal emerges, promote the table to a plane (rows→nodes, FK pointers→edges). | | A25-L | Tracking the latest `pi-coding-agent` release continuously (via source-alias in dev + package dependency bumps) keeps Brunch adaptable without routinely destabilizing it, because Brunch's pi product-behavior surface is concentrated in a few sealed integration seams (the `src/.pi/` extension bundle and the session/runtime adapters) behind the D39-L profile — even though pi *types* are imported across ~25 files, those are mostly type-only and pass through that small set of seams. | medium | partially validated | D67-L | 2026-06-09 FE-825 bumped Brunch to pi 0.79, kept type/default resolution on installed `dist`, added a `PI_SOURCE`-gated vite/vitest runtime alias to sibling `pi-mono` source, preserved product default sealed-profile/offline behavior, and passed `npm run verify`. Each later pi bump that lands without product-behavior regressions raises confidence; a bump that silently breaks sealed-profile assumptions falsifies it. | | A26-L | The refined "conversational introspection" goal can be built as a *read-only session-query-back tool*: under `BRUNCH_DEV`, the agent can call `brunch_session_query` over `ctx.sessionManager.getBranch()`, find entries by predicate, project capped dot/`[n]`/`[*]` paths, and surface exact returned values in chat without weakening D39-L sealing or turning self-reporting into product behavior. | medium | validated | D69-L, D71-L | 2026-06-09 `dx-introspection-live` slice 2 replaced the earlier fixed structured self-report/schema idea with `src/.pi/extensions/session-query/`: a dev-gated read-only tool registered only through `createBrunchPiExtensions(..., { introspection: { enabled } })`, covered by find/project/truncation unit tests, default-off/default-on registration tests, and a faux turn that returns verbatim projected session values. Live-model compliance with "call then echo verbatim" remains outer-loop fitness, not a merge gate. | -| A27-L | Gap satisfaction is expressible band-by-band at acceptable LLM cost: **commitment** typologies are structural `presence`/`field`/`coverage` predicates over the graph; **grounding** typologies are a `presence` floor plus `manual` LLM satisficiency (D57-L); **elicitation** typologies are generatively spawned. The explicit `capability → relevant gaps` map (D74-L) carries enough signal to drive proceed / negotiate without a standing grade. | medium | partially validated | D65-L, D74-L, D75-L | 2026-06-10 `elicitation-gaps-remodel` validated the structural `presence` case: a seeded grounding gap's derived coverage/answered state flips from graph truth with no stored structural answer and sibling-spec isolation holds. 2026-06-10 the `capability-readiness` D74-L gate tracer validated the grounding floor: the explicit capability→gap map drives proceed / proceed_low_epistemic / negotiate, live presence coverage flips a generative capability negotiate→proceed, and the gate imports no grade symbols. 2026-06-10 `gaps-node-kind-reference` collapsed that map onto `NodeKind` (`context`/`thesis`/`goal`/`constraint`), proved required-kind absence fails loud, and proved same-kind gaps discriminate by question+satisfier rather than typology name. 2026-06-10 the `capability-readiness` affordance-legality slice validated the affordance-path consumer: the runtime affordance projection (`affordances` / `axisOptionsForRuntimeState`) derives goal/strategy/lens menu legality from `evaluateCapabilityReadiness` over gap coverage with no grade symbols, a coverage flip moves a gated option legal, and a required kind absent from the register fails loud (config bug ≠ uncovered) — retiring the affordance-path uncertainty. 2026-06-10 the method/manifest legality slice validated the turn-boundary consumer: `before_agent_start` reads selected-spec gaps through the graph read seam, prompt manifests and active tool names derive gated methods from gap coverage, floor methods/tools remain available at zero coverage, and the `state.ts` grade tables are gone. 2026-06-10 the agent-prompt display slice validated the display consumer: `compose.ts` and `contexts/cwd.ts` render the selected-spec soft per-band estimate from gaps with stable band order/fixed decimals, and `before_agent_start` threads the same selected-spec gaps into the pushed cwd context. Remaining proof: `field`/`coverage` predicate derivation, `manual` LLM satisficiency, elicitation/commitment fixtures, workspace/chrome display rewiring, and stored-grade deletion. Falsified if grounding readiness cannot decompose into per-typology presence+manual judgments, or if commitment obligations need logic the predicate union can't express. | +| A27-L | Gap satisfaction is expressible band-by-band at acceptable LLM cost: **commitment** typologies are structural `presence`/`field`/`coverage` predicates over the graph; **grounding** typologies are a `presence` floor plus `manual` LLM satisficiency (D57-L); **elicitation** typologies are generatively spawned. The explicit `capability → relevant gaps` map (D74-L) carries enough signal to drive proceed / negotiate without a standing grade. | medium | partially validated | D65-L, D74-L, D75-L | 2026-06-10 `elicitation-gaps-remodel` validated the structural `presence` case: a seeded grounding gap's derived coverage/answered state flips from graph truth with no stored structural answer and sibling-spec isolation holds. 2026-06-10 the `capability-readiness` D74-L gate tracer validated the grounding floor: the explicit capability→gap map drives proceed / proceed_low_epistemic / negotiate, live presence coverage flips a generative capability negotiate→proceed, and the gate imports no grade symbols. 2026-06-10 `gaps-node-kind-reference` collapsed that map onto `NodeKind` (`context`/`thesis`/`goal`/`constraint`), proved required-kind absence fails loud, and proved same-kind gaps discriminate by question+satisfier rather than typology name. 2026-06-10 the `capability-readiness` affordance-legality slice validated the affordance-path consumer: the runtime affordance projection (`affordances` / `axisOptionsForRuntimeState`) derives goal/strategy/lens menu legality from `evaluateCapabilityReadiness` over gap coverage with no grade symbols, a coverage flip moves a gated option legal, and a required kind absent from the register fails loud (config bug ≠ uncovered) — retiring the affordance-path uncertainty. 2026-06-10 the method/manifest legality slice validated the turn-boundary consumer: `before_agent_start` reads selected-spec gaps through the graph read seam, prompt manifests and active tool names derive gated methods from gap coverage, floor methods/tools remain available at zero coverage, and the `state.ts` grade tables are gone. 2026-06-10 the agent-prompt display slice validated the display consumer: `compose.ts` and `contexts/cwd.ts` render the selected-spec soft per-band estimate from gaps with stable band order/fixed decimals, and `before_agent_start` threads the same selected-spec gaps into the pushed cwd context. 2026-06-11 the review-fix remediation hardened the predicate substrate: `gapPredicateSupport` (in the union's owning schema module) is the single never-checked owner of per-arm semantics — `field`/`coverage` now **reject loudly at the CommandExecutor boundary** until derivation exists (a structural arm without derivation also fails loud at read), open presence gaps dedupe by `(specId, nodeKind)` (presence is a kind-floor obligation; situated same-kind gaps use `manual` until `field`/`coverage` land), and gap hydration fails on `predicate_kind`/JSON divergence. Remaining proof: `field`/`coverage` predicate derivation, `manual` LLM satisficiency, elicitation/commitment fixtures. Falsified if grounding readiness cannot decompose into per-typology presence+manual judgments, or if commitment obligations need logic the predicate union can't express. | ### Active Decisions @@ -133,7 +133,7 @@ The POC's purpose is to prove three things: (a) that pi's coding-agent harness c - Tooling exception: the worktree helper extension now lives outside this repository under the user Pi agent tree (`~/.pi/agent/extensions/worktree/index.ts`) for direct Pi sessions only. It is not a Brunch product extension, is not imported by `src/.pi/brunch-pi-extensions.ts`, and does not weaken the sealed Brunch Pi settings/extensions boundary; Brunch-launched product sessions continue to disable ambient `.pi/` discovery unless deliberately imported. The extension may register direct-Pi `/worktree:switch` / `switch_worktree` and `/worktree:create` / `create_worktree` affordances, but Brunch does not test, package, or document it as a product extension. - **D40-L — Runtime state is transcript-backed Brunch session-agent state, not hidden extension memory.** `src/session/runtime-state.ts` owns the transcript entry facts (`brunch.agent_runtime_state` schema, parser, and init/switch append helpers); `src/projections/session/runtime-state.ts` owns the pure reusable projection, `src/projections/session/runtime-policy.ts` owns operational-mode/role policy plus shared capability-readiness policy, and `src/projections/session/affordances.ts` owns the pure `(resolvedState, gaps) → legal options + default-on-switch` derivation for goal/strategy/lens. The projection reconstructs agent posture from linear `brunch.agent_runtime_state` entries (`reason: "init" | "switch"`), last-writer-wins at turn preparation and over `session.runtimeState`; default/empty slots are explicit when no entry family exists. Runtime-state entries are Pi JSONL state-change facts, not assistant/user chat content: init and switch entries should render, when visible, as dim non-chat state rows analogous to Pi thinking/model-change rows, and must not enter LLM context as ordinary conversation. Its axes are `op_mode` (`elicit`, future `execute`) plus optional, AUTO-able objective axes `strategy`, `lens`, and `goal` (D25-L, D59-L). **Posture switches (durable `reason: "switch"` entries) are a user/system authority: the foreground agent never emits a posture switch.** The agent's only in-axis freedom is `AUTO` (per-turn implicit selection from the D58-L manifest); what it actually chose each turn is legible downstream via per-emission facet stamping (D25-L), not via runtime-state — so runtime-state is the *frame/constraints* while emitted facets carry the agent's per-turn choice. User-mutable axes are `op_mode`, `strategy`, and `lens`; `goal` is internal/readiness-derived and not part of the user posture-change surface for now (D59-L). On a parent switch that invalidates a child axis, the child defaults to `AUTO`. The `source: "agent"` entry value is reserved — no current path emits it; it is parked for a future execute-mode orchestrator that might legitimately steer sub-postures. `session.runtimeState` also exposes shaped mention slots, world-update watermarks (latest graph LSN and optional git head, without raw transcript detail bags), and lifecycle facts when transcript-backed entries make them computable; this is a projection contract, not a mutable state table. The **foreground session agent** (`elicitor` now, future `executor`) is *derived* from `op_mode`, not stored; the other agent roles (`reviewer`, `reconciler`, future `scout`/`researcher`) are async sub-agent/side-chain workers (D29-L, D44-L) invoked out-of-band, never part of the session state machine. `op_mode` gates tool authority, applied by `src/.pi/extensions/runtime/index.ts` (current `elicit` policy denies side-effecting `bash`/`edit`/`write` plus user-shell interception) while `.pi` reuses session-owned entry definitions and projected policy. Prompt composition is a separate concern (D58-L). Depends on: D17-L, D23-L, D25-L, D39-L, D58-L, D59-L. Supersedes: mode-only vocabulary, extension-local mutable state as authority, storing the foreground role as independent session state, the "runtime bundle / role preset" as one knob deriving model/thinking/resources, and binding prompt-resource location to `src/.pi/context/`. - **D34-L — Command containment separates visibility suppression from effect blocking.** Current Pi extension seams can hide unsupported slash suggestions with autocomplete wrapping and can cancel branch/session effects through lifecycle hooks, but they cannot strictly suppress exact interactive built-in commands before `InteractiveMode` dispatches them. Brunch-owned commands must use product-specific names and route writes through Brunch handlers/`CommandExecutor`; extension command collisions are not an override mechanism. Strict built-in command/keybinding policy is a Pi upstream/API ask, while POC safety relies on hiding generic affordances, blocking dangerous effects (`/fork`, `/clone`, `/tree`, raw session replacement), and failing fast on branched transcripts. Brunch's command-policy code should live in `src/.pi/extensions/commands/policy.ts`, merging branch/session-effect blocking with any product command allow/deny behavior instead of preserving a branch-only module. Depends on: D2-L, D24-L, A18-L. Supersedes: treating extension `input` handlers or command-name collisions as built-in command allowlisting. -- **D35-L — Dynamic TUI chrome is a Brunch projection wrapper over Pi UI primitives.** Downstream TUI affordances should call a Brunch-owned renderer (`renderBrunchChrome` or its successor) with one activated product-state value rather than scattering raw `ctx.ui.setHeader`, `setFooter`, `setWidget`, title, or working-indicator calls. The wrapper is stateless projection over canonical workspace/session/graph facts, including the discovered project name, selected spec, real activated session id/label, launch activation kind for new-session startup headers, and app-supplied live sidecar URL when present, while its TUI footer compositor may read Pi footer telemetry (`getGitBranch`, foreign `getExtensionStatuses`) at render time. Brunch chrome and startup dialog are project-first shell surfaces with selected-spec context: the project name labels the cwd container, the spec title labels the selected graph, and the session label distinguishes transcript instances. New `newSpec` / `newSession` launches keep Pi `quietStartup` but install a Brunch-owned expandable header through the chrome wrapper; resume/open launches stay quiet. Brunch chrome does not publish a `brunch.chrome` status key; `ctx.ui.setStatus(key, text)` remains a lateral contribution channel for other extensions and future dynamic Brunch state. RPC clients should rely only on surfaces Pi actually emits for the wrapper (currently sidecar/widget-compatible string arrays and title, plus any future explicit status adapter) because header/footer/working-indicator are TUI-only in current Pi RPC mode. Session display names are product projections over Pi session metadata: every Brunch-created session should immediately receive a neutral workspace-global `Untitled Session N` `session_info` label, and later user/generated names may characterize the transcript without replacing spec identity or graph truth. Depends on: D2-L, D21-L, D34-L, A18-L. Supersedes: treating Pi UI methods as direct downstream affordance APIs, rendering placeholder session state such as `unbound` after a session is activated, consuming the status-key namespace for chrome's own static summary, using spec title as the default session label, or allowing two unchanged Brunch-created default names to collide in one cwd. +- **D35-L — Dynamic TUI chrome is a Brunch projection wrapper over Pi UI primitives.** Downstream TUI affordances should call a Brunch-owned renderer (`renderBrunchChrome` or its successor) with one activated product-state value rather than scattering raw `ctx.ui.setHeader`, `setFooter`, `setWidget`, title, or working-indicator calls. The wrapper is stateless projection over canonical workspace/session/graph facts, including the discovered project name, selected spec, real activated session id/label, launch activation kind for new-session startup headers, and app-supplied live sidecar URL when present, while its TUI footer compositor may read Pi footer telemetry (`getGitBranch`, foreign `getExtensionStatuses`) at render time. Brunch chrome and startup dialog are project-first shell surfaces with selected-spec context: the project name labels the cwd container, the spec title labels the selected graph, and the session label distinguishes transcript instances. Every non-cancel launch activation (`newSpec` / `newSession` / `continue` / `openSession`) keeps Pi `quietStartup` and installs a Brunch-owned startup header through the chrome wrapper, parameterized by the activation decision (test-locked: "requests startup header chrome for every activated launch decision"); the header's expand affordance was removed 2026-06-11 — no advertised unwired behavior — and may return only with a real input path. Brunch chrome does not publish a `brunch.chrome` status key; `ctx.ui.setStatus(key, text)` remains a lateral contribution channel for other extensions and future dynamic Brunch state. RPC clients should rely only on surfaces Pi actually emits for the wrapper (currently sidecar/widget-compatible string arrays and title, plus any future explicit status adapter) because header/footer/working-indicator are TUI-only in current Pi RPC mode. Session display names are product projections over Pi session metadata: every Brunch-created session should immediately receive a neutral workspace-global `Untitled Session N` `session_info` label, and later user/generated names may characterize the transcript without replacing spec identity or graph truth. Depends on: D2-L, D21-L, D34-L, A18-L. Supersedes: treating Pi UI methods as direct downstream affordance APIs, rendering placeholder session state such as `unbound` after a session is activated, consuming the status-key namespace for chrome's own static summary, using spec title as the default session label, or allowing two unchanged Brunch-created default names to collide in one cwd, and the earlier resume/open-launches-stay-quiet clause (superseded 2026-06-11: the shipped, test-locked behavior headers every non-cancel activation). - **D52-L — Source topology targets `src/{app, workspace, scripts, .pi, db, graph, session, projections, renderers, rpc, web}` with directed layer dependencies.** Product entrypoints live under `src/app/`, local executable utility ownership is reserved under `src/scripts/`, package/workspace identity tests live under `src/workspace/`, and reusable projection/rendering modules live under top-level `src/projections/` and `src/renderers/` rather than whichever domain or adapter first needed them. `app/` owns product host entrypoints and wiring. `workspace/` owns cwd/package/workspace identity helpers. `scripts/` owns local executable utilities. `.pi/` is the sealed Pi-harness runtime surface: `agents/` owns runtime prompt assembly, role definitions, legal resource manifests, and agent-context orchestration; `skills/` owns goal/strategy/lens/method markdown resources read on demand; `components/` owns reusable Pi TUI/message components; `extensions/` owns Pi registrars for tools, hooks, commands, chrome, context tools, system-prompt append, exchanges, graph tools, workspace dialogs, runtime policy, and session lifecycle. `graph/` is the domain layer: CommandExecutor, readers, policy, validators, query bucketing, change-log replay, reconciliation-need substrate; it imports from `db/` (Drizzle schema, migrations, connection lifecycle) and no other layer imports `db/` directly. `session/` owns transcript projection, exchange extraction, workspace coordination, session binding, runtime-state transcript entries, and LSN staleness tracking over Pi JSONL. `projections/` owns structured DTOs derived from graph/session/workspace/tool facts; it must not render lossy text and must not import adapters, transports, app entrypoints, or web code. `renderers/` owns lossy text/markdown/toon/tool-content rendering over domain or projection inputs; it may import input types from `graph/`, `session/`, or `projections/` as needed, but must not import adapters, transports, app entrypoints, or web code. `rpc/` owns Brunch JSON-RPC handlers. `web/` owns the React client. Dependency direction: `.pi/`, `rpc/`, and `app/` may import from `graph/`, `session/`, `projections/`, and `renderers/`; `.pi/agents/` may import from `graph/`, `session/`, `projections/`, and `renderers/` to build agent context; `.pi/extensions/` may import from `.pi/agents/` and `.pi/components/`; `projections/` may import from `graph/`, `session/`, and `workspace/`; `renderers/` may import from `projections/`, `graph/`, and `session/`; `graph/` imports from `db/`, and `db/` may import the drizzle-free taxonomy leaf `graph/schema/kinds.ts` — the single sanctioned `db/`→`graph/` edge (D73-L); `web/` is a standalone build target. Depends on: D2-L, D4-L, D39-L, D40-L. Refined by: D73-L. Supersedes: scattering session domain files at `src/` root; treating Pi-only agents as a host-independent top-level `src/.pi/` layer; nesting prompt composition under `src/.pi/context/`; treating reusable `project` / `format` helpers as owned by whichever adapter first needed them. - **D73-L — Domain enum taxonomy is owned by a drizzle-free `src/graph/schema/kinds.ts` leaf; `db/` is a consumer, not the source.** The closed enum `const` arrays that define graph vocabulary — node kinds (`INTENT_KINDS`, `ORACLE_KINDS`, `DESIGN_KINDS`, `PLAN_KINDS`), `NODE_PLANES` (`intent`/`oracle`/`design`/`plan`), `NODE_BASES`, `EDGE_CATEGORIES`, `EDGE_STANCES`, `READINESS_BANDS`, `LENS_AFFINITIES`, `GAP_DISPOSITIONS`, and `GAP_PREDICATE_KINDS` — live in `graph/schema/kinds.ts`, a pure constants leaf that imports nothing (no drizzle, no `graph/atoms`). Both `db/schema.ts` (for `text({ enum })` column constraints, including the previously-inlined `plane` columns) and `graph/` domain modules import the arrays from this leaf; `graph/index.ts` re-exports them from the leaf so non-graph layers still avoid importing `db/` directly (I26-L). Derivations stay where they are read: `NODE_KIND_METADATA`, `formatGraphNodeCode`, `parseGraphNodeCode`, and `intentKindCategory` remain in `graph/schema/nodes.ts` (D62-L). The motivating defect: because `db/schema.ts` eagerly evaluates `sqliteTable(...)` and `verbatimModuleSyntax` emits even type-only imports at runtime, any value-import path from `web/` into the old taxonomy location pulled Drizzle into the browser bundle. Locating taxonomy in a drizzle-free leaf makes the `web/` build target structurally Drizzle-free (I44-L) and corrects the ownership direction so the domain, not the persistence layer, owns its vocabulary. Vocabulary migration status: `READINESS_GRADES` is retired (readiness is no longer a stored grade, D45-L), `ELICITATION_BACKLOG_STATUSES` is replaced by the `elicitation_gaps` disposition + predicate-shape enums (D65-L), and `READINESS_BANDS` stays. Depends on: D16-L, D52-L, D54-L, D62-L, D63-L, D64-L; I26-L. Supersedes: `db/schema.ts` owning the shared enum `const` arrays and the "enum literals flow outward from `db/schema.ts`" posture; the triplicated inline `['intent','oracle','design','plan']` plane literals. @@ -323,9 +323,9 @@ The POC's purpose is to prove three things: (a) that pi's coding-agent harness c | I42-L | Dev-only substrate never affects product/prod behavior: `src/dev/**` is build-excluded from `dist`; the introspection extension registers and advertises its query tools only when `BRUNCH_DEV` opts it in (default product sessions never register or advertise the tap, `/introspect`, `brunch_session_query`, `brunch_introspect_query`, or any `before_provider_request` observer); durable dev-loop artifacts land only under gitignored `.fixtures/scratch/`, never tracked `runs/` or the operating cwd; the only workspace-local dev cache is ephemeral `.brunch/debug/` output derived from the same passive capture / explicit Brunch-owned text `tool_result` events in `BRUNCH_DEV` real TUI launches; and Pi startup update suppression / any offline-default lift is save/restore-scoped through TUI launch, never a leaked global `process.env` mutation. | covered for the current DX substrate (`src/.pi/__tests__/introspection.test.ts` proves default-off registration + last-position ordering when enabled, active-tool advertisement of `brunch_session_query` / `brunch_introspect_query`, debug-cache mirroring from passive final-prompt capture, and Brunch-owned tool-result filtering/append formatting; `src/.pi/agents/state.test.ts` proves the injected dev tool set is unioned only before blocked-tool subtraction and registered-tool intersection; `src/.pi/extensions/session-query/index.test.ts` and `src/.pi/extensions/introspect-query/index.test.ts` cover read-only find/project/truncation behavior; `src/app/brunch-tui.test.ts` proves the real TUI launch path threads `BRUNCH_DEV` into introspection registration with launch-cwd debug-cache options, keeps the registrar last, asserts `tsconfig.build.json` excludes `src/dev`, and proves `PI_OFFLINE` startup update suppression plus prior `PI_OFFLINE` / `PI_SKIP_VERSION_CHECK` values are save/restore-scoped through `finally`; `src/dev/introspection-launcher.test.ts` proves scratch artifact routing is repo-rooted and independent of workspace cwd; `.fixtures/README.md` + `.gitignore` document/guard scratch). | D39-L, D40-L, D68-L, D69-L, D70-L, D71-L | | I43-L | The web client's accent presentation map is exhaustive over `NodePlane` (intent/oracle/design/plan); every plane renders with a defined accent, and node reference-code labels remain canonical via `NODE_KIND_METADATA` + `kindOrdinal` (no fallthrough default that silently swallows an unmapped plane). | met (compile-time `satisfies Record` exhaustiveness check on `PLANE_ACCENT` in `src/web/components/node-card.tsx`; breaks the build when a new `NodePlane` is added without an accent) | D72-L; I36-L | | I44-L | Domain enum taxonomy lives in the drizzle-free leaf `src/graph/schema/kinds.ts` (zero imports), `db/schema.ts` owns no enum `const` array (it imports them from the leaf), and the `web/` build target transitively contains no Drizzle/persistence code. The only sanctioned `db/`→`graph/` import is from `db/schema.ts` to `graph/schema/kinds.ts`. | covered (`src/graph/architecture.test.ts` guards leaf purity, db→graph import confinement, absence of enum const arrays in `db/schema.ts`, and post-`build:web` absence of `drizzle`/`sqliteTable` in the dist-web bundle; `src/db/README.md` and `src/graph/README.md` record the taxonomy leaf topology) | D52-L, D73-L; I26-L | -| I45-L | A session's assistant-visible watermark advances only when a continuity entry naming a strictly higher spec-local LSN is inserted: a boot/context seed or whole-spec overview snapshot, a `worldUpdate` for any write not already assistant-visible through another carrier (naming only items with LSN strictly greater than the pre-update watermark, I4-L), or the session's own graph-mutation `toolResult`. `worldUpdate` covers foreign writes **and** same-session writes that did not ride an own-mutation `toolResult` (e.g. submit-time / freestyle capture); such a same-session capture advances `current_lsn` and is surfaced by the next `worldUpdate`, never silently swallowed. A freshly seeded session whose seed named the current snapshot LSN does not immediately synthesize a redundant `worldUpdate`. Narrow `getNodes`/`queryNodes` reads do not advance the global watermark (they update per-entity read ledgers only). When `current_lsn == watermark` no `worldUpdate` is synthesized, and the session's own already-visible mutations never produce a `worldUpdate`. The watermark is its own projection over the carrier set (distinct from `runtimeState.world.latestLsn`), projected from transcript continuity entries (D43-L), never a stored field. | planned (turn-boundary-reconciliation slice; coverage-first scaffold) | D43-L, D76-L, D77-L; I1-L, I4-L | -| I46-L | Session origination never writes a fabricated user transcript entry. A new session inserts seed continuity entries and then an assistant-originated exchange before idling; a resumed session decides the kick from the **latest unresolved conversational debt**, computed by ignoring trailing continuity-only entries — any reconciler-inserted notice owing no assistant continuation: seed / `worldUpdate` / `brunch.mention*` / `brunch.session_lifecycle` / side-task & reviewer drains — whether inserted this boot or persisted by a prior boot — it originates a turn iff that debt owed assistant continuation (a user message or an incomplete exchange-tuple awaiting the assistant), and otherwise rests at an assistant/system-originated leaf (I13-L). The kick decision is idempotent across crash/reboot: trailing continuity notices neither mask an older unanswered debt nor manufacture a kick over a satisfied leaf. AUTO never originates a `freestyle` turn (D66-L); only an explicit `freestyle` pin yields a wait-for-user idle. | planned (kick+seeding slice; coverage-first scaffold) | D66-L, D78-L; R16; I13-L | -| I47-L | Continuity facts (seed/refresh, `worldUpdate`, `brunch.mention*`, `brunch.session_lifecycle`) persist only as Brunch custom transcript entries — never synthetic `toolCall`s, never prompt-only injection — so the D43-L projection can reconstruct them; boot/resume seeding is idempotent, deriving dedupe from projected transcript state (a seed/world-update already present is not re-emitted) rather than from hidden flags, and survives real restart/resume. The watermark must also survive compaction: the preserved-anchor set retains the latest watermark-carrier entry per spec so the projected global watermark never regresses after compaction+resume (which would otherwise spuriously re-emit `worldUpdate`). | planned (kick+seeding + turn-boundary-reconciliation slices; coverage-first scaffold) | D17-L, D37-L, D43-L, D76-L, D78-L | +| I45-L | A session's assistant-visible watermark advances only when a continuity entry naming a strictly higher spec-local LSN is inserted: a boot/context seed or whole-spec overview snapshot, a `worldUpdate` for any write not already assistant-visible through another carrier (naming only items with LSN strictly greater than the pre-update watermark, I4-L), or the session's own graph-mutation `toolResult`. `worldUpdate` covers foreign writes **and** same-session writes that did not ride an own-mutation `toolResult` (e.g. submit-time / freestyle capture); such a same-session capture advances `current_lsn` and is surfaced by the next `worldUpdate`, never silently swallowed. A freshly seeded session whose seed named the current snapshot LSN does not immediately synthesize a redundant `worldUpdate`. Narrow `getNodes`/`queryNodes` reads do not advance the global watermark (they update per-entity read ledgers only). When `current_lsn == watermark` no `worldUpdate` is synthesized, and the session's own already-visible mutations never produce a `worldUpdate`. The watermark is its own projection over the carrier set (distinct from `runtimeState.world.latestLsn`), projected from transcript continuity entries (D43-L), never a stored field. | covered (2026-06-11: all I45 Tier-2 scaffold rows run live through real `runBrunchTui` boot in `src/dev/tier-2-harness.test.ts`; the live `before_provider_request` guard delegates to `guardBeforeProviderRequest` retry semantics) | D43-L, D76-L, D77-L; I1-L, I4-L | +| I46-L | Session origination never writes a fabricated user transcript entry. A new session inserts seed continuity entries and then an assistant-originated exchange before idling; a resumed session decides the kick from the **latest unresolved conversational debt**, computed by ignoring trailing continuity-only entries — any reconciler-inserted notice owing no assistant continuation: seed / `worldUpdate` / `brunch.mention*` / `brunch.session_lifecycle` / side-task & reviewer drains — whether inserted this boot or persisted by a prior boot — it originates a turn iff that debt owed assistant continuation (a user message or an incomplete exchange-tuple awaiting the assistant), and otherwise rests at an assistant/system-originated leaf (I13-L). The kick decision is idempotent across crash/reboot: trailing continuity notices neither mask an older unanswered debt nor manufacture a kick over a satisfied leaf. AUTO never originates a `freestyle` turn (D66-L); only an explicit `freestyle` pin yields a wait-for-user idle. | covered (2026-06-11: new-session seed-then-kick plus all four resume rows run live through real boot/resume — pre-reconcile user-tail kick including after earlier completed exchanges, `request_*`/system leaves idle against the real result envelope (outcome is `answered`/`cancelled`/`unavailable` **key presence** per `projections/exchanges`, never a status string), crash-after-notice re-kick, drains neither manufacture nor mask debt; kick origin derives from projected transcript state, not entry counts) | D66-L, D78-L; R16; I13-L | +| I47-L | Continuity facts (seed/refresh, `worldUpdate`, `brunch.mention*`, `brunch.session_lifecycle`) persist only as Brunch custom transcript entries — never synthetic `toolCall`s, never prompt-only injection — so the D43-L projection can reconstruct them; boot/resume seeding is idempotent, deriving dedupe from projected transcript state (a seed/world-update already present is not re-emitted) rather than from hidden flags, and survives real restart/resume. The watermark must also survive compaction: the preserved-anchor set retains the latest watermark-carrier entry per spec so the projected global watermark never regresses after compaction+resume (which would otherwise spuriously re-emit `worldUpdate`). | covered (2026-06-11: boot/resume dedupe proven across an actual restart via `rebootTier2Runtime` — seed, kick, and `worldUpdate` non-duplicated, derived purely from transcript projection; compaction-anchor carrier preservation asserted at projection level; the Tier-2 scaffold has zero skipped/todo rows) | D17-L, D37-L, D43-L, D76-L, D78-L | | I48-L | Dev seeding never mutates an unintended workspace and never loads unrelated reusable seeds by ambient default: the seed path is target-workspace-scoped, selected by seed set/slug unless an all-seeds batch is explicitly requested, routes through `CommandExecutor`, and reports the destination `.brunch/data.db`; dev launch (`npm run dev`, with or without `--cwd`) observes existing workspace DB state but does not imply seeding. | partially validated — seed CLI now requires unambiguous `--workspace` + safe `--seed /` input, rejects malformed/unknown/duplicate flags before opening a workspace DB, writes only the named workspace DB through `seedFixture`/`CommandExecutor`, reports destination + selected seed ref mapping, and product RPC `workspace.selectionState` through `--cwd` proves seeded-vs-sibling workspace isolation; explicit all-seeds opt-in and full seed disposition catalog remain `dev-seed-fixtures` follow-up. | D70-L, D71-L, D79-L; I1-L, I11-L | ## Future Direction Register @@ -696,9 +696,9 @@ The first required probe is M0: after manual TUI interaction, a checker proves ` | I39-L | `graph-tool-resilience` CommandExecutor/adapter/context tests: counter rows allocate monotonic per-kind ordinals in multi-node batches, rollback does not persist failed ordinals/counter rows, DB constraints reject duplicate `(spec_id, plane, kind, kind_ordinal)`, projected-code metadata is unique and parses by longest prefix, existing-code refs resolve inside the selected spec, and prompt/tool renderers use codes as primary handles. Remaining proof: deletion/supersession no-reuse. | | I40-L | `graph-tool-resilience` CommandExecutor/adapter tests: `mutateGraph` applies one batch create-basis to all created nodes/edges, single-node `createNode` rejects retired basis values before LSN/counter/node/change-log allocation, `propose-graph` adapter commits use `implicit`, review-set translation uses `explicit`, retired `accepted_review_set` is rejected, and `change_log.operation` remains independent of basis. FE-807 adds direct structured text response capture with `basis: explicit`. FE-809 adds real project-graph review-cycle acceptance proof with explicit-basis readback under `.fixtures/runs/project-graph-review-cycle/2026-06-06-project-graph-review-cycle/`. | | I41-L | `graph-tool-resilience` CommandExecutor tests reject supersession cycles across existing edges, intra-batch edges, and mixed existing+batch edges, including rollback of batch nodes/edges/change_log; existing acyclic supersession paths still commit. | -| I45-L | Middle — watermark-projection property tests (own-write stamping vs foreign `worldUpdate`; strict-greater item set per I4-L; no-`worldUpdate` when `current==watermark`); **seed/full-overview snapshots advance the watermark while narrow `getNodes`/`queryNodes` reads do not**; **no redundant `worldUpdate` immediately after a seed that named the current snapshot LSN**; **same-session submit/capture write bumps `current_lsn` and is surfaced by the next `worldUpdate` (not swallowed)**; **a foreign write that lands between the snapshot read and seed insertion is not masked by the seed**; change-log-range fixtures driving a foreign writer (a second faux session or a direct `CommandExecutor` write) through the real boot. Inner — projection unit tests over synthetic transcript continuity entries. Authored coverage-first (skipped/`todo`) ahead of the `turn-boundary-reconciliation` slice. | -| I46-L | Middle — Tier-2 faux-turn-through-real-boot assertions: new session seeds-then-kicks before the first provider call; resumed-session kick decision classifies **latest unresolved conversational debt** (ignoring trailing continuity-only entries) and still fires when a user tail is followed by reconciler-inserted seed/staleness notices; **crash-after-notice-before-provider reboot still kicks when the underlying debt is an unanswered user/assistant turn** (idempotent re-boot); resumed-session kick stays silent when the latest debt already rests at a `request_*`/system leaf; no fabricated user entry in any path; AUTO never originates `freestyle`. Outer — manual walkthrough of opening-offer quality. Authored coverage-first (skipped/`todo`) ahead of the `kick+seeding` slice. | -| I47-L | Middle — restart/resume idempotence property tests (repeated boot does not duplicate seed/`worldUpdate`; dedupe derived from projection); **compaction+resume preserves the projected watermark and does not spuriously re-emit `worldUpdate`** (preserved-anchor set retains the latest watermark carrier); carrier-discipline source/architecture tests (continuity facts are custom entries, not synthetic `toolCall`s or prompt-only). Authored coverage-first (skipped/`todo`) ahead of the enabling slices. | +| I45-L | Middle — watermark-projection property tests (own-write stamping vs foreign `worldUpdate`; strict-greater item set per I4-L; no-`worldUpdate` when `current==watermark`); **seed/full-overview snapshots advance the watermark while narrow `getNodes`/`queryNodes` reads do not**; **no redundant `worldUpdate` immediately after a seed that named the current snapshot LSN**; **same-session submit/capture write bumps `current_lsn` and is surfaced by the next `worldUpdate` (not swallowed)**; **a foreign write that lands between the snapshot read and seed insertion is not masked by the seed**; change-log-range fixtures driving a foreign writer (a second faux session or a direct `CommandExecutor` write) through the real boot. Inner — projection unit tests over synthetic transcript continuity entries. **Live 2026-06-11** — the coverage-first scaffold is fully flipped; no skipped/`todo` rows remain. | +| I46-L | Middle — Tier-2 faux-turn-through-real-boot assertions: new session seeds-then-kicks before the first provider call; resumed-session kick decision classifies **latest unresolved conversational debt** (ignoring trailing continuity-only entries) and still fires when a user tail is followed by reconciler-inserted seed/staleness notices; **crash-after-notice-before-provider reboot still kicks when the underlying debt is an unanswered user/assistant turn** (idempotent re-boot); resumed-session kick stays silent when the latest debt already rests at a `request_*`/system leaf; no fabricated user entry in any path; AUTO never originates `freestyle`. Outer — manual walkthrough of opening-offer quality. **Live 2026-06-11** via `bootTier2RuntimeFromFixture` (real-boot-over-fixture resume chassis); the `request_*` idle proof uses fixtures built from the real result projections (key-presence envelope), not hand-built shapes. | +| I47-L | Middle — restart/resume idempotence property tests (repeated boot does not duplicate seed/`worldUpdate`; dedupe derived from projection); **compaction+resume preserves the projected watermark and does not spuriously re-emit `worldUpdate`** (preserved-anchor set retains the latest watermark carrier); carrier-discipline source/architecture tests (continuity facts are custom entries, not synthetic `toolCall`s or prompt-only). **Live 2026-06-11** via `rebootTier2Runtime` (actual restart over the same session file, Pi's deferred JSONL flushed first); the suite's sets-and-`{specId, lsn}` convention is enforced mechanically by a source scan banning golden matchers. | | I48-L | Inner — seed CLI contract tests for target workspace resolution, seed set/slug filtering, explicit all-seeds mode, `CommandExecutor`/change-log routing, and destination reporting. Middle — fresh workbench tracer: seed one named fixture into `.fixtures/workbenches//.brunch/data.db`, launch `npm run dev -- --cwd .fixtures/workbenches/` (or print/RPC equivalent), and assert selected workspace state plus graph overview come only from that workbench DB. | ### Design Notes @@ -727,6 +727,7 @@ The first required probe is M0: after manual TUI interaction, a checker proves ` | Reviewer finding precision (false positives/negatives) | Advisory-only reviewer can spam reconciliation needs (false positives) or miss real coherence gaps (false negatives); both erode trust. | Targeted adversarial briefs with known-bad coherence problems; precision/recall surfaced per run as fitness; user can dismiss reviewer findings without consequence. | Users systematically ignore reviewer findings, or coherence gaps slip past reviewer in known-bad fixtures. | | In-flight reviewer-signal UX | Chrome rendering of "reviewer running / has findings" before next-turn delivery is not yet designed; cost may exceed value in POC. | Probe oracle on chrome state after batch-accept; defer in-flight progress affordances unless a frontier explicitly demands them. | Users report confusion about whether reviewer ran or completed; or async job latency makes silence feel like failure. | | Meta-rubric usefulness (D31-L) | Universal evaluative dimensions (complexity, lock-in, etc.) may or may not be productive across lens types; this is an unproven hypothesis. | Comparative outer-loop walkthrough: same proposal scenario with and without meta-rubric framing; user judgment captured in probe metadata. | Meta-rubric framings are consistently ignored by users, or consistently produce better decisions — either signal warrants spec revision. | +| Live-vs-harness wiring divergence | Capabilities declared optional on dependency/context interfaces (with `?.` + fallback) let the production composition root silently omit wiring that every test harness supplies — the POC delivery question ("can the real entrypoints compose without the harness secretly supplying wiring?") inverted as a defect class. Four independent 2026-06-11 findings instantiated it: unwired live gap reads froze legality at a conservative floor; the mention/drain inputs were never threaded; the provider guard bypassed its retry helper; runtime switches recomputed tool posture from an empty register. | Load-bearing capabilities are **required** interface members (the compiler polices the composition root); intended-optional members carry explicit doc comments distinguishing them; Tier-2 real-boot assertions pin each live posture (gap legality, resume kick, guard retry); empty-register states fail loud through the documented config-bug throw rather than quiet fallbacks. | Another optional-hook fallthrough reaches a PR, or a new dependency interface accrues `?.`-consumed capability members without a real-boot oracle. | ### Acceptance Criteria diff --git a/memory/cards/tooling--runtime-state-commands.md b/memory/cards/tooling--runtime-state-commands.md deleted file mode 100644 index 248f45fd..00000000 --- a/memory/cards/tooling--runtime-state-commands.md +++ /dev/null @@ -1,136 +0,0 @@ -# Runtime state command operations - -Frontier: n/a — FE-845 branch concern; uses FE-847 Tier-2 harness substrate -Status: done -Mode: single -Created: 2026-06-11 - -## Orientation - -- Seam: Brunch TUI chrome + Pi extension commands over transcript-backed runtime state (`src/.pi/extensions/{commands,runtime,chrome}` → `src/session/runtime-state.ts` → `src/projections/session/*`). -- Nearby frontier: this is not yet a named `memory/PLAN.md` frontier; it is FE-845 branch work. It should build on the active lower-stack `dx-tier-2-harness` frontier (`ln/fe-847-dx-introspection-tier-2`) for real-boot/faux-turn proof rather than inventing a local fake harness. -- Volatile state: user declared Pi update suppression and new-session header recovery sufficiently done; keyboard shortcut lookup overlay and `src/.pi/components/tui-lab/` posture UI experiments are deferred for later implementation, not part of this card. -- TUI interaction model for this slice: go straight through namespaced slash commands as the first user-facing surface (`/brunch:strategy`, `/brunch:lens`, mode read/no-op). Use notifications/errors for feedback; do not introduce a custom selector/overlay yet. -- Main risk: Pi command invocation and footer render are TUI/extension-context shaped; the slice must prove the product entry path with Tier-2 real boot where feasible, not only by directly calling pure helpers. - -Posture: proving (inherited from FE-845 branch concern and adjacent `dx-tier-2-harness`). - -## Target Behavior - -A user-invoked Brunch slash command changes the session's active transcript-backed runtime posture before the next provider turn. - -## Full-card cold-start reads - -- `memory/SPEC.md` — D35-L, D40-L, D39-L, D58-L, D59-L, I25-L, I38-L, I42-L -- `memory/PLAN.md` — active `dx-tier-2-harness`; FE-847 single-branch Tier-2 context; no PLAN frontier currently names FE-845 chrome-pass work -- `src/.pi/extensions/README.md` — chrome/commands/runtime extension ownership and raw `ctx.ui.*` boundary rules -- `src/session/README.md` — runtime-state ownership and runtime affordance coverage ledger -- `src/dev/README.md` — Tier-2 real boot loop and faux-provider proof surfaces -- `docs/architecture/pi-faux-provider-pattern.md` — when faux provider assertions are the right oracle - -## Boundary Crossings - -```pseudo -/brunch:strategy or /brunch:lens - -> .pi/extensions/commands validates command args against runtime axis vocabulary - -> session/runtime-state appends brunch.agent_runtime_state reason=switch, source=user - -> projections/session/runtime-state resolves last-writer-wins posture - -> .pi/extensions/runtime applies active-tool/prompt posture before provider request - -> .pi/extensions/chrome footer reflects projected mode/strategy/lens - -> Tier-2 faux provider captures the next provider payload/active tools -``` - -`/brunch:mode` is in scope only to stop being a misleading stub: because `elicit` is currently the only legal op mode, it may report the current mode or accept an explicit no-op `elicit`, but it must not invent future execute-mode behavior. - -Custom TUI controls are explicitly out of scope. The experimental `src/.pi/components/tui-lab/` segment/chip components are promising for a follow-on posture picker or overlay, but this slice should not couple runtime authority to that exploratory UI. - -## Risks and Assumptions - -- RISK: The current Tier-2 helper may expose a real runtime but not a convenient command-execution helper. → MITIGATION: extend `src/dev/tier-2-harness.ts` narrowly to invoke a registered extension command through `runtime.session.extensionRunner.getCommand(...).handler(...)` or equivalent real extension-runner surface; do not build a parallel fake command runner. -- RISK: A command can append runtime state but the next provider turn may not observe it if policy/prompt hooks read stale state. → MITIGATION: assertion must include a second faux provider turn after the command and inspect the captured provider context/active tools or prompt manifest for the selected axis. -- RISK: Footer reflection is hard to assert through raw TUI rendering. → MITIGATION: keep a pure footer-line assertion via `projectBrunchChromeFooterLines` with projected `agentState`, and treat an optional TUI smoke as outer/manual rather than a blocking oracle. -- ASSUMPTION: FE-847 Tier-2 chassis is available on the lower stack and usable from this branch. - → IMPACT IF FALSE: this card should first land the minimal missing Tier-2 helper on the FE-847 seam or route back to `ln-scope`; do not regress to ad hoc local fakes. - → VALIDATE: run/extend `src/dev/tier-2-harness.test.ts` around one command-driven real-boot faux turn. - → memory/SPEC.md: D68-L, A25-L, I42-L. - -## Posture check - -This is a proving tracer bullet: - -- Proof of life: a namespaced Brunch TUI command mutates actual transcript-backed runtime state and the next faux provider call sees it. -- Invariants: D40-L stays intact — runtime state is a Pi JSONL fact, not hidden extension memory; foreground agent never emits posture switches; user/system authority appends `reason: "switch"`. -- Uncertainty retirement: validates that FE-847 Tier-2 is sufficient for FE-845 chrome/runtime operation checks without adding another test harness. - -## Acceptance Criteria - -```pseudo tree -runtime-state command path -├── command validation -│ ├── accepts strategy selections: auto + known strategy ids -│ ├── accepts lens selections: auto + known lens ids -│ └── rejects unknown axis values without appending a runtime-state entry -├── transcript authority -│ ├── appends brunch.agent_runtime_state with reason=switch and source=user -│ ├── preserves previous state in the switch entry -│ └── projects the new state as last-writer-wins after reload -├── product reflection -│ ├── next provider turn observes the selected posture through the real Brunch runtime hooks -│ └── footer projection renders selected mode/strategy/lens from projected runtime state -└── explicit non-scope - ├── update suppression/header recovery remain treated as already accomplished - ├── keyboard shortcut lookup overlay is documented as deferred, not implemented here - └── tui-lab posture picker/overlay is deferred to a following slice -``` - -✓ Command tests — `/brunch:strategy propose-graph`, `/brunch:strategy auto`, `/brunch:lens intent`, and `/brunch:lens auto` append valid switch entries; invalid values notify/fail without appending. -✓ Projection/reload test — appended switch entries survive Pi JSONL reload and `projectBrunchAgentState` returns the selected strategy/lens. -✓ Tier-2 faux test — a real `runBrunchTui` boot invokes a runtime switch command, runs a subsequent faux-provider turn, and captures provider/prompt/tool posture consistent with the switch. -✓ Chrome projection test — `projectBrunchChromeFooterLines` renders the projected strategy/lens from `agentState`, not stale launch-time `chrome.runtime` fallback. -✓ No hotkeys implementation — no shortcut lookup overlay is added in this slice; if touched docs need it, note it as deferred only. - -## Verification Approach - -- Inner: unit tests in `src/.pi/__tests__/` and/or `src/app/brunch-tui.test.ts` over command parsing, validation, append behavior, and footer projection. -- Middle: Tier-2 real-boot faux-provider test via `src/dev/tier-2-harness.ts` / `src/dev/tier-2-harness.test.ts` proving command → transcript state → next provider turn. -- Gate: `npm run verify` before commit; during iteration, run focused tests plus `npm run fix` after meaningful edits. - -## Cross-cutting obligations - -- Preserve D39-L sealed profile: no ambient Pi resources, no product behavior gated on dev-only introspection, no hidden state outside the explicit extension bundle. -- Preserve D40-L: runtime-state entries are transcript-backed facts; commands are user/system authority; no agent-emitted posture switches. -- Use existing runtime legality/policy tables; do not create a second command-local vocabulary or future `execute` mode. -- Use FE-847 Tier-2/faux-provider surfaces for product-path proof; do not add a third harness or shape implementation around an injected fake path that product never runs. - -## Expected touched paths (tentative) - -```pseudo tree -src/.pi/extensions/commands/ -├── index.ts ~ -└── runtime-switch-command.test.ts +? -src/.pi/__tests__/ -├── operational-mode.test.ts ~? -├── chrome.test.ts ~ -└── extension-registry.test.ts ~? -src/app/ -└── brunch-tui.test.ts ~? -src/dev/ -├── tier-2-harness.ts ~? -└── tier-2-harness.test.ts ~ -src/session/ -└── runtime-state.ts ~? -src/.pi/extensions/README.md ~? -src/session/README.md ~? -``` - -## Promotion checklist - -- [ ] Does this change a requirement? — No. -- [ ] Does this create, retire, or invalidate an assumption? — No; it validates use of Tier-2 as an oracle for this seam. -- [ ] Does this slice depend on an unvalidated high-impact assumption? — No; Tier-2 already exists enough to try, and the card names the fallback. -- [ ] Does this make or reverse a non-trivial design decision? — No; D40-L already says user/system posture switches append transcript facts. -- [ ] Does this establish a new seam-level invariant? — No; it exercises existing invariants. -- [ ] Does this change a frontier-level cross-cutting obligation or verification architecture layer? — No. -- [x] Does it cross more than two major seams? — Yes, intentionally; kept as a full card. -- [ ] Is this the first touch in an unfamiliar seam from a fresh thread? — No. -- [ ] Can you not name the containing seam or current rationale from the live docs? — No. diff --git a/src/dev/README.md b/src/dev/README.md index 9c7ccc6e..3029c62b 100644 --- a/src/dev/README.md +++ b/src/dev/README.md @@ -31,7 +31,7 @@ Product probes may import `src/probes/faux-provider.ts` when they need determini ## Tier-2 real boot loop (FE-847) -`runTier2RealBootFauxTurn()` is the real-boot harness for runtime choreography tests: it enters through `runBrunchTui`, drives one faux-provider turn, and exposes the captured provider context, active tool names, transcript entries, session file, and rendered transcript. `bootTier2RuntimeThroughRunBrunchTui()` owns real runtime boot proofs such as ready context and `BRUNCH_DEV`-gated query-tool registration. `resumeTier2Fixture()` writes a fixture JSONL transcript, reopens it through the workspace/session coordinator, and reports original vs resumed session-file state so restart/resume assertions do not need local fake boot helpers. +`runTier2RealBootFauxTurn()` is the real-boot harness for runtime choreography tests: it enters through `runBrunchTui`, drives one faux-provider turn, and exposes the captured provider context, active tool names, transcript entries, session file, and rendered transcript. `bootTier2RuntimeThroughRunBrunchTui()` owns real runtime boot proofs such as ready context and `BRUNCH_DEV`-gated query-tool registration. `resumeTier2Fixture()` writes a fixture JSONL transcript, reopens it through the workspace/session coordinator, and reports original vs resumed session-file state so restart/resume assertions do not need local fake boot helpers. `bootTier2RuntimeFromFixture()` is the resume-side real-boot chassis (pre-seed a fixture transcript, then boot the real runtime over it — the I46 resume-origination oracle), and `rebootTier2Runtime()` re-boots the real runtime over the same session file after flushing Pi's deferred JSONL (the I47 actual-restart idempotence oracle). The FE-847 coverage-first scaffold is fully live as of 2026-06-11 — no skipped/todo rows remain in `tier-2-harness.test.ts`. ## Proof ownership ledger From 593915200832f4d3219da5733e44167fac574a54 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Thu, 11 Jun 2026 15:51:51 +0200 Subject: [PATCH 32/32] Graduate two induct lenses into ln-review contract-integrity catalog Per user approval: the optional-hook live-wiring divergence lens (four findings this arc) and the dark-union-variant lens (the gap-predicate family) join the stabilized lens library with their cues, repairs, and graduation evidence. Co-Authored-By: Claude Fable 5 --- .agents/skills/ln-review/SKILL.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/.agents/skills/ln-review/SKILL.md b/.agents/skills/ln-review/SKILL.md index 9c3591e3..0fc41aea 100644 --- a/.agents/skills/ln-review/SKILL.md +++ b/.agents/skills/ln-review/SKILL.md @@ -61,6 +61,8 @@ Concrete cues to look for: - Ordering or position encoded by a numeric index/splice rather than by structure. Repair: make the order declarative. - A type alias or name that implies a wider contract than it points at. Repair: point it at the real union, or rename. - A method-shaped read/cache surface added without the matching update path (RPC method, query key, publisher topic, client invalidator, and README ledger entry drifting apart). Repair: thread the method-shaped topic through the whole publish/invalidate path, and lock it with a narrow invalidation/publisher test. +- An **optional capability hook on a dependency/context interface** (`readonly x?: (` consumed via `?.` with a `??` fallback) where test harnesses supply the capability but the production composition root silently omits it → live behavior diverges from everything the suite proves, invisibly (graduated 2026-06-11 from four independent findings: unwired live gap reads froze legality at a conservative floor; mention/drain inputs never threaded; a provider guard bypassed its retry helper; runtime switches recomputed tool posture from an empty register). Repair: make load-bearing capabilities **required** members so the compiler polices the composition root; reserve optionality for documented ergonomics, and say which is which on the interface; pin each live posture with a real-boot oracle. Quiet defaults that "handle" the absence one layer down are the same fault relocated. See SPEC §Acknowledged Blind Spots "Live-vs-harness wiring divergence". +- A **tagged-union arm that is representable and boundary-accepted but has no semantics anywhere downstream** (validation checks kind membership only; derivation hits a default/zero branch) → a "dark variant" persists as permanently-inert data, silently (graduated 2026-06-11: `field`/`coverage` gap predicates were creatable but derived coverage 0 forever and could not be hand-answered). Repair: one exhaustive `never`-checked owner of per-arm semantics that both validation and derivation ride — every accepted arm gets an implementation or loud rejection at the boundary, and adding an arm without deciding its semantics fails to compile. Collect findings as numbered items (category: `contract`). Frame each as: the assumed contract in one sentence, the failure mode when it breaks, and which of the three repairs applies. Most are concrete fixes (`ln-scope`/`ln-build`); clusters across a seam route to `ln-refactor`.