Skip to content

feat(eval): ao eval outcomes ingest — Outcomes score → one verdict record (ag-hdqu0 #ingest-verdict)#603

Merged
boshu2 merged 1 commit into
mainfrom
feat/ag-hdqu0.2-outcomes-ingest
May 29, 2026
Merged

feat(eval): ao eval outcomes ingest — Outcomes score → one verdict record (ag-hdqu0 #ingest-verdict)#603
boshu2 merged 1 commit into
mainfrom
feat/ag-hdqu0.2-outcomes-ingest

Conversation

@boshu2
Copy link
Copy Markdown
Owner

@boshu2 boshu2 commented May 29, 2026

What

Adds ao eval outcomes ingest <score.json> — the third slice of ag-hdqu0. Maps an Outcomes grader score (aggregate + per-criterion) onto the one council verdict record (skills/council/schemas/verdict.json: PASS/WARN/FAIL + satisfaction_score + satisfaction_breakdown), closing the Outcomes → Knowledge Flywheel loop without forking the verdict format. Pairs with ao eval outcomes compile (#601).

Verdict bands: PASS ≥ threshold · FAIL < 70% of threshold · WARN between.

One-shot surface landing (compounding from #601/ag-lkxx)

ingest is a ##### leaf, so heading counts are unchanged — only an allowlist entry was needed. Ran scripts/test-agentops-contract-canaries.sh locally before pushing: cli-command-surface-matrix verdict=pass aggregate=1. No CI canary round-trips.

Tests (TDD)

  • TestIngestOutcomesScore_ProducesVerdictRecord — PASS verdict, satisfaction_score=aggregate, breakdown=criterion scores, schema_version 4, non-nil findings.
  • TestIngestOutcomesScore_VerdictBands — PASS/WARN/FAIL banding.
  • go test ./cmd/ao green, vet/build clean.

Closes-scenario: ag-hdqu0.2#ingest-verdict
Bounded-context: BC1-Corpus
Evidence: cli/cmd/ao/eval_outcomes_ingest_test.go

…cord (ag-hdqu0 #ingest-verdict)

Adds 'ao eval outcomes ingest <score.json>': maps an Outcomes grader score
(aggregate + per-criterion) onto the council verdict.json shape (PASS/WARN/FAIL +
satisfaction_score + satisfaction_breakdown), closing the Outcomes -> Knowledge
Flywheel loop without forking the verdict format. Bands: PASS >= threshold, FAIL
< 70% of threshold, WARN between.

Surface recipe applied PRE-EMPTIVELY (learned from #601/ag-lkxx): new #### #####
leaf needs only an allowlist entry (heading counts unchanged since it's a
5-level command); ran test-agentops-contract-canaries.sh locally —
cli-command-surface-matrix verdict=pass. One-shot, no CI canary dance.

TDD: TestIngestOutcomesScore_ProducesVerdictRecord + _VerdictBands. go test/vet/build green.

Closes-scenario: ag-hdqu0.2#ingest-verdict
Bounded-context: BC1-Corpus
Evidence: cli/cmd/ao/eval_outcomes_ingest_test.go
@github-actions github-actions Bot added the cli label May 29, 2026
@boshu2 boshu2 merged commit 4ad5ea4 into main May 29, 2026
14 checks passed
@boshu2 boshu2 deleted the feat/ag-hdqu0.2-outcomes-ingest branch May 29, 2026 20:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant