Skip to content

fix(ai): tighten content-block types + add nightly schema-vs-type drift guard#558

Closed
sroussey wants to merge 7 commits into
mainfrom
claude/beautiful-mayer-6k5pug
Closed

fix(ai): tighten content-block types + add nightly schema-vs-type drift guard#558
sroussey wants to merge 7 commits into
mainfrom
claude/beautiful-mayer-6k5pug

Conversation

@sroussey

Copy link
Copy Markdown
Collaborator

Summary

Follow-up hardening for PR #555. Tightens three runtime input types in @workglow/ai to eliminate (or document) drift between hand-written types and the JSON schemas they shadow, and adds a nightly type-drift guard so future schema or ContentBlock changes can't drift silently.

This PR targets main and is independent of PR #557 (which targets a different base).

Per-finding rationale

H-1 — AiChatTaskInput['prompt'] and ToolCallingTaskInput['prompt']

  • AiChatTask.ts previously inlined the four ContentBlock variants as a literal union. Replaced with string | readonly ContentBlock[] so the runtime type reuses ContentBlock directly and a change to ContentBlock flows through automatically. The schema's items are already ContentBlockSchema, so this is the canonical match.
  • ToolCallingTask.ts previously used string | (string | { type: "text" | "image" | "audio"; [x]: unknown })[] — a looser shape mirroring the JSON schema. Replaced with string | readonly (string | ContentBlock)[]. This is intentionally tighter than the schema (which still uses additionalProperties: true for UI flexibility); the drift test pins this divergence so a future schema broadening or ContentBlock change cannot drift silently.

H-2 — ChunkRetrievalTaskInput

The schema's if/then/else (when query: string, model is required) is invisible to json-schema-to-ts / FromSchema, so the hand-written discriminated union is intentionally stricter than what FromSchema would resolve to. Renamed inputSchema to the exported ChunkRetrievalInputSchema (so the drift test can reference it) and added a JSDoc block above the type explaining the intentional divergence. No behavior change.

H-3 — Drift guardrail

  • packages/ai/src/task/__tests__/types.test-d.ts — vitest typecheck-mode tests using built-in expectTypeOf. Three assertions:
    • AiChat: positive assertions that the prompt type accepts every ContentBlock variant (text, image, tool_use, tool_result) and the union itself, plus FromSchema<typeof AiChatInputSchema>['prompt'] resolves to a non-trivial type. This is the right drift signal — FromSchema's resolution differs from the runtime type in non-meaningful ways (mutable array vs readonly, is_error?: vs is_error: boolean | undefined), so structural equality is too brittle.
    • ToolCalling: not.toEqualTypeOf — pins the intentional divergence. If the schema or ContentBlock ever bring the types into equality, the author should re-verify whether the tightening is still desired.
    • ChunkRetrieval: not.toEqualTypeOf — pins the if/then/else divergence. The moment FromSchema honours if/then/else, this test fails and the runtime type can switch to a FromSchema-derived form.
  • .github/workflows/nightly-typecheck.yml — nightly + workflow_dispatch job that runs the drift test via bunx vitest run --typecheck --typecheck.only. Kept separate from test.yml so per-PR signal stays focused; the nightly catches drift the moment it lands on main.

A late commit (ci(ai): run drift guard in vitest typecheck mode) corrects the workflow command — .test-d.ts files are picked up by vitest's typecheck include pattern (*.{test,spec}-d.ts), not its default runtime include, so --typecheck --typecheck.only is required.

Test plan

  • bunx vitest run --typecheck --typecheck.only packages/ai/src/task/__tests__/types.test-d.ts — 3/3 tests pass
  • bun scripts/test.ts ai vitest — 36 files / 238 tests pass
  • bun run typecheck:budgetOK (34 packages within budget); packages/ai instantiation count is flat vs baseline (the substitutions reuse already-resolved types, no extra instantiation)
  • Nightly workflow lands on main after merge; the next scheduled run (or a manual workflow_dispatch) verifies CI-side

https://claude.ai/code/session_01562Z29a2UQDNBVAcJGyUoY


Generated by Claude Code

claude added 7 commits June 10, 2026 08:29
Replaces the hand-written `prompt:` literal-union mirror of the four
ContentBlock variants with `string | readonly ContentBlock[]`. The runtime
type now reuses ContentBlock directly, eliminating drift between the schema
items (ContentBlockSchema) and the input type.

https://claude.ai/code/session_01562Z29a2UQDNBVAcJGyUoY
Intentionally tighter than the JSON schema's prompt-array items, which use a
looser `{ type: "text" | "image" | "audio", additionalProperties: true }`
shape. The runtime type now reuses `ContentBlock` directly so callers get
the same discriminated union the chat tasks see. The nightly schema-vs-type
drift test (added separately) pins this divergence so a future schema
broadening or `ContentBlock` change cannot drift silently.

https://claude.ai/code/session_01562Z29a2UQDNBVAcJGyUoY
…if/then/else tightening

Renames the unexported `inputSchema` to `ChunkRetrievalInputSchema` and
exports it so the schema-vs-type drift test can reference it. Adds a JSDoc
block above `ChunkRetrievalTaskInput` explaining that the hand-written
discriminated union is intentionally stricter than `FromSchema`'s
resolution — it encodes the schema's `if/then/else` (when `query: string`,
`model` is required) which `json-schema-to-ts` ignores.

https://claude.ai/code/session_01562Z29a2UQDNBVAcJGyUoY
Adds `packages/ai/src/task/__tests__/types.test-d.ts` — a type-only test
that pins three drift relationships:

- `AiChatTaskInput['prompt']` round-trips with the schema (equality).
- `ToolCallingTaskInput['prompt']` is one-way assignable to the schema's
  prompt type (intentional ContentBlock tightening over the looser
  `additionalProperties: true` items).
- `ChunkRetrievalTaskInput` is one-way assignable to `FromSchema`'s
  resolution (intentional if/then/else tightening that `json-schema-to-ts`
  ignores).

Adds `.github/workflows/nightly-typecheck.yml`, a nightly + workflow_dispatch
job that runs the drift test via `bunx vitest run`. Kept separate from
`test.yml` so per-PR signal stays focused; the nightly catches drift the
moment it lands on main.

https://claude.ai/code/session_01562Z29a2UQDNBVAcJGyUoY
…rent types

Adjusts the three drift assertions added in 929dadb so they actually pass
under vitest's typecheck mode against the current shape of the runtime
types and FromSchema resolution:

- AiChat: the runtime prompt type uses `readonly ContentBlock[]` while
  FromSchema gives a mutable array with `is_error?:` (optional) vs
  ContentBlock's `is_error: boolean | undefined` (required-but-undefined).
  These differences are real, but the meaningful drift signal is "does
  the runtime type still accept every ContentBlock variant?". The test
  now uses positive assertions on each variant + the union and pins the
  schema's resolution to a non-trivial type.

- ToolCalling: keep the "intentionally divergent" assertion.
- ChunkRetrieval: keep the "intentionally stricter than FromSchema"
  assertion.

Run locally via `bunx vitest run --typecheck --typecheck.only
packages/ai/src/task/__tests__/types.test-d.ts`.

https://claude.ai/code/session_01562Z29a2UQDNBVAcJGyUoY
`.test-d.ts` files are picked up by vitest's typecheck include pattern,
not its default runtime include. Add `--typecheck --typecheck.only` so
the nightly job actually runs the assertions rather than reporting
"No test files found".

https://claude.ai/code/session_01562Z29a2UQDNBVAcJGyUoY
…-guard test

The H-1 substitutions in PR #558 caused two CI failures:

1. `build` — MessageConversion.ts:52,174 stopped compiling because the
   new `readonly` modifier on `prompt` broke `Array.isArray()` narrowing
   (TS doesn't narrow readonly arrays via Array.isArray).

2. `typecheck-budget` — packages/ai went 92,750 → 152,539 instantiations
   (+64%), defeating PR #555's perf goal. ContentBlock plus FromSchema
   references in types.test-d.ts together blew the budget.

PR #555 verified the inline literals are byte-equal to FromSchema's
resolution, so the substitution was gratuitous tightening, not a safety
win. The schema-vs-type drift risk is still real but addressed by the
H-3 nightly drift guard.

Changes:
- Restore original inline `prompt` literal in AiChatTask.ts (ContentBlock
  import stays — used elsewhere).
- Restore original inline `prompt` literal in ToolCallingTask.ts; drop
  unused ContentBlock import.
- Simplify types.test-d.ts to assert
  expectTypeOf<X["prompt"]>().toEqualTypeOf<FromSchema<typeof S>["prompt"]>()
  for AiChat + ToolCalling (now true). Keep .not.toEqualTypeOf for
  ChunkRetrieval (intentional if/then/else divergence).
- Exclude src/**/*.test-d.ts from packages/ai/tsconfig.json so
  typecheck:budget doesn't measure FromSchema cost in the test file.
  Nightly workflow's --typecheck engine is unaffected.

MessageConversion.ts is unchanged — narrowing works again with the
restored mutable arrays.

Copy link
Copy Markdown
Collaborator Author

Closing in favour of PR #565. The README changes from this PR were already committed to main directly (bc0918d). The remaining applicable fixes (export ChunkRetrievalInputSchema, drift-guard test, tsconfig update, nightly CI workflow) are cherry-picked into #565 targeting main.


Generated by Claude Code

@sroussey sroussey closed this Jun 11, 2026
sroussey added a commit that referenced this pull request Jun 11, 2026
…ift guard (#565)

* fix(ai): export ChunkRetrievalInputSchema + add nightly schema-vs-type drift guard

Cherry-picks the applicable parts of PR #558 that target main:

- Export `ChunkRetrievalInputSchema` (renamed from module-private `inputSchema`)
  so the drift test can reference it; add JSDoc documenting the intentional
  if/then/else tightening over `FromSchema`.
- Add `packages/ai/src/task/__tests__/types.test-d.ts`: vitest typecheck-mode
  assertions that AiChat/ToolCalling prompt types match their `FromSchema`
  resolution and that ChunkRetrieval's discriminated union stays stricter.
- Exclude `*.test-d.ts` from `packages/ai/tsconfig.json` so the per-PR
  typecheck:budget gate stays fast.
- Add `.github/workflows/nightly-typecheck.yml`: nightly + workflow_dispatch
  job that runs the drift test via `--typecheck --typecheck.only`.

PRs #557 and #559 are not cherry-picked: they fix the binary-streaming
framework (CacheRef branding, BinaryStreamRouter backpressure) which lives
on a feature branch not yet merged to main.

https://claude.ai/code/session_013v3PWUAdtJBnWKLbzF8nfe

* chore: update bun.lock

https://claude.ai/code/session_013v3PWUAdtJBnWKLbzF8nfe

---------

Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants