Skip to content

feat: Add instrumentation hook + perf-regression canary spec#77

Merged
Saeris merged 1 commit into
mainfrom
perf/regression-instrumentation
Jun 3, 2026
Merged

feat: Add instrumentation hook + perf-regression canary spec#77
Saeris merged 1 commit into
mainfrom
perf/regression-instrumentation

Conversation

@Saeris

@Saeris Saeris commented Jun 3, 2026

Copy link
Copy Markdown
Owner

Summary

A catastrophic-regression canary that catches eager-recursion class bugs deterministically — no wall-clock assertions, no CI flake. This is concern #1 from the three-part perf-testing architecture discussion: catch the bugs that should never ship.

The v1.5.0 wrapper-recursion bug went out because nothing in the test suite failed when mock(messageSchema) went from 5.6ms to 26 seconds. Property tests still passed — they just got slow. This PR makes that mode of regression impossible to ship undetected.

Workload Healthy Simulated bug Spec ceiling Catches?
messageLikeSchema calls 84 177,252 < 2,000 ✅ ~88× margin
messageLikeSchema depth 6 4,200 < 50 ✅ ~84× margin

What's new

Opt-in instrumentation on Valimock

const m = new Valimock({ instrument: true });
m.mock(schema);
console.log(m.instrumentation); // { mockCalls: 87, maxDepth: 6 }
m.resetInstrumentation();
  • mockCalls — total #mock invocations since the last reset.
  • maxDepth — peak recursion depth.

Off by default; production users pay zero overhead. Adds one branch + two increments per #mock call when on. New types: ValimockInstrumentation exported alongside.

Realistic-shape fixture library

src/__benchmarks__/fixtures/ — four schemas modeling recurring patterns from discordkit's codebase, the library that drove the original v1.5.0 regression investigation:

  • messageLikeSchema — recursive lazy through nullish (the v1.5.0 bug's worst case; 26s per mock at v1.5.2)
  • channelLikeSchemaintersect([common, variant(...)]) discriminated unions (the v1.5.1 deepMerge discriminator-drop pattern)
  • applicationLikeSchema — wide-flat object with many string fields, no recursion (control case — explosions here would point at the string pipeline, not a wrapper)
  • interactionLikeSchema — multi-v.lazy() references with cross-schema dependencies

Not exported from the public API — internal to the test/bench surface. Designed to be reused by both this PR's regression specs and the future Vitest bench() files (PR #2 of this trilogy).

Regression spec

src/__tests__/perfRegression.spec.ts asserts call-count and depth ceilings on each fixture, averaging over 20 trials to smooth the random-roll variance from wrapper handlers (a single roll can vary by ~3× depending on how many nullish branches happen to pick the wrapped option).

Each fixture has documented baseline measurements and a ceiling at ~4-5× the max baseline. That bandwidth tolerates natural variance while catching catastrophic regressions by orders of magnitude.

Why call counts, not wall-clock

Wall-clock budgets would have caught the v1.5.0 bug but flake on CI runners under load. Call counts are deterministic properties of the algorithm and shift by orders of magnitude when the bug fires. The regression class we're protecting against is catastrophic by nature — the test doesn't need to be precise, it needs to be reliable.

Verification that the canary works

Locally simulated the v1.5.0 wrapper-recursion bug by registering customMocks.nullish with the eager evaluation pattern (arrayElement([this.mock(wrapped), null, undefined]) — wrapped is always evaluated). Results:

== Healthy (lazy nullish) ==
  mockCalls: 84, maxDepth: 6

== Simulated eager-nullish bug ==
  mockCalls: 177252, maxDepth: 4200
  spec would catch this: YES

The spec's expect(stats.maxCalls).toBeLessThan(2000) triggers at a ~2,100× explosion in call counts. Comfortable detection bandwidth — the bug v1.5.0 shipped would have been caught at the PR stage.

What's deferred (to follow-up PRs)

Two follow-ups from the architecture discussion that share the fixture library:

  1. Vitest bench() files (concern chore(deps-dev): Bump valibot from 0.19.0 to 0.20.0 #2: is this change actually faster?). The fixtures land in this PR so the bench files can reuse them directly. Starting work on this immediately after this PR opens.
  2. scripts/profile.mjs + vp run profile convenience wrapper for CPU profiling under node --cpu-prof (concern chore(deps-dev): Bump eslint-plugin-vitest from 0.3.2 to 0.3.8 #3: where should I look to make this faster?). Optional QoL.

Verification

  • vp check --fix clean across 67 files
  • ✅ Wallaby reports 0 failing tests, 80.42% coverage
  • ✅ Canary locally proven to catch the v1.5.0 wrapper-recursion bug
  • ⏳ CI on this branch

Bump

minor — adds public API surface (instrument option, instrumentation accessor, resetInstrumentation() method, ValimockInstrumentation type export). Bumpy file included.

🤖 Generated with Claude Code

A catastrophic-regression canary that catches eager-recursion class bugs
deterministically — no wall-clock assertions, no CI flake.

## What's new

**Opt-in instrumentation on `Valimock`.** Pass `instrument: true` to the
constructor to enable per-call counters exposed via
`valimock.instrumentation`:

```ts
const m = new Valimock({ instrument: true });
m.mock(schema);
console.log(m.instrumentation); // { mockCalls: 87, maxDepth: 6 }
m.resetInstrumentation();
```

- `mockCalls` — total `#mock` invocations since the last reset.
- `maxDepth` — peak recursion depth.

Off by default; production users pay no overhead. Adds one branch + two
increments per `#mock` call when on. New types `ValimockInstrumentation`
exported.

**Realistic-shape fixture library** in `src/__benchmarks__/fixtures/`.
Each fixture models a recurring shape from the discordkit codebase, the
library that drove the original v1.5.0 regression investigation:

- `messageLikeSchema` — recursive lazy through nullish (the v1.5.0
  bug's worst case)
- `channelLikeSchema` — intersect-with-variant discriminated unions
- `applicationLikeSchema` — wide-flat object, no recursion (control)
- `interactionLikeSchema` — multi-lazy union with cross-references

Not exported from the public API.

**Regression spec** `src/__tests__/perfRegression.spec.ts` asserts call-
count and depth ceilings on each fixture. Averaged over 20 trials to
smooth random-roll variance.

## Why call counts, not wall-clock

The v1.5.0 wrapper-recursion bug went out because nothing in the test
suite *failed* when `mock(messageSchema)` went from 5.6ms to 26 seconds.
Property tests still passed — they just got slow.

Wall-clock budgets would have caught it but flake on CI runners under
load. Call counts are deterministic properties of the algorithm and
shift by orders of magnitude when the bug fires.

Locally verified the canary catches the bug at the simulation level:

| Workload                  | Healthy | Simulated eager bug |
| ------------------------- | ------: | ------------------: |
| `messageLikeSchema` calls |      84 |            177,252  |
| `messageLikeSchema` depth |       6 |              4,200  |

Spec ceiling `maxCalls < 2000` triggers at ~2,100x explosion. Comfortable
detection bandwidth.

## What's deferred

Two follow-ups from the architecture discussion:

- **Vitest `bench()` files** for iteration-time perf comparison
  (concern #2: "is this change actually faster?"). The fixture library
  lands now so the bench files can reuse it directly when added.
- **`scripts/profile.mjs` + `vp run profile`** convenience wrapper for
  CPU profiling under `node --cpu-prof` (concern #3: "where should I
  look to make this faster?"). Optional QoL; skip unless wanted.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@codesandbox

codesandbox Bot commented Jun 3, 2026

Copy link
Copy Markdown

Review or Edit in CodeSandbox

Open the branch in Web EditorVS CodeInsiders

Open Preview

@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

bumpy-frog

The changes in this PR will be included in the next version bump.

minor Minor releases

  • valimock 1.5.4 → 1.6.0

Bump files in this PR

Click here if you want to add another bump file to this PR


This comment is maintained by bumpy.

@Saeris Saeris merged commit c362514 into main Jun 3, 2026
4 checks passed
@Saeris Saeris deleted the perf/regression-instrumentation branch June 3, 2026 21:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant