Skip to content

perf: Cache findFakerForKeyName auto-discovery results#75

Merged
Saeris merged 1 commit into
mainfrom
perf/find-faker-for-keyname-cache
Jun 3, 2026
Merged

perf: Cache findFakerForKeyName auto-discovery results#75
Saeris merged 1 commit into
mainfrom
perf/find-faker-for-keyname-cache

Conversation

@Saeris

@Saeris Saeris commented Jun 3, 2026

Copy link
Copy Markdown
Owner

Summary

Closes the largest remaining hotspot from the perf report — findFakerForKeyName, which the report measured as 21.4% of CPU self-time and the single biggest contributor to the 1.4→1.5 string-mock regression (32× slower per the report's comparative profile). Pure-refactor optimization with no observable behavior change.

Workload (vs published 1.5.3) This branch Speedup
Auto-discovery-heavy schema (20 fields requiring full scan) 0.198 ms/mock 1.87×
Unknown-keys schema (10 fields, full scan + negative cache benefit) 0.060 ms/mock 2.10×

What was slow

findFakerForKeyName is the last-resort discovery walk: when a string field's keyName doesn't match a direct keyNameGenerators entry, the function walks every section of faker, walks every method in every section, invokes each candidate to type-check its return value, and returns the first match. The report measured 13µs per call; with ~50 unmatched string keys per mock(messageSchema) × up to 16 retry-loop attempts in constrained-string cases, that's roughly 10 ms of pure scan time per top-level mock.

No caching meant every call paid the full cost — even for the same keyName, on the same faker instance, in the same Valimock instance. And there was no negative caching, so unknown keynames re-scanned the entire faker tree on every appearance.

What the fix does

A module-level WeakMap<Faker, Map<string, fn | null>> caches the discovery result. Both positive results (the discovered function) and negative results (no match, cached as null) are stored.

const discoveryCache = new WeakMap<Faker, Map<string, DiscoveredFn | null>>();

if (cache.has(lower)) {
  return cache.get(lower) ?? undefined;
}
const discovered = discover(lower, faker);
cache.set(lower, discovered ?? null);
return discovered;

Cache design decisions

  • WeakMap by Faker instance — different fakers don't share state (correctness). Entries GC when a faker becomes unreachable (no leak).
  • Module-level, not per-Valimock-instance — the common case is callers reusing the default faker import across many Valimock constructions; module-level lets that cache survive.
  • Inner Map by lowercased keyName — lookups are case-insensitive, so firstName and FIRSTNAME hit the same entry.
  • null for negative cache + Map.has() to distinguish "cached miss" from "not yet scanned."
  • mockeryMapper path bypasses the cache — the deprecated extension point emits a per-call deprecation warning that callers should still see, and its output depends on user-supplied state.

Regression test

src/__tests__/findFakerForKeyNameCache.spec.ts uses a Proxy-instrumented faker stub to count Object.keys(faker) walks across repeated calls. Six tests pin:

  1. Repeat hits scan exactly once
  2. Negative caching: repeat misses scan exactly once
  3. Distinct keys each scan once, regardless of order
  4. Different faker instances maintain separate caches
  5. Case-insensitive cache: differently-cased keynames share an entry
  6. mockeryMapper path bypasses cache (deprecation warning fires every call)

Asserts on observed scan counts, not timing — robust against slow CI runners.

Verification

  • vp check --fix clean across 61 files
  • ✅ Wallaby reports 0 failing tests
  • ✅ Local bench confirms wins on both representative workloads
  • ⏳ CI on this branch

Stack relationship to #74

Both PRs branched off main and touch different concerns (#74 is dispatch + lookup tables; this PR is the auto-discovery cache). They'll merge in any order without conflict.

What's left from the report

This closes the largest remaining hotspot. One semantic item still deferred:

  • Union safeParse short-circuit for leaf-primitive options (~3% per the report). Worth its own focused PR because it's a behavior question, not a pure refactor — starting that work now.

Bump

patch — Bumpy file included.

🤖 Generated with Claude Code

The auto-discovery walk in `findFakerForKeyName` was 21.4% of CPU
self-time per the VALIMOCK_PERF_REPORT — it scanned every section of
faker and invoked each candidate method to type-check the return
value on every call, with no caching. For schemas with many fields
whose `keyName` doesn't match a `keyNameGenerators` entry, this
dominated string-heavy mock cost.

Cache the result (both positive and negative) in a module-level
`WeakMap<Faker, Map<string, DiscoveredFn | null>>`. The keying:

- **WeakMap by Faker** so different faker instances don't share cache
  state, and so entries become GC-eligible when a faker is unreachable.
  Module-level (not per-Valimock-instance) so the common case of
  callers reusing `faker` from `@faker-js/faker` shares the cache
  across `Valimock` construction.

- **Inner Map by lowercased keyName** because lookups are case-
  insensitive — `firstName` and `FIRSTNAME` resolve to the same fn.

- **Two-state values (`fn | null`)** to support negative caching.
  A keyName like `totallyRandomXyzKey` that doesn't resolve must
  not re-scan the entire faker tree on every subsequent string
  field with the same name. `Map.has()` distinguishes the cached
  miss (`null`) from "not yet scanned" (key absent).

The deprecated `mockeryMapper` path bypasses the cache: it emits a
one-time-per-call deprecation warning that callers should still see,
and its output depends on user-supplied state rather than a
deterministic property of `faker`.

Tests use a Proxy-instrumented faker stub to count `Object.keys(faker)`
walks across repeated calls, so we assert on cache behavior directly
rather than via flaky timing.

Bench against published 1.5.3 on representative workloads:
- 20-field schema hitting auto-discovery:   1.87x faster
- 10-field schema with unknown keys (full
  scan + negative cache):                   2.10x faster

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@codesandbox

codesandbox Bot commented Jun 3, 2026

Copy link
Copy Markdown

Review or Edit in CodeSandbox

Open the branch in Web EditorVS CodeInsiders

Open Preview

@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

bumpy-frog

The changes in this PR will be included in the next version bump.

patch Patch releases

  • valimock 1.5.3 → 1.5.4

Bump files in this PR

Click here if you want to add another bump file to this PR


This comment is maintained by bumpy.

@Saeris Saeris merged commit 0bd840d into main Jun 3, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant