fix(server,cli): server recomputes content_hash on download to absorb FilterConfig schema drift#351
Open
fix(server,cli): server recomputes content_hash on download to absorb FilterConfig schema drift#351
Conversation
… FilterConfig schema drift Closes #350. `tokf install <hash>` was failing with "filter hash mismatch — the server may have returned tampered content" for filters published before recent `FilterConfig` schema additions (e.g. `inject_path`, added 2026-03-07 in 2fa1e50). Each new field with `#[serde(default)]` silently changes the output of `canonical_hash` for every same-TOML filter that doesn't reference the new field, breaking the hash that was stored at publish time. Investigation ruled out client-side reconstruction strategies: stripping type-default fields from the JSON, stripping known-since-initial fields, emitting canonical TOML via `toml::to_string(toml::Value)` — none reproduce the URL hash for the two reported filters. The original shape can't be recovered from the current binary. The server is the trust authority: on `GET /api/filters/<hash>/download`, it now parses the stored TOML and recomputes `canonical_hash` with the current binary, returning it as `content_hash` in the response. The URL hash becomes a stable lookup key; the recomputed `content_hash` is the authoritative identity under the current schema. The client trusts the server's `content_hash`, but still hashes the wire bytes and asserts they match — preserving wire-tampering detection between server and client. When the user-requested URL hash differs from the recomputed `content_hash`, the client emits a one-line stderr note explaining the schema drift; the install proceeds under the recomputed identity. Old servers that don't yet emit `content_hash` fall through to the historical URL-hash check (graceful degradation); behaviour matches today against the upgrade matrix. Long-term: define an explicit, version-tagged canonical-TOML hash so future schema additions don't silently invalidate stored identities. Tracked separately. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
Filter Verification ReportChanged FiltersNo filter files changed in this PR. All Filters Summary✅ 143/143 test cases passed across 51 filters Generated by |
This was referenced Apr 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #350.
Summary
GET /api/filters/<hash>/downloadnow parses the stored TOML and recomputescanonical_hashwith the current binary, returning it ascontent_hashin the response.content_hashas the authoritative identity, while still hashing the wire bytes to detect tampering between server and client. When the URL hash differs from the recomputedcontent_hash(i.e. the filter was published under an older schema), a one-line stderr note explains the drift and the install proceeds under the new identity.content_hashtrigger the historical URL-hash check on the client (graceful degradation; no regression).Why this approach
The bug-report filters (
0585b874…,d2a19dc4…) cannot be repaired client-only — the URL hash was produced by an olderFilterConfigschema whose exact field set we can't reconstruct from the current binary. I exhausted client-side reconstruction strategies (strip type-defaults, strip-since-initial, canonical TOML viatoml::to_stringontoml::Value, raw-byte hash) and none reproduce the stored hash.Switching to "server is the trust" works because:
The URL hash effectively becomes a stable lookup key; the recomputed hash is the content identity under the current schema. The architecture stays simple, no DB migration is needed, and old clients keep working unchanged against new servers.
Long-term direction (not in this PR)
canonical_hashis fundamentally fragile because it's tied to the in-memory shape ofFilterConfig. The proper fix is an explicit, version-tagged canonical-TOML hash decoupled from struct evolution (e.g.v1:<sha256>over a deterministic TOML emission with sorted keys / stripped comments). Tracking as a follow-up issue.Test plan
cargo fmt -- --checkcleancargo clippy --workspace --all-targets -- -D warningscleancargo test --workspace— 2184 passing (10 new: 3 server unit, 2 client deserialize, 5 clientverify_and_resolve_hashbranches)download_returns_toml_contentextended to assertcontent_hashround-tripsBranch coverage of
verify_and_resolve_hashDeployment ordering note
This is a coordinated fix:
The fix lands the moment
tokf.netis updated to servecontent_hash. No client republish needed.🤖 Generated with Claude Code