feat(common): hash module foundation + e1 historical epoch#352
Closed
feat(common): hash module foundation + e1 historical epoch#352
Conversation
Lays the groundwork for versioned canonical hashes (architectural fix for #350). Splits `hash.rs` into a directory module: - `hash::current` — the existing schema-tied `canonical_hash` (verbatim move; behaviour byte-for-byte unchanged for all 14 callers in `tokf-cli` and `tokf-server`). - `hash::epochs::e1` — frozen byte-for-byte snapshot of `FilterConfig` at commit 5abfaf8 (when `canonical_hash` was first introduced and `GroupConfig.labels` switched to `BTreeMap`). Reproduces what a pre-`show_history_hint`/`chunk`/`json`/`inject_path` binary would have hashed for the same TOML input. The schema is wrapped in a private `mod schema` so its types don't leak; the public API is `hash::epochs::e1::hash(toml: &str) -> Result<String, HashError>`. - `hash::HashVersion` / `KNOWN_VERSIONS` / `compute_all` / `matches_any` — public dispatch API. Clients try every known epoch to verify a stored hash; the wiring into `verify_and_resolve_hash` lands in a follow-up once PR #351 merges. - `hash::HashError` — promoted from a `serde_json::Error` newtype to a 2-variant enum (`Parse(String)`, `Serialize(String)`) so it can also carry TOML parse errors. Source-compatible: every existing call site uses `?` / `Display` / `.map_err`, none destructure. Frozen-corpus CI test (`tests/hash_corpus.rs`) walks `tests/hash_corpus/<id>/` for every registered version and asserts each `<n>.toml`'s hash matches its `<n>.expected`. Modifying an expected value (or leaving an orphan `.expected`) fails the test. Four `e1` fixtures cover minimal, BTreeMap labels, Lua scripts, and a kitchen-sink filter exercising most e1 fields. `toml` is promoted from optional (under `validation`) to a regular dep on `tokf-common` because epoch parsers need it; all workspace crates already depend on `toml` directly so no new transitive surface. Bug-report filter `0585b874...` is NOT reproduced by `e1` (`9977a297...` ≠ `0585b874...`); the filter was published at a later schema epoch. `e2..e11` will land in follow-up PRs: | epoch | commit | trigger | |-------|---------|---------------------------------------------| | e2 | 9eca37c | `+show_history_hint` | | e3 | 87557f5 | `+chunk`, `+aggregates` in OutputBranch | | e4 | dd2759b | `+json` | | e5 | 2fa1e50 | `+inject_path` | | e6 | 36d43d0 | `+passthrough_args` | | e7 | 4418619 | `+description, +truncate_lines_at, +on_empty, +tail` | | e8 | 494e770 | RTK aliases; `MatchOutputRule.contains` -> Option | | e9 | 3f44787 | `ReplaceRule { +replace_all }` | | e10 | 322e133 | `+tree` | | e11 | 19c4d0e | `VariantDetect { +args_pattern }` | Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
Filter Verification ReportChanged FiltersNo filter files changed in this PR. All Filters Summary✅ 143/143 test cases passed across 51 filters Generated by |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Long-term architectural fix for the hash drift behind #350. Lays the foundation for versioned canonical hashes; ships only
e1and the dispatch infrastructure so the pattern can be reviewed before the rest of the historical chain (e2..e11) is filled in mechanically.Summary
tokf-common::hashinto a directory module:hash::current— verbatim move of the existingcanonical_hash(&FilterConfig). Every existing caller (tokf-cli'sinstall_cmd,resolve,show_cmd,backfill_cmd,config/cache,publish_shared;tokf-server'sroutes/filters/publish) is byte-for-byte unaffected.hash::epochs::e1— frozen byte-for-byte snapshot ofFilterConfigat commit `5abfaf8` (canonical_hash introduction; `GroupConfig.labels` was `BTreeMap`). Reproduces what a binary at that commit would have hashed for the same TOML.hash::HashVersion/KNOWN_VERSIONS/compute_all/matches_any— public dispatch API. `HashVersion` carries a stable id (e.g. `"e1"`) and a hasher fn; clients can iterate `KNOWN_VERSIONS` to find which epoch reproduces a stored hash.hash::HashErrorupgraded from a `serde_json::Error` newtype to a 2-variant enum (`Parse(String)`, `Serialize(String)`) so it can also carry `toml::de::Error`. Source-compatible — every caller uses `?` / `Display` / `.map_err(|e| ...)`; none destructure.tomlis promoted from optional (gated under `validation`) to a regular dep on `tokf-common`. All workspace crates already pull `toml` directly, so no new transitive surface.How to add the rest of the historical chain
Each subsequent epoch follows the same template:
The agent archeology that informed this PR identified the full chain (in chronological order):
Test plan
Closes / refs
Refs #350 (PR #351 already shipped the immediate stopgap). Does not close — `e2..e11` and the eventual schema-independent `v1` canonical-TOML hash remain.
🤖 Generated with Claude Code