Skip to content

POC of trie keys to abstract hashing and prefixes#6457

Draft
Arkenan wants to merge 1 commit intomainfrom
trie-keys
Draft

POC of trie keys to abstract hashing and prefixes#6457
Arkenan wants to merge 1 commit intomainfrom
trie-keys

Conversation

@Arkenan
Copy link
Copy Markdown
Collaborator

@Arkenan Arkenan commented Apr 8, 2026

Trying to get a more centralized place for the trie key format, which is currently scattered across many places. May need iteration.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2026

🤖 Kimi Code Review

This is a solid refactoring PR that centralizes trie key construction and eliminates magic numbers. No critical issues found.

Summary of Changes

The PR introduces a TrieKey abstraction to encapsulate nibble-path construction, table routing logic, and length constants (65/131/17) that were previously scattered across the storage layer.

Code Quality & Correctness

crates/storage/trie_key.rs

  • Good use of named constants (SEPARATOR_NIBBLE, ACCOUNT_LEAF_LEN, STORAGE_LEAF_LEN)
  • Clear separation between constructors for different key types (account vs storage)
  • Documentation correctly explains the separator nibble (17) is outside valid range (0–15)

crates/storage/trie.rs

  • Lines 112, 118, 127: Proper migration from manual key construction to TrieKey methods
  • Removal of hardcoded table names from struct fields reduces state surface area
  • Line 164: tx_for_key correctly routes to appropriate locked view based on key type

crates/storage/store.rs

  • Lines 3075–3077: Replaces magic length checks (65/131) with semantic methods is_leaf() and is_account()
  • Line 3247: Replaces literal 131 with STORAGE_LEAF_LEN constant

crates/storage/layering.rs

  • Lines 287, 302: Delegates to new TrieKey constructors while maintaining backward compatibility via apply_prefix

Minor Observations

  1. Constructor consistency: from_nibbles (line 71) is public and allows bypassing typed constructors, but the doc comment appropriately marks it for "internal/intermediate nodes".

  2. Length calculation: The comment on line 18 notes storage leaf length includes "2 (leaf flags)". Verify this matches your trie encoding (typically HP encoding prefix nibbles), though this preserves existing behavior.

  3. Clone in tx_for_key: Line 164 clones the Nibbles to construct TrieKey. Since this is on the read path and Nibbles is likely small (Vec of bytes), this is acceptable, but consider if TrieKey::from_nibbles could take &Nibbles for read-only operations to avoid the allocation.

Security & Consensus

  • No changes to consensus-critical logic (hashing, RLP, gas accounting)
  • Key construction preserves existing behavior (separator nibble 17, same hash inputs)
  • Table routing logic is equivalent to previous implementation

Verdict

Approve. Clean refactoring that improves maintainability without changing behavior.


Automated review by Kimi (Moonshot AI) · kimi-k2.5 · custom prompt

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2026

Lines of code report

Total lines added: 95
Total lines removed: 22
Total lines changed: 117

Detailed view
+-----------------------------------+-------+------+
| File                              | Lines | Diff |
+-----------------------------------+-------+------+
| ethrex/crates/storage/layering.rs | 196   | -5   |
+-----------------------------------+-------+------+
| ethrex/crates/storage/lib.rs      | 17    | +2   |
+-----------------------------------+-------+------+
| ethrex/crates/storage/store.rs    | 2730  | +3   |
+-----------------------------------+-------+------+
| ethrex/crates/storage/trie.rs     | 158   | -17  |
+-----------------------------------+-------+------+
| ethrex/crates/storage/trie_key.rs | 90    | +90  |
+-----------------------------------+-------+------+

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2026

🤖 Codex Code Review

Findings

  1. trie_key.rs:89 introduces a real storage-trie routing regression. TrieKey::table() and is_account() now infer account-vs-storage purely from len <= 65, but a storage trie root key after prefixing is also exactly 65 nibbles (keccak(address) + separator from trie_key.rs:53). That means root reads/writes through trie.rs:115 and trie.rs:123 get routed to ACCOUNT_FLATKEYVALUE instead of the storage column family previously implied by BackendTrieDB::new_for_storages / new_for_account_storage. This can break opening non-empty storage tries via store.rs:2769 and store.rs:2837, including snap-sync paths such as snap_sync.rs:869. The fix should preserve “storage vs account DB” as explicit context instead of deriving it only from key length.

  2. store.rs:3075 adds avoidable hot-path overhead in apply_trie_updates: every committed key is cloned and re-parsed into Nibbles just to compute is_leaf / is_account. This loop runs during trie flushes, so the extra allocation and nibble decoding per node is measurable churn. The previous length checks were allocation-free; if the abstraction is needed, add zero-copy helpers over &[u8] instead.

No additional EVM / consensus / gas-accounting issues stood out in the touched code.

I couldn’t run the targeted Rust tests in this environment because cargo/rustup attempted to write under read-only /home/runner/.rustup.


Automated review by OpenAI Codex · gpt-5.4 · custom prompt

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2026

🤖 Claude Code Review

Now I have everything I need for a complete review.


PR Review: POC of trie keys to abstract hashing and prefixes

Overall assessment: The direction is good — centralizing magic length constants and table routing logic is the right move for maintainability. However, the POC has a meaningful bug in store.rs, inaccurate doc comments on the new constants, and some dead code. These should be addressed before promotion.


Bug: TrieKey constructed but table() not used — store.rs:3075–3093

The PR's primary purpose is to replace scattered key.len() checks with TrieKey methods, but in apply_trie_updates the refactoring is only half done:

let trie_key = crate::trie_key::TrieKey::from_nibbles(
    ethrex_trie::Nibbles::from_hex(key.clone()),
);
let is_leaf = trie_key.is_leaf();
let is_account = trie_key.is_account();
// ...
let table = if is_leaf {           // ← manually reimplements trie_key.table()
    if is_account { &ACCOUNT_FLATKEYVALUE } else { &STORAGE_FLATKEYVALUE }
} else if is_account { &ACCOUNT_TRIE_NODES } else { &STORAGE_TRIE_NODES };

trie_key.table() encapsulates exactly this if/else tree (see trie_key.rs:89–105). Not using it here means the table routing logic now exists in two places and can diverge. Compare with the correct usage in trie.rs:118–119 and trie.rs:125–126. Should be:

let trie_key = crate::trie_key::TrieKey::from_nibbles(
    ethrex_trie::Nibbles::from_hex(key.clone()),
);
let is_leaf = trie_key.is_leaf();
let table = trie_key.table();

Performance: Unnecessary Vec<u8> clone — store.rs:3076

key.clone() allocates for every trie node written during block processing (potentially thousands per block). The clone is only needed because Nibbles::from_hex takes ownership while key is still used at line 3081 for comparison. Options:

  • If Nibbles exposes a borrow-based constructor (or from_hex can accept &[u8]), use that.
  • Alternatively, push last_written comparison into TrieKey itself.

As a POC this is acceptable, but worth flagging before this lands in a hot path.


Documentation errors in the new constants — trie_key.rs:18–20 and trie_key.rs:51–55

STORAGE_LEAF_LEN breakdown is wrong. The comment says:

/// 65 (account prefix with separator) + 64 (keccak(slot)) + 2 (leaf flags) = 131

Nibbles::from_bytes always appends a leaf-flag nibble (value 16) per its own docstring. So the actual layout is:

  • from_bytes(keccak(address)) → 64 nibbles + 1 leaf_flag = 65
  • append_new(17) → +1 separator = 66 total prefix
  • from_bytes(keccak(slot)) → 64 + 1 leaf_flag = 65
  • Total: 66 + 65 = 131

Correct breakdown: "66 (storage_prefix: 64 nibbles + leaf_flag + separator) + 65 (slot: 64 nibbles + leaf_flag) = 131".

storage_prefix doc says "65 nibbles" (trie_key.rs:52), but the function produces 66 (65 from from_bytes + 1 from append_new(17)). The constant value itself is correct; only the explanatory arithmetic is wrong.


Dead code: standalone hash helpers — trie_key.rs:141–154

pub fn hash_address(address: &Address) -> H256 { ... }
pub fn hash_storage_key(key: &H256) -> [u8; 32] { ... }

These are not called by any code in this PR. They also duplicate hash_address_fixed (private in store.rs) and hash_key_fixed (public in store.rs). If the goal is to consolidate hashing, migrate the existing callers from store.rs and delete those. If they're purely aspirational, they should be removed to keep surface area minimal.

Additionally, trie_key::hash_address (returns H256) shadows store::hash_address (returns Vec<u8>) in the same crate namespace. Both exist in pub mod submodules; while only store::hash_address is currently re-exported at the crate root (lib.rs:79), the name collision is a footgun for future re-exports.


Minor: key.clone() in tx_for_keytrie.rs:164

fn tx_for_key(&self, key: &Nibbles) -> &dyn StorageLockedView {
    let trie_key = TrieKey::from_nibbles(key.clone());

from_nibbles takes Nibbles by value, forcing a clone of the path for what is fundamentally a length-only inspection. Consider pub fn is_leaf_len(len: usize) -> bool free functions, or accepting &Nibbles internally, to avoid the allocation.


Minor: removed useful comment — trie.rs (BackendTrieDB)

The comment "Using Arc allows sharing a single read view across multiple BackendTrieDB instances (e.g., state trie + storage trie in a single query)" was deleted as part of the diff. This explains the Arc<dyn StorageReadView> design choice non-obviously — worth keeping.


What's good

  • The centralisation of the separator nibble (17) and leaf lengths (65, 131) into named, doc'd constants is exactly the right call. These were raw magic numbers scattered across 3+ files.
  • Removing nodes_table and fkv_table fields from BackendTrieDB in favour of dynamic routing via TrieKey::table() eliminates a category of bugs where field and path could disagree.
  • The with_prefix constructor cleanly unifies account/storage dispatch, and apply_prefix as a shim preserves the existing call sites cleanly.
  • The incremental/POC scope is appropriate — the store.rs callers of apply_prefix are a safe follow-up.

Automated review by Claude (Anthropic) · sonnet · custom prompt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant