feat(l1): use pipelined execution for full sync instead of sequential by azteca1998 · Pull Request #6493 · lambdaclass/ethrex

azteca1998 · 2026-04-16T10:55:34Z

Summary

Adds add_blocks_in_pipeline_batches() to Blockchain that processes full sync blocks through the existing pipeline (add_block_pipeline) instead of the sequential execute_block_from_state path
Blocks are processed in configurable sub-batches (default 64, tunable via PIPELINE_SUB_BATCH_SIZE env var with sync-test feature)
Each block gets the full pipeline: 3 concurrent threads (warmer, executor, merkleizer) with 16 parallel shard workers
Replaces both the sequential batch path and the final-batch pipeline path in add_blocks() with a single unified pipeline path
Est. +20-40% throughput from overlapping execution with merkleization

Test plan

Run full sync on hoodi with FULL_SYNC_BLOCK_LIMIT=50000 and verify completion
Compare Gigagas/s throughput with and without pipelined execution
Test different sub-batch sizes: PIPELINE_SUB_BATCH_SIZE=32, 64, 128
Verify cancellation works: send SIGTERM during sync and confirm clean shutdown

…#6484) Replace the sequential batch execution path in full sync with pipelined execution. Instead of running all 1024 blocks sequentially then doing one giant merkle at the end, blocks are now processed through the existing block pipeline (add_block_pipeline) which provides: - Concurrent execution + merkleization per block (3-thread pipeline) - 16 parallel shard workers for trie computation - Cache warming / prefetching via warmer thread Blocks are grouped into configurable sub-batches (default 64 blocks, configurable via PIPELINE_SUB_BATCH_SIZE env var with sync-test feature) for progress logging and async runtime yielding. The trade-off is per-block merkleization instead of one collapsed batch, but the pipeline overlap (merkle runs concurrently with execution) should more than compensate, with estimated +20-40% throughput improvement.

github-actions · 2026-04-16T10:58:04Z

Lines of code report

Total lines added: 75
Total lines removed: 19
Total lines changed: 94

Detailed view

+------------------------------------------------+-------+------+
| File                                           | Lines | Diff |
+------------------------------------------------+-------+------+
| ethrex/crates/blockchain/blockchain.rs         | 2559  | +72  |
+------------------------------------------------+-------+------+
| ethrex/crates/networking/p2p/snap/constants.rs | 24    | +1   |
+------------------------------------------------+-------+------+
| ethrex/crates/networking/p2p/sync.rs           | 246   | +2   |
+------------------------------------------------+-------+------+
| ethrex/crates/networking/p2p/sync/full.rs      | 229   | -19  |
+------------------------------------------------+-------+------+

github-actions · 2026-04-16T10:58:08Z

🤖 Kimi Code Review

Critical Issue: Blocking Async Runtime

crates/blockchain/blockchain.rs, lines 2350-2360: The method add_blocks_in_pipeline_batches performs CPU-intensive block execution (via add_block_pipeline) directly in an async context without spawn_blocking. This blocks the Tokio runtime thread for the duration of each sub-batch, causing the node to become unresponsive to network events and RPC calls during execution.

Fix: Wrap the sub-batch processing in spawn_blocking:

// Check cancellation before spawning
if cancellation_token.is_cancelled() { ... }

let blockchain = self.clone();
let sub_batch = sub_batch.to_vec(); // Necessary to move into blocking task
tokio::task::spawn_blocking(move || {
    for block in sub_batch {
        // ... existing processing logic ...
        blockchain.add_block_pipeline(block, None).map_err(...)?;
    }
    Ok::<_, (ChainError, Option<BatchBlockProcessingFailure>)>(())
})
.await
.map_err(|e| (ChainError::Custom(e.to_string()), None))??;

Issue: Unused Parameter

crates/networking/p2p/sync/full.rs, line 288: The _final_batch parameter

Automated review by Kimi (Moonshot AI) · kimi-k2.5 · custom prompt

github-actions · 2026-04-16T10:58:32Z

🤖 Claude Code Review

Now I have a full picture of the PR. Here is the review:

PR #6493 — `feat(sync): use pipelined execution for full sync instead of sequential`

Status: DRAFT | +113 / -29 lines

Overview

This PR unifies the two execution paths in full sync (add_blocks_in_batch for non-final batches vs. sequential pipeline for the final batch) into a single path that always uses add_block_pipeline. A new add_blocks_in_pipeline_batches method processes blocks in configurable sub-batches (default 64) through the existing 3-thread pipeline (warmer + executor + merkleizer). The claimed benefit is +20–40% throughput from overlapping execution with merkleization.

Critical Issue: Blocking the Async Runtime

blockchain.rs:2337 — add_block_pipeline is called directly from an async fn without spawn_blocking:

// NEW (no spawn_blocking):
self.add_block_pipeline(block.clone(), None).map_err(...)?;

add_block_pipeline_inner → execute_block_pipeline uses std::thread::scope (line 459), which blocks the calling thread until all three pipeline threads complete. Calling this from inside an async fn blocks the Tokio executor thread for the full duration of each block's pipeline.

The old final-batch code correctly used spawn_blocking:

// OLD (correct):
tokio::task::spawn_blocking(move || {
    blockchain.add_block_pipeline(block, None)...
}).await

The yield_now() at sub-batch boundaries (line 2361) only helps between sub-batches, not during the blocking pipeline runs inside them. With sub_batch_size = 64, each sub-batch holds the executor thread for potentially seconds at a time.

Fix: Wrap the inner loop in spawn_blocking, or restructure so blocking work is done off the async runtime. Something like:

let blockchain = Arc::clone(&self);
let sub_batch: Vec<Block> = sub_batch.to_vec(); // avoid clone per-block
tokio::task::spawn_blocking(move || {
    for block in &sub_batch {
        blockchain.add_block_pipeline(block.clone(), None)?;
        ...
    }
    Ok::<_, ChainError>(...)
}).await.map_err(|e| (ChainError::Custom(e.to_string()), None))??;

Dead Parameter: `_final_batch`

full.rs:277

async fn add_blocks(
    blockchain: Arc<Blockchain>,
    blocks: Vec<Block>,
    _final_batch: bool,   // <-- silenced but never used
    cancel_token: CancellationToken,
) -> ...

The underscore prefix suppresses the warning but leaves a misleading API. Every call site (full.rs:160, full.rs:179, full.rs:215) still computes and passes final_batch, wasting effort. The parameter and its computation at full.rs:122 should be removed entirely now that the two paths are unified.

Stale Doc Comment

full.rs:273–274

/// For the final batch (near sync head), this also stores each block's state individually.

This is a copy of the old behavior that no longer applies. The _final_batch distinction is gone; the comment should be updated or removed.

`block.clone()` in the Inner Loop

blockchain.rs:2337

for block in sub_batch {          // sub_batch: &[Block]
    let block_hash = block.hash();
    self.add_block_pipeline(block.clone(), None)...

Since blocks.chunks() yields &[Block], iterating gives &Block, making the .clone() unavoidable given add_block_pipeline's block: Block signature. If the spawn_blocking fix above is applied by converting the sub-batch to a Vec<Block> upfront (consuming the slice), the per-block clone can be eliminated by iterating with .into_iter() instead.

Subtle Metrics Safety Dependency

blockchain.rs:2382

METRICS_BLOCKS.set_latest_gas_used(total_gas_used as f64 / blocks_len as f64);

This is safe from divide-by-zero only because of the blocks.last().ok_or_else(...) early return two lines above (line 2364–2366). The two are disconnected visually. Consider computing average gas used only after confirming blocks_len > 0, or using a helper like checked_div, to make the invariant explicit rather than implicit.

Minor: `blocks_done` Variable Shadowing

blockchain.rs:2351–2352

let blocks_done = (sub_batch_idx + 1) * sub_batch_size;
let blocks_done = blocks_done.min(blocks_len);

This is valid Rust but unusual. A single expression is cleaner:

let blocks_done = ((sub_batch_idx + 1) * sub_batch_size).min(blocks_len);

`add_blocks_in_batch` (`blockchain.rs`) Is Now Unreachable from Sync

The public add_blocks_in_batch method in blockchain.rs is no longer called from the sync path. It remains pub and is referenced in the doc comment on line 2308. If it's not used elsewhere it may become dead code; worth a quick audit with grep before this merges to avoid surprising callers.

No Unit Tests for the New Method

add_blocks_in_pipeline_batches has no unit tests. The test plan is integration-only. At minimum, testing the sub-batch boundary math (blocks_done edge cases) and cancellation behavior with a mock CancellationToken would improve confidence.

Summary

Severity	Item
Critical	`add_block_pipeline` blocks the Tokio executor thread — needs `spawn_blocking`
Medium	`_final_batch` is dead code — remove the parameter and its computation
Medium	Stale doc comment on `add_blocks`
Low	`block.clone()` per iteration — avoidable with the `spawn_blocking` refactor
Low	Metrics division safety is implicit — make it explicit
Nit	`blocks_done` shadow variable

The core concept is sound and the pipeline approach for non-final batches is a real improvement. The spawn_blocking issue is the blocker; everything else can be polished before merge.

Automated review by Claude (Anthropic) · sonnet · custom prompt

github-actions · 2026-04-16T10:58:52Z

🤖 Codex Code Review

High: the PR removes the special handling for the last full-sync batch, so the final EXECUTE_BATCH_SIZE blocks no longer get per-block state materialized. final_batch is still computed in full.rs, but add_blocks() now ignores it and always calls the new pipelined path (full.rs). In storage, batch mode collapses the whole range into one trie layer from the first parent root to the last block root (store.rs, layering.rs). That means intermediate header.state_roots in the final batch are no longer reachable, which breaks or degrades features that expect recent per-block state: RPC reads/proofs return None/defaults for those blocks (store.rs, account.rs), tracing falls back to re-exec, and forkchoice can reject a reorg link whose ancestor is inside that window because has_state_root() fails (vm.rs, fork_choice.rs). I would keep the old per-block pipeline behavior for final_batch, or otherwise explicitly materialize each state root in that last window.
Medium: PIPELINE_SUB_BATCH_SIZE is unchecked, and blocks.chunks(sub_batch_size) will panic on zero (sync.rs, blockchain.rs). In sync-test builds this is user-controlled via env var, so a bad test configuration can crash the node instead of returning a structured error. Clamp or validate > 0 when parsing, and ideally guard the public API too.

I couldn’t run cargo check in this environment because rustup needs writable filesystem state.

Automated review by OpenAI Codex · gpt-5.4 · custom prompt

github-actions bot assigned azteca1998 Apr 16, 2026

azteca1998 changed the title ~~feat(sync): use pipelined execution for full sync instead of sequential~~ feat(l1): use pipelined execution for full sync instead of sequential Apr 16, 2026

github-actions bot added the L1 Ethereum client label Apr 16, 2026

github-project-automation bot added this to ethrex_l1 Apr 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(l1): use pipelined execution for full sync instead of sequential#6493

feat(l1): use pipelined execution for full sync instead of sequential#6493
azteca1998 wants to merge 1 commit intomainfrom
feat/pipelined-execution-fullsync

azteca1998 commented Apr 16, 2026

Uh oh!

github-actions bot commented Apr 16, 2026

Uh oh!

github-actions bot commented Apr 16, 2026

Uh oh!

github-actions bot commented Apr 16, 2026

Uh oh!

github-actions bot commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

azteca1998 commented Apr 16, 2026

Summary

Test plan

Uh oh!

github-actions bot commented Apr 16, 2026

Lines of code report

Uh oh!

github-actions bot commented Apr 16, 2026

🤖 Kimi Code Review

Uh oh!

github-actions bot commented Apr 16, 2026

🤖 Claude Code Review

PR #6493 — feat(sync): use pipelined execution for full sync instead of sequential

Overview

Critical Issue: Blocking the Async Runtime

Dead Parameter: _final_batch

Stale Doc Comment

block.clone() in the Inner Loop

Subtle Metrics Safety Dependency

Minor: blocks_done Variable Shadowing

add_blocks_in_batch (blockchain.rs) Is Now Unreachable from Sync

No Unit Tests for the New Method

Summary

Uh oh!

github-actions bot commented Apr 16, 2026

🤖 Codex Code Review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

PR #6493 — `feat(sync): use pipelined execution for full sync instead of sequential`

Dead Parameter: `_final_batch`

`block.clone()` in the Inner Loop

Minor: `blocks_done` Variable Shadowing

`add_blocks_in_batch` (`blockchain.rs`) Is Now Unreachable from Sync