Skip to content

feat(l1): use pipelined execution for full sync instead of sequential#6493

Draft
azteca1998 wants to merge 1 commit intomainfrom
feat/pipelined-execution-fullsync
Draft

feat(l1): use pipelined execution for full sync instead of sequential#6493
azteca1998 wants to merge 1 commit intomainfrom
feat/pipelined-execution-fullsync

Conversation

@azteca1998
Copy link
Copy Markdown
Contributor

Summary

  • Adds add_blocks_in_pipeline_batches() to Blockchain that processes full sync blocks through the existing pipeline (add_block_pipeline) instead of the sequential execute_block_from_state path
  • Blocks are processed in configurable sub-batches (default 64, tunable via PIPELINE_SUB_BATCH_SIZE env var with sync-test feature)
  • Each block gets the full pipeline: 3 concurrent threads (warmer, executor, merkleizer) with 16 parallel shard workers
  • Replaces both the sequential batch path and the final-batch pipeline path in add_blocks() with a single unified pipeline path
  • Est. +20-40% throughput from overlapping execution with merkleization

Closes #6484

Test plan

  • Run full sync on hoodi with FULL_SYNC_BLOCK_LIMIT=50000 and verify completion
  • Compare Gigagas/s throughput with and without pipelined execution
  • Test different sub-batch sizes: PIPELINE_SUB_BATCH_SIZE=32, 64, 128
  • Verify cancellation works: send SIGTERM during sync and confirm clean shutdown

…#6484)

Replace the sequential batch execution path in full sync with pipelined
execution. Instead of running all 1024 blocks sequentially then doing
one giant merkle at the end, blocks are now processed through the
existing block pipeline (add_block_pipeline) which provides:

- Concurrent execution + merkleization per block (3-thread pipeline)
- 16 parallel shard workers for trie computation
- Cache warming / prefetching via warmer thread

Blocks are grouped into configurable sub-batches (default 64 blocks,
configurable via PIPELINE_SUB_BATCH_SIZE env var with sync-test feature)
for progress logging and async runtime yielding.

The trade-off is per-block merkleization instead of one collapsed batch,
but the pipeline overlap (merkle runs concurrently with execution) should
more than compensate, with estimated +20-40% throughput improvement.
@azteca1998 azteca1998 changed the title feat(sync): use pipelined execution for full sync instead of sequential feat(l1): use pipelined execution for full sync instead of sequential Apr 16, 2026
@github-actions github-actions bot added the L1 Ethereum client label Apr 16, 2026
@github-actions
Copy link
Copy Markdown

Lines of code report

Total lines added: 75
Total lines removed: 19
Total lines changed: 94

Detailed view
+------------------------------------------------+-------+------+
| File                                           | Lines | Diff |
+------------------------------------------------+-------+------+
| ethrex/crates/blockchain/blockchain.rs         | 2559  | +72  |
+------------------------------------------------+-------+------+
| ethrex/crates/networking/p2p/snap/constants.rs | 24    | +1   |
+------------------------------------------------+-------+------+
| ethrex/crates/networking/p2p/sync.rs           | 246   | +2   |
+------------------------------------------------+-------+------+
| ethrex/crates/networking/p2p/sync/full.rs      | 229   | -19  |
+------------------------------------------------+-------+------+

@github-actions
Copy link
Copy Markdown

🤖 Kimi Code Review

Critical Issue: Blocking Async Runtime

crates/blockchain/blockchain.rs, lines 2350-2360: The method add_blocks_in_pipeline_batches performs CPU-intensive block execution (via add_block_pipeline) directly in an async context without spawn_blocking. This blocks the Tokio runtime thread for the duration of each sub-batch, causing the node to become unresponsive to network events and RPC calls during execution.

Fix: Wrap the sub-batch processing in spawn_blocking:

// Check cancellation before spawning
if cancellation_token.is_cancelled() { ... }

let blockchain = self.clone();
let sub_batch = sub_batch.to_vec(); // Necessary to move into blocking task
tokio::task::spawn_blocking(move || {
    for block in sub_batch {
        // ... existing processing logic ...
        blockchain.add_block_pipeline(block, None).map_err(...)?;
    }
    Ok::<_, (ChainError, Option<BatchBlockProcessingFailure>)>(())
})
.await
.map_err(|e| (ChainError::Custom(e.to_string()), None))??;

Issue: Unused Parameter

crates/networking/p2p/sync/full.rs, line 288: The _final_batch parameter


Automated review by Kimi (Moonshot AI) · kimi-k2.5 · custom prompt

@github-actions
Copy link
Copy Markdown

🤖 Claude Code Review

Now I have a full picture of the PR. Here is the review:


PR #6493feat(sync): use pipelined execution for full sync instead of sequential

Status: DRAFT | +113 / -29 lines


Overview

This PR unifies the two execution paths in full sync (add_blocks_in_batch for non-final batches vs. sequential pipeline for the final batch) into a single path that always uses add_block_pipeline. A new add_blocks_in_pipeline_batches method processes blocks in configurable sub-batches (default 64) through the existing 3-thread pipeline (warmer + executor + merkleizer). The claimed benefit is +20–40% throughput from overlapping execution with merkleization.


Critical Issue: Blocking the Async Runtime

blockchain.rs:2337add_block_pipeline is called directly from an async fn without spawn_blocking:

// NEW (no spawn_blocking):
self.add_block_pipeline(block.clone(), None).map_err(...)?;

add_block_pipeline_innerexecute_block_pipeline uses std::thread::scope (line 459), which blocks the calling thread until all three pipeline threads complete. Calling this from inside an async fn blocks the Tokio executor thread for the full duration of each block's pipeline.

The old final-batch code correctly used spawn_blocking:

// OLD (correct):
tokio::task::spawn_blocking(move || {
    blockchain.add_block_pipeline(block, None)...
}).await

The yield_now() at sub-batch boundaries (line 2361) only helps between sub-batches, not during the blocking pipeline runs inside them. With sub_batch_size = 64, each sub-batch holds the executor thread for potentially seconds at a time.

Fix: Wrap the inner loop in spawn_blocking, or restructure so blocking work is done off the async runtime. Something like:

let blockchain = Arc::clone(&self);
let sub_batch: Vec<Block> = sub_batch.to_vec(); // avoid clone per-block
tokio::task::spawn_blocking(move || {
    for block in &sub_batch {
        blockchain.add_block_pipeline(block.clone(), None)?;
        ...
    }
    Ok::<_, ChainError>(...)
}).await.map_err(|e| (ChainError::Custom(e.to_string()), None))??;

Dead Parameter: _final_batch

full.rs:277

async fn add_blocks(
    blockchain: Arc<Blockchain>,
    blocks: Vec<Block>,
    _final_batch: bool,   // <-- silenced but never used
    cancel_token: CancellationToken,
) -> ...

The underscore prefix suppresses the warning but leaves a misleading API. Every call site (full.rs:160, full.rs:179, full.rs:215) still computes and passes final_batch, wasting effort. The parameter and its computation at full.rs:122 should be removed entirely now that the two paths are unified.


Stale Doc Comment

full.rs:273–274

/// For the final batch (near sync head), this also stores each block's state individually.

This is a copy of the old behavior that no longer applies. The _final_batch distinction is gone; the comment should be updated or removed.


block.clone() in the Inner Loop

blockchain.rs:2337

for block in sub_batch {          // sub_batch: &[Block]
    let block_hash = block.hash();
    self.add_block_pipeline(block.clone(), None)...

Since blocks.chunks() yields &[Block], iterating gives &Block, making the .clone() unavoidable given add_block_pipeline's block: Block signature. If the spawn_blocking fix above is applied by converting the sub-batch to a Vec<Block> upfront (consuming the slice), the per-block clone can be eliminated by iterating with .into_iter() instead.


Subtle Metrics Safety Dependency

blockchain.rs:2382

METRICS_BLOCKS.set_latest_gas_used(total_gas_used as f64 / blocks_len as f64);

This is safe from divide-by-zero only because of the blocks.last().ok_or_else(...) early return two lines above (line 2364–2366). The two are disconnected visually. Consider computing average gas used only after confirming blocks_len > 0, or using a helper like checked_div, to make the invariant explicit rather than implicit.


Minor: blocks_done Variable Shadowing

blockchain.rs:2351–2352

let blocks_done = (sub_batch_idx + 1) * sub_batch_size;
let blocks_done = blocks_done.min(blocks_len);

This is valid Rust but unusual. A single expression is cleaner:

let blocks_done = ((sub_batch_idx + 1) * sub_batch_size).min(blocks_len);

add_blocks_in_batch (blockchain.rs) Is Now Unreachable from Sync

The public add_blocks_in_batch method in blockchain.rs is no longer called from the sync path. It remains pub and is referenced in the doc comment on line 2308. If it's not used elsewhere it may become dead code; worth a quick audit with grep before this merges to avoid surprising callers.


No Unit Tests for the New Method

add_blocks_in_pipeline_batches has no unit tests. The test plan is integration-only. At minimum, testing the sub-batch boundary math (blocks_done edge cases) and cancellation behavior with a mock CancellationToken would improve confidence.


Summary

Severity Item
Critical add_block_pipeline blocks the Tokio executor thread — needs spawn_blocking
Medium _final_batch is dead code — remove the parameter and its computation
Medium Stale doc comment on add_blocks
Low block.clone() per iteration — avoidable with the spawn_blocking refactor
Low Metrics division safety is implicit — make it explicit
Nit blocks_done shadow variable

The core concept is sound and the pipeline approach for non-final batches is a real improvement. The spawn_blocking issue is the blocker; everything else can be polished before merge.


Automated review by Claude (Anthropic) · sonnet · custom prompt

@github-actions
Copy link
Copy Markdown

🤖 Codex Code Review

  1. High: the PR removes the special handling for the last full-sync batch, so the final EXECUTE_BATCH_SIZE blocks no longer get per-block state materialized. final_batch is still computed in full.rs, but add_blocks() now ignores it and always calls the new pipelined path (full.rs). In storage, batch mode collapses the whole range into one trie layer from the first parent root to the last block root (store.rs, layering.rs). That means intermediate header.state_roots in the final batch are no longer reachable, which breaks or degrades features that expect recent per-block state: RPC reads/proofs return None/defaults for those blocks (store.rs, account.rs), tracing falls back to re-exec, and forkchoice can reject a reorg link whose ancestor is inside that window because has_state_root() fails (vm.rs, fork_choice.rs). I would keep the old per-block pipeline behavior for final_batch, or otherwise explicitly materialize each state root in that last window.

  2. Medium: PIPELINE_SUB_BATCH_SIZE is unchecked, and blocks.chunks(sub_batch_size) will panic on zero (sync.rs, blockchain.rs). In sync-test builds this is user-controlled via env var, so a bad test configuration can crash the node instead of returning a structured error. Clamp or validate > 0 when parsing, and ideally guard the public API too.

I couldn’t run cargo check in this environment because rustup needs writable filesystem state.


Automated review by OpenAI Codex · gpt-5.4 · custom prompt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

L1 Ethereum client

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

Use Pipelined Execution instead of sequential for Full sync

1 participant