Skip to content

feat(l1): pipeline body download with execution#6491

Draft
azteca1998 wants to merge 3 commits intomainfrom
feat/pipeline-body-download
Draft

feat(l1): pipeline body download with execution#6491
azteca1998 wants to merge 3 commits intomainfrom
feat/pipeline-body-download

Conversation

@azteca1998
Copy link
Copy Markdown
Contributor

Summary

  • Restructures sync_cycle_full to overlap body downloads with block execution
  • Spawns a background task that downloads bodies and sends batches through mpsc::channel(2) (2-batch buffer)
  • Main loop receives pre-fetched batches and executes them — next batch downloads while current batch executes
  • Handles both single-batch (headers in memory) and multi-batch (headers from store) paths
  • Errors from the download task propagate cleanly to the execution loop
  • Est. +12-15% throughput by hiding body download latency behind execution

Closes #6479

Test plan

  • Run full sync on hoodi with FULL_SYNC_BLOCK_LIMIT=50000 and verify completion
  • Compare total sync time with and without pipelining
  • Verify error handling: download task stops when execution errors

Overlap block body downloads with block execution during full sync by
using a background tokio task and an mpsc channel (buffer of 2 batches).

The download task iterates through batches, fetches headers from the
store, downloads bodies from peers, constructs Block objects, and sends
them through the channel. The main loop receives completed batches and
executes them immediately. This hides network I/O latency behind
execution time for an estimated 12-15% throughput improvement.

Closes #6479
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 16, 2026

Lines of code report

Total lines added: 66
Total lines removed: 0
Total lines changed: 66

Detailed view
+-------------------------------------------+-------+------+
| File                                      | Lines | Diff |
+-------------------------------------------+-------+------+
| ethrex/crates/networking/p2p/sync/full.rs | 358   | +66  |
+-------------------------------------------+-------+------+

@github-actions
Copy link
Copy Markdown

🤖 Claude Code Review

Now I have enough context to write a thorough review.


PR Review: feat(sync): pipeline body download with execution (#6491)

Overview

This PR refactors sync_cycle_full to overlap network I/O (body downloads) with CPU work (block execution) using a Tokio task + mpsc::channel(2). The design is sound: the 2-slot buffer limits memory consumption while providing enough pipelining to hide download latency behind execution. The estimated 12-15% throughput gain is plausible.


Correctness

Silent skip of final_batch in empty batches

At full.rs:212-217 (multi-batch path):

if !blocks.is_empty() {
    if body_tx.send(Ok((blocks, final_batch))).await.is_err() {
        return;
    }
}

If the last batch (where final_batch = true) somehow produces zero blocks, the receiver never learns the sync completed. The execution loop drains normally, but add_blocks_in_batch never gets called with final_batch = true, which means the final batch is never executed in sequential (add_block_pipeline) mode. In practice this can't happen (a batch with headers will always have bodies), but it is a latent correctness hazard. The same guard existed in the original code, so this is not a regression — but it's worth noting.

Single-batch path always uses final_batch = true

full.rs:131:

let final_batch = true;

In the original code, even the single-batch path iterated for start in (start_block_number..end_block_number).step_by(*EXECUTE_BATCH_SIZE). If headers.len() > EXECUTE_BATCH_SIZE, the first chunks were executed with final_batch = false (batch mode, faster) and only the final chunk with final_batch = true (sequential, stores per-block state). The new code sends all headers as one message with final_batch = true, meaning all blocks are executed sequentially regardless of count.

In practice single_batch = true implies one P2P header response, which is bounded by the fetch limit (≤1024), and EXECUTE_BATCH_SIZE defaults to 1024 — so they're unlikely to diverge. But if they can diverge (e.g. EXECUTE_BATCH_SIZE is configured lower), this is a silent behavior change: blocks that should have been batch-executed will now be executed one-by-one, which is correct but slower.

Download task not awaited on execution error

full.rs:241:

download_task.await?;

This is only reached if the while let loop exits cleanly. If the loop exits early via result? (execution error), sync_cycle_full returns immediately, dropping the JoinHandle without awaiting. The download task continues running until it detects the closed channel. For a sync function this is tolerable, but worth noting: the task may hold peer connections or locks briefly after the function has declared failure.

A cleaner pattern (not mandatory, but good hygiene):

// on early exit, abort the download task
download_task.abort();

Or use a CancellationToken that's shared with the download task.


Code Quality

Significant code duplication

The inner body-download loop appears verbatim in both the single-batch path (full.rs:134-154) and the multi-batch path (full.rs:190-211). Extract a helper:

async fn download_bodies(
    peers: &mut PeerHandler,
    mut headers: Vec<BlockHeader>,
) -> Result<Vec<Block>, SyncError> {
    let mut blocks = Vec::new();
    while !headers.is_empty() {
        let end = min(MAX_BLOCK_BODIES_TO_REQUEST, headers.len());
        match peers.request_block_bodies(&headers[..end]).await {
            Ok(Some(bodies)) => {
                debug!("Obtained: {} block bodies", bodies.len());
                let batch = headers.drain(..bodies.len())
                    .zip(bodies)
                    .map(|(h, b)| Block { header: h, body: b });
                blocks.extend(batch);
            }
            Ok(None) => return Err(SyncError::BodiesNotFound),
            Err(e) => return Err(e.into()),
        }
    }
    Ok(blocks)
}

This would eliminate ~40 lines of duplication and make the two paths trivial to compare.

Asymmetric send-error handling

Single-batch path (full.rs:157):

let _ = body_tx.send(Ok((blocks, final_batch))).await;

Multi-batch path (full.rs:213):

if body_tx.send(Ok((blocks, final_batch))).await.is_err() {
    return;
}

The single-batch path discards the send result. Since it returns immediately after, the return would be a no-op if the send fails — functionally equivalent but inconsistent. It is confusing to a reader. Even if only for style consistency, either check or add a comment explaining why the result is intentionally ignored.

Unused variable binding

full.rs:131: let final_batch = true; could just be a literal true at the call site, or at minimum named const SINGLE_BATCH_IS_FINAL: bool = true; to make intent explicit. Minor.


Error Propagation

Error handling from the download task is correct:

  • body_tx.send(Err(e)) propagates errors to the receiver via result? in the main loop.
  • Panics in the download task propagate via download_task.await? as SyncError::JoinHandle.
  • Channel closure detection in the multi-batch path (is_err()) stops the download task when the receiver exits early.

The From impls on SyncError handle all the .into() conversions cleanly.


Performance

The channel buffer of 2 is a good tradeoff: the download task can prefetch one batch ahead while the current batch executes. Large buffers would increase memory pressure during sync.

No concerns on the execution-side performance; add_blocks_in_batch structure is unchanged.


Summary

Category Severity Item
Correctness Medium Single-batch always uses final_batch = true, potentially changing execution mode for large single batches
Correctness Low Empty batch silently drops final_batch = true signal
Design Low Download task not aborted/awaited on execution error
Style Medium Inner body-download loop duplicated in both paths
Style Low Asymmetric send-error handling in single vs multi-batch

The pipelining design is solid. The main request before merging is to extract the duplicated download loop into a shared helper and verify the single-batch / EXECUTE_BATCH_SIZE interaction. The test plan (hoodi full sync + timing comparison) is the right way to validate the throughput claim.


Automated review by Claude (Anthropic) · sonnet · custom prompt

@github-actions
Copy link
Copy Markdown

🤖 Codex Code Review

Findings

  1. The new downloader task is only joined on the success path, so failure paths now leave background peer requests running after sync_cycle_full has already returned. The task is spawned at full.rs:128, but it is only awaited at full.rs:241. Any early ? from the receive loop at full.rs:223 or from block execution at full.rs:230 returns immediately and skips that join. Because the task uses a cloned PeerHandler, it can continue issuing request_block_bodies calls until its next channel send fails, which means stale full-sync traffic can overlap with the next sync attempt. I’d treat this as a correctness/resource-lifecycle regression. Suggested fix: keep the spawned task in a guard and abort/await it on every exit path, or pass a cancellation signal into the downloader and always await shutdown before returning.

Notes

I did not see an EVM/gas-accounting or consensus-rule regression in the diff itself; batch ordering is still FIFO. I couldn’t run cargo check in this environment because rustup tried to write under a read-only path, so compile/test verification is still outstanding.


Automated review by OpenAI Codex · gpt-5.4 · custom prompt

@azteca1998 azteca1998 changed the title feat(sync): pipeline body download with execution feat(l1): pipeline body download with execution Apr 16, 2026
@github-actions github-actions Bot added the L1 Ethereum client label Apr 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

L1 Ethereum client

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

Pipeline Body Download with Execution.

1 participant