Skip to content

feat: add more invalid BAL test cases; extend invalid case coverage#2653

Merged
marioevz merged 8 commits intoethereum:forks/amsterdamfrom
fselmo:feat/more-bal-invalid-test-coverage
Apr 15, 2026
Merged

feat: add more invalid BAL test cases; extend invalid case coverage#2653
marioevz merged 8 commits intoethereum:forks/amsterdamfrom
fselmo:feat/more-bal-invalid-test-coverage

Conversation

@fselmo
Copy link
Copy Markdown
Contributor

@fselmo fselmo commented Apr 10, 2026

🗒️ Description

Extend invalid BAL test coverage by adding:

  • test_bal_invalid_hash_mismatch: injects wrong hash via the header_modify option. We were'nt testing this before.

  • Add missing coverage / test cases for:

    • missing_storage_change
    • missing_storage_read
    • missing_code_change
    • wrong_code_value
  • Adds invalid coinbase test cases:

    • test_bal_invalid_missing_coinbase — fee recipient removed from BAL (block has tx with tip)
    • test_bal_invalid_coinbase_balance_value — fee recipient balance wrong (999 instead of actual tip)
    • test_bal_invalid_extraneous_coinbase[empty_block] — spurious coinbase injected into block with no txs/withdrawals
    • test_bal_invalid_extraneous_coinbase[withdrawal_only] — spurious coinbase injected into block with only withdrawals (no fees paid)
  • Updates the test_cases.md for the tests introduced here, as well as for tests introduced in PR tests(amsterdam): add BAL missing withdrawal account tests #2652 which I missed to add to test_cases.md - helps us keep track of it all.

  • Audited the implemented tests and test_cases.md and attempted to sync the md with the current state of things (some renames were not updated and some implementations were not recorded there, but not a whole lot seems out of sync).

  • Consolidates unused, granular BAL exceptions that don't map 1-to-1 for any client - we should just keep these more generic for now. If granularity is added to clients, we can approach newer exceptions imo.

✅ Checklist

  • All: Ran fast static checks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:
    just static
  • All: PR title adheres to the repo standard - it will be used as the squash commit message and should start type(scope):.
  • All: Considered updating the online docs in the ./docs/ directory.
  • All: Set appropriate labels for the changes (only maintainers can apply labels).
  • Tests: Ran mkdocs serve locally and verified the auto-generated docs for new tests in the Test Case Reference are correctly formatted.

Cute Animal Picture

Put a link to a cute animal picture inside the parenthesis-->

@fselmo fselmo force-pushed the feat/more-bal-invalid-test-coverage branch from 40a762d to f7d0078 Compare April 10, 2026 19:34
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 10, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.25%. Comparing base (7f3ab55) to head (3c5f635).
⚠️ Report is 14 commits behind head on forks/amsterdam.

Additional details and impacted files
@@                 Coverage Diff                 @@
##           forks/amsterdam    #2653      +/-   ##
===================================================
+ Coverage            86.24%   86.25%   +0.01%     
===================================================
  Files                  599      599              
  Lines                36984    37032      +48     
  Branches              3795     3795              
===================================================
+ Hits                 31895    31943      +48     
  Misses                4525     4525              
  Partials               564      564              
Flag Coverage Δ
unittests 86.25% <ø> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@fselmo fselmo force-pushed the feat/more-bal-invalid-test-coverage branch from f4fb9a5 to a9056b4 Compare April 10, 2026 20:37
@fselmo fselmo force-pushed the feat/more-bal-invalid-test-coverage branch from a9056b4 to 1d8d8e7 Compare April 10, 2026 20:37
@fselmo fselmo marked this pull request as ready for review April 10, 2026 20:41
@fselmo fselmo requested a review from raxhvl April 10, 2026 20:42
@fselmo
Copy link
Copy Markdown
Contributor Author

fselmo commented Apr 10, 2026

cc: @nerolation

@fselmo fselmo added C-test Category: test A-tests Area: Consensus tests. labels Apr 10, 2026
@fselmo fselmo self-assigned this Apr 10, 2026
@fselmo
Copy link
Copy Markdown
Contributor Author

fselmo commented Apr 10, 2026

I haven't checked bal-devnet-3... but cherry-picking this on bal-devnet-2 and running against clients to verify the exception mapper message consolidation changes (they look good), I believe there are genuine client bugs that some tests found. Summary from Claude:

  • hash_mismatch (4 clients fail): reth/nethermind
    report INVALID_RECEIPTS_ROOT, erigon/geth accept
    as VALID. These clients aren't validating the BAL
    hash field properly.
  • field_entries (nethermind × 4): nethermind
    reports INVALID_RECEIPTS_ROOT — it seems to be
    falling through to receipts root validation
    instead of catching the BAL content corruption.
  • missing_coinbase + coinbase_balance_value
    (nethermind): same pattern — nethermind doesn't
    detect the BAL issue and falls through to receipts
    root.
  • extraneous_coinbase (geth × 2): geth accepts the
    extraneous coinbase entry as VALID — it's not
    validating that the BAL shouldn't include accounts
    with no actual state changes.

Something to pay extra attention to when reviewing perhaps.

@fselmo fselmo force-pushed the feat/more-bal-invalid-test-coverage branch from e83ca7e to 6fd97fd Compare April 10, 2026 23:57
@raxhvl
Copy link
Copy Markdown
Member

raxhvl commented Apr 13, 2026

suggestion: Increase invalid test coverage using the framework

BAL validation can be broken down into three dimensions:

  1. Correctness: each entry has the right value
  2. Exactness: exactly the right entries exist (no more, no less)
  3. Sequence: entries are in canonical order

BAL equivalence = Correctness + Exactness + Sequence.

Our tests should not assume how clients verify equivalence of computed vs provided BAL (hash, item-by-item, or otherwise).

A client that does zero BAL validation (one that simply passes through the provided BAL) will pass every happy path test. For every happy path test, we should have a negative test and ensure client rejects them:

  • happy path: Alice sends bob 1 ETH. BAL Balance change 1 ETH. accept block. test pass
  • invalid path: Alice sends bob 1 ETH. BAL Balance change 2 ETH. reject block. test pass

However, instead of writing invalid tests by hand, we should automate them using the framework. We can derive these invalid tests from a single valid one:

  • Correctnes
    • corrupting an existing entry's value,
  • Exactness
    • adding a bogus entry,
    • removing an entry,
    • duplicating an entry,
  • Sequence
    • (if there are multiple items) swapping entries.

The required modifiers for these mutations already exists. So some kind of pytest hook over
existing tests that introspects a valid BAL expectation, enumerates its entries, produces N invalid variants,
one per applicable mutation.

this way our invalid coverage grows organically with valid tests.

@fselmo
Copy link
Copy Markdown
Contributor Author

fselmo commented Apr 13, 2026

suggestion: Increase invalid test coverage using the framework

I think this is a great idea to attempt to implement as a separate exercise, perhaps even as its own simulator / test run. I can see something like a decorator used for specific tests that can do this sort of mutation otherwise I suspect we will bloat invalid tests with a ton of overlap on code paths. It's worth exploring as a different test flow for some fuzzing equivalent for invalid tests.

I would argue we should get these test cases in and review this PR solely on the cases here though. We shouldn't wait for a big framework change like this. I support your idea but I'd like to get coverage here asap as I share your view on the importance of invalid tests at the current moment for devnet progress.

Copy link
Copy Markdown
Contributor

@nerolation nerolation left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome thanks!

Reviewed and looks good to me.
Some nits that could be ignored:

  1. test_bal_invalid_field_entries passes alice= to every modifier lambda though none use it
  2. append_account in test_bal_invalid_extraneous_coinbase appends at the end without re-sorting. If the coinbase EOA address sorts below a prior entry, the client may reject for ordering rather than "extra account". Since it about "INVALID_BLOCK_ACCESS_LIST" it might be fine(?)

@fselmo
Copy link
Copy Markdown
Contributor Author

fselmo commented Apr 14, 2026

Some nits that could be ignored:

Nope, I think number 2 is a legitimate concern not worth ignoring.. thanks! I added both here since I was already going into the code to make sure, especially for a reader of the test, that point 2 is applied. We need to remove any ambiguity that it's actually testing what we want to test. In this case, it was already in order, but if it helps with readability and future-proofing the test, then I think it's a good addition. Thanks for the review!

@marioevz marioevz assigned marioevz and unassigned fselmo Apr 14, 2026
Copy link
Copy Markdown
Member

@marioevz marioevz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one comment in one of the tests.

Copy link
Copy Markdown
Member

@marioevz marioevz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the new check in Header/FixtureHeader!

@marioevz marioevz merged commit d206d4f into ethereum:forks/amsterdam Apr 15, 2026
21 checks passed
@fselmo fselmo deleted the feat/more-bal-invalid-test-coverage branch April 15, 2026 13:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-tests Area: Consensus tests. C-test Category: test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants