Skip to content

chore(trie): fully reveal sparse tries prior to leaf updates/removals #17643

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 37 commits into from
Aug 22, 2025

Conversation

mediocregopher
Copy link
Collaborator

Fixes #17571

Background

The Serial/ParallelSparseTrie's have their nodes revealed using the DecodedMultiProofs generated based on the previous database state and the changed leafset. Once revealed, the leaf updates/removals are applied to the sparse tries and root hashes are calculated.

There are specific edge-cases during leaf adding/removal where a node which is outside the changeset is required to complete the operation. In these cases we were falling back to a singular database lookup to fill in the gap.

By modifying the generation of the DecodedMultiProofs to include those extra required nodes we can simplify the sparse trie implementations significantly, as well as the surrounding engine code which supports these one-off lookups.

Changes

This primarily relies on a change to the ProofRetainer in alloy-trie, made here:

alloy-rs/trie#109

These changes are enabled by a flag which the payload validator enables; other places where proofs are generated remain unnaffected.

There is a further change here to the trie walker to not skip over branch nodes when they might be involved in a leaf removal. A change to support leaf addition is not required, because that case involves children of extension nodes, and we don't ever skip over extension nodes anyway.

The sparse trie code to fallback to the db for missing nodes remains in place for now, but we emit a warning when it happens. This will let us track if there's any edge-cases we're missing.

Benchmarks - 2k blocks on main

No significant change

Timestamp: 2025-07-28 10:37:12 UTC
Baseline: main
Feature:  origin/mediocregopher/17571-leaf-updates-removals

Performance Changes:
  NewPayload Latency: -1.24%
  FCU Latency:        -1.74%
  Total Latency:      -1.26%
  Gas/Second:         +1.27%
  Blocks/Second:      +1.27%

Baseline Summary:
  Blocks: 2000, Gas: 36118922440, Duration: 74.02s
  Avg NewPayload: 35.69ms, Avg FCU: 1.29ms, Avg Total: 36.99ms
  Started: 2025-07-28 10:48:20 UTC, Ended: 2025-07-28 10:56:39 UTC

Feature Summary:
  Blocks: 2000, Gas: 36118922440, Duration: 73.08s
  Avg NewPayload: 35.25ms, Avg FCU: 1.27ms, Avg Total: 36.52ms
  Started: 2025-07-28 10:58:35 UTC, Ended: 2025-07-28 11:06:55 UTC

Fixes #17571

**Background**

The Serial/ParallelSparseTrie's have their nodes revealed using the
DecodedMultiProofs generated based on the previous database state and
the changed leafset. Once revealed, the leaf updates/removals are
applied to the sparse tries and root hashes are calculated.

There are specific edge-cases during leaf adding/removal where a node
which is outside the changeset is required to complete the operation. In
these cases we were falling back to a singular database lookup to fill
in the gap.

By modifying the generation of the DecodedMultiProofs to include those
extra required nodes we can simplify the sparse trie implementations
significantly, as well as the surrounding engine code which supports
these one-off lookups.

**Changes**

This primarily relies on a change to the ProofRetainer in alloy-trie,
made here:

alloy-rs/trie#109

These changes are enabled by a flag which the payload validator enables;
other places where proofs are generated remain unnaffected.

There is a further change here to the trie walker to not skip over
branch nodes when they might be involved in a leaf removal. A change to
support leaf addition is not required, because that case involves
children of extension nodes, and we don't ever skip over extension nodes
anyway.

The sparse trie code to fallback to the db for missing nodes remains in
place for now, but we emit a warning when it happens. This will let us
track if there's any edge-cases we're missing.

**Benchmarks - 2k blocks on main**

No significant change

```
Timestamp: 2025-07-28 10:37:12 UTC
Baseline: main
Feature:  origin/mediocregopher/17571-leaf-updates-removals

Performance Changes:
  NewPayload Latency: -1.24%
  FCU Latency:        -1.74%
  Total Latency:      -1.26%
  Gas/Second:         +1.27%
  Blocks/Second:      +1.27%

Baseline Summary:
  Blocks: 2000, Gas: 36118922440, Duration: 74.02s
  Avg NewPayload: 35.69ms, Avg FCU: 1.29ms, Avg Total: 36.99ms
  Started: 2025-07-28 10:48:20 UTC, Ended: 2025-07-28 10:56:39 UTC

Feature Summary:
  Blocks: 2000, Gas: 36118922440, Duration: 73.08s
  Avg NewPayload: 35.25ms, Avg FCU: 1.27ms, Avg Total: 36.52ms
  Started: 2025-07-28 10:58:35 UTC, Ended: 2025-07-28 11:06:55 UTC
```
@mediocregopher mediocregopher added C-perf A change motivated by improving speed, memory usage or disk footprint A-trie Related to Merkle Patricia Trie implementation labels Jul 28, 2025
@Rjected
Copy link
Member

Rjected commented Jul 30, 2025

The sparse trie code to fallback to the db for missing nodes remains in place for now, but we emit a warning when it happens. This will let us track if there's any edge-cases we're missing.

Assuming there are no warnings from the benches? Are there any changes in the metrics (nodes fetched for ex)

@mediocregopher
Copy link
Collaborator Author

The sparse trie code to fallback to the db for missing nodes remains in place for now, but we emit a warning when it happens. This will let us track if there's any edge-cases we're missing.

Assuming there are no warnings from the benches? Are there any changes in the metrics (nodes fetched for ex)

I swear there were no warnings when I ran this against mainnet on monday, but now I'm seeing two related to leaf removal, so there's a bit more investigation to do here 🤦

Regarding metrics, I just reran the bench (2k blocks on mainnet) and this time it came out 3% faster 🤷 there were indeed more multiproofs fetched and more db txs generally though.

shot-1753882921 shot-1753883001

Summary:
Total multiproof account nodes: +20.4118%
Total multiproof storage nodes: +19.1292%
Opened Read-Write Transactions: +13.9773%
Opened Read-Only Transactions: +12.8246%

Copy link
Member

@Rjected Rjected left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the comments are very helpful, I mainly have one question about empty account edge cases, and a suggestion wrt the span ID

Comment on lines 40 to 59
if account.is_empty() {
self.account.insert_removed(*hashed_address);
}
continue
}

let storage_removed_keys =
self.storages.entry(*hashed_address).or_insert_with(default_added_removed_keys);

for (key, val) in &storage.storage {
if *val == U256::ZERO {
storage_removed_keys.insert_removed(*key);
} else {
storage_removed_keys.remove_removed(key);
}
}

if !account.is_empty() {
self.account.remove_removed(hashed_address);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, is it possible for these account empty cases to be hit? not suggesting they be removed, just wondering, we might want to just add docs if they can't be

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally thought it wasn't possible, but they do actually get hit. For example if I comment out these self.account lines I get this error on block 22729256 of mainnet:

WARN Branch node child not revealed in remove_leaf, falling back to db child_path=Nibbles(0xba982e83) leaf_full_path=Nibbles(0xba982e827371ca20eb274c77c02d21b0acb7e04fed4f6b28a73948060688dc6e)

This corresponds to https://etherscan.io/address/0xecb7ca9ec4cb52536b61227176993216cf4f4154, which gets self-destructed in that tx without having ever called a tx itself (and therefore having a zero nonce).

@mediocregopher mediocregopher marked this pull request as ready for review August 19, 2025 10:39
@mediocregopher mediocregopher requested a review from Rjected August 21, 2025 09:35
Copy link
Collaborator

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe I was able to follow along here

all of this made sense, lgtm but also want @Rjected to take a final look at this

@@ -183,7 +184,7 @@ where

/// Generates storage merkle proofs.
#[derive(Debug)]
pub struct StorageProof<T, H> {
pub struct StorageProof<T, H, K = AddedRemovedKeys> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I see, this way this works for both, proofretainer and addedremovedkeys

makes sense

Copy link
Member

@Rjected Rjected left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, nice

@github-project-automation github-project-automation bot moved this from Backlog to In Progress in Reth Tracker Aug 21, 2025
@Rjected
Copy link
Member

Rjected commented Aug 21, 2025

needs docs fixes

@mediocregopher mediocregopher added this pull request to the merge queue Aug 22, 2025
Merged via the queue into main with commit 8193fcf Aug 22, 2025
41 checks passed
@mediocregopher mediocregopher deleted the mediocregopher/17571-leaf-updates-removals branch August 22, 2025 09:34
@github-project-automation github-project-automation bot moved this from In Progress to Done in Reth Tracker Aug 22, 2025
mediocregopher added a commit that referenced this pull request Aug 22, 2025
In #17643 we introduced tracking of removed keys within a block in order
to fully reveal sparse tries prior to updating/removing leafs. This lets
us never have to block root calculation to reveal blinded nodes in the
sparse tries.

This was implemented by assuming that all keys are added (as opposed to
being only modified), which results in over-revealing the sparse tries
with all extension node children. This isn't logically incorrect, but is
a bit wasteful.

This change implements proper tracking of added keys alongside tracking
of removed keys. When a key is added we always generate its proof in
`on_state_update`, just like key removal. Because we can now enforce
that added keys have their proofs generated in `on_state_update` we no
longer need to optimistically retain extra proofs in prewarms.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-trie Related to Merkle Patricia Trie implementation C-perf A change motivated by improving speed, memory usage or disk footprint
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Updates and removals of leaves cause sparse trie thread to block
3 participants