feat: retain proofs of non-target nodes in certain edge-cases. #109

mediocregopher · 2025-07-28T12:06:00Z

There are specific edge-cases where it's necessary to retain the proofs for nodes which aren't given in the target set, namely leaf removals which result in the deletion of a branch, and leaf additions which result in the creation of a branch.

Documentation of each case is provided in the code at the point it is handled.

This change will cause more proofs than are strictly necessary to be retained, because the target set we are given does not tell us if paths are added, updated or removed. This extra work is made up for by us not needing to fetch the nodes which would otherwise be missing later down the pipeline, and in benchmarking the overall change comes out very slightly faster.

Partially addresses paradigmxyz/reth#17571

There are specific edge-cases where it's necessary to retain the proofs for nodes which aren't given in the target set, namely leaf removals which result in the deletion of a branch, and leaf additions which result in the creation of a branch. Documentation of each case is provided in the code at the point it is handled. This change will cause more proofs than are strictly necessary to be retained, because the `target` set we are given does not tell us if paths are added, updated or removed. This extra work is made up for by us not needing to fetch the nodes which would otherwise be missing later down the pipeline, and in benchmarking the overall change comes out very slightly faster.

Fixes #17571 **Background** The Serial/ParallelSparseTrie's have their nodes revealed using the DecodedMultiProofs generated based on the previous database state and the changed leafset. Once revealed, the leaf updates/removals are applied to the sparse tries and root hashes are calculated. There are specific edge-cases during leaf adding/removal where a node which is outside the changeset is required to complete the operation. In these cases we were falling back to a singular database lookup to fill in the gap. By modifying the generation of the DecodedMultiProofs to include those extra required nodes we can simplify the sparse trie implementations significantly, as well as the surrounding engine code which supports these one-off lookups. **Changes** This primarily relies on a change to the ProofRetainer in alloy-trie, made here: alloy-rs/trie#109 These changes are enabled by a flag which the payload validator enables; other places where proofs are generated remain unnaffected. There is a further change here to the trie walker to not skip over branch nodes when they might be involved in a leaf removal. A change to support leaf addition is not required, because that case involves children of extension nodes, and we don't ever skip over extension nodes anyway. The sparse trie code to fallback to the db for missing nodes remains in place for now, but we emit a warning when it happens. This will let us track if there's any edge-cases we're missing. **Benchmarks - 2k blocks on main** No significant change ``` Timestamp: 2025-07-28 10:37:12 UTC Baseline: main Feature: origin/mediocregopher/17571-leaf-updates-removals Performance Changes: NewPayload Latency: -1.24% FCU Latency: -1.74% Total Latency: -1.26% Gas/Second: +1.27% Blocks/Second: +1.27% Baseline Summary: Blocks: 2000, Gas: 36118922440, Duration: 74.02s Avg NewPayload: 35.69ms, Avg FCU: 1.29ms, Avg Total: 36.99ms Started: 2025-07-28 10:48:20 UTC, Ended: 2025-07-28 10:56:39 UTC Feature Summary: Blocks: 2000, Gas: 36118922440, Duration: 73.08s Avg NewPayload: 35.25ms, Avg FCU: 1.27ms, Avg Total: 36.52ms Started: 2025-07-28 10:58:35 UTC, Ended: 2025-07-28 11:06:55 UTC ```

codspeed-hq · 2025-07-28T15:09:02Z

CodSpeed Performance Report

Merging #109 will not alter performance

_{Comparing mediocregopher:mediocregopher/17571-leaf-update-removal (6732c80) with main (a54766b)}

Summary

✅ 4 untouched benchmarks

Rjected

I see, this makes sense conceptually, I do wonder if it would be worth adding more granular prefixsets to the proof retainer so we don't retain more nodes than we need

src/proof/retainer.rs

mattsse

lgtm, pending @Rjected

graphs are very helpful

mattsse · 2025-07-30T10:59:37Z

src/proof/retainer.rs

+            // When a new leaf is being added to a trie, it can happen that an extension node's
+            // path is a prefix of the new leaf's, but the extension child's path is not. In
+            // this case a new branch node is created; the new leaf is one child, the previous
+            // branch node is its other, and the extension node is its parent.


this is helpful

gakonst · 2025-07-30T11:14:56Z

I see, this makes sense conceptually, I do wonder if it would be worth adding more granular prefixsets to the proof retainer so we don't retain more nodes than we need

wdym?

Co-authored-by: Dan Cline <6798349+Rjected@users.noreply.github.com>

mediocregopher · 2025-07-30T12:05:36Z

I see, this makes sense conceptually, I do wonder if it would be worth adding more granular prefixsets to the proof retainer so we don't retain more nodes than we need

wdym?

The ProofRetainer only knows which leaves changed, but not in what way they changed (added, updated, or removed). If we had that more fine-grained information we could return the extra edge-case proofs only precisely when they would be required. As it is, the ProofRetainer only knows when it might need to keep one of these extra nodes, and so it ends up keeping more than are strictly required.

mediocregopher · 2025-07-30T15:13:46Z

I would have sworn I tested all this on mainnet already, but when I reran it today I discovered a hole in the logic. It can happen that a branch has three children, two of which are removed leaves and the last which isn't in the target set. In this case we still need the proof of the nontarget child. Same with a 4-child branch where 3 leaves are removed, etc...

mediocregopher · 2025-08-06T04:06:08Z

This is now fully tested on mainnet, ran 2k blocks using paradigmxyz/reth#17643 and saw zero requests being made by the sparse tries, indicating that all required trie nodes are being revealed by the multiproof tasks.

A big change has been made here, which is the introduction of the AddedRemovedKeys trait, which is what is used by the ProofRetainer builder to both pass in keys which have been added/removed in the block and to enable the new behavior of retaining extra non-target nodes related to leaf addition/removal.

The trait is designed so we have a fair amount of flexibility in how we keep track of the keys; ultimately the ProofRetainer only cares about whether certain paths are prefixes of those which have been added/removed.

There's still more work which could be done here to narrow down which extra nodes are returned, specifically with regard to leaf removals. If we were to track a stack of masks within the ProofRetainer it might be possible to fully accurately determine which branches will be removed and only retain the extra branch child proof for those cases. Right now we are pessimistically assuming a branch will be removed if all but one of its children are a prefix of a removed leaf, and returning more proofs than are necessary as a result.

src/proof/retainer.rs

This reverts commit 1e4be6f.

src/proof/retainer.rs

src/hash_builder/mod.rs

Rjected

mainly would like to avoid the generic if we can, because it seems like the default type is the only one being used? the logic for retaining based on added / removed keys makes sense to me

Rjected · 2025-08-17T21:10:38Z

src/proof/retainer.rs

+pub struct ProofRetainer<K = AddedRemovedKeys> {
    /// The nibbles of the target trie keys to retain proofs for.
    targets: Vec<Nibbles>,
    /// The map retained trie node keys to RLP serialized trie nodes.
    proof_nodes: ProofNodes,
+    /// Provided by the user to give the necessary context to retain extra proofs.
+    added_removed_keys: Option<K>,


why do we have a generic here? I think we could just have this be AddedRemovedKeys, although if it needs to be shared maybe it should be a reference type?

AddedRemovedKeys needs to also be read by the TrieWalker during proof generation:

https://github.com/paradigmxyz/reth/blob/c5fb4ce77d11b8739bbbc96d06bf6b1bee450a51/crates/trie/parallel/src/proof.rs#L241-L254

So it definitely cannot just be AddedRemovedKeys without cloning it for every proof. We could make it instead Option<&'k AddedRemovedKeys>, but that leaks the 'k onto the HashBuilder and that kind of infects everything because lifetimes can't have a default. The only other option would be to make this Arc<AddedRemovedKeys>, as that is what is actually being used by reth.

I see, yeah I think an arc makes sense here then

Actually I spoke too soon, it's not as simple as passing an Arc<AddedRemovedKeys>.

The problem is that in the reth PR we have MultiAddedRemovedKeys which is cloned and put in an Arc, and then that Arc is passed to all the multiproof chunks.

From there a reference to either account or storage AddedRemovedKeys instances are queried from the Arc<MultiAddedRemovedKeys> (here and here) and passed to the TrieWalker and ProofRetainer/HashBuilder.

So in order for an Arc<AddedRemovedKeys> to be used here it would be necessary to have something like MultiAddedRemovedKeysMut which is held by the MultiProofTask and has AddedRemovedKeys internally, and then a MultiAddedRemovedKeys which can be created from a MultiAddedRemovedKeysMut, and which has Arc<AddedRemovedKeys> instances internally.

Imo at that point the generic is the cleaner solution, and more performant because there's only one Arc (for the MultiAddedRemovedKeys) rather than one for every individual AddedRemovedKeys. I'll defer to you though.

I see, so is the issue that: we have an Arc to the higher level MultiAddedRemovedKeys in reth, and no arc to the AddedRemovedKeys themselves, we can only obtain regular references. So we need the generic to avoid lifetime leaks, and so we don't need to convert a map, to a map of references every time

so we don't need to convert a map, to a map of references every time

to a map of Arcs, but otherwise yes that's right

src/proof/retainer.rs

src/proof/added_removed_keys.rs

Rjected

looks good to me, yeah it seems like the generic is the best option here

mediocregopher force-pushed the mediocregopher/17571-leaf-update-removal branch 2 times, most recently from c95a020 to bb4e492 Compare July 28, 2025 13:13

mediocregopher force-pushed the mediocregopher/17571-leaf-update-removal branch from bb4e492 to cfaaac3 Compare July 28, 2025 13:17

mediocregopher mentioned this pull request Jul 28, 2025

chore(trie): fully reveal sparse tries prior to leaf updates/removals paradigmxyz/reth#17643

Merged

mediocregopher marked this pull request as ready for review July 28, 2025 15:13

mediocregopher requested review from DaniPopes, gakonst and rkrasiuk as code owners July 28, 2025 15:13

Rjected reviewed Jul 30, 2025

View reviewed changes

src/proof/retainer.rs Outdated Show resolved Hide resolved

mattsse approved these changes Jul 30, 2025

View reviewed changes

Update src/proof/retainer.rs

d9c56e9

Co-authored-by: Dan Cline <6798349+Rjected@users.noreply.github.com>

mediocregopher marked this pull request as draft July 30, 2025 15:12

mediocregopher added 4 commits July 31, 2025 12:50

Fully track branch target children, to properly handle collapse case

d8c2f02

Introduce AddedRemovedKeys, use that for trackinge extra proofs

eee23a5

debug

dc8ec83

Don't track removed leaves as non-targets

e0ef480

mediocregopher requested review from mattsse and Rjected August 6, 2025 04:01

mediocregopher marked this pull request as ready for review August 6, 2025 04:01

DaniPopes reviewed Aug 6, 2025

View reviewed changes

src/proof/retainer.rs Outdated Show resolved Hide resolved

mediocregopher added 3 commits August 6, 2025 21:59

Make AddedRemovedKeys into a concrete type

49e2d80

PR suggestion

6bb1dac

Remove assume_added

1e4be6f

Revert "Remove assume_added"

61f2db1

This reverts commit 1e4be6f.

graphite-app bot reviewed Aug 7, 2025

View reviewed changes

src/proof/retainer.rs Outdated Show resolved Hide resolved

graphite-app bot reviewed Aug 7, 2025

View reviewed changes

src/hash_builder/mod.rs Show resolved Hide resolved

Fix unnecessary boolean being included

263708f

mediocregopher force-pushed the mediocregopher/17571-leaf-update-removal branch from 07bff5b to 263708f Compare August 8, 2025 00:43

Rjected self-assigned this Aug 13, 2025

Rjected requested changes Aug 17, 2025

View reviewed changes

mediocregopher added 2 commits August 18, 2025 14:29

Add clear_added method

0171755

PR feedback

6732c80

mediocregopher requested a review from Rjected August 18, 2025 17:36

Rjected approved these changes Aug 18, 2025

View reviewed changes

gakonst merged commit 1889c65 into alloy-rs:main Aug 18, 2025
24 checks passed

mediocregopher deleted the mediocregopher/17571-leaf-update-removal branch August 19, 2025 09:26

feat: retain proofs of non-target nodes in certain edge-cases. #109

feat: retain proofs of non-target nodes in certain edge-cases. #109

Uh oh!

Conversation

mediocregopher commented Jul 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codspeed-hq bot commented Jul 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodSpeed Performance Report

Merging #109 will not alter performance

Summary

Uh oh!

Rjected left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mattsse left a comment

Choose a reason for hiding this comment

Uh oh!

mattsse Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

gakonst commented Jul 30, 2025

Uh oh!

mediocregopher commented Jul 30, 2025

Uh oh!

mediocregopher commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mediocregopher commented Aug 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Rjected left a comment

Choose a reason for hiding this comment

Uh oh!

Rjected Aug 17, 2025

Choose a reason for hiding this comment

Uh oh!

mediocregopher Aug 18, 2025

Choose a reason for hiding this comment

Uh oh!

Rjected Aug 18, 2025

Choose a reason for hiding this comment

Uh oh!

mediocregopher Aug 18, 2025

Choose a reason for hiding this comment

Uh oh!

Rjected Aug 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mediocregopher Aug 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Rjected left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

mediocregopher commented Jul 28, 2025 •

edited

Loading

codspeed-hq bot commented Jul 28, 2025 •

edited

Loading

mediocregopher commented Jul 30, 2025 •

edited

Loading

mediocregopher commented Aug 6, 2025 •

edited

Loading

Rjected Aug 18, 2025 •

edited

Loading