Skip to content

feat: retain proofs of non-target nodes in certain edge-cases. #109

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

mediocregopher
Copy link
Collaborator

@mediocregopher mediocregopher commented Jul 28, 2025

There are specific edge-cases where it's necessary to retain the proofs for nodes which aren't given in the target set, namely leaf removals which result in the deletion of a branch, and leaf additions which result in the creation of a branch.

Documentation of each case is provided in the code at the point it is handled.

This change will cause more proofs than are strictly necessary to be retained, because the target set we are given does not tell us if paths are added, updated or removed. This extra work is made up for by us not needing to fetch the nodes which would otherwise be missing later down the pipeline, and in benchmarking the overall change comes out very slightly faster.

Partially addresses paradigmxyz/reth#17571

@mediocregopher mediocregopher force-pushed the mediocregopher/17571-leaf-update-removal branch 2 times, most recently from c95a020 to bb4e492 Compare July 28, 2025 13:13
There are specific edge-cases where it's necessary to retain the proofs
for nodes which aren't given in the target set, namely leaf removals
which result in the deletion of a branch, and leaf additions which
result in the creation of a branch.

Documentation of each case is provided in the code at the point it is
handled.

This change will cause more proofs than are strictly necessary to be
retained, because the `target` set we are given does not tell us if
paths are added, updated or removed. This extra work is made up for by
us not needing to fetch the nodes which would otherwise be missing later
down the pipeline, and in benchmarking the overall change comes out very
slightly faster.
@mediocregopher mediocregopher force-pushed the mediocregopher/17571-leaf-update-removal branch from bb4e492 to cfaaac3 Compare July 28, 2025 13:17
mediocregopher added a commit to paradigmxyz/reth that referenced this pull request Jul 28, 2025
Fixes #17571

**Background**

The Serial/ParallelSparseTrie's have their nodes revealed using the
DecodedMultiProofs generated based on the previous database state and
the changed leafset. Once revealed, the leaf updates/removals are
applied to the sparse tries and root hashes are calculated.

There are specific edge-cases during leaf adding/removal where a node
which is outside the changeset is required to complete the operation. In
these cases we were falling back to a singular database lookup to fill
in the gap.

By modifying the generation of the DecodedMultiProofs to include those
extra required nodes we can simplify the sparse trie implementations
significantly, as well as the surrounding engine code which supports
these one-off lookups.

**Changes**

This primarily relies on a change to the ProofRetainer in alloy-trie,
made here:

alloy-rs/trie#109

These changes are enabled by a flag which the payload validator enables;
other places where proofs are generated remain unnaffected.

There is a further change here to the trie walker to not skip over
branch nodes when they might be involved in a leaf removal. A change to
support leaf addition is not required, because that case involves
children of extension nodes, and we don't ever skip over extension nodes
anyway.

The sparse trie code to fallback to the db for missing nodes remains in
place for now, but we emit a warning when it happens. This will let us
track if there's any edge-cases we're missing.

**Benchmarks - 2k blocks on main**

No significant change

```
Timestamp: 2025-07-28 10:37:12 UTC
Baseline: main
Feature:  origin/mediocregopher/17571-leaf-updates-removals

Performance Changes:
  NewPayload Latency: -1.24%
  FCU Latency:        -1.74%
  Total Latency:      -1.26%
  Gas/Second:         +1.27%
  Blocks/Second:      +1.27%

Baseline Summary:
  Blocks: 2000, Gas: 36118922440, Duration: 74.02s
  Avg NewPayload: 35.69ms, Avg FCU: 1.29ms, Avg Total: 36.99ms
  Started: 2025-07-28 10:48:20 UTC, Ended: 2025-07-28 10:56:39 UTC

Feature Summary:
  Blocks: 2000, Gas: 36118922440, Duration: 73.08s
  Avg NewPayload: 35.25ms, Avg FCU: 1.27ms, Avg Total: 36.52ms
  Started: 2025-07-28 10:58:35 UTC, Ended: 2025-07-28 11:06:55 UTC
```
Copy link

codspeed-hq bot commented Jul 28, 2025

CodSpeed Performance Report

Merging #109 will not alter performance

Comparing mediocregopher:mediocregopher/17571-leaf-update-removal (6732c80) with main (a54766b)

Summary

✅ 4 untouched benchmarks

@mediocregopher mediocregopher marked this pull request as ready for review July 28, 2025 15:13
Copy link
Contributor

@Rjected Rjected left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, this makes sense conceptually, I do wonder if it would be worth adding more granular prefixsets to the proof retainer so we don't retain more nodes than we need

Copy link
Member

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, pending @Rjected

graphs are very helpful

Comment on lines 146 to 149
// When a new leaf is being added to a trie, it can happen that an extension node's
// path is a prefix of the new leaf's, but the extension child's path is not. In
// this case a new branch node is created; the new leaf is one child, the previous
// branch node is its other, and the extension node is its parent.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is helpful

@gakonst
Copy link
Member

gakonst commented Jul 30, 2025

I see, this makes sense conceptually, I do wonder if it would be worth adding more granular prefixsets to the proof retainer so we don't retain more nodes than we need

wdym?

Co-authored-by: Dan Cline <6798349+Rjected@users.noreply.github.com>
@mediocregopher
Copy link
Collaborator Author

I see, this makes sense conceptually, I do wonder if it would be worth adding more granular prefixsets to the proof retainer so we don't retain more nodes than we need

wdym?

The ProofRetainer only knows which leaves changed, but not in what way they changed (added, updated, or removed). If we had that more fine-grained information we could return the extra edge-case proofs only precisely when they would be required. As it is, the ProofRetainer only knows when it might need to keep one of these extra nodes, and so it ends up keeping more than are strictly required.

@mediocregopher mediocregopher marked this pull request as draft July 30, 2025 15:12
@mediocregopher
Copy link
Collaborator Author

mediocregopher commented Jul 30, 2025

I would have sworn I tested all this on mainnet already, but when I reran it today I discovered a hole in the logic. It can happen that a branch has three children, two of which are removed leaves and the last which isn't in the target set. In this case we still need the proof of the nontarget child. Same with a 4-child branch where 3 leaves are removed, etc...

@mediocregopher mediocregopher marked this pull request as ready for review August 6, 2025 04:01
@mediocregopher
Copy link
Collaborator Author

mediocregopher commented Aug 6, 2025

This is now fully tested on mainnet, ran 2k blocks using paradigmxyz/reth#17643 and saw zero requests being made by the sparse tries, indicating that all required trie nodes are being revealed by the multiproof tasks.

A big change has been made here, which is the introduction of the AddedRemovedKeys trait, which is what is used by the ProofRetainer builder to both pass in keys which have been added/removed in the block and to enable the new behavior of retaining extra non-target nodes related to leaf addition/removal.

The trait is designed so we have a fair amount of flexibility in how we keep track of the keys; ultimately the ProofRetainer only cares about whether certain paths are prefixes of those which have been added/removed.

There's still more work which could be done here to narrow down which extra nodes are returned, specifically with regard to leaf removals. If we were to track a stack of masks within the ProofRetainer it might be possible to fully accurately determine which branches will be removed and only retain the extra branch child proof for those cases. Right now we are pessimistically assuming a branch will be removed if all but one of its children are a prefix of a removed leaf, and returning more proofs than are necessary as a result.

@mediocregopher mediocregopher force-pushed the mediocregopher/17571-leaf-update-removal branch from 07bff5b to 263708f Compare August 8, 2025 00:43
@Rjected Rjected self-assigned this Aug 13, 2025
Copy link
Contributor

@Rjected Rjected left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mainly would like to avoid the generic if we can, because it seems like the default type is the only one being used? the logic for retaining based on added / removed keys makes sense to me

Comment on lines +71 to +77
pub struct ProofRetainer<K = AddedRemovedKeys> {
/// The nibbles of the target trie keys to retain proofs for.
targets: Vec<Nibbles>,
/// The map retained trie node keys to RLP serialized trie nodes.
proof_nodes: ProofNodes,
/// Provided by the user to give the necessary context to retain extra proofs.
added_removed_keys: Option<K>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we have a generic here? I think we could just have this be AddedRemovedKeys, although if it needs to be shared maybe it should be a reference type?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AddedRemovedKeys needs to also be read by the TrieWalker during proof generation:

https://github.com/paradigmxyz/reth/blob/c5fb4ce77d11b8739bbbc96d06bf6b1bee450a51/crates/trie/parallel/src/proof.rs#L241-L254

So it definitely cannot just be AddedRemovedKeys without cloning it for every proof. We could make it instead Option<&'k AddedRemovedKeys>, but that leaks the 'k onto the HashBuilder and that kind of infects everything because lifetimes can't have a default. The only other option would be to make this Arc<AddedRemovedKeys>, as that is what is actually being used by reth.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, yeah I think an arc makes sense here then

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I spoke too soon, it's not as simple as passing an Arc<AddedRemovedKeys>.

The problem is that in the reth PR we have MultiAddedRemovedKeys which is cloned and put in an Arc, and then that Arc is passed to all the multiproof chunks.

From there a reference to either account or storage AddedRemovedKeys instances are queried from the Arc<MultiAddedRemovedKeys> (here and here) and passed to the TrieWalker and ProofRetainer/HashBuilder.

So in order for an Arc<AddedRemovedKeys> to be used here it would be necessary to have something like MultiAddedRemovedKeysMut which is held by the MultiProofTask and has AddedRemovedKeys internally, and then a MultiAddedRemovedKeys which can be created from a MultiAddedRemovedKeysMut, and which has Arc<AddedRemovedKeys> instances internally.

Imo at that point the generic is the cleaner solution, and more performant because there's only one Arc (for the MultiAddedRemovedKeys) rather than one for every individual AddedRemovedKeys. I'll defer to you though.

Copy link
Contributor

@Rjected Rjected Aug 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, so is the issue that: we have an Arc to the higher level MultiAddedRemovedKeys in reth, and no arc to the AddedRemovedKeys themselves, we can only obtain regular references. So we need the generic to avoid lifetime leaks, and so we don't need to convert a map, to a map of references every time

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we don't need to convert a map, to a map of references every time

to a map of Arcs, but otherwise yes that's right

@mediocregopher mediocregopher requested a review from Rjected August 18, 2025 17:36
Copy link
Contributor

@Rjected Rjected left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me, yeah it seems like the generic is the best option here

@gakonst gakonst merged commit 1889c65 into alloy-rs:main Aug 18, 2025
24 checks passed
@mediocregopher mediocregopher deleted the mediocregopher/17571-leaf-update-removal branch August 19, 2025 09:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants