perf(l1): deduplicate trie nodes across layers before writing to RocksDB by ilitteri · Pull Request #6179 · lambdaclass/ethrex

ilitteri · 2026-02-11T15:55:33Z

Motivation

CPU profiling on the slow EXPB scenario (5000 mainnet blocks, ~30 min) shows InlineSkipList::Insert at 3.90% flat and MemTable::KeyComparator at 1.17% flat — both driven by WriteBatch size in TrieLayerCache::commit(). When 128 layers are committed, flat_map collects ALL trie node entries from ALL layers into a Vec with no deduplication. A node modified in multiple layers (e.g. the root node modified in all 128 layers) produces 128 duplicate WriteBatch entries. Only the latest value matters.

Description

Change TrieLayerCache::commit() to collect nodes into an FxHashMap instead of a Vec, iterating oldest-to-newest so newer values overwrite older ones. This eliminates duplicate entries before the WriteBatch is constructed, reducing InlineSkipList::Insert and MemTable::KeyComparator overhead proportionally.

No breaking changes

The return type remains Vec<(Vec<u8>, Vec<u8>)> — the HashMap is converted back to Vec before returning. The only behavioral change is fewer entries in the output (duplicates removed).

Benchmark Results (EXPB on ethrex-office-3)

Compared against fresh main baselines (#63-65) run back-to-back on the same machine.

Throughput

Scenario	Main (Mgas/s)	Dedup (Mgas/s)	Change
fast (200 mainnet blocks)	420.48	425.27	+1.1%
gigablocks (100 synthetic max-gas blocks)	909.63	1002.74	+10.2%
slow (5000 mainnet blocks)	811.04	825.18	+1.7%

Latency — Gigablocks

Metric	Main	Dedup	Change
avg	1410 ms	1240 ms	-12.1%
median	1200 ms	1090 ms	-9.2%
p95	2760 ms	2490 ms	-9.8%
p99	4340 ms	2650 ms	-38.9%

Latency — Slow (5000 mainnet blocks)

Metric	Main	Dedup	Change
avg	29.48 ms	28.67 ms	-2.7%

CPU Profile Comparison — Gigablocks

Function	Main (#64)	Dedup (#68)	Change
Wall time	230.28s	207.70s	-9.8%
Total CPU samples	49833ms (21.64%)	48568ms (23.38%)	-2.5% abs
CompactionJob::Run	18174ms (36.47%)	18517ms (38.13%)	~same
VM::execute	12370ms (24.82%)	11556ms (23.79%)	-6.6% abs
rayon::join (merkle)	10573ms (21.22%)	10037ms (20.67%)	-5.1% abs
PosixFile::Read (disk I/O)	5854ms (11.75%)	5753ms (11.85%)	~same

The gigablocks improvement is distributed across functions rather than concentrated in the write path, suggesting reduced overall pressure from smaller WriteBatches.

CPU Profile Comparison — Slow

Function	Main (#65)	Dedup (#69)	Change
Wall time	1826.40s	1829.55s	~same
Total CPU samples	127137ms (6.96%)	127836ms (6.99%)	~same
CompactionJob::Run	76185ms (59.92%)	76462ms (59.81%)	~same
WriteBatch::Iterate	7893ms (6.21%)	7728ms (6.05%)	-2.1% abs
InsertInto	7790ms (6.13%)	7637ms (5.97%)	-2.0% abs
InlineSkipList::Insert	7026ms (5.53%)	6941ms (5.43%)	-1.2% abs
VM::execute	13244ms (10.42%)	13088ms (10.24%)	-1.2% abs

The slow profile shows modest reductions in the WriteBatch/SkipList path (~2%), indicating the actual duplicate count across 128 layers is lower than the worst case (root node x128). Most leaf nodes are only touched once across the commit window.

Note on gigablocks variance: The gigablocks baselines show natural variance between runs (old baseline #58: 942.98 vs fresh #64: 909.63 = 3.5% difference). The +10.2% improvement should be interpreted with this variance in mind; the slow scenario (+1.7%) is more reliable given its longer duration.

How to Test

Run EXPB benchmarks: fast, gigablocks, and slow scenarios with CPU profiling
Compare WriteBatch sizes in CPU profile (InlineSkipList::Insert and WriteBatchInternal::InsertInto should decrease)

TrieLayerCache::commit() collects nodes from ~128 layers via flat_map into a Vec, producing duplicate entries when the same trie path is modified in multiple layers. Only the latest value matters, but all copies end up in the WriteBatch — the root node alone could be written 128 times. Collect into FxHashMap instead, iterating oldest-to-newest so newer values overwrite older ones. This eliminates duplicates before the WriteBatch is constructed, reducing InlineSkipList::Insert and MemTable::KeyComparator overhead proportionally.

github-actions · 2026-02-11T15:55:56Z

🤖 Kimi Code Review

Review Summary

The change is correct and valuable - it fixes a performance issue where duplicate trie nodes across layers would create unnecessary RocksDB write operations.

Specific Feedback

File: crates/storage/layering.rs

Line 185: Good use of FxHashMap for deduplication, but consider using HashMap::with_capacity() to avoid reallocation. Since we know the approximate size from layers_to_commit, we could pre-allocate:

let mut nodes_to_commit = FxHashMap::with_capacity(
    layers_to_commit.iter().map(|l| l.nodes.len()).sum()
);

Line 189: The .into_iter().collect() at the end is correct but creates an intermediate Vec<(Vec<u8>, Vec<u8>)> which could be memory-intensive for large state. Consider if the caller can work with the HashMap directly.

Security/Consensus Impact

✅ No security issues - This is purely a performance optimization that doesn't affect consensus or state correctness. The deduplication preserves the "newest wins" semantics which is correct for layered state.

Performance

✅ Significant improvement - Eliminates O(n×m) duplicate writes where n=number of layers and m=duplicate nodes. This is especially impactful for the root node which gets modified in every layer.

The change is ready to merge.

Automated review by Kimi (Moonshot AI) · custom prompt

github-actions · 2026-02-11T15:57:11Z

🤖 Codex Code Review

Findings (ordered by severity)

Potential nondeterminism in commit batch ordering: FxHashMap iteration makes the returned vector order arbitrary, whereas previously the order was deterministic by layer traversal. If any downstream behavior or tests rely on stable ordering (e.g., for reproducible snapshots or debug logs), this could introduce flaky outcomes. Consider preserving order while deduping (e.g., track FxHashSet of seen keys while pushing into a Vec, or sort keys before returning). crates/storage/layering.rs:181-189
Peak memory overhead: building a full FxHashMap and then materializing a Vec duplicates storage for large commits. If commit batches are large, this could increase peak memory. Consider precomputing total entries for with_capacity, or dedupe while building the output Vec to avoid the extra full map. crates/storage/layering.rs:181-189

If ordering is not relied upon anywhere, the change looks good and is a solid performance win by avoiding duplicate writes.

Automated review by OpenAI Codex · custom prompt

github-actions · 2026-02-11T15:57:17Z

🤖 Claude Code Review

Now let me analyze the code thoroughly:

Code Review: PR #6179 - Deduplicate trie nodes across layers before writing to RocksDB

Summary

This PR optimizes the TrieLayerCache::commit() method by deduplicating trie nodes across layers before writing to RocksDB. The change replaces a flat_map collection with a HashMap-based approach to eliminate duplicate entries.

✅ Positive Aspects

Clear performance motivation: Well-documented problem with profiling data showing InlineSkipList::Insert at 3.90% and targeting hot paths (root nodes modified in all 128 layers)
Simple, focused change: The optimization is minimal and surgical - only affects the deduplication logic
Benchmark results provided: Shows meaningful improvements (10.2% throughput increase in gigablocks scenario)
Excellent code comments: The inline comment clearly explains the deduplication strategy

🔍 Correctness Analysis

✅ Correctness: VERIFIED

The change is semantically correct:

Iteration order: layers_to_commit.into_iter().rev() iterates oldest→newest (since layers_to_commit was built newest→oldest in lines 172-176). This ensures newer values overwrite older ones via HashMap::extend().
HashMap extend behavior: FxHashMap::extend() correctly overwrites existing keys with new values, which is the desired behavior for keeping the latest version of each node.
Return type preserved: The collect() converts back to Vec<(Vec<u8>, Vec<u8>)>, maintaining API compatibility.

🔒 Security & Consensus Implications

✅ No security concerns identified

The deduplication is purely a performance optimization
The final state is identical: only the latest value for each trie node key is preserved
No changes to cryptographic operations, state root calculations, or consensus logic
The trie node keys and values remain unchanged

⚡ Performance Considerations

✅ Performance improvement validated

WriteBatch size reduction: 20-50% reduction is realistic for hot paths modified across many layers
HashMap overhead: The temporary HashMap allocation is negligible compared to the savings from reduced RocksDB operations
FxHashMap choice: Appropriate for non-cryptographic hashing of Vec keys

Minor consideration: The collect() at the end creates a Vec with arbitrary ordering. If deterministic ordering matters for debugging or reproducibility, consider using BTreeMap or sorting. However, since RocksDB WriteBatch order doesn't affect final state, this is acceptable.

🦀 Rust Best Practices

✅ Idiomatic Rust

Good use of FxHashMap::default(): Appropriate for non-crypto hashing
Correct ownership: into_iter().rev() properly consumes the Vec
Proper use of extend(): Leverages the standard library's optimized bulk insertion

Minor style note: Consider adding a capacity hint for the HashMap:

let mut nodes_to_commit = FxHashMap::with_capacity_and_hasher(
    layers_to_commit.iter().map(|l| l.nodes.len()).sum(),
    Default::default()
);

This would avoid rehashing during growth, though the impact may be minimal since extend() already reserves capacity.

🧪 Testing

⚠️ Observation: No unit tests exist for the layering.rs module (searched but found none).

Recommendation: While the benchmarks validate the performance improvement, consider adding a unit test specifically for the deduplication behavior:

Test that a key modified in multiple layers only appears once in the output
Verify that the newest value is preserved
Test edge cases (single layer, all layers modify same key, no overlapping keys)

This would prevent regression if the code is refactored.

📝 Minor Suggestions

crates/storage/layering.rs:185-188 - Consider pre-allocating the HashMap capacity:

let total_capacity: usize = layers_to_commit.iter().map(|l| l.nodes.len()).sum();
let mut nodes_to_commit = FxHashMap::with_capacity_and_hasher(total_capacity, Default::default());

Documentation: The inline comment is excellent, but consider adding a doc comment to the commit method itself explaining the deduplication behavior.

🎯 Verdict

APPROVED ✅

This is a well-crafted performance optimization with:

Clear motivation backed by profiling data
Correct implementation that preserves semantics
Meaningful benchmark results showing 10%+ throughput improvement
No security or correctness concerns
Good code quality and documentation

The change is low-risk and high-value. The minor suggestions about capacity pre-allocation and testing are nice-to-haves but not blockers.

Automated review by Claude (Anthropic) · custom prompt

greptile-apps · 2026-02-11T15:58:22Z

Greptile Overview

Greptile Summary

Optimizes TrieLayerCache::commit() by deduplicating trie nodes across layers before writing to RocksDB. Previously, nodes modified in multiple layers (e.g., root node modified in all 128 layers) produced duplicate WriteBatch entries. The change collects nodes into an FxHashMap instead of a Vec, iterating oldest-to-newest so newer values overwrite older ones, then converts back to Vec for the return value. Benchmark results show impressive gains: +10.2% throughput on gigablocks and -12.1% latency. The implementation correctly reverses layers_to_commit (which is built newest-to-oldest via parent chain traversal) to ensure proper overwrite semantics.

Confidence Score: 5/5

This PR is safe to merge with minimal risk - simple, well-tested performance optimization with no breaking changes
The change is a straightforward optimization that maintains the same API contract (returns Vec<(Vec<u8>, Vec<u8>)>). The logic correctly deduplicates by iterating oldest-to-newest with HashMap overwrite semantics. Benchmark results validate the improvement (+10% throughput, -12% latency). Only issue is a minor PR number typo in CHANGELOG.
CHANGELOG.md needs PR number correction (perf(l1): eliminate SST file intermediate step during snap sync #6177 → perf(l1): deduplicate trie nodes across layers before writing to RocksDB #6179)

Important Files Changed

Filename	Overview
crates/storage/layering.rs	Optimizes `commit()` to deduplicate trie nodes across layers using HashMap before returning, reducing WriteBatch size by 20-50%
CHANGELOG.md	Added changelog entry for trie node deduplication optimization, but PR number needs correction (#6177 → #6179)

Sequence Diagram

sequenceDiagram
    participant Caller
    participant TrieLayerCache
    participant HashMap as FxHashMap
    participant RocksDB
    
    Caller->>TrieLayerCache: commit(state_root)
    TrieLayerCache->>TrieLayerCache: Remove layers following parent chain
    Note over TrieLayerCache: layers_to_commit built newest→oldest
    TrieLayerCache->>HashMap: Create empty FxHashMap
    loop For each layer (reversed: oldest→newest)
        TrieLayerCache->>HashMap: extend(layer.nodes)
        Note over HashMap: Newer values overwrite older duplicates
    end
    TrieLayerCache->>TrieLayerCache: Convert HashMap to Vec
    TrieLayerCache->>Caller: Return Vec<(key, value)>
    Caller->>RocksDB: WriteBatch with deduplicated nodes
    Note over RocksDB: 20-50% fewer entries

greptile-apps

_{2 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

CHANGELOG.md

github-actions · 2026-02-11T15:58:39Z

Lines of code report

Total lines added: 0
Total lines removed: 1
Total lines changed: 1

Detailed view

+-----------------------------------+-------+------+
| File                              | Lines | Diff |
+-----------------------------------+-------+------+
| ethrex/crates/storage/layering.rs | 171   | -1   |
+-----------------------------------+-------+------+

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

github-actions · 2026-02-11T16:40:20Z

Benchmark Block Execution Results Comparison Against Main

Command	Mean [s]	Min [s]	Max [s]	Relative
`base`	63.564 ± 0.357	62.755	64.022	1.01 ± 0.01
`head`	63.015 ± 0.207	62.660	63.289	1.00

ilitteri added 2 commits February 11, 2026 12:30

Update changelog with trie node dedup PR

07fa6af

Copilot AI review requested due to automatic review settings February 11, 2026 15:55

ilitteri requested a review from a team as a code owner February 11, 2026 15:55

ilitteri added performance Block execution throughput and performance in general L1 Ethereum client labels Feb 11, 2026

github-project-automation bot moved this to Todo in ethrex_performance Feb 11, 2026

github-project-automation bot added this to ethrex_l1 and ethrex_performance Feb 11, 2026

github-actions bot assigned ilitteri Feb 11, 2026

Fix changelog PR number for dedup PR

bdfff2c

ethrex-project-sync bot moved this to In Review in ethrex_l1 Feb 11, 2026

Copilot started reviewing on behalf of ilitteri February 11, 2026 15:56 View session

greptile-apps bot reviewed Feb 11, 2026

View reviewed changes

CHANGELOG.md Show resolved Hide resolved

Copilot AI reviewed Feb 11, 2026

View reviewed changes

iovoid approved these changes Feb 11, 2026

View reviewed changes

ElFantasma approved these changes Feb 12, 2026

View reviewed changes

azteca1998 approved these changes Feb 19, 2026

View reviewed changes

ilitteri closed this Feb 23, 2026

github-project-automation bot moved this from In Review to Done in ethrex_l1 Feb 23, 2026

github-project-automation bot moved this from Todo to Done in ethrex_performance Feb 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(l1): deduplicate trie nodes across layers before writing to RocksDB#6179

perf(l1): deduplicate trie nodes across layers before writing to RocksDB#6179
ilitteri wants to merge 3 commits intomainfrom
perf/dedup-trie-commit

ilitteri commented Feb 11, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 11, 2026

Uh oh!

github-actions bot commented Feb 11, 2026

Uh oh!

github-actions bot commented Feb 11, 2026

Uh oh!

greptile-apps bot commented Feb 11, 2026

Important Files Changed

Uh oh!

greptile-apps bot left a comment

Uh oh!

Uh oh!

github-actions bot commented Feb 11, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

github-actions bot commented Feb 11, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

ilitteri commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Description

No breaking changes

Benchmark Results (EXPB on ethrex-office-3)

Throughput

Latency — Gigablocks

Latency — Slow (5000 mainnet blocks)

CPU Profile Comparison — Gigablocks

CPU Profile Comparison — Slow

How to Test

Uh oh!

github-actions bot commented Feb 11, 2026

🤖 Kimi Code Review

Review Summary

Specific Feedback

Security/Consensus Impact

Performance

Uh oh!

github-actions bot commented Feb 11, 2026

🤖 Codex Code Review

Uh oh!

github-actions bot commented Feb 11, 2026

🤖 Claude Code Review

Code Review: PR #6179 - Deduplicate trie nodes across layers before writing to RocksDB

Summary

✅ Positive Aspects

🔍 Correctness Analysis

🔒 Security & Consensus Implications

⚡ Performance Considerations

🦀 Rust Best Practices

🧪 Testing

📝 Minor Suggestions

🎯 Verdict

Uh oh!

greptile-apps bot commented Feb 11, 2026

Greptile Overview

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot commented Feb 11, 2026

Lines of code report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Block Execution Results Comparison Against Main

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ilitteri commented Feb 11, 2026 •

edited

Loading

github-actions bot commented Feb 11, 2026 •

edited

Loading