Skip to content

Conversation

@edg-l
Copy link
Contributor

@edg-l edg-l commented Nov 14, 2025

Motivation

  • In the current impl_encode there are 2 loops to get the position, fixed to a single pass.
  • Added specialized branchless length methods to integers,
  • Added specialized length to Vec
  • Added specialized length to tuples
  • Added specialized length to most types in ethrex-common, only ones missing are in networking/p2p
  • Removed unneeded arrayvec usage
  • Many instances that deref to &[u8] didn't use its length specialized method, fixed to use it

Each missing length impl in the trait impl meant an extra allocation, this should remove them entirely when encoding

Helps with #4956

@github-actions
Copy link

github-actions bot commented Nov 14, 2025

Lines of code report

Total lines added: 222
Total lines removed: 0
Total lines changed: 222

Detailed view
+------------------------------------+-------+------+
| File                               | Lines | Diff |
+------------------------------------+-------+------+
| ethrex/crates/common/rlp/encode.rs | 748   | +222 |
+------------------------------------+-------+------+

@edg-l edg-l marked this pull request as ready for review November 14, 2025 12:41
@edg-l edg-l requested a review from a team as a code owner November 14, 2025 12:41
@edg-l edg-l moved this from Todo to In review in ethrex_performance Nov 14, 2025
@edg-l edg-l moved this to In Review in ethrex_l1 Nov 14, 2025
@github-actions
Copy link

github-actions bot commented Nov 14, 2025

Benchmark Block Execution Results Comparison Against Main

Command Mean [s] Min [s] Max [s] Relative
base 61.121 ± 0.200 60.793 61.509 1.00 ± 0.01
head 60.915 ± 0.261 60.684 61.572 1.00

@github-actions
Copy link

Benchmark for ad3c222

Click to view benchmark
Test Base PR %
Trie/cita-trie insert 10k 31.5±2.44ms 31.8±2.65ms +0.95%
Trie/cita-trie insert 1k 2.9±0.02ms 2.9±0.15ms 0.00%
Trie/ethrex-trie insert 10k 27.1±1.61ms 27.7±1.50ms +2.21%
Trie/ethrex-trie insert 1k 2.2±0.04ms 2.2±0.01ms 0.00%

@github-actions
Copy link

Benchmark for 4ad8835

Click to view benchmark
Test Base PR %
Trie/cita-trie insert 10k 31.3±1.45ms 35.9±2.24ms +14.70%
Trie/cita-trie insert 1k 2.9±0.05ms 2.9±0.12ms 0.00%
Trie/ethrex-trie insert 10k 26.6±1.10ms 28.4±0.85ms +6.77%
Trie/ethrex-trie insert 1k 2.2±0.01ms 2.2±0.01ms 0.00%

@github-actions
Copy link

Benchmark for c43b4ff

Click to view benchmark
Test Base PR %
Trie/cita-trie insert 10k 28.8±1.38ms 29.2±1.69ms +1.39%
Trie/cita-trie insert 1k 2.9±0.05ms 2.9±0.13ms 0.00%
Trie/ethrex-trie insert 10k 25.1±1.26ms 25.2±0.90ms +0.40%
Trie/ethrex-trie insert 1k 2.2±0.01ms 2.2±0.03ms 0.00%

@github-actions
Copy link

Benchmark for 28969a4

Click to view benchmark
Test Base PR %
Trie/cita-trie insert 10k 27.4±0.51ms 28.0±0.67ms +2.19%
Trie/cita-trie insert 1k 2.8±0.01ms 2.9±0.18ms +3.57%
Trie/ethrex-trie insert 10k 25.1±1.04ms 24.7±0.39ms -1.59%
Trie/ethrex-trie insert 1k 2.2±0.01ms 2.2±0.03ms 0.00%

@github-actions
Copy link

Benchmark for 2383de2

Click to view benchmark
Test Base PR %
Trie/cita-trie insert 10k 28.4±1.06ms 28.7±0.72ms +1.06%
Trie/cita-trie insert 1k 2.8±0.01ms 2.9±0.07ms +3.57%
Trie/ethrex-trie insert 10k 25.5±1.17ms 26.1±0.94ms +2.35%
Trie/ethrex-trie insert 1k 2.2±0.05ms 2.2±0.01ms 0.00%

Copilot finished reviewing on behalf of mpaulucci November 19, 2025 13:54
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR improves RLP encoding performance by eliminating extra allocations and loops during encoding. The key changes include:

  • Refactored integer encoding to use a single-pass loop instead of multiple passes
  • Added specialized length() methods to all RLP-encodable types to avoid re-encoding when calculating lengths
  • Removed unnecessary ArrayVec usage in favor of direct byte array manipulation
  • Introduced helper functions list_length() and bytes_length() for calculating encoded lengths without allocation

Reviewed Changes

Copilot reviewed 12 out of 13 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
crates/common/rlp/encode.rs Core RLP encoding improvements: refactored impl_encode, added list_length/bytes_length helpers, implemented specialized length() methods for all primitive types, removed ArrayVec usage
crates/common/types/transaction.rs Added length() implementations for all transaction types and helper tx_type_byte() methods; contains critical bugs with incorrect transaction type bytes
crates/common/types/tx_fields.rs Added length() and redundant encode_to_vec() methods to AuthorizationTuple
crates/common/types/receipt.rs Added length() implementations for Receipt, ReceiptWithBloom, and Log
crates/common/types/block.rs Added length() implementations for Block, BlockHeader, BlockBody, and Withdrawal
crates/common/types/account.rs Added length() implementations for AccountInfo and AccountState
crates/common/types/blobs_bundle.rs Added length() implementation for BlobsBundle
crates/common/types/fork_id.rs Added length() implementation for ForkId
crates/common/types/requests.rs Added length() implementation for EncodedRequests
crates/common/trie/rlp.rs Added length() implementations for trie nodes (BranchNode, ExtensionNode, LeafNode, Node)
crates/common/trie/nibbles.rs Added compact_length() and compact_first_byte() helper methods, and length() implementation for Nibbles
tooling/Cargo.lock Added lz4-sys dependency (transitive from librocksdb-sys) and removed unused hex dependency from ethrex-vm
CHANGELOG.md Documented the RLP encoding performance improvements
Comments suppressed due to low confidence (1)

crates/common/types/tx_fields.rs:82

  • This encode_to_vec override is redundant. The default implementation in the RLPEncode trait (lines 65-69 in encode.rs) already does exactly this. Consider removing this method unless there's a specific reason for the override (such as pre-allocating buffer capacity as done in BranchNode).

impl RLPDecode for AuthorizationTuple {
    fn decode_unfinished(rlp: &[u8]) -> Result<(Self, &[u8]), RLPDecodeError> {
        let decoder = Decoder::new(rlp)?;
        let (chain_id, decoder) = decoder.decode_field("chain_id")?;

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions
Copy link

Benchmark for 6d8f5b0

Click to view benchmark
Test Base PR %
Trie/cita-trie insert 10k 30.7±2.17ms 29.9±1.39ms -2.61%
Trie/cita-trie insert 1k 2.8±0.01ms 2.9±0.13ms +3.57%
Trie/ethrex-trie insert 10k 25.2±0.79ms 25.9±1.15ms +2.78%
Trie/ethrex-trie insert 1k 2.2±0.09ms 2.2±0.01ms 0.00%

@github-actions
Copy link

Benchmark for 38d4e9d

Click to view benchmark
Test Base PR %
Trie/cita-trie insert 10k 28.0±0.73ms 27.8±0.53ms -0.71%
Trie/cita-trie insert 1k 2.8±0.01ms 2.9±0.14ms +3.57%
Trie/ethrex-trie insert 10k 24.7±0.37ms 24.8±0.49ms +0.40%
Trie/ethrex-trie insert 1k 2.2±0.01ms 2.2±0.01ms 0.00%

edg-l and others added 5 commits November 20, 2025 11:06
Co-authored-by: Tomás Grüner <47506558+MegaRedHand@users.noreply.github.com>
Co-authored-by: Tomás Grüner <47506558+MegaRedHand@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

L1 Ethereum client performance

Projects

Status: In Review
Status: In review

Development

Successfully merging this pull request may close these issues.

6 participants