perf(l1): disable WAL and improve concurrency in snap sync by ilitteri · Pull Request #6178 · lambdaclass/ethrex

ilitteri · 2026-02-11T15:28:56Z

Motivation

Snap sync performance profiling on Hoodi showed that RocksDB write-ahead log (WAL) overhead and single-peer healing dispatch were limiting throughput. Nethermind's snap sync (which achieves ~20 min on mainnet) disables WAL during sync since all data can be re-downloaded on crash, and uses multi-peer concurrent requests for healing.

Description

Three changes, each in a separate commit:

1. Storage layer: commit_no_wal() (1848388)

Add commit_no_wal() to StorageWriteBatch trait with default fallback to commit()
RocksDB implementation uses WriteOptions with disable_wal(true) to skip the write-ahead log
Wire through BackendTrieDB via with_no_wal() builder and add _no_wal variants to Store

2. Multi-peer state healing (7d18550)

Extract dispatch_state_healing_batches() that sends up to MAX_IN_FLIGHT_REQUESTS (77) concurrent trie node requests instead of one at a time
Replace try_recv() busy-polling with tokio::select! on channel recv + 1s timeout
Process all pending healed batches with while let instead of if let to drain the queue each iteration

3. Wire no-WAL through snap sync + fix storage healing polling (5730189)

Switch all snap sync DB writes to _no_wal variants (account trie, storage trie, bytecodes, storage roots)
Replace try_recv() + yield_now() busy loop in storage healing with tokio::select! on recv + 1s timeout

Benchmark results (Hoodi, ethrex-mainnet-4: 12 cores, 62 GB RAM)

Phase	main (baseline)	This PR	Delta
1. Block Headers	1:30	1:20	-10s (network)
2. Account Ranges	0:20	0:20	—
3. Account Insertion	0:40	0:40	—
4. Storage Ranges	2:50	2:20	-30s (network)
5. Storage Insertion	4:10	4:10	—
6. State Healing	0:10	0:10	—
7. Storage Healing	0:40	0:50	+10s (more accounts)
8. Bytecodes	3:11	1:13	-118s
Finalize	~17s	~2s	-15s
Total	837s (13:57)	664s (11:04)	-173s (21%)

The biggest win is the bytecodes phase (62% faster) where WAL-disabled writes dramatically reduce I/O. Phases 1, 2, 4 are network-bound so their deltas are peer variability. Storage healing healed more accounts (19,333 vs 16,555) due to slightly different chain state, explaining the +10s.

How to Test

Run snap sync on Hoodi (or any testnet) and compare total sync time against main:

# Build and run
cargo build --release --bin ethrex
./target/release/ethrex --network hoodi --authrpc.jwtsecret <jwt> --p2p.lookup-interval 10

Look for Sync cycle finished successfully time_elapsed_s=... in the logs.

Add a commit_no_wal() method to StorageWriteBatch trait with a default fallback to commit(). The RocksDB implementation uses WriteOptions with disable_wal(true) to skip the write-ahead log, avoiding double-write I/O for data that can be re-downloaded (snap sync). Wire the flag through BackendTrieDB (with_no_wal() builder) and add _no_wal variants to Store: open_direct_state_trie_no_wal, open_direct_storage_trie_no_wal, write_storage_trie_nodes_batch_no_wal, write_account_code_batch_no_wal, and write_batch_async_no_wal.

Extract dispatch_state_healing_batches() that sends up to MAX_IN_FLIGHT_REQUESTS (77) concurrent trie node requests instead of one at a time. Replace try_recv() busy-polling with tokio::select! on channel recv + 1s timeout. Process all pending healed batches with while-let instead of if-let to drain the queue each iteration. Switch state trie DB writes to open_direct_state_trie_no_wal().

Replace all snap sync DB writes with their _no_wal variants: account trie insertion, storage trie insertion, storage trie node batch writes, bytecode batch writes, and storage root computation. In storage healing, replace try_recv() + yield_now() busy loop with tokio::select! on channel recv + 1s timeout to eliminate CPU waste.

github-actions · 2026-02-11T15:29:30Z

🤖 Kimi Code Review

Review Summary

This PR introduces WAL (Write-Ahead Log) bypassing for snap sync operations and refactors state healing to use concurrent batch dispatching. The changes are generally well-structured, but there are several issues to address:

Critical Issues

Race Condition in State Healing (state.rs:142-194)
- The while let Some((nodes, batch)) = nodes_to_heal.pop() loop can process batches before all responses are received, potentially causing out-of-order processing
- This could lead to incorrect state trie construction if parent nodes are processed before their children
Missing Error Handling (state.rs:285-360)
- dispatch_state_healing_batches doesn't handle the case where request_state_trienodes fails to spawn
- No validation that dispatched tasks actually complete successfully

Security & Consensus Issues

State Consistency Risk (state.rs:194-221)
- The WAL bypass (commit_no_wal) during snap sync could lead to state corruption if the process crashes mid-sync
- Consider adding a checkpoint mechanism or verification step after sync completion

Performance & Correctness Issues

Inefficient Batch Processing (state.rs:142-194)
- The while let Some(...) loop processes one batch at a time instead of processing all available batches
- This reduces the benefits of concurrent dispatching
Resource Leak Potential (state.rs:360)
- JoinSet in heal_state_batch is created but never properly cleaned up if errors occur

Code Quality Issues

Type Mismatch (state.rs:112)
- logged_no_free_peers_count changed from u64 to u32 but the decrement logic (lines 156-158, 324-326) uses hardcoded values that assume u32
Redundant Code (state.rs:285-360)
- dispatch_state_healing_batches duplicates logic from the original inline code
- Consider extracting common patterns into shared utilities

Suggested Fixes

Fix race condition:

// Replace the while loop with proper synchronization
while inflight_tasks > 0 || !nodes_to_heal.is_empty() {
    tokio::select! {
        Some((peer_id, response, batch)) = task_receiver.recv() => {
            // Process response and add to nodes_to_heal
            nodes_to_heal.push((nodes, batch));
        }
    }
    
    // Process all available batches
    while let Some((nodes, batch)) = nodes_to_heal.pop() {
        // Process batch
    }
}

Add WAL verification:

// After snap sync completes
pub async fn verify_state_consistency(&self, expected_state_root: H256) -> Result<bool, StoreError> {
    // Verify state root matches expected
}

Fix type consistency:

const NO_PEERS_LOG_INTERVAL: u32 = 1000;

The WAL bypass changes appear safe for snap sync use case, but should be clearly documented as only safe during initial sync.

Automated review by Kimi (Moonshot AI) · custom prompt

github-actions · 2026-02-11T15:31:19Z

Lines of code report

Total lines added: 136
Total lines removed: 1
Total lines changed: 137

Detailed view

+------------------------------------------------------+-------+------+
| File                                                 | Lines | Diff |
+------------------------------------------------------+-------+------+
| ethrex/crates/networking/p2p/sync/healing/state.rs   | 413   | +26  |
+------------------------------------------------------+-------+------+
| ethrex/crates/networking/p2p/sync/healing/storage.rs | 614   | -1   |
+------------------------------------------------------+-------+------+
| ethrex/crates/storage/api/mod.rs                     | 40    | +3   |
+------------------------------------------------------+-------+------+
| ethrex/crates/storage/backend/rocksdb.rs             | 323   | +8   |
+------------------------------------------------------+-------+------+
| ethrex/crates/storage/store.rs                       | 2514  | +86  |
+------------------------------------------------------+-------+------+
| ethrex/crates/storage/trie.rs                        | 161   | +13  |
+------------------------------------------------------+-------+------+

github-actions · 2026-02-11T15:31:31Z

🤖 Codex Code Review

Here are the review findings I’d flag.

Findings

Durability risk if heal_state_trie can run outside snap sync. open_direct_state_trie_no_wal is now used for state healing writes, so a crash can drop recently written trie nodes with no WAL replay, and the in‑memory queue won’t be persisted. This is fine only if this code path is strictly snap‑sync (re-downloadable). Please confirm the invariant or guard it explicitly. crates/networking/p2p/sync/healing/state.rs:196-203, crates/storage/store.rs:2521-2529.
Potential code/metadata inconsistency on crash with WAL disabled. write_account_code_batch_no_wal writes ACCOUNT_CODES and ACCOUNT_CODE_METADATA in two separate batches with WAL disabled. A crash between them can leave metadata without code (or vice‑versa), which could surface as invalid code length or “missing code” during reads. If recovery relies on re-download, it’s probably OK, but consider a single batch across CFs or add validation/repair on startup. crates/storage/store.rs:1199-1223.
Metrics regression (may hide liveliness issues). heal_storage_trie no longer increments state.empty_count on empty receives; the select timeout just continues. If this metric is used for monitoring or backoff heuristics, it will now stay near zero. Not correctness, but worth verifying. crates/networking/p2p/sync/healing/storage.rs:257-270.

If the no‑WAL paths are strictly snap‑sync only and you’re comfortable with possible partial writes on crash, the rest looks reasonable and I didn’t spot EVM/consensus‑critical changes.

Natural next steps:

Confirm whether healing paths ever run outside snap sync; if yes, add a guard or keep WAL.
Decide whether code/metadata should be atomically written (single batch) or tolerate mismatches with a repair pass.

Automated review by OpenAI Codex · custom prompt

greptile-apps · 2026-02-11T15:31:39Z

Greptile Overview

Greptile Summary

This PR significantly improves snap sync performance by disabling RocksDB's write-ahead log (WAL) during sync operations and introducing concurrent multi-peer healing requests.

Key Changes:

Storage layer: Added commit_no_wal() method to StorageWriteBatch trait with RocksDB implementation using WriteOptions::disable_wal(true). Default trait implementation falls back to commit() for non-RocksDB backends
Multi-peer concurrency: Refactored state healing from single-peer sequential requests to concurrent dispatch of up to 77 requests (MAX_IN_FLIGHT_REQUESTS) across multiple peers
Async improvements: Replaced busy-polling (try_recv() + yield_now()) with proper tokio::select! on channel recv + 1s timeout in both state and storage healing
Snap sync integration: All DB write paths during snap sync now use _no_wal variants (account trie, storage trie, bytecodes, storage roots)

Performance Impact:
Benchmark on Hoodi shows 21% total speedup (13:57 → 11:04), with bytecodes phase improving 62% (3:11 → 1:13) due to reduced I/O overhead from WAL bypass. Network-bound phases show peer variability.

Safety:
WAL bypass is safe during snap sync since all data can be re-downloaded on crash. The change is isolated to sync operations and doesn't affect normal block execution paths.

Confidence Score: 5/5

This PR is safe to merge with minimal risk
The implementation is well-structured with clear separation of concerns. The WAL bypass is correctly isolated to snap sync paths where data loss on crash is acceptable (re-downloadable). The multi-peer concurrency changes improve performance without introducing race conditions. All changes maintain backward compatibility through trait default methods and dedicated _no_wal variants. The approach follows Nethermind's proven pattern for fast snap sync.
No files require special attention

Important Files Changed

Filename	Overview
crates/storage/backend/rocksdb.rs	Implemented `commit_no_wal()` to bypass write-ahead log using `WriteOptions::disable_wal(true)` for RocksDB
crates/storage/trie.rs	Added `no_wal` field to `BackendTrieDB` with `with_no_wal()` builder method, conditionally calls `commit_no_wal()` in `put_batch()`
crates/storage/store.rs	Added `_no_wal` variants for batch write operations and trie opening methods, routing through internal helpers with `no_wal` flag
crates/networking/p2p/sync/healing/state.rs	Replaced single-peer busy-polling with concurrent multi-peer dispatch (up to 77 requests) and `tokio::select!` with 1s timeout; switched to `_no_wal` DB writes

Sequence Diagram

sequenceDiagram
    participant Main as heal_state_trie
    participant Dispatch as dispatch_state_healing_batches
    participant Peers as PeerHandler
    participant Worker as tokio::spawn
    participant Channel as task_receiver

    Main->>Dispatch: dispatch up to 77 batches
    loop while inflight < MAX_IN_FLIGHT_REQUESTS
        Dispatch->>Peers: get_best_peer()
        Peers-->>Dispatch: (peer_id, connection)
        Dispatch->>Worker: spawn(request_state_trienodes)
        Worker->>Worker: fetch trie nodes from peer
        Worker->>Channel: send(peer_id, response, batch)
    end
    
    Main->>Main: tokio::select! on recv + 1s timeout
    Channel-->>Main: (peer_id, response, batch)
    Main->>Main: process response
    Main->>Main: while let Some() drain nodes_to_heal
    Main->>Main: write to DB with no-WAL

github-actions · 2026-02-11T15:32:00Z

🤖 Claude Code Review

Now I have enough context to provide a comprehensive code review. Let me compile my findings:

Code Review: PR #6178 - Disable WAL and Improve Concurrency in Snap Sync

Summary

This PR implements performance optimizations for snap sync by disabling RocksDB's write-ahead log (WAL) during sync operations and introducing concurrent multi-peer state healing. The changes achieve a 21% performance improvement (13:57 → 11:04) on Hoodi testnet.

Critical Issues

1. Unsafe Memory Management in RocksDBLocked ⚠️

File: crates/storage/backend/rocksdb.rs:394-403

The RocksDBLocked uses unsafe memory operations with Box::leak and Box::from_raw:

fn begin_locked(&self, table_name: &'static str) -> Result<Box<dyn StorageLockedView>, StoreError> {
    let db = Box::leak(Box::new(self.db.clone()));  // Line 254
    // ...
}

impl Drop for RocksDBLocked {
    fn drop(&mut self) {
        unsafe {
            drop(Box::from_raw(
                self.db as *const Arc<DBWithThreadMode<MultiThreaded>>
                    as *mut Arc<DBWithThreadMode<MultiThreaded>>,
            ));
        }
    }
}

Issue: This pattern is error-prone and relies on manual lifetime management. While this code exists before this PR, the increased usage during snap sync (with _no_wal variants) amplifies the risk.

Recommendation: Consider using Arc directly or ensuring the leak/drop symmetry is well-documented and tested.

2. Panic on Missing Parent in State Healing 🔴

File: crates/networking/p2p/sync/healing/state.rs:406-408

let mut healing_queue_entry = healing_queue.remove(parent_path).unwrap_or_else(|| {
    panic!("The parent should exist. Parent: {parent_path:?}, path: {path:?}")
});

Issue: The code panics if a parent node is missing from the healing queue. In a distributed P2P environment with potentially malicious or buggy peers, this could crash the sync process.

Recommendation: Convert to proper error handling:

let mut healing_queue_entry = healing_queue.remove(parent_path)
    .ok_or_else(|| SyncError::Custom(format!("Parent missing: {parent_path:?}, path: {path:?}")))?;

3. Channel Send Error Ignored ⚠️

File: crates/networking/p2p/sync/healing/state.rs:348-353

let _ = tx.send((peer_id, response, batch)).await.inspect_err(
    |err| debug!(error=?err, "Failed to send state trie nodes response"),
);

Issue: Channel send errors are logged but ignored. If the receiver is dropped or the channel is full, tasks silently fail without updating the inflight counter, leading to potential deadlocks.

Recommendation: Propagate errors or implement retry logic. Consider bounded channels with back-pressure.

Performance & Correctness

4. Missing Error Propagation in Storage Healing ⚠️

File: crates/networking/p2p/sync/healing/storage.rs:228-229

spawned_rt::tasks::block_on(store.write_storage_trie_nodes_batch_no_wal(encoded_to_write))
    .expect("db write failed");

Issue: Using expect() in async code will panic the entire task on DB errors during snap sync. Recoverable errors (disk full, temporary I/O issues) become fatal.

Recommendation: Propagate errors properly:

spawned_rt::tasks::block_on(store.write_storage_trie_nodes_batch_no_wal(encoded_to_write))?;

5. Concurrent Multi-Peer Dispatch - Good Design ✅

File: crates/networking/p2p/sync/healing/state.rs:288-356

The new dispatch_state_healing_batches() function properly:

Limits concurrent requests to MAX_IN_FLIGHT_REQUESTS (77)
Uses tokio::select! instead of busy-polling
Handles peer unavailability gracefully
Uses exponential back-off for logging

Minor suggestion: Consider making MAX_IN_FLIGHT_REQUESTS configurable for different network conditions.

6. WAL Bypass Implementation - Correct ✅

Files: crates/storage/api/mod.rs, crates/storage/backend/rocksdb.rs, crates/storage/trie.rs

The WAL bypass implementation is sound:

Default fallback to commit() ensures compatibility
RocksDB WriteOptions with disable_wal(true) is correct
Builder pattern (with_no_wal()) is clean and explicit
All snap sync write paths properly use _no_wal variants

Concern: Ensure crash recovery documentation clearly states that snap sync must restart if crashed during WAL-disabled writes.

Code Quality

7. Batch Processing Logic ✅

File: crates/networking/p2p/sync/healing/state.rs:159-173

The change from if let to while let for draining nodes_to_heal is correct and improves throughput:

// Process all pending healed node batches
while let Some((nodes, batch)) = nodes_to_heal.pop() {
    heals_per_cycle += 1;
    // ...
}

This ensures all completed batches are processed each iteration instead of one at a time.

8. Type Safety Improvement ✅

File: crates/networking/p2p/sync/healing/state.rs:112

let mut logged_no_free_peers_count: u32 = 0;  // Changed from implicit to explicit u32

Good change for clarity and overflow protection.

Security Considerations

9. Node Hash Validation ✅

File: crates/networking/p2p/sync/healing/storage.rs:469-476

The code properly validates node hashes from peers:

if node.compute_hash().finalize() != node_request.hash {
    trace!("Node Hash failed");
    Err(RLPDecodeError::MalformedData)
} else {
    Ok(NodeResponse { node_request: node_request.clone(), node })
}

This prevents malicious peers from injecting invalid trie nodes.

10. Potential DoS via Empty Responses ⚠️

File: crates/networking/p2p/sync/healing/storage.rs:438-448

if nodes_size == 0 {
    *failed_downloads += 1;
    peer_handler.peer_table.record_failure(&request.peer_id).await?;
    download_queue.extend(request.requests);
    return Ok(None);
}

Issue: A malicious peer could repeatedly return empty responses, causing the same work to be re-queued indefinitely while only incrementing a failure counter.

Recommendation: Implement progressive penalties (e.g., temporary peer bans after N consecutive empty responses).

Minor Issues

11. Metric Removed Without Replacement

File: crates/networking/p2p/sync/healing/state.rs:125-127 (removed)

The healing_empty_try_recv metric was removed when switching from try_recv() to tokio::select!. Consider adding a replacement metric for timeout events to monitor peer responsiveness.

12. Magic Number for Peer Logging

File: crates/networking/p2p/sync/healing/state.rs:329

*logged_no_free_peers_count = 1000;

This magic number (log once per ~1000 iterations) should be a named constant like NO_PEERS_LOG_THROTTLE.

Positive Observations

Consistent naming: All _no_wal methods follow a clear naming convention
Non-breaking: Default behavior preserved via trait default method
Proper async boundaries: spawn_blocking used correctly for RocksDB operations
Memory batching: The 100k node threshold before DB writes is reasonable
Benchmark data: Comprehensive performance measurements provided in PR description

Recommendations

High Priority:

Fix panic on missing parent (state.rs:406)
Handle channel send errors properly (state.rs:348)
Propagate DB write errors instead of panicking (storage.rs:228)

Medium Priority:
4. Add crash recovery documentation for WAL-disabled mode
5. Implement progressive peer penalties for empty responses
6. Consider making MAX_IN_FLIGHT_REQUESTS configurable

Low Priority:
7. Replace magic number 1000 with named constant
8. Add replacement metric for timeout events
9. Document unsafe memory usage in RocksDBLocked

Verdict

Conditional Approval - The performance improvements are significant and the core WAL bypass implementation is correct. However, the panic on missing parent and ignored channel errors need to be addressed before merging to production. The changes are well-structured and follow Rust best practices overall.

Automated review by Claude (Anthropic) · custom prompt

Copilot

Pull request overview

This PR optimizes snap sync throughput by allowing storage writes to bypass RocksDB WAL during sync and by increasing concurrency in state healing (plus removing busy-poll polling loops).

Changes:

Add a commit_no_wal() API to the storage write-batch abstraction and implement it for RocksDB (via WriteOptions::disable_wal(true)), wiring it through trie DB and store helpers.
Switch snap sync write paths (tries, storage trie nodes, bytecodes) to use the new no-WAL variants.
Improve healing performance by dispatching multiple concurrent state-healing requests and replacing try_recv()/busy loops with tokio::select! + timeouts.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
crates/storage/api/mod.rs	Extends `StorageWriteBatch` with `commit_no_wal()` (defaulting to `commit()`).
crates/storage/backend/rocksdb.rs	Implements `commit_no_wal()` using RocksDB `write_opt` with WAL disabled.
crates/storage/trie.rs	Adds `BackendTrieDB::with_no_wal()` and routes trie batch commits through `commit_no_wal()` when enabled.
crates/storage/store.rs	Adds `_no_wal` variants for key snap-sync write paths and async batch writes; adds no-WAL trie open helpers.
crates/networking/p2p/sync/snap_sync.rs	Switches snap sync writes and trie opens to no-WAL variants.
crates/networking/p2p/sync/healing/state.rs	Dispatches multiple healing requests concurrently and replaces busy polling with `tokio::select!`.
crates/networking/p2p/sync/healing/storage.rs	Replaces `try_recv()` + `yield_now()` polling with `tokio::select!` on recv + timeout; uses no-WAL writes for healed batches.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-11T15:33:32Z

crates/networking/p2p/sync/healing/storage.rs

            }
-            Err(TryRecvError::Disconnected) => {
-                state.disconnected_count += 1;
+            _ = tokio::time::sleep(Duration::from_secs(1)) => {


state.empty_count is still reported via METRICS.healing_empty_try_recv, but after switching from try_recv() to tokio::select! it is never incremented anywhere, so this metric/log field will always stay at 0. Consider incrementing empty_count on the timeout branch (or removing the counter/metric if it’s no longer meaningful).

Suggested change

_ = tokio::time::sleep(Duration::from_secs(1)) => {

_ = tokio::time::sleep(Duration::from_secs(1)) => {

state.empty_count += 1;

Copilot · 2026-02-11T15:33:33Z

crates/networking/p2p/sync/healing/state.rs

+        let batch: Vec<RequestMetadata> =
+            paths.drain(0..min(paths.len(), NODE_BATCH_SIZE)).collect();


paths.drain(0..min(paths.len(), NODE_BATCH_SIZE)) repeatedly drains from the front of a Vec, which is O(n) per drain due to element shifting. With multi-dispatch enabled this can become a noticeable CPU cost. Consider switching paths to a VecDeque (pop from front) or draining from the end (e.g., split_off) if ordering isn’t important.

Suggested change

let batch: Vec<RequestMetadata> =

paths.drain(0..min(paths.len(), NODE_BATCH_SIZE)).collect();

let batch_size = min(paths.len(), NODE_BATCH_SIZE);

let batch: Vec<RequestMetadata> = paths.split_off(paths.len() - batch_size);

ElFantasma · 2026-02-12T18:53:43Z

crates/networking/p2p/sync/healing/state.rs

+    )>,
+    logged_no_free_peers_count: &mut u32,
+) {
+    while (*inflight_tasks as u32) < MAX_IN_FLIGHT_REQUESTS && !paths.is_empty() {


(*inflight_tasks as u32) < MAX_IN_FLIGHT_REQUESTS — the as u32 truncates silently if inflight_tasks ever exceeds u32::MAX. Can't happen in practice (max 77), but the cast is unnecessary since MAX_IN_FLIGHT_REQUESTS is a constant: either change inflight_tasks to u32 (it's only used as a counter here), or cast the constant up: *inflight_tasks < MAX_IN_FLIGHT_REQUESTS as u64.

ElFantasma · 2026-02-12T18:53:43Z

crates/networking/p2p/sync/healing/state.rs

+        *inflight_tasks += 1;
+        let peer_table = peers.peer_table.clone();
+
+        tokio::spawn(async move {


Fire-and-forget tokio::spawn — if this task panics (e.g., request_state_trienodes panics), inflight_tasks is never decremented and the outer tokio::select! will poll the channel forever (retrying every 1s on timeout). Pre-existing issue, but now amplified since up to 77 tasks can be in flight.

Consider collecting the JoinHandle (or using JoinSet) and checking for panics when draining responses, similar to how db_joinset is handled elsewhere in this file.

ilitteri added 3 commits February 11, 2026 12:28

Copilot AI review requested due to automatic review settings February 11, 2026 15:28

ilitteri requested a review from a team as a code owner February 11, 2026 15:28

github-actions bot added L1 Ethereum client performance Block execution throughput and performance in general labels Feb 11, 2026

github-actions bot assigned ilitteri Feb 11, 2026

github-project-automation bot added this to ethrex_l1 and ethrex_performance Feb 11, 2026

github-project-automation bot moved this to Todo in ethrex_performance Feb 11, 2026

Copilot started reviewing on behalf of ilitteri February 11, 2026 15:29 View session

Copilot AI reviewed Feb 11, 2026

View reviewed changes

This was referenced Feb 11, 2026

perf(l1): adaptive request sizing, storage bisection, and parallel trie in snap sync #6181

Open

perf(l1): pipeline bytecode downloads and background storage healing in snap sync #6184

Open

ElFantasma requested changes Feb 12, 2026

View reviewed changes

github-project-automation bot moved this to In Progress in ethrex_l1 Feb 12, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(l1): disable WAL and improve concurrency in snap sync#6178

perf(l1): disable WAL and improve concurrency in snap sync#6178
ilitteri wants to merge 3 commits intomainfrom
snap-sync-phase1-optimizations

ilitteri commented Feb 11, 2026

Uh oh!

github-actions bot commented Feb 11, 2026

Uh oh!

github-actions bot commented Feb 11, 2026

Uh oh!

github-actions bot commented Feb 11, 2026

Uh oh!

greptile-apps bot commented Feb 11, 2026

Important Files Changed

Uh oh!

github-actions bot commented Feb 11, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 11, 2026

Uh oh!

iovoid Feb 19, 2026

Uh oh!

Copilot AI Feb 11, 2026

Uh oh!

ElFantasma Feb 12, 2026

Uh oh!

ElFantasma Feb 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	_ = tokio::time::sleep(Duration::from_secs(1)) => {
	_ = tokio::time::sleep(Duration::from_secs(1)) => {
	state.empty_count += 1;

		let batch: Vec<RequestMetadata> =
		paths.drain(0..min(paths.len(), NODE_BATCH_SIZE)).collect();

Conversation

ilitteri commented Feb 11, 2026

Motivation

Description

Benchmark results (Hoodi, ethrex-mainnet-4: 12 cores, 62 GB RAM)

How to Test

Uh oh!

github-actions bot commented Feb 11, 2026

🤖 Kimi Code Review

Review Summary

Critical Issues

Security & Consensus Issues

Performance & Correctness Issues

Code Quality Issues

Suggested Fixes

Uh oh!

github-actions bot commented Feb 11, 2026

Lines of code report

Uh oh!

github-actions bot commented Feb 11, 2026

🤖 Codex Code Review

Uh oh!

greptile-apps bot commented Feb 11, 2026

Greptile Overview

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

github-actions bot commented Feb 11, 2026

🤖 Claude Code Review

Code Review: PR #6178 - Disable WAL and Improve Concurrency in Snap Sync

Summary

Critical Issues

1. Unsafe Memory Management in RocksDBLocked ⚠️

2. Panic on Missing Parent in State Healing 🔴

3. Channel Send Error Ignored ⚠️

Performance & Correctness

4. Missing Error Propagation in Storage Healing ⚠️

5. Concurrent Multi-Peer Dispatch - Good Design ✅

6. WAL Bypass Implementation - Correct ✅

Code Quality

7. Batch Processing Logic ✅

8. Type Safety Improvement ✅

Security Considerations

9. Node Hash Validation ✅

10. Potential DoS via Empty Responses ⚠️

Minor Issues

11. Metric Removed Without Replacement

12. Magic Number for Peer Logging

Positive Observations

Recommendations

Verdict

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

iovoid Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

ElFantasma Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

ElFantasma Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development