Skip to content

Comments

fix: make observed_peer_subnets fork-aware#820

Merged
mergify[bot] merged 4 commits intosigp:unstablefrom
diegomrsantos:fix/fork-aware-observed-peer-subnets
Feb 11, 2026
Merged

fix: make observed_peer_subnets fork-aware#820
mergify[bot] merged 4 commits intosigp:unstablefrom
diegomrsantos:fix/fork-aware-observed-peer-subnets

Conversation

@diegomrsantos
Copy link
Member

Issue Addressed

Closes #818

Proposed Changes

During fork transitions (e.g., Alan → Boole), peers may be subscribed to the same subnet via topics from different forks simultaneously — for example, ssv.v2.42 (Alan) and /ssv/mainnet/boole/42 (Boole). The peer manager's observed_peer_subnets previously stored a single flat bitmap per peer, so unsubscribing from one fork's topic would incorrectly clear the subnet bit even though the peer was still subscribed via the other fork.

This PR makes the subscription tracking fork-aware by maintaining a separate bitmap per fork per peer. Query methods (used for peer counting, disconnect decisions, and priority dialing) aggregate across forks transparently, so the rest of the peer management logic is unaffected.

Additional Info

  • The gossipsub subscription/unsubscription event handlers now use topic::parse_topic() (which returns fork info) instead of topic::parse_subnet_id() (which discarded it), threading the fork identity through to the connection manager.
  • This also lays groundwork for future fork-aware peer selection, e.g., preferring peers that are on the current fork's topics after a grace period ends.

…scription

During fork transitions, a peer may be subscribed to the same subnet via
topics from different forks (e.g., ssv.v2.42 and /ssv/mainnet/boole/42).
Previously, unsubscribing from one fork's topic would clear the subnet
bit entirely, even though the peer was still subscribed via the other
fork's topic.

Track subscriptions per fork so each fork maintains its own bitmap. Query
methods aggregate across forks using bitwise union when checking subnet
coverage.

Closes sigp#818

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@claude-code-actions-sigp

This comment was marked as outdated.

Address review feedback: log a warning when a subnet ID exceeds the
bitfield capacity instead of silently ignoring it, and add a test
verifying that duplicate subscribes on the same fork+subnet don't
break state.
@diegomrsantos diegomrsantos marked this pull request as draft February 10, 2026 16:33
@diegomrsantos diegomrsantos marked this pull request as ready for review February 10, 2026 16:34
@claude-code-actions-sigp
Copy link

claude-code-actions-sigp bot commented Feb 10, 2026

Claude finished @diegomrsantos's task in 2m 6s —— View job


Code Review Complete ✅

Summary

This PR successfully addresses issue #818 by making peer subnet tracking fork-aware. The implementation correctly handles fork transitions where peers may be subscribed to the same subnet via different fork-specific topics simultaneously (e.g., ssv.v2.42 and /ssv/mainnet/boole/42). The nested HashMap<PeerId, HashMap<Fork, Bitfield>> structure properly models this scenario.


Strengths

1. Correct Problem Modeling

The core data structure change from HashMap<PeerId, Bitfield> to HashMap<PeerId, HashMap<Fork, Bitfield>> correctly addresses the root cause: during fork transitions, a peer subscribed to both ssv.v2.42 (Alan) and /ssv/mainnet/boole/42 (Boole) should not have subnet 42 cleared when unsubscribing from one fork.

2. Transparent Aggregation

The aggregate_fork_bitmaps() helper (connection.rs:298-306) uses bitwise union to merge per-fork bitmaps, making query methods fork-agnostic. This is a clean separation - the rest of the peer management logic doesn't need to change.

fn aggregate_fork_bitmaps(fork_map: &HashMap<Fork, Bitfield<Fixed<U128>>>) -> Bitfield<Fixed<U128>> {
    let mut result = Bitfield::default();
    for bitfield in fork_map.values() {
        result = result.union(bitfield);
    }
    result
}

3. Proper Memory Management

The cleanup logic (connection.rs:150-157) prevents memory bloat by removing empty entries:

if !subscribed {
    if bitfield.is_zero() {
        fork_map.remove(&fork);
    }
    if fork_map.is_empty() {
        self.observed_peer_subnets.remove(&peer);
    }
}

4. Excellent Test Coverage

The test suite comprehensively validates the fix:

  • Single-fork subscribe/unsubscribe (baseline behavior)
  • Multi-fork retention test (lines 607-624) - regression test for Make observed_peer_subnets Fork aware #818
  • Aggregation across forks
  • Cleanup behavior verification
  • Peer counting with multi-fork scenarios
  • Idempotency test (lines 829-845)

All tests follow AAA pattern with descriptive names.

5. Out-of-Bounds Protection

Commit 19f675c adds proper bounds checking with a warning (connection.rs:142-147):

if idx < bitfield.len() {
    let _ = bitfield.set(idx, subscribed);
} else {
    tracing::warn!(
        subnet = idx,
        max = bitfield.len(),
        "Subnet ID exceeds bitfield capacity"
    );
}

This prevents silent failures when invalid subnet IDs are passed.


Areas for Improvement

1. Performance Consideration (Minor)

The aggregate_fork_bitmaps() function creates a new bitfield and unions all per-fork bitmaps on every query. During steady-state with multiple forks active, this could be called frequently from:

  • count_observed_peers_for_subnets() (line 220)
  • peer_offers_needed_subnets_observed_only() (line 245)
  • qualifies_for_priority_connection() (line 191)

Observation: This is likely not a bottleneck given:

  • Small number of forks (typically 2-3 during transitions)
  • Bitwise OR is fast (128-bit bitmap)
  • Queries happen during connection decisions, not hot path

Recommendation: Monitor in production. If profiling shows this as hot, consider caching the aggregated result and invalidating on subscription changes.

Verdict: No action needed now.

2. Type Safety Consideration (Very Minor)

The Fork enum is a closed set (Alan, Boole), so invalid values aren't possible in safe Rust. The implementation correctly relies on this.


Concurrency & Thread Safety

The ConnectionManager follows standard Rust ownership semantics. All mutable access goes through &mut self, ensuring no data races. The nested HashMap structure is safe for concurrent reads (when not mutating) via standard borrowing rules.

Verdict: Thread-safe ✅


Security Assessment

No security vulnerabilities identified:

  • No sensitive data exposure
  • Proper bounds validation (lines 139-147)
  • Safe Rust only (no unsafe blocks)
  • Memory cleanup prevents leaks

Verdict: Secure ✅


Documentation Quality

  • Clear comments explaining fork-aware tracking (connection.rs:50-53)
  • Issue reference (Make observed_peer_subnets Fork aware #818) in code comments
  • Comprehensive PR description linking fork transition scenario
  • Well-documented test cases

Verdict: Documentation excellent ✅


Edge Cases Covered

  1. ✅ Subscribe/unsubscribe same fork+subnet (idempotency test, lines 829-845)
  2. ✅ Multi-fork retention (regression test for Make observed_peer_subnets Fork aware #818, lines 607-624)
  3. ✅ Full unsubscribe cleanup (lines 678-695)
  4. ✅ Partial unsubscribe retention (lines 698-713)
  5. ✅ Disconnected peers not counted (lines 747-767)
  6. ✅ Out-of-bounds subnet IDs (warning added in commit 19f675c)

Code Quality

  • Readability: Helper functions with clear names
  • Modularity: Clean separation between tracking and aggregation
  • Error Handling: Proper bounds checking with logging
  • Testing: Comprehensive edge case coverage
  • Consistency: Follows existing Anchor patterns

Final Verdict

✅ Approve

This PR correctly solves the fork transition bug with a clean, well-tested implementation. The solution is minimal, focused, and production-ready. The minor performance observation is not a blocker - the implementation is already optimized for the common case (small number of forks, fast bitwise operations).

The addition of the out-of-bounds warning in the second commit demonstrates attention to robustness. The comprehensive test suite ensures the fix works correctly and won't regress.


Completed Tasks

  • Read all changed files
  • Analyze fork-aware bitmap implementation
  • Review security and thread safety
  • Check error handling and edge cases
  • Assess test coverage
  • Verify documentation quality
  • Provide comprehensive feedback

@diegomrsantos diegomrsantos force-pushed the fix/fork-aware-observed-peer-subnets branch from 317a350 to 1d13d0b Compare February 10, 2026 17:33
@diegomrsantos diegomrsantos force-pushed the fix/fork-aware-observed-peer-subnets branch from 1d13d0b to 481f223 Compare February 10, 2026 17:55
@diegomrsantos diegomrsantos self-assigned this Feb 10, 2026
@diegomrsantos diegomrsantos added the v2.0.0 The release shipping the next network upgrade label Feb 10, 2026
Add peer ID to the out-of-bounds subnet warning log and replace the
imperative loop in aggregate_fork_bitmaps with a functional fold.
Copy link
Member

@dknopik dknopik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@mergify
Copy link

mergify bot commented Feb 11, 2026

Merge Queue Status

Rule: default


This pull request spent 10 minutes 32 seconds in the queue, including 8 minutes 56 seconds running CI.

Required conditions to merge

mergify bot added a commit that referenced this pull request Feb 11, 2026
@mergify mergify bot merged commit b52a42a into sigp:unstable Feb 11, 2026
22 of 23 checks passed
@mergify mergify bot removed the queued label Feb 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-merge v2.0.0 The release shipping the next network upgrade

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants