Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct gossipsub mesh and connected peer inconsistencies #6244

Open
wants to merge 6 commits into
base: unstable
Choose a base branch
from
Open
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 22 additions & 16 deletions beacon_node/lighthouse_network/gossipsub/src/behaviour.rs
Original file line number Diff line number Diff line change
Expand Up @@ -760,7 +760,7 @@ where
}
} else {
tracing::error!(peer_id = %peer_id,
"Could not PUBLISH, peer doesn't exist in connected peer list");
"Could not send PUBLISH, peer doesn't exist in connected peer list");
}
}

Expand Down Expand Up @@ -1062,7 +1062,7 @@ where
});
} else {
tracing::error!(peer = %peer_id,
"Could not GRAFT, peer doesn't exist in connected peer list");
"Could not send GRAFT, peer doesn't exist in connected peer list");
}

// If the peer did not previously exist in any mesh, inform the handler
Expand Down Expand Up @@ -1161,7 +1161,7 @@ where
peer.sender.prune(prune);
} else {
tracing::error!(peer = %peer_id,
"Could not PRUNE, peer doesn't exist in connected peer list");
"Could not send PRUNE, peer doesn't exist in connected peer list");
}

// If the peer did not previously exist in any mesh, inform the handler
Expand Down Expand Up @@ -1340,7 +1340,7 @@ where
}
} else {
tracing::error!(peer = %peer_id,
"Could not IWANT, peer doesn't exist in connected peer list");
"Could not send IWANT, peer doesn't exist in connected peer list");
}
}
tracing::trace!(peer=%peer_id, "Completed IHAVE handling for peer");
Expand All @@ -1363,7 +1363,7 @@ where

for id in iwant_msgs {
// If we have it and the IHAVE count is not above the threshold,
// foward the message.
// forward the message.
if let Some((msg, count)) = self
.mcache
.get_with_iwant_counts(&id, peer_id)
Expand Down Expand Up @@ -1403,7 +1403,7 @@ where
}
} else {
tracing::error!(peer = %peer_id,
"Could not IWANT, peer doesn't exist in connected peer list");
"Could not send IWANT, peer doesn't exist in connected peer list");
}
}
}
Expand Down Expand Up @@ -2043,8 +2043,11 @@ where
}
}

// remove unsubscribed peers from the mesh if it exists
// remove unsubscribed peers from the mesh and fanout if they exist there
AgeManning marked this conversation as resolved.
Show resolved Hide resolved
for (peer_id, topic_hash) in unsubscribed_peers {
self.fanout
.get_mut(&topic_hash)
.map(|peers| peers.remove(&peer_id));
self.remove_peer_from_mesh(&peer_id, &topic_hash, None, false, Churn::Unsub);
}

Expand All @@ -2068,7 +2071,7 @@ where
}
} else {
tracing::error!(peer = %propagation_source,
"Could not GRAFT, peer doesn't exist in connected peer list");
"Could not send GRAFT, peer doesn't exist in connected peer list");
}

// Notify the application of the subscriptions
Expand All @@ -2086,9 +2089,12 @@ where
fn apply_iwant_penalties(&mut self) {
if let Some((peer_score, ..)) = &mut self.peer_score {
for (peer, count) in self.gossip_promises.get_broken_promises() {
peer_score.add_penalty(&peer, count);
if let Some(metrics) = self.metrics.as_mut() {
metrics.register_score_penalty(Penalty::BrokenPromise);
// We do not apply penalties to nodes that have disconnected.
if self.connected_peers.contains_key(&peer) {
peer_score.add_penalty(&peer, count);
if let Some(metrics) = self.metrics.as_mut() {
metrics.register_score_penalty(Penalty::BrokenPromise);
}
}
}
}
Expand Down Expand Up @@ -2583,7 +2589,7 @@ where
}
} else {
tracing::error!(peer = %peer_id,
"Could not IHAVE, peer doesn't exist in connected peer list");
"Could not send IHAVE, peer doesn't exist in connected peer list");
}
}
}
Expand Down Expand Up @@ -2669,7 +2675,7 @@ where
peer.sender.prune(prune);
} else {
tracing::error!(peer = %peer_id,
"Could not PRUNE, peer doesn't exist in connected peer list");
"Could not send PRUNE, peer doesn't exist in connected peer list");
}

// inform the handler
Expand Down Expand Up @@ -2706,8 +2712,8 @@ where

for peer_id in recipient_peers {
let Some(peer) = self.connected_peers.get_mut(peer_id) else {
tracing::error!(peer = %peer_id,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of removing this error, can we instead filter above for the gossip promises? I.e.

// Gossip promises are kept for disconnected peers.
let iwant_peers = self
            .gossip_promises
            .peers_for_message(msg_id)
            .iter()
            .filter(|peer_id| self.connected_peers.contains_key(peer_id));

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about this. I don't mind.
The logic that went through my head was:
Adding this check adds a hashmap lookup for each peer, but a few lines down the page we do the exact same check.

I figured we are already doing the check, by just removing the error, we have no extra hash lookup, however the cost is we don't log an error in case a mesh peer isn't in the connected mapping (but we have these errors elsewhere).

Its such a tiny difference, so happy to do whatever you think is best, just explaining why I removed the error.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for elaborating age! Do you know where else we check? It def doesn't make sense to double check this adding another iteration as you refer, I just think we shouldn't stop logging for this event

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So yeah happy to do what you prefer.

I can't think of a way to keep the error without adding an extra hashmap lookup. Its a trivial lookup, and maybe the error is worth it. So I'm happy to go with the soln you have suggested here if you want the error.

"Could not IDONTWANT, peer doesn't exist in connected peer list");
// It can be the case that promises to disconnected peers appear here. In this case
// we simply ignore the peer-id.
continue;
};

Expand Down Expand Up @@ -2972,7 +2978,7 @@ where
}
} else {
tracing::error!(peer = %peer_id,
"Could not SUBSCRIBE, peer doesn't exist in connected peer list");
"Could not send SUBSCRIBE, peer doesn't exist in connected peer list");
}
}

Expand Down
Loading