-
Notifications
You must be signed in to change notification settings - Fork 784
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve peer performance for NAT'd nodes #5345
Conversation
The downside is that after pruning, this logic won't be triggered again until a peer disconnects or the peer count falls below the target. Open to suggestions if people want to invoke discovery more often, but i'm conscious of having it over-used, it's very noisy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is the root cause of our issues, but should help non-port forwarded peers nevertheless.
@@ -61,6 +61,8 @@ pub const MIN_OUTBOUND_ONLY_FACTOR: f32 = 0.2; | |||
/// limit is 55, and we are at 55 peers, the following parameter provisions a few more slots of | |||
/// dialing priority peers we need for validator duties. | |||
pub const PRIORITY_PEER_EXCESS: f32 = 0.2; | |||
/// The number of inbound peers that are connected that indicate this node is not behind a NAT. | |||
pub const INBOUND_PEERS_NAT: usize = 5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why 5? How's the node getting any inbound peers at all if it does not have a reachable address?!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the node may still be reached by peers that are public, i.e. the node dials a public peer so the router adds that public address to it's table and therefore it may be dialed from them in the future right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the logic seems sound, just have the same question as Pawan
@@ -61,6 +61,8 @@ pub const MIN_OUTBOUND_ONLY_FACTOR: f32 = 0.2; | |||
/// limit is 55, and we are at 55 peers, the following parameter provisions a few more slots of | |||
/// dialing priority peers we need for validator duties. | |||
pub const PRIORITY_PEER_EXCESS: f32 = 0.2; | |||
/// The number of inbound peers that are connected that indicate this node is not behind a NAT. | |||
pub const INBOUND_PEERS_NAT: usize = 5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the node may still be reached by peers that are public, i.e. the node dials a public peer so the router adds that public address to it's table and therefore it may be dialed from them in the future right?
Yeah I came here to change it. The original number was just something I made up. It was some threshold. I think the better logic now is to just always discover up to max_peers, regardless of inbound peers. We have been seeing cases of non-NAT'd peers not reaching the excess, so might as well always dial up to the max and let the peer manager prune down. |
Ok, diff is really small now, but impact could be substantial |
@Mergifyio queue |
🛑 Command
|
🟠 The pull request is the 2nd in the queue to be merged#5345 is queued for merge. Required conditions of queue
Required conditions to stay in the queue:
Visit the Mergify Dashboard to check the state of the queue |
@Mergifyio dequeue |
✅ The pull request has been removed from the queue
|
@Mergifyio requeue |
✅ This pull request will be re-embarked automaticallyThe followup |
✅ The pull request has been merged automaticallyThe pull request has been merged automatically at de91c77 |
The best way for lighthouse to manage peers is via pruning excess peers in order to get the best possible steady-state selection of peers long-term.
This process has always been reserved for nodes that have correctly port-forwarded their lighthouse node. Nodes that are behind NAT's never get excess peers and suffer from not having this peer selection logic.
Often these nodes will see
InsufficientPeers
warning occasionally because their peer set has not been optimized.This PR adds logic to allow discovery to search for more peers than necessary if the inbound peer count is small (which indicates the lighthouse node is unreachable globally). By searching for and subsequently dialing excess peers, the pruning logic will now also apply for these lighthouse nodes giving them a better set of peers.