-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make sure inject_dial_failure is called #1549
Conversation
I've reverted a change in the last commit because it makes the libp2p-kad tests fail. To me the change is correct: if we emit a I'm testing whether this fixes paritytech/substrate#5684. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
After investigating a bit, I was mistaken. The problem is that my change would call |
Co-Authored-By: Toralf Wittner <tw@dtex.org>
Why would that not be correct? I was actually aware of the current behaviour described in this PR w.r.t. |
swarm/src/lib.rs
Outdated
}, | ||
Peer::Connected(_) => {}, | ||
Peer::Disconnected(..) | Peer::Local => { | ||
this.behaviour.inject_dial_failure(&peer_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm really not sure about calling inject_dial_failure
when the dialing condition was not met. I would only expect a call to inject_dial_failure
if my behaviour emitted DialPeer
and the dialing condition was actually met, since by specifying the condition I'm saying under which conditions I actually want to dial and wouldn't expect to be informed about a dial failure if I didn't actually intend to dial as per my condition. Admittedly, as I wrote in #1506, if a behaviour actually wants to know if the dialing condition is not met because it would otherwise still keep some state, another callback may be needed then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it would be ok if we clearly document that inject_dial_failure
is called also if the dialing condition is not met, i.e. such that every DialPeer
is guaranteed to be paired with inject_connection_established
or inject_dial_failure
, but then there should be no exception for when the peer is already connected, again because the behaviour may otherwise keep some state around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can revert that.
I don't know exactly when this got changed, but in libp2p 0.16.2, when
For the local peer ID, this has indeed never been the case, but it makes sense to me for it to fail as well. |
This reverts commit 80c0d3d.
I see, thanks for digging that out. This was a regression then, my apologies. |
No worries! Is that PR good for you? |
Context: paritytech/substrate#5684
There is an assumption right now in Substrate's code (and probably other) that, after we call
dial
or emit aDialPeer
, theinject_dial_failure
method is called if we know that we won't be able to reach the given node.This is not the case right now in the situation where
addresses_of_peer
returns an empty list. In that situation, we ask theSwarm
to connect to a node, and don't get any event whatsoever about that node ever again.This PR modifies
dial
to emit the event if necessary, and simplifies the handling ofDialPeer
to rely more on the behaviour ofdial
.In generally, I wish it was precisely documented and explained what happens in which situation, and that it is possible in any situation to track the state of each connection attempt, but that's a later change.
While I think it's be reasonable to ask @romanb to handle and review that, he's unfortunately on vacation right now.