This repository has been archived by the owner on Nov 15, 2023. It is now read-only.
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.
Dialing a node for which we have no address doesn't notify the PSM #5684
Closed
Description
I have some sub-libp2p
logs that show a buggy situation:
2020-04-17 11:42:03.441 tokio-runtime-worker DEBUG sub-libp2p PSM => Connect(PeerId("QmWabGpJf7JyezL2bpsykobTMecrEDeT21d4W6ntcv39Zx")): Starting to connect
2020-04-17 11:42:03.441 tokio-runtime-worker DEBUG sub-libp2p Libp2p <= Dial PeerId("QmWabGpJf7JyezL2bpsykobTMecrEDeT21d4W6ntcv39Zx")
2020-04-17 11:42:03.441 tokio-runtime-worker TRACE sub-libp2p Addresses of PeerId("QmWabGpJf7JyezL2bpsykobTMecrEDeT21d4W6ntcv39Zx") are []
2020-04-17 11:42:03.441 tokio-runtime-worker DEBUG sub-libp2p Requested dialing to PeerId("QmWabGpJf7JyezL2bpsykobTMecrEDeT21d4W6ntcv39Zx") (peer not in k-buckets), and no address was found
...and then, importantly, no further logs related to QmWabGpJf7JyezL2bpsykobTMecrEDeT21d4W6ntcv39Zx
(note that this is before #5679, which slightly changed the logs format here)
What happens is:
- The
NetworkBehaviour::poll
emits aDialPeer
event. - The libp2p
Swarm
then callsaddresses_of_peer
, but the method returns an empty list of addresses. - In libp2p 0.17, this would then call
inject_dial_failure
, which would then print additional logs and report to the peerset that the dialing attempt has failed. In libp2p 0.18, however, this doesn't seem to happen anymore.
The consequence is that the peerset and the NetworkBehaviour
think that we're connecting to QmWabGpJf7JyezL2bpsykobTMecrEDeT21d4W6ntcv39Zx
, while in fact we are not.
cc @romanb Could you look into this? I'm not necessarily familiar with the state machine anymore to know where this should be fixed.
EDIT: Roman is on vacation
Activity