-
Notifications
You must be signed in to change notification settings - Fork 995
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix relay #827
Fix relay #827
Conversation
Pushed some updates: (1) Sometimes a HostInfo that is not the main HostInfo sees traffic, which prevents it from being deleted by ConnectionManager. Since no more outgoing traffic is sent over this non-main HostInfo, it never cycles back through ConnectionManager. PR updates ConnectionManager to place a HostInfo back onto the watch list if traffic is seen as received during the deletion phase. |
@@ -88,17 +88,14 @@ func AddRelay(l *logrus.Logger, relayHostInfo *HostInfo, hm *HostMap, vpnIp iput | |||
|
|||
// EstablishRelay updates a Requested Relay to become an Established Relay, which can pass traffic. | |||
func (rm *relayManager) EstablishRelay(relayHostInfo *HostInfo, m *NebulaControl) (*Relay, error) { | |||
relay, ok := relayHostInfo.relayState.QueryRelayForByIdx(m.InitiatorRelayIndex) | |||
relay, ok := relayHostInfo.relayState.CompleteRelayByIdx(m.InitiatorRelayIndex, m.ResponderRelayIndex) | |||
if !ok { | |||
rm.l.WithFields(logrus.Fields{"relayHostInfo": relayHostInfo.vpnIp, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think relayHostInfo
-> relayVpnIp
.
Probably would be good to do a separate pass to make sure all logs are using the same field names.
Co-authored-by: Nate Brown <nbrown.us@gmail.com>
Relay creation may race, meaning a host that has sent a CreateRelayRequest may receive a CreateRelayRequest message for the exact same tunnel, before receiving a CreateRelayResponse. When this happens, a relay's locally stored remote index will be 0. When a corresponding CreateRelayRequest arrives, it'll have a non-zero value for that remote index. The current code on Master detects this mismatch as a broken relay, and tears down the relay state. Destroying this relay state breaks all possible connectivity for the tunnel.
To work in all cases, once a relay's index is created and advertised (in a CreateRelayRequest), it should never be torn down until the associated HostInfo is destroyed. This PR updates the handling of CreateRelayRequests to avoid tearing down relay state in these detected mismatch cases.
I also added some synchronization around the Relay struct, using an atomic pointer.
I edited an existing test that could reproduce errors on Master, but which runs correctly with these updates. (Nate made the test.)