-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zebra: Ebgp neighbor stuck in active state if multihop cli given befo… #3387
Conversation
…re the update-source FRRouting#3264 What is happening here is, when we try to create a BGP session b/w two neighbour, we open two sockets per neighbour. Out of theses two sockets (which BGP opens per peer in the initial stage to bring up the session i.e one socket for sending connect request to the neighbour and the other is the listening socket, which would be waiting for the connect request from the neighbour.), the connect socket is kind of closed for this peer and the listening socket path is working fine. So if a new connection request doesn’t arrive to either of this two neighbour, this session b/w these two peers would never come up. It’s a kind of deadlock where either of the peer is waiting on the other to initiate the connection. Zebra was not responding to re-registration messages from bgp due to which the next hop (which was reset in BGP) was not getting resolved. So bgp was not preceding ahead with this next hop and there was a deadlock created for this next hop. If similar scenario happens in the neighbour FRR router. As a result of that both the neighbour would NOT try to connect each other and will depend on each other to initiate the connection ,which is never going to happen if both the neighbours are having this same bug, it’s a deadlock. So, in-order to avoid this deadlock Zebra would respond to the client registration request, even if next-hop is already registered. This change is needed as bgp's neighbour machine, for this particular neighbour goes into dead lock state, and never send any connect request to this neighbour.
💚 Basic BGPD CI results: SUCCESS, 0 tests failedResults table
For details, please contact louberger |
Continuous Integration Result: SUCCESSFULCongratulations, this patch passed basic tests Tested-by: NetDEF / OpenSourceRouting.org CI System CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-6013/ This is a comment from an EXPERIMENTAL automated CI system. Warnings Generated during build:Checkout code: Successful with additional warnings:
CLANG Static Analyzer Summary
No Changes in Static Analysis warnings compared to base |
💚 Basic BGPD CI results: SUCCESS, 0 tests failedResults table
For details, please contact louberger |
Continuous Integration Result: SUCCESSFULCongratulations, this patch passed basic tests Tested-by: NetDEF / OpenSourceRouting.org CI System CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-6023/ This is a comment from an EXPERIMENTAL automated CI system. Warnings Generated during build:Checkout code: Successful with additional warnings:
CLANG Static Analyzer Summary
No Changes in Static Analysis warnings compared to base |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you fix the clang warnings above? I see a few trailing whitespace errors.
@riw777 white spaces removed from comments. |
💚 Basic BGPD CI results: SUCCESS, 0 tests failedResults table
For details, please contact louberger |
Continuous Integration Result: SUCCESSFULCongratulations, this patch passed basic tests Tested-by: NetDEF / OpenSourceRouting.org CI System CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-6033/ This is a comment from an EXPERIMENTAL automated CI system. Warnings Generated during build:Checkout code: Successful with additional warnings:
CLANG Static Analyzer Summary
No Changes in Static Analysis warnings compared to base |
Continuous Integration Result: SUCCESSFULCongratulations, this patch passed basic tests Tested-by: NetDEF / OpenSourceRouting.org CI System CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-6034/ This is a comment from an EXPERIMENTAL automated CI system. Warnings Generated during build:Checkout code: Successful with additional warnings:
CLANG Static Analyzer Summary
No Changes in Static Analysis warnings compared to base |
This has been fixed already, closing |
…re the update-source #3264
Summary
[What is happening here is, when we try to create a BGP session b/w two neighbour, we open two sockets per neighbour. Out of theses two sockets (which BGP opens per peer in the initial stage to bring up the session i.e one socket for sending connect request to the neighbour and the other is the listening socket, which would be waiting for the connect request from the neighbour.), the connect socket is kind of closed for this peer and the listening socket path is working fine.]
Related Issue
[ Zebra was not responding to re-registration messages from bgp due to which the next hop (which was reset in BGP) was not getting resolved. So bgp was not preceding ahead with this next hop and there was a deadlock
created for this next hop. If similar scenario happens in the neighbour FRR router. As a result of that both the neighbour would NOT try to connect each other and will depend on each other to initiate the connection ,which is never going to happen
if both the neighbours are having this same bug, it’s a deadlock.
So, in-order to avoid this deadlock Zebra would respond to the client registration request, even if next-hop is already registered. This change is needed as bgp's neighbour machine, for this particular neighbour goes into dead lock state, and never send
any connect request to this neighbour.
]
Components
[zebra]
Signed-off-by: Biswajit Sadhu bsadhu@vmware.com