Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bfdd: fix BFD_GETCBIT #15738

Merged
merged 2 commits into from
Apr 21, 2024
Merged

bfdd: fix BFD_GETCBIT #15738

merged 2 commits into from
Apr 21, 2024

Conversation

idryzhov
Copy link
Contributor

The macro should check C bit, not F bit.

The macro should check C bit, not F bit.

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Copy link
Member

@ton31337 ton31337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering how it worked stuff before this 😄

But, seems the tests are failing now.

@idryzhov
Copy link
Contributor Author

Apparently it didn't work :)
Unfortunately I don't have time right now to look into the test failures. Maybe someone else could take a look...

@rwestphal
Copy link
Member

My analysis is that your fix exposed a problem in the topotest logic.

This is the test that is failing:

def test_bfd_loss_intermediate():
    """
    Assert that BFD notices the bfd link down failure.
    but BGP entries should still be present
    """

The problem is that the premise of this test is wrong.

Since the BGP peers are configured with the bfd check-control-plane-failure option, we're declaring that their corresponding BFD session doesn't share fate with the control plane. In practice, this means the BFD packets are sent with the "Control Plane Independent" bit (aka C-bit) set.

When a BFD down event is detected and C-bit is set, that represents an actual forwarding failure that can not be ignored. As such, bgpd is correct in bringing the session down when that happens.

Prior to this PR, the topotest was passing because of this:

BGP: [Q4BCV-6FHZ5] zclient_bfd_session_update: 2001:db8:1::1/128 -> 2001:db8:4::1/128 VRF default(0) (CPI bit no): Down
BGP: [MKVHZ-7MS3V] bfd_session_status_update: neighbor 2001:db8:4::1 vrf default(0) bfd state Up -> Down 
BGP: [JG709-75BG6] 2001:db8:4::1 BFD DOWN message ignored in the process of graceful restart when C bit is cleared

bgpd was ignoring the BFD down event because it thought the C-bit was clear, so the failure was being dismissed as a side effect of the restart taking place. This is the correct behavior as specified here.

The topotest test descriptions and reference data need to be fixed accordingly.

@donaldsharp
Copy link
Member

@rwestphal -> Do we need to worry about backwards comparability issues with these changes? And can we get some help fixing up the tests?

@frrbot frrbot bot added the tests Topotests, make check, etc label Apr 17, 2024
@github-actions github-actions bot added size/M and removed size/XS labels Apr 17, 2024
@rwestphal
Copy link
Member

@donaldsharp I think this PR will merely fix protocol behavior in the face of BFD link down events. I can't think of any backward compatibility issues.

Also, I believe very few people should be using the neighbor A.B.C.D bfd check-control-plane-failure command, as it only makes sense when BFD is implemented in hardware.

I've pushed a topotest update to Igor's branch.

When a BFD down notification is received and the C-bit is set in both
directions, any ongoing graceful restart should be aborted and stale
routes removed from the RIB.

This commit updates the `bfd_bgp_cbit_topo3` topotest accordingly to
fix the expected outcomes in the `test_bfd_loss_intermediate` test.

Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
@github-actions github-actions bot added the rebase PR needs rebase label Apr 18, 2024
@donaldsharp donaldsharp merged commit 7ba58e9 into FRRouting:master Apr 21, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bfd bugfix master rebase PR needs rebase size/M tests Topotests, make check, etc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants