-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Frr 8.5.1 backport wcmp #15835
Closed
donaldsharp
wants to merge
202
commits into
FRRouting:master
from
donaldsharp:frr-8.5.1_backport_wcmp
Closed
Frr 8.5.1 backport wcmp #15835
donaldsharp
wants to merge
202
commits into
FRRouting:master
from
donaldsharp:frr-8.5.1_backport_wcmp
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
``` anlan(config-router-af)# vni 33 anlan(config-router-af-vni)# route-target both 44:55 anlan(config-router-af-vni)# no route-target both 44:55 vtysh: error reading from bgpd: Resource temporarily unavailable (11)Warning: closing connection to bgpd because of an I/O error! ``` When `bgp_evpn_vni_rt_cmd` deals with "both" type, it wrongly created only one node ( should be two nodes ) for lists of both `vpn->import_rtl` and `vpn->export_rtl`. At this time, the two lists are already wrong. In `no route-target both RT`, it will free the single node from lists of both `vpn->import_rtl` and `vpn->export_rtl`. After freed from `vpn->import_rtl`, it is "use-after-free" at the time of freeing it from `vpn->export_rtl`. It causes crash sometimes, or other unexpected behaviours. This issue is introduced by commit `3b7e8d`, which have adjusted both `bgp_evpn_vni_rt_cmd` and `bgp_evpn_vrf_rt_cmd`. Since `bgp_evpn_vrf_rt_cmd/no_bgp_evpn_vrf_rt_cmd` works well again unintentionally with commit `7022da`, only `bgp_evpn_vni_rt_cmd` needs to modify - add two nodes for "both" type and some explicit comments for this special case of "both" type. Signed-off-by: anlan_cs <vic.lan@pica8.com> (cherry picked from commit 432ff4b)
…pr-12761 bgpd: fix use-after-free crash for evpn (backport FRRouting#12761)
Signed-off-by: Donald Sharp <sharpd@nvidia.com> (cherry picked from commit 7386031)
…pr-12769 tools: Fix missing pbrd in rsyslog.d 45-frr.conf file (backport FRRouting#12769)
…pr-12768 tests: do not use exclude grep (backport FRRouting#12768)
Ticket: 2699411 Signed-off-by: Wesley Coakley <wcoakley@nvidia.com> (cherry picked from commit 8dc2001)
…pr-12773 pbrd: fix large tableids displayed as negative (backport FRRouting#12773)
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org> (cherry picked from commit b9941b3)
…loaded Seems just a missed one because at `goto error` we release dnode. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org> (cherry picked from commit ecf82aa)
When we receive a default route from a peer and we originate default route using `neighbor default-originate`, we do not track of struct attr we use, and when we do `no neighbor default-originate` we withdraw our generated default route, but we announce default-route from the peer. After we do this, we unintern aspath (which was used for default-originate), BUT it was used also for peer's default route we received. And here we have a use-after-free crash, because bgp_process_main_one() reaps old paths that are marked as BGP_PATH_REMOVED with aspath->refcnt > 0, but here it's 0. ``` 0 0x55c24bbcd022 in aspath_key_make bgpd/bgp_aspath.c:2070 1 0x55c24b8f1140 in attrhash_key_make bgpd/bgp_attr.c:777 2 0x7f52322e66c9 in hash_release lib/hash.c:220 3 0x55c24b8f6017 in bgp_attr_unintern bgpd/bgp_attr.c:1271 4 0x55c24ba0acaa in bgp_path_info_free_with_caller bgpd/bgp_route.c:283 5 0x55c24ba0a7de in bgp_path_info_unlock bgpd/bgp_route.c:309 6 0x55c24ba0af6d in bgp_path_info_reap bgpd/bgp_route.c:426 7 0x55c24ba17b9a in bgp_process_main_one bgpd/bgp_route.c:3333 8 0x55c24ba18a1d in bgp_process_wq bgpd/bgp_route.c:3425 9 0x7f52323c2cd5 in work_queue_run lib/workqueue.c:282 10 0x7f52323aab92 in thread_call lib/thread.c:2006 11 0x7f5232300dc7 in frr_run lib/libfrr.c:1198 12 0x55c24b8ea792 in main bgpd/bgp_main.c:520 13 0x7f5231c3a082 in __libc_start_main ../csu/libc-start.c:308 14 0x55c24b8ef0bd in _start (/usr/lib/frr/bgpd+0x2c90bd) ``` Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org> (cherry picked from commit e9340ff)
And also do not crash when we do `clear ip bgp ...`. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org> (cherry picked from commit 35b7ce0)
…pr-12782 bgpd: Intern default-originate attributes to avoid use-after-free (backport FRRouting#12782)
…pr-12781 lib: Release memory of YANG translation module on error (backport FRRouting#12781)
Currently "show ipv6 mld join json" o/p is frr# show ipv6 mld joins json { "default":{ "ens192":{ "ff02:2":{ "::":{ "state":"JOIN", "created":"00:01:50.595", "lastSeen":"00:00:38.403", } } } } } Here, I modified the o/p as below for better understanding. frr# show ipv6 mld joins json { "default":{ "vrf":"default", "ens192":{ "ff02::2":{ "*":{ "state":"JOIN", "created":"00:00:42.766", "lastSeen":"00:00:05.266" } } } } } Issue: FRRouting#12755 Signed-off-by: Sarita Patra <saritap@vmware.com> (cherry picked from commit 58971e1)
…pr-12776 pim6d: Modify "show ipv6 mld join json" o/p (backport FRRouting#12776)
This is still causing crashes somehow. Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com> (cherry picked from commit ed33162)
…pr-12790 vrrpd: give null when using null ifp to lookup vr (backport FRRouting#12790)
When performing deterministic MED processing, ensure that the peer status is not checked when we encounter a stale path. Otherwise, this path will be skipped from the DMED consideration leading to it potentially not being installed. Test scenario: Consider a prefix with 2 (multi)paths. The peer that announces the path with the winning DMED undergoes a graceful-restart. Before it comes back up, the other path goes away. Prior to the fix, a third router that receives both these paths would have ended up not having any path installed to the prefix after the above events. Signed-off-by: Vivek Venkatraman <vivek@nvidia.com> For internal use: Ticket: CM-32032 Testing done: Multiple manual testing (cherry picked from commit de692a4)
Ensure that a multipath set is fully comprised of EVPN paths (i.e., paths imported into the VRF from EVPN address-family) or non-EVPN paths. This is actually a condition that existed already in the code but was not properly enforced. This change, as a side effect, eliminates the known trigger condition for bad or missing RMAC programming in an EVPN deployment, described in tickets CM-29043 and CM-31222. Routes (actually, paths) in a VRF routing table that require VXLAN tunneling to the next hop currently need some special handling in zebra to deal with the nexthop (neigh) and RMAC programming, and this is implemented for the entire route (prefix), not per-path. This can lead to the bad or missing RMAC situation, which is now eliminated by ensuring all paths in the route are 'similar'. The longer-term solution in CL 5.x will be to deal with the special programming by means of explicit communication between bgpd and zebra. This is already implemented for EVPN-MH via CM-31398. These changes will be extended to non-MH also and the special code in zebra removed or refined. Signed-off-by: Vivek Venkatraman <vivek@nvidia.com> Acked-by: Trey Aspelund <taspelund@nvidia.com> Acked-by: Anuradha Karuppiah <anuradhak@nvidia.com> Acked-by: Chirag Shah <chirag@nvidia.com> Ticket: CM-29043 Testing Done: 1. Manual testing 2. precommit on both MLX and BCM platforms 3. evpn-smoke - BCM and VX Results described in the ticket (cherry picked from commit d2d71b0)
Signed-off-by: Vivek Venkatraman <vivek@nvidia.com> Ticket: #2609944 (cherry picked from commit f88889b)
…pr-12734 lib: Fix non-use of option (backport FRRouting#12734)
…pr-12817 some missed upstream commits (backport FRRouting#12817)
Because mp_nexthop_len attribute value stands for the length to encode in the stream, simplify the way the nexthop is forged. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com> (cherry picked from commit 558e8f5)
Using a route-map to update the local ipv6 address has to be better clarified. Actually, when a VPN SAFI is used, the nexthop length must be changed to 48 bytes. Other cases, the length will be 32 bytes. Fixes: 9795e9f ("bgpd: fix when route-map changes the link local nexthop for vpnv6") Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com> (cherry picked from commit 5bbcc78)
In `rib_link`, if is_zebra_import_table_enabled returns true, `rib_queue_add` will not called, resulting in other table route node never processed. This actually should not be dependent on whether the route is imported. In `rib_delnode`, if is_zebra_import_table_enabled returns true, it will use `rib_unlink` instead of enqueuing the route node for process. There is no reason that imported route nodes should not be reprocessed. Long ago, the behaviour was dependent on whether the route_entry comes from a table other than main. Signed-off-by: zyxwvu Shi <i@shiyc.cn> (cherry picked from commit 207207c)
…pr-12818 zebra: Fix other table inactive when ip import-table is on (backport FRRouting#12818)
…pr-12795 Vpnv6 nexthop encoding (backport FRRouting#12795)
Signed-off-by: Christian Hopps <chopps@labn.net> (cherry picked from commit 1794afe)
….5/pr-13312 zebra: EVPN MH ES-peer Sync MAC installed as inactive (backport FRRouting#13312)
1. Consider a established L2VPN EVPN BGP peer with soft-reconfiguartion inbound configured 2. When the interface of this directly connected BGP peer is shutdown, bgp_soft_reconfig_table_update() is called, which memsets the evpn buffer and calls bgp_update() with received attributes stored in ain table(ain->attr). In bgp_update(), evpn_overlay attribute in ain->attr (which is an interned attr) was modified by doing a memcpy 3. Above action causes 2 attributes in the attrhash (which were previously different) to match! 4. Later during fsm change event of the peer, bgp_adj_in_remove() is called to clean up the ain->attr. But, because 2 attrs in attrhash match, it causes BGP to assert in bgp_attr_unintern() Signed-off-by: Pooja Jagadeesh Doijode <pdoijode@nvidia.com> (cherry picked from commit 6e076ba)
….5/pr-13329 bgpd: Fix for ain->attr corruption during path update (backport FRRouting#13329)
If you have a very large number of large communities whose string length happened to be greater than BUFSIZ FRR's bgpd would crash. This is because bgpd would write beyond the end of the string. Originally the code auto-calculated the string size appropriately but commit ed0e57e modified the string length to be a hard coded BUFSIZ. When a route-map like this is added: route-map LARGE-OUT permit 10 set large-community 4635:0:0 4635:1:906 4635:1:2906 4635:1:4515 4635:1:4594 4635:1:4641 4635:1:4760 4635:1:7979 4635:1:9253 4635:1:9293 4635:1:9304 4635:1:9908 4635:1:13335 4635:1:16265 4635:1:17924 4635:1:18013 4635:1:20940 4635:1:22822 4635:1:24429 4635:1:24482 4635:1:32590 4635:1:32934 4635:1:36692 4635:1:38008 4635:1:38819 4635:1:41378 4635:1:45753 4635:1:46489 4635:1:49544 4635:1:51847 4635:1:54574 4635:1:54994 4635:1:55720 4635:1:56059 4635:1:57724 4635:1:65021 4635:1:134823 4635:1:136907 4635:1:146961 24115:0:24115 24115:1:906 24115:1:2906 24115:1:4515 24115:1:4594 24115:1:4641 24115:1:4760 24115:1:7979 24115:1:9253 24115:1:9293 24115:1:9304 24115:1:9908 24115:1:13335 24115:1:16265 24115:1:17924 24115:1:18013 24115:1:20940 24115:1:22822 24115:1:24429 24115:1:24482 24115:1:32590 24115:1:32934 24115:1:36692 24115:1:38008 24115:1:38819 24115:1:41378 24115:1:45753 24115:1:46489 24115:1:49544 24115:1:51847 24115:1:54574 24115:1:54994 24115:1:55720 24115:1:56059 24115:1:57724 24115:1:65021 24115:1:134823 24115:1:136907 24115:1:100000 24115:1:100001 24115:1:100002 exit BGP would have issues and crash. Modify the code to correctly determine the string length of the communities and to also double check if the string has an alias and ensure that the string is still sufficiently large enough. If not auto size it again. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
…t_8cb4892c0669d916557d693b878420faec8e6e2a bgpd: [8.5] Fix lcom->str string length to correctly cover aliases
* Bug Fixes bgpd - Fix crash due to community aliases size - Aggregate-address memory leak fix - Bmp fix peer-up ports byte order - Check 7 bytes for long-lived graceful-restart capability - Copy the password from the previous peer on peer_xfer_config() - Do not allow a `no router bgp xxx` when autoimport is happening - Do not allow l3vni changes when shutting down - Do not announce routes immediatelly on filter updates - Do not call bgp_soft_reconfig_in() twice in a row on policy change - Evpn-mh esi not active suppress ead-es route - Fix crash for `show bgp ... neighbor received-routes detail|prefix` - Fix debug output for route-map names when using a unsuppress-map - Fix ecommunity parsing for as4 - Fix for ain->attr corruption during path update - Increase buffer size used for dumping bgp to mrt files - Limit flowspec to no attribute means a implicit withdrawal - Prevent null pointer deref when outputting data lib - Adjust only `any` flag for prefix-list entries if destroying - Destroy `any` flag when creating a prefix-list entry with prefix - Fix clear route-map cmd using defpy - Fix link state memory leak - Include clippy generated commands for routemap.c - On bfd peer shutdown actually stop event ospfd - Cleanup some memory leaks on shutdown in ospf_apiserver.c - Fix for vitual-link crash in signal handler - Fix ospf_lsa memory leak - Fix ospf_ti_lfa drop of an entire table - Fixing summary origination after range configuration - Free up q_space in early return path - Log adjacency changes with neighbor ip in addition to neighbor id pbrd - Fix mismatching in match src-dst pim6d - Fixing mroutes not created after disabling and enabling pimv6. pimd - Fix use after free issue for ifp's moving vrfs - In_multicast needs host order - Process no-forward bsm packet ripd - Fix malformed route-map - Fix memory leak for ripd's route-map staticd - Tell bfd that we are shutting down tools - Fix missing remote-as configuration when reload - Frr-reload fix list value not present - Make check flag really work for reload - Set correct directory of vtysh for frr-reload.py zebrad - Add link_nsid to zebra interface - Cleanup ctx leak on shutdown and turn off event - Evpn mh sync mac install as inactive - Fix for heap-use-after-free in evpn - Fix race during shutdown - Install directly connected route after interface flap Signed-off-by: Jafar Al-Gharaibeh <jafar@atcorp.com>
Currently underlying asics get into a bit of trouble when the nexthop weight passed down varies wildly between the different numbers. Let's normalize the weight values between 1 and 255 Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The weight scale value might be useful to have it change it's behavior at a later time or controlled by something depending on how FRR is compiled/ran. Let's start that process Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Instead of scaling the bandwith to something between 1 and 100, just send down the bandwidth Available for the link. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Old behavior was metric values between 1-100, now we have metric values between 1-255. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.