Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zebra: Fix for heap-use-after-free in EVPN #13062

Merged
merged 1 commit into from
Mar 21, 2023

Conversation

Pdoijode
Copy link
Contributor

Issue:
When a netns is deleted, since zebra doesn’t receive interface down/delete notifications from kernel, it manually deletes the interface without removing the association between zebra_l3vni and the interface that is being deleted (i.e it deletes the interface without setting “zl3vni->vxlan_if” to NULL).

Later, during the deletion of netns, when zl3vni_rmac_uninstall() is called to uninstall the remote RMAC from the kernel, zebra ends up accessing stale “zl3vni->vxlan_if” pointer, which now points to freed memory. This was causing heap use-after-free.

Fix:
Before zebra starts deleting the interfaces when it receives netns delete notification, appropriate functions() are being called to remove the association between evpn structs and interface and set “zl3vni->vxlan_if” to NULL. This ensures that when zl3vni_rmac_uninstall() is called during netns deletion, it will bail because “zl3vni->vxlan_if” is NULL.

In netlink_link_change() during interface deletion, it calls the appropriate functions to do the clean up before calling if_delete_update() to delete the interface. Similar changes have been made in zebra_ns_delete() to handle the interface deletion.

Address-sanitizer output before fix on R1:

=================================================================
==7821==ERROR: AddressSanitizer: heap-use-after-free on address 0xffff8b15d888 at pc 0xaaaabca5b50c bp 0xffffe40f1580 sp 0xffffe40f15a0
READ of size 8 at 0xffff8b15d888 thread T0
    #0 0xaaaabca5b508 in zl3vni_rmac_uninstall zebra/zebra_vxlan.c:1330
    #1 0xaaaabca5b608 in zl3vni_rmac_uninstall zebra/zebra_vxlan.c:1318
    #2 0xaaaabca5b608 in zl3vni_del_rmac_hash_entry zebra/zebra_vxlan.c:2455
    #3 0xffff915d0ab0 in hash_iterate lib/hash.c:252
    #4 0xaaaabca6c0ec in zebra_vxlan_vrf_disable zebra/zebra_vxlan.c:5214
    #5 0xaaaabca3d17c in zebra_vrf_disable zebra/zebra_vrf.c:171
    #6 0xffff916848b4 in vrf_delete lib/vrf.c:230
    #7 0xaaaabc9f7c40 in zebra_ns_delete zebra/zebra_netns_notify.c:177
    #8 0xaaaabc9f7c40 in zebra_ns_notify_read zebra/zebra_netns_notify.c:342
    #9 0xffff9167dfdc in thread_call lib/thread.c:1991
    #10 0xffff915e7dd0 in frr_run lib/libfrr.c:1185
    #11 0xaaaabc929c08 in main zebra/main.c:465
    #12 0xffff91271e0c in __libc_start_main ../csu/libc-start.c:308
    #13 0xaaaabc92c4c0  (/usr/lib/frr/zebra+0x19c4c0)

0xffff8b15d888 is located 200 bytes inside of 272-byte region [0xffff8b15d7c0,0xffff8b15d8d0)
freed by thread T0 here:
    #0 0xffff919ae174 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:122
    #1 0xffff915d8858 in if_delete lib/if.c:280
    #2 0xaaaabc947b74 in if_delete_update zebra/interface.c:809
    #3 0xaaaabc9f7b94 in zebra_ns_delete zebra/zebra_netns_notify.c:169
    #4 0xaaaabc9f7b94 in zebra_ns_notify_read zebra/zebra_netns_notify.c:342
    #5 0xffff9167dfdc in thread_call lib/thread.c:1991
    #6 0xffff915e7dd0 in frr_run lib/libfrr.c:1185
    #7 0xaaaabc929c08 in main zebra/main.c:465
    #8 0xffff91271e0c in __libc_start_main ../csu/libc-start.c:308
    #9 0xaaaabc92c4c0  (/usr/lib/frr/zebra+0x19c4c0)

previously allocated by thread T0 here:
    #0 0xffff919ae724 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:153
    #1 0xffff91601c94 in qcalloc lib/memory.c:105
    #2 0xffff915d5bcc in if_new lib/if.c:161
    #3 0xffff915d5bcc in if_create_name lib/if.c:215
    #4 0xffff915d5bcc in if_get_by_name lib/if.c:614
    #5 0xaaaabc9377e0 in netlink_interface zebra/if_netlink.c:1163
    #6 0xaaaabc955320 in netlink_parse_info zebra/kernel_netlink.c:1183
    #7 0xaaaabc93cd14 in interface_lookup_netlink zebra/if_netlink.c:1273
    #8 0xaaaabc93ceac in interface_list zebra/if_netlink.c:2419
    #9 0xaaaabca064fc in zebra_ns_enable zebra/zebra_ns.c:113
    #10 0xaaaabca06640 in zebra_ns_enabled zebra/zebra_ns.c:90
    #11 0xffff9160b0f4 in ns_enable_internal lib/netns_linux.c:227
    #12 0xffff9160b0f4 in ns_enable lib/netns_linux.c:348
    #13 0xaaaabca3ea20 in zebra_vrf_netns_handler_create zebra/zebra_vrf.c:610
    #14 0xaaaabc9f845c in zebra_ns_notify_create_context_from_entry_name zebra/zebra_netns_notify.c:112
    #15 0xaaaabc9f8adc in zebra_ns_notify_parse zebra/zebra_netns_notify.c:397
    #16 0xaaaabca069c8 in zebra_ns_init zebra/zebra_ns.c:212
    #17 0xaaaabc929b18 in main zebra/main.c:397
    #18 0xffff91271e0c in __libc_start_main ../csu/libc-start.c:308
    #19 0xaaaabc92c4c0  (/usr/lib/frr/zebra+0x19c4c0)
"r1.zebra.asan.7821" 87L, 4792C                                                                                                                                     1,1           Top

With Fix:
Zebra logs on r1 for topotests/bgp_evpn_rt5/test_bgp_evpn.py:

pdoijode@upstream2:~/Documents/frr/tests/topotests/bgp_evpn_rt5$ sudo -E pytest test_bgp_evpn.py


2023/03/20 09:04:09.472 ZEBRA: [M252K-PDDRC] Intf vxlan-101(5) L3-VNI 101 is DOWN ——> zebra_vxlan_if_vni_down()
2023/03/20 09:04:09.472 ZEBRA: [JAESH-BABB8] Send L3_VNI_DEL 101 VRF r1-vrf-101 to bgp  ——> zl3vni_send_del_to_client()
2023/03/20 09:04:09.472 ZEBRA: [SBFM4-2P25V] MESSAGE: ZEBRA_INTERFACE_DOWN vxlan-101 vrf r1-vrf-101(1)
2023/03/20 09:04:09.472 ZEBRA: [XN0NB-2NSYE] MESSAGE: ZEBRA_INTERFACE_ADDRESS_DELETE fe80::d470:6ff:fee6:c896/64 on vxlan-101 vrf r1-vrf-101(1)
2023/03/20 09:04:09.472 ZEBRA: [WVRMN-YEC5Q] Del L3-VNI 101 intf vxlan-101(5) —> zebra_vxlan_if_del_vni()
2023/03/20 09:04:09.472 ZEBRA: [JAESH-BABB8] Send L3_VNI_DEL 101 VRF r1-vrf-101 to bgp
2023/03/20 09:04:09.472 ZEBRA: [WEEJX-M4HA0] interface vxlan-101 vrf r1-vrf-101(1) index 5 is now inactive. ——> if_delete_update()
2023/03/20 09:04:09.472 ZEBRA: [XN0NB-2NSYE] MESSAGE: ZEBRA_INTERFACE_ADDRESS_DELETE fe80::d470:6ff:fee6:c896/64 on vxlan-101 vrf r1-vrf-101(1)
2023/03/20 09:04:09.472 ZEBRA: [NXAHW-290AC] MESSAGE: ZEBRA_INTERFACE_DELETE vxlan-101 vrf r1-vrf-101(1)
2023/03/20 09:04:09.472 ZEBRA: [Y6R2N-EF2N4] interface vxlan-101 is being deleted from the system.  ——> if_delete_update()
2023/03/20 09:04:09.472 ZEBRA: [P0CZ5-RF5FH] VRF r1-vrf-101 id 1 is now inactive
2023/03/20 09:04:09.472 ZEBRA: [JAESH-BABB8] Send L3_VNI_DEL 101 VRF r1-vrf-101 to bgp
2023/03/20 09:04:09.472 ZEBRA: [W0Q6Q-6EMPH] RMAC c6:86:f2:52:e9:8f on L3-VNI 101 hash 0xffff9e725390 couldn't be uninstalled - no vxlan_if —> zl3vni_rmac_uninstall(). With Fix, zebra return -1 from zl3vni_rmac_uninstall() since "zl3vni->vxlan_if" is NULL.
2023/03/20 09:04:09.472 ZEBRA: [RTA3T-W4WDC] rtadv_event(r1-vrf-101) with event: 1 and val: 0
2023/03/20 09:04:09.472 ZEBRA: [T65SJ-FY79X] adv_if_clean: r1-vrf-101:1 count: 0 -> 0
2023/03/20 09:04:09.472 ZEBRA: [T65SJ-FY79X] adv_msec_if_clean: r1-vrf-101:1 count: 0 -> 0
2023/03/20 09:04:09.472 ZEBRA: [XC3P3-1DG4D] MESSAGE: ZEBRA_VRF_DELETE r1-vrf-101
2023/03/20 09:04:09.472 ZEBRA: [W1RCA-GPZTZ] ZNS /run/netns/r1-vrf-101 with id 1 (disabled)
2023/03/20 09:04:09.472 ZEBRA: [GCH7Z-8R4AV] ZNS /run/netns/r1-vrf-101 with id 1 (deleted)
2023/03/20 09:04:09.472 ZEBRA: [ZWXAA-QY4KQ] NS notify : deleted VRF r1-vrf-101
2023/03/20 09:04:09.487 ZEBRA: [XVBTQ-5QTVQ] Terminating on signal
2023/03/20 09:04:09.489 ZEBRA: [VFCDB-S5FKG] connection closed socket [23]

Issue:
When a netns is deleted, since zebra doesn’t receive interface down/delete
notifications from kernel, it manually deletes the interface without removing
the association between zebra_l3vni and the interface that is being deleted
(i.e it deletes the interface without setting “zl3vni->vxlan_if” to NULL).

Later, during the deletion of netns, when zl3vni_rmac_uninstall() is called to
uninstall the remote RMAC from the kernel, zebra ends up accessing stale
“zl3vni->vxlan_if” pointer, which now points to freed memory.
This was causing heap use-after-free.

Fix:
Before zebra starts deleting the interfaces when it receives netns delete notification,
appropriate functions() are being called to remove the association between evpn structs
and interface and set “zl3vni->vxlan_if” to NULL. This ensures that when
zl3vni_rmac_uninstall() is called during netns deletion, it will bail because
“zl3vni->vxlan_if” is NULL.

Signed-off-by: Pooja Jagadeesh Doijode <pdoijode@nvidia.com>
@ton31337
Copy link
Member

@Mergifyio backport stable/8.4 stable/8.5

@mergify
Copy link

mergify bot commented Mar 20, 2023

backport stable/8.4 stable/8.5

✅ Backports have been created

@NetDEF-CI
Copy link
Collaborator

Continuous Integration Result: SUCCESSFUL

Congratulations, this patch passed basic tests

Tested-by: NetDEF / OpenSourceRouting.org CI System

CI System Testrun URL: https://ci1.netdef.org/browse/FRR-PULLREQ2-10274/

This is a comment from an automated CI system.
For questions and feedback in regards to this CI system, please feel free to email
Martin Winter - mwinter (at) opensourcerouting.org.

@donaldsharp
Copy link
Member

verified that this is fixed

@ton31337 ton31337 merged commit 0c462d6 into FRRouting:master Mar 21, 2023
ton31337 added a commit that referenced this pull request Mar 21, 2023
zebra: Fix for heap-use-after-free in EVPN (backport #13062)
ton31337 added a commit that referenced this pull request Mar 21, 2023
zebra: Fix for heap-use-after-free in EVPN (backport #13062)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants