-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bgp use after free #12415
Bgp use after free #12415
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good
b1da62d
to
6ee6370
Compare
Continuous Integration Result: SUCCESSFULContinuous Integration Result: SUCCESSFULCongratulations, this patch passed basic tests Tested-by: NetDEF / OpenSourceRouting.org CI System CI System Testrun URL: https://ci1.netdef.org/browse/FRR-PULLREQ2-8642/ This is a comment from an automated CI system. |
6ee6370
to
c68dd75
Compare
Continuous Integration Result: SUCCESSFULContinuous Integration Result: SUCCESSFULCongratulations, this patch passed basic tests Tested-by: NetDEF / OpenSourceRouting.org CI System CI System Testrun URL: https://ci1.netdef.org/browse/FRR-PULLREQ2-8679/ This is a comment from an automated CI system. |
bgpd/bgp_fsm.c
Outdated
@@ -2641,7 +2647,7 @@ int bgp_event_update(struct peer *peer, enum bgp_fsm_events event) | |||
* we need to indicate that the peer was stopped in the return | |||
* code. | |||
*/ | |||
if (!dyn_nbr && !passive_conn && peer->bgp) { | |||
if (!dyn_nbr && !passive_conn && peer->bgp && ret != -2) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest giving a name for -2
(aka #define
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to an enum in a later commit. Since the whole thing needed to be updated
At some point in the past the peer creation was not properly setting the su and the code had the release and re-add when setting the su. Since peer_create got a bit of code to handle the su properly the need to release then add it back in is negated so remove the code. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Three fixes: a) When calling bgp_fsm_change_status with `Deleted` do not add a new event to the peer's t_event because we are already in the process of deleting everything b) When bgp_stop decides to delete a peer return a notification that it is happening to bgp_event_update so that it does not set the peer state back to idle or do other processing. c) bgp_event_update can cause a peer deletion, because the peer can be deleted in the fsm function but the peer data structure is still pointed to and used after words. So lock the peer before entering and prevent a use after free. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When a peer is a peer-group based peer, and the config is inherited from the peer group, let's ensure that the CONFIG_NODE flag stays no matter what. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When changing the peers sockunion structure the bgp->peer list was not being updated properly. Since the peer's su is being used for a sorted insert then the change of it requires that the value be pulled out of the bgp->peer list and then put back into as well. Additionally ensure that the hash is always released on peer deletion. Lead to this from this decode in a address sanitizer run. ================================================================= ==30778==ERROR: AddressSanitizer: heap-use-after-free on address 0x62a0000d8440 at pc 0x7f48c9c5c547 bp 0x7ffcba272cb0 sp 0x7ffcba272ca8 READ of size 2 at 0x62a0000d8440 thread T0 #0 0x7f48c9c5c546 in sockunion_same lib/sockunion.c:425 #1 0x55cfefe3000f in peer_hash_same bgpd/bgpd.c:890 #2 0x7f48c9bde039 in hash_release lib/hash.c:209 #3 0x55cfefe3373f in bgp_peer_conf_if_to_su_update bgpd/bgpd.c:1541 #4 0x55cfefd0be7a in bgp_stop bgpd/bgp_fsm.c:1631 #5 0x55cfefe4028f in peer_delete bgpd/bgpd.c:2362 #6 0x55cfefdd5e97 in no_neighbor_interface_config bgpd/bgp_vty.c:4267 #7 0x7f48c9b9d160 in cmd_execute_command_real lib/command.c:949 #8 0x7f48c9ba1112 in cmd_execute_command lib/command.c:1009 #9 0x7f48c9ba1573 in cmd_execute lib/command.c:1162 #10 0x7f48c9c87402 in vty_command lib/vty.c:526 #11 0x7f48c9c87832 in vty_execute lib/vty.c:1291 #12 0x7f48c9c8e741 in vtysh_read lib/vty.c:2130 #13 0x7f48c9c7a66d in thread_call lib/thread.c:1585 #14 0x7f48c9bf64e7 in frr_run lib/libfrr.c:1123 #15 0x55cfefc75a15 in main bgpd/bgp_main.c:540 #16 0x7f48c96b009a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) #17 0x55cfefc787f9 in _start (/usr/lib/frr/bgpd+0xe27f9) 0x62a0000d8440 is located 576 bytes inside of 23376-byte region [0x62a0000d8200,0x62a0000ddd50) freed by thread T0 here: #0 0x7f48c9eb9fb0 in __interceptor_free (/lib/x86_64-linux-gnu/libasan.so.5+0xe8fb0) #1 0x55cfefe3fe42 in peer_free bgpd/bgpd.c:1113 #2 0x55cfefe3fe42 in peer_unlock_with_caller bgpd/bgpd.c:1144 #3 0x55cfefe4092e in peer_delete bgpd/bgpd.c:2457 #4 0x55cfefdd5e97 in no_neighbor_interface_config bgpd/bgp_vty.c:4267 #5 0x7f48c9b9d160 in cmd_execute_command_real lib/command.c:949 #6 0x7f48c9ba1112 in cmd_execute_command lib/command.c:1009 #7 0x7f48c9ba1573 in cmd_execute lib/command.c:1162 #8 0x7f48c9c87402 in vty_command lib/vty.c:526 #9 0x7f48c9c87832 in vty_execute lib/vty.c:1291 #10 0x7f48c9c8e741 in vtysh_read lib/vty.c:2130 #11 0x7f48c9c7a66d in thread_call lib/thread.c:1585 #12 0x7f48c9bf64e7 in frr_run lib/libfrr.c:1123 #13 0x55cfefc75a15 in main bgpd/bgp_main.c:540 #14 0x7f48c96b009a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When the decision has been made to copy a peer configuration from a peer to another peer because one is taking over. Do not automatically set the CONFIG_NODE flag. Instead we need to handle that appropriately when the final decision is made to transfer. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Currently bgp does not stop any events that are on the thread system for execution on peer deletion. This is not good. Stop those events and prevent use after free's. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The bgp->peerhash is made up of the sockunion and the CONFIG_NODE flag. If the CONFIG_NODE flag is moved around or changed then we get into a situation where both the doppelganger and the peer actually hash to the exact same thing. Leading to wrongful deletion and pointers being used after freed. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When actually creating a peer in BGP, tell the creation if it is a config node or not. There were cases where the CONFIG_NODE was being set *after* being placed into the bgp->peerhash, thus causing collisions between the doppelganger and the peer and eventually use after free's. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The BGP fsm uses return codes to pass event success/fail as well as some extra data to the bgp_event_update function. Convert this to use a enum instead of an int to track the changes. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
c68dd75
to
4da144f
Compare
Continuous Integration Result: FAILEDContinuous Integration Result: FAILEDSee below for issues. This is a comment from an automated CI system. Get source / Pull Request: SuccessfulBuilding Stage: SuccessfulBasic Tests: FailedTopotests Ubuntu 18.04 arm8 part 8: Failed (click for details)Topotests Ubuntu 18.04 arm8 part 8: No useful log foundSuccessful on other platforms/tests
|
ci:rerun pim failures |
Continuous Integration Result: FAILEDContinuous Integration Result: FAILEDSee below for issues. This is a comment from an automated CI system. Get source / Pull Request: SuccessfulBuilding Stage: SuccessfulBasic Tests: FailedTopotests debian 10 amd64 part 9: Failed (click for details)Topotests debian 10 amd64 part 9: Unknown Log URL: https://ci1.netdef.org/browse/FRR-PULLREQ2-8703/artifact/TOPO9DEB10AMD64/ErrorLog/ Topotests debian 10 amd64 part 9: No useful log foundSuccessful on other platforms/tests
|
clang-format rearraged the fsm state machine and it disagrees with the frrbot. Frankly I prefer clang-format here |
ci:rerun |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, waiting on ci
Continuous Integration Result: SUCCESSFULCongratulations, this patch passed basic tests Tested-by: NetDEF / OpenSourceRouting.org CI System CI System Testrun URL: https://ci1.netdef.org/browse/FRR-PULLREQ2-8713/ This is a comment from an automated CI system. |
found 2 use after frees in bgp code. See individual commits for explanations.