Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix use after free ebpf_object_t code #2311

Merged
merged 8 commits into from
Apr 14, 2023

Conversation

Alan-Jowett
Copy link
Member

Resolves: #2276

Description

Release reference in the ebpf_object code decrements the ref-count outside of the lock, which can result in entries being in the object tracking list with ref-count of 0. This can result in object's ref-counts going from 1 -> 0 -> 1 and the object being deleted.

Testing

CI/CD + KM stress.

Documentation

No.

@Alan-Jowett
Copy link
Member Author

This PR triggers a deadlock due to latent bug in _ebpf_object_release_reference_under_lock.

@Alan-Jowett
Copy link
Member Author

This PR triggers a deadlock due to latent bug in _ebpf_object_release_reference_under_lock.

Fixed in latest version.

Signed-off-by: Alan Jowett <alanjo@microsoft.com>
libs/platform/ebpf_object.c Outdated Show resolved Hide resolved
libs/platform/ebpf_object.c Show resolved Hide resolved
Signed-off-by: Alan Jowett <alanjo@microsoft.com>
Signed-off-by: Alan Jowett <alanjo@microsoft.com>
Signed-off-by: Alan Jowett <alanjo@microsoft.com>
dthaler
dthaler previously approved these changes Apr 13, 2023
libs/platform/ebpf_object.h Outdated Show resolved Hide resolved
libs/platform/ebpf_object.h Outdated Show resolved Hide resolved
@dthaler dthaler added the bug Something isn't working label Apr 13, 2023
Co-authored-by: Dave Thaler <dthaler@microsoft.com>
Signed-off-by: Alan Jowett <alanjo@microsoft.com>
@dthaler dthaler enabled auto-merge April 13, 2023 18:29
uint32_t marker;
volatile int32_t reference_count;
uint32_t marker; // Contains the 32bit value 'eobj' when the object is valid and is inverted when the object is
// freed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The marker can have other values also:

static const uint32_t _ebpf_native_marker = 'entv';

@@ -263,8 +263,10 @@ ebpf_native_release_reference(_In_opt_ _Post_invalid_ ebpf_native_module_t* modu

ebpf_assert(module->base.marker == _ebpf_native_marker);

new_ref_count = ebpf_interlocked_decrement_int32(&module->base.reference_count);
ebpf_assert(new_ref_count != -1);
new_ref_count = ebpf_interlocked_decrement_int64(&module->base.reference_count);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A similar problem then exists here also. I think we will need to acquire hash table lock here before calling ebpf_interlocked_decrement_int64()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#2326 is tracking the ref-count issue in the native module code. I need to better understand the ref-counting model used by the native module code before I can fix it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my info -- how did we hit these issues? any manual testing or new tests in CICD?

@dthaler dthaler added this pull request to the merge queue Apr 14, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to no response for status checks Apr 14, 2023
@Alan-Jowett Alan-Jowett added this pull request to the merge queue Apr 14, 2023
Merged via the queue into microsoft:main with commit 16c8d21 Apr 14, 2023
@Alan-Jowett Alan-Jowett deleted the issue2276 branch April 14, 2023 20:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Assert in ebpfcore while running MT Stress test
5 participants