-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Fix race condition when createdump tries to read /proc/pid/mem before prctl(PR_SET_PTRACER, childpid) is set #120000
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix race condition when createdump tries to read /proc/pid/mem before prctl(PR_SET_PTRACER, childpid) is set #120000
Conversation
…ad /proc/pid/mem before prctl is set
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes a race condition in PROCCreateCrashDump where the child process (createdump) could attempt to read /proc/pid/mem before the parent process grants PTRACE permissions via prctl(PR_SET_PTRACER, childpid). The fix ensures proper synchronization between parent and child processes to prevent intermittent permission failures.
|
Tagging subscribers to this area: @steveisok, @dotnet/dotnet-diag |
|
I will address copilot's suggestions after a team member review if the fix makes sense 😅 |
|
BTW, the same pattern is also present on nativeaot: runtime/src/coreclr/nativeaot/Runtime/unix/PalCreateDump.cpp Lines 230 to 272 in 2ccc38b
|
noahfalk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor comments inline and I agree with those AI review comments that we should error check the new read/write calls you added.
Thanks for finding this and making the PR!
Interested in making the fix for NativeAOT in the same PR? |
Co-authored-by: Noah Falk <noahfalk@users.noreply.github.com>
Co-authored-by: Noah Falk <noahfalk@users.noreply.github.com>
noahfalk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
|
@hoyosjs - if this looks good to you as well go ahead and merge it |
|
/backport to release/10.0 |
|
Started backporting to release/10.0: https://github.com/dotnet/runtime/actions/runs/17990123963 |
ProcDump-for-Linux is a tool that supports .NET as one of its targets to take dumps. For .NET, it works pretty much as
dotnet-dumptool, triggeringcreatedumputility.In ProcDump's integration pipelines, we've been seeing some intermittent errors taking .NET dumps. Sometimes we see:
We even saw that for the same process, a first dump succeeded, a second one failed with the message above.
After studying the code from createdump's launcher in the CLR, we can see there is a fork: the child execs and becomes
createdump, while the parent process usesprctlsyscall to allowcreatedumpto read its/proc/pid/mem:runtime/src/coreclr/pal/src/thread/process.cpp
Lines 2448 to 2505 in 6ef2af1
We noticed there can be a potential race condition on the code above when the child/
createdumptries to read/proc/pid/membefore the parent grants permissions. I assume it is rare in multi-core systems, but it happens quite frequently in our pipelines that have a limited number of cores.To force the race condition to happen locally, I added a
sleepon the parent branch right before it callsprctl- it always fails with the error message above!This PR proposes a fix to allow the parent to signal the child using pipes when
prctl(PR_SET_PTRACER, childpid)is done so it can proceed.