Skip to content

Block thread from receiving profile signal with stackwalk lock #57089

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 18, 2025

Conversation

ararslan
Copy link
Member

This is Jameson's proposed amendment to the changes made in #57035 that introduced a deadlock on FreeBSD, amusingly in service of fixing a deadlock on Linux.

On Linux and (non-macOS) BSD, instead of acquiring and releasing a lock on the profiler in jl_with_stackwalk_lock, we're just blocking the thread from receiving the profiler's SIGUSR2 signal at all.

This should fix #57058; I don't get the deadlock locally on FreeBSD with this change, but it's AArch64 instead of x86-64 like on CI, so let's see if this also makes CI happy. If so, closes #57059.

This is Jameson's proposed amendment to the changes made in 57035 that
introduced a deadlock on FreeBSD, amusingly in service of fixing a
deadlock on Linux.

On Linux and (non-macOS) BSD, instead of acquiring and releasing a lock
on the profiler in `jl_with_stackwalk_lock`, we're just blocking the
thread from receiving the profiler's `SIGUSR2` signal at all.

Should fix 57058. If so, closes 57059.

Co-Authored-By: Jameson Nash <vtjnash@gmail.com>
@ararslan
Copy link
Member Author

Success! 🎉

@ararslan ararslan merged commit 1740287 into master Jan 18, 2025
8 checks passed
@ararslan ararslan deleted the aa/block-profile-signal branch January 18, 2025 07:33
@StevenWhitaker StevenWhitaker mentioned this pull request Jan 20, 2025
@ararslan ararslan added the backport 1.11 Change should be backported to release-1.11 label Jan 31, 2025
@ararslan
Copy link
Member Author

Added the backport label, but should only be backported in conjunction with #57035.

vtjnash added a commit that referenced this pull request Feb 3, 2025
Restores #57035, undo #57089 for non-FreeBSD. While I suggested doing
this change for all platforms, I forgot that means non-FreeBSD platforms
become vulnerable again to the very deadlock problems that #57035 was
required to prevent. That fix seems to not be viable on FreeBSD due to
known libc implementation problems on that platform. However, upon
closer inspection of the questionable design implementation decisions
they seem to have made here, the platform is likely not currently
vulnerable to this libunwind bug in the first place:
https://github.com/lattera/freebsd/blob/master/libexec/rtld-elf/rtld_lock.c#L120
vtjnash added a commit that referenced this pull request Feb 3, 2025
Restores #57035, undo #57089 for non-FreeBSD. While I suggested doing
this change for all platforms, I forgot that means non-FreeBSD platforms
become vulnerable again to the very deadlock problems that #57035 was
required to prevent. That fix seems to not be viable on FreeBSD due to
known libc implementation problems on that platform. However, upon
closer inspection of the questionable design implementation decisions
they seem to have made here, the platform is likely not currently
vulnerable to this libunwind bug in the first place:
https://github.com/lattera/freebsd/blob/master/libexec/rtld-elf/rtld_lock.c#L120
vtjnash added a commit that referenced this pull request Feb 4, 2025
Restores #57035, undo #57089 for non-FreeBSD. While I suggested doing
this change for all platforms, I forgot that means non-FreeBSD platforms
become vulnerable again to the very deadlock problems that #57035 was
required to prevent. That fix seems to not be viable on FreeBSD due to
known libc implementation problems on that platform. However, upon
closer inspection of the questionable design implementation decisions
they seem to have made here, the platform is likely not currently
vulnerable to this libunwind bug in the first place:
https://github.com/lattera/freebsd/blob/master/libexec/rtld-elf/rtld_lock.c#L120
KristofferC pushed a commit that referenced this pull request Feb 6, 2025
Restores #57035, undo #57089 for non-FreeBSD. While I suggested doing
this change for all platforms, I forgot that means non-FreeBSD platforms
become vulnerable again to the very deadlock problems that #57035 was
required to prevent. That fix seems to not be viable on FreeBSD due to
known libc implementation problems on that platform. However, upon
closer inspection of the questionable design implementation decisions
they seem to have made here, the platform is likely not currently
vulnerable to this libunwind bug in the first place:
https://github.com/lattera/freebsd/blob/master/libexec/rtld-elf/rtld_lock.c#L120

(cherry picked from commit 2f0a523)
KristofferC pushed a commit that referenced this pull request Feb 6, 2025
Restores #57035, undo #57089 for non-FreeBSD. While I suggested doing
this change for all platforms, I forgot that means non-FreeBSD platforms
become vulnerable again to the very deadlock problems that #57035 was
required to prevent. That fix seems to not be viable on FreeBSD due to
known libc implementation problems on that platform. However, upon
closer inspection of the questionable design implementation decisions
they seem to have made here, the platform is likely not currently
vulnerable to this libunwind bug in the first place:
https://github.com/lattera/freebsd/blob/master/libexec/rtld-elf/rtld_lock.c#L120

(cherry picked from commit 2f0a523)
@KristofferC KristofferC mentioned this pull request Mar 11, 2025
71 tasks
@KristofferC KristofferC mentioned this pull request Apr 25, 2025
71 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 1.11 Change should be backported to release-1.11 bugfix This change fixes an existing bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deadlock in FreeBSD tests, causing CI jobs to timeout
2 participants