-
Notifications
You must be signed in to change notification settings - Fork 5.1k
Don't suspend for debugger while holding the slot backpatching lock #40060
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
f496527
to
1336c15
Compare
Was the previous fix incomplete which led to the issue re-occurring due to timing? |
The underlying issue was introduced when we started using cooperative GC mode for the slot backpatching info data structure. It was not obvious that such an issue would arise. The workaround used in the previous fix was a partial fix to the most problematic occurrences of the issue, we did not come up with a complete fix at the time after a fair bit of discussion, and the workaround was never intended to be a complete fix for the issue. The issue appears to be more problematic in 5.0 due to differences in the ways in which the lock is used. This change should be a complete fix for the issue. |
- Mostly reverted the previous workaround for the issue (commit fc06054) - Added a forbid-suspend-for-debugger region in preemptive GC mode - Added a crst holder type that acquires a lock and enters the forbid region - Where the slot backpatching lock would be taken where cooperative GC mode may be entered inside that lock, the new crst holder is used - When a suspend for debugger is requested, a thread in preemptive GC mode that is in the forbid region is considered not yet suspended and is synched later similarly to threads in cooperative GC mode Fixes #37278
cc: @gregg-miskelly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Kount! This looks good to me. Given the subtleties in the area I don't have high confidence that I would spot an issue if there was one, but I am OK accepting that risk.
Long term this still feels a bit "princess and the pea" to me. The pea is that we designed a low-level data structure that requires co-op mode to access it, and then we've layered on a bunch of complex locking/suspend logic to resolve the undesirable effects (the mattresses in the analogy). As I recall the original purpose of the co-op mode datastructure was to simplify some lifetime management for state attached to collectible assemblies. My suspicion is that the amount of complexity and performance overhead we have added working around co-op mode outweighs the value we are getting from it. Of course I make no claim that I have carefully evaluated the alternatives, so it is only my hunch at this point.
Thanks @noahfalk! Longer-term, avoiding coop mode would avoid some complication, though there are other complications it may not avoid. I haven't seen a startup perf hit from using the coop mode data structure initially or from this change, so there may not be any improvement from using a preemptive mode data structure. The alternative data structure may have to be a separate data structure and along with other changes that would go along with using it, it would be a bit too large/risky for .NET 5 at the moment. In the interest of simplifying some things it would be good longer-term to see if we can switch to a mostly preemptive mode solution. |
…otnet#40060) - Mostly reverted the previous workaround for the issue (commit fc06054) - Added a forbid-suspend-for-debugger region in preemptive GC mode - Added a crst holder type that acquires a lock and enters the forbid region - Where the slot backpatching lock would be taken where cooperative GC mode may be entered inside that lock, the new crst holder is used - When a suspend for debugger is requested, a thread in preemptive GC mode that is in the forbid region is considered not yet suspended and is synched later similarly to threads in cooperative GC mode Fixes dotnet#37278
…bid-suspend-for-debugger region (#42587) - Followup to #40060 - In short timing windows if a thread is placed in a pending-suspend-for-debugger state while in a forbid-suspend-for-debugger region, from the above PR it would not suspend for the debugger but it was missed that it can also get stuck in a perpetual spin-wait while entering cooperative GC mode. This causes the thread to not suspend for the debugger (VS times out after waiting) and the thread can only progress by unmarking it for debugger suspension. - Fixed to break the spin-wait by checking whether the thread is in the forbid region Fix for #42375 in master
…s in a forbid-suspend-for-debugger region (#42630) - Port of #42587 to 5.0 RC2 - Followup to #40060 - In short timing windows if a thread is placed in a pending-suspend-for-debugger state while in a forbid-suspend-for-debugger region, from the above PR it would not suspend for the debugger but it was missed that it can also get stuck in a perpetual spin-wait while entering cooperative GC mode. This causes the thread to not suspend for the debugger (VS times out after waiting) and the thread can only progress by unmarking it for debugger suspension. - Fixed to break the spin-wait by checking whether the thread is in the forbid region Fix for #42375
Fixes #37278