Fix debugger hangs due to runtime deadlock by using patch DJI#123651
Fix debugger hangs due to runtime deadlock by using patch DJI#123651
Conversation
|
@copilot - it looks like you had an error earlier and no files have changed. Can you please try again? |
Modified BindPatch to accept an optional DebuggerJitInfo parameter. When DJI is provided by the caller, use it instead of calling GetJitInfo, which can trigger a deadlock through HashMap lookups with GC mode transitions. Updated two call sites that already have the DJI available: - AddBindAndActivatePatchForMethodDesc - MapPatchToDJI This is a targeted fix to avoid the deadlock described in issue where DebuggerController lock is held while calling GetJitInfo, which can try to acquire locks in the wrong order. Co-authored-by: noahfalk <6243776+noahfalk@users.noreply.github.com>
|
@copilot Instead of passing in a new |
Per code review feedback, simplified the fix to use the DJI already stored on the patch rather than passing it as a new parameter. This achieves the same goal of avoiding the deadlock-causing GetJitInfo call while being cleaner and more aligned with existing code patterns. Co-authored-by: AaronRobinsonMSFT <30635565+AaronRobinsonMSFT@users.noreply.github.com>
Per code review feedback, use a ternary operator instead of if-else block for more concise and idiomatic C++ code. Co-authored-by: AaronRobinsonMSFT <30635565+AaronRobinsonMSFT@users.noreply.github.com>
Updated to use patch->GetDJI() instead of passing DJI as a parameter. The implementation now checks patch->HasDJI() and uses patch->GetDJI() when available. Commits: 581a7bb, 6a6a4d6 |
There was a problem hiding this comment.
Pull request overview
This PR fixes a debugger deadlock issue where the runtime hangs when DebuggerController::BindPatch calls GetJitInfo while holding the DebuggerController lock. The GetJitInfo method can trigger HashMap lookups that perform GC mode transitions, leading to lock ordering violations and deadlocks.
Changes:
- Modified
BindPatchto check if the patch already has aDebuggerJitInfo(DJI) stored viapatch->HasDJI()and usepatch->GetDJI()instead of calling the deadlock-proneGetJitInfomethod - Added explanatory comments documenting the deadlock issue and why the optimization is necessary
- Maintained backward compatibility by falling back to
GetJitInfowhen no DJI is available on the patch
max-charlamb
left a comment
There was a problem hiding this comment.
Tested locally and looks good to me
|
/backport to release/10.0 |
|
Started backporting to |
…tch DJI (#124022) Backport of #123651 to release/10.0 /cc @steveisok @Copilot Backport of #123651 to release/10.0 ## Customer Impact - [ ] Customer reported - [X] Found internally Without this fix, customers can experience debugger hangs when debugging applications with ReadyToRun methods under specific timing conditions. This occurs when: 1. A breakpoint is set on a ReadyToRun method that hasn't executed yet 2. One thread is stepping through code while another thread first executes the ReadyToRun method with the breakpoint The hang requires the debugger and IDE to be force-closed, resulting in loss of debugging session state and developer productivity. ## Regression - [ ] Yes - [X] No The root cause has existed for a while, but timing changes in the code may have impacted the likelihood of hitting the race condition. ## Testing Manually verified. ## Risk Low The changes are minimal and highly targeted: - No API changes - the BindPatch signature remains unchanged - The logic flow remains unchanged - we're just avoiding an unnecessary lookup when the DJI is already available on the patch - The fix addresses the specific deadlock scenario without affecting other debugger functionality **IMPORTANT**: If this backport is for a servicing release, please verify that: - For .NET 8 and .NET 9: The PR target branch is `release/X.0-staging`, not `release/X.0`. - For .NET 10+: The PR target branch is `release/X.0` (no `-staging` suffix). ## Package authoring no longer needed in .NET 9 **IMPORTANT**: Starting with .NET 9, you no longer need to edit a NuGet package's csproj to enable building and bump the version. Keep in mind that we still need package authoring in .NET 8 and older versions. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: noahfalk <6243776+noahfalk@users.noreply.github.com> Co-authored-by: AaronRobinsonMSFT <30635565+AaronRobinsonMSFT@users.noreply.github.com>
Description
This PR implements a targeted fix for the debugger deadlock issue where the debugger hangs while waiting for a deadlocked debuggee process. The deadlock occurs when
DebuggerController::BindPatchcallsGetJitInfowhile holding the DebuggerController lock.GetJitInfocan trigger HashMap lookups that perform GC mode transitions, leading to lock ordering issues.The fix modifies
BindPatchto check if the patch already has aDebuggerJitInfo(DJI) stored on it viapatch->HasDJI(). When the DJI is available, it usespatch->GetDJI()instead of calling the problematicGetJitInfomethod. This approach is cleaner than adding a new parameter since the patch already stores the DJI.Changes Made
BindPatchimplementation to checkpatch->HasDJI()before callingGetJitInfopatch->GetDJI()to avoid the deadlock-proneGetJitInfocallGetJitInfoas beforeThe implementation uses a ternary operator for concise, idiomatic C++ code:
Customer Impact
Without this fix, customers can experience debugger hangs when debugging applications with ReadyToRun methods under specific timing conditions. This occurs when:
The hang requires the debugger and IDE to be force-closed, resulting in loss of debugging session state and developer productivity.
Regression
This is not a regression from the most recent release. The root cause has existed for a while, but timing changes in the code may have impacted the likelihood of hitting the race condition.
Testing
The changes are minimal (1 file changed, 3 insertions, 1 deletion) and surgical to reduce regression risk for potential backporting to prior .NET releases.
Risk
Low Risk. The changes are minimal and highly targeted:
BindPatchsignature remains unchangedAddPatchForMethodDef, so we're using existing dataGetJitInfoas beforeOriginal prompt
This section details on the original issue you should resolve
<issue_title>Debugger hangs due to runtime deadlock. HashMap takes ThreadStore out of order</issue_title>
<issue_description>### Description
The debugger is blocked in mscordbi code at:
mscordbi is waiting on the debuggee to respond which it won't do because it is deadlocked.
Thread 0x9314 in the debuggee (trying to acquire DebuggerController lock):
[Inline Frame] coreclr.dll!CLREventBase::Wait(unsigned long) Line 412 C++
coreclr.dll!Thread::WaitSuspendEventsHelper() Line 4458 C++
coreclr.dll!Thread::RareDisablePreemptiveGC() Line 2185 C++
[Inline Frame] coreclr.dll!Thread::DisablePreemptiveGC() Line 1288 C++
[Inline Frame] coreclr.dll!GCHolderBase::EnterInternalCoop_HackNoThread(bool) Line 4619 C++
[Inline Frame] coreclr.dll!GCCoopHackNoThread::{ctor}(bool) Line 4834 C++
[Inline Frame] coreclr.dll!HashMap::LookupValue(unsigned __int64) Line 552 C++
[Inline Frame] coreclr.dll!PtrHashMap::LookupValue(unsigned __int64 key, void *) Line 607 C++
[Inline Frame] coreclr.dll!ReadyToRunInfo::GetMethodDescForEntryPointInNativeImage(unsigned __int64) Line 376 C++
[Inline Frame] coreclr.dll!ReadyToRunInfo::GetMethodDescForEntryPoint(unsigned __int64 entryPoint) Line 104 C++
coreclr.dll!ReadyToRunJitManager::JitCodeToMethodInfo(RangeSection * pRangeSection, unsigned __int64 currentPC, MethodDesc * * ppMethodDesc, EECodeInfo * pCodeInfo) Line 6451 C++
coreclr.dll!EECodeInfo::Init(unsigned __int64 codeAddress, ExecutionManager::ScanFlag scanFlag) Line 15025 C++
[Inline Frame] coreclr.dll!EECodeInfo::{ctor}(unsigned __int64) Line 2877 C++
[Inline Frame] coreclr.dll!ExecutionManager::GetCodeStartAddress(unsigned __int64) Line 4980 C++
coreclr.dll!EEDbgInterfaceImpl::GetNativeCodeStartAddress(unsigned __int64 address) Line 416 C++
coreclr.dll!Debugger::GetJitInfoWorker(MethodDesc * fd, const unsigned char * pbAddr, DebuggerMethodInfo * * pMethInfo) Line 2711 C++
[Inline Frame] coreclr.dll!Debugger::GetJitInfo(MethodDesc *) Line 2650 C++
coreclr.dll!DebuggerController::BindPatch(DebuggerControllerPatch * patch, MethodDesc * pMD, const unsigned char *) Line 1432 C++
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.