Don't crash when walking callstack roots in the debugger#124402
Don't crash when walking callstack roots in the debugger#124402hoyosjs merged 2 commits intodotnet:mainfrom
Conversation
|
Tagging subscribers to this area: @agocke |
There was a problem hiding this comment.
Pull request overview
This pull request fixes a critical crash that occurs when debuggers walk callstack roots on Linux amd64 coredumps. The root cause is that context pointers in DAC (Data Access Component) builds resolve through a cache that can evict entries before the pointers are consumed, leading to invalid memory access.
Changes:
- Fixed AMD64 FaultingExceptionFrame to use local context pointers in DAC builds instead of frame member references
- Extended SoftwareExceptionFrame context pointer fix from X86-only to all architectures
- Added defensive checks in GC info decoder to use captured register values in DAC scenarios across all architectures
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| src/coreclr/vm/amd64/cgenamd64.cpp | Added DACCESS_COMPILE guards in FaultingExceptionFrame::UpdateRegDisplay_Impl to point context pointers at local pCurrentContext copy instead of m_ctx members |
| src/coreclr/vm/excep.cpp | Expanded DACCESS_COMPILE context pointer fix from TARGET_X86 to all architectures in SoftwareExceptionFrame::UpdateRegDisplay_Impl |
| src/coreclr/vm/gcinfodecoder.cpp | Added DACCESS_COMPILE && TARGET_UNIX guards to use GetCapturedRegister instead of GetRegisterSlot in ReportRegisterToGC and GetStackSlot across all architectures |
1cc2ca8 to
b26de30
Compare
|
Looking at the FaultingExceptionFrame::UpdateRegDisplay_Impl variants, it seems the other architectures would suffer from the same issue. |
|
I'll take a look and keep working on it when I'm back in office Tuesday morning. |
|
I've updated the other platforms with the fix as well. |
|
Before merging, I want JanV to have a chance to weigh in here when he's back on Monday. |
Changeset 1fa1745 introduced a stack walking regression where we will crash trying to unwind/report registers. The real fix here is in cgenamd64.cpp/excep.cpp, which will point the context to a local copy before we try to use it.
I've also added some changes in gcinfodecoder.cpp that aren't strictly necessary but would have avoided the problem to begin with.
Fixes #124401.