-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Description
Description
CoreCLR crashes in signal handler due to not async-safe code called from inject_activation_handler signal handler.
This starts to happen with glibc at least >= 2.40, which changed logic in thread locals and now GetThread/GetThreadNULLOk called from signal handler can lead to realloc and crash. Sometimes this can lead to deadlock, #121345 is related. This issue happens both with and without asan.
Example of crash backtrace (also from #121345):
0xf51a9340 is located 0 bytes inside of 320-byte region [0xf51a9340,0xf51a9480)
freed by thread T11 here:
#0 0xf75fa26e in realloc.part.0 (/usr/lib/libasan.so+0x9e26e) (BuildId: 9bba7c7c1d333d26085dc332318addcdddefc51d)
#1 0xf7b11ae4 in _dl_resize_dtv (/lib/ld-linux.so.3+0x41010ae4) (BuildId: 09e97ca6a7629ff5a7bcdd346a4f7a5508203c59)
#2 0xf7b12500 in _dl_update_slotinfo (/lib/ld-linux.so.3+0x41011500) (BuildId: 09e97ca6a7629ff5a7bcdd346a4f7a5508203c59)
#3 0xf7b1265c in update_get_addr (/lib/ld-linux.so.3+0x4101165c) (BuildId: 09e97ca6a7629ff5a7bcdd346a4f7a5508203c59)
#4 0xf75dd5ea in __tls_get_addr (/usr/lib/libasan.so+0x815ea) (BuildId: 9bba7c7c1d333d26085dc332318addcdddefc51d)
#5 0xf24062da in CheckActivationSafePoint(unsigned int) (/usr/share/dotnet/shared/Microsoft.NETCore.App/8.0.11/libcoreclr.so+0x1792da) (BuildId: 78a8db7ede0a62b8ff150e5a58e4c5dad06019e3)
#6 0xf2571cc8 in inject_activation_handler(int, siginfo_t*, void*) (/usr/share/dotnet/shared/Microsoft.NETCore.App/8.0.11/libcoreclr.so+0x2e4cc8) (BuildId: 78a8db7ede0a62b8ff150e5a58e4c5dad06019e3)
#7 0xf71ace0c (/lib/libc.so.6+0x41242e0c) (BuildId: 4d66a597c3674cb64087a6587522a00c688b8037)
#8 0xf7605700 in __sanitizer::BufferedStackTrace::UnwindImpl(unsigned int, unsigned int, void*, bool, unsigned int) (/usr/lib/libasan.so+0xa9700) (BuildId: 9bba7c7c1d333d26085dc332318addcdddefc51d)
#9 0xf75fa298 in realloc.part.0 (/usr/lib/libasan.so+0x9e298) (BuildId: 9bba7c7c1d333d26085dc332318addcdddefc51d)
#10 0xf7b11ae4 in _dl_resize_dtv (/lib/ld-linux.so.3+0x41010ae4) (BuildId: 09e97ca6a7629ff5a7bcdd346a4f7a5508203c59)
#11 0xf7b12500 in _dl_update_slotinfo (/lib/ld-linux.so.3+0x41011500) (BuildId: 09e97ca6a7629ff5a7bcdd346a4f7a5508203c59)
#12 0xf7b1265c in update_get_addr (/lib/ld-linux.so.3+0x4101165c) (BuildId: 09e97ca6a7629ff5a7bcdd346a4f7a5508203c59)
#13 0xf75dd5ea in __tls_get_addr (/usr/lib/libasan.so+0x815ea) (BuildId: 9bba7c7c1d333d26085dc332318addcdddefc51d)
#14 0xf235d62a in ManagedThreadBase_DispatchMiddle(ManagedThreadCallState*)::Cleanup::~Cleanup() (/usr/share/dotnet/shared/Microsoft.NETCore.App/8.0.11/libcoreclr.so+0xd062a) (BuildId: 78a8db7ede0a62b8ff150e5a58e4c5dad06019e3)
#15 0xf235c926 in ManagedThreadBase_DispatchOuter(ManagedThreadCallState*) (/usr/share/dotnet/shared/Microsoft.NETCore.App/8.0.11/libcoreclr.so+0xcf926) (BuildId: 78a8db7ede0a62b8ff150e5a58e4c5dad06019e3)
#16 0xf235cb60 in ManagedThreadBase::KickOff(void (*)(void*), void*) (/usr/share/dotnet/shared/Microsoft.NETCore.App/8.0.11/libcoreclr.so+0xcfb60) (BuildId: 78a8db7ede0a62b8ff150e5a58e4c5dad06019e3)
#17 0xf238a274 in ThreadNative::KickOffThread(void*) (/usr/share/dotnet/shared/Microsoft.NETCore.App/8.0.11/libcoreclr.so+0xfd274) (BuildId: 78a8db7ede0a62b8ff150e5a58e4c5dad06019e3)
#18 0xf25912d8 in CorUnix::CPalThread::ThreadEntry(void*) (/usr/share/dotnet/shared/Microsoft.NETCore.App/8.0.11/libcoreclr.so+0x3042d8) (BuildId: 78a8db7ede0a62b8ff150e5a58e4c5dad06019e3)
#19 0xf75a1eda in asan_thread_start(void*) (/usr/lib/libasan.so+0x45eda) (BuildId: 9bba7c7c1d333d26085dc332318addcdddefc51d)
#20 0xf71f0cb0 in start_thread (/lib/libc.so.6+0x41286cb0) (BuildId: 4d66a597c3674cb64087a6587522a00c688b8037)
Should hijack be disabled for Linux for now as a quick fix (e.g. disabling FEATURE_THREAD_ACTIVATION) until #121345 (comment) is not completed? This seems to affect all Linux platforms with glibc >= 2.40 (e.g. Ubuntu 25.04 and higher).
cc @dotnet/samsung
Reproduction Steps
Some reproduction cases are mentioned in #121345
Expected behavior
No crash/deadlock
Actual behavior
Crash/deadlock
Regression?
Seems to be present in all .net versions (at least starting from .net core 3.1)
Known Workarounds
Disabling FEATURE_THREAD_ACTIVATION?
Configuration
Crash backtrace above for .net 8.0.11 arm32 Tizen, but bug is independent of arch and dotnet version
Other information
No response
Metadata
Metadata
Assignees
Labels
Type
Projects
Status