Skip to content

[.NET 9.0] Random deadlock when terminating thread #114333

@herve-dev1

Description

@herve-dev1

Description

Hello

I am facing a deadlock during a thread termination.

From what I understand of the info provided by WinDBG for Thread #23 (Id=9224), the deadlock occurs because the thread is waiting for a Garbage Collection event (e.g. it is in CooperativeCleanUp method) while it is holding the Loader Lock.

I am on Windows and my version of dotnet is shown below:

dotnet --version
9.0.201

Call-stack for the thread is shown below:
# Child-SP RetAddr Call Site
00 000000bbf3afebf8 00007ff97245ce4f ntdll!NtWaitForSingleObject+0x14
01 000000bbf3afec00 00007ff843c69263 KERNELBASE!WaitForSingleObjectEx+0xaf
02 (Inline Function) ---------------- coreclr!CLREventWaitHelper2+0x6 [D:\a\_work\1\s\src\coreclr\vm\synch.cpp @ 372] 03 000000bbf3afeca0 00007ff843de777b coreclr!CLREventWaitHelper+0xf [D:\a\_work\1\s\src\coreclr\vm\synch.cpp @ 397] 04 (Inline Function) ---------------- coreclr!CLREventBase::WaitEx+0x11 [D:\a_work\1\s\src\coreclr\vm\synch.cpp @ 466]
05 (Inline Function) ---------------- coreclr!CLREventBase::Wait+0x11 [D:\a\_work\1\s\src\coreclr\vm\synch.cpp @ 412] 06 000000bbf3afecf0 00007ff843ceead6 coreclr!Thread::WaitSuspendEventsHelper+0x9f [D:\a\_work\1\s\src\coreclr\vm\threadsuspend.cpp @ 4485] 07 (Inline Function) ---------------- coreclr!Thread::WaitSuspendEvents+0x8 [D:\a_work\1\s\src\coreclr\vm\threadsuspend.cpp @ 4517]
08 000000bbf3afed80 00007ff843bebff4 coreclr!Thread::RareDisablePreemptiveGC+0x137ace [D:\a_work\1\s\src\coreclr\vm\threadsuspend.cpp @ 2178]
09 (Inline Function) ---------------- coreclr!Thread::DisablePreemptiveGC+0x1f [D:\a\_work\1\s\src\coreclr\vm\threads.h @ 1297] 0a (Inline Function) ---------------- coreclr!GCHolderBase::EnterInternalCoop+0x37 [D:\a_work\1\s\src\coreclr\vm\threads.h @ 4712]
0b 000000bbf3afee10 00007ff843c0da58 coreclr!GCCoop::GCCoop+0x54 [D:\a_work\1\s\src\coreclr\vm\threads.h @ 4832]
0c 000000bbf3afee40 00007ff843c0d98a coreclr!Thread::CooperativeCleanup+0x24 [D:\a_work\1\s\src\coreclr\vm\threads.cpp @ 2737]
0d 000000bbf3afee90 00007ff843c0d8b6 coreclr!Thread::DetachThread+0x9a [D:\a_work\1\s\src\coreclr\vm\threads.cpp @ 936]
0e 000000bbf3afeec0 00007ff843ca5043 coreclr!TlsDestructionMonitor::~TlsDestructionMonitor+0x62 [D:\a_work\1\s\src\coreclr\vm\ceemain.cpp @ 1744]
0f 000000bbf3afef00 00007ff974a72073 coreclr!__dyn_tls_dtor+0x63 [D:\a_work\1\s\src\vctools\crt\vcstartup\src\tls\tlsdtor.cpp @ 119]
10 000000bbf3afef30 00007ff974a78030 ntdll!LdrpCallInitRoutine+0xa3
11 000000bbf3aff210 00007ff974b1c73a ntdll!LdrpCallTlsInitializers+0x210
12 000000bbf3aff2d0 00007ff974b1bff6 ntdll!LdrShutdownThread+0x3ba
13 000000bbf3aff3f0 00007ff974ac6985 ntdll!RtlExitUserThread+0x46
14 000000bbf3aff430 00007ff9744ae8d7 ntdll!TppWorkerThread+0xfd5
15 000000bbf3aff790 00007ff974b1bf6c kernel32!BaseThreadInitThunk+0x17
16 000000bbf3aff7c0 0000000000000000 ntdll!RtlUserThreadStart+0x2c

The extracts below that it is holding the Loader Lock:
0:023> !peb
PEB at 000000bbf16f2000

0:023> dt ntdll!_PEB 000000bbf16f2000
	...
   +0x110 LoaderLock       : 0x00007ff9`74c2a810 _RTL_CRITICAL_SECTION
	...

dx -r1 ((ntdll!_RTL_CRITICAL_SECTION *)0x7ff974c2a810)
((ntdll!_RTL_CRITICAL_SECTION *)0x7ff974c2a810)                 : 0x7ff974c2a810 [Type: _RTL_CRITICAL_SECTION *]
	[+0x000] DebugInfo        : 0x7ff974c2a890 [Type: _RTL_CRITICAL_SECTION_DEBUG *]
	[+0x008] LockCount        : -2 [Type: long]
	[+0x00c] RecursionCount   : 1 [Type: long]
	[+0x010] OwningThread     : 0x9224 [Type: void *]
	[+0x018] LockSemaphore    : 0x0 [Type: void *]
	[+0x020] SpinCount        : 0x4000000 [Type: unsigned __int64]

In DllMain's documentation (https://learn.microsoft.com/en-gb/windows/win32/dlls/dllmain) it is written that:

  • When the system starts or terminates a process or thread, it calls the entry-point function for each loaded DLL using the first thread of the process
    => it is normal that the thread holds the loader lock
  • Because DLL notifications are serialized, entry-point functions should not attempt to communicate with other threads or processes. Deadlocks may occur as a result.
    => it is not normal that there is a wait

Is my analysis correct ? Does any workaround exist ?
Or is there something that could be done to prevent the wait ? note: at the present time, I seem to have the issue mostly when I run the program in Visual Studio (2022, version 17.13.5) but I cannot see any relation.

Unfortunately I don't have minimal reproduction scenario.

Thanks for your help.
Kind regards,

Hervé

Reproduction Steps

Unfortunately I don't have any.

Expected behavior

No deadlock

Actual behavior

deadlock

Regression?

I don't know.

Known Workarounds

None.

Configuration

.NET 9.0
Windows
X64
I have the issue most often when running the program on visual studio.

Other information

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions