Skip to content

[.NET 9.0] Random deadlock when terminating thread #114333

Open
@herve-dev1

Description

@herve-dev1

Description

Hello

I am facing a deadlock during a thread termination.

From what I understand of the info provided by WinDBG for Thread #23 (Id=9224), the deadlock occurs because the thread is waiting for a Garbage Collection event (e.g. it is in CooperativeCleanUp method) while it is holding the Loader Lock.

I am on Windows and my version of dotnet is shown below:

dotnet --version
9.0.201

Call-stack for the thread is shown below:
# Child-SP RetAddr Call Site
00 000000bbf3afebf8 00007ff97245ce4f ntdll!NtWaitForSingleObject+0x14
01 000000bbf3afec00 00007ff843c69263 KERNELBASE!WaitForSingleObjectEx+0xaf
02 (Inline Function) ---------------- coreclr!CLREventWaitHelper2+0x6 [D:\a\_work\1\s\src\coreclr\vm\synch.cpp @ 372] 03 000000bbf3afeca0 00007ff843de777b coreclr!CLREventWaitHelper+0xf [D:\a\_work\1\s\src\coreclr\vm\synch.cpp @ 397] 04 (Inline Function) ---------------- coreclr!CLREventBase::WaitEx+0x11 [D:\a_work\1\s\src\coreclr\vm\synch.cpp @ 466]
05 (Inline Function) ---------------- coreclr!CLREventBase::Wait+0x11 [D:\a\_work\1\s\src\coreclr\vm\synch.cpp @ 412] 06 000000bbf3afecf0 00007ff843ceead6 coreclr!Thread::WaitSuspendEventsHelper+0x9f [D:\a\_work\1\s\src\coreclr\vm\threadsuspend.cpp @ 4485] 07 (Inline Function) ---------------- coreclr!Thread::WaitSuspendEvents+0x8 [D:\a_work\1\s\src\coreclr\vm\threadsuspend.cpp @ 4517]
08 000000bbf3afed80 00007ff843bebff4 coreclr!Thread::RareDisablePreemptiveGC+0x137ace [D:\a_work\1\s\src\coreclr\vm\threadsuspend.cpp @ 2178]
09 (Inline Function) ---------------- coreclr!Thread::DisablePreemptiveGC+0x1f [D:\a\_work\1\s\src\coreclr\vm\threads.h @ 1297] 0a (Inline Function) ---------------- coreclr!GCHolderBase::EnterInternalCoop+0x37 [D:\a_work\1\s\src\coreclr\vm\threads.h @ 4712]
0b 000000bbf3afee10 00007ff843c0da58 coreclr!GCCoop::GCCoop+0x54 [D:\a_work\1\s\src\coreclr\vm\threads.h @ 4832]
0c 000000bbf3afee40 00007ff843c0d98a coreclr!Thread::CooperativeCleanup+0x24 [D:\a_work\1\s\src\coreclr\vm\threads.cpp @ 2737]
0d 000000bbf3afee90 00007ff843c0d8b6 coreclr!Thread::DetachThread+0x9a [D:\a_work\1\s\src\coreclr\vm\threads.cpp @ 936]
0e 000000bbf3afeec0 00007ff843ca5043 coreclr!TlsDestructionMonitor::~TlsDestructionMonitor+0x62 [D:\a_work\1\s\src\coreclr\vm\ceemain.cpp @ 1744]
0f 000000bbf3afef00 00007ff974a72073 coreclr!__dyn_tls_dtor+0x63 [D:\a_work\1\s\src\vctools\crt\vcstartup\src\tls\tlsdtor.cpp @ 119]
10 000000bbf3afef30 00007ff974a78030 ntdll!LdrpCallInitRoutine+0xa3
11 000000bbf3aff210 00007ff974b1c73a ntdll!LdrpCallTlsInitializers+0x210
12 000000bbf3aff2d0 00007ff974b1bff6 ntdll!LdrShutdownThread+0x3ba
13 000000bbf3aff3f0 00007ff974ac6985 ntdll!RtlExitUserThread+0x46
14 000000bbf3aff430 00007ff9744ae8d7 ntdll!TppWorkerThread+0xfd5
15 000000bbf3aff790 00007ff974b1bf6c kernel32!BaseThreadInitThunk+0x17
16 000000bbf3aff7c0 0000000000000000 ntdll!RtlUserThreadStart+0x2c

The extracts below that it is holding the Loader Lock:
0:023> !peb
PEB at 000000bbf16f2000

0:023> dt ntdll!_PEB 000000bbf16f2000
	...
   +0x110 LoaderLock       : 0x00007ff9`74c2a810 _RTL_CRITICAL_SECTION
	...

dx -r1 ((ntdll!_RTL_CRITICAL_SECTION *)0x7ff974c2a810)
((ntdll!_RTL_CRITICAL_SECTION *)0x7ff974c2a810)                 : 0x7ff974c2a810 [Type: _RTL_CRITICAL_SECTION *]
	[+0x000] DebugInfo        : 0x7ff974c2a890 [Type: _RTL_CRITICAL_SECTION_DEBUG *]
	[+0x008] LockCount        : -2 [Type: long]
	[+0x00c] RecursionCount   : 1 [Type: long]
	[+0x010] OwningThread     : 0x9224 [Type: void *]
	[+0x018] LockSemaphore    : 0x0 [Type: void *]
	[+0x020] SpinCount        : 0x4000000 [Type: unsigned __int64]

In DllMain's documentation (https://learn.microsoft.com/en-gb/windows/win32/dlls/dllmain) it is written that:

  • When the system starts or terminates a process or thread, it calls the entry-point function for each loaded DLL using the first thread of the process
    => it is normal that the thread holds the loader lock
  • Because DLL notifications are serialized, entry-point functions should not attempt to communicate with other threads or processes. Deadlocks may occur as a result.
    => it is not normal that there is a wait

Is my analysis correct ? Does any workaround exist ?
Or is there something that could be done to prevent the wait ? note: at the present time, I seem to have the issue mostly when I run the program in Visual Studio (2022, version 17.13.5) but I cannot see any relation.

Unfortunately I don't have minimal reproduction scenario.

Thanks for your help.
Kind regards,

Hervé

Reproduction Steps

Unfortunately I don't have any.

Expected behavior

No deadlock

Actual behavior

deadlock

Regression?

I don't know.

Known Workarounds

None.

Configuration

.NET 9.0
Windows
X64
I have the issue most often when running the program on visual studio.

Other information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    area-VM-coreclruntriagedNew issue has not been triaged by the area owner

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions