Description
We've seen a deadlock in msbuild when performing a source-build of rc2 natively on s390x (using the Mono runtime). In a nutshell, the deadlock occurs due to a lock-order inversion between the Mono JIT's mono_loader_lock
and the managed Microsoft.Build.BackEnd.Logging.LoggingService._lockObject
.
Specifically, we have two involved threads, the main thread and a thread pool thread.
The main thread runs
NuGet.Build.Tasks.Console.Program:Main
which callsNuGet.Build.Tasks.Console.MSBuildStaticGraphRestore:LoadProjects
which callsMicrosoft.Build.Evaluation.ProjectCollection:.ctor
which callsMicrosoft.Build.BackEnd.Logging.LoggingService:RegisterLogger
which TAKESMicrosoft.Build.BackEnd.Logging.LoggingService._lockObject
and callsMicrosoft.Build.Evaluation.ProjectCollection/ReusableLogger:RegisterForEvents
- At this point, the JIT compiler for
Microsoft.Build.BackEnd.Logging.EventSourceSink:add_BuildStarted
gets invoked - This calls at some point
mono_class_create_from_typedef
which TAKESmono_loader_lock
- And deadlocks here
The thread pool thread runs
NuGet.Build.Tasks.Console.ConsoleLoggingQueue:Process
which callsNuGet.Build.Tasks.ConsoleOutLogMessage:ToJson
which callsSystem.Collections.Concurrent.ConcurrentDictionary'2<TKey_REF, TValue_REF>:GetOrAdd
- At this point, the JIT compiler for
Newtonsoft.Json.Serialization.DefaultContractResolver:CreateContract
gets invoked - This calls at some point
mono_class_create_from_typedef
which TAKESmono_loader_lock
- And calls
mono_metadata_interfaces_from_typedef_full
, which loads for the first timeSystem.Linq.Expressions.dll
- This invokes the assembly load hook chain, in particular calling
Microsoft.Build.BackEnd.Components.RequestBuilder.AssemblyLoadsTracker:CurrentDomainOnAssemblyLoad
which callsMicrosoft.Build.BackEnd.Logging.LoggingService:ProcessLoggingEvent
which TAKESMicrosoft.Build.BackEnd.Logging.LoggingService._lockObject
- And deadlocks here
It seems to me the root cause of the deadlock is that mono_class_create_from_typedef
holds mono_loader_lock
across function calls that may trigger invoking managed code (the assembly load hooks) which might do anything - this doesn't seem a good idea.
CC - @directhex @lambdageek @vargaz @akoeplinger
FYI - @giritrivedi @alhad-deshpande @janani66 @omajid @tmds