Description
Description
When running a load that is highly burstable - i.e. creates lots of threadpool workers that aren't used persistently, the threadpool itself uses a lot of CPU time waiting for jobs, that could interfere with background processes (outside the application).
The codebase extensively uses async/await for primarily low-level networking related operations.
Configuration
.NET Core LTS 6.0
Linux (Ubuntu 22.04)
AWS EC2 t2.medium host
Data
Measured using a combination of htop and JetBrain's dotTrace - when 17 threadpool workers are created, the system CPU usage is around 75% +/- 15% (averaging both cores).
dotTrace (sampling mode) reports that 35.4% of the CPU usage is spent in the following stack trace:
System.Threading.LowLevelLifoSemaphore.WaitNative(SafeWaitHandle, Int32)
System.Threading.LowLevelLifoSemaphore.WaitForSignal(Int32)
System.Threading.LowLevelLifoSemaphore.Wait(Int32, Boolean)
System.Threading.PortableThreadPool+WorkerThread.WorkerThreadStart()
System.Threading.Thread.StartCallback()
Another 45% or so is spread among
10.5%
System.Threading.Monitor.Wait(Object, Int32)
9.8%
System.IO.FileSystemWatcher+RunningInstance.TryReadEvent(out NotifyEvent)
8.7%
System.Threading.WaitHandle.WaitOneNoCheck(Int32)
System.Threading.PortableThreadPool+GateThread.GateThreadStart()
System.Threading.Thread.StartCallback()
5.3%
System.Threading.Thread.Sleep(Int32)
4.9%
Interop+Sys.WaitForSocketEvents(IntPtr, SocketEvent*, Int32*)
Which all appear to be roughly symptomatic of the same thing - even if different in specifics.
Analysis
Across a range of areas (see stack traces above) the runtime appears to be spinlocking while waiting for jobs, rather than using interrupts which may be better in this scenario when some latency in picking up the job is acceptable (i.e. I'm fine waiting 5ms for the job to start).
This would allow the CPU to idle more lowering both environmental impact (both real world, and within the OS instance), as well as better manage things such as AWS's CPU Credits; and provide more accurate reporting in system level analytics on the real system load.