Description
After helping a customer look at a thread pool starvation case on linux on .NET Core 3.1 I ended up here. After doing some research and with some discussion on Twitter, it turns out that getaddrinfo_a
uses an internal thread pool and blocks on getaddrinfo
and isn't doing any async IO. This change is an improvement over what we had before because our threadpool doesn't grow but I'm not sure this change is a net positive in the long run. The thread pool limits are controlled by compile time constants in glibc (essentially, another library is doing async over sync for us on a less controllable threadpool...).
I wonder if we're better off controlling this blocking code and maybe it should be possible to turn this off with a configuration switch.
The other improvement I was thinking about was only allowing one pending request to a specific host name concurrently. That would improve situations where DNS is slow and new blocking calls are issued for the same host name (which is the case the customer ran into) on thread pool threads.