-
Notifications
You must be signed in to change notification settings - Fork 10.4k
Dispatch connection execution #12265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Seeing if this helps performance of connection: close based connections
src/Servers/Kestrel/Core/src/Internal/Infrastructure/KestrelConnection.cs
Outdated
Show resolved
Hide resolved
/azp run AspNetCore-ci |
Azure Pipelines successfully started running 1 pipeline(s). |
Ran plaintext on linux to make sure there was no regression as well:
After:
|
First request looks worse tho |
@davidfowl I feel like some else may have caused the first request regression. I'm running on master right now (without your change) and I see first request times around 86 ms. |
I wouldn't be shocked if the first request regression was real. That said getting the connection throughput back is probably worth it. It would be interesting to compare the first-requests latency before the bedrock refactor and after this dispatch change. I also wouldn't be surprised if we had higher latency prior to the refactor, but didn't really notice since we don't pay as much attention to first request latency compared to RPS. |
Yep I was thinking the same thing. I want to make sure the other kestrel changes happen before this one so the regression doesn't end up in the same run as the other changes.
I spent last night looking at this and the bedrock refactor: The startup time during the bedrock refactor went from 19ms to 29sms. |
34% cpu utilization seems to leave a lot on the table for connections per second. 17k – which it was before the regression – is already pretty far below what would be expected. Even nginx claims to do 34k CPS on a single core or 250k+ on 16 cores. Erlang and Go easily do 100k+ as well. /cc @tmds @benaadams |
@davidfowl I don't think the regression in startup corresponds with your Bedrock refactor. What is interesting is that before and after the regression, there were no changes to either the ASP.NET Core runtime and the .NET Runtime. (aspnetcore was 3.0.0-preview6-19253-01 and dotnetcore was 3.0.0-preview6-27714-15). This suggests a machine issue for the time to first response. We don't have a similar regression on Linux; Startup time has consistently dropped there. @NinoFloris I agree that our connection close scenario is an area to improve. Our linux numbers are slightly better (around 30k CPS) but I believe the machines have more than one core. |
Didn't it bounce back after that single 29 run as well?
That's on physical with a bazillion cores right? CPU utilization on linux for 30k CPS looks to be ~10% |
From what I remember, there was a driver issue on the Windows perf machines that haven't been addressed. @sebastienros for more info? |
This might be relevant for CPS and non pipelined investigations too https://github.com/eloraiby/fs-pacer/blob/master/fs-pacer.md |
Thanks for the input @NinoFloris. I hadn't seen the nginx CPS benchmarks before, but it seems that we both decided to measure CPS performance using wrk with a "Connection: close" request header. There are some slight differences like we respond with "Hello World" and nginx responds with an empty body, but I don't think that should make much of a difference. The more significant difference is that the nginx CPS benchmarks only use 50 vs our 256 concurrent connections. 256 connections might not be optimal, since that was mostly tuned for the TechEmpower Pipelined Plaintext scenario. Or maybe nginx is effectively handicapping itself and still winning. Either way, it looks like nginx is using better hardware for both their benchmarking servers and clients than we do (e.g. 2x Xeon E5‑2699 vs 1x Xeon E5-1650, 2x Intel XL710 40GbE vs 1x Intel X540 10GbE, etc...). Though as you mention, nginx outperforms even on a single core. We're looking at getting 40GbE NICs soon. All that said, it's always nice to know what's possible, and I'm sure we're leaving per-connection performance on the table. So far, we definitely focused more of our optimization efforts on improving RPS once a connection is already established. |
Fixes #10956
Before:
After: