-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
crypto/rsa: TLS requests are 2-3x slower in go1.20 with race detection enabled #56980
Comments
Thank you for the details! If I understand the benchmark correctly, both the server and the client side of the handshake contribute to the benchmarked time, so this is observing an RSA-2048 sign+verify cycle. In that case +26% on amd64 and +35% on arm64 is more or less in line with the benchmarks we had. I had not benchmarked with the race detector enabled, and it's interesting the performance profile is so different. I can look into it, but it would help to understand if race detector-enabled performance is a goal. The Go 1.20 changes are a tradeoff that knowingly sacrifice a bit of performance for a lot of security by cutting a large chunk of attack surface (math/big). In the meantime, if you just want your tests to be faster (assuming that's where the race detector matters), you might consider using a smaller (insecure) RSA key, or checking if ECDSA keys perform better. (Please use https://go.dev/cl/ links, these links don't work outside Google. There's an internal Chrome extension you can use to easily copy a shortened version of the link to the current page. Thank you!) |
It wasn't clear to me that the benchmarks in http://go.dev/cl/326012 represented known multi-millisecond slowdowns for requests (where both the client and server performance is impacted in serial). The CL also alluded to possible optimizations ("this is without any assembly at all, and that further improvements are likely possible"), and I was surprised to see it merge without attempting to preserve status quo for performance. Are any of those improvements planned before go1.20 freezes?
It's not specifically a goal, but the change turned several of our tests that were previously solid green (taking ~30ms to make a request with a 100ms timeout) into flaking or solidly failing (taking 100-200ms+ to make the same request) on some architectures. As we are working to be able to update Kubernetes release branches to new go minor versions with minimal code changes, knowing this is in the pipeline and anticipating chasing down new test flakes and backporting test changes to release branches as a result of a go version bump is a red flag. |
Go 1.20 already froze. We landed https://go.dev/cl/445019, https://go.dev/cl/445020, https://go.dev/cl/445021, and https://go.dev/cl/452095 to reduce the performance gap. https://go.dev/cl/452095 has the final benchmarks on amd64 compared to Go 1.19, which amount to 500µs for sign+verify on the Intel(R) Core(TM) i5-7400 CPU @ 3.00GHz I ran them on, which is more or less in line with the 840µs you're seeing. We do have extra optimization work planned for Go 1.21, and the new backend can in theory become faster than the old one with the same amount of assembly. When you say multi-millisecond slowdowns, I assume you mean with the race detector enabled. That's interesting and worth looking into (it might be an issue in the race detector or in the interaction, maybe?) but the question is whether race detector-enabled performance is a blocker worth reverting the security improvement over. |
I suspect that the greater slowdowns with the race detector just mirror the fact that the new implementation is more Go and less assembly. The race detector instrumentation only gets inserted in Go code. I don't think race detector performance should be a release blocker. |
No, without race detection, a dial+handshake+request took at least a millisecond longer on the machines I tested, and in some architectures, multiple milliseconds longer. That reflects real runtime impact, which I was surprised by.
I tend to agree (pain of backporting deflaking test changes notwithstanding), but I was surprised by the degree of the performance drop, and wanted to surface what we saw in case it indicated a bigger problem. |
@liggitt
Have run your Minimal reproducer in description.
|
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
No, introduced in go1.20 dev branch in https://go-review.googlesource.com/c/go/+/326012
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
Ran project unit tests making TLS requests with race detection enabled (xref kubernetes/kubernetes#114145 (comment))
Minimal reproducer:
run with
What did you expect to see?
TLS request times are roughly equivalent to go1.19 regardless of race detection.
What did you see instead?
to make the same request as with go1.19, go1.20 takes ~2x longer on linux/amd64, ~3x longer on darwin/arm64, and even longer on linux/ppc64le (I'm working on getting numbers with the minimal reproducer above)
linux/amd64:
darwin/arm64
This is with a simple request that only has the client verifying the server cert. When more is involved (mutual TLS or an RSA JWT verification), the slowdown is even more pronounced (3x - 7x slowdowns observed in kubernetes/kubernetes#114145).
Without race detection, the slowdown due to https://go-review.googlesource.com/c/go/+/326012 was not as drastic, but was still milliseconds longer per request:
linux/amd64:
darwin/arm64
cc @FiloSottile
The text was updated successfully, but these errors were encountered: