-
Notifications
You must be signed in to change notification settings - Fork 247
Hide OpenSSL sha256 initialisation in a function-scoped thread_local
#7254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hide OpenSSL sha256 initialisation in a function-scoped thread_local
#7254
Conversation
| PICOBENCH_SUITE("digest sha256"); | ||
| namespace SHA256_bench | ||
| { | ||
| const std::vector<int> sha256_shifts = { | ||
| 2 << 4, 2 << 6, 2 << 8, 2 << 10, 2 << 12, 2 << 14, 2 << 16}; | ||
|
|
||
| auto openssl_sha256_preinit = sha256_bench; | ||
| PICOBENCH(openssl_sha256_preinit).iterations(sha256_shifts).baseline(); | ||
|
|
||
| DEFINE_SHA256_BENCH(6) | ||
| DEFINE_SHA256_BENCH(8) | ||
| DEFINE_SHA256_BENCH(10) | ||
| DEFINE_SHA256_BENCH(12) | ||
| DEFINE_SHA256_BENCH(14) | ||
| DEFINE_SHA256_BENCH(16) | ||
| auto openssl_sha256_tl_init = sha256_bench_; | ||
| PICOBENCH(openssl_sha256_tl_init).iterations(sha256_shifts); | ||
|
|
||
| auto openssl_sha256_nocache = sha256_noopt_bench; | ||
| PICOBENCH(openssl_sha256_nocache).iterations(sha256_shifts); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This refactor is worth doing in a separate, even if we don't use this initialisation. It moves all these benchmarks into a single suite, and uses the .iterations (the Dim column in the output) for the actual digest size, so that the ns/op and Ops/second columns are actually talking about per-byte costs, rather than some abstract multiple.
Before:
## digest sha256 (2 << 6):
Name (* = baseline) | Dim | Total ms | ns/op |Baseline| Ops/second
--------------------------|--------:|----------:|--------:|-------:|----------:
openssl_sha256_6_no * | 10 | 0.004 | 442 | - | 2259887.0
openssl_sha256_6 | 10 | 0.003 | 279 | 0.633 | 3572704.5
## digest sha256 (2 << 8):
Name (* = baseline) | Dim | Total ms | ns/op |Baseline| Ops/second
--------------------------|--------:|----------:|--------:|-------:|----------:
openssl_sha256_8_no * | 10 | 0.007 | 708 | - | 1412429.4
openssl_sha256_8 | 10 | 0.005 | 498 | 0.704 | 2005615.7
## digest sha256 (2 << 10):
Name (* = baseline) | Dim | Total ms | ns/op |Baseline| Ops/second
--------------------------|--------:|----------:|--------:|-------:|----------:
openssl_sha256_10_no * | 10 | 0.017 | 1657 | - | 603281.9
openssl_sha256_10 | 10 | 0.015 | 1512 | 0.913 | 661025.9
## digest sha256 (2 << 12):
Name (* = baseline) | Dim | Total ms | ns/op |Baseline| Ops/second
--------------------------|--------:|----------:|--------:|-------:|----------:
openssl_sha256_12_no * | 10 | 0.056 | 5587 | - | 178980.5
openssl_sha256_12 | 10 | 0.054 | 5418 | 0.970 | 184559.7
## digest sha256 (2 << 14):
Name (* = baseline) | Dim | Total ms | ns/op |Baseline| Ops/second
--------------------------|--------:|----------:|--------:|-------:|----------:
openssl_sha256_14_no * | 10 | 0.214 | 21420 | - | 46684.7
openssl_sha256_14 | 10 | 0.212 | 21208 | 0.990 | 47151.1
## digest sha256 (2 << 16):
Name (* = baseline) | Dim | Total ms | ns/op |Baseline| Ops/second
--------------------------|--------:|----------:|--------:|-------:|----------:
openssl_sha256_16_no * | 10 | 0.846 | 84614 | - | 11818.3
openssl_sha256_16 | 10 | 0.843 | 84281 | 0.996 | 11865.1
## digest sha256 (2 << 6):
Name (* = baseline) | Dim | Total ms | ns/op |Baseline| Ops/second
--------------------------|--------:|----------:|--------:|-------:|----------:
openssl_sha256_6_no * | 10 | 0.004 | 442 | - | 2259887.0
openssl_sha256_6 | 10 | 0.003 | 279 | 0.633 | 3572704.5
## digest sha256 (2 << 8):
Name (* = baseline) | Dim | Total ms | ns/op |Baseline| Ops/second
--------------------------|--------:|----------:|--------:|-------:|----------:
openssl_sha256_8_no * | 10 | 0.007 | 708 | - | 1412429.4
openssl_sha256_8 | 10 | 0.005 | 498 | 0.704 | 2005615.7
## digest sha256 (2 << 10):
Name (* = baseline) | Dim | Total ms | ns/op |Baseline| Ops/second
--------------------------|--------:|----------:|--------:|-------:|----------:
openssl_sha256_10_no * | 10 | 0.017 | 1657 | - | 603281.9
openssl_sha256_10 | 10 | 0.015 | 1512 | 0.913 | 661025.9
## digest sha256 (2 << 12):
Name (* = baseline) | Dim | Total ms | ns/op |Baseline| Ops/second
--------------------------|--------:|----------:|--------:|-------:|----------:
openssl_sha256_12_no * | 10 | 0.056 | 5587 | - | 178980.5
openssl_sha256_12 | 10 | 0.054 | 5418 | 0.970 | 184559.7
## digest sha256 (2 << 14):
Name (* = baseline) | Dim | Total ms | ns/op |Baseline| Ops/second
--------------------------|--------:|----------:|--------:|-------:|----------:
openssl_sha256_14_no * | 10 | 0.214 | 21420 | - | 46684.7
openssl_sha256_14 | 10 | 0.212 | 21208 | 0.990 | 47151.1
## digest sha256 (2 << 16):
Name (* = baseline) | Dim | Total ms | ns/op |Baseline| Ops/second
--------------------------|--------:|----------:|--------:|-------:|----------:
openssl_sha256_16_no * | 10 | 0.846 | 84614 | - | 11818.3
openssl_sha256_16 | 10 | 0.843 | 84281 | 0.996 | 11865.1
After:
## digest sha256:
Name (* = baseline) | Dim | Total ms | ns/op |Baseline| Ops/second
--------------------------|--------:|----------:|--------:|-------:|----------:
openssl_sha256_noopt * | 32 | 0.000 | 12 | - | 80200501.3
openssl_sha256_opt | 32 | 0.000 | 5 | 0.439 |182857142.9
openssl_sha256_noopt * | 128 | 0.000 | 3 | - |272921108.7
openssl_sha256_opt | 128 | 0.000 | 2 | 0.557 |490421455.9
openssl_sha256_noopt * | 512 | 0.001 | 1 | - |665799739.9
openssl_sha256_opt | 512 | 0.001 | 1 | 0.700 |951672862.5
openssl_sha256_noopt * | 2048 | 0.002 | 0 | - |1137146030.0
openssl_sha256_opt | 2048 | 0.002 | 0 | 0.890 |1277604491.6
openssl_sha256_noopt * | 8192 | 0.006 | 0 | - |1362608117.1
openssl_sha256_opt | 8192 | 0.006 | 0 | 0.934 |1458170167.3
openssl_sha256_noopt * | 32768 | 0.022 | 0 | - |1481977296.4
openssl_sha256_opt | 32768 | 0.022 | 0 | 1.005 |1475105789.1
openssl_sha256_noopt * | 131072 | 0.088 | 0 | - |1490792870.9
openssl_sha256_opt | 131072 | 0.090 | 0 | 1.028 |1450537289.3
The evidence that these are "the same numbers" is that the ratio in the Baseline column is (roughly, within noise) the same for each size. Just now the other columns are more readable, the Total ms is for one rather than 10, and as mentioned above the "op" unit is "byte".
(I've also added 2 << 4, because I was interested)
(Bonus, you can now run this directly with ./crypto_bench --run-suite="digest sha256", because it's a single suite)
Is this actually faster?If we only run a single digest, then the benchmarks say no - this is faster than not caching the contexts, but slower than the explicit But that's slightly artificial, what about the minor "warm cache" (/branch predictor) win we'd expect if we digest multiple things in quick succession (simulated by "calling the function 10 times"). Then this approach is near-identical: I think that's close enough that it's worth comparing e2e numbers, so I'll try that. |
|
Thanks for benchmarking, picking that up in #7251 |
See discussion in this thread:
https://github.com/microsoft/CCF/pull/7251/files#r2322012456
I think this is much nicer, but it does seem to be slightly slower (checking a bool for
thread_localinitialisation on every call), so this is debatable.