Commit 1a86580
committed
x/crypto/internal/poly1305: improve sum_ppc64le.s
This contains a few minor improvements to sum_ppc64le.s
which result in up to 10% performance improvement for
some of the benchmarks in this directory.
- ADDZE followed by ADD can be combined into ADDE
- PCALIGN added to the loop
- Eliminate a few unnecessary register moves
goos: linux
goarch: ppc64le
pkg: golang.org/x/crypto/internal/poly1305
cpu: POWER10
│ poly.orig.out │ poly.out │
│ sec/op │ sec/op vs base │
64 40.34n ± 0% 38.13n ± 0% -5.47% (p=0.002 n=6)
1K 482.2n ± 0% 444.6n ± 0% -7.81% (p=0.002 n=6)
2M 978.4µ ± 0% 879.3µ ± 0% -10.12% (p=0.002 n=6)
64Unaligned 40.35n ± 0% 38.16n ± 0% -5.42% (p=0.002 n=6)
1KUnaligned 482.0n ± 0% 444.2n ± 0% -7.84% (p=0.002 n=6)
2MUnaligned 978.4µ ± 0% 879.4µ ± 0% -10.12% (p=0.002 n=6)
Write64 32.69n ± 0% 30.71n ± 0% -6.04% (p=0.002 n=6)
Write1K 472.4n ± 0% 436.5n ± 0% -7.60% (p=0.002 n=6)
Write2M 978.3µ ± 0% 879.4µ ± 0% -10.11% (p=0.002 n=6)
Write64Unaligned 32.67n ± 0% 30.71n ± 0% -6.00% (p=0.002 n=6)
Write1KUnaligned 472.6n ± 0% 436.4n ± 0% -7.66% (p=0.002 n=6)
Write2MUnaligned 978.5µ ± 0% 879.6µ ± 0% -10.10% (p=0.002 n=6)
geomean 2.569µ 2.367µ -7.87%
Change-Id: I63314e7252ef10fb2d157f623c4bc2e31a63ae32
Reviewed-on: https://go-review.googlesource.com/c/crypto/+/558775
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Paul Murphy <murp@ibm.com>
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>1 parent 1c981e6 commit 1a86580
1 file changed
+6
-8
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
22 | | - | |
23 | 22 | | |
| 23 | + | |
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
27 | | - | |
28 | 27 | | |
29 | 28 | | |
30 | | - | |
| 29 | + | |
31 | 30 | | |
32 | 31 | | |
33 | 32 | | |
| |||
37 | 36 | | |
38 | 37 | | |
39 | 38 | | |
40 | | - | |
41 | | - | |
42 | 39 | | |
43 | | - | |
44 | | - | |
| 40 | + | |
| 41 | + | |
45 | 42 | | |
46 | | - | |
| 43 | + | |
47 | 44 | | |
48 | 45 | | |
49 | 46 | | |
| |||
75 | 72 | | |
76 | 73 | | |
77 | 74 | | |
| 75 | + | |
78 | 76 | | |
79 | 77 | | |
80 | 78 | | |
| |||
0 commit comments