-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Commit 12dd70c
committed
internal/chacha20: implement the cipher.Stream interface and optimize
SIMD implementations of ChaCha20 (such as CL 35842) interleave block
computations in order to achieve high performance. This means that
they produce more than 64 bytes of output at a time. Unfortunately
when encrypting small amounts of data (such as Poly1305 keys) the
current interface to ChaCha20 forces the additional encrypted blocks
of output to be discarded and recomputed later since it does not
maintain any state. This additional overhead slows down the encryption
of small amounts of data when using such optimized code.
This CL makes the generic ChaCha20 implementation stateful, caching
key, nonce and counter values and buffering any unused key stream bytes.
ChaCha20 now also implements the high level cipher.Stream interface
which makes the API more consistent with other stream ciphers in the
standard library's crypto package. This will make it easier to add high
performance SIMD implementations in the future.
In addition to modifying the API I have also added some optimizations
to improve the performance of the generic implementation. Note that
the performance will improve further on amd64 with Go 1.11 due to
CL 95475 (binary.LittleEndian.PutUint32 optimization). These benchmarks
are based on Go 1.10.1.
name old speed new speed delta
ChaCha20/32 174MB/s ± 2% 174MB/s ± 1% ~ (p=0.796 n=10+10)
ChaCha20/63 309MB/s ± 1% 337MB/s ± 2% +9.32% (p=0.000 n=10+9)
ChaCha20/64 299MB/s ± 2% 350MB/s ± 1% +17.12% (p=0.000 n=9+8)
ChaCha20/256 297MB/s ± 2% 390MB/s ± 1% +31.40% (p=0.000 n=10+10)
ChaCha20/1024 300MB/s ± 0% 400MB/s ± 3% +33.38% (p=0.000 n=7+10)
ChaCha20/1350 290MB/s ± 1% 386MB/s ± 2% +33.10% (p=0.000 n=9+10)
ChaCha20/65536 301MB/s ± 1% 416MB/s ± 2% +38.25% (p=0.000 n=9+10)
ChaCha20-Poly1305 (AEAD optimizations manually disabled):
name old speed new speed delta
Chacha20Poly1305Open_64 122MB/s ± 7% 131MB/s ± 2% +7.23% (p=0.000 n=18+18)
Chacha20Poly1305Seal_64 125MB/s ± 4% 137MB/s ± 2% +9.88% (p=0.000 n=20+19)
Chacha20Poly1305Open_1350 244MB/s ± 4% 305MB/s ± 3% +25.04% (p=0.000 n=20+19)
Chacha20Poly1305Seal_1350 242MB/s ± 3% 309MB/s ± 2% +27.56% (p=0.000 n=20+19)
Chacha20Poly1305Open_8K 260MB/s ± 7% 338MB/s ± 3% +29.96% (p=0.000 n=20+19)
Chacha20Poly1305Seal_8K 262MB/s ± 5% 335MB/s ± 4% +27.80% (p=0.000 n=20+19)
No change in allocations for either set of benchmarks.
Change-Id: I28ca7947904e9d79debe2d5aac6623526fe5e595
Reviewed-on: https://go-review.googlesource.com/104856
Run-TryBot: Michael Munday <mike.munday@ibm.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>1 parent dccd99e commit 12dd70cCopy full SHA for 12dd70c
File tree
Expand file treeCollapse file tree
4 files changed
+983
-182
lines changedOpen diff view settings
Filter options
- internal/chacha20
Expand file treeCollapse file tree
4 files changed
+983
-182
lines changedOpen diff view settings
0 commit comments