XSalsa20, ChaCha, Faster + Reduced Round Salsa20 #20
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a series of evolutions of the Salsa20 engine family, basically presented as I developed them.
If you want to cherry-pick or rework (e.g. to not be based on extending Salsa20 engine) feel free.
XSalsa20 implementation, based on the existing Salsa20 engine with a couple of tweaks to allow the key setup and nonce size to vary
XSalsa20 is a version of the Salsa20 stream cipher with an extended (192 vs 64 bit) nonce.
Test vectors are copied from the cryptopp implementation, which were generated using the nacl XSalsa20. There don't appear to be any official test vectors.
ChaCha implementation, based on the existing Salsa20 engine with the key setup, block permutation and block counter increment overridden.
This is basically an implementation of the 'regs' reference implementation found in the eStream benchmark suite and at http://cr.yp.to/chacha.html.
Speed is slightly (~10% faster) than the Salsa20 engine (due to the registerization).
Reduced round Salsa20
Parameterisation of Salsa20Engine to allow arbitrary rounds. Test vectors from estreambench-20080905.
Registerization of Salsa20Engine
Registerize the state variables in salsa20Core to allow Hotspot etc. to optimise the loads/stores (as much as can be done with 16 variables and no SIMD).
Boosts performance by about 10% on common x86 hardware, possibly more on setups with more registers. Should have no affect on systems with small numbers of registers.