32bit hash, the core loop is constructed of invertible operations, and the multiplications are limited to 16x16->16
. For better speed on short keys, the limitation is loosen to 32x32->32
multiplication in the NMHASH32X
variant when hashing the short keys or avalanching the final result of the core loop.
except common operations, some compound ones are used:
x ^= x << a | x << b; // a != b
x ^= x >> a | x >> b; // a != b
x ^= x << a | x >> b; // a != b && a + b != 32 && a % b != 0 && b % a != 0
// dot_prod_16
x = ((x >> 16) * M1) << 16 | (x * M2 & 0xFFFF); // M1, M2 are odd
Both hashes are the same high quality, and pass the checking:
- rurban/smhasher, including LongNeighbors and BadSeeds
- demerphq/smhasher
- massive collision tester, 1G, len=8,16,256
For large keys, the hash is optmized for SSE2/AVX2/AVX512, and archive about 60% speed of XXH3
in Gb/s
. For short keys, NMHASH32X
is a little faster than xxhash32
.