Pure Python reimplementation of CPython's tuplehash function with exact overflow behavior, designed for hashing user-defined immutable sequences.
Currently in Python, if you want your custom immutable sequence to hash like a tuple:
from typing import TypeVar, Sequence
T = TypeVar('T', covariant=True)
class MySequence(Sequence[T]):
def __hash__(self):
return hash(tuple(self))This creates memory overhead for large collections.
Until such stdlib functionality exists, tuplehash provides:
# Explicitly provide a non-negative int to `len` if `len(iterable)` doesn't work
def tuplehash(iterable, length=None): ...So you can do this:
from typing import TypeVar, Sequence
from tuplehash import tuplehash
T = TypeVar('T', covariant=True)
class MySequence(Sequence[T]):
def __hash__(self):
return tuplehash(self)This gives you stdlib-quality hashing today.
- Strict overflow behavior using custom fixed-width integer types
- Version-specific implementations matching:
- Python 2.7/3.0-3.7: Original multiplicative hash
- Python 3.8+: Simplified xxHash algorithm
pip install tuplehashfrom tuplehash import tuplehash
# Basic usage
assert tuplehash((1, 2, 3)) == hash((1, 2, 3))
# Works with any collection
from collections import namedtuple
Point = namedtuple('Point', ['x', 'y'])
assert tuplehash(Point(3, 4)) == hash(Point(3, 4))| Python Version | Algorithm | Key Characteristics |
|---|---|---|
| <3.8 | Multiplicative hash | Initial value 0x345678, multiplier 1000003, length-dependent addend |
| ≥3.8 | Simplified xxHash | Single accumulator, 31/13-bit rotations, prime multiplications |
Uses custom Signed/Unsigned types to exactly replicate:
- 32-bit overflow on 32-bit platforms
- 64-bit overflow on 64-bit platforms
- All intermediate casting behaviors
- Performance overhead vs native implementation
Contributions are welcome! Please submit pull requests or open issues on the GitHub repository.
This project is licensed under the MIT License.