Skip to content

Pure Python reimplementation of CPython's `tuplehash` function with exact overflow behavior, designed for hashing user-defined immutable sequences.

License

Notifications You must be signed in to change notification settings

jifengwu2k/tuplehash

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

tuplehash

Pure Python reimplementation of CPython's tuplehash function with exact overflow behavior, designed for hashing user-defined immutable sequences.

The Current Pain Point

Currently in Python, if you want your custom immutable sequence to hash like a tuple:

from typing import TypeVar, Sequence

T = TypeVar('T', covariant=True)


class MySequence(Sequence[T]):
    def __hash__(self):
        return hash(tuple(self))

This creates memory overhead for large collections.

Until such stdlib functionality exists, tuplehash provides:

# Explicitly provide a non-negative int to `len` if `len(iterable)` doesn't work
def tuplehash(iterable, length=None): ...

So you can do this:

from typing import TypeVar, Sequence

from tuplehash import tuplehash

T = TypeVar('T', covariant=True)


class MySequence(Sequence[T]):
    def __hash__(self):
        return tuplehash(self)

This gives you stdlib-quality hashing today.

Features

Installation

pip install tuplehash

Usage

from tuplehash import tuplehash

# Basic usage
assert tuplehash((1, 2, 3)) == hash((1, 2, 3))

# Works with any collection
from collections import namedtuple

Point = namedtuple('Point', ['x', 'y'])
assert tuplehash(Point(3, 4)) == hash(Point(3, 4))

Implementation Details

Version-Specific Algorithms

Python Version Algorithm Key Characteristics
<3.8 Multiplicative hash Initial value 0x345678, multiplier 1000003, length-dependent addend
≥3.8 Simplified xxHash Single accumulator, 31/13-bit rotations, prime multiplications

Overflow Handling

Uses custom Signed/Unsigned types to exactly replicate:

  • 32-bit overflow on 32-bit platforms
  • 64-bit overflow on 64-bit platforms
  • All intermediate casting behaviors

Limitations

  • Performance overhead vs native implementation

Contributing

Contributions are welcome! Please submit pull requests or open issues on the GitHub repository.

License

This project is licensed under the MIT License.

About

Pure Python reimplementation of CPython's `tuplehash` function with exact overflow behavior, designed for hashing user-defined immutable sequences.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages