Skip to content

Structural hash collisions #57284

Open
Open
@bbrehm

Description

@bbrehm

There are some structural hash collisions in julia:

julia> x,y,z = rand(3); h0=rand(UInt); hash((x,y,(z,)), h0) == hash((x,(y,z,)),h0)
true

julia> x,y,z = rand(3); h0=rand(UInt); hash(x=>(y=>z), h0) == hash((x=>y)=>z,h0)
true

Companion discourse thread: https://discourse.julialang.org/t/structural-hash-collisions/125613

If #57235 goes forward, we should consider addressing this as well.

I'm not sure whether this is a big problem, but it is an interesting peculiarity of the design.

Note that the general structure is such that many hash_uint64 invocations can run in parallel on the superscalar CPU, because there are no data dependencies for many types. Before we decide whether to address that issue, @StefanKarpinski should probably elaborate on how intentional that hidden parallelism is.

It might be reasonable to keep this issue open as WONTFIX indefinitely, for better discoverability if somebody gets bitten by similar structural collisions that happen in real life.

Metadata

Metadata

Assignees

No one assigned

    Labels

    collectionsData structures holding multiple items, e.g. setshashing

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions