Skip to content

Order of dict can change after serialization (make Dict ordered?) #34265

Open
@oxinabox

Description

@oxinabox

@bkamins pointed out on slack:

julia> using Serialization;
julia> d = Dict{Symbol, Vector{Int}}(Symbol.('a':'z') .=> Ref([1]));
julia> serialize("test.bin", d);
julia> d2 = deserialize("test.bin");
julia> hcat(collect.(keys.([d, d2]))...)
26×2 Array{Symbol,2}:
 :o  :j
 :b  :x
 :p  :d
 :n  :k
 :j  :g
 :e  :u
 :c  :r
 :h  :a
 :l  :m
 :w  :y
 :x  :i
 :d  :o
 :k  :b
 :s  :p
 :v  :n
 :g  :e
 :u  :c
 :q  :h
 :r  :l
 :z  :w
 :a  :s
 :f  :v
 :m  :z
 :y  :q
 :i  :f
 :t  :t

But it doesn't have to be this way.

If we redefine things so it remembers how many slots it should have,
then it comes out the same as it came in.


hintsize(dict::AbstractDict) = length(dict)
hintsize(dict::Dict) = length(dict.keys)

function Serialization.deserialize_dict(s::AbstractSerializer, T::Type{<:AbstractDict})
    n = read(s.io, Int32)
    sz = read(s.io, Int32)
    t = T();
    sizehint!(t, sz)
    Serialization.deserialize_cycle(s, t)
    for i = 1:n
        k = deserialize(s)
        v = deserialize(s)
        t[k] = v
    end
    return t
end

function Serialization.serialize_dict_data(s::AbstractSerializer, d::AbstractDict)
    write(s.io, Int32(length(d)))
    write(s.io, Int32(length(d.slots)))
    for (k,v) in d
        serialize(s, k)
        serialize(s, v)
    end
end

But this is annoying because it changes the serialization format.
I would rather change sizehint!(::Dict) or how we call it.
The problem is that sizehint!(Dict(), 26) gives it 32 slots,
but the d had 64 slots.


In python this was one of thing things that really caught me out.
Because python salts its hashes wiith a random salt selected each time it starts.
But julia doesn't.

Metadata

Metadata

Assignees

No one assigned

    Labels

    collectionsData structures holding multiple items, e.g. setsneeds decisionA decision on this change is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions