Skip to content

Reduce size_of<HashMap> ? #69

Open
@SimonSapin

Description

@SimonSapin

With the switch to Hashbrow, std::mem::size_of::<std::collections::HashMap<(), ()>>() on 64-bit platforms grew from 40 bytes to 56. (24 to 40 bytes with BuildHasherDefault.)

In Servo’s DOM implementation we have types whose size_of is several hundreds of bytes. Because some not-so-unusual pages can have very many DOM nodes this size can add up to significant memory usage. We have unit tests for size_of to ensure it does not accidentally grow, which fail in today’s Rust Nightly because several types grew by 16 bytes because they contain a HashMap.

Hashbrown’s HashMap contains a RawTable which has five pointer-sized fields:

hashbrown/src/raw/mod.rs

Lines 328 to 348 in 7e79b0c

/// A raw hash table with an unsafe API.
pub struct RawTable<T> {
// Mask to get an index from a hash value. The value is one less than the
// number of buckets in the table.
bucket_mask: usize,
// Pointer to the array of control bytes
ctrl: NonNull<u8>,
// Pointer to the array of buckets
data: NonNull<T>,
// Number of elements that can be inserted before we need to grow the table
growth_left: usize,
// Number of elements in the table, only really used by len()
items: usize,
// Tell dropck that we own instances of T.
marker: PhantomData<T>,
}

Some of them seem potentially redundant, but I’m not sure. For example there are two pointers that seem to be in the same allocation. How expensive would it be to re-compute the second pointer every time it is needed?

Are bucket_mask, growth_left, and len related such that one of them could be computed from the others?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions