Skip to content

Revisit HashMap memory layout #36660

Closed
Closed
@arthurprs

Description

@arthurprs

Right now the internal HashMap representation is 3 unziped arrays hhhkkkvvv, I'm experimenting with changing it to hhhkvkvkv (in further iterations kvkvkvhhh may allow inplace grow). A previous attempt is #21973

Code here https://github.com/arthurprs/hashmap2/tree/layout

The obvious drawbacks are: padding can be wasted between the key and value, keys() and values() iterations that can consume more cache. Strangely I can't reproduce the iterator regression in the benchmarks.

I fixed all problems I could find in the benchmarks making sure value is not () and that it's actually accessed by the lookup benchmarks. But hey, they're still microbenchmarks...

 name                             hashmap:: ns/iter  hhkvkv:: ns/iter  diff ns/iter   diff % 
 grow_10_000                      867,518            844,286                -23,232   -2.68% 
 grow_fnv_10_000                  442,023            412,458                -29,565   -6.69% 
 insert_100                       2,788              2,330                     -458  -16.43% 
 insert_1000                      23,146             22,025                  -1,121   -4.84% 
 insert_100_000                   4,686,760          3,767,347             -919,413  -19.62% 
 insert_10_000                    315,139            284,226                -30,913   -9.81% 
 insert_int_bigvalue_10_000       741,045            442,034               -299,011  -40.35% 
 insert_str_10_000                334,245            330,182                 -4,063   -1.22% 
 insert_string_10_000             790,355            788,964                 -1,391   -0.18% 
 iter_keys_10_000                 61,643             53,448                  -8,195  -13.29% 
 iter_values_10_000               58,991             52,606                  -6,385  -10.82% 
 iterate_10_000                   63,588             55,054                  -8,534  -13.42% 
 lookup_100_000_multi             189,238            185,422                 -3,816   -2.02% 
 lookup_100_000_multi_bigvalue    189,688            193,036                  3,348    1.77% 
 lookup_10_000_exist              145,310            130,813                -14,497   -9.98% 
 lookup_10_000_multi              152,443            141,698                -10,745   -7.05% 
 lookup_10_000_multi_bigvalue     158,438            147,839                -10,599   -6.69% 
 lookup_10_000_noexist            148,015            151,075                  3,060    2.07% 
 lookup_1_000_000_multi           137,682            133,091                 -4,591   -3.33% 
 lookup_1_000_000_multi_bigvalue  142,378            134,866                 -7,512   -5.28% 
 merge_shuffle                    1,422,998          1,331,370              -91,628   -6.44% 
 merge_simple                     40,915,514         39,468,332          -1,447,182   -3.54% 
 new                              6                  5                           -1  -16.67% 
 with_capacity_10e5               3,219              3,266                       47    1.46%

@bluss I realized I was hijacking the other thread #36567. I'm sorry.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions