Replace pair_list with hash table #1128

asvetlov · 2025-04-05T19:27:14Z

No description provided.

codspeed-hq · 2025-04-07T14:39:33Z

CodSpeed Performance Report

Merging #1128 will degrade performances by 31.97%

_{Comparing ht (aa786a7) with master (1c5d240)}

Summary

⚡ 48 improvements
❌ 12 regressions
✅ 184 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

	Benchmark	`BASE`	`HEAD`	Change
❌	`test_cimultidict_add_istr[c-extension-module]`	2.6 ms	3.6 ms	-26.59%
⚡	`test_cimultidict_delitem_istr[c-extension-module]`	86.1 µs	66.2 µs	+30.17%
❌	`test_cimultidict_extend_istr[c-extension-module]`	2.4 ms	3.3 ms	-27.01%
❌	`test_cimultidict_extend_istr_with_kwargs[c-extension-module]`	6.5 ms	8.1 ms	-20.14%
⚡	`test_cimultidict_fetch_istr[c-extension-module]`	57.9 µs	46.9 µs	+23.3%
⚡	`test_cimultidict_get_istr_hit[c-extension-module]`	69.6 µs	58.4 µs	+19.11%
⚡	`test_cimultidict_get_istr_hit_with_default[c-extension-module]`	71.6 µs	60.4 µs	+18.47%
⚡	`test_cimultidict_get_istr_miss[c-extension-module]`	73.2 µs	48.2 µs	+51.94%
⚡	`test_cimultidict_get_istr_with_default_miss[c-extension-module]`	75.3 µs	50.2 µs	+50.01%
⚡	`test_cimultidict_insert_istr[c-extension-module]`	65.1 µs	50.3 µs	+29.57%
⚡	`test_cimultidict_update_istr[c-extension-module]`	128.5 µs	49 µs	×2.6
⚡	`test_cimultidict_update_istr_with_kwargs[c-extension-module]`	292 µs	152.7 µs	+91.25%
❌	`test_create_cimultidict_with_dict_istr[c-extension-module]`	40.2 µs	45.9 µs	-12.43%
⚡	`test_create_cimultidict_with_items_istr[c-extension-module]`	51.5 µs	46.3 µs	+11.15%
❌	`test_create_multidict_with_dict[case-sensitive-c-extension-module]`	36.4 µs	41.9 µs	-13.17%
⚡	`test_create_multidict_with_items[case-sensitive-c-extension-module]`	48 µs	42.6 µs	+12.54%
❌	`test_multidict_add_str[case-insensitive-c-extension-module]`	5.5 ms	6.5 ms	-15.21%
❌	`test_multidict_add_str[case-sensitive-c-extension-module]`	2 ms	3 ms	-31.97%
⚡	`test_multidict_delitem_str[case-insensitive-c-extension-module]`	111 µs	91.2 µs	+21.71%
⚡	`test_multidict_delitem_str[case-sensitive-c-extension-module]`	77.2 µs	57.4 µs	+34.43%
...	...	...	...	...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.

asvetlov · 2025-04-07T14:50:14Z

Heh. Appending new values to the multidict is more expensive, all other ops are faster.

We have a tradeoff, as usual. I think that the lookup is a much more often operation than the multidict filling.
Also, pls keep in mind that the item replacement/deletion also requires a lookup.

asvetlov · 2025-04-07T20:38:26Z

The PR is more-or-less done but I'd like to make some polishing and self-review later.
Please don't merge it, I'll be only partially available next month, I cannot commit that I'll have enough time for fixing issues when the new version will be released.

Careful testing is appreciated!

New multidict is close to Python's dict except for multiple keys, of course.

It starts from the empty hashtable, which grows by a power of 2 starting from 8: 8, 16, 32, 64, 128, ...
The amount of items is 2/3 of the hashtable size (1/3 of the table is never allocated).

The table is resized if needed, and bulk updates (extend(), update(), and constructor calls) pre-allocate many items at once, reducing the amount of potential hashtable resizes.

Item deletion puts DKIX_DUMMY special index in the hashtable. In opposite to the standard dict, DKIX_DUMMY is never replaced with an index of the new entry except by hashtable indices rebuild. It allows to keep the insertion order for multiple equal keys.

The iteration for operations like getall() is a little tricky. The next index calculation could return the already visited index before reaching the end. To eliminate duplicates, the code marks already visited entries by entry->hash = -1. -1 hash is an invalid hash value that could be used as a marker. After the iteration finishes, all marked entries are restored.
Double iteration over the indices still has O(1) amortized time, it is ok.

.add(), val = md[key], md[key] = val, md.setdefault() all have O(1).
.getall() / .popall() have O(N) where N is the amount of returned items.
.update() / extend() have O(N+M) where N and M are amount of items in the left and right arguments (we had quadratic time here before).
popitem() is slightly slower because the function should calculate the index of the last (deleted) entry. Since the method is really rare; I don't care too much.

.copy() is super fast; construction multidict from another multidict could be optimized to reuse .copy() approach.

The performance is okay, multidict creation is slightly slower because the hashtable should be recalculated and the indices table rebuilds, but all other operations are faster.

Multidict is still slightly slower than the regular dict because I don't want to use the private API for accessing internal Python structures, the most notable is the string's hash.

TODO:

Optimize MultiDict(md) construction.
GC untrack the multidict if all keys/values are not tracked. In aiohttp, values are usually untracked str instanced.

Open question: Should we use HT_PERTURB_SHIFT in the index calculation? It is crucial for storing integers where hash(num) == num, but I doubt if it decreases the number of collisions for str keys. During the work on PR I saw many times when idx = next(idx) didn't change the current idx or it was like 1, 2, 1, 3 with a relatively high chance of duplication.
It may not be worth changing. I have no idea how to prove this except by calculating the collision statistics on a large enough number of keys.

If anybody wants to play with the code, two things could help debugging.

Uncomment # CFLAGS = ["-O0", "-g3", "-UNDEBUG"] line in setup.py to make a debugging build. It enables asserts and ASSERT_CONSISTENT(...) self-consistency check.
_ht_dump(...) function prints the current hashtable structure in a form useful for analyzing the internal structure.

Please feel free to experiment with and ask any questions.

asvetlov added 20 commits April 4, 2025 13:07

rename

ab509f4

broken

e421a0d

Add missing file

662ff40

Compile

0104c2c

fix

1b46611

Add debug

0c21d45

Fix update

54374aa

Fix del

81c734b

Fix del

0f179b1

step

12ca9f0

reorganize

41e8490

Fix iterator

9b96800

No segfaults

597f83d

Fix popall version counter

4ccd5d5

step

8801221

step2

63d68d2

step3

515fdbe

step4

cfe514b

Tests passed

6ac85ae

fix

7bcafb7

asvetlov added 8 commits April 7, 2025 16:57

Reduce amount of resizing

1be5663

Optimize extend/update

aca6d56

fix

0dfc635

speed-copy

789f6a8

rebuild index table if needed

31d2a56

optimize

07321ec

inlines

305e295

renames

3585eb5

Merge branch 'master' into ht

aa786a7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace pair_list with hash table #1128

Replace pair_list with hash table #1128

asvetlov commented Apr 5, 2025

codspeed-hq bot commented Apr 7, 2025 •

edited

Loading

asvetlov commented Apr 7, 2025

asvetlov commented Apr 7, 2025 •

edited

Loading

Replace pair_list with hash table #1128

Are you sure you want to change the base?

Replace pair_list with hash table #1128

Conversation

asvetlov commented Apr 5, 2025

codspeed-hq bot commented Apr 7, 2025 • edited Loading

CodSpeed Performance Report

Merging #1128 will degrade performances by 31.97%

Summary

Benchmarks breakdown

asvetlov commented Apr 7, 2025

asvetlov commented Apr 7, 2025 • edited Loading

codspeed-hq bot commented Apr 7, 2025 •

edited

Loading

asvetlov commented Apr 7, 2025 •

edited

Loading