-
-
Notifications
You must be signed in to change notification settings - Fork 32.8k
bpo-45609: Specialize STORE_SUBSCR (alternate) #29242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…nto store_subscr
Microbenchmarks are good, pyperformance has a couple of the expected speedups, especially nobody (1.21x faster!), some slowdowns that I'm guessing are just my unstable benchmarking. https://gist.github.com/sweeneyde/a3bef64bf07560c90dd4bdad820b4a07 |
🤖 New build scheduled with the buildbot fleet by @sweeneyde for commit 3245f21 🤖 If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again. |
Updated numbers: https://gist.github.com/sweeneyde/e9e348414fdda6e582c814f89a266372 |
Nice speedups for I note that you have added a lot of code just for specializing |
I just got these microbenchmarks for the difference between dict[object] versus dict[str]+dict[general]:
Benchmark hidden because not significant (2): bytearray[int]=..., list[int]=... |
What about for the standard benchmark set? |
Most of pyperformance doesn't really make use of dicts, other than django_template, where 99% of dict lookups there are looking up strings that already have their hash computed (see faster-cpython/ideas#105). Running just that one benchmark, I got
|
Pyperformance under-utilizes dicts IMO, but I think dict[str] is common enough in real code that it might make sense to specialize for that extra 12% microbenchmark. But if your suggestion is to keep the complexity at bay and only have one dict[object] specialization opcode, I can do that as well. |
I would rather have just the one specialization for I realize this is just generalizing and I don't have data to back it up. Hopefully we'll start gathering hardware metrics along with timings in the future which will help us make better informed decisions on such things. |
Thanks, looks good. I'll benchmark it later. One thing from your earlier version that is probably worth doing in a future PR is the reference stealing. Something like 5ccb398#diff-b08a47ddc5bc20b2e99ac2e5aa199ca24a56b994e7bc64e918513356088c20aeR1633
and It shouldn't be that much more code if we make |
Ah I missed "in a future PR", but I don't think this is a huge change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you've added _PyDict_SetItem[_Take[s_2]]
you might as well use it elsewhere.
MAP_ADD
is the obvious candidate.
Objects/dictobject.c
Outdated
int | ||
PyDict_SetItem(PyObject *op, PyObject *key, PyObject *value) | ||
_PyDict_SetItem(PyObject *op, PyObject *key, PyObject *value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would make the first argument PyDictObject *mp
. It has to be a dict, so we might as well tell the C compiler that.
Include/internal/pycore_dict.h
Outdated
@@ -22,6 +22,8 @@ typedef struct { | |||
*/ | |||
Py_ssize_t _Py_dict_lookup(PyDictObject *mp, PyObject *key, Py_hash_t hash, PyObject **value_addr); | |||
|
|||
/* Steals key and value */ | |||
int _PyDict_SetItem(PyObject *op, PyObject *key, PyObject *value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that I suggested _PyDict_SetItem
but can we make it clearer that it consumes references to key
and values
.
BTW I don't like the term "steal". That suggests that the caller is unaware that the references are taken, but it has to know that. I prefer "consume", or "take". Rust uses "take", IIRC.
So how about _PyDict_SetItem_Take
, or even _PyDict_SetItem_Takes2
to show that two references are taken.
51d89f1
to
9c6860d
Compare
🤖 New build scheduled with the buildbot fleet by @markshannon for commit 9c6860d 🤖 If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again. |
The one failing "buildbot/AMD64 Arch Linux Asan Debug PR" is on test_gdb: Click to exapnd
|
The buildbot failure should have been fixed by #29515 |
🤖 New build scheduled with the buildbot fleet by @sweeneyde for commit 9f9ef35 🤖 If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again. |
These were the latest failures... they seem unrelated to me.
|
I'll be happy to merge this, once the conflicts are fixed. |
* Specialize STORE_SUBSCR for list[int], and dict[object] * Adds _PyDict_SetItem_Take2 which consumes references to the key and values.
https://bugs.python.org/issue45609