bpo-45829: Specialize BINARY_SUBSCR for __getitem__ implemented in Python.#29592
Conversation
brandtbucher
left a comment
There was a problem hiding this comment.
This is really cool! A few notes:
| @@ -0,0 +1,2 @@ | |||
| Specialize BINARY_SUBSCR for classes with a ``__getitem__`` method | |||
There was a problem hiding this comment.
| Specialize BINARY_SUBSCR for classes with a ``__getitem__`` method | |
| Specialize :opcode:`BINARY_SUBSCR` for classes with a ``__getitem__`` method |
Python/ceval.c
Outdated
| assert(cache->adaptive.original_oparg == 0); | ||
| oparg = 0; |
There was a problem hiding this comment.
Is this just to clarify that the oparg is still unused (even though we use cache entries)?
Python/specialize.c
Outdated
| int flags = code->co_flags; | ||
| if (flags & (CO_GENERATOR | CO_COROUTINE | CO_ASYNC_GENERATOR)) { | ||
| return SPEC_FAIL_GENERATOR; | ||
| return -1; |
There was a problem hiding this comment.
| return -1; |
Python/specialize.c
Outdated
| if (code->co_nfreevars) { | ||
| return SPEC_FAIL_FREE_VARS; | ||
| } | ||
| return 0; |
There was a problem hiding this comment.
I don't really like this, since 0 is a valid failure code. Perhaps return -1 on success and check for that instead (it's sort of a weird interface, but whatever).
There was a problem hiding this comment.
I'll add a success code, for clarity.
In many other C programs, yes 0 is a failure.
However, in CPython, 0 is usually a success, as -1 (or any negative number in some cases) is a failure.
E.g. PyObject_RichCompareBool return 0 for False, and -1 for an error.
compile.c is an unfortunate counter example, where some functions return 0 for a failure and others return 0 for a success.
| _Py_IDENTIFIER(__getitem__); | ||
|
|
||
| static int | ||
| function_kind(PyCodeObject *code) { |
There was a problem hiding this comment.
| function_kind(PyCodeObject *code) { | |
| function_spec_fail_kind(PyCodeObject *code) { |
| MISS_WITH_CACHE(CALL_FUNCTION) | ||
| MISS_WITH_CACHE(BINARY_OP) | ||
| MISS_WITH_OPARG_COUNTER(BINARY_SUBSCR) | ||
| MISS_WITH_CACHE(BINARY_SUBSCR) |
There was a problem hiding this comment.
I'm guessing it's still worth keeping the logic for oparg counters, just in case we implement a new family that can uses it in the future?
There was a problem hiding this comment.
Might as well delete it. It's in the git history.
Fidget-Spinner
left a comment
There was a problem hiding this comment.
I have the same questions as Brandt, but otherwise this LGTM.
While not in pyperformance, I'm excited for code using typing, since that module heavily uses __getitem__ (and __class_getitem__). Awesome!
Type hints are run-once code (at least they should be) so won't be affected. |
Agreed that it's true for the vast majority. But recently I noticed variable annotations sometimes appear in hot code paths and they might be affected. |
Specializes
BINARY_SUBSCRfor classes with a__getitem__method implemented in Python by making the call directly in the instruction using the same mechanism asCALL_FUNCTION_PY_SIMPLE.A respectable 1% speedup including a 20% speedup for one benchmark that makes heavy use of
__getitem__.https://bugs.python.org/issue45829