Description
Currently all references to objects in frameobjects use _PyStackRef
instead of PyObject *
.
This is necessary for the free-threaded build to support deferred references.
For the default build _PyStackRef
is just an alias for PyObject *
.
We should change _PyStackRef
to use proper tagged pointers in the default build for two important reasons:
- It will reduce the maintenance burden of using tagged pointers if they were the same in both builds
- It offers a lot of optimization potential. The overhead of reference counting operations is large, and tagged pointers will allow us to reduce that overhead considerably.
My initial implementation is 0.8% slower, although I'd like to get that closer to 0 before merging anything. There is some speedup in the GC due to streamlined immortality checks, and some slowdown due to increased overhead of turning new PyObject *
references into _PyStackRef
s.
This small slowdown will allow us a large speedup (maybe more than 5%) as we can do the following:
- Reduce the overhead of refcount operations by using tagged references for the majority of
LOAD_
instructions in the interpreter. - Completely eliminate many decref operations by tracking which references are tagged in the JIT.
The tagging scheme:
Tag | Meaning |
---|---|
00 | Normal pointers |
01 | Pointers with embedded reference count |
10 | Unused |
11 | Pointer to immortal object1 (including NULL) |
This tagging scheme is chosen as it provides the best performance for the most common operations:
- PyStackRef_DUP: Can check to see if the object's reference count needs updating with a single check and no memory read:
ptr & 1
- PyStackRef_CLOSE: As for PyStackRef_DUP, only a single bit check is needed
- PyStackRef_XCLOSE: Since
NULL
is treated as immortal and tagged, this is the same as PyStackRef_CLOSE.
Maintaining the invariant that tag 11
is used for all immortal objects is a bit expensive, but can be mitigated by pushing the conversion from PyObject *
to _PyStackRef
down to a point where it is known whether an object is newly created or not.
For newly created objects PyStackRef_FromPyObjectStealMortal
can be used which performs no immortality check.
- Actually, any object that was immortal when the reference was created. References to objects that are made immortal after the reference is created would have the low bits set to
00
, or01
. This is OK as immortal refcounts have a huge margin of error and the number of possible references to one of these immortal objects is very small.
Linked PRs
- GH-127705: Use
_PyStackRef
s in the default build. #127875 - GH-127705: Add debug mode for
_PyStackRef
s inspired by HPy debug mode #128121 - GH-127705: better double free message. #130785
- GH-127705: Check for immortality in refcount accounting #131072
- GH-127705: Fix _Py_RefcntAdd to handle objects becoming immortal #131140
- GH-127705: Handle trace refs in specialized decref #131198
- GH-127705: Adds the missing bits from #131198 #131365
- GH-127705: Revert "Move mortal decrefs to internal header and make sure _PyReftracerTrack is called" #131500
- GH-127705: Don't call _Py_ForgetReference before _Py_Dealloc #131508