Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PEP 683: Add a PEP for immortal objects #2320
PEP 683: Add a PEP for immortal objects #2320
Changes from 17 commits
b396af6
19991dc
49a78f0
4241d1b
56e7174
9589df7
0cbd50b
413f465
4099282
02260bf
9126cc6
850bd8d
57b536c
7437848
016f895
904868d
a048b57
7d5401c
5340a17
63caa44
ab6943e
743c753
92fc5d4
4e1f5fc
c94ef99
4667e82
d6a88f3
6816e8f
db4a703
77bd7ee
f8cf975
bb17abb
6eb4b36
aa5f14e
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the PEP with a link to the actual thread once you create it, so it is easy for others to find without a lot of guessing and Googling.
Also, there are a smattering of relatively non-critical editing issues (aside from the title containing a comma, which is a bit irregular), but as this was merged just as I got to it, I'm a bit late to the party now so if needed, I can make a separate PR fixing those (though they aren't terribly critical).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be useful have a citation/link for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really have any. However, it's fairly common knowledge on the core team.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I don't think we've made a public facing post about this, but I added this to Instagram and can confirm that we are still using this 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not needed to include here but these are four different strategies that I'll be pursuing:
Also the following but it probably won't impact current benchmarks:
Mental note to expand the benchmark suite to include something that overflows the last GC generation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that I've replaced all core uses of
_Py_IDENTIFIER()
with direct access of statically allocated/initialized objects under_PyRuntimeState.global_objects
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would this work? Some list of pointers that you can iterate through to free at runtime finalization?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll have to check with Eddie, but I expect the plan was to walk through the GC objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@brettcannon at a high level, there are three different kinds of cases that we need to care about when immortalizing objects to make sure that we correctly clean them up:
Runtime globals:
Py_None
,Py_True
,Py_False
, Small Ints, and types defined by:PyTypeObject Foo =
. None of these instances are cleaned up today, thus, the behavior will remain the same even after they are made immortal.Immortalizing non-containers during runtime: Examples of these include interned strings. All of these objects have to be found during runtime shutdown, so they must be tracked somewhere. Fortunately, the biggest use case of this would be interned strings which are already tracked in a python dictionary. We should just refer to those at runtime shutdown to properly clean them up. But yes, the generic strategy is to track and deallocate at shutdown.
Immortalizing containers during runtime: Example of these include immortalizing the heap after initialization. To correctly track these instances, we'll leverage the GC's permanent generation by pushing all immortalized containers there. During runtime shutdown, the strategy will be to first let the runtime try to do its best effort of deallocating these instances normally. Most of the module deallocation will now be handled by
pylifecycle.c:finalize_modules
which cleans up the remaining modules as best as we can. It will change which modules are available during__del__
but that's already defined as undefined behavior by the docs. Optionally, we could do some topological disorder to guarantee that user modules will be deallocated first before the stdlib modules. Finally, anything leftover (if any) can be found through the permanent generation gc list which we can clear afterfinalize_modules
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to continue to do that, or should we change this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Static objects should not be freed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth tweaking this to always return
sys.maxsize()
for immortal objects so people can test if an object is immortal?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now I'd rather folks avoid even thinking about immortal objects and keep this as internal as possible. We can deal with it later if it looks like it might actually be useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, the value of
sys.maxsize
is slightly different from the immortal value but I think this refers to a general way to testing if an instance is immortal. We could in theory expose an API for this specifically (which should be different frommaxsize
). But for now I agree with Eric, let's keep this it as an implementation detail.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't change! Or rather, it will change but not in the expected way.
This is something that I really looked into to make sure that the refleak tooling wouldn't break from this change.
gettotalrefcount
relies on_Py_RefTotal
which is updated independently from the object's refcount field. Thus, an object's immortality will not make the value ofgettotalrefcount
be something really large.Currently, I've placed the
_Py_RefTotal
update after the immortalization check. Thus, this value now reflects all the increfs/decrefs that did take place. If we move the check to before the immortalization check, it will now reflect all the attempts to incref and decrefs that took place (but they could have been a no-op). I chose the former since it's a bit unintuitive to have a returned value that's larger than the actual effective operations. This is enough to make the refcount leak tooling to work as is. However, if we choose the latter, it will work even in cases where we can upgrade a once mortal object to be immortal (which is beyond of the scope of any of the current changes in the PR). In this case,_Py_RefTotal
will always be updated regardless of the the immortalization status of the instance, but the returned value ofgettotalrefcount
might make less sense as a whole.The final consideration is that we need to make sure that
_PySet_Dummy
is not set to be immortal, otherwise, it will affect the implementation of_Py_GetRefTotal
. Perhaps we should just remove the use of_PySet_Dummy
and fix the gdb plugin in another way?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've mentioned that
sys.gettotalrefcount()
is unaffected but didn't include any of your explanation. If you think any of it would be helpful to PEP readers then we can add it in a follow-up.