-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make sure to properly untrack gc objects before freeing them #4461
Conversation
CI failure looks unrelated to my change right? |
The flake8 failure is certainly unrelated, I don't know how that slipped in. The clang-tidy failure is real, but just use the suggested fix. |
Could that be reduced to a unit test here (IIUC it's a pytorch unit test)? |
The crash I observed were only with debug build and for a class that was bound manually (not with pybind). Do you actually have a CI config with 3.11 and pydebug enabled? |
This one? https://pypi.org/project/pydebug/ No, I don't think so. How much trouble would it be to port the pytorch pydebug testing here? (for just one platform maybe) Might be good to have in general. Maybe we discover more issues? |
Ho sorry I should have explained. I never managed to get my hand on a debug cpython easily for the pytorch CI either. But I'm running with that locally because it is very helpful as it contains many extra assert (pre conditions, refcount, etc). |
AFAIK that's true ... and too much trouble, at least for this PR. I'll run this PR through Google's global testing, to make sure there aren't any weird surprises. (That'll take half a day or so.) |
The new flake8 error is in test code I added ... sigh. I'll fix it in a separate PR. Good to ignore here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Global testing looks good (i.e. no surprises; internal test ID OCL:502758013:BASE:502854252:1674050307767:9412450b).
Thanks for the fix!
auto *type = Py_TYPE(self); | ||
|
||
// If this is a GC tracked object, untrack it first | ||
if (PyType_HasFeature(type, Py_TPFLAGS_HAVE_GC) != 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add your sentence from the PR description as a comment here?
// Note that the track call is implicitly done by the default tp_alloc, which we never override.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be good now!
* Make sure to properly untrack gc objects before freeing them * style: pre-commit fixes * Fix lint * Add comment about where the original track comes from Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
In particular, I would like to incorporate pybind/pybind11#4461, which means that pybind11 objects are properly untracked from the Python GC before destruction, absent which we see warnings when running in a debug build of Python 3.11. PiperOrigin-RevId: 539148388
Description
There is no issue for this as the problem is hard to describe without showing code.
The problem is that this dealloc function is sometimes associated with GC objects but it doesn't properly call PyObject_GC_UnTrack as required by https://docs.python.org/3/c-api/gcsupport.html (note that the track call is implicitly done by the default tp_alloc which we never override).
This was detected because the latest 3.11 release now raises a warning for this when pydebug is enabled.
This warning is being raised in particular for this class in PyTorch: https://github.com/pytorch/pytorch/blob/764f79f6804a57a75d9d4da97824b4f57b638ef5/torch/csrc/jit/python/script_init.cpp#L1454
Since similar warnings for some of our custom python objects do lead to hard crash, I think it is important to fix it (details pytorch/pytorch#91161 )
There is no test in this PR as I don't know how to test this. The warning is only raised in python debug builds (and is a weird warning that cannot be turned into an error). If you have an idea on how to test this, I am happy to do it!
Suggested changelog entry: