-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Make non-embedded module clear their locals before deallocation #3798
base: master
Are you sure you want to change the base?
Conversation
include/pybind11/pybind11.h
Outdated
``def`` should point to a statically allocated module_def. | ||
\endrst */ | ||
static module_ | ||
create_embedded_extension_module(const char *name, const char *doc, module_def *def) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this would actually be better to subclass or just add it as an optional argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @Skylion007, thinking it will be easy to avoid the code duplication, maybe as simple as bool is_embedded = false
? (same idea as Aaron's?)
My second concern is testing: what would it take to add a test that fails without the change here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This just feels like bad code smell, the MACRO we are using is already a bit hacky. Is there a way we can improve create_embedded_extension_module to automatically detect when it's being called from that macro? It seems like a rather specific and limited use case that the end user should never call directly so this method should really be hidden somewhere as an implementation detail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless there is a reason to allow embedded and non-embedded modules in the same executable / library, maybe detecting whether embed.h
has been included, or having the user indicate if they want the modules to be embedded with a #define PYBIND11_EMBED
are options to consider ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rwgk About the testing: I added a line in external_module.cpp
which was making the "Restart an interpreter" test case fail in a way that was consistent with the original issue (ie. having the library complain about an already registered type). I could have added another test but I didn't see the point since it would have done exactly the same thing and I needed to add that line in the external module in any case.
In retrospect, maybe it would have been better to dedicate another external module to this test case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, here is my proposal @StarQTius:
- Factor out the implementation of the static module function into to a function in a pybind11
detail
namespace. You can have it use the is_embedded to denote the different behavior. Have the static function call the new detail one. - Have the macro call the new detail function with the proper bool from the macro.
That way we don't create a new static method API that users can accidentally use or that we have to mantain and the API is the same as before with minimal code duplication. (The detail namespace will indicate this is not a new part of the API).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bout the testing: I added a line in
external_module.cpp
which was making the "Restart an interpreter" test case fail in a way that was consistent with the original issue (ie. having the library complain about an already registered type).
I see, thanks! I saw that line but didn't make the connection. Could you please add a comment, maybe one or two lines to explain in a nutshell, plus reference to this PR? So that it will be quick for anyone looking at the test code in the future, with no specific context, to make the connection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you asked, I moved the extension module creation implementation detail into a non-API function. I had to add forward declarations to make it works though. I also added comments to clarify how local internals are tested, and a reference to the PR.
include/pybind11/pybind11.h
Outdated
``def`` should point to a statically allocated module_def. | ||
\endrst */ | ||
static module_ | ||
create_embedded_extension_module(const char *name, const char *doc, module_def *def) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @Skylion007, thinking it will be easy to avoid the code duplication, maybe as simple as bool is_embedded = false
? (same idea as Aaron's?)
My second concern is testing: what would it take to add a test that fails without the change here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks awesome now, thanks!
@@ -510,11 +510,19 @@ struct local_internals { | |||
}; | |||
|
|||
/// Works like `get_internals`, but for things which are locally registered. | |||
/// There is one variable storing the local internals for each module, but every embedded module |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little unclear if I understand this correctly. I'm wondering is
but all embedded modules share the same locals.
correct? If not I'm confused, if yes that would be my suggested wording.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry if it does not sound clear, English is not my native language. What I meant is that since the static local variable of get_local_internals
does not get dynamically linked when a non-embedded module is loaded by the interpreter, you end up with one definition of the locals for the interpreter, and one definition for each non-embedded modules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in that case my suggested wording is clearer. But I'm not a native speaker, too!
Let's see what @Skylion007 thinks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I'm having serious 2nd thoughts.
E.g. at Google, Python and all extension modules for a given binary (or test) are linked together statically. Each binary or test is hermetic.
Others may use RTLD_GLOBAL
. Probably rare, but maybe in certain environment that's what's needed?
What does this PR mean in those situations? I'm not sure. Does someone else have a clear understanding of how the m_clear
functionality plays out across all environments?
Guessing a lot: I think to be completely safe, we'd need some kind of lookup table, to keep track of which module_def
pointers are associated with a given local_internals
pointer, and if more than one, we cannot safely clear_local_internals
until the last one disappears.
Or we could have static std::unordered_map<module_def *, local_internals>
with all the required logic to register & clear types given a module_def
pointer, but search for types over all keys of the map.
It seems pretty tricky to get the m_clear
functionality correct in all environments.
Given those uncertainties, and that pybind11 existed to this point without using m_clear
, maybe this should be an opt-in, e.g. via some #define
, or runtime option? That would be a safer and less sudden way to find out what's possible and what not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry after more thought I became too uncertain how this plays out. I someone else feels certain, please speak up.
@@ -510,11 +510,19 @@ struct local_internals { | |||
}; | |||
|
|||
/// Works like `get_internals`, but for things which are locally registered. | |||
/// There is one variable storing the local internals for each module, but every embedded module |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I'm having serious 2nd thoughts.
E.g. at Google, Python and all extension modules for a given binary (or test) are linked together statically. Each binary or test is hermetic.
Others may use RTLD_GLOBAL
. Probably rare, but maybe in certain environment that's what's needed?
What does this PR mean in those situations? I'm not sure. Does someone else have a clear understanding of how the m_clear
functionality plays out across all environments?
Guessing a lot: I think to be completely safe, we'd need some kind of lookup table, to keep track of which module_def
pointers are associated with a given local_internals
pointer, and if more than one, we cannot safely clear_local_internals
until the last one disappears.
Or we could have static std::unordered_map<module_def *, local_internals>
with all the required logic to register & clear types given a module_def
pointer, but search for types over all keys of the map.
It seems pretty tricky to get the m_clear
functionality correct in all environments.
Given those uncertainties, and that pybind11 existed to this point without using m_clear
, maybe this should be an opt-in, e.g. via some #define
, or runtime option? That would be a safer and less sudden way to find out what's possible and what not.
Tagging @jbms: is there a chance that you could look at this PR (and my uncertainties under #3798 (comment))? |
I wasn't too clear on exactly when the First let's consider just the single interpreter case: In the most common case, you import some pybind11 modules, and don't do anything funny. In that case, the However, it appears that the mymodule.cc: #include <pybind11/pybind11.h>
#include <stdio.h>
namespace py = ::pybind11;
struct MyClass {
std::string get_name() { return "myclass"; }
};
PYBIND11_MODULE(mymodule, m) {
m.doc() = "pybind11 example plugin"; // optional module docstring
py::class_<MyClass>(m, "MyClass", py::module_local())
.def(py::init([] { return MyClass{}; }))
.def("get_name", &MyClass::get_name);
fprintf(stderr, "Loading mymodule\n");
fflush(stderr);
} test.py:
I also modified the The result with Python 3.9.9 is:
The crash happens in:
Presumably due to attempting to reference a type record that has been already deallocated. This may be a bug in Python. I don't really have any experience with starting and finalizing the Python interpreter multiple times in a single process, but from what I have seen when looking closely at the Python finalization process it is not really properly supported by Python --- some memory is sure to be leaked, especially if threads are used. Therefore I wouldn't recommend doing that in production code, and even doing it in tests seems kind of questionable since why test something that doesn't match what happens in production? It appears there is support for multi-phase initialization intended to better support module reloading, and also sub-interpreters: We might consider changing over to using that (which would probably require API changes in pybind11) , I think the objective of this PR, fixing module_local registration when restarting the interpreter, could be solved by instead arranging to clear the local internals only when the interpreter is finalized, or after the interpreter is restarted, rather than using m_free. I'm not sure what the best way to do that would be --- maybe adding an More generally, I think it would be best if module_local went away, and instead casters and exception translators could be overridden via explicitly passed-around context objects or something, so that there is no hidden semi-global state dependent on symbol visibility. |
Good thing Python has an auditing event just for this: https://docs.python.org/3/c-api/init.html#c.Py_FinalizeEx |
Upon looking at it, atexit is a viable alternative, but I do dislike having to manage yet another global static state as would be needed with either solution. |
The reason of writing tests in C++ instead of python is because of debugging, gtest + Visual Studio or Xcode is a lot better and faster than debugging with pytest + pdb. |
You can run the Python executable itself in a debugger --- obviously works better if you have the debugging symbols, though. What I mean, though, is that running Py_Initialize more than once is not something well-tested or well-supported by CPython, and there will be a lot of behavior differences compared to running Python normally. For example, some state may stick around from a previous interpreter run, and therefore a test may pass even if the functionality does not actually work in normal Python usage, or a test may fail even though the functionality does work in normal Python usage. Of course some tradeoffs have to be made to make testing more efficient, though. |
It seems problematic if there are a lot of behavior differences between Python and CPython. I think most users (at least me) would implicitly assume that CPython is a C version of Python, therefore their behavior should ideally be identical. If the consensus is they shouldn't be identical, is there a README somewhere that documents the behavior differences between Python and CPython that I can read? |
CPython is the main implementation of Python --- PyPy is another less widely used implementation. I mentioned CPython because I have some familiarity with its implementation, while I have essentially no familiarity with PyPy. |
General comment: we (mostly CPython) is sending people into traps by giving the impression that restarting the interpreter is safe, well-defined behavior. It simply isn't. It probably never will be. It would be best if people gave up trying. It's just a trap, ultimately, that's burning human time, like here, for no good reason. If it was just me, I'd put a https://docs.python.org/3/c-api/init.html#c.Py_FinalizeEx
A couple years ago I did some really simple experiments: Leaks! |
This reverts commit 59c5010. pybind/pybind11#3798 (comment) gives the cause for the real problem
restarting python interpreter does not seem to be well supported, cf. pybind/pybind11#3798 (comment)
This reverts commit 59c5010. pybind/pybind11#3798 (comment) gives the cause for the real problem
restarting python interpreter does not seem to be well supported, cf. pybind/pybind11#3798 (comment)
Description
This fix aims to solve the issue #3776. The problem was that every non-embedded module had their own locals, which could only be accessed through the
get_local_internals
function of the shared library they were in. So when these modules were deallocated, their locals were not cleared, and if the modules were initialized again, C++ types could not be registered in the locals.I used the PyModuleDef::m_free field to register a cleanup callback which clears the locals on non-embedded module deallocation. I added another function for creating embedded module, so that embedded modules do not clear the locals on deallocation.
I also registered a local class in the external_module used for testing. This new line was making the "Restart the interpreter" test case fail.