Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Make non-embedded module clear their locals before deallocation #3798

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

StarQTius
Copy link
Contributor

Description

This fix aims to solve the issue #3776. The problem was that every non-embedded module had their own locals, which could only be accessed through the get_local_internals function of the shared library they were in. So when these modules were deallocated, their locals were not cleared, and if the modules were initialized again, C++ types could not be registered in the locals.
I used the PyModuleDef::m_free field to register a cleanup callback which clears the locals on non-embedded module deallocation. I added another function for creating embedded module, so that embedded modules do not clear the locals on deallocation.
I also registered a local class in the external_module used for testing. This new line was making the "Restart the interpreter" test case fail.

``def`` should point to a statically allocated module_def.
\endrst */
static module_
create_embedded_extension_module(const char *name, const char *doc, module_def *def) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this a lot of code duplication for changing the default value of one argument and the generic pybind11 fail, isn't it? If you think from API/ABI design they should remain seperate, maybe they can both call a common private method. Thoughts @henryiii @rwgk ?

Copy link
Collaborator

@Skylion007 Skylion007 Mar 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this would actually be better to subclass or just add it as an optional argument.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @Skylion007, thinking it will be easy to avoid the code duplication, maybe as simple as bool is_embedded = false? (same idea as Aaron's?)

My second concern is testing: what would it take to add a test that fails without the change here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This just feels like bad code smell, the MACRO we are using is already a bit hacky. Is there a way we can improve create_embedded_extension_module to automatically detect when it's being called from that macro? It seems like a rather specific and limited use case that the end user should never call directly so this method should really be hidden somewhere as an implementation detail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless there is a reason to allow embedded and non-embedded modules in the same executable / library, maybe detecting whether embed.h has been included, or having the user indicate if they want the modules to be embedded with a #define PYBIND11_EMBED are options to consider ?

Copy link
Contributor Author

@StarQTius StarQTius Mar 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rwgk About the testing: I added a line in external_module.cpp which was making the "Restart an interpreter" test case fail in a way that was consistent with the original issue (ie. having the library complain about an already registered type). I could have added another test but I didn't see the point since it would have done exactly the same thing and I needed to add that line in the external module in any case.
In retrospect, maybe it would have been better to dedicate another external module to this test case.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, here is my proposal @StarQTius:

  • Factor out the implementation of the static module function into to a function in a pybind11 detail namespace. You can have it use the is_embedded to denote the different behavior. Have the static function call the new detail one.
  • Have the macro call the new detail function with the proper bool from the macro.

That way we don't create a new static method API that users can accidentally use or that we have to mantain and the API is the same as before with minimal code duplication. (The detail namespace will indicate this is not a new part of the API).

Copy link
Collaborator

@rwgk rwgk Mar 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bout the testing: I added a line in external_module.cpp which was making the "Restart an interpreter" test case fail in a way that was consistent with the original issue (ie. having the library complain about an already registered type).

I see, thanks! I saw that line but didn't make the connection. Could you please add a comment, maybe one or two lines to explain in a nutshell, plus reference to this PR? So that it will be quick for anyone looking at the test code in the future, with no specific context, to make the connection.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you asked, I moved the extension module creation implementation detail into a non-API function. I had to add forward declarations to make it works though. I also added comments to clarify how local internals are tested, and a reference to the PR.

``def`` should point to a statically allocated module_def.
\endrst */
static module_
create_embedded_extension_module(const char *name, const char *doc, module_def *def) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @Skylion007, thinking it will be easy to avoid the code duplication, maybe as simple as bool is_embedded = false? (same idea as Aaron's?)

My second concern is testing: what would it take to add a test that fails without the change here?

@StarQTius StarQTius marked this pull request as ready for review March 17, 2022 10:51
Copy link
Collaborator

@rwgk rwgk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks awesome now, thanks!

@@ -510,11 +510,19 @@ struct local_internals {
};

/// Works like `get_internals`, but for things which are locally registered.
/// There is one variable storing the local internals for each module, but every embedded module
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little unclear if I understand this correctly. I'm wondering is

but all embedded modules share the same locals.

correct? If not I'm confused, if yes that would be my suggested wording.

Copy link
Contributor Author

@StarQTius StarQTius Mar 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry if it does not sound clear, English is not my native language. What I meant is that since the static local variable of get_local_internals does not get dynamically linked when a non-embedded module is loaded by the interpreter, you end up with one definition of the locals for the interpreter, and one definition for each non-embedded modules.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in that case my suggested wording is clearer. But I'm not a native speaker, too!
Let's see what @Skylion007 thinks.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I'm having serious 2nd thoughts.

E.g. at Google, Python and all extension modules for a given binary (or test) are linked together statically. Each binary or test is hermetic.

Others may use RTLD_GLOBAL. Probably rare, but maybe in certain environment that's what's needed?

What does this PR mean in those situations? I'm not sure. Does someone else have a clear understanding of how the m_clear functionality plays out across all environments?

Guessing a lot: I think to be completely safe, we'd need some kind of lookup table, to keep track of which module_def pointers are associated with a given local_internals pointer, and if more than one, we cannot safely clear_local_internals until the last one disappears.

Or we could have static std::unordered_map<module_def *, local_internals> with all the required logic to register & clear types given a module_def pointer, but search for types over all keys of the map.

It seems pretty tricky to get the m_clear functionality correct in all environments.

Given those uncertainties, and that pybind11 existed to this point without using m_clear, maybe this should be an opt-in, e.g. via some #define, or runtime option? That would be a safer and less sudden way to find out what's possible and what not.

Copy link
Collaborator

@rwgk rwgk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry after more thought I became too uncertain how this plays out. I someone else feels certain, please speak up.

@@ -510,11 +510,19 @@ struct local_internals {
};

/// Works like `get_internals`, but for things which are locally registered.
/// There is one variable storing the local internals for each module, but every embedded module
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I'm having serious 2nd thoughts.

E.g. at Google, Python and all extension modules for a given binary (or test) are linked together statically. Each binary or test is hermetic.

Others may use RTLD_GLOBAL. Probably rare, but maybe in certain environment that's what's needed?

What does this PR mean in those situations? I'm not sure. Does someone else have a clear understanding of how the m_clear functionality plays out across all environments?

Guessing a lot: I think to be completely safe, we'd need some kind of lookup table, to keep track of which module_def pointers are associated with a given local_internals pointer, and if more than one, we cannot safely clear_local_internals until the last one disappears.

Or we could have static std::unordered_map<module_def *, local_internals> with all the required logic to register & clear types given a module_def pointer, but search for types over all keys of the map.

It seems pretty tricky to get the m_clear functionality correct in all environments.

Given those uncertainties, and that pybind11 existed to this point without using m_clear, maybe this should be an opt-in, e.g. via some #define, or runtime option? That would be a safer and less sudden way to find out what's possible and what not.

@rwgk
Copy link
Collaborator

rwgk commented Mar 18, 2022

Tagging @jbms: is there a chance that you could look at this PR (and my uncertainties under #3798 (comment))?

@jbms
Copy link
Contributor

jbms commented Mar 18, 2022

I wasn't too clear on exactly when the PyModuleDef.m_free function is called, so I did a bit of experimentation. My conclusion is that this change is NOT safe.

First let's consider just the single interpreter case:

In the most common case, you import some pybind11 modules, and don't do anything funny. In that case, the PyModuleDef.m_free functions of all the modules will be called during interpreter finalization. In that case it is probably fairly safe --- even if the locals are shared by multiple modules, clearing the locals multiple times doesn't really hurt anything, as long as none of the module free functions/destructors invoked by pybind11 attempt to use any pybind11 apis.

However, it appears that the PyModuleDef.m_free function can be called before finalization, if you import the extension, remove it from sys.modules, then reload it. However, note that even though m_free is called, the PyInit_modulename function is not called again. Here is a test case I put together:

mymodule.cc:

#include <pybind11/pybind11.h>
#include <stdio.h>

namespace py = ::pybind11;

struct MyClass {
  std::string get_name() { return "myclass"; }
};

PYBIND11_MODULE(mymodule, m) {
  m.doc() = "pybind11 example plugin";  // optional module docstring

  py::class_<MyClass>(m, "MyClass", py::module_local())
      .def(py::init([] { return MyClass{}; }))
      .def("get_name", &MyClass::get_name);

  fprintf(stderr, "Loading mymodule\n");
  fflush(stderr);
}

test.py:

import sys

import mymodule
cls = mymodule.MyClass()
assert cls.get_name() == "myclass"

del sys.modules['mymodule']
del mymodule
import mymodule
print('Reloaded module')
cls = mymodule.MyClass()
assert cls.get_name() == "myclass"

I also modified the m_free function to print a message.

The result with Python 3.9.9 is:

Loading mymodule
m_free called
Reloaded module
zsh: segmentation fault  python3 test.py

The crash happens in:

pybind11::class_<MyClass>::init_instance(pybind11::detail::instance*, void const*)

Presumably due to attempting to reference a type record that has been already deallocated.

This may be a bug in Python.

I don't really have any experience with starting and finalizing the Python interpreter multiple times in a single process, but from what I have seen when looking closely at the Python finalization process it is not really properly supported by Python --- some memory is sure to be leaked, especially if threads are used. Therefore I wouldn't recommend doing that in production code, and even doing it in tests seems kind of questionable since why test something that doesn't match what happens in production?

It appears there is support for multi-phase initialization intended to better support module reloading, and also sub-interpreters:
https://docs.python.org/3/c-api/module.html#multi-phase-initialization

We might consider changing over to using that (which would probably require API changes in pybind11) ,
though my understanding is that there are still some Python APIs where the module state cannot easily be obtained, which would make it challenging to use this module state mechanism. Additionally, I think it would require major API changes in pybind11 to support that.

I think the objective of this PR, fixing module_local registration when restarting the interpreter, could be solved by instead arranging to clear the local internals only when the interpreter is finalized, or after the interpreter is restarted, rather than using m_free. I'm not sure what the best way to do that would be --- maybe adding an atexit handler would work, although that would add some overhead. Alternatively we could attempt to detect if the interpreter has been restarted, and clear the locals only in that case. I'm not sure how to do that exactly.

More generally, I think it would be best if module_local went away, and instead casters and exception translators could be overridden via explicitly passed-around context objects or something, so that there is no hidden semi-global state dependent on symbol visibility.

@Skylion007
Copy link
Collaborator

I think the objective of this PR, fixing module_local registration when restarting the interpreter, could be solved by instead arranging to clear the local internals only when the interpreter is finalized, or after the interpreter is restarted, rather than using m_free. I'm not sure what the best way to do that would be --- maybe adding an atexit handler would work, although that would add some overhead. Alternatively we could attempt to detect if the interpreter has been restarted, and clear the locals only in that case. I'm not sure how to do that exactly.

Good thing Python has an auditing event just for this: https://docs.python.org/3/c-api/init.html#c.Py_FinalizeEx

@Skylion007
Copy link
Collaborator

Skylion007 commented Mar 18, 2022

Upon looking at it, atexit is a viable alternative, but I do dislike having to manage yet another global static state as would be needed with either solution.

@jasjuang
Copy link

Therefore I wouldn't recommend doing that in production code, and even doing it in tests seems kind of questionable since why test something that doesn't match what happens in production?

The reason of writing tests in C++ instead of python is because of debugging, gtest + Visual Studio or Xcode is a lot better and faster than debugging with pytest + pdb.

@jbms
Copy link
Contributor

jbms commented Mar 18, 2022

You can run the Python executable itself in a debugger --- obviously works better if you have the debugging symbols, though.

What I mean, though, is that running Py_Initialize more than once is not something well-tested or well-supported by CPython, and there will be a lot of behavior differences compared to running Python normally. For example, some state may stick around from a previous interpreter run, and therefore a test may pass even if the functionality does not actually work in normal Python usage, or a test may fail even though the functionality does work in normal Python usage.

Of course some tradeoffs have to be made to make testing more efficient, though.

@jasjuang
Copy link

It seems problematic if there are a lot of behavior differences between Python and CPython. I think most users (at least me) would implicitly assume that CPython is a C version of Python, therefore their behavior should ideally be identical. If the consensus is they shouldn't be identical, is there a README somewhere that documents the behavior differences between Python and CPython that I can read?

@jbms
Copy link
Contributor

jbms commented Mar 18, 2022

CPython is the main implementation of Python --- PyPy is another less widely used implementation. I mentioned CPython because I have some familiarity with its implementation, while I have essentially no familiarity with PyPy.

@rwgk
Copy link
Collaborator

rwgk commented Mar 18, 2022

General comment: we (mostly CPython) is sending people into traps by giving the impression that restarting the interpreter is safe, well-defined behavior. It simply isn't. It probably never will be. It would be best if people gave up trying. It's just a trap, ultimately, that's burning human time, like here, for no good reason.

If it was just me, I'd put a static bool into pybind11, that made it impossible to initialize the interpreter a second time, terminating the process with a message pointing here:

https://docs.python.org/3/c-api/init.html#c.Py_FinalizeEx

Bugs and caveats: The destruction of modules and objects in modules is done in random order; this may cause destructors (__del__() methods) to fail when they depend on other objects (even functions) or modules. Dynamically loaded extension modules loaded by Python are not unloaded. Small amounts of memory allocated by the Python interpreter may not be freed (if you find a leak, please report it). Memory tied up in circular references between objects is not freed. Some memory allocated by extension modules may not be freed. Some extensions may not work properly if their initialization routine is called more than once; this can happen if an application calls Py_Initialize() and Py_FinalizeEx() more than once.

A couple years ago I did some really simple experiments: Leaks!

https://github.com/rwgk/stuff/blob/71f5300061aaecb0f499aab77343abd4ca360c6c/noddy/embedded_noddy_main_run.c

aumuell added a commit to vistle/vistle that referenced this pull request Dec 12, 2022
aumuell added a commit to vistle/vistle that referenced this pull request Dec 12, 2022
restarting python interpreter does not seem to be well supported, cf. pybind/pybind11#3798 (comment)
MDjur pushed a commit to MDjur/vistle that referenced this pull request Feb 17, 2023
MDjur pushed a commit to MDjur/vistle that referenced this pull request Feb 17, 2023
restarting python interpreter does not seem to be well supported, cf. pybind/pybind11#3798 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants