fix: Make non-embedded module clear their locals before deallocation #3798

StarQTius · 2022-03-12T19:04:58Z

Description

This fix aims to solve the issue #3776. The problem was that every non-embedded module had their own locals, which could only be accessed through the get_local_internals function of the shared library they were in. So when these modules were deallocated, their locals were not cleared, and if the modules were initialized again, C++ types could not be registered in the locals.
I used the PyModuleDef::m_free field to register a cleanup callback which clears the locals on non-embedded module deallocation. I added another function for creating embedded module, so that embedded modules do not clear the locals on deallocation.
I also registered a local class in the external_module used for testing. This new line was making the "Restart the interpreter" test case fail.

Skylion007 · 2022-03-12T19:28:51Z

include/pybind11/pybind11.h

+        ``def`` should point to a statically allocated module_def.
+    \endrst */
+    static module_
+    create_embedded_extension_module(const char *name, const char *doc, module_def *def) {


Hmm, this a lot of code duplication for changing the default value of one argument and the generic pybind11 fail, isn't it? If you think from API/ABI design they should remain seperate, maybe they can both call a common private method. Thoughts @henryiii @rwgk ?

Maybe this would actually be better to subclass or just add it as an optional argument.

I agree with @Skylion007, thinking it will be easy to avoid the code duplication, maybe as simple as bool is_embedded = false? (same idea as Aaron's?)

My second concern is testing: what would it take to add a test that fails without the change here?

This just feels like bad code smell, the MACRO we are using is already a bit hacky. Is there a way we can improve create_embedded_extension_module to automatically detect when it's being called from that macro? It seems like a rather specific and limited use case that the end user should never call directly so this method should really be hidden somewhere as an implementation detail.

Unless there is a reason to allow embedded and non-embedded modules in the same executable / library, maybe detecting whether embed.h has been included, or having the user indicate if they want the modules to be embedded with a #define PYBIND11_EMBED are options to consider ?

@rwgk About the testing: I added a line in external_module.cpp which was making the "Restart an interpreter" test case fail in a way that was consistent with the original issue (ie. having the library complain about an already registered type). I could have added another test but I didn't see the point since it would have done exactly the same thing and I needed to add that line in the external module in any case.
In retrospect, maybe it would have been better to dedicate another external module to this test case.

Okay, here is my proposal @StarQTius:

Factor out the implementation of the static module function into to a function in a pybind11 detail namespace. You can have it use the is_embedded to denote the different behavior. Have the static function call the new detail one.

Have the macro call the new detail function with the proper bool from the macro.

That way we don't create a new static method API that users can accidentally use or that we have to mantain and the API is the same as before with minimal code duplication. (The detail namespace will indicate this is not a new part of the API).

bout the testing: I added a line in external_module.cpp which was making the "Restart an interpreter" test case fail in a way that was consistent with the original issue (ie. having the library complain about an already registered type).

I see, thanks! I saw that line but didn't make the connection. Could you please add a comment, maybe one or two lines to explain in a nutshell, plus reference to this PR? So that it will be quick for anyone looking at the test code in the future, with no specific context, to make the connection.

As you asked, I moved the extension module creation implementation detail into a non-API function. I had to add forward declarations to make it works though. I also added comments to clarify how local internals are tested, and a reference to the PR.

rwgk · 2022-03-15T09:32:58Z

include/pybind11/pybind11.h

+        ``def`` should point to a statically allocated module_def.
+    \endrst */
+    static module_
+    create_embedded_extension_module(const char *name, const char *doc, module_def *def) {


I agree with @Skylion007, thinking it will be easy to avoid the code duplication, maybe as simple as bool is_embedded = false? (same idea as Aaron's?)

My second concern is testing: what would it take to add a test that fails without the change here?

rwgk

Looks awesome now, thanks!

rwgk · 2022-03-17T16:09:33Z

include/pybind11/detail/internals.h

@@ -510,11 +510,19 @@ struct local_internals {
 };

 /// Works like `get_internals`, but for things which are locally registered.
+/// There is one variable storing the local internals for each module, but every embedded module


I'm a little unclear if I understand this correctly. I'm wondering is

but all embedded modules share the same locals.

correct? If not I'm confused, if yes that would be my suggested wording.

I'm sorry if it does not sound clear, English is not my native language. What I meant is that since the static local variable of get_local_internals does not get dynamically linked when a non-embedded module is loaded by the interpreter, you end up with one definition of the locals for the interpreter, and one definition for each non-embedded modules.

I think in that case my suggested wording is clearer. But I'm not a native speaker, too!
Let's see what @Skylion007 thinks.

Sorry I'm having serious 2nd thoughts.

E.g. at Google, Python and all extension modules for a given binary (or test) are linked together statically. Each binary or test is hermetic.

Others may use RTLD_GLOBAL. Probably rare, but maybe in certain environment that's what's needed?

What does this PR mean in those situations? I'm not sure. Does someone else have a clear understanding of how the m_clear functionality plays out across all environments?

Guessing a lot: I think to be completely safe, we'd need some kind of lookup table, to keep track of which module_def pointers are associated with a given local_internals pointer, and if more than one, we cannot safely clear_local_internals until the last one disappears.

Or we could have static std::unordered_map<module_def *, local_internals> with all the required logic to register & clear types given a module_def pointer, but search for types over all keys of the map.

It seems pretty tricky to get the m_clear functionality correct in all environments.

Given those uncertainties, and that pybind11 existed to this point without using m_clear, maybe this should be an opt-in, e.g. via some #define, or runtime option? That would be a safer and less sudden way to find out what's possible and what not.

rwgk

Sorry after more thought I became too uncertain how this plays out. I someone else feels certain, please speak up.

rwgk · 2022-03-18T07:04:20Z

include/pybind11/detail/internals.h

@@ -510,11 +510,19 @@ struct local_internals {
 };

 /// Works like `get_internals`, but for things which are locally registered.
+/// There is one variable storing the local internals for each module, but every embedded module


Sorry I'm having serious 2nd thoughts.

E.g. at Google, Python and all extension modules for a given binary (or test) are linked together statically. Each binary or test is hermetic.

Others may use RTLD_GLOBAL. Probably rare, but maybe in certain environment that's what's needed?

What does this PR mean in those situations? I'm not sure. Does someone else have a clear understanding of how the m_clear functionality plays out across all environments?

Guessing a lot: I think to be completely safe, we'd need some kind of lookup table, to keep track of which module_def pointers are associated with a given local_internals pointer, and if more than one, we cannot safely clear_local_internals until the last one disappears.

Or we could have static std::unordered_map<module_def *, local_internals> with all the required logic to register & clear types given a module_def pointer, but search for types over all keys of the map.

It seems pretty tricky to get the m_clear functionality correct in all environments.

Given those uncertainties, and that pybind11 existed to this point without using m_clear, maybe this should be an opt-in, e.g. via some #define, or runtime option? That would be a safer and less sudden way to find out what's possible and what not.

rwgk · 2022-03-18T07:49:13Z

Tagging @jbms: is there a chance that you could look at this PR (and my uncertainties under #3798 (comment))?

jbms · 2022-03-18T15:48:29Z

I wasn't too clear on exactly when the PyModuleDef.m_free function is called, so I did a bit of experimentation. My conclusion is that this change is NOT safe.

First let's consider just the single interpreter case:

In the most common case, you import some pybind11 modules, and don't do anything funny. In that case, the PyModuleDef.m_free functions of all the modules will be called during interpreter finalization. In that case it is probably fairly safe --- even if the locals are shared by multiple modules, clearing the locals multiple times doesn't really hurt anything, as long as none of the module free functions/destructors invoked by pybind11 attempt to use any pybind11 apis.

However, it appears that the PyModuleDef.m_free function can be called before finalization, if you import the extension, remove it from sys.modules, then reload it. However, note that even though m_free is called, the PyInit_modulename function is not called again. Here is a test case I put together:

mymodule.cc:

#include <pybind11/pybind11.h>
#include <stdio.h>

namespace py = ::pybind11;

struct MyClass {
  std::string get_name() { return "myclass"; }
};

PYBIND11_MODULE(mymodule, m) {
  m.doc() = "pybind11 example plugin";  // optional module docstring

  py::class_<MyClass>(m, "MyClass", py::module_local())
      .def(py::init([] { return MyClass{}; }))
      .def("get_name", &MyClass::get_name);

  fprintf(stderr, "Loading mymodule\n");
  fflush(stderr);
}

test.py:

import sys

import mymodule
cls = mymodule.MyClass()
assert cls.get_name() == "myclass"

del sys.modules['mymodule']
del mymodule
import mymodule
print('Reloaded module')
cls = mymodule.MyClass()
assert cls.get_name() == "myclass"

I also modified the m_free function to print a message.

The result with Python 3.9.9 is:

Loading mymodule
m_free called
Reloaded module
zsh: segmentation fault  python3 test.py

The crash happens in:

pybind11::class_<MyClass>::init_instance(pybind11::detail::instance*, void const*)

Presumably due to attempting to reference a type record that has been already deallocated.

This may be a bug in Python.

I don't really have any experience with starting and finalizing the Python interpreter multiple times in a single process, but from what I have seen when looking closely at the Python finalization process it is not really properly supported by Python --- some memory is sure to be leaked, especially if threads are used. Therefore I wouldn't recommend doing that in production code, and even doing it in tests seems kind of questionable since why test something that doesn't match what happens in production?

It appears there is support for multi-phase initialization intended to better support module reloading, and also sub-interpreters:
https://docs.python.org/3/c-api/module.html#multi-phase-initialization

We might consider changing over to using that (which would probably require API changes in pybind11) ,
though my understanding is that there are still some Python APIs where the module state cannot easily be obtained, which would make it challenging to use this module state mechanism. Additionally, I think it would require major API changes in pybind11 to support that.

I think the objective of this PR, fixing module_local registration when restarting the interpreter, could be solved by instead arranging to clear the local internals only when the interpreter is finalized, or after the interpreter is restarted, rather than using m_free. I'm not sure what the best way to do that would be --- maybe adding an atexit handler would work, although that would add some overhead. Alternatively we could attempt to detect if the interpreter has been restarted, and clear the locals only in that case. I'm not sure how to do that exactly.

More generally, I think it would be best if module_local went away, and instead casters and exception translators could be overridden via explicitly passed-around context objects or something, so that there is no hidden semi-global state dependent on symbol visibility.

Skylion007 · 2022-03-18T16:02:42Z

I think the objective of this PR, fixing module_local registration when restarting the interpreter, could be solved by instead arranging to clear the local internals only when the interpreter is finalized, or after the interpreter is restarted, rather than using m_free. I'm not sure what the best way to do that would be --- maybe adding an atexit handler would work, although that would add some overhead. Alternatively we could attempt to detect if the interpreter has been restarted, and clear the locals only in that case. I'm not sure how to do that exactly.

Good thing Python has an auditing event just for this: https://docs.python.org/3/c-api/init.html#c.Py_FinalizeEx

Skylion007 · 2022-03-18T16:20:40Z

Upon looking at it, atexit is a viable alternative, but I do dislike having to manage yet another global static state as would be needed with either solution.

jasjuang · 2022-03-18T17:31:38Z

Therefore I wouldn't recommend doing that in production code, and even doing it in tests seems kind of questionable since why test something that doesn't match what happens in production?

The reason of writing tests in C++ instead of python is because of debugging, gtest + Visual Studio or Xcode is a lot better and faster than debugging with pytest + pdb.

jbms · 2022-03-18T17:58:15Z

You can run the Python executable itself in a debugger --- obviously works better if you have the debugging symbols, though.

What I mean, though, is that running Py_Initialize more than once is not something well-tested or well-supported by CPython, and there will be a lot of behavior differences compared to running Python normally. For example, some state may stick around from a previous interpreter run, and therefore a test may pass even if the functionality does not actually work in normal Python usage, or a test may fail even though the functionality does work in normal Python usage.

Of course some tradeoffs have to be made to make testing more efficient, though.

jasjuang · 2022-03-18T18:12:47Z

It seems problematic if there are a lot of behavior differences between Python and CPython. I think most users (at least me) would implicitly assume that CPython is a C version of Python, therefore their behavior should ideally be identical. If the consensus is they shouldn't be identical, is there a README somewhere that documents the behavior differences between Python and CPython that I can read?

jbms · 2022-03-18T18:37:31Z

CPython is the main implementation of Python --- PyPy is another less widely used implementation. I mentioned CPython because I have some familiarity with its implementation, while I have essentially no familiarity with PyPy.

rwgk · 2022-03-18T18:41:59Z

General comment: we (mostly CPython) is sending people into traps by giving the impression that restarting the interpreter is safe, well-defined behavior. It simply isn't. It probably never will be. It would be best if people gave up trying. It's just a trap, ultimately, that's burning human time, like here, for no good reason.

If it was just me, I'd put a static bool into pybind11, that made it impossible to initialize the interpreter a second time, terminating the process with a message pointing here:

https://docs.python.org/3/c-api/init.html#c.Py_FinalizeEx

Bugs and caveats: The destruction of modules and objects in modules is done in random order; this may cause destructors (__del__() methods) to fail when they depend on other objects (even functions) or modules. Dynamically loaded extension modules loaded by Python are not unloaded. Small amounts of memory allocated by the Python interpreter may not be freed (if you find a leak, please report it). Memory tied up in circular references between objects is not freed. Some memory allocated by extension modules may not be freed. Some extensions may not work properly if their initialization routine is called more than once; this can happen if an application calls Py_Initialize() and Py_FinalizeEx() more than once.

A couple years ago I did some really simple experiments: Leaks!

https://github.com/rwgk/stuff/blob/71f5300061aaecb0f499aab77343abd4ca360c6c/noddy/embedded_noddy_main_run.c

This reverts commit 59c5010. pybind/pybind11#3798 (comment) gives the cause for the real problem

restarting python interpreter does not seem to be well supported, cf. pybind/pybind11#3798 (comment)

This reverts commit 59c5010. pybind/pybind11#3798 (comment) gives the cause for the real problem

restarting python interpreter does not seem to be well supported, cf. pybind/pybind11#3798 (comment)

Make non-embedded module clear their locals before deallocation

e09ea59

Skylion007 reviewed Mar 12, 2022

View reviewed changes

rwgk requested changes Mar 15, 2022

View reviewed changes

Add create_extension_module_impl() & clarify testing purpose

85a89fe

StarQTius marked this pull request as ready for review March 17, 2022 10:51

rwgk approved these changes Mar 17, 2022

View reviewed changes

Skylion007 approved these changes Mar 17, 2022

View reviewed changes

rwgk requested changes Mar 18, 2022

View reviewed changes

aumuell added a commit to vistle/vistle that referenced this pull request Dec 12, 2022

Revert "Revert "Replaced deprecated "load_module" with "exec_module"""

bd4d18c

This reverts commit 59c5010. pybind/pybind11#3798 (comment) gives the cause for the real problem

aumuell added a commit to vistle/vistle that referenced this pull request Dec 12, 2022

keep python interpreter around

72ac688

restarting python interpreter does not seem to be well supported, cf. pybind/pybind11#3798 (comment)

rwgk mentioned this pull request Feb 10, 2023

FWD pybind11 google/pybind11clif#3798

Closed

MDjur pushed a commit to MDjur/vistle that referenced this pull request Feb 17, 2023

Revert "Revert "Replaced deprecated "load_module" with "exec_module"""

6e626c0

This reverts commit 59c5010. pybind/pybind11#3798 (comment) gives the cause for the real problem

MDjur pushed a commit to MDjur/vistle that referenced this pull request Feb 17, 2023

keep python interpreter around

f45964e

restarting python interpreter does not seem to be well supported, cf. pybind/pybind11#3798 (comment)

fix: Make non-embedded module clear their locals before deallocation #3798

Are you sure you want to change the base?

fix: Make non-embedded module clear their locals before deallocation #3798

Uh oh!

Conversation

StarQTius commented Mar 12, 2022

Description

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Skylion007 Mar 12, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

StarQTius Mar 16, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rwgk Mar 16, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rwgk left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

StarQTius Mar 17, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rwgk left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rwgk commented Mar 18, 2022

Uh oh!

jbms commented Mar 18, 2022

Uh oh!

Skylion007 commented Mar 18, 2022

Uh oh!

Skylion007 commented Mar 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jasjuang commented Mar 18, 2022

Uh oh!

jbms commented Mar 18, 2022

Uh oh!

jasjuang commented Mar 18, 2022

Uh oh!

jbms commented Mar 18, 2022

Uh oh!

rwgk commented Mar 18, 2022

Uh oh!

Uh oh!

Skylion007 Mar 12, 2022 •

edited

Loading

StarQTius Mar 16, 2022 •

edited

Loading

rwgk Mar 16, 2022 •

edited

Loading

StarQTius Mar 17, 2022 •

edited

Loading

Skylion007 commented Mar 18, 2022 •

edited

Loading