Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attach python lifetime to shared_ptr passed to C++ #2839

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

virtuald
Copy link
Contributor

@virtuald virtuald commented Feb 1, 2021

Description

Modifies the base type caster to ensure that python instances don't die when passed to C++.

The shared_ptr is always given to C++ code, so we construct a new shared_ptr that is given a custom deleter. The custom deleter increments the python reference count to bind the python instance lifetime with the lifetime of the shared_ptr. This behavior is only applied to instances that have an associated trampoline class

This enables things like passing the last python reference of a subclass to a C++ function without the python reference dying -- which means custom virtual function overrides will still work.

Reference cycles will cause a leak, but this is a limitation of shared_ptr

Thanks to @YannickJadoul for the inspiration putting me on this path. Fixes #1333 and #1120, #1145, #1389, #1552, #1546, #1566, #1774, #1902, #1937

Other inspiration from Boost::Python. Quoting from @rwgk at #2646:

When passing to shared_ptr arguments, Boost.Python always constructs a new shared_ptr, using a properly cast raw pointer, and a custom deleter keeping the PyObject owning the raw pointer alive:
https://github.com/boostorg/python/blob/5e4f6278e01f018c32cebc4ffc3b75a743fb1042/include/boost/python/converter/shared_ptr_from_python.hpp#L52
Note that this code uses the shared_ptr aliasing constructor.
In this way shared_ptr upcasts and downcasts work correctly, although the argument shared_ptr::use_count results tend to surprise users. Note that this mechanism is independent of the HeldType, in other words, no matter what the HeldType, it is possible to pass to shared_ptr arguments.

Downsides:

Potential improvements:

  • Right now this does this for all shared_ptr, might make sense to only do this for things that are python instances of python classes? Now only does this for things that have trampolines
  • Should we document how to do this for other holders? Not sure if the value_and_holder class is something we want to expose...

Suggested changelog entry:

* Objects that are wrapped with a shared_ptr holder and a trampoline class (or alias class) are now kept alive until all shared_ptr references are deleted

@YannickJadoul
Copy link
Collaborator

YannickJadoul commented Feb 1, 2021

Hi, @virtuald. Thanks for trying this out in more detail than just the slightly crazy "hack" suggested on Gitter! :-)

I'm not sure I meant for this hack to be the default way shared_ptr is always handled, though. In it's current state, I can see a few reasons why this solution could wreak (massive?) havoc:

  • Decreasing the reference count needs to hold the GIL, as it can call back into the Python interpreter (e.g. when deallocating). An straightforward fix would be to acquire the GIL in that deleter, of course.
    Doing so might still be tricky, as it's completely hiding a call into Python and could cause hard-to-fix deadlocks, while it seems like it's a pure C++ construct. On the other hand, arguably PYBIND11_OVERRIDE already does this as well?
    There's also only a point in doing this if there's a trampoline class, right? Checking that might be another part of making this safer.
    We could add a test to demonstrate this issue?
  • This will only work for shared_ptr. That's already better than it was of course, but I wonder how much confusion and bug reports this might cause ("This (unrelated) thing works with shared_ptr but not with unique_ptr.").

Starting from these two concerns and thinking further, I see a number of alternative approaches we could think about:

  1. Document this hack/workaround, and maybe provide some kind of utility function.
  2. Add some kind of a tag (similar to e.g. py::keep_alive) which applies this to selected arguments.
  3. Come up with a small wrapper type (e.g. py::shared_python_object<T> or something like that), which uses this caster and has some kind of std::shared_ptr<T> get() method. This would make things explicit in the signature, and still make things easy by wrapping with a lambda.
  4. Who knows what else there is. But given the issues from above, I'd think we need to be careful about doing things implicitly.

Reference cycles are possible as a result, but shared_ptr is already susceptible to this in C++. #2757 discusses this topic a bit

This PR doesn't really suffer from the specific reference cycles I mentioned to you on Gitter. The reference cycles are an issue when the Python object keeps a C++ object alive (which it does) while the C++ would keep the Python object alive. In this case, that's not really an issue, because there's no reference from the C++ object to the Python object; the shared_ptr is basically serving as an alias to a py::object, so we should be good.

I also know that @bstaletic was interested in pushing pybind11 towards using full-on garbage-collected heap types (as CPython is apparently pushing people away from non-GC heap types). A radically different solution, which would also work more generally, would be to do something similar to @EricCousineau-TRI's wrapper type, where the C++ trampoline class keeps a reference to the Python object and make this cycle visible to Python's GC (with tp_traverse).

I'll also tag @EricCousineau-TRI, since he will be the one who has been thinking about this by far the longest of all of us.

@YannickJadoul
Copy link
Collaborator

Also, I had forgotten about this, but I think I found where I got the original idea. This issue is a very interesting read: #1049 (comment)

@virtuald
Copy link
Contributor Author

virtuald commented Feb 1, 2021

This will only work for shared_ptr. That's already better than it was of course, but I wonder how much confusion and bug reports this might cause ("This (unrelated) thing works with shared_ptr but not with unique_ptr.").

I believe you get an error if you pass something with a unique_ptr holder to something with a shared_ptr argument? Or at least, I tried it and I got an error, will have to check again tonight. If there was an automatic unique -> shared conversion going on somewhere, we could add this there as well.

I agree that it would be ideal if this could only be done for instances of trampoline classes. Any idea how to determine that?

@YannickJadoul
Copy link
Collaborator

I believe you get an error if you pass something with a unique_ptr holder to something with a shared_ptr argument? Or at least, I tried it and I got an error, will have to check again tonight. If there was an automatic unique -> shared conversion going on somewhere, we could add this there as well.

Yep, you're right, that's true; currently, you can't take away ownership from Python (though there's some work happening on that).

I agree that it would be ideal if this could only be done for instances of trampoline classes. Any idea how to determine that?

Good question. I don't have time for a thorough search, currently, but this might be harder than I thought. It might be that all this information is lost after declaring a class and its __init__ :-/

@rwgk
Copy link
Collaborator

rwgk commented Feb 1, 2021

Currently I don't have the free bandwidth to fully page in here, but some quick remarks:

I believe you get an error if you pass something with a unique_ptr holder to something with a shared_ptr argument? Or at least, I tried it and I got an error, will have to check again tonight. If there was an automatic unique -> shared conversion going on somewhere, we could add this there as well.

#2672 (comment)

pass_shared_pointee  RuntimeError: Unable to load a custom holder type from a default-holder instance

("This (unrelated) thing works with shared_ptr but not with unique_ptr.").

  • unique_ptr with custom deleter: possible, if the deleter owns the py::object. This has only limited use because the unique_ptr deleter is part of the type. Additionally, the optimization compared to working with shared_ptr is probably negligible: for the extra trouble working with the unique_ptr custom deleter you only get a very small benefit in return.

  • unique_ptr with default deleter: strongly on my do-not-even-try list. It's breaking the "you fully own the resources" contract that comes with unique_ptr: the new owner has no way of taking ownership of the py::object. For that to be safe, you'd need access to the std::default_delete internals, which is practically impossible as far as I know.

@YannickJadoul
Copy link
Collaborator

  • unique_ptr with default deleter: strongly on my do-not-even-try list. It's breaking the "you fully own the resources" contract that comes with unique_ptr: the new owner has no way of taking ownership of the py::object. For that to be safe, you'd need access to the std::default_delete internals, which is practically impossible as far as I know.

Nonono, let's not go that way. Completely agreed!

The GC-type solution (similar to @EricCousineau-TRI's current solution) would solve this independently of holders, though (so also for non-shared_ptr copyable holders).
But I understand if @virtuald doesn't want to dive in that deeply :-) (It's also not a guaranteed success, and comes with its own caveats on non-deterministic GC.) So that's why my other suggestion is making things more explicit, here.

@virtuald
Copy link
Contributor Author

virtuald commented Feb 2, 2021

Unfortunately, gil_scoped_acquire is in pybind11.h, so I can't use it here. There's also a note there about how the GIL shouldn't be acquired if the interpreter is shutting down, but it's not clear to me how to avoid that problem here.

@virtuald
Copy link
Contributor Author

virtuald commented Feb 2, 2021

Regarding grabbing the GIL in the deleter, it appears that the std::function wrapper does this already so it's not without precedent.

@virtuald
Copy link
Contributor Author

virtuald commented Feb 2, 2021

I think this is worth doing by default, and I'm -1 on requiring a user to enable it via a tag or other hack. The current behavior of potentially losing the python half of an object when passing it to a C++ function is (while rare) clearly incorrect, unexpected, and very difficult to root cause without digging into the C++ code.

Evidence that users agree with me: #1120, #1145, #1333, #1389, #1552, #1546 (which has a similar implementation as this as a workaround!), #1566, #1774, #1902, #1937, and probably others.

I've looked for a bit, and as far as I can tell there is not currently a mechanism to determine whether a class was constructed by python or not. To record that information:

  • Add a bool has_alias to type_record, set that from class_::has_alias
  • Add a bool has_alias to type_info
    • Requires bumping PYBIND11_INTERNALS_VERSION
  • Set type_info.has_alias in make_new_python_type

Even then, that just tells you there's an alias class associated with the python type, not whether the constructed python type was created from an alias. You might be able to add a check for that in pybind11_meta_call and set a flag in the instance. Not sure it's worth the hassle.

@virtuald
Copy link
Contributor Author

virtuald commented Feb 2, 2021

This will only work for shared_ptr. That's already better than it was of course, but I wonder how much confusion and bug reports this might cause ("This (unrelated) thing works with shared_ptr but not with unique_ptr.").

Looking at this more, you could enable upgrading a unique holder to shared_ptr by providing an implementation of holder_retriever<std::unique_ptr> which does exactly the same thing, and removing the check_holder_compat.

Edit: ah, this is what rwgk was referring to, I think

@virtuald
Copy link
Contributor Author

virtuald commented Feb 3, 2021

Added the gil release, and a test inspired by #2844

@virtuald
Copy link
Contributor Author

virtuald commented Feb 3, 2021

This has been updated to only use the special deleter if an alias class is used.

I found a place to store the alias indicator that isn't currently covered by the PYBIND11_INTERNALS_VERSION guarantee -- in py::instance! ... which upon reflection, probably should be covered by it? I'm guessing that if objects from two different versions of pybind11 that did and didn't support this feature, you'd sometimes get the broken behavior and sometimes wouldn't. But that would probably be the case anyways unless we bumped PYBIND11_INTERNALS_VERSION?

Anyways, the last commit here changes init_instance to vary based on whether an alias class was used. If an alias class is used, a version of init_instance is used which sets a flag on the py::instance. This flag is used to determine whether the special deleter that keeps the python portion alive should be used or not.

@cdyson37
Copy link
Contributor

cdyson37 commented Feb 7, 2021

Sorry I'm a bit late to the party and haven't read the above fully, but I think this does a similar thing to my (now quite stale) PR #1566 . I never worked out how to detect if it was an alias though, and in my version I did the shared_ptr dance a bit differently, haven't had a close look so don't know which is better, but I suspect yours. Would be great to get this problem fixed once and for all!

@virtuald
Copy link
Contributor Author

virtuald commented Feb 7, 2021

@cdyson37 yep, this solves the same problem yours does, but takes care of only doing this for aliased shared_ptrs.

@rwgk
Copy link
Collaborator

rwgk commented Feb 22, 2021

I found a place to store the alias indicator that isn't currently covered by the PYBIND11_INTERNALS_VERSION guarantee -- in py::instance! ... which upon reflection, probably should be covered by it? I'm guessing that if objects from two different versions of pybind11 that did and didn't support this feature, you'd sometimes get the broken behavior and sometimes wouldn't. But that would probably be the case anyways unless we bumped PYBIND11_INTERNALS_VERSION?

This is an awesome PR! My only significant concern is exactly this question.
@wjakob what's the right thing to do?

I tested this PR in the Google environment with all sanitizers (ASAN, MSAN, TSAN, UBSAN) and didn't find any issues (except a couple known TSAN, UBSAN failures that already exist on master).

I think the suggested changelog is a bit unclear. It doesn't mention alias classes. Is the following true? "For classes wrapped with (A) shared_ptr as the holder, and (B) having a trampoline (aka alias class), shared_ptrs passed from Python to C++ keep the Python object alive."

@virtuald
Copy link
Contributor Author

I think the suggested changelog is a bit unclear. It doesn't mention alias classes.

Thanks for catching that, I'll update the changelog entry. The alias class thing was added later.

@rwgk
Copy link
Collaborator

rwgk commented Feb 23, 2021

Hi Dustin, I will merge #2841 in a minute, which will generate a merge conflict here I think. Sorry, but it's probably easy to resolve.

If it wasn't for the pesky PYBIND11_INTERNALS_VERSION question I think this PR could get merged very soon, but we needs Wenzel's input about it, and depending on what he recommends, we may need to wait. I'm not sure.

In the meantime, what do you think about splitting out a separate PR, just for factoring out gil.h? That would make this PR easier to review (for Wenzel) and rebase.

@virtuald
Copy link
Contributor Author

Already did that :) #2845

@rwgk
Copy link
Collaborator

rwgk commented Feb 25, 2023

More immediately practical maybe

We could cherry-pick most of the smart_holder branch into master:

  • Leaving behind type_caster_odr_guard and the two added return_value_policy enum members (easy).
  • Also leaving behind the "Progressive" mode completely: testing & #ifdefs (a little hairy). — The original purpose was to be sure all corner cases are exercised with smart_holder as the holder. Maybe that's not so important anymore, given where we are with the PyCLIF-pybind11 integration work: that gives us a ton of in-the-wild testing. I'd be willing to give up the "Progressive mode", to not double the testing load on master.
  • After smart_holder "Conservative" is on master: retire the smart_holder branch and continue only with master and pywrapcc.

I don't think this is very difficult or time-consuming. What I would need to get started are strong signals of support by other maintainers, and of course @wjakob, and a committed reviewer or two (fast turnaround, because having this change hanging in the air for more than a week or two would seriously hamper my primary project).

Tagging some potentially interested people for comment: @ax3l @eacousineau @henryiii @lalaland @Skylion007 @wangxf123456

@EthanSteinberg
Copy link
Collaborator

EthanSteinberg commented Feb 27, 2023

@rwgk I'd be happy to help review your smart_holder branch (especially because it would fix so many long standing bugs / issues). As long as it's ABI and API compatible it seems like a good idea to integrate within pybind11.

I think it would actually be good to have both progressive and conservative mode available, with an eventual goal of switching everything to progressive by default (which can be done at the next Python version release when everyone is forced to recompile the world anyways).

@rwgk
Copy link
Collaborator

rwgk commented Feb 28, 2023

@rwgk I'd be happy to help review your smart_holder branch (especially because it would fix so many long standing bugs / issues). As long as it's ABI and API compatible it seems like a good idea to integrate within pybind11.

A quick big thank you, that's awesome!

I think it would actually be good to have both progressive and conservative mode available, with an eventual goal of switching everything to progressive by default (which can be done at the next Python version release when everyone is forced to recompile the world anyways).

I'm turning this over in my mind still. Will get back soon.

@rwgk
Copy link
Collaborator

rwgk commented Mar 1, 2023

I think it would actually be good to have both progressive and conservative mode available

I'd really like to remove the Progressive mode. The main reason: I think if we have to support it, it could be more difficult to move to smart_holder as the default, in case we need subtle behavior tweaks. I don't know if that's actually the case, but I would be more surprised than not if there aren't any.

But, is there anyone already relying on the Progressive mode? — I don't know about any use case. @virtuald @rhaschke?

(I know of one or two people that tried but wound up using the Conservative mode.)

How could I find out more systematically?

@rhaschke
Copy link
Contributor

rhaschke commented Mar 1, 2023

But, is there anyone already relying on the Progressive mode?

I guess "Progressive mode" refers to PYBIND11_USE_SMART_HOLDER_AS_DEFAULT, right? If so, I confirm that we don't use that mode anymore.

@petersteneteg
Copy link

Hi, we compile with PYBIND11_USE_SMART_HOLDER_AS_DEFAULT and it works great! But I have no problem dropping that for classic mode, if it means getting it in master!

@virtuald
Copy link
Contributor Author

virtuald commented Mar 1, 2023

I also use PYBIND11_USE_SMART_HOLDER_AS_DEFAULT. I don't see why one would bother with the other holders (except for custom holder support) since they're broken.

But either way, I would like to also see smart_holder be merged to master, so I don't care how it gets there. I use my own fork of pybind11, so I can tweak my generator code as necessary.

@rwgk
Copy link
Collaborator

rwgk commented Mar 1, 2023

But, is there anyone already relying on the Progressive mode?

I guess "Progressive mode" refers to PYBIND11_USE_SMART_HOLDER_AS_DEFAULT, right?

Yes.

If so, I confirm that we don't use that mode anymore.

Thanks!

mthrok added a commit to pytorch/audio that referenced this pull request Apr 3, 2023
Currently, creating CTCDecoder object by passing a language model to
`lm` argument without assigning it to a variable elsewhere causes
`RuntimeError: Tried to call pure virtual function "LM::start"`.

According to discussions on PyBind11, (
pybind/pybind11#4013 and
pybind/pybind11#2839
) this is due to Python object garbage-collected by the time
it's used by code implemented in C++. It attempts to call
methods defined in Python, which overrides the base pure virtual
function, but the object which provides this override gets
deleted by garbage collrector, as the original object is not
reference counted.

This commit fixes this by simply assiging the given `lm` object
as an attribute of CTCDecoder class.

Address #3218
mthrok added a commit to pytorch/audio that referenced this pull request Apr 3, 2023
Currently, creating CTCDecoder object by passing a language model to
`lm` argument without assigning it to a variable elsewhere causes
`RuntimeError: Tried to call pure virtual function "LM::start"`.

According to discussions on PyBind11, (
pybind/pybind11#4013 and
pybind/pybind11#2839
) this is due to Python object garbage-collected by the time
it's used by code implemented in C++. It attempts to call
methods defined in Python, which overrides the base pure virtual
function, but the object which provides this override gets
deleted by garbage collrector, as the original object is not
reference counted.

This commit fixes this by simply assiging the given `lm` object
as an attribute of CTCDecoder class.

Address #3218
facebook-github-bot pushed a commit to pytorch/audio that referenced this pull request Apr 3, 2023
Summary:
Currently, creating CTCDecoder object by passing a language model to
`lm` argument without assigning it to a variable elsewhere causes
`RuntimeError: Tried to call pure virtual function "LM::start"`.

According to discussions on PyBind11, (
pybind/pybind11#4013 and
pybind/pybind11#2839
) this is due to Python object garbage-collected by the time
it's used by code implemented in C++. It attempts to call
methods defined in Python, which overrides the base pure virtual
function, but the object which provides this override gets
deleted by garbage collrector, as the original object is not
reference counted.

This commit fixes this by simply assiging the given `lm` object
as an attribute of CTCDecoder class.

Address #3218

Pull Request resolved: #3230

Reviewed By: hwangjeff

Differential Revision: D44642989

Pulled By: mthrok

fbshipit-source-id: a90af828c7c576bc0eb505164327365ebaadc471
mthrok added a commit to pytorch/audio that referenced this pull request Apr 4, 2023
Summary:
Currently, creating CTCDecoder object by passing a language model to
`lm` argument without assigning it to a variable elsewhere causes
`RuntimeError: Tried to call pure virtual function "LM::start"`.

According to discussions on PyBind11, (
pybind/pybind11#4013 and
pybind/pybind11#2839
) this is due to Python object garbage-collected by the time
it's used by code implemented in C++. It attempts to call
methods defined in Python, which overrides the base pure virtual
function, but the object which provides this override gets
deleted by garbage collrector, as the original object is not
reference counted.

This commit fixes this by simply assiging the given `lm` object
as an attribute of CTCDecoder class.

Address #3218

Pull Request resolved: #3230

Reviewed By: hwangjeff

Differential Revision: D44642989

Pulled By: mthrok

fbshipit-source-id: a90af828c7c576bc0eb505164327365ebaadc471
mthrok added a commit to pytorch/audio that referenced this pull request Apr 5, 2023
)

Summary:
Currently, creating CTCDecoder object by passing a language model to
`lm` argument without assigning it to a variable elsewhere causes
`RuntimeError: Tried to call pure virtual function "LM::start"`.

According to discussions on PyBind11, (
pybind/pybind11#4013 and
pybind/pybind11#2839
) this is due to Python object garbage-collected by the time
it's used by code implemented in C++. It attempts to call
methods defined in Python, which overrides the base pure virtual
function, but the object which provides this override gets
deleted by garbage collrector, as the original object is not
reference counted.

This commit fixes this by simply assiging the given `lm` object
as an attribute of CTCDecoder class.

Address #3218

Pull Request resolved: #3230

Reviewed By: hwangjeff

Differential Revision: D44642989

Pulled By: mthrok

fbshipit-source-id: a90af828c7c576bc0eb505164327365ebaadc471
@wjakob
Copy link
Member

wjakob commented Apr 20, 2023

Regarding the merging smart_holder into the master branch: for a balanced discussion, I think it is useful to also mention potential downsides:

With the latest version of both branches, I am observing a 1.31x compilation time increase and a 1.48x binary size increase in a release build of a class binding benchmark (this compares master to smart_holder in progressive mode). See https://nanobind.readthedocs.io/en/latest/benchmark.html. That is a rather significant overhead on top of baseline numbers that are arguably already concerning.

@rwgk
Copy link
Collaborator

rwgk commented Apr 20, 2023

Thanks for sharing those results. The measurements are larger than I would have guessed, especially the binary size increase is really surprising.

For my particular situation:

  1. I don't have a choice, I absolutely need those features.
  2. In a typical real-world situation, hopefully the size of the bindings compared to the size of what's being bound is relatively small.

People have a choice, that's good.

chetnieter added a commit to chetnieter/fletch that referenced this pull request May 27, 2023
Patching pybind11 to fix premature deletion of python objects
created and wrapped in c++. This patch is based off an open
PR for pybind11.

pybind/pybind11#2839
chetnieter added a commit to chetnieter/fletch that referenced this pull request May 30, 2023
Patching pybind11 to fix premature deletion of python objects
created and wrapped in c++. This patch is based off an open
PR for pybind11.

pybind/pybind11#2839
@kliegeois
Copy link

@rwgk do you have an update regarding the potential merging of smart_holder ? Is it still considered and if yes do you know when it might happen?

Thanks!

@rwgk
Copy link
Collaborator

rwgk commented Oct 22, 2023

@rwgk do you have an update regarding the potential merging of smart_holder ? Is it still considered and if yes do you know when it might happen?

Sorry not sure still. — The project I wrote the smart_holder for is at just a little over 1 failure in 100,000 tests, but I still cannot know for sure when we're all the way through, and that's the highest priority.

I'll continue merging master into smart_holder with just a few days delay at most.

If someone wants to help that would be great:

chetnieter added a commit to chetnieter/fletch that referenced this pull request Jan 3, 2024
Patching pybind11 to fix premature deletion of python objects
created and wrapped in c++. This patch is based off an open
PR for pybind11.

pybind/pybind11#2839
chetnieter added a commit to chetnieter/fletch that referenced this pull request Feb 7, 2024
Patching pybind11 to fix premature deletion of python objects
created and wrapped in c++. This patch is based off an open
PR for pybind11.

pybind/pybind11#2839
@galenbwill
Copy link

I have also run into this problem. Is there any ETA yet for merging in the smart_holder branch?

@rwgk
Copy link
Collaborator

rwgk commented Aug 31, 2024

I have also run into this problem. Is there any ETA yet for merging in the smart_holder branch?

See #5339.

I requested feedback five days ago (2024-08-26) but haven't gotten any. I want to encourage everyone to chime in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Problem when creating derived Python objects from C++ (inheritance slicing)