Description
Issue description
pybind11 allows releasing the GIL for pretty much any bound function, including constructors, but not for destructors. Besides a missed opportunity for optimizing GIL usage, this can easily cause deadlocks in certain situations. Whenever a destructor waits for another thread, and this thread tries to lock the GIL (because it needs to run Python code, or otherwise wants to work with Python objects), a deadlock occurs.
The sample program at the bottom demonstrates this problem. Destroying the dictionary triggers the destructor of the Worker
, causing a deadlock more often than not. Obviously, ~Worker()
does not have to keep the GIL locked, and explicitly releasing it before calling join()
will resolve the deadlock. However, this is not always a desirable solution, because it means inserting Python calls invasively into a codebase (basically into any destructor that may block).
Are there any agreed upon strategies to deal with this problem?
Possible solutions
If there isn't a common solution to this deadlock, I would like to propose a couple of options.
delete_without_gil
Add a new option to the class_
template, delete_without_gil
. While deallocating objects of such classes, pybind11 will release the GIL.
[EDIT 2024-01-18: This was implemented under https://github.com/google/pybind11clif/pull/30088]
This is a straight-forward, but not a complete solution. The "blocking" property of destructors is transitive through the class' members. When pybind11 destroys an object of type A
, but this object has a member of type B
whose destructor blocks, A
also has to be marked delete_without_gil
. What's worse, if ~B()
originally starts out as non-blocking, but is later changed to be blocking, all classes that have a B
member need to retroactively be marked delete_without_gil
. Not to mention the case where B
is polymorph, and someone unwittingly implements a new subclass with a blocking destructor.
In short, bindings for complex codebases may need to always specify delete_without_gil
to be on the safe side.
[EDIT 2024-01-18: This is exactly how PyCLIF works. The new PyCLIF-pybind11 version will have the same behavior.]
Always release the GIL during deallocation
This would prevent the deadlock pretty decisively, but objects holding Python objects (e.g. pybind11::dict
) as members will have to take care to reacquire the GIL before destroying them. Furthermore, the GIL may thrash during destruction of a complex object hierarchy, introducing a performance penalty.
It may be prudent to allow toggling this option through a preprocessor flag. Bindings that require it and can live with the additional GIL overhead can enable it, while simpler modules can leave it as is.
Reproducible example code
This sample will start a worker executing some Python code (simple print statements) in a separate thread, which it needs the GIL for. Upon destruction of the worker, the thread is joined. If, as is the case here, the worker is destroyed while the GIL is locked, a deadlock occurs.
#include <pybind11/pybind11.h>
#include <pybind11/embed.h>
#include <atomic>
#include <thread>
using namespace std::chrono_literals;
// A worker that runs some Python code in a separate thread
struct Worker {
Worker() {
thread = std::thread([this] {
while (keepRunning) {
pybind11::gil_scoped_acquire gil;
pybind11::print("Working");
std::this_thread::sleep_for(10ms);
}
});
}
~Worker() {
keepRunning = false;
if (thread.joinable()) {
thread.join();
}
}
std::thread thread;
std::atomic<bool> keepRunning;
};
PYBIND11_EMBEDDED_MODULE(deadlock, mod) {
pybind11::class_<Worker>(mod, "Worker");
}
int main() {
pybind11::scoped_interpreter interpreter;
pybind11::module::import("deadlock");
{
pybind11::dict dict;
dict["worker"] = pybind11::cast(new Worker(), pybind11::return_value_policy::take_ownership);
{
// Let the worker run for a while
pybind11::gil_scoped_release release;
std::this_thread::sleep_for(100ms);
}
}
// This line will rarely be reached due to a deadlock when destroying dict
pybind11::print("No deadlock");
}