-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add memoryview_scoped_release to manage short-lived memoryviews #2307
base: master
Are you sure you want to change the base?
Conversation
ef6b1a6
to
4e92ffd
Compare
4e92ffd
to
5671fa6
Compare
5671fa6
to
74ab0d8
Compare
74ab0d8
to
5814f9a
Compare
5814f9a
to
3cf0b08
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I won't approve this, because I don't think I know enough about memory views at this point.
Do we want to make this release
thing a part of py::memoryview
's destructor?
class memoryview_scoped_release { | ||
public: | ||
explicit memoryview_scoped_release(memoryview view) : m_view(view) {} | ||
~memoryview_scoped_release() { m_view.attr("release")(); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, CPython implementation is in 256 different files. To save everyone the trouble, attr("release")
is defined here: https://github.com/python/cpython/blob/63298930fb531ba2bb4f23bc3b915dbf1e17e9e1/Objects/memoryobject.c#L1068
I don't think you want to make release part of the destructor. Release indicates that the buffer is no longer valid -- so for example, a memoryview around a static buffer wouldn't need release to be called. Also would break existing users of py::memoryview. This mechanism is intended to be used when you know when the memoryview is no longer pointing to a valid piece of memory and shouldn't be used by python anymore (and in particular -- python could still have references to the memoryview lying around). |
Okay, that makes a lot of sense. I did say I'm not completely familiar with memory views. |
3cf0b08
to
1dfb360
Compare
I'm sad this didn't make it into 2.7.0, but still interested in getting this merged! |
300e623
to
3c1946f
Compare
3c1946f
to
8276c0e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me and it's be nice to see this feature merged.
We could simply add a member function like below (tested). diff --git a/include/pybind11/pytypes.h b/include/pybind11/pytypes.h
index c7b2501f..567c9d3d 100644
--- a/include/pybind11/pytypes.h
+++ b/include/pybind11/pytypes.h
@@ -1601,6 +1601,8 @@ public:
return memoryview::from_memory(const_cast<void*>(mem), size, true);
}
#endif
+
+ void release() { attr("release")(); }
};
#ifndef DOXYGEN_SHOULD_SKIP_THIS
diff --git a/tests/test_pytypes.cpp b/tests/test_pytypes.cpp
index d70536d3..1df0d264 100644
--- a/tests/test_pytypes.cpp
+++ b/tests/test_pytypes.cpp
@@ -410,6 +410,15 @@ TEST_SUBMODULE(pytypes, m) {
});
#endif
+#if PY_VERSION_HEX >= 0x03020000
+ m.def("test_memoryview_release", [](const py::function &f) {
+ const char* buf = "\x42";
+ auto view = py::memoryview::from_memory(buf, 1);
+ f(view);
+ view.release();
+ });
+#endif
+
// test_builtin_functions
m.def("get_len", [](py::handle h) { return py::len(h); });
diff --git a/tests/test_pytypes.py b/tests/test_pytypes.py
index 66d6d30a..63c5ec92 100644
--- a/tests/test_pytypes.py
+++ b/tests/test_pytypes.py
@@ -460,6 +460,20 @@ def test_memoryview(method, args, fmt, expected_view):
assert view_as_list == list(expected_view)
+@pytest.mark.skipif(sys.version_info < (3, 2), reason="API not available")
+def test_memoryview_release():
+ class C:
+ def fn(self, view):
+ self.view = view
+ assert bytes(view) == b"\x42"
+
+ c = C()
+ m.test_memoryview_release(c.fn)
+ assert hasattr(c, "view")
+ with pytest.raises(ValueError):
+ bytes(c.view)
+
+
@pytest.mark.xfail("env.PYPY", reason="getrefcount is not available")
@pytest.mark.parametrize(
"method", |
I think it would make sense to add the The primary reason I created this was for instances where I had a short-lived buffer passed to a python callback that I didn't want users in python code to accidentally retain. If they did accidentally retain the buffer, diagnosing a python exception is generally much simpler than diagnosing a segfault. Having an RAII object is useful for all the reasons that RAII is generally useful: ensures that buffer has correct behavior in case of exceptions, multiple returns, etc etc. Any usage of py::memoryview wrapping a short lived buffer that doesn't use the RAII object is a nasty bug waiting to happen, in my opinion. |
There is no But there is the extra class in the pybind11 code. It's not easy to see what it's for, it takes a small effort to understand and to realize it's only rarely useful. In my judgement that's a "con": more often distracting than useful. Therefore I feel very reluctant adding the class. If other maintainers feel differently, I'll be fine with it. |
Fine, there's no acquisition here, so it's not RAII. But it's a scope management object, which is still a really useful thing. This is the easiest correct way to release the memoryview in situations where that is required. Given that memoryviews are fairly esoteric, I don't see "this isn't used that often" as a particularly convincing argument. There's not enough data to say. If this was a golang library, I could see "don't include in the library something trivially implementable" as an acceptable argument. This is python, and python libraries tend to include all of the batteries. ;) The memoryview documentation says:
Perhaps the better way to do this would be some general context management object that emulates the with statement using an RAII object? Then you could do something like.... auto [scoped, as] = py::with(py::memoryview(buffer, size)); Or this, which seems worse. auto scope = py::with(py::memoryview(buffer, size));
auto m = scope.as<memoryview>();
There are probably other places where calling If a py::with existed, I would still put a note in the memoryview documentation about using that for short-lived buffers. |
py::with just sounds like contextlib.closing? Why not just use that, but within C++. |
If you can find an elegant approach to full RAII, that doesn't duplicate code too much, I'd find this more convincing. Caveat: I'm still having strong "of all the problems in the world ..." doubts, but I'll look again if you succeed with a full RAII implementation. |
I don't think pybind11 supports creating python generators? Maybe with an eval or something. You would need to implement a C++ version of contextlib.closing that does the same thing -- I guess it could be called
What about: py::scoped_memoryview m(buffer, sz);
It could probably inherit from py::memoryview, but disable copies/moves to ensure the scope behavior is retained? Haven't tried it yet. |
This is not "scoped". This just forcibly completes the Python Buffer Protocol for the session associated with the memoryview instance. It is within the scope of standard memoryview usage to force the buffer to be released before finalization, but that can be accomplished with a single line calling the Python object's The usage that is implied here, though, is unsafe. It doesn't prevent a seg fault in the C++ code if the C++ code is not also managing the object that provided the buffer in the first place. Once Only the developer (for a specific use case of this proposed patch) can know whether it is safe to use a pointer obtained from a memoryview after release has been called. There isn't a general way to make it safe.
Pybind already has this. If you write a function that accepts a If the |
Maybe what you are asking for is some sort of update to the move semantics of |
I think you're looking at this from the wrong angle. I don't care a whole lot about whether C++ is doing something to cause a segfault, but rather the goal is to prevent python from causing a segfault unexpectedly. python users are typically not going to be very good at debugging segfaults, but they know how to deal with exceptions.
Yes. That is exactly what I want. The specific use case that this is useful for is a callback thing like this: {
char buffer[size];
// do something with buffer to fill it
auto view = py::memoryview::from_memory(buffer, size);
py::memoryview_scoped_release release(view);
some_python_function(view);
} In that case, I created the python buffer object in the scope. The buffer is not valid after that scope exists. If the python code retains a reference to the memoryview (which it could for some reason), then it will segfault when python tries to access the buffer. The goal of this object is to ensure that release is called, so that when python tries to access the memoryview it gets an exception instead of a segfault.
Yes, one could just call memoryview release at the end of that scope. However, that wouldn't be exception safe, so you'll need to either create some kind of object whose destructor calls the release function (like this PR does), or do something more complicated. |
Ah, I see, now. My inclination is to expose the buffer protocol and provide stern warnings not to keep buffers open longer than needed (and make sure that the C++ code also keeps the owning object alive per the buffer protocol), but this certainly isn't always practical. In similar cases, I've created new types so that I could clearly document the somewhat unusual behavior of a Python object that becomes unusable when the scope changes, and used a module-custom "ScopeError" exception to draw attention to the constraints on correct usage of the exported object. But the semantics are the same as using memoryview in this way. If It might be clearer if written to clearly indicate that the new object is definitely tied to the lifetime (or period of validity) of the exposed object. The snippet above looks like a lock or guard, which allows a state to be toggled, but doesn't permanently transform another object. What about the following? {
char buffer[size];
// do something with buffer to fill it
auto handle = scoped_memoryview::from_memory(buffer, size);
some_python_function(handle.view);
} Maybe that's too big a change. I'm just thinking about how one commonly uses a name like It seems like the description and implementation of this might be better as documentation additions than as additions to the pybind utilities themselves, though. The discussion here includes some good points that might help readers of the section on |
More generically, a C++ |
memoryviews seemed useful except for the potential to access an invalid buffer. Turns out that there's builtin support to avoid that sort of thing -- though oddly enough there doesn't seem to be an explicit CAPI function for it?
Example of an error encountered after the buffer is released.. better than a segfault?