Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable implicit conversion of 0 to pybind11::handle. #4008

Merged
merged 19 commits into from
Jul 14, 2022

Conversation

rwgk
Copy link
Collaborator

@rwgk rwgk commented Jun 15, 2022

Description

Resolves a complaint under #1309:

For additional confusion, 0 is implicitly convertible to a handle (an invalid one: the null pointer).

This PR does NOT attempt to address other requests under #1309.

Suggested changelog entry:

Implicit conversion of the literal ``0`` to ``pybind11::handle`` is now disabled.

@rwgk rwgk force-pushed the implicit_conversion_from_0_to_handle branch from 4480b95 to ca8d9fd Compare June 15, 2022 07:45
@rwgk
Copy link
Collaborator Author

rwgk commented Jun 15, 2022

Maybe I'm having too much fun here? At second thought, I think I almost implemented backdoor inheritance / duck typing: if it looks like a PyObject * ... why not go all the way and m_ptr(reinterpret_cast<PyObject *>(ptr))? That could get rid of a lot of casts elsewhere.

Or maybe I should go the other way, try the much simpler std::is_same<T, PyObject *>?

@rwgk
Copy link
Collaborator Author

rwgk commented Jun 15, 2022

The simpler version works everywhere, incl. GCC 8.3.0. My only worry: this seems too simple, why hasn't anybody done this before in the 4+ years that #1309 has been around?

I'll try global testing.

@rwgk
Copy link
Collaborator Author

rwgk commented Jun 15, 2022

pytorch is very unhappy (triggers hundreds of global testing failures, below).

This is what we're up against: https://github.com/pytorch/pytorch/blob/279634f384662b7c3a9f8bf7ccc3a6afd2f05657/torch/csrc/utils/object_ptr.h#L30

third_party/py/torch/csrc/autograd/init.cpp:25:12: error: no matching conversion for functional-style cast from 'THPObjectPtr' (aka 'THPPointer<_object>') to 'py::handle'
  auto m = py::handle(autograd_module).cast<py::module>();
           ^~~~~~~~~~~~~~~~~~~~~~~~~~
./third_party/pybind11/include/pybind11/detail/../pytypes.h:223:5: note: candidate constructor not viable: no known conversion from 'THPObjectPtr' (aka 'THPPointer<_object>') to 'const pybind11::handle' for 1st argument
    handle(const handle &) = default;
    ^
./third_party/pybind11/include/pybind11/detail/../pytypes.h:224:5: note: candidate constructor not viable: no known conversion from 'THPObjectPtr' (aka 'THPPointer<_object>') to 'pybind11::handle' for 1st argument
    handle(handle &&) = default;
    ^
./third_party/pybind11/include/pybind11/detail/../pytypes.h:221:5: note: candidate template ignored: requirement 'std::is_same<THPPointer<_object>, std::nullptr_t>::value || std::is_same<THPPointer<_object>, _object *>::value' was not satisfied [with T = THPPointer<_object>]
    handle(/* PyObject* */ T ptr) : m_ptr(ptr) {} // Allow implicit conversion from PyObject*
    ^
./third_party/pybind11/include/pybind11/detail/../pytypes.h:213:5: note: candidate constructor not viable: requires 0 arguments, but 1 was provided
    handle() = default;
    ^

@rwgk
Copy link
Collaborator Author

rwgk commented Jun 15, 2022

I should add: I didn't find any actual problems, the pytorch issue is the only type of breakage (the total number of tests is many millions).

While the implicit conversion from 0 to handle is definitely a stumbling stone during development, it never seems to make it as an actual bug to production.

@rwgk
Copy link
Collaborator Author

rwgk commented Jun 15, 2022

The valgrind failure in the CI here is probably an easy fix, I think I'm just leaking the Python float in the new test.

But, pytorch is still acting up, at this minute no idea why:

third_party/py/torch/csrc/jit/attributes.h:19:42: warning: expression does not compute the number of elements in this array; element type is 'const char *', not 'torch::jit::AttributeKind' [-Wsizeof-array-div]
  JIT_ASSERT(size_t(kind) < sizeof(names)/sizeof(AttributeKind));
                                   ~~~~~ ^
third_party/py/torch/csrc/assertions.h:50:20: note: expanded from macro 'JIT_ASSERT'
#define JIT_ASSERT TORCH_ASSERT
                   ^
third_party/py/torch/csrc/assertions.h:27:22: note: expanded from macro 'TORCH_ASSERT'
  if (TORCH_EXPECT(!(cond), 0)) { \
                     ^~~~
third_party/py/torch/csrc/assertions.h:21:47: note: expanded from macro 'TORCH_EXPECT'
#define TORCH_EXPECT(x, y) (__builtin_expect((x), (y)))
                                              ^
third_party/py/torch/csrc/jit/attributes.h:18:22: note: array 'names' declared here
  static const char* names[] = {"f","fs","i","is","s","ss","t","ts","g","gs"};
                     ^
third_party/py/torch/csrc/jit/attributes.h:19:42: note: place parentheses around the 'sizeof(torch::jit::AttributeKind)' expression to silence this warning
  JIT_ASSERT(size_t(kind) < sizeof(names)/sizeof(AttributeKind));
                                         ^
third_party/py/torch/csrc/assertions.h:50:20: note: expanded from macro 'JIT_ASSERT'
#define JIT_ASSERT TORCH_ASSERT
                   ^
third_party/py/torch/csrc/autograd/init.cpp:25:23: error: call to implicitly-deleted copy constructor of 'THPPointer<_object>'
  auto m = py::handle(autograd_module).cast<py::module>();
                      ^~~~~~~~~~~~~~~
third_party/py/torch/csrc/utils/object_ptr.h:10:3: note: copy constructor is implicitly deleted because 'THPPointer<_object>' has a user-declared move constructor
  THPPointer(THPPointer &&p) { free(); ptr = p.ptr; p.ptr = nullptr; };
  ^
./third_party/pybind11/include/pybind11/detail/../pytypes.h:223:30: note: passing argument to parameter 'ptr' here
    handle(/* PyObject* */ T ptr) : m_ptr(ptr) {} // Allow implicit conversion from PyObject*

Ralf W. Grosse-Kunstleve added 4 commits June 16, 2022 01:18
…lidated against pytorch proper).

The first version of the reduced pytorch code was critically missing the move ctor. The first version of the accompanying test was meaningless.

Note: It turns out the `!std::is_arithmetic<T>` condition is not needed: `int` is not in general implicitly convertible to `PyObject *`, only the literal `0` is.
…from the wild (rather than changing the code).
Comment on lines 226 to 230
detail::enable_if_t<
detail::all_of<detail::negation<detail::any_of<std::is_base_of<handle, T>,
std::is_same<T, PyObject *>,
std::is_same<T, PyObject *const>,
std::is_same<T, std::nullptr_t>>>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
detail::enable_if_t<
detail::all_of<detail::negation<detail::any_of<std::is_base_of<handle, T>,
std::is_same<T, PyObject *>,
std::is_same<T, PyObject *const>,
std::is_same<T, std::nullptr_t>>>,
detail::enable_if_t<
detail::all_of<detail::none_of<std::is_base_of<handle, T>,
std::is_same<T, PyObject *>,
std::is_same<T, PyObject *const>,
std::is_same<T, std::nullptr_t>>

This probably still isn't the best way to do this, will this match volatile, mutable etc pointers. I think we may need to use detail::remove_cv<T>::type or something similar here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also there is detail::satisifies_none_of for the std::is_same block, but i am not sure its that much cleaner. Might be a bit when using remove_cv_ref or other behavior.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was still thinking about the cv cases.
The most important thing is to come up with corresponding unit tests first. Without we (me at least) are certain to get lost in the incomprehensible maze of C++ rules and special cases.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rwgk I feel as though some template library has to have this implemented already with all the edge cases we need and we can just use that.

Copy link
Collaborator Author

@rwgk rwgk Jun 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"feel" and "probably" comments like this are massively unhelpful.
You gave up really quickly (#4006).
I'm presenting a solution that is already globally tested.
If you think this can be done better, please show how.
With tests.
I'm always happy to accept something that is provably better, and will be happy to globally test that for you.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Apologies, I didn't intend to come off rude or dismissive of your efforts. I was just making a generic note for future discussions.

Thanks for providing this really helpful PR so far and testing against all the edge case in PyTorch. I just meant to comment on how surprised I was to find that disabling this troublesome implicit conversion in C++ is so difficult. Thanks for all the effort.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Skylion007 I added all the tests that I could think of (5340256) to validate that the only behavior change is to disable the implicit conversion from 0 to handle. I left a temporary #ifdef in case you have ideas for additional tests, or want to validate yourself that the commented-out code fails to compile before and after this PR.

Between

  • the existing unit tests
  • the added operator PyObject *() test reduced from the wild (pytorch)
  • the added operator PyObject *() const test
  • the added pure_compile_tests_for_handle_from_PyObject_pointers() tests
  • and global testing

I think we are extremely well covered. Maybe the SFINAE constructs could be polished, but unless we have more additional tests that expose a hole, I'd say this is as good as it gets.

I want to rerun the global testing though, although I don't think the change to none_of would break anything.

@rwgk rwgk marked this pull request as ready for review June 18, 2022 16:55
@rwgk
Copy link
Collaborator Author

rwgk commented Jun 18, 2022

@Skylion007 I took this out of Draft mode now. After factoring out the is_pyobj_ptr_or_nullptr_t subexpression it almost looks simple again. I don't think it can be made any simpler, there is a clear reason for each of the sub-conditions.

BTW I did try google/stackoverflow at the beginning, but didn't find anything in a reasonable amount of time, then decided to experiment with it myself. I think there is no general/library-kind solution because "it depends", in our case we have to deal with the subclasses of handle and we need to support implicit conversions via operator PyObject *() , which is maybe something that one wouldn't do if there were not existing users. Also, most people probably don't care enough about the very special 0 literal trap.

@rwgk rwgk changed the title Disable implicit conversion from 0 to pybind11::handle. Disable implicit conversion of 0 to pybind11::handle. Jun 18, 2022
Copy link
Collaborator

@Skylion007 Skylion007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me if it didn't cause any breakages in Google Test Suite. Thoughts @henryiii ?

Copy link
Collaborator

@henryiii henryiii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this just turns a mistake into a compiler error, I'd say it's a good step in the right direction. Can't look into it in too much detail (at SciPy).

@rwgk
Copy link
Collaborator Author

rwgk commented Jul 14, 2022

Thanks for the approvals!

If this just turns a mistake into a compiler error,

Yes, that's exactly what it is, and only exactly that (validated with the extra tests and global testing).

@rwgk
Copy link
Collaborator Author

rwgk commented Jul 14, 2022

Retested here with the CI, but skipping global retesting. I believe it is very unlikely that something new slipped in.

@rwgk rwgk merged commit 1d81191 into pybind:master Jul 14, 2022
@rwgk rwgk deleted the implicit_conversion_from_0_to_handle branch July 14, 2022 16:53
@github-actions github-actions bot added the needs changelog Possibly needs a changelog entry label Jul 14, 2022
@henryiii henryiii removed the needs changelog Possibly needs a changelog entry label Oct 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants