Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to create object matrices #1152

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

m-chaturvedi
Copy link

@m-chaturvedi m-chaturvedi commented Oct 20, 2017

Please excuse the lack of tests. This PR is towards issue #647 that @mwoehlke-kitware had created. I'm trying to get feedback of the developers on the PR. Please let me know of the mistakes I've made and the changes you'd like.

Basically I'm explicit copying the Eigen::Matrix to the array type and vice-versa in cast and load respectively. For the load for Eigen::Ref as well, I make an explicit copy into a Eigen::Ref I created. I register the type using PYBIND11_NUMPY_OBJECT_DTYPE.

Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some lightweight comments down below (just as a commentator, nothing authoritative for pybind11).

@@ -10,6 +10,7 @@
#pragma once

#include "numpy.h"
#include <numpy/ndarraytypes.h>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW minor formatting nit This uses a different include style <> than above "".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

a = array({ src.rows(), src.cols() }, { elem_size * src.rowStride(), elem_size * src.colStride() },
src.data(), base);
using Scalar = typename props::Type::Scalar;
if (props::vector) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW minor formatting nit Could you (a) use is_pyobject and (b) "transpose" and reverse this nested if statement, just to keep the original code close together?
e.g.

if (!is_pyobject) {
   if (props::vector) { ... }
   else { ... }
} else {
   ...  // new code
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


int result = detail::npy_api::get().PyArray_CopyInto_(ref.ptr(), buf.ptr());
if (npy_format_descriptor<Scalar>::value == npy_api::NPY_OBJECT_) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW For consistency, could you place this value into is_py_object? And same as above, could this be transposed / reversed to keep the original code more visible?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -453,15 +543,55 @@ struct type_caster<
// We need to copy: If we need a mutable reference, or we're not supposed to convert
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW Can this if statement be joined to the one above via an else?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this can be done, need_copy is being changed inside if(!need_copy)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, missed that!

@EricCousineau-TRI
Copy link
Collaborator

EricCousineau-TRI commented Oct 26, 2017

Can I ask if there's a possibility of incorporating this into the unittests that are present (in tests/test_eigen.py)?
Looking at the Ref<> logic, I'd also like to see what the constraints are - can we not pass mutable Ref<ObjectType> matrices?
(If not, is there a way to work around this? If they are just object pointers, could we use a reference RVP for the scalars?)

Also, naive question, but even if not using Ref<>, would NumPy assignment semantics drop the existing object? (If it's just an array of pointers, does it just assign over the reference rather than use operator=(...)?)
Is that one of the main challenge that causes Ref<> to be copy-only?
If so, is there possibly a way around this? (e.g. overriding NumPy's __setitem__ when in Python, if possible?)
https://docs.scipy.org/doc/numpy-1.12.0/user/basics.subclassing.html

e.g.

class Test {
 public:
  Test() { x_ << Variable("a"), Variable("b"); }
  VectorX<Variable>& mutable_x() { return x_; }  // bind with return_value_policy::reference
 private:
  VectorX<Variable> x_(2);
}

and in Python:

>>> t = Test()
>>> x = Test.mutable_x()
>>> x[1] = Variable('z')
>>> print(x)
[a, z]
>>> print(Test.mutable_x())
# Will this print "[a, b]" or "[a, z]"?

EDIT: Per @jagerman's earlier comment:

Assuming T is a pybind-registered type, it should be entirely possible to cast the pointer using return_value_policy::reference to get a Python object that wraps (but doesn't copy) the pointed-at object. [...]
Going the other way doesn't seem possible, as far as I can see.

It seems like this could be possible if there is some sort of "hot swap" with the incoming Python ndarray - e.g. when pybind11 receives a request to convert ndarray to something like Ref<T> or T& / T*, then internally, we could convert that to a dense-ish Eigen matrix, stash a copy somewhere (helpful if ndarray allows __dict__ additions), and re-map the existing objects in the ndarray to reference the dense-ish Eigen matrix values.
The caveat to this is that it could potentialy "lock in" the behavior of the incoming ndarray matrix, which would be weird; however, there could be some sort of "release" back to a simple ndarray matrix when there is some sort of scope change (e.g. when the C++ function call is complete, if the reference is not stored).

(That being said, I wouldn't expect this kind of feature to come in at this point. Possibly just for future consideration.)

EDIT 2: May move this to the original issue conversation at some point.

@m-chaturvedi
Copy link
Author

Can I ask if there's a possibility of incorporating this into the unittests that are present (in tests/test_eigen.py)?

I apologize, again, for the lack of tests, I'm working on them. In the first pass I was wondering if there are any egregious mistakes in the approach, and the replacements for PyArray_SETITEM and PyArray_GETITEM, since numpy/ndarraytypes.h can't be invoked on CI.


Looking at the Ref<> logic, I'd also like to see what the constraints are - can we not pass mutable Ref matrices?

I'm not sure if I'm answering your question. But my understanding is that Ref which when supported for PODTypes didn't allow non-writable matrices because it was transparently making changes numpy.ndarry. Since I am making an explicit copy, I let go of this restriction and don't return false even if the matrix is not writable if the array is of non-POD type. Quoting from their docs:

When a bound function parameter is instead Eigen::Ref (note the lack of const), pybind11 will only allow the function to be called if it can be mapped and if the numpy array is writeable (that is a.flags.writeable is true). Any access (including modification) made to the passed variable will be transparently carried out directly on the numpy.ndarray.

So, if I'm passing a numpy matrix of AutoDiffScalars with writeable = False from python and I have Ref<Matrix<AutoDiffScalar>> in the corresponding C++, with my change it will work. Because of this. However, if a numpy matrix of doubles with writeable=False is passed to Ref<MatrixXd> it won't work because of the documented behavior. My rationale for not caring about is that, I'm not heeding the presence of Ref for non-PODTypes, but it may be stupid.


When your code is run with s/Test/t in the right places, it would do [a,b]. This won't be true if a matrix of doubles is returned using internal_reference. I need to see why is that happening, I seem to creating a copy even after chaing the policy = reference_internal in the line here.

However, the behviour with doubles which I outlined wasn't there before, probably because we were not passing base which is used to decide whether to move or not now?

@m-chaturvedi m-chaturvedi force-pushed the numpy_obj_matrix branch 8 times, most recently from e1657ea to c62f5c8 Compare November 14, 2017 16:00
@m-chaturvedi m-chaturvedi changed the title WIP: Add ability to create object matrices Add ability to create object matrices Nov 14, 2017
@EricCousineau-TRI
Copy link
Collaborator

For a follow-up to this PR, there may be a way to generally avoid the need for object copies / using NPY_OBJECT and instead define a Custom Data type (> NPY_USERDEF), such that data is actually the same as what the Eigen matrix contains (a dense array of T) rather than an array of Python objects (with the added overhead of conversion at construction).

https://docs.scipy.org/doc/numpy-1.13.0/user/c-info.beyond-basics.html#user-defined-data-types
https://docs.scipy.org/doc/numpy-1.13.0/reference/c-api.dtype.html#c.NPY_USERDEF
https://docs.scipy.org/doc/numpy-1.13.0/reference/c-api.types-and-structures.html#c.PyArray_ArrFuncs

@EricCousineau-TRI
Copy link
Collaborator

EricCousineau-TRI commented Mar 22, 2020

FTR, my above comments about avoiding copies are out of scope for this PR, and should be excluded from consideration for now.

We have some modifications in our fork relevant to this PR; I would be happy to help bring them in.

@gabor-varga
Copy link

gabor-varga commented Mar 23, 2020

Hello @EricCousineau-TRI @wjakob, @jagerman

I work on the new flight dynamics software infrastructure for the European Space Agency. This software will be used to fly missions such as SOLO, BepiColombo, JUICE and more (see https://www.esa.int/ESA/Our_Missions). One major selling point is the ease-of-use using the python interface. We plan to make the software open source and available to European industry and academia.

I am very interested in this PR, as it would add the possibility to handle automatic differentiation much easier on the C++/python interface level with Eigen/numpy. Due to strict third-party policy, pybind11 is already a library I have to push to management (so forks are hard to sell).

@wjakob @jagerman Do you have a schedule on the next release and/or a discussion on which PR-s to include?

Thank you for your time.

@gabor-varga
Copy link

Hi there,

I would like to refresh the ping on this PR. In the meantime I have used https://github.com/RobotLocomotion/pybind11 to implement my use-case of object matrices, specifically using custom autodiff scalar types together with Eigen/numpy interface.

In the end, I would prefer using the primary pybind11 repository. This PR might need to be updated with the latest changes/optimisations/bugfixes from the fork @EricCousineau-TRI @m-chaturvedi are you still maintaining that?

I see @henryiii has been doing some nice work on the repo, do you see a chance this feature could be added? Let me know @wjakob @jagerman if you have any thoughts. Thank you in advance!

@benbovy
Copy link

benbovy commented Dec 7, 2021

I'd also be interested in creating and processing Numpy arrays of dtype=np.object (where objects are custom, non-POD C++ classes exposed in Python) from C++ with pybind11. Ideally conversion should be fast (pass-by-reference).

It is not super clear to me if this PR would (partially) enable that, though, as it also involves Eigen stuff (which I'm not using). I've also looked at the https://github.com/RobotLocomotion/pybind11 fork and the various issues linked to #2259 but it was hard for me (due to my lack of experience) to find out which information or code addition is relevant for Numpy object arrays.

More specifically, I've tried reusing https://github.com/RobotLocomotion/pybind11/blob/drake/include/pybind11/numpy.h#L1282-L1298 and https://github.com/eacousineau/repro/blob/43407e3/python/pybind11/custom_tests/pybind11_numpy_scalar.h but didn't succeed in creating or processing Numpy objects array from C++. Is there something I'm missing?

I've also noticed the ongoing efforts to allow the creation of custom ufuncs (e.g., in #1325), which is something I'd really want to be able to do for Numpy object arrays. Is there a workaround for this implemented in RobotLocomotion's pybind11 fork or in pydrake?

@EricCousineau-TRI, if you have any pointer to relevant code parts and/or examples that would be super helpful!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants