Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate the strange issues associated with implicit STL conversion #1244

Closed
3 tasks
ProfFan opened this issue Jul 14, 2022 · 12 comments
Closed
3 tasks

Investigate the strange issues associated with implicit STL conversion #1244

ProfFan opened this issue Jul 14, 2022 · 12 comments

Comments

@ProfFan
Copy link
Collaborator

ProfFan commented Jul 14, 2022

#1239 #1203 both are hit by this strange problem.

Either we are doing something that should not be done, or there is a bug (probably UB) in pybind11.

For #1203 it's looks very similar to pybind/pybind11#1035

Investigation TODO List:

  • Run Python wrapper with ASAN (all unit tests)
  • Run Python wrapper with UBSAN (all unit tests)
  • Include stl.h and rerun ASAN/UBSAN (all unit tests)
@ProfFan
Copy link
Collaborator Author

ProfFan commented Jul 14, 2022

Pinging @gchenfc @varunagrawal, have you ever encountered such behavior before?

@gchenfc
Copy link
Member

gchenfc commented Jul 14, 2022

I couldn't see what was actually the issue with 1203, but for #1239, could it be just that this conversion is clashing with KeyVector defined in specializations/gtsam.h and preamble/gtsam.h ?

@varunagrawal
Copy link
Collaborator

I think the issue is due to the opaque behavior of the JacobianVector aka vector<Matrix>. My guess is that since we don't declare vector<Matrix> as an opaque type, the updates on the python side are not being pushed back to the C++ side (which explains the error messages).

More info.

@ProfFan
Copy link
Collaborator Author

ProfFan commented Jul 15, 2022

@varunagrawal But this is the commit the fixed the crash, and JacobianVector has already been marked opaque...

@gchenfc That is also a good point, there is no documentation on what will happen is the same type gets bound by both stl.h and stl_bind.h in different compilation units! I can see the devil of ODR violation smiling at me now...

@varunagrawal
Copy link
Collaborator

JacobianVector is marked opaque but then why is pybind passing in a list or np arrays? Shouldn't it pass in a JacobianVector object?

This is also a good time to revisit my PR on using stl.h exclusively. It's on my gtsam repo.

@varunagrawal
Copy link
Collaborator

My previous comments come from looking at the values. I ran print-debugging on both the python and C++ side, and where the A matrix should be filled with Jacobians after call the to error_function_, it still has matrices of size 0x0 which seems to indicate that the mechanism for updating H is not working properly or pybind is not updating things back on the C++ side.

@ProfFan
Copy link
Collaborator Author

ProfFan commented Jul 17, 2022

@varunagrawal JacobianVector is marked opaque and then bound with bind_vector, so it appears as an array of Eigen matrices to pybind, and Eigen arrays are treated as numpy arrays by Pybind11. (this is what is supposed to happen)
However with stl.h added it seems the stl.h conversion takes precedence and the array is copied (per the stl.h limitations), which exactly matches what you are seeing.

@varunagrawal
Copy link
Collaborator

Okay that makes sense. I get the feeling that we should then not use bind_vector since pybind11/stl.h supports vectors right out of the box. Whether the copy semantics are as expected by us, that is something we can test relatively quickly (ensure that we still use the opaque functionality).

@ProfFan
Copy link
Collaborator Author

ProfFan commented Jul 17, 2022

Problem with auto binding is that nothing will be mutable, for example the KeyVector

@varunagrawal
Copy link
Collaborator

That's fine because KeyVector will be removed. Instead we'll just use regular lists.

That being said, I updated the stl.h PR and I still see this issue, so my guess is that it's something deeper.

@varunagrawal
Copy link
Collaborator

@gchenfc is on the money with his diagnosis. Part of it is the clash with KeyVector.

@varunagrawal
Copy link
Collaborator

I think this issue is superseded by the fact that we support full STL binding in the python wrapper. Closing as a result.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants