Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unifying 2D and 3D messages #2

Merged
merged 5 commits into from
May 23, 2017
Merged

Conversation

procopiostein
Copy link
Contributor

IMO I think we gain in simplicity by reducing the number of messages by merging, when possible, 2D and 3D msgs.
A pose can be interpreted as 2D or 3D according to the value on the z field.
For example, in many cases, wheeled robots in ROS publish their pose as a geometry_msgs/Pose, (inherently 3D) although they are always at the ground level.

Copy link

@tfoote tfoote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks like a good suggestion. We generally recommend using generic 3D datatypes even in mostly 2D processing pipelines to keep the pipelines compatible. 2D processors should either have appropriate methods to either collapse or ignore the Z elements.

The above is on the caveat that a 2D bounding box in pixel space is different as it's a completely different unit and if this is expected to represent bounding boxes in pixel space that should be a different datatype.

geometry_msgs/Point size
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why change this to a Point?

This should be documented how to resolve this. What is it relative to? The pose of the pose?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reverted to the original Vector type, and added a better description.

@@ -1,7 +1,7 @@
# Note: This message plans to move to geometry_msgs in the final implementation.

# position and rotation of the bounding box
# position and orientation of the bounding box center
geometry_msgs/Pose pose
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not in this PR but related to the other comment. This should probably be more considered an 'origin' or something instead of echoing the datatype.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I improved the description.

@Kukanani
Copy link
Collaborator

+1 on combining ObjectHypothesis. The DetectionXD messages will also have to be updated to point to the new message type.

Thanks for the PR!

@procopiostein
Copy link
Contributor Author

Agreed @tfoote , I have not thought about that. I reverted the related commit and tried to improved the description and fields. Also changed the datatype for the size of a 2D BB to uint32, as it should be in pixels.

@Kukanani I updated the DetectionXD to use the generic ObjectHypothesisWithPose

# The x/y position and (optional) rotation of the bounding box center.
geometry_msgs/Pose2D pose
# The 2D position (in pixels) and orientation of the bounding box origin.
geometry_msgs/Pose2D origin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honest question: Is it common to have rotated bounding boxes in 2D classification? I've only seen axis-aligned bounding boxes so far. If so, we should replace the Pose2D with a "Point2D" (which doesn't exist yet), preferably with uint32 coordinates instead of float. If (almost) nobody uses the rotation, many people won't implement it correctly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rotated 2D bounding boxes often is used for grasp generators and human annotations. Since the bounding boxes will probably move to geometry_msgs, I think that a generic rotated version should be best.

# to the pose of its origin
uint32 size_x
uint32 size_y
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to my previous comment: If we allow the bounding box origin to be given as floats with rotation, we should also allow floats for the size. The unit should still be pixels. However, I would prefer to remove the rotation and use uint32 everywhere (bounding box origin + size).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with both your comments.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sub-pixel bounding boxes are possible when using machine learning-based object detectors, but it's worth asking if that information is worth storing.

I'll also note that without floats for origin, it becomes difficult to accurately position an even-sized bounding box (if the BB is 2x2, where is its center?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that by origin, @procopiostein meant something like the top-left corner, not the center. This is why a 2x2 BB is not a problem: it's the box between (origin.x, origin.y) and (origin.x + size_x, origin.y + size_y).

@Kukanani Kukanani merged commit 1929d27 into ros-perception:master May 23, 2017
@mintar mintar mentioned this pull request May 24, 2017
@mintar
Copy link
Contributor

mintar commented May 24, 2017

Let's continue the discussion in #5.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants