-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unifying 2D and 3D messages #2
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks like a good suggestion. We generally recommend using generic 3D datatypes even in mostly 2D processing pipelines to keep the pipelines compatible. 2D processors should either have appropriate methods to either collapse or ignore the Z elements.
The above is on the caveat that a 2D bounding box in pixel space is different as it's a completely different unit and if this is expected to represent bounding boxes in pixel space that should be a different datatype.
msg/BoundingBox.msg
Outdated
geometry_msgs/Point size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why change this to a Point?
This should be documented how to resolve this. What is it relative to? The pose of the pose
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reverted to the original Vector type, and added a better description.
msg/BoundingBox.msg
Outdated
@@ -1,7 +1,7 @@ | |||
# Note: This message plans to move to geometry_msgs in the final implementation. | |||
|
|||
# position and rotation of the bounding box | |||
# position and orientation of the bounding box center | |||
geometry_msgs/Pose pose |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not in this PR but related to the other comment. This should probably be more considered an 'origin' or something instead of echoing the datatype.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I improved the description.
+1 on combining ObjectHypothesis. The Thanks for the PR! |
This reverts commit 12f0451.
also changed datatype to uint32 as we are in the pixel space for 2D BB
# The x/y position and (optional) rotation of the bounding box center. | ||
geometry_msgs/Pose2D pose | ||
# The 2D position (in pixels) and orientation of the bounding box origin. | ||
geometry_msgs/Pose2D origin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honest question: Is it common to have rotated bounding boxes in 2D classification? I've only seen axis-aligned bounding boxes so far. If so, we should replace the Pose2D with a "Point2D" (which doesn't exist yet), preferably with uint32 coordinates instead of float. If (almost) nobody uses the rotation, many people won't implement it correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rotated 2D bounding boxes often is used for grasp generators and human annotations. Since the bounding boxes will probably move to geometry_msgs, I think that a generic rotated version should be best.
# to the pose of its origin | ||
uint32 size_x | ||
uint32 size_y |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related to my previous comment: If we allow the bounding box origin to be given as floats with rotation, we should also allow floats for the size. The unit should still be pixels. However, I would prefer to remove the rotation and use uint32 everywhere (bounding box origin + size).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with both your comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sub-pixel bounding boxes are possible when using machine learning-based object detectors, but it's worth asking if that information is worth storing.
I'll also note that without floats for origin
, it becomes difficult to accurately position an even-sized bounding box (if the BB is 2x2, where is its center?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that by origin
, @procopiostein meant something like the top-left corner, not the center. This is why a 2x2 BB is not a problem: it's the box between (origin.x, origin.y) and (origin.x + size_x, origin.y + size_y).
Let's continue the discussion in #5. |
IMO I think we gain in simplicity by reducing the number of messages by merging, when possible, 2D and 3D msgs.
A pose can be interpreted as 2D or 3D according to the value on the z field.
For example, in many cases, wheeled robots in ROS publish their pose as a geometry_msgs/Pose, (inherently 3D) although they are always at the ground level.