Skip to content

Commit

Permalink
Reorganized and cleaned up
Browse files Browse the repository at this point in the history
  • Loading branch information
Kukanani committed May 9, 2017
1 parent 08b2370 commit bfc8269
Show file tree
Hide file tree
Showing 8 changed files with 92 additions and 16 deletions.
17 changes: 11 additions & 6 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -45,12 +45,17 @@ find_package(catkin REQUIRED COMPONENTS
## * uncomment the generate_messages entry below
## * add every package in MSG_DEP_SET to generate_messages(DEPENDENCIES ...)

## Generate messages in the 'msg' folder
# add_message_files(
# FILES
# Message1.msg
# Message2.msg
# )
# Generate messages in the 'msg' folder
add_message_files(
FILES
Classification2D.msg
Classification3D.msg
ClassifierInfo.msg
Detection2D.msg
Detection2DArray.msg
Detection3D.msg
Detection3DArray.msg
)

## Generate services in the 'srv' folder
# add_service_files(
Expand Down
68 changes: 68 additions & 0 deletions README
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
# ROS Vision Messages Proposal

## Introduction

This package is a proposed set of messages to unify computer
vision and object detection efforts in ROS. Please feel free
to provide feedback using GitHub's issues, or suggest changes
or add functionality with a pull request.

## Overview

The messages in this package are to define a common outward-facing interface
for vision-based classifiers. The set of messages here are meant to enable 2
primary types of classifiers:

1. **"Pure" Classifiers**, which identify class probabilities given a single
sensor input
2. **Detectors**, which identify class probabilities as well as the poses of
those classes given a sensor input

Message types exist separately for 2D (using `sensor_msgs/Image`) and 3D (using
`sensor_msgs\PointCloud2`). The metadata that is stored for each object is
application-specific, and so this package places very few constraints on the
metadata. Each possible detection result must have a unique numerical ID so
that it can be unambiguously and efficiently identified in the results messages.
Object metadata such as name, mesh, etc. can then be looked up from a database.

The only other requirement is that the metadata database can be stored in a
ROS parameter. We expect a classifier to load the database to the parameter
server in a manner similar to how URDFs are loaded and stored there (see [6]),
most likely defined in an XML format. This expectation may be further refined
in the future using a ROS Enhancement Proposal, or REP [7].

We also would like classifiers to have a way to signal when the database has
been updated, so that listeners can respond accordingly. The database might be
updated in the case of online learning. To solve this problem, each classifier
can publish messages to a topic signaling that the database has been updated, as
well as incrementing a database version that's continually published with the
classifier information.

## Messages

* Classification2D and Classification3D: pure classification without pose
* Detection2D and Detection3D: classification + pose
* ClassifierInfo: Information about a classifier, such as its name and where
to find its metadata database.

By using a very general message definition, we hope to cover as many of the
various computer vision use cases as possible. Some examples of use cases that
can be fully represented are:

* Bounding box multi-object detectors with tight bounding box predictions,
such as YOLO [1]
* Class-predicting full-image detectors, such as TensorFlow examples trained
on the MNIST dataset [2]
* Full 6D-pose recognition pipelines, such as LINEMOD [3] and those included
in the Object Recognition Kitchen [4]
* Custom detectors that use various point-cloud based features to predict
object attributes (one example is [5])

## References
* [1] [YOLO](https://pjreddie.com/darknet/yolo/)
* [2] [TensorFlow MNIST](https://www.tensorflow.org/get_started/mnist/beginners)
* [3] [LINEMOD]()
* [4] [Object Recognition Kitchen](https://wg-perception.github.io/ork_tutorials/tutorial03/tutorial.html)
* [5] [Attribute Detector](http://campar.in.tum.de/pub/hinterstoisser2011linemod/hinterstoisser2011linemod.pdf)
* [6] [URDFs on the parameter server](http://wiki.ros.org/urdf/Tutorials/Using%20urdf%20with%20robot_state_publisher#Launch_File)
* [7] [ROS Enhancement Proposals](http://www.ros.org/reps/rep-0000.html)
2 changes: 1 addition & 1 deletion msg/ClassifierInfo.msg
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
Header header

# Name of the classifier
String method
string method

# ROS parameter name where the metadata database is stored in XML format.
# The exact information stored in the database is left up to the user.
Expand Down
6 changes: 3 additions & 3 deletions msg/Detection2D.msg
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,11 @@
# to be located in the larger image.

# non-geometric results of detection.
Classification2D classification
vision_msgs/Classification2D classification

# The x/y position and (optional) rotation of the bounding box.
geometry_msgs/Pose2D pose

# The size of the bounding box surrounding the object. The center of the
# bounding box is the position of the detection point.
# (Optional) The size of the bounding box surrounding the object. The center of
# the bounding box is the position of the detection point.
geometry_msgs/Vector2 bbox_size
4 changes: 2 additions & 2 deletions msg/Detection3DList.msg → msg/Detection2DArray.msg
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Header header
# A list of 3D detections, for a multi-object 2D detector.

# list of the detected proposals. For a multi-proposal detector, this list could
# have many objects.
Detection3D[] results
vision_msgs/Detection2D[] detections
2 changes: 1 addition & 1 deletion msg/Detection3D.msg
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
# to be located in the larger image.

# non-geometric class probability.
Classification3D classification
vision_msgs/Classification3D classification

# The element's pose. This pose should be
# defined as the pose of some fixed reference point on the object, such as the
Expand Down
4 changes: 2 additions & 2 deletions msg/Detection2DList.msg → msg/Detection3DArray.msg
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Header header
# A list of 3D detections, for a multi-object 3D detector.

# list of the detected proposals. For a multi-proposal detector, this list could
# have many objects.
Detection2D[] results
vision_msgs/Detection3D[] results
5 changes: 4 additions & 1 deletion package.xml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,10 @@
<package format="2">
<name>vision_msgs</name>
<version>0.0.0</version>
<description>Classifier-agnostic computer vision classification and detection messages.</description>
<description>
PROPOSED DRAFT of: Classifier-agnostic computer vision classification and
detection messages.
</description>

<!-- One maintainer tag required, multiple allowed, one person per tag -->
<!-- Example: -->
Expand Down

0 comments on commit bfc8269

Please sign in to comment.