Reorganized and cleaned up

RonaldEnsing · May 9, 2017 · bfc8269 · bfc8269
1 parent 08b2370
commit bfc8269
Show file tree

Hide file tree

Showing 8 changed files with 92 additions and 16 deletions.
diff --git a/CMakeLists.txt b/CMakeLists.txt
@@ -45,12 +45,17 @@ find_package(catkin REQUIRED COMPONENTS
 ##   * uncomment the generate_messages entry below
 ##   * add every package in MSG_DEP_SET to generate_messages(DEPENDENCIES ...)
 
-## Generate messages in the 'msg' folder
-# add_message_files(
-#   FILES
-#   Message1.msg
-#   Message2.msg
-# )
+# Generate messages in the 'msg' folder
+add_message_files(
+  FILES
+  Classification2D.msg
+  Classification3D.msg
+  ClassifierInfo.msg
+  Detection2D.msg
+  Detection2DArray.msg
+  Detection3D.msg
+  Detection3DArray.msg
+)
 
 ## Generate services in the 'srv' folder
 # add_service_files(

diff --git a/README b/README
@@ -0,0 +1,68 @@
+# ROS Vision Messages Proposal
+
+## Introduction
+
+This package is a proposed set of messages to unify computer
+vision and object detection efforts in ROS. Please feel free
+to provide feedback using GitHub's issues, or suggest changes
+or add functionality with a pull request.
+
+## Overview
+
+The messages in this package are to define a common outward-facing interface
+for vision-based classifiers. The set of messages here are meant to enable 2
+primary types of classifiers:
+
+  1. **"Pure" Classifiers**, which identify class probabilities given a single
+  sensor input
+  2. **Detectors**, which identify class probabilities as well as the poses of
+  those classes given a sensor input
+
+Message types exist separately for 2D (using `sensor_msgs/Image`) and 3D (using
+`sensor_msgs\PointCloud2`). The metadata that is stored for each object is
+application-specific, and so this package places very few constraints on the
+metadata. Each possible detection result must have a unique numerical ID so
+that it can be unambiguously and efficiently identified in the results messages.
+Object metadata such as name, mesh, etc. can then be looked up from a database.
+
+The only other requirement is that the metadata database can be stored in a
+ROS parameter. We expect a classifier to load the database to the parameter
+server in a manner similar to how URDFs are loaded and stored there (see [6]),
+most likely defined in an XML format. This expectation may be further refined
+in the future using a ROS Enhancement Proposal, or REP [7].
+
+We also would like classifiers to have a way to signal when the database has
+been updated, so that listeners can respond accordingly. The database might be
+updated in the case of online learning. To solve this problem, each classifier
+can publish messages to a topic signaling that the database has been updated, as
+well as incrementing a database version that's continually published with the
+classifier information.
+
+## Messages
+
+  * Classification2D and Classification3D: pure classification without pose
+  * Detection2D and Detection3D: classification + pose
+  * ClassifierInfo: Information about a classifier, such as its name and where
+  to find its metadata database.
+
+By using a very general message definition, we hope to cover as many of the
+various computer vision use cases as possible. Some examples of use cases that
+can be fully represented are:
+
+  * Bounding box multi-object detectors with tight bounding box predictions,
+  such as YOLO [1]
+  * Class-predicting full-image detectors, such as TensorFlow examples trained
+  on the MNIST dataset [2]
+  * Full 6D-pose recognition pipelines, such as LINEMOD [3] and those included
+  in the Object Recognition Kitchen [4]
+  * Custom detectors that use various point-cloud based features to predict
+  object attributes (one example is [5])
+
+## References
+  * [1] [YOLO](https://pjreddie.com/darknet/yolo/)
+  * [2] [TensorFlow MNIST](https://www.tensorflow.org/get_started/mnist/beginners)
+  * [3] [LINEMOD]()
+  * [4] [Object Recognition Kitchen](https://wg-perception.github.io/ork_tutorials/tutorial03/tutorial.html)
+  * [5] [Attribute Detector](http://campar.in.tum.de/pub/hinterstoisser2011linemod/hinterstoisser2011linemod.pdf)
+  * [6] [URDFs on the parameter server](http://wiki.ros.org/urdf/Tutorials/Using%20urdf%20with%20robot_state_publisher#Launch_File)
+  * [7] [ROS Enhancement Proposals](http://www.ros.org/reps/rep-0000.html)
diff --git a/msg/ClassifierInfo.msg b/msg/ClassifierInfo.msg
@@ -11,7 +11,7 @@
 Header header
 
 # Name of the classifier
-String method
+string method
 
 # ROS parameter name where the metadata database is stored in XML format.
 # The exact information stored in the database is left up to the user.

diff --git a/msg/Detection2D.msg b/msg/Detection2D.msg
@@ -5,11 +5,11 @@
 # to be located in the larger image.
 
 # non-geometric results of detection.
-Classification2D classification
+vision_msgs/Classification2D classification
 
 # The x/y position and (optional) rotation of the bounding box.
 geometry_msgs/Pose2D pose
 
-# The size of the bounding box surrounding the object. The center of the 
-# bounding box is the position of the detection point.
+# (Optional) The size of the bounding box surrounding the object. The center of
+# the bounding box is the position of the detection point.
 geometry_msgs/Vector2 bbox_size
diff --git a/msg/Detection3DList.msg → msg/Detection2DArray.msg b/msg/Detection3DList.msg → msg/Detection2DArray.msg
@@ -1,5 +1,5 @@
-Header header
+# A list of 3D detections, for a multi-object 2D detector.
 
 # list of the detected proposals. For a multi-proposal detector, this list could
 # have many objects.
-Detection3D[] results
+vision_msgs/Detection2D[] detections
diff --git a/msg/Detection3D.msg b/msg/Detection3D.msg
@@ -5,7 +5,7 @@
 # to be located in the larger image.
 
 # non-geometric class probability.
-Classification3D classification
+vision_msgs/Classification3D classification
 
 # The element's pose. This pose should be
 # defined as the pose of some fixed reference point on the object, such as the

diff --git a/msg/Detection2DList.msg → msg/Detection3DArray.msg b/msg/Detection2DList.msg → msg/Detection3DArray.msg
@@ -1,5 +1,5 @@
-Header header
+# A list of 3D detections, for a multi-object 3D detector.
 
 # list of the detected proposals. For a multi-proposal detector, this list could
 # have many objects.
-Detection2D[] results
+vision_msgs/Detection3D[] results
diff --git a/package.xml b/package.xml
@@ -2,7 +2,10 @@
 <package format="2">
   <name>vision_msgs</name>
   <version>0.0.0</version>
-  <description>Classifier-agnostic computer vision classification and detection messages.</description>
+  <description>
+    PROPOSED DRAFT of: Classifier-agnostic computer vision classification and
+    detection messages.
+  </description>
 
   <!-- One maintainer tag required, multiple allowed, one person per tag --> 
   <!-- Example:  -->