Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
msg		msg
CMakeLists.txt		CMakeLists.txt
README.md		README.md
package.xml		package.xml
setup.py		setup.py

Repository files navigation

ROS Vision Messages Proposal

Introduction

This package is a proposed set of messages to unify computer vision and object detection efforts in ROS. Please feel free to suggest specific changes or add functionality with a pull request, and also visit our Discourse topic for discussion.

Overview

The messages in this package are to define a common outward-facing interface for vision-based classifiers. The set of messages here are meant to enable 2 primary types of classifiers:

"Pure" Classifiers, which identify class probabilities given a single sensor input
Detectors, which identify class probabilities as well as the poses of those classes given a sensor input

The class probabilities are stored with a CategoryDistribution message, which is essentially a map from integer IDs to floats.

Message types exist separately for 2D (using sensor_msgs/Image) and 3D (using sensor_msgs\PointCloud2). The metadata that is stored for each object is application-specific, and so this package places very few constraints on the metadata. Each possible detection result must have a unique numerical ID so that it can be unambiguously and efficiently identified in the results messages. Object metadata such as name, mesh, etc. can then be looked up from a database.

The only other requirement is that the metadata database can be stored in a ROS parameter. We expect a classifier to load the database to the parameter server in a manner similar to how URDFs are loaded and stored there (see [6]), most likely defined in an XML format. This expectation may be further refined in the future using a ROS Enhancement Proposal, or REP [7].

We also would like classifiers to have a way to signal when the database has been updated, so that listeners can respond accordingly. The database might be updated in the case of online learning. To solve this problem, each classifier can publish messages to a topic signaling that the database has been updated, as well as incrementing a database version that's continually published with the classifier information.

Messages

Classification2D and Classification3D: pure classification without pose
Detection2D and Detection3D: classification + pose
VisionInfo: Information about a classifier, such as its name and where to find its metadata database.

By using a very general message definition, we hope to cover as many of the various computer vision use cases as possible. Some examples of use cases that can be fully represented are:

Bounding box multi-object detectors with tight bounding box predictions, such as YOLO [1]
Class-predicting full-image detectors, such as TensorFlow examples trained on the MNIST dataset [2]
Full 6D-pose recognition pipelines, such as LINEMOD [3] and those included in the Object Recognition Kitchen [4]
Custom detectors that use various point-cloud based features to predict object attributes (one example is [5])

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ROS Vision Messages Proposal

Introduction

Overview

Messages

References

About

Releases

Packages

Languages

License

RonaldEnsing/vision_msgs

Folders and files

Latest commit

History

Repository files navigation

ROS Vision Messages Proposal

Introduction

Overview

Messages

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages