autowarefoundation · tkimura4 · Dec 23, 2021 · Dec 21, 2021 · Dec 22, 2021 · Dec 23, 2021
diff --git a/perception/lidar_centerpoint/README.md b/perception/lidar_centerpoint/README.md
@@ -1,61 +1,112 @@
-# CenterPoint TensorRT
+# lidar_centerpoint
 
-This is a 3D object detection implementation of CenterPoint supporting TensorRT inference.
+## Purpose
 
-The object.existence_probability is stored the value of classification confidence of DNN, not probability.
+lidar_centerpoint is a package for detecting dynamic 3D objects.
+
+## Inner-workings / Algorithms
+
+In this implementation, CenterPoint [1] uses a PointPillars-based [2] network to inference with TensorRT.
+
+We trained the models using <https://github.com/open-mmlab/mmdetection3d>.
+
+## Inputs / Outputs
+
+### Input
+
+| Name                 | Type                            | Description      |
+| -------------------- | ------------------------------- | ---------------- |
+| `~/input/pointcloud` | `sensor_msgs::msg::PointCloud2` | input pointcloud |
+
+### Output
+
+| Name                               | Type                                                  | Description              |
+| ---------------------------------- | ----------------------------------------------------- | ------------------------ |
+| `~/output/objects`                 | `autoware_auto_perception_msgs::msg::DetectedObjects` | detected objects         |
+| `~/debug/pointcloud_densification` | `sensor_msgs::msg::PointCloud2`                       | densification pointcloud |
 
 ## Parameters
 
-### Input Topics
+### Core Parameters
+
+| Name                            | Type   | Default Value | Description                                                 |
+| ------------------------------- | ------ | ------------- | ----------------------------------------------------------- |
+| `score_threshold`               | float  | `0.4`         | detected objects with score less than threshold are ignored |
+| `densification_world_frame_id`  | string | `map`         | the world frame id to fuse multi-frame pointcloud           |
+| `densification_num_past_frames` | int    | `1`           | the number of past frames to fuse with the current frame    |
+| `use_encoder_trt`               | bool   | `false`       | use TensorRT VoxelFeatureEncoder                            |
+| `use_head_trt`                  | bool   | `true`        | use TensorRT DetectionHead                                  |
+| `trt_precision`                 | string | `fp16`        | TensorRT inference precision: `fp32` or `fp16`              |
+| `encoder_onnx_path`             | string | `""`          | path to VoxelFeatureEncoder ONNX file                       |
+| `encoder_engine_path`           | string | `""`          | path to VoxelFeatureEncoder TensorRT Engine file            |
+| `encoder_pt_path`               | string | `""`          | path to VoxelFeatureEncoder TorchScript file                |
+| `head_onnx_path`                | string | `""`          | path to DetectionHead ONNX file                             |
+| `head_engine_path`              | string | `""`          | path to DetectionHead TensorRT Engine file                  |
+| `head_pt_path`                  | string | `""`          | path to DetectionHead TorchScript file                      |
+
+## Assumptions / Known limits
+
+- The `object.existence_probability` is stored the value of classification confidence of a DNN, not probability.
+
+- If you have an error like `'GOMP_4.5' not found`, replace the OpenMP library in libtorch.
+
+  ```bash
+  sudo apt install libgomp1 -y
+  sudo rm /usr/local/libtorch/lib/libgomp-75eea7e8.so.1
+  sudo ln -s /usr/lib/x86_64-linux-gnu/libgomp.so.1 /usr/local/libtorch/lib/libgomp-75eea7e8.so.1
+  ```
+
+- if `use_encoder_trt` is set `use_encoder_trt`, more GPU memory is allocated.
+
+## (Optional) Error detection and handling
+
+<!-- Write how to detect errors and how to recover from them.
+
+Example:
+  This package can handle up to 20 obstacles. If more obstacles found, this node will give up and raise diagnostic errors.
+-->
+
+## (Optional) Performance characterization
+
+<!-- Write performance information like complexity. If it wouldn't be the bottleneck, not necessary.
+
+Example:
+  ### Complexity
+
+  This algorithm is O(N).
+
+  ### Processing time
+
+  ...
+-->
+
+## References/External links
+
+[1] Yin, Tianwei, Xingyi Zhou, and Philipp Krähenbühl. "Center-based 3d object detection and tracking." arXiv preprint arXiv:2006.11275 (2020).
 
-| Name             | Type        | Description                          |
-| ---------------- | ----------- | ------------------------------------ |
-| input/pointcloud | PointCloud2 | Point Clouds (x, y, z and intensity) |
+[2] Lang, Alex H., et al. "Pointpillars: Fast encoders for object detection from point clouds." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019.
 
-### Output Topics
+[3] <https://github.com/tianweiy/CenterPoint>
 
-| Name                           | Type                          | Description            |
-| ------------------------------ | ----------------------------- | ---------------------- |
-| output/objects                 | DynamicObjectWithFeatureArray | 3D Bounding Box        |
-| debug/pointcloud_densification | PointCloud2                   | multi-frame pointcloud |
+[4] <https://github.com/open-mmlab/mmdetection3d>
 
-### ROS Parameters
+[5] <https://github.com/open-mmlab/OpenPCDet>
 
-| Name                      | Type   | Description                                                 | Default |
-| ------------------------- | ------ | ----------------------------------------------------------- | ------- |
-| score_threshold           | float  | detected objects with score less than threshold are ignored | `0.4`   |
-| densification_base_frame  | string | the base frame id to fuse multi-frame pointcloud            | `map`   |
-| densification_past_frames | int    | the number of past frames to fuse with the current frame    | `1`     |
-| use_encoder_trt           | bool   | use TensorRT VoxelFeatureEncoder                            | `false` |
-| use_head_trt              | bool   | use TensorRT DetectionHead                                  | `true`  |
-| trt_precision             | string | TensorRT inference precision: `fp32` or `fp16`              | `fp16`  |
-| encoder_onnx_path         | string | path to VoxelFeatureEncoder ONNX file                       |         |
-| encoder_engine_path       | string | path to VoxelFeatureEncoder TensorRT Engine file            |         |
-| encoder_pt_path           | string | path to VoxelFeatureEncoder TorchScript file                |         |
-| head_onnx_path            | string | path to DetectionHead ONNX file                             |         |
-| head_engine_path          | string | path to DetectionHead TensorRT Engine file                  |         |
-| head_pt_path              | string | path to DetectionHead TorchScript file                      |         |
+[6] <https://github.com/poodarchu/Det3D>
 
-## For Developers
+[7] <https://github.com/xingyizhou/CenterNet>
 
-If you have an error like `'GOMP_4.5' not found`, replace the OpenMP library in libtorch.
+[8] <https://github.com/lzccccc/SMOKE>
 
-```bash
-sudo apt install libgomp1 -y
-rm /path/to/libtorch/lib/libgomp-75eea7e8.so.1
-ln -s /usr/lib/x86_64-linux-gnu/libgomp.so.1 /path/to/libtorch/lib/libgomp-75eea7e8.so.1
-```
+[9] <https://github.com/yukkysaito/autoware_perception>
 
-## Reference
+[10] <https://github.com/pytorch/pytorch>
 
-Yin, Tianwei, Xingyi Zhou, and Philipp Krähenbühl. "Center-based 3d object detection and tracking." arXiv preprint arXiv:2006.11275 (2020).
+## (Optional) Future extensions / Unimplemented parts
 
-## Reference Repositories
+<!-- Write future extensions of this package.
 
-- <https://github.com/tianweiy/CenterPoint>
-- <https://github.com/open-mmlab/OpenPCDet>
-- <https://github.com/poodarchu/Det3D>
-- <https://github.com/xingyizhou/CenterNet>
-- <https://github.com/lzccccc/SMOKE>
-- <https://github.com/yukkysaito/autoware_perception>
-- <https://github.com/pytorch/pytorch>
+Example:
+  Currently, this package can't handle the chattering obstacles well. We plan to add some probabilistic filters in the perception layer to improve it.
+  Also, there are some parameters that should be global(e.g. vehicle size, max steering, etc.). These will be refactored and defined as global parameters so that we can share the same parameters between different nodes.
+-->
diff --git a/perception/lidar_centerpoint/include/lidar_centerpoint/node.hpp b/perception/lidar_centerpoint/include/lidar_centerpoint/node.hpp
@@ -17,7 +17,6 @@
 
 #include <centerpoint_trt.hpp>
 #include <config.hpp>
-#include <pointcloud_densification.hpp>
 #include <rclcpp/rclcpp.hpp>
 
 #include <autoware_auto_perception_msgs/msg/detected_object_kinematics.hpp>
@@ -45,13 +44,14 @@ class LidarCenterPointNode : public rclcpp::Node
   static uint8_t getSemanticType(const std::string & class_name);
   static bool isCarLikeVehicleLabel(const uint8_t label);
 
+  tf2_ros::Buffer tf_buffer_;
+  tf2_ros::TransformListener tf_listener_{tf_buffer_};
+
   rclcpp::Subscription<sensor_msgs::msg::PointCloud2>::SharedPtr pointcloud_sub_;
   rclcpp::Publisher<autoware_auto_perception_msgs::msg::DetectedObjects>::SharedPtr objects_pub_;
   rclcpp::Publisher<sensor_msgs::msg::PointCloud2>::SharedPtr pointcloud_pub_;
 
   float score_threshold_{0.0};
-  std::string densification_base_frame_;
-  int densification_past_frames_{0};
   bool use_encoder_trt_{false};
   bool use_head_trt_{false};
   std::string trt_precision_;
@@ -66,7 +66,6 @@ class LidarCenterPointNode : public rclcpp::Node
   std::vector<std::string> class_names_;
   bool rename_car_to_truck_and_bus_{false};
 
-  std::unique_ptr<PointCloudDensification> densification_ptr_{nullptr};
   std::unique_ptr<CenterPointTRT> detector_ptr_{nullptr};
 };
 

diff --git a/perception/lidar_centerpoint/launch/lidar_centerpoint.launch.xml b/perception/lidar_centerpoint/launch/lidar_centerpoint.launch.xml
@@ -11,8 +11,8 @@
       <remap from="~/input/pointcloud" to="$(var input/pointcloud)"/>
       <remap from="~/output/objects" to="$(var output/objects)"/>
       <param name="score_threshold" value="0.45"/>
-      <param name="densification_base_frame" value="map"/>
-      <param name="densification_past_frames" value="1"/>
+      <param name="densification_world_frame_id" value="map"/>
+      <param name="densification_num_past_frames" value="1"/>
       <param name="use_head_trt" value="true"/>
       <param name="trt_precision" value="fp16"/>
       <param name="encoder_pt_path" value="$(var model_path)/pts_voxel_encoder_$(var model_name).pt"/>

diff --git a/perception/lidar_centerpoint/lib/include/centerpoint_trt.hpp b/perception/lidar_centerpoint/lib/include/centerpoint_trt.hpp
@@ -66,11 +66,13 @@ class CenterPointTRT
 {
 public:
   explicit CenterPointTRT(
-    const NetworkParam & encoder_param, const NetworkParam & head_param, bool verbose);
+    const int num_class, const NetworkParam & encoder_param, const NetworkParam & head_param,
+    const DensificationParam & densification_param);
 
   ~CenterPointTRT();
 
-  std::vector<float> detect(const sensor_msgs::msg::PointCloud2 & input_pointcloud_msg);
+  std::vector<float> detect(
+    const sensor_msgs::msg::PointCloud2 &, const tf2_ros::Buffer & tf_buffer);
 
 private:
   bool initPtr(bool use_encoder_trt, bool use_head_trt);
@@ -93,6 +95,7 @@ class CenterPointTRT
   c10::Device device_ = torch::kCUDA;
   cudaStream_t stream_ = nullptr;
 
+  int num_class_{0};
   at::Tensor voxels_t_;
   at::Tensor coordinates_t_;
   at::Tensor num_points_per_voxel_t_;

diff --git a/perception/lidar_centerpoint/lib/include/config.hpp b/perception/lidar_centerpoint/lib/include/config.hpp
@@ -21,7 +21,6 @@ class Config
 {
 public:
   // input params
-  constexpr static int num_class = 3;           // car, bicycle and pedestrian
   constexpr static int num_point_dims = 3;      // x, y and z
   constexpr static int num_point_features = 4;  // x, y, z and timelag
   constexpr static int max_num_points_per_voxel = 32;
@@ -47,9 +46,10 @@ class Config
 
   // output params
   constexpr static int num_box_features = 11;  // score, class, x, y, z, w, l, h, yaw, vel_x, vel_y
-  constexpr static int max_num_output_objects = 200;
+  constexpr static int max_num_output_objects = 500;
 
   // network params
+  constexpr static int batch_size = 1;
   constexpr static int downsample_factor = 2;
   constexpr static int num_encoder_input_features = 8;
   constexpr static int num_encoder_output_features = 32;

diff --git a/perception/lidar_centerpoint/lib/include/network_trt.hpp b/perception/lidar_centerpoint/lib/include/network_trt.hpp
@@ -35,10 +35,14 @@ class HeadTRT : public TensorRTWrapper
 public:
   using TensorRTWrapper::TensorRTWrapper;
 
+  HeadTRT(const int num_class, const bool verbose);
+
 protected:
   bool setProfile(
     nvinfer1::IBuilder & builder, nvinfer1::INetworkDefinition & network,
     nvinfer1::IBuilderConfig & config) override;
+
+  int num_class_{0};
 };
 
 }  // namespace centerpoint

diff --git a/perception/lidar_centerpoint/lib/include/pointcloud_densification.hpp b/perception/lidar_centerpoint/lib/include/pointcloud_densification.hpp
@@ -21,30 +21,60 @@
 
 #include <list>
 #include <string>
+#include <utility>
 
 namespace centerpoint
 {
-class PointCloudDensification
+class DensificationParam
 {
 public:
-  PointCloudDensification(
-    std::string base_frame_id, unsigned int pointcloud_cache_size, rclcpp::Clock::SharedPtr clock);
+  DensificationParam(const std::string & world_frame_id, const unsigned int num_past_frames)
+  : world_frame_id_(std::move(world_frame_id)),
+    pointcloud_cache_size_(num_past_frames + /*current frame*/ 1)
+  {
+  }
 
-  sensor_msgs::msg::PointCloud2 stackPointCloud(
-    const sensor_msgs::msg::PointCloud2 & input_pointcloud_msg);
+  std::string world_frame_id() const { return world_frame_id_; }
+  unsigned int pointcloud_cache_size() const { return pointcloud_cache_size_; }
 
 private:
-  geometry_msgs::msg::TransformStamped getTransformStamped(
-    const std::string & target_frame, const std::string & source_frame);
+  std::string world_frame_id_;
+  unsigned int pointcloud_cache_size_{1};
+};
+
+struct PointCloudWithTransform
+{
+  sensor_msgs::msg::PointCloud2 pointcloud_msg;
+  Eigen::Affine3f affine_past2world;
+};
+
+class PointCloudDensification
+{
+public:
+  explicit PointCloudDensification(const DensificationParam & param);
 
-  static void setTimeLag(sensor_msgs::msg::PointCloud2 & pointcloud_msg, float diff_timestamp);
+  void enqueuePointCloud(
+    const sensor_msgs::msg::PointCloud2 & input_pointcloud_msg, const tf2_ros::Buffer & tf_buffer);
+
+  double getCurrentTimestamp() const { return current_timestamp_; }
+  Eigen::Affine3f getAffineWorldToCurrent() const { return affine_world2current_; }
+  std::list<PointCloudWithTransform>::iterator getPointCloudCacheIter()
+  {
+    return pointcloud_cache_.begin();
+  }
+  bool isCacheEnd(std::list<PointCloudWithTransform>::iterator iter)
+  {
+    return iter == pointcloud_cache_.end();
+  }
+
+private:
+  void enqueue(const sensor_msgs::msg::PointCloud2 & msg, const Eigen::Affine3f & affine);
+  void dequeue();
 
-  std::string base_frame_id_;
-  unsigned int pointcloud_cache_size_ = 0;
-  tf2_ros::Buffer tf_buffer_;
-  tf2_ros::TransformListener tf_listener_{tf_buffer_};
-  std::list<sensor_msgs::msg::PointCloud2> pointcloud_cache_;
-  std::list<Eigen::Matrix4f> matrix_past2base_cache_;
+  DensificationParam param_;
+  double current_timestamp_{0.0};
+  Eigen::Affine3f affine_world2current_;
+  std::list<PointCloudWithTransform> pointcloud_cache_;
 };
 
 }  // namespace centerpoint

diff --git a/perception/lidar_centerpoint/lib/include/tensorrt_wrapper.hpp b/perception/lidar_centerpoint/lib/include/tensorrt_wrapper.hpp
@@ -68,6 +68,8 @@ class TensorRTWrapper
     nvinfer1::IBuilder & builder, nvinfer1::INetworkDefinition & network,
     nvinfer1::IBuilderConfig & config) = 0;
 
+  Logger logger_;
+
 private:
   bool parseONNX(
     const std::string & onnx_path, const std::string & engine_path, const std::string & precision,
@@ -79,7 +81,6 @@ class TensorRTWrapper
 
   bool createContext();
 
-  Logger logger_;
   unique_ptr<nvinfer1::IRuntime> runtime_ = nullptr;
   unique_ptr<nvinfer1::ICudaEngine> engine_ = nullptr;
 };