Support custom dataset template and tutorial - Merge pull request ope…

…n-mmlab#1072 from jihanyang/master Support custom dataset template and tutorial
thri5ha · Aug 22, 2022 · b71c6a1 · b71c6a1
2 parents a41e331 + d1e6cf0
commit b71c6a1
Show file tree

Hide file tree

Showing 14 changed files with 937 additions and 145 deletions.
diff --git a/README.md b/README.md
@@ -21,7 +21,9 @@ It is also the official code release of [`[PointRCNN]`](https://arxiv.org/abs/18
 
 
 ## Changelog
-[2022-07-05] Added support for the 3D object detection backbone network [Focals Conv](https://openaccess.thecvf.com/content/CVPR2022/papers/Chen_Focal_Sparse_Convolutional_Networks_for_3D_Object_Detection_CVPR_2022_paper.pdf).
+[2022-08-22] Added support for [custom dataset tutorial and template](docs/CUSTOM_DATASET_TUTORIAL.md) 
+
+[2022-07-05] Added support for the 3D object detection backbone network [`Focals Conv`](https://openaccess.thecvf.com/content/CVPR2022/papers/Chen_Focal_Sparse_Convolutional_Networks_for_3D_Object_Detection_CVPR_2022_paper.pdf).
 
 [2022-02-12] Added support for using docker. Please refer to the guidance in [./docker](./docker).
 

diff --git a/docs/CUSTOM_DATASET_TUTORIAL.md b/docs/CUSTOM_DATASET_TUTORIAL.md
@@ -0,0 +1,108 @@
+# Custom Dataset Tutorial
+For the custom dataset template, we only consider the basic scenario: raw point clouds and 
+their corresponding annotations. Point clouds are supposed to be stored in `.npy` format.
+
+## Label format
+We only consider the most basic information -- category and bounding box in the label template.
+Annotations are stored in the `.txt`. Each line represents a box in a given scene as below:
+```
+[x y z dx dy dz heading_angle category_name]
+1.50 1.46 0.10 5.12 1.85 4.13 1.56 Vehicle
+5.54 0.57 0.41 1.08 0.74 1.95 1.57 Pedestrian
+```
+The box should in the unified 3D box definition (see [README](../README.md))
+
+## Files structure
+Files should be placed as the following folder structure:
+```
+OpenPCDet
+├── data
+│   ├── custom
+│   │   │── ImageSets
+│   │   │   │── train.txt
+│   │   │   │── val.txt
+│   │   │── points
+│   │   │   │── 000000.npy
+│   │   │   │── 999999.npy
+│   │   │── labels
+│   │   │   │── 000000.txt
+│   │   │   │── 999999.txt
+├── pcdet
+├── tools
+```
+Dataset splits need to be pre-defined and placed in `ImageSets`
+
+## Hyper-parameters Configurations
+
+### Point cloud features
+Modify following configurations to in `custom_dataset.yaml` to 
+suit your own point clouds.
+```yaml
+POINT_FEATURE_ENCODING: {
+    encoding_type: absolute_coordinates_encoding,
+    used_feature_list: ['x', 'y', 'z', 'intensity'],
+    src_feature_list: ['x', 'y', 'z', 'intensity'],
+}
+...
+# In gt_sampling data augmentation
+NUM_POINT_FEATURES: 4
+
+```
+
+#### Point cloud range and voxel sizes
+For voxel based detectors such as SECOND, PV-RCNN and CenterPoint, the point cloud range and voxel size should follow:
+1. Point cloud range along z-axis / voxel_size is 40
+2. Point cloud range along x&y-axis / voxel_size is the multiple of 16.
+
+Notice that the second rule also suit pillar based detectors such as PointPillar and CenterPoint-Pillar.
+
+### Category names and anchor sizes
+Category names and anchor size are need to be adapted to custom datasets.
+ ```yaml
+CLASS_NAMES: ['Vehicle', 'Pedestrian', 'Cyclist']  
+...
+MAP_CLASS_TO_KITTI: {
+    'Vehicle': 'Car',
+    'Pedestrian': 'Pedestrian',
+    'Cyclist': 'Cyclist',
+}
+...
+'anchor_sizes': [[3.9, 1.6, 1.56]],
+...
+# In gt sampling data augmentation
+PREPARE: {
+ filter_by_min_points: ['Vehicle:5', 'Pedestrian:5', 'Cyclist:5'],
+ filter_by_difficulty: [-1],
+}
+SAMPLE_GROUPS: ['Vehicle:20','Pedestrian:15', 'Cyclist:15']
+...
+ ```
+In addition, please also modify the default category names for creating infos in `custom_dataset.py`
+```
+create_custom_infos(
+    dataset_cfg=dataset_cfg,
+    class_names=['Vehicle', 'Pedestrian', 'Cyclist'],
+    data_path=ROOT_DIR / 'data' / 'custom',
+    save_path=ROOT_DIR / 'data' / 'custom',
+)
+```
+
+
+## Create data info
+Generate the data infos by running the following command:
+```shell
+python -m pcdet.datasets.custom.custom_dataset create_custom_infos tools/cfgs/dataset_configs/custom_dataset.yaml
+```
+
+
+## Evaluation
+Here, we only provide an implementation for KITTI stype evaluation.
+The category mapping between custom dataset and KITTI need to be defined 
+in the `custom_dataset.yaml`
+```yaml
+MAP_CLASS_TO_KITTI: {
+    'Vehicle': 'Car',
+    'Pedestrian': 'Pedestrian',
+    'Cyclist': 'Cyclist',
+}
+```
diff --git a/docs/GETTING_STARTED.md b/docs/GETTING_STARTED.md
@@ -5,7 +5,7 @@ and the model configs are located within [tools/cfgs](../tools/cfgs) for differe
 
 ## Dataset Preparation
 
-Currently we provide the dataloader of KITTI dataset and NuScenes dataset, and the supporting of more datasets are on the way.  
+Currently we provide the dataloader of KITTI, NuScenes, Waymo, Lyft and Pandaset. If you want to use a custom dataset, Please refer to our [custom dataset template](CUSTOM_DATASET_TUTORIAL.md).
 
 ### KITTI Dataset
 * Please download the official [KITTI 3D object detection](http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d) dataset and organize the downloaded files as follows (the road planes could be downloaded from [[road plane]](https://drive.google.com/file/d/1d5mq0RXRnvHPVeKx6Q612z0YRO1t2wAp/view?usp=sharing), which are optional for data augmentation in the training):

diff --git a/pcdet/datasets/__init__.py b/pcdet/datasets/__init__.py
@@ -11,14 +11,16 @@
 from .waymo.waymo_dataset import WaymoDataset
 from .pandaset.pandaset_dataset import PandasetDataset
 from .lyft.lyft_dataset import LyftDataset
+from .custom.custom_dataset import CustomDataset
 
 __all__ = {
     'DatasetTemplate': DatasetTemplate,
     'KittiDataset': KittiDataset,
     'NuScenesDataset': NuScenesDataset,
     'WaymoDataset': WaymoDataset,
     'PandasetDataset': PandasetDataset,
-    'LyftDataset': LyftDataset
+    'LyftDataset': LyftDataset,
+    'CustomDataset': CustomDataset
 }
 
 

diff --git a/pcdet/datasets/custom/__init__.py b/pcdet/datasets/custom/__init__.py