Skip to content

Latest commit

 

History

History
161 lines (120 loc) · 8.72 KB

GETTING_STARTED.md

File metadata and controls

161 lines (120 loc) · 8.72 KB

Getting Started

Overview

0. Data Download
1. Installation
2. Data Preprocess
3. Model Training
4. Model Evaluation

0. Data Download

First of all, please download the offical View-of-Delft (VoD) dataset and keep the format of how the dataset is provided. Note that the VoD dataset is made freely available for non-commercial research purposes only. You need to request the access to the VoD dataset at first.

The labels in the original release do not include track ids, please download the version with tracking ids and overwrite the original labels, following the official instructions. In the end, the dataset should be oragnized like this:

View-of-Delft-Dataset (root)
    ├── radar (kitti dataset where velodyne contains the radar point clouds)
    │   │── ImageSets
    │   │── training
    │   │   ├──calib & velodyne & image_2 & label_2
    │   │── testing
    │       ├──calib & velodyne & image_2
    ├── lidar (kitti dataset where velodyne contains the LiDAR point clouds)
    ├── radar_3_scans (kitti dataset where velodyne contains the accumulated radar point clouds of 3 scans)
    ├── radar_5_scans (kitti dataset where velodyne contains the radar point clouds of 5 scans)

1. Installation

Note: our code has been tested on Ubuntu 16.04/18.04 with Python 3.7, CUDA 11.1/11.0, PyTorch 1.7. It may work for other setups, but has not been tested.

Before you run our code, please follow the steps below to build up your environment.

a. Clone the repository to local

git clone https://github.com/Toytiny/CMFlow

b. Set up a new environment (Python 3.7) with Anaconda

conda create -n $ENV_NAME$ python=3.7
source activate $ENV_NAME$

c. Install common dependices and pytorch

pip install -r requirements.txt
conda install pytorch==1.7.0 torchvision==0.8.0 torchaudio==0.7.0 cudatoolkit=11.0 -c pytorch

d. Install PointNet++ library for basic point cloud operation

cd lib
python setup.py install
cd ..

2. Data Preprocess

To run our experiments on the VoD dataset, you need to preprocess the original dataset into our scene flow format. All related folders or files are put under preprocess/, which includes:

clips: The clip information for all frames (which frame belongs to which clip).
scene_flow_clips_info.yaml: The split information for our scene flow experimets (which clip belongs to which split).
label_track_pre: The prediction labels generated by running the multi-object tracking algorithm AB3DMOT on the LiDAR point clouds from our training split. These labels are used to provide cross-modal traininig supervision signals in our experiments.
process_vod.py: The Python file used to process the orignial dataset into our scene flow samples.

To generate our scene flow samples for experiments, please follow the next steps:

a. Cope the official labels (with tracking ids) to this repository as preprocess/label_track_gt.

b. Download the official RAFT pretrained model from google drive.

We use the raft-small model in our work to estimate optical flow from training images. This optical flow estimation can be used to provide cross-modal supervision singals in the image aspect. Please put the downloaded model as preprocess/utils/RAFT/raft-small.pth

c. Running the preprocessing code using:

python preprocess/preprocess_vod.py --root_dir $ROOT_DIR$ --save_dir $SAVE_DIR$

where $ROOT_DIR$ is the path of the VoD dataset. The final scene flow samples will be saved under the $SAVE_DIR$/flow_smp/. The preprocessing speed might be slow because we need to infer the optical flow results with the RAFT model for training samples. Each scene flow sample is a dictinary that includes:

#Key     Dimension     Description
---------------------------------------------------------------------------------------------------
pc1         N×5        Source radar point clouds (x, y, z, RCS, doppler velocity).
pc2         M×5        Target radar point clouds (x, y, z, RCS, doppler velocity).
trans       4×4        The coordinate frame transformation between two frames.
opt_info     -         The estimated optical flow information for radar points in pc1.
gt_mask      N         The ground truth motion segmentation labels, only valid for val and test. 
gt_labels   N×3        The ground truth scene flow labels, only valid for val and test. 
pse_mask     N         The pseudo forground segmentation labels (from LiDAT), only valid for train. 
pse_labels  N×3        The pseudo scene flow labels (from LiDAR), only valid for train.

Possible errors during preprocessing:

i. Error when FrameDataLoader reads frame pose information.

File ".../preprocess/utils/vod/frame/transformations.py", line 277, in get_world_transform
t_odom_camera = np.array(jsons[0]["odomToCamera"], dtype=np.float32).reshape(4, 4)
IndexError: list index out of range

This is due to the pose information missing at the beginning of sequences. We recommend copying the pose information from the closest later frame to frames lacking pose information.

ii. Error when FrameDataLoader loads labels.

ERROR:root:02759.txt does not exist at location: /mnt/data/fangqiang/view_of_delft/lidar/training/label_2!

Please ignore them. These errors have no impact to our preprocessing. Because these frames (e.g., 2532-3276) are testing frames in the original dataset, thus their labels are withheld for benchmarking. Moreover, our preprocessing code doesn't use these labels loaded by FrameDataLoader, instead we use preprocess/label_track_gt and preprocess/label_track_pre.

3. Model Training

Make sure you have successfully completed all above steps before you start running code for model training. With our code, you can train three types of our models, i.e., CMFlow, CMFlow (T), and RaFlow using scene flow samples that you generate in the last section.

To train our CMFlow (or RaFlow) models where temporal information is not used, please run:

python main.py --dataset_path $DATA_PATH$ --exp_name $EXP_NAME$ --model cmflow (or raflow) 

To train our CMFlow (T) model where samples are organized as mini-clips, please run:

python main.py --dataset_path $DATA_PATH$ --exp_name $EXP_NAME$  --model cmflow_t --dataset vodClipDataset 

Here, $DATA_PATH$ is the path where you save your preprocessed scene flow samples. EXP_NAME is the name of the current experiment defined by yourself. Training logs and results will be saved under checkpoints/$EXP_NAME$/. Besides, you can also modify training args, such as batch size, learning rate and number of epochs, by editing the configuration file configs.yaml.

4. Model Evaluation

We provide our trained models under three folders in checkpoints/ with an affix cvpr. You can evaluate our trained models or models trained by yourself.

To evaluate the trained CMFlow (or RaFlow) models on the test set, please run:

python main.py --eval --dataset_path $DATA_PATH$ --exp_name cmflow_cvpr --model cmflow (raflow)

To evaluate the trained CMFlow (T) model on the test set, please run:

python main.py --eval --dataset_path $DATA_PATH$ --exp_name cmflow_t_cvpr --model cmflow_t --dataset vodClipDataset

To evaluate the trained CMFlow (T) model trained with extra unannotated data on the test set, please run:

python main.py --eval --dataset_path $DATA_PATH$ --exp_name cmflow_t_ed_cvpr --model cmflow_t --dataset vodClipDataset

Once the evaluation is completed, the results on different metrics will be printed. If you want to save the model outputs, please add --save_res in the command. The results will be saved at checkpoints/$EXP_NAME$/results/. To enable the visualization of the estimated scene flow and motion segmentation in BEV, please add --vis in the command. The visualization figures will be saved under checkpoints/$EXP_NAME$/test_vis_flow and .../test_vis_seg. For scene flow visualization, the corresponding color wheel is checkpoints/flow_encoding.png. In the motion segmentation visulization figures, organge indicates moving points while blue is static.