Hidden Gems: 4D Radar Scene Flow Learning Using Cross-Modal Supervision
This is the official repository of the CMFlow, a cross-modal supervised approach for estimating 4D radar scene flow. For technical details, please refer to our paper on CVPR 2023:
Hidden Gems: 4D Radar Scene Flow Learning Using Cross-Modal Supervision
Fangqiang Ding, Andras Palffy, Dariu M. Gavrila, Chris Xiaoxuan Lu
[arXiv] [demo] [page] [supp] [video]
- [2023-02-28] Our paper is accepted by CVPR 2023 🎉.
- [2023-03-03] Our paper can be seen here 👉 arXiv. Supplementary can be found here. Project page is built here.
- [2023-03-15] Our code has been released. Please see 👉 GETTING_STARTED for the guidelines.
- [2023-03-21] Our paper is selected as a highlight 🎉 in CVPR 2023 (10% of the accepted papers).
- [2023-05-08] Our poster to CVPR 2023 is uploaded. Please download it 👉 here
- [2023-05-25] Our presentation video to CVPR 2023 is uploaded. Please watch it 👉 here
- [2023-07-18] We release our model trained with an extra amount of unlabeled data provided by the VoD dataset. Please try to following 👉 MODEL_EVALUATION.
If you find our work useful in your research, please consider citing:
@InProceedings{Ding_2023_CVPR,
author = {Ding, Fangqiang and Palffy, Andras and Gavrila, Dariu M. and Lu, Chris Xiaoxuan},
title = {Hidden Gems: 4D Radar Scene Flow Learning Using Cross-Modal Supervision},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2023},
pages = {9340-9349}
}
To find out how to run our scene flow experiments, please see our intructions in GETTING_STARTED. If you meet some issues when runinng our code, please raise them under this repository.
This work proposes a novel approach to 4D radar-based scene flow estimation via cross-modal learning. Our approach is motivated by the co-located sensing redundancy in modern autonomous vehicles. Such redundancy implicitly provides various forms of supervision cues to the radar scene flow estimation. Specifically, we introduce a multi-task model architecture for the identified cross-modal learning problem and propose loss functions to opportunistically engage scene flow estimation using multiple cross-modal constraints for effective model training. Extensive experiments show the state-of-the-art performance of our method and demonstrate the effectiveness of cross-modal supervised learning to infer more accurate 4D radar scene flow. We also show its usefulness to two subtasks - motion segmentation and ego-motion estimation.
Here are some GIFs to show our qualitative results on scene flow estimation and two subtasks, motion segmentation and ego-motion estimation. For more qualitative results, please refer to our demo video or supplementary.