StreamingFlow: Streaming Occupancy Forecasting with Asynchronous Multi-modal Data Streams via Neural Ordinary Differential Equation
This repo introduces StreamingFlow (CVPR2024 poster(hightlight)).
Occupancy forecasting on nuScenes dataset
demo_video_panopticseg_nusc_first_half.00_00_00-00_01_01.mp4
Occupancy forecasting on Lyft dataset
demo_video_panopticseg_lyft_firsthalf.00_00_00-00_01_00.mp4
Streaming forecasting: foreseeing the future to 8s
demo_longpred_8s_nusc.00_00_00-00_01_01.mp4
Streaming forecasting: predicting at given interval 0.05s/0.10s/0.25s
demo_pred_interval_005_nusc.00_00_00-00_02_00.mp4
demo_pred_interval_010_nusc.00_00_00-00_01_00.mp4
demo_pred_interval_025_nusc.00_00_00-00_01_01.mp4
We implement StreamingFlow on Vidar codebase and generates streaming prediction on self-supervised 4d occupancy forecasting task with future point clouds as proxy. It is still in an early stage. We provide demo videos of current process.
Streaming forecasting with interval 0.5s:
viz_pcd_interval_0.5s.00_00_00-00_01_31.mp4
Streaming forecasting with interval 0.05s:
viz_pcd_streaming.00_00_00-00_01_30.mp4
StreamingFlow is a streaming occupancy forecasting framework which can input multi-modal asynchronous data streams (possibly with different given frequency) as input, and outputs future instance prediction in a continuous manner.
We follow the ST-P3 setup and bevfusion setup for environoment. For data setup, simply organize nuscenes and lyft dataset in ./data/nuscenes and ./data/lyft.
Settings | Image | LiDAR | ODE Step | IoU | VPQ | config | checkpoint |
---|---|---|---|---|---|---|---|
past_1s, future_2s | Effi-B4-224x480-2Hz | Spconv8x-0050-5Hz | variable | 53.7 | 50.7 | config | ckpt |
Train command:
python train.py --config /path/to/config
Test command:
python evaluate.py --checkpoint /path/to/checkpoint
We use streamingflow with variable ode step config and checkpoint to conduct the following experiments.
Settings | 1s | 2s | 3s | 4s | 5s | 6s | 8s |
---|---|---|---|---|---|---|---|
Variable | 56.5/54.4 | 53.7/50.7 | 50.4/47.2 | 47.2/44.1 | 44.1/41.1 | 40.7/38.0 | 34.4/32.6 |
Test command:
python evaluate.py --checkpoint /path/to/checkpoint --future-frames N
here, N is for N * 0.5s future seconds.
Settings | 0.05s | 0.1s | 0.25s | 0.5s | 0.6s |
---|---|---|---|---|---|
Variable | 48.2/45.2 | 49.5/46.4 | 51.5/48.5 | 53.6/49.6 | 53.4/49.8 |
Test command:
export PYTHONPATH=/project_root_dir/nuscenes-devkit/python-sdk:$PYTHONPATH
python evaluate_streaming.py --checkpoint /path/to/checkpoint --eval-interval N
here, N is for N * 0.05s interval.
Settings | 0.15s | 0.2s | 0.25s | 0.4s | 0.5s |
---|---|---|---|---|---|
Variable | 53.1/50.0 | 53.7/50.7 | 53.2/50.3 | 50.6/47.4 | 47.6/44.5 |
Test command:
python evaluate_datastream.py --checkpoint /path/to/checkpoint --frame-skip N
here, N is for 20/N interval for lidar input stream interval.
All assets and code are under the Apache 2.0 license unless specified otherwise.
Please consider citing our paper if the project helps your research with the following BibTex:
@inproceedings{shi2024streamingflow,
title={StreamingFlow: Streaming Occupancy Forecasting with Asynchronous Multi-modal Data Streams via Neural Ordinary Differential Equation},
author={Shi, Yining and Jiang, Kun and Wang, Ke and Li, Jiusi and Wang, Yunlong and Yang, Mengmeng and Yang, Diange},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={14833--14842},
year={2024}
}
Thanks to prior excellent open source projects: