Skip to content

a cross-view video diffusion model utilizing a spatial-temporal reconstruction VAE that generates long-term, multi-view videos with 4D reconstruction capabilities under various control inputs.

License

Notifications You must be signed in to change notification settings

SenseTime-FVG/CVD-STORM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CVD-STORM

arXiv web

The official repository for

CVD-STORM: Cross-View Video Diffusion with Spatial-Temporal Reconstruction Model for Autonomous Driving
Tianrui Zhang1,2, Yichen Liu1, Zilin Guo1,2, Yuxin Guo1, Jingcheng Ni1, Chenjing Ding1, Dan Xu2, Lewei Lu1, Zehuan Wu1
1SenseTime Research 2The Hong Kong University of Science and Technology

News

  • We will open-source the code, text data, and weight files as soon as possible.

Cite Us

@article{zhang2025cvd,
  title={CVD-STORM: Cross-View Video Diffusion with Spatial-Temporal Reconstruction Model for Autonomous Driving},
  author={Zhang, Tianrui and Liu, Yichen and Guo, Zilin and Guo, Yuxin and Ni, Jingcheng and Ding, Chenjing and Xu, Dan and Lu, Lewei and Wu, Zehuan},
  journal={arXiv preprint arXiv:2510.07944},
  year={2025}
}

About

a cross-view video diffusion model utilizing a spatial-temporal reconstruction VAE that generates long-term, multi-view videos with 4D reconstruction capabilities under various control inputs.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •