FastVideo is a lightweight framework for accelerating large video diffusion models.
| Documentation | 🤗 FastHunyuan | 🤗 FastMochi | 🟣💬 Slack |
repo-demo.mp4
FastVideo currently offers: (with more to come)
- [NEW!] V1 inference API available. Full announcement coming soon!
- Sliding Tile Attention.
- FastHunyuan and FastMochi: consistency distilled video diffusion models for 8x inference speedup.
- First open distillation recipes for video DiT, based on PCM.
- Support distilling/finetuning/inferencing state-of-the-art open video DiTs: 1. Mochi 2. Hunyuan.
- Scalable training with FSDP, sequence parallelism, and selective activation checkpointing, with near linear scaling to 64 GPUs.
- Memory efficient finetuning with LoRA, precomputed latent, and precomputed text embeddings.
Dev in progress and highly experimental.
2025/02/20: FastVideo now supports STA on StepVideo with 3.4X speedup!2025/02/18: Release the inference code and kernel for Sliding Tile Attention.2025/01/13: Support Lora finetuning for HunyuanVideo.2024/12/25: Enable single 4090 inference forFastHunyuan, please rerun the installation steps to update the environment.2024/12/17:FastVideov0.0.1 is released.
- Quick Start
- V1 Inference API Guide (Coming soon!)
- More models support
- Add StepVideo to V1
- Optimization features
- Teacache in V1
- SageAttention in V1
- Code updates
- V1 Configuration API
- Support Training in V1
We welcome all contributions. Please check out our guide here
We learned and reused code from the following projects:
We thank MBZUAI and Anyscale for their support throughout this project.
If you use FastVideo for your research, please cite our paper:
@misc{zhang2025fastvideogenerationsliding,
title={Fast Video Generation with Sliding Tile Attention},
author={Peiyuan Zhang and Yongqi Chen and Runlong Su and Hangliang Ding and Ion Stoica and Zhenghong Liu and Hao Zhang},
year={2025},
eprint={2502.04507},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2502.04507},
}
@misc{ding2025efficientvditefficientvideodiffusion,
title={Efficient-vDiT: Efficient Video Diffusion Transformers With Attention Tile},
author={Hangliang Ding and Dacheng Li and Runlong Su and Peiyuan Zhang and Zhijie Deng and Ion Stoica and Hao Zhang},
year={2025},
eprint={2502.06155},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2502.06155},
}