Skip to content

SoulX-FlashTalk is the first 14B model to achieve sub-second start-up latency (0.87s) while maintaining a real-time throughput of 32 FPS on an 8xH800 node.

License

Notifications You must be signed in to change notification settings

Soul-AILab/SoulX-FlashTalk

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

22 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

SoulX-FlashTalk: Real-Time Infinite Streaming of Audio-Driven Avatars via Self-Correcting Bidirectional Distillation

Le Shen*, Qian Qiao*, Tan Yu*, Ke Zhou, Tianhang Yu, Yu Zhan, Zhenjie Wang, Dingcheng Zhen, Ming Tao, Shunshun Yin, Siyuan Liu โœ‰

*Equal Contribution โœ‰Corresponding Author

HF spaceย 

๐Ÿ”ฅ News

๐Ÿคซ Coming soon

A 4-GPU real-time version of SoulX-FlashTalk.

๐Ÿ“‘ Todo List

  • Technical report
  • Project Page
  • Inference code
  • Checkpoint release
  • Online demo

๐Ÿ“ข Live Streaming & Video Podcast

Live.Streaming.mp4

๐ŸŽฌ Online Demos

online.demo01.mp4
online.demo02.mp4

๐ŸŒฐ Examples

Girl.mp4
Seal.mp4
Rap.mp4

๐Ÿ“– Quickstart

๐Ÿ”ง Installation

1. Create a Conda environment

conda create -n flashtalk python=3.10
conda activate flashtalk

2. Install PyTorch on CUDA

pip install torch==2.7.1 torchvision==0.22.1 --index-url https://download.pytorch.org/whl/cu128

3. Install other dependencies

pip install -r requirements.txt

4. Flash-attention installation:

pip install ninja
pip install flash_attn==2.8.0.post2 --no-build-isolation

5. FFmpeg installation

# Ubuntu / Debian
apt-get install ffmpeg
# CentOS / RHEL
yum install ffmpeg ffmpeg-devel

or

# Conda (no root required) 
conda install -c conda-forge ffmpeg==7

๐Ÿค— Model download

Model Component Description Link
SoulX-FlashTalk-14B Our 14b model ๐Ÿค— Huggingface
chinese-wav2vec2-base chinese-wav2vec2-base ๐Ÿค— Huggingface
# If you are in china mainland, run this first: export HF_ENDPOINT=https://hf-mirror.com
pip install "huggingface_hub[cli]"
huggingface-cli download Soul-AILab/SoulX-FlashTalk-14B --local-dir ./models/SoulX-FlashTalk-14B
huggingface-cli download TencentGameMate/chinese-wav2vec2-base --local-dir ./models/chinese-wav2vec2-base

๐Ÿš€ Inference

# Infer on single GPU
# Requires more than 64G of VRAM. Use --cpu_offload to reduce VRAM usage to 40G.
bash inference_script_single_gpu.sh

# Infer on multy GPUs
# Real-time inference speed can only be supported on 8xH800 or higher graphics cards
bash inference_script_multi_gpu.sh

๐Ÿ‘‹ Online Demo

Coming Soon!

๐Ÿ“ง Contact Us

If you are interested in leaving a message to our work, feel free to email le.shen@mail.dhu.edu.cn or qiaoqian@soulapp.cn or yutan@soulapp.cn or zhouke@soulapp.cn or liusiyuan@soulapp.cn

Due to Group 1 reaching its capacity, we have opened a new WeChat group. Additionally, we represent SoulApp and warmly welcome everyone to download the app and join our Soul group for further technical discussions and updates!

WeChat Group QR Code
Join WeChat Group
(ๅŠ ๅ…ฅๅพฎไฟกๆŠ€ๆœฏ็พค)
Soul App Group QR Code
Download SoulApp & Join Group
(ไธ‹่ฝฝSoulAppๅŠ ๅ…ฅ็พค็ป„)

๐Ÿ“š Citation

If you find our work useful in your research, please consider citing:

@misc{shen2025soulxflashtalktechnicalreport,
      title={SoulX-FlashTalk: Real-Time Infinite Streaming of Audio-Driven Avatars via Self-Correcting Bidirectional Distillation}, 
      author={Le Shen and Qian Qiao and Tan Yu and Ke Zhou and Tianhang Yu and Yu Zhan and Zhenjie Wang and Ming Tao and Shunshun Yin and Siyuan Liu},
      year={2025},
      eprint={2512.23379},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2512.23379}, 
}

๐Ÿ™‡ Acknowledgement

Tip

If you find our work useful, please also consider starring the original repositories of these foundational methods.

๐Ÿ’ก Star History

Star History Chart

About

SoulX-FlashTalk is the first 14B model to achieve sub-second start-up latency (0.87s) while maintaining a real-time throughput of 32 FPS on an 8xH800 node.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published