Stochastic Talking Face Generation Using Latent Distribution Matching

This is the official [PyTorch] implementation of our Interspeech'20 paper "Stochastic Talking Face Generation Using Latent Distribution Matching".

It shows two different predicted video for same reference image and audio input.

Training step

CUDA_VISIBLE_DEVICES=0 python3 train.py --batch_size 5 --lr 0.0002 --beta 0.000001 --z_dim 24 --model vgg --image_width 128 --data_threads 4

To restore training: CUDA_VISIBLE_DEVICES=0 python3 train.py --model_dir <path_to_model.pth>

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
models		models
README.md		README.md
data_loader.py		data_loader.py
path.py		path.py
train.py		train.py
utils.py		utils.py