wav2lip-576x576 introduction

This is a project about talking faces. We use 576X576 sized facial images for training, which can generate 2k, 4k, 6k, and 8k digital human videos.

We have optimized in the following areas:

Using Hubert for audio processing, there is a significant improvement compared to wav2lip-96 and wav2lip-288.
Optimized dataset processing, eliminating the need to manually cut videos into seconds.
We have optimized the network structure to better extract features,Our idea is not to train the discriminator separately, but to train the generator directly..
We trained the base model with a high-definition dataset of hundreds of people. Although its generalization ability is not strong, the effect is very good after single or multi person fine-tuning.

wav2lip-576x576 Project situation

Video | Project Page | Code

wav2lip-576x576 Code Release Plan

This project is not yet mature enough. We will gradually release the code, first release the data processing code, then release the inference code, and when the time is ripe, we will release the training code.

acknowledge

The code is mainly borrowed from wav2lip, wav2lip-288, wav2lip-384, ER-NeRF, etc. Thank you for their wonderful work.

author

Project made by Lu Rui from Langzizhixin Technology company in Chengdu, China, 2024.

Code contribution

At present, the video preprocessing, facial cropping, and audio Hubert processing codes have been completed. Welcome everyone to contribute code related to network structure, training, and inference.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
audio-data		audio-data
preprocessed-audio-data		preprocessed-audio-data
preprocessed-video-data		preprocessed-video-data
video-data		video-data
576x576-CorrespondingVideo.jpg		576x576-CorrespondingVideo.jpg
LICENSE		LICENSE
README.md		README.md
audio-preprocess.py		audio-preprocess.py
hparams.py		hparams.py
video-preprocess.py		video-preprocess.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

wav2lip-576x576 introduction

wav2lip-576x576 Project situation

wav2lip-576x576 Code Release Plan

acknowledge

author

Code contribution

About

Releases

Packages

Languages

License

langzizhixin/wav2lip-576x576

Folders and files

Latest commit

History

Repository files navigation

wav2lip-576x576 introduction

wav2lip-576x576 Project situation

wav2lip-576x576 Code Release Plan

acknowledge

author

Code contribution

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages