Wav2Lip Video Generation Pipeline

This project enables you to create lip-synced videos from an input image and generated audio from a given text prompt. The pipeline converts text into speech (TTS), and then uses the Wav2Lip model to animate the image with synchronized lip movements to match the audio.

🔥 Features

Convert any text into audio using TTS (Text-to-Speech)
Use Wav2Lip to generate a lip-synced video from a static face image and audio
Automatic audio extraction and video frame preparation
Resize factor customization to fit different GPU memory requirements
CUDA support for fast inference

🛠️ Setup

1. Clone the Repository

git clone https://github.com/your-bchachar/lip_sync_video_generator.git
cd lip_sync_video_generator

2. Set Up Python Virtual Environment (Recommended)

python3 -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

3. Install Dependencies

pip install -r requirements.txt

Make sure you have ffmpeg installed and accessible via command line.

4. Download Required Models

a. Clone Wav2Lip Repository

Clone the official Wav2Lip repository into the project root directory:

git clone https://github.com/Rudrabha/Wav2Lip.git

b. Wav2Lip Checkpoint (.pth file)

Download the wav2lip.pth file from this Hugging Face link and place it in ./Wav2Lip/checkpoints/

⚙️ How It Works

Input text is converted into audio using a TTS system.
The audio is saved to ./audio/output.wav
The image and audio are passed to the Wav2Lip model.
A video is generated where the lips of the image move in sync with the spoken audio.

🚀 Usage

Run the full pipeline:

python main.py

🐞 Common Issues & Fixes

1. `CUDA error: no kernel image is available for execution on the device`

Your GPU (e.g., RTX 4090) may not be supported by the current PyTorch installation.
Fix: Reinstall PyTorch with support for compute capability 8.9:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

2. `invalid load key, '<'` or TorchScript Errors

Occurs when the wrong model file is used (e.g., a TorchScript .pt file instead of a PyTorch .pth checkpoint)
Fix: Use the .pth file from the correct source (e.g., Hugging Face link above)

3. `Image too big to run face detection on GPU`

Fix: Use the --resize_factor argument (e.g., 2 or 4)

4. Output video not found

Fix: Ensure ffmpeg is installed and accessible from the command line.

📁 Project Structure

.
├── Wav2Lip
│   ├── checkpoints
│   │   └── wav2lip.pth
├── audio
│   └── output.wav
├── images
│   └── sample.jpg
├── video
│   └── result_voice.mp4
├── generate_lipsync_video.py
└── requirements.txt

📜 License

MIT License. See LICENSE file for more information.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.idea		.idea
Wav2Lip		Wav2Lip
audio		audio
images		images
results		results
temp		temp
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt
requirements_freeze.txt		requirements_freeze.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Wav2Lip Video Generation Pipeline

🔥 Features

🛠️ Setup

1. Clone the Repository

2. Set Up Python Virtual Environment (Recommended)

3. Install Dependencies

4. Download Required Models

a. Clone Wav2Lip Repository

b. Wav2Lip Checkpoint (.pth file)

⚙️ How It Works

🚀 Usage

Run the full pipeline:

🐞 Common Issues & Fixes

1. `CUDA error: no kernel image is available for execution on the device`

2. `invalid load key, '<'` or TorchScript Errors

3. `Image too big to run face detection on GPU`

4. Output video not found

📁 Project Structure

📜 License

🙏 Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Languages

bchachar/lip_sync_video_generator

Folders and files

Latest commit

History

Repository files navigation

Wav2Lip Video Generation Pipeline

🔥 Features

🛠️ Setup

1. Clone the Repository

2. Set Up Python Virtual Environment (Recommended)

3. Install Dependencies

4. Download Required Models

a. Clone Wav2Lip Repository

b. Wav2Lip Checkpoint (.pth file)

⚙️ How It Works

🚀 Usage

Run the full pipeline:

🐞 Common Issues & Fixes

1. CUDA error: no kernel image is available for execution on the device

2. invalid load key, '<' or TorchScript Errors

3. Image too big to run face detection on GPU

4. Output video not found

📁 Project Structure

📜 License

🙏 Acknowledgements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

1. `CUDA error: no kernel image is available for execution on the device`

2. `invalid load key, '<'` or TorchScript Errors

3. `Image too big to run face detection on GPU`

Packages