A powerful subtitle generation tool using WhisperX for accurate speech-to-text transcription with precise timestamp alignment.
- 🎯 Accurate speech recognition using WhisperX
- ⚡ GPU acceleration support (CUDA)
- 🎵 Handles both video and audio files
- 📁 Batch processing support
- ⏱️ Performance timing and logging
- 🔧 Multiple language model options
- 🎛️ Configurable compute device selection
- 🌐 Multi-language support with auto-detection
- 📑 Text file input support for batch processing
- 🔄 Automatic model size selection
- Python 3.10 or later
- NVIDIA GPU with CUDA 12 support (optional, for GPU acceleration)
- Latest driver from NVIDIA (Very important, when using
cuda
flag) - FFmpeg
- Git
- Clone the repository:
git clone https://github.com/protik09/subgen-whisperx.git --depth=1
cd subgen-whisperx
- Create and activate a virtual environment:
In Powershell:
.\uv_init.ps1
or in bash
.\uv_init.sh
Generate subtitles for a single video file:
python subgen_whisperx.py -f path/to/video.mp4
Process all media files in a directory:
python subgen_whisperx.py -l en -d path/to/directory
Specify compute device and model size:
python subgen_whisperx.py -f video.mp4 -c cuda -m medium
Set logging level:
python subgen_whisperx.py -f video.mp4 -l DEBUG
Argument | Description | Default |
---|---|---|
-f , --file |
Path to input media file | None |
-d , --directory |
Path to directory containing media files | None |
-c , --compute_device |
Device for computation (cuda or cpu ) |
Auto |
-m , --model_size |
WhisperX model size | Auto |
-l , --language |
Subtitle language | None |
-log , --log-level |
Logging level | ERROR |
-t- , --txt |
Text file with file/folder paths | None |
The script generates SRT subtitle files in the same directory as the input media:
- Format:
filename.{language}-ai.srt
- Example:
Meetings-0822.en-ai.srt
for a video calledMeetings-0822.mp4
If the automatic model selection leads to the CUDA Out of Memory (OOM) Issue, just manually select
the next smaller model using the -m
flag. For example, if the -m medium
flag causes a
CUDA OOM, then use -m small.en
or -m small
.
- GPU acceleration provides significantly faster processing
- There is CPU fallback if GPU access fails
- Progress indicators show real-time status
- Performance timing information displayed after completion
This project operates under the MIT Open Source License
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add some amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request