A powerful CLI tool to transcribe and translate audio/video content into subtitles (
.srt,.vtt,.txt) using OpenAI Whisper and Facebook M2M100.
- 🎤 Voice Recording from mic, auto-transcribed
- 📼 Transcribe local files (
.mp4,.aac,.wav, etc.) - 📺 Download & transcribe YouTube or Facebook videos
- 🌍 Translate to other languages (e.g. zh → en) using Facebook M2M100
- 📁 Batch processing for local/YT lists
- 📄 Outputs:
.srt,.vtt,.txt - 💬 Full CLI interface with rich options
- 🇭🇰 Logger speaks fluent Cantonese 😎
git clone https://github.com/yourname/openai-transcribe-python.git
cd openai-transcribe-python
python3 -m venv .venv
source .venv/bin/activate # or .venv\Scripts\activate on Windows
pip install -r requirements.txtpython3 -m venv .venv
pip install -q git+https://github.com/openai/whisper.git
pip install -q pytube transformers sentencepiece tqdm
pip3 install -U openai-whisper
mac pytouch:
pip3 install torch torchvision torchaudio
window cuda command
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121feature ref: https://github.com/AssemblyAI-Examples/whisper-multilingual/blob/main/main.py
stt --help🎙️ Record from Mic
stt record --source zh --target en --model medium --txt --srtSpeak and type end to stop. It transcribes + translates your voice.
stt transcribe local \
--file sample/canton_long.mp3 \
--source zh --target en \
--model medium \
--srt --txt
stt transcribe youtube \
--url https://youtu.be/abc123 \
--source en --target zh \
--model medium \
--srt --txtstt transcribe facebook \
--url https://www.facebook.com/watch?v=xyz \
--source zh --target en \
--model medium \
--txt --srtstt transcribe-many \
--file temp/mixed-files.txt \
--source zh \
--target en \
--model medium \
--txt --srtpytest tests/subtitle_generator/ ├── cli_main.py # CLI entry point (Click-based) ├── processor.py # Transcription logic ├── recorder.py # Microphone recording ├── video_processor.py ├── youtube_processor.py ├── facebook_processor.py ├── subtitle_generator.py ├── text_translator.py ├── logger.py # Cantonese logger ❤️
- Audio: .aac, .m4a, .wav, .mp3
- Video: .mp4, .mkv, .mov
- URL: YouTube, Facebook
- Whisper works best with Python 3.10
- For FB/YT, yt-dlp and ffmpeg are required
- Translation requires internet (for Facebook M2M100)
Built with Whisper, Transformers, and love.
- Cleaner layout with icons for readability
- Clear command examples
- Proper sectioning: install, usage, format support
- Credit + license section
- Cantonese pride 🇭🇰