Subtitles

A crash-safe Rust pipeline for generating multi-language subtitles from your video library — fast, resumable, and designed for batch jobs.

Features

🎬 Extract audio from MP4/MKV video files
🗣️ Transcribe speech to text using Whisper
🌍 Translate subtitles to multiple languages (English, Spanish, German)
📄 Output in SRT and VTT formats
♻️ Resumable checkpoints (stage-level + translation segment-level) to avoid redoing work
🧹 Built-in cleanup (subtitles clean) to reclaim checkpoint storage

Why It's Different ✨

💾 Never redo hours of work: crash-safe checkpoints resume from where you left off by default.
🏠 Local-first by design: translate offline with Ollama or plug in OpenAI.
📚 Built for libraries: batch processing with configurable concurrency.
✅ Predictable outputs: clear naming conventions and standards-compliant SRT/VTT.

Quick Start

cargo build --release
./target/release/subtitles generate movie.mp4 --languages en,es,de

Resume is enabled by default. To force a clean run, pass --no-resume.

Maintenance 🧹

subtitles clean

Documentation

📋 Requirements — User stories and acceptance criteria
🏗️ Design — Technical architecture and implementation details

Glossary

Term	Definition
SRT	SubRip Subtitle format. A simple text-based subtitle format with sequential numbering, timestamps, and text. Widely supported by media players.
VTT	WebVTT (Web Video Text Tracks). A subtitle format designed for HTML5 video, supporting styling and positioning. Used by web browsers and streaming platforms.
STT	Speech-to-Text. The process of converting spoken audio into written text. Also called automatic speech recognition (ASR).
TTS	Text-to-Speech. The inverse of STT — converting written text into spoken audio. Not used in this project but often confused with STT.
Whisper	An open-source speech recognition model by OpenAI. Supports multiple languages and produces timestamped transcriptions.
Ollama	A tool for running large language models locally. Used here as a translation backend that doesn't require internet access or API keys.
FFmpeg	A multimedia framework for handling video, audio, and other multimedia files. Used here to extract audio tracks from video containers.
Segment	A single unit of subtitle text with a start time, end time, and content. Multiple segments make up a complete subtitle file.
Plex	A media server platform for organizing and streaming personal media collections. This tool generates subtitles compatible with Plex's subtitle discovery conventions.
ISO 639	An international standard for language codes. We use ISO 639-1 two-letter codes (e.g., `en` for English, `es` for Spanish, `de` for German) throughout this project.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
docs/research		docs/research
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md
design.md		design.md
requirements.md		requirements.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Subtitles

Features

Why It's Different ✨

Quick Start

Maintenance 🧹

Documentation

License

About

Uh oh!

Releases

Packages

Languages

kevinmichaelchen/subtitles

Folders and files

Latest commit

History

Repository files navigation

Subtitles

Features

Why It's Different ✨

Quick Start

Maintenance 🧹

Documentation

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages