srt-maker

CLI tool to generate SRT subtitles from video audio using speech recognition.

It also includes a dedicated command for burning an existing .srt file into an existing video.

Features

Automatic speech recognition using OpenAI Whisper (local, offline)
Language detection support
Progress indicators for transcription
Configurable output parameters
Timestamp precision control
Minimum subtitle display duration for better readability

Requirements

Python 3.9+
ffmpeg (installed on system)

Installation

Install System Dependencies

Ubuntu/Debian:

sudo apt-get install ffmpeg

macOS:

brew install ffmpeg

Windows: Download from https://ffmpeg.org/download.html

Install Python Package

pip install -e .

For development:

pip install -e ".[dev]"

Usage

Basic Usage

srt-maker video.mp4

This generates video.srt in the same directory.

Options

srt-maker video.mp4 [OPTIONS]

Options:
  video_file                  Path to the input video file (required)
  -o, --output OUTPUT         Output SRT file path (default: <video_name>.srt)
  -m, --model MODEL           Whisper model size: tiny, base, small, medium, large, large-v1, large-v2, large-v3 (default: base)
  -l, --language LANG         Language code (e.g., en, es, fr). Auto-detect if not specified
  -p, --precision N           Timestamp precision in milliseconds (default: 0)
  -d, --device DEVICE         Device to run the model on: cpu, cuda, auto (default: auto)
  --min-display-duration N    Minimum display duration for subtitles in seconds (default: 0.0 - use actual speech duration)
  --no-speech-threshold N     Filter segments with no_speech_prob above this value (default: 0.6)
  --logprob-threshold N       Filter segments with avg_logprob below this value (default: -1.0)
  --temperature N             Whisper decoding temperature (default: 0.0)
  --compression-ratio-threshold N
                              Filter segments with overly repetitive output during decoding (default: 2.4)
  --min-duration N            Minimum segment duration in seconds (default: 0.1)
  --max-repetitions N         Max consecutive repetitions of same text (default: 2)
  --offset N                 Time offset in seconds to add to all timestamps (default: 0.0)
  -v, --verbose              Enable verbose logging
  --help                     Show help message

Examples

Generate subtitles with custom output path:

srt-maker video.mp4 -o subtitles.srt

Use tiny model for faster transcription (less accurate):

srt-maker video.mp4 -m tiny

Use large model for better accuracy (slower):

srt-maker video.mp4 -m large

Specify language for better accuracy:

srt-maker video.mp4 -l en

Force CPU usage:

srt-maker video.mp4 -d cpu

Extend short subtitles for better readability (2 second minimum display duration):

srt-maker video.mp4 --min-display-duration 2.0

Reproduce a tuned German small-model run with stricter hallucination filtering:

srt-maker input.mp4 -l de -m small -d cuda \
  -o output.srt \
  --no-speech-threshold 0.85 \
  --logprob-threshold -1.45 \
  --temperature 0.0 \
  --compression-ratio-threshold 2.0 \
  --min-duration 0.0 \
  --max-repetitions 1 \
  --similarity-threshold 0.72 \
  --repetition-window 20

Burned Subtitle Rendering

Use srt-burn when you already have a video file and an external subtitle file and want a new video with the subtitles burned into the image.

Basic Usage

srt-burn video.mp4 subtitles.srt

This generates video_subtitled.mp4 in the same directory.

Options

srt-burn video.mp4 subtitles.srt [OPTIONS]

Options:
  video_file                    Path to the input video file (required)
  srt_file                      Path to the input SRT file (required)
  -o, --output OUTPUT           Output video path
                                (default: <video_name>_subtitled.<ext>)
  --font-size N                 Burned subtitle font size
  --bottom-margin N             Bottom margin for burned subtitles
  --primary-color COLOR         Primary subtitle color in #RRGGBB
                                or ASS &H... format
  --use-gpu                     Force NVIDIA NVENC for video encoding
  --no-gpu                      Disable GPU detection and use CPU encoding
  -v, --verbose                 Enable verbose logging
  --help                        Show help message

Examples

Burn subtitles into a video while keeping the same container type:

srt-burn input.mkv input.srt

Write to a specific output path:

srt-burn input.mp4 input.srt -o output_with_subs.mp4

Apply basic subtitle styling:

srt-burn input.mp4 input.srt \
  --font-size 26 \
  --bottom-margin 32 \
  --primary-color "#FFFFFF"

Force GPU encoding with NVIDIA NVENC:

srt-burn input.mp4 input.srt --use-gpu

Rendering Notes

UHD and 4K sources use an explicit bottom-centered default subtitle style even when no styling flags are passed.
The video stream is re-encoded because subtitle burning requires a video filter.
H.264 outputs automatically use h264_nvenc when an NVIDIA GPU is available and the installed ffmpeg build supports NVENC; otherwise the burner falls back to libx264.
--use-gpu explicitly requires NVENC and fails fast with a clear error if the current ffmpeg/NVIDIA setup cannot use it.
--no-gpu disables detection and always uses CPU encoding.
Audio streams are copied when possible.
Existing subtitle streams are removed from the output to avoid duplicate subtitles.
Metadata and other non-subtitle streams are preserved where practical.

Development

Run Tests

Run all tests:

pytest

Run with coverage:

pytest --cov=srt_maker

Run specific test file:

pytest tests/test_audio_extractor.py

Watch Mode

Run tests in watch mode for continuous feedback during development:

./test_runner.sh --watch

Linting

Run linting checks:

pyflakes srt_maker/**/*.py

Running the Test Runner

The test_runner.sh script provides automated testing with continuous feedback:

# Run all tests with coverage
./test_runner.sh

# Skip slow tests
./test_runner.sh --skip-slow

# Watch mode (re-runs on file changes)
./test_runner.sh --watch

Project Structure

srt-maker/
├── srt_maker/
│   ├── __init__.py
│   ├── audio_extractor.py    # Audio extraction from video
│   ├── transcriber.py       # Whisper speech recognition
│   ├── srt_generator.py     # SRT file formatting
│   └── cli.py               # CLI interface
├── tests/
│   ├── conftest.py          # Test fixtures
│   ├── test_audio_extractor.py
│   ├── test_transcriber.py
│   ├── test_srt_generator.py
│   └── test_cli.py
├── test_runner.sh           # Automated test runner
└── pyproject.toml

How It Works

Audio Extraction: Extract audio track from video using ffmpeg
Speech Recognition: Use OpenAI Whisper to transcribe audio segments
Language Detection: Automatically detect language (or use specified)
SRT Generation: Format segments into SRT subtitle format

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
srt_maker		srt_maker
tests		tests
.gitignore		.gitignore
AGENTS.md		AGENTS.md
DEVELOPMENT_PLAN.md		DEVELOPMENT_PLAN.md
README.md		README.md
pyproject.toml		pyproject.toml
test_runner.sh		test_runner.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

srt-maker

Features

Requirements

Installation

Install System Dependencies

Install Python Package

Usage

Basic Usage

Options

Examples

Burned Subtitle Rendering

Basic Usage

Options

Examples

Rendering Notes

Development

Run Tests

Watch Mode

Linting

Running the Test Runner

Project Structure

How It Works

License

About

Uh oh!

Releases

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

srt-maker

Features

Requirements

Installation

Install System Dependencies

Install Python Package

Usage

Basic Usage

Options

Examples

Burned Subtitle Rendering

Basic Usage

Options

Examples

Rendering Notes

Development

Run Tests

Watch Mode

Linting

Running the Test Runner

Project Structure

How It Works

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Uh oh!

Contributors

Uh oh!

Languages