whisper-llm

Turn spoken thought into polished prose—entirely offline.

Press a hotkey, speak naturally, and watch your words transform into clean, punctuated text—ready to send. No cloud. No subscription. Just your voice and your machine.

Features

Hotkey-activated - Press Scroll Lock to start/stop recording
Voice Activity Detection - Automatically detects when you stop speaking
GPU-accelerated transcription - Uses faster-whisper via Docker
LLM text cleanup - Fixes grammar, punctuation, removes filler words ("um", "uh")
Voice commands - Say "new paragraph", "send", "delete last"
System tray - Runs quietly in the background
100% local - Your audio never leaves your machine

How It Works

[Hotkey] → Microphone → Voice Detection → Whisper → LLM Cleanup → Paste
                              ↓                         ↓
                        (Silero VAD)              (Ollama, local)

All processing happens on your machine. Audio goes to a local Docker container running faster-whisper, then optionally through a local Ollama LLM for text cleanup.

Requirements

Windows 10/11
Python 3.11+
Docker with:
- faster-whisper container (Wyoming protocol, port 10300)
- Ollama (port 11434) - optional, for LLM text cleanup
GPU recommended for fast transcription (CPU works but slower)

Quick Start

1. Start Docker containers

# Faster-whisper (Wyoming protocol)
docker run -d --name faster-whisper \
  --gpus all \
  -p 10300:10300 \
  rhasspy/wyoming-whisper:latest \
  --model large-v3 --language en

# Ollama (optional, for text cleanup)
docker run -d --name ollama \
  --gpus all \
  -p 11434:11434 \
  -v ollama:/root/.ollama \
  ollama/ollama

# Pull an LLM model
docker exec ollama ollama pull qwen3:14b

2. Install whisper-llm

git clone https://github.com/cj-elevate/whisper-llm.git
cd whisper-llm
pip install -r requirements.txt

3. Configure

cp config.example.yaml config.yaml
# Edit config.yaml to customize settings

4. Run

python src/main.py

Press Scroll Lock to start/stop recording. Speak naturally, and text appears in your active window.

Configuration

Edit config.yaml to customize:

Setting	Default	Description
`hotkey`	`scroll lock`	Key to toggle recording
`audio.silence_duration_ms`	`1000`	Pause before transcribing (ms)
`llm.enabled`	`true`	Enable LLM text cleanup
`llm.model`	`qwen3:14b`	Ollama model for cleanup
`output.method`	`auto`	How to insert text (clipboard/sendinput/auto)
`output.clipboard_restore_delay`	`0.15`	Seconds after paste before restoring clipboard
`corrections.enabled`	`true`	Enable post-transcription word corrections
`commands.auto_enter_slash`	`true`	Auto-press Enter for lone slash commands

See config.example.yaml for all options with descriptions.

Voice Commands

Command	Effect
"send"	Paste text and press Enter
"new paragraph"	Insert blank line
"new line"	Insert line break
"period" / "comma"	Insert punctuation
"slash [command]"	Insert slash command (e.g., "slash team" → "/team")
"delete last"	Undo last output (Ctrl+Z)

Auto-Enter for Slash Commands

When you say a slash command by itself (e.g., "slash team"), it's outputted and Enter is pressed automatically. This lets you trigger CLI commands and tool shortcuts hands-free.

Examples:

"slash team" → Outputs /team + presses Enter
"slash end" → Outputs /end + presses Enter
"use slash team" → Outputs use /team (no Enter - not a lone command)

This works with any slash command you've added to corrections.words in config.yaml.

Word Corrections

Whisper sometimes misrecognizes domain-specific words. Add corrections in config.yaml:

corrections:
  enabled: true
  words:
    cloud: Claude
    cloud code: Claude Code
    # Slash commands - add your CLI tools here
    slash team: /team
    slash start: /start

Corrections are case-insensitive, whole-word only ("cloudy" won't be changed), and longer phrases are matched first.

LLM Modes

Mode	Description
`raw`	No processing, direct transcription
`clean`	Fix grammar, punctuation, remove fillers (default)

Switch modes via the system tray menu.

Troubleshooting

See TROUBLESHOOTING.md for common issues:

Wyoming server connection issues
LLM timeout on cold start
Audio capture problems

Project Structure

src/
  main.py        # Entry point
  app.py         # System tray, hotkey, lifecycle
  pipeline.py    # Async audio processing
  audio.py       # Microphone capture, VAD
  transcriber.py # Wyoming protocol client
  llm.py         # Ollama integration
  output.py      # Text injection (clipboard/SendInput)
  config.py      # Configuration loading

Privacy

No cloud services - All processing is local
No telemetry - No data collection
No network calls - Only connects to localhost Docker containers
Audio stays local - Never transmitted anywhere

Contributing

Contributions welcome! Please:

Fork the repository
Create a feature branch
Make your changes
Submit a pull request

License

MIT License - Use it however you like.

Acknowledgments

OpenAI Whisper - Speech recognition model
faster-whisper - Optimized Whisper implementation
Wyoming Protocol - Audio streaming protocol
Ollama - Local LLM server
Silero VAD - Voice activity detection

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
assets		assets
scripts		scripts
src		src
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
TROUBLESHOOTING.md		TROUBLESHOOTING.md
config.example.yaml		config.example.yaml
config.yaml		config.yaml
requirements.txt		requirements.txt
test_output_diagnostic.py		test_output_diagnostic.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

whisper-llm

Features

How It Works

Requirements

Quick Start

1. Start Docker containers

2. Install whisper-llm

3. Configure

4. Run

Configuration

Voice Commands

Auto-Enter for Slash Commands

Word Corrections

LLM Modes

Troubleshooting

Project Structure

Privacy

Contributing

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

License

cj-elevate/whisper-llm

Folders and files

Latest commit

History

Repository files navigation

whisper-llm

Features

How It Works

Requirements

Quick Start

1. Start Docker containers

2. Install whisper-llm

3. Configure

4. Run

Configuration

Voice Commands

Auto-Enter for Slash Commands

Word Corrections

LLM Modes

Troubleshooting

Project Structure

Privacy

Contributing

License

Acknowledgments

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages