whisper-toggle

Press once to record. Press again to transcribe and paste.

Single-keybinding speech-to-text for Linux desktops.
Pure bash. Local inference. No cloud. No latency.

Super+`  -->  🎙️ Recording...  -->  Super+`  -->  📋 Pasted!

How It Works

┌─────────────┐     ┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│  1st press   │────>│  Recording   │────>│ Transcribing │────>│   Pasted    │
│  Super + `   │     │  (sox/rec)   │     │ (whisper.cpp)│     │ (clipboard) │
└─────────────┘     └──────┬──────┘     └─────────────┘     └─────────────┘
                           │
                    ┌──────┴──────┐
                    │  2nd press   │  (or silence auto-stops)
                    │  Super + `   │
                    └─────────────┘

The server backend loads the model while you speak — by the time you stop talking, inference is nearly instant.

Quick Start

# Install from AUR
yay -S whisper-toggle

# Interactive setup: GPU, model, backend, keybinding
whisper-toggle-setup

whisper.cpp required — install via AUR (yay -S whisper.cpp-cuda) or build from source. The setup wizard will guide you.

Features

Single keybinding — toggle recording on/off, transcription auto-pastes
Dual backend — on-demand whisper-server (recommended) or direct whisper-cli
GPU accelerated — CUDA, ROCm, Vulkan, or CPU fallback
X11 + Wayland — auto-detects session, uses the right clipboard/paste tools
Interactive setup — detects GPU, downloads models, configures your WM
XDG compliant — config in ~/.config/, models in ~/.local/share/, temp in /dev/shm/
No daemon — server starts and stops per-transcription, zero background footprint
Smart silence detection — recording stops automatically when you stop speaking
Post-processing — strips non-speech markers, trims whitespace, capitalizes

Dependencies

Required
`bash`	`sox`	`curl`
`jq`	`libnotify`	`libpulse`
X11
`xsel`	`xdotool`
Wayland
`wl-clipboard`	`ydotool`
Optional
`pciutils` (GPU detection in setup wizard)

Keybindings

The setup wizard auto-detects your WM and offers to configure this for you.

i3

bindsym $mod+grave exec --no-startup-id whisper-toggle

sway

bindsym $mod+grave exec whisper-toggle

Hyprland

bind = $mainMod, grave, exec, whisper-toggle

GNOME

Configured via whisper-toggle-setup using gsettings, or manually:
Settings > Keyboard > Custom Shortcuts

KDE

System Settings > Shortcuts > Custom Shortcuts > Add > Command/URL

Configuration

~/.config/whisper-toggle/whisper-toggle.conf

BACKEND="server"              # "server" or "cli"
WHISPER_SERVER=""              # Path to whisper-server (auto-detected)
WHISPER_CLI=""                 # Path to whisper-cli (auto-detected)
WHISPER_MODEL="~/.local/share/whisper-toggle/models/ggml-small.en.bin"
WHISPER_PORT=58080             # Server backend port
WHISPER_DEVICE=0               # GPU index (-1 for CPU-only)
WHISPER_THREADS=4              # CPU threads for inference
WHISPER_LANGUAGE="en"          # Language code or "auto"
AUTOPASTE=1                    # Auto-paste after transcription
SILENCE_DURATION=3.0           # Seconds of silence before auto-stop
SILENCE_THRESHOLD=3            # Silence sensitivity (%)

Models

Model	Size	Speed	Quality	Best For
`tiny.en`	75 MB	⚡⚡⚡⚡	★★	Quick notes, low-end hardware
`base.en`	142 MB	⚡⚡⚡	★★★	Everyday use
`small.en`	466 MB	⚡⚡	★★★★	Recommended
`medium.en`	1.5 GB	⚡	★★★★★	High accuracy needs
`large-v3-turbo`	1.6 GB	⚡⚡	★★★★★	Best speed/accuracy ratio
`large-v3`	3.1 GB	⚡	★★★★★	Maximum accuracy

Models are downloaded by the setup wizard to ~/.local/share/whisper-toggle/models/.

Backends

`server` (recommended)

[key press] ──> whisper-server starts ──> model loads ──> ┐
                recording starts ──────> audio captured ──> inference ──> paste
                                                           server killed

The model loads in parallel with your speech. No persistent daemon — the server is started and killed per use.

`cli`

[key press] ──> recording starts ──> audio captured ──> whisper-cli runs ──> paste

Simpler, but slower — the model loads after you stop speaking.

GPU Setup

The setup wizard detects your GPU(s) via lspci and recommends the right device.

GPU	whisper.cpp Package	Build Flag
NVIDIA	`whisper.cpp-cuda`	`-DGGML_CUDA=ON`
AMD	`whisper.cpp-hip`	`-DGGML_HIP=ON`
Any	`whisper.cpp-vulkan`	`-DGGML_VULKAN=ON`
None	`whisper.cpp`	(default)

Troubleshooting

No sound recorded

Check that PulseAudio/PipeWire is running and rec can access your mic:

rec -q -t wav /tmp/test.wav rate 16k

Server failed to start

Check the log for whisper-server errors:

cat /tmp/whisper-toggle.log

Common causes: wrong GPU device index, missing CUDA/Vulkan drivers, model file not found.

Nothing pastes

X11: Install xsel and xdotool
Wayland: Install wl-clipboard and ydotool, ensure ydotoold is running

Double triggers

The 500ms debounce should prevent this. If your WM sends multiple key events, use exec --no-startup-id (i3) or increase the debounce in the script.

whisper-server / whisper-cli not found

The script searches ~/whisper.cpp/build/bin/, ~/.local/bin/, /usr/local/bin/, and /usr/bin/. You can also set WHISPER_SERVER / WHISPER_CLI explicitly in the config.

_{MIT License — built with whisper.cpp}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
bin		bin
config		config
LICENSE		LICENSE
PKGBUILD		PKGBUILD
README.md		README.md
whisper-toggle.install		whisper-toggle.install

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

whisper-toggle

How It Works

Quick Start

Features

Dependencies

Keybindings

Configuration

Models

Backends

`server` (recommended)

`cli`

GPU Setup

Troubleshooting

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

zweck/whisper-toggle

Folders and files

Latest commit

History

Repository files navigation

whisper-toggle

How It Works

Quick Start

Features

Dependencies

Keybindings

Configuration

Models

Backends

server (recommended)

cli

GPU Setup

Troubleshooting

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

`server` (recommended)

`cli`

Packages