Releases: goodroot/hyprwhspr
v1.29.3
Bug Fixes
-
Tray: Fixed
error:model_missingappearing forfaster-whisperusers. Themodel_exists()check was falling through to thepywhispercppggml file lookup for any unrecognised backend. Now uses an import-based check, matching the existingonnx-asr
pattern. -
ydotool version detection: Added
rpmfallback alongside the existingdpkgandpacmanchecks. Without it, ydotool 1.0+ was falsely reported as0.1.0on openSUSE, Fedora, and RHEL-based systems. (PR #170, @Legend28469)
v1.29.0
What's New
post_transcription_hook — pipe transcriptions through any shell command
You can now define a post_transcription_hook in your config to transform transcribed text before it's pasted. The hook
receives text on stdin; non-empty stdout replaces it. An empty stdout leaves the text unchanged, so observers like loggers work
without side effects. A broken or slow hook (5 s timeout) never silently drops a dictation — the original text is always
preserved as a fallback.
"post_transcription_hook": "sed 's|.*|<dictation>&</dictation>|'"Env vars HYPRWHSPR_MODEL and HYPRWHSPR_BACKEND are available to the hook for context-aware transforms.
See docs/CONFIGURATION.md for full details and examples.
Bug fix: mic-osd daemon uses active Python interpreter
The mic-osd overlay daemon previously hardcoded /usr/bin/python3, which broke on systems using a venv or a non-standard Python
layout. It now uses sys.executable — the same interpreter running the service — so package visibility is always consistent.
Contributors
Thanks to @mmacpherson for the post_transcription_hook feature (#167) and @Stark-X for the Python fix (#168).
New Contributors
Full Changelog: v1.28.0...v1.29.0
v1.28.0
Keyboard hotplug detection
Keyboards plugged in after the service starts — USB hubs, docking stations, Bluetooth reconnects — are now detected and grabbed automatically without a service restart.
Hotplug is opt-in via the new keyboard_device_names allowlist. This is intentional: the same broad capability filter that makes hotplug useful can also grab mice and media controllers that report keyboard-like keys (Logitech MX series, macro pads, etc.). Listing your keyboards by name is the explicit signal that auto-grab is safe.
"keyboard_device_names": ["Your Keyboard Name", "Another Keyboard"]When the allowlist is unset, behaviour is unchanged from previous releases.
Keyboard allowlist and improved keyboard list output
hyprwhspr keyboard list now shows:
- [ALLOWED] — devices on your allowlist
- [VIRTUAL] — virtual/UInput nodes (ydotoold, hyprwhspr's own virtual keyboard) so you don't accidentally allowlist them
- A ready-to-paste snippet suggesting the allowlist when no selection is configured, including a note that setting it enables hotplug
Devices the running service has already grabbed are no longer hidden from keyboard list output.
Fixes
- Devices that fail to grab no longer block subsequent hotplug events
- Self-grab prevention: hyprwhspr's own UInput virtual keyboard is skipped during discovery and hotplug to prevent input
feedback loops
Big thanks to @mmacpherson for the PR.
PRs
- feat: opt-in keyboard hotplug via keyboard_device_names allowlist by @mmacpherson in #165
- chore(deps): bump astro from 6.1.2 to 6.1.6 in /website by @dependabot[bot] in #166
Full Changelog: v1.27.0...v1.28.0
v1.27.0
v1.27.0
Features
-
Google Gemini Live API — Added Gemini Live as a WebSocket (
realtime-ws) provider. Configure viaprovider: geminiwith your API key. Supports streaming transcription with the same interface as the OpenAI Realtime backend. -
Capture current transcription — New
hyprwhspr record capturecommand that retrieves the in-progress transcription from a live session without stopping it. Great for scripting off stdout. -
Environment variable expansion in credentials — Values in
credentials.jsonnow support$ENV_VAR/${ENV_VAR}substitution, so API keys can be sourced from the environment rather than stored as plain text.
Fixes
- Fixed language switching in the Gemini realtime client
- Fixed
realtime-wsbackend support in the meeting recorder utility (utils/meeting-recorder.py) - Minor CLI copy/flavour text corrections
Internal
- Lock safety improvements in the main service and CLI commands
Pull Requests
- feat(realtime): add google gemini live api as websocket provider by @ali205412 in #161
- feat: capture the current transcription by @cenk1cenk2 in #163
- feat(config): extend environment variables in the credentials file by @cenk1cenk2 in #164
New Contributors
- @cenk1cenk2 made their first contribution in #163
Full Changelog: v1.26.1...v1.27.0
v1.26.1
What's changed
New: continuous recording mode
A new "continuous" recording mode that listens indefinitely and auto-transcribes whenever a pause in speech is detected.
Transcriptions are injected in real time as you speak, without pressing any key.
Enable via config or hyprwhspr setup:
{
"recording_mode": "continuous"
}Experimental. See the configuration docs for silence threshold and floor-finding options.
Fix: language auto-detection on pywhispercpp backend (#160)
When "language" was null (the documented default for auto-detect), the pywhispercpp backend was silently falling back to whisper.cpp's compiled-in default of "en", causing non-English speech to be transcribed as garbled English. The backend now calls auto_detect_language() explicitly when no language is set.
Users who pinned a language code (e.g. "language": "fr") were unaffected.
Feat: beam search decoding on pywhispercpp and faster-whisper backends (#159)
Both local backends now default to beam search with beam_size: 5, matching whisper-cli behavior. Previously, pywhispercpp used greedy decoding with no way to change it. The difference is most noticeable on non-English audio and noisier input.
Two new config keys:
{
"sampling_strategy": "beam_search", // "beam_search" (default) or "greedy"
"beam_size": 5
}faster-whisper's previously hardcoded beam_size: 5 also now respects the beam_size config key.
Note: Changing
sampling_strategyrequires a service restart on the pywhispercpp backend.
Fix: audio ducking timing
Audio ducking now activates after the stream is confirmed open rather than before, preventing a race where the duck fired before audio capture was ready.
Fix: audio stream startup reliability
Improved retry logic on stream open to handle transient PortAudio errors (-9987 paTimedOut, -9999 PulseAudio "No such entity") that could occur during service startup before the audio server had fully settled.
Full Changelog: v1.26.0...v1.26.1
v1.25.2
What's Changed
Fix: Microphone indicator no longer shows while idle
On some desktops (GNOME, Ubuntu, and others), hyprwhspr appeared to be always listening — the system microphone indicator would appear as soon as the service started, even when not recording.
This was caused by a recently implemented keepalive audio stream that held the microphone open between recordings to prevent cold-start timeouts on certain ALSA hardware. It worked, but had the unintended side effect of looking like hyprwhspr was constantly recording. It's not!
The keepalive stream is now opt-in and disabled by default. If you previously relied on it to avoid paTimedOut errors on your first recording after the mic has been idle, you can re-enable it:
{
"keepalive_stream": true
}See the configuration docs for more info.
Website!
Also adds https://hyprwhspr.com - check it out!
Full Changelog: v1.25.1...v1.25.2
v1.25.1
v1.25.1
Bug Fixes
-
Fix crash on startup with non-Cohere backends (#154)
get_credentialwas re-imported as a local variable insideinitialize()and_reinitialize_cohere_transcribe()in
whisper_manager.py. Python's scoping rules marked it local for the entire method at compile time, causing an
UnboundLocalErrorwhenever a non-Cohere backend (rest-api,realtime-ws,pywhispercpp) was configured. Redundant local
imports removed; the module-level import is used throughout. -
Fix USB microphone failures after idle on ALSA systems (#153)
USB microphones managed via raw ALSA can power down after ~30 seconds of silence. When hyprwhspr tried to open the device for
the next recording, PortAudio'sPaUnixThread_Newtimed out (paTimedOut/ error -9987). This release introduces a silent
keepalive stream that holds the ALSA device open between recordings, preventing the kernel from suspending it. The keepalive is
only started when a multiplexed audio server (PipeWire or PulseAudio) is detected — on raw ALSA it is skipped to avoid holding
an exclusive lock that would block other applications.Additional hardening: stream start now retries up to 3 times with a brief pause on timeout, handling cold-start races on
service startup before the audio node has fully warmed up.
Configuration
- New setting
stream_start_retry_delay(default:1.5seconds) controls how long to wait between stream start retries on
timeout. Increase this value if your USB microphone frequently fails on the first recording after idle.
Full Changelog: v1.25.0...v1.25.1
v1.25.0
This is a big one!
You can now select Cohere Transcribe, which is the Hugging Face leaderboard.
It's a gated model and requires a Hugging Face API key, but it's well worth it for the speed and performance.
Enjoy!
Features
New: Cohere Transcribe backend
A new local transcription backend powered by CohereLabs/cohere-transcribe-03-2026 via HuggingFace Transformers. 🇨🇦
- #1 on the Open ASR Leaderboard — 5.42 average WER, beating Whisper large-v3 (7.44 WER) at 3× the throughput
- Supports 14 languages: English, German, French, Italian, Spanish, Portuguese, Greek, Dutch, Polish, Arabic, Vietnamese,
Chinese, Japanese, Korean - Runs on GPU (bfloat16, ~4–5 GB VRAM) or CPU (float32, ~8 GB RAM)
- Optional
torch.compilesupport for faster throughput after first-call warmup - Select during setup:
hyprwhspr setup→ [8] Cohere Transcribe
The model is gated on HuggingFace. You must accept the license at huggingface.co/CohereLabs/cohere-transcribe-03-2026 and provide a read token during setup.
New: Cohere REST API provider
Cohere's cloud transcription API (https://api.cohere.com/v2/audio/transcriptions) is now available as a REST API provider —
Canadian-hosted, same Apache 2.0 model, no local GPU required.
Cohere's API requires a
languageparameter. Add"language": "en"(or your language code) to your config.
Background model loading
Heavier backends (currently cohere-transcribe) now load their model in a background thread on startup. Keyboard shortcuts and the
recording FIFO are active immediately — recording is blocked with a desktop notification until the model is ready.
AT-SPI window detection fallback
text_injector now falls back to AT-SPI (gi.repository.Atspi) for active window detection on native Wayland compositors
(GNOME, KDE, etc.) where hyprctl is unavailable. The probe runs once with a 0.5 s timeout and caches the result; AT-SPI is
accessed under a lock as it is not thread-safe.
Bug fixes
- Ptyxis terminal detection: added
org.gnome.ptyxisapp ID alongside the legacyio.gitlab.ptyxis.ptyxisidentifier so
type-aware paste works correctly in newer Ptyxis releases. faster-whisperbackend verification:_verify_backend_installationnow correctly checksfaster-whisperinstallations
(was previously skipped).
Full Changelog: v1.24.0...v1.25.0
v1.24.0
Improved
Paste compatibility on non-Hyprland systems
Window detection now falls back to xdotool + xprop when hyprctl is unavailable, covering GNOME, KDE, and other compositors running under XWayland. Ptyxis (the default Fedora 43 terminal) is now recognized as a terminal for automatic Ctrl+Shift+V paste
selection.
Users on pure Wayland compositors without xdotool will see a note at the end of setup pointing to the paste_mode config option if terminal paste won't auto-detect.
Setup wizard fixes
- Backend installation skip now correctly gates model selection and download — previously the model prompt and download would still run even if installation was skipped
- "Service is running manually" warning moved to after the systemd setup step, where it's contextually relevant
- Setup summary now shows the normalized backend name (e.g.
vulkaninstead ofamdfor AMD GPU users) - onnx-asr default model now shown in the setup summary, consistent with other backends
- Parakeet → onnx-asr migration prompt expanded with a clear description of what changes and why
Backend selection clarity
- Whisper NVIDIA and Whisper Vulkan options now explicitly noted as fastest local transcription for their respective GPU families
- Parakeet TDT V3 description updated to "Leading accuracy, no GPU required" to better communicate its value proposition
Full Changelog: v1.23.1...v1.24.0