Skip to content

Releases: goodroot/hyprwhspr

v1.29.3

04 May 16:06

Choose a tag to compare

Bug Fixes

  • Tray: Fixed error:model_missing appearing for faster-whisper users. The model_exists() check was falling through to the pywhispercpp ggml file lookup for any unrecognised backend. Now uses an import-based check, matching the existing onnx-asr
    pattern.

  • ydotool version detection: Added rpm fallback alongside the existing dpkg and pacman checks. Without it, ydotool 1.0+ was falsely reported as 0.1.0 on openSUSE, Fedora, and RHEL-based systems. (PR #170, @Legend28469)

v1.29.0

29 Apr 14:41
82834c5

Choose a tag to compare

What's New

post_transcription_hook — pipe transcriptions through any shell command

You can now define a post_transcription_hook in your config to transform transcribed text before it's pasted. The hook
receives text on stdin; non-empty stdout replaces it. An empty stdout leaves the text unchanged, so observers like loggers work
without side effects. A broken or slow hook (5 s timeout) never silently drops a dictation — the original text is always
preserved as a fallback.

"post_transcription_hook": "sed 's|.*|<dictation>&</dictation>|'"

Env vars HYPRWHSPR_MODEL and HYPRWHSPR_BACKEND are available to the hook for context-aware transforms.

See docs/CONFIGURATION.md for full details and examples.


Bug fix: mic-osd daemon uses active Python interpreter

The mic-osd overlay daemon previously hardcoded /usr/bin/python3, which broke on systems using a venv or a non-standard Python
layout. It now uses sys.executable — the same interpreter running the service — so package visibility is always consistent.

Contributors

Thanks to @mmacpherson for the post_transcription_hook feature (#167) and @Stark-X for the Python fix (#168).

New Contributors

Full Changelog: v1.28.0...v1.29.0

v1.28.0

22 Apr 15:42

Choose a tag to compare

Keyboard hotplug detection

Keyboards plugged in after the service starts — USB hubs, docking stations, Bluetooth reconnects — are now detected and grabbed automatically without a service restart.

Hotplug is opt-in via the new keyboard_device_names allowlist. This is intentional: the same broad capability filter that makes hotplug useful can also grab mice and media controllers that report keyboard-like keys (Logitech MX series, macro pads, etc.). Listing your keyboards by name is the explicit signal that auto-grab is safe.

"keyboard_device_names": ["Your Keyboard Name", "Another Keyboard"]

When the allowlist is unset, behaviour is unchanged from previous releases.

Keyboard allowlist and improved keyboard list output

hyprwhspr keyboard list now shows:

  • [ALLOWED] — devices on your allowlist
  • [VIRTUAL] — virtual/UInput nodes (ydotoold, hyprwhspr's own virtual keyboard) so you don't accidentally allowlist them
  • A ready-to-paste snippet suggesting the allowlist when no selection is configured, including a note that setting it enables hotplug

Devices the running service has already grabbed are no longer hidden from keyboard list output.

Fixes

  • Devices that fail to grab no longer block subsequent hotplug events
  • Self-grab prevention: hyprwhspr's own UInput virtual keyboard is skipped during discovery and hotplug to prevent input
    feedback loops

Big thanks to @mmacpherson for the PR.

PRs

  • feat: opt-in keyboard hotplug via keyboard_device_names allowlist by @mmacpherson in #165
  • chore(deps): bump astro from 6.1.2 to 6.1.6 in /website by @dependabot[bot] in #166

Full Changelog: v1.27.0...v1.28.0

v1.27.0

19 Apr 03:37
7a8f438

Choose a tag to compare

v1.27.0

Features

  • Google Gemini Live API — Added Gemini Live as a WebSocket (realtime-ws) provider. Configure via provider: gemini with your API key. Supports streaming transcription with the same interface as the OpenAI Realtime backend.

  • Capture current transcription — New hyprwhspr record capture command that retrieves the in-progress transcription from a live session without stopping it. Great for scripting off stdout.

  • Environment variable expansion in credentials — Values in credentials.json now support $ENV_VAR / ${ENV_VAR} substitution, so API keys can be sourced from the environment rather than stored as plain text.

Fixes

  • Fixed language switching in the Gemini realtime client
  • Fixed realtime-ws backend support in the meeting recorder utility (utils/meeting-recorder.py)
  • Minor CLI copy/flavour text corrections

Internal

  • Lock safety improvements in the main service and CLI commands

Pull Requests

  • feat(realtime): add google gemini live api as websocket provider by @ali205412 in #161
  • feat: capture the current transcription by @cenk1cenk2 in #163
  • feat(config): extend environment variables in the credentials file by @cenk1cenk2 in #164

New Contributors

Full Changelog: v1.26.1...v1.27.0

v1.26.1

11 Apr 16:17

Choose a tag to compare

What's changed

New: continuous recording mode

A new "continuous" recording mode that listens indefinitely and auto-transcribes whenever a pause in speech is detected.

Transcriptions are injected in real time as you speak, without pressing any key.

Enable via config or hyprwhspr setup:

{
    "recording_mode": "continuous"
}

Experimental. See the configuration docs for silence threshold and floor-finding options.

Fix: language auto-detection on pywhispercpp backend (#160)

When "language" was null (the documented default for auto-detect), the pywhispercpp backend was silently falling back to whisper.cpp's compiled-in default of "en", causing non-English speech to be transcribed as garbled English. The backend now calls auto_detect_language() explicitly when no language is set.

Users who pinned a language code (e.g. "language": "fr") were unaffected.

Feat: beam search decoding on pywhispercpp and faster-whisper backends (#159)

Both local backends now default to beam search with beam_size: 5, matching whisper-cli behavior. Previously, pywhispercpp used greedy decoding with no way to change it. The difference is most noticeable on non-English audio and noisier input.

Two new config keys:

{
    "sampling_strategy": "beam_search",  // "beam_search" (default) or "greedy"
    "beam_size": 5
}

faster-whisper's previously hardcoded beam_size: 5 also now respects the beam_size config key.

Note: Changing sampling_strategy requires a service restart on the pywhispercpp backend.

Fix: audio ducking timing

Audio ducking now activates after the stream is confirmed open rather than before, preventing a race where the duck fired before audio capture was ready.

Fix: audio stream startup reliability

Improved retry logic on stream open to handle transient PortAudio errors (-9987 paTimedOut, -9999 PulseAudio "No such entity") that could occur during service startup before the audio server had fully settled.

Full Changelog: v1.26.0...v1.26.1

v1.25.2

02 Apr 03:02

Choose a tag to compare

What's Changed

Fix: Microphone indicator no longer shows while idle

On some desktops (GNOME, Ubuntu, and others), hyprwhspr appeared to be always listening — the system microphone indicator would appear as soon as the service started, even when not recording.

This was caused by a recently implemented keepalive audio stream that held the microphone open between recordings to prevent cold-start timeouts on certain ALSA hardware. It worked, but had the unintended side effect of looking like hyprwhspr was constantly recording. It's not!

The keepalive stream is now opt-in and disabled by default. If you previously relied on it to avoid paTimedOut errors on your first recording after the mic has been idle, you can re-enable it:

{
  "keepalive_stream": true
}

See the configuration docs for more info.

Website!

Also adds https://hyprwhspr.com - check it out!

Full Changelog: v1.25.1...v1.25.2

v1.25.1

31 Mar 16:55

Choose a tag to compare

v1.25.1

Bug Fixes

  • Fix crash on startup with non-Cohere backends (#154)

    get_credential was re-imported as a local variable inside initialize() and _reinitialize_cohere_transcribe() in
    whisper_manager.py. Python's scoping rules marked it local for the entire method at compile time, causing an
    UnboundLocalError whenever a non-Cohere backend (rest-api, realtime-ws, pywhispercpp) was configured. Redundant local
    imports removed; the module-level import is used throughout.

  • Fix USB microphone failures after idle on ALSA systems (#153)

    USB microphones managed via raw ALSA can power down after ~30 seconds of silence. When hyprwhspr tried to open the device for
    the next recording, PortAudio's PaUnixThread_New timed out (paTimedOut / error -9987). This release introduces a silent
    keepalive stream that holds the ALSA device open between recordings, preventing the kernel from suspending it. The keepalive is
    only started when a multiplexed audio server (PipeWire or PulseAudio) is detected — on raw ALSA it is skipped to avoid holding
    an exclusive lock that would block other applications.

    Additional hardening: stream start now retries up to 3 times with a brief pause on timeout, handling cold-start races on
    service startup before the audio node has fully warmed up.

Configuration

  • New setting stream_start_retry_delay (default: 1.5 seconds) controls how long to wait between stream start retries on
    timeout. Increase this value if your USB microphone frequently fails on the first recording after idle.

Full Changelog: v1.25.0...v1.25.1

v1.25.0

27 Mar 21:35

Choose a tag to compare

This is a big one!

You can now select Cohere Transcribe, which is the Hugging Face leaderboard.

It's a gated model and requires a Hugging Face API key, but it's well worth it for the speed and performance.

Enjoy!

Features

New: Cohere Transcribe backend

A new local transcription backend powered by CohereLabs/cohere-transcribe-03-2026 via HuggingFace Transformers. 🇨🇦

  • #1 on the Open ASR Leaderboard — 5.42 average WER, beating Whisper large-v3 (7.44 WER) at 3× the throughput
  • Supports 14 languages: English, German, French, Italian, Spanish, Portuguese, Greek, Dutch, Polish, Arabic, Vietnamese,
    Chinese, Japanese, Korean
  • Runs on GPU (bfloat16, ~4–5 GB VRAM) or CPU (float32, ~8 GB RAM)
  • Optional torch.compile support for faster throughput after first-call warmup
  • Select during setup: hyprwhspr setup[8] Cohere Transcribe

The model is gated on HuggingFace. You must accept the license at huggingface.co/CohereLabs/cohere-transcribe-03-2026 and provide a read token during setup.

New: Cohere REST API provider

Cohere's cloud transcription API (https://api.cohere.com/v2/audio/transcriptions) is now available as a REST API provider —
Canadian-hosted, same Apache 2.0 model, no local GPU required.

Cohere's API requires a language parameter. Add "language": "en" (or your language code) to your config.

Sign up for Cohere

Background model loading

Heavier backends (currently cohere-transcribe) now load their model in a background thread on startup. Keyboard shortcuts and the
recording FIFO are active immediately — recording is blocked with a desktop notification until the model is ready.

AT-SPI window detection fallback

text_injector now falls back to AT-SPI (gi.repository.Atspi) for active window detection on native Wayland compositors
(GNOME, KDE, etc.) where hyprctl is unavailable. The probe runs once with a 0.5 s timeout and caches the result; AT-SPI is
accessed under a lock as it is not thread-safe.

Bug fixes

  • Ptyxis terminal detection: added org.gnome.ptyxis app ID alongside the legacy io.gitlab.ptyxis.ptyxis identifier so
    type-aware paste works correctly in newer Ptyxis releases.
  • faster-whisper backend verification: _verify_backend_installation now correctly checks faster-whisper installations
    (was previously skipped).

Full Changelog: v1.24.0...v1.25.0

v1.24.0

25 Mar 18:31

Choose a tag to compare

Improved

Paste compatibility on non-Hyprland systems
Window detection now falls back to xdotool + xprop when hyprctl is unavailable, covering GNOME, KDE, and other compositors running under XWayland. Ptyxis (the default Fedora 43 terminal) is now recognized as a terminal for automatic Ctrl+Shift+V paste
selection.

Users on pure Wayland compositors without xdotool will see a note at the end of setup pointing to the paste_mode config option if terminal paste won't auto-detect.

Setup wizard fixes

  • Backend installation skip now correctly gates model selection and download — previously the model prompt and download would still run even if installation was skipped
  • "Service is running manually" warning moved to after the systemd setup step, where it's contextually relevant
  • Setup summary now shows the normalized backend name (e.g. vulkan instead of amd for AMD GPU users)
  • onnx-asr default model now shown in the setup summary, consistent with other backends
  • Parakeet → onnx-asr migration prompt expanded with a clear description of what changes and why

Backend selection clarity

  • Whisper NVIDIA and Whisper Vulkan options now explicitly noted as fastest local transcription for their respective GPU families
  • Parakeet TDT V3 description updated to "Leading accuracy, no GPU required" to better communicate its value proposition

Full Changelog: v1.23.1...v1.24.0

v1.23.1

22 Mar 20:06

Choose a tag to compare

  • A set of minor bug fixes and improvements.