Uhura

Local, push-to-talk speech-to-text for macOS. Hold a hotkey, speak, release — the transcript pastes into whatever text field is focused. Everything runs on-device via Apple Neural Engine. Audio never leaves your Mac.

Apple Silicon + macOS 14 (Sonoma) or later required.

Installation

Download the latest Uhura.dmg from Releases, open it, and drag Uhura to Applications.

Because Uhura is ad-hoc signed and not notarized, macOS may say the app is damaged on first launch. If so, run once in Terminal:

xattr -cr /Applications/Uhura.app

First launch: Uhura downloads the Whisper small.en model (~460 MB) from Hugging Face on first run. A progress indicator appears in the menu bar. Subsequent launches load in 1–3 s.

Setup

Launch Uhura — the onboarding window opens automatically.
Grant Microphone access (system prompt).
Grant Accessibility access — click the button, toggle Uhura on in System Settings, and return. The app detects the grant within 1 s; no restart needed.
Record your push-to-talk hotkey (any key or chord).
Click Continue.

Use

Focus any text field in any app.
Hold your hotkey and speak.
A floating HUD shows a live waveform and partial transcript while you hold.
Release — the final transcript pastes into the focused field via Cmd+V.

Settings

Menu bar icon → Settings…

Setting	Default	Notes
Input device	System default	USB and Bluetooth mics appear when connected
Push-to-talk hotkey	(unset)	Set in onboarding or here
Whisper model	`small.en` (~460 MB)	Switching requires a restart
Start/stop sounds	On	Subtle Tink/Pop chimes
Use Unicode keystrokes	Off	Slower fallback that avoids the clipboard; use if Cmd+V paste fails in a specific app

Building from source

Requirements: Xcode 16+, XcodeGen

brew install xcodegen
git clone https://github.com/jcleigh/uhura
cd uhura
xcodegen generate
open Uhura.xcodeproj

The first build resolves SPM packages (WhisperKit + KeyboardShortcuts) — allow a couple of minutes.

To run tests: ⌘U in Xcode, or:

xcodebuild test -project Uhura.xcodeproj -scheme Uhura -destination 'platform=macOS'

Releasing

Push a version tag and GitHub Actions builds and publishes the DMG automatically:

git tag v1.0.0
git push origin v1.0.0

The workflow installs XcodeGen, generates the project, resolves packages, builds a Release binary, ad-hoc signs it, packages Uhura.dmg and Uhura.zip, and creates a GitHub Release. See .github/workflows/build-release.yml.

Architecture

Audio pipeline — WhisperKit's AudioProcessor owns AVAudioEngine capture, 48 kHz → 16 kHz Float32 resampling, sample storage, and relativeEnergy: [Float] for the waveform. The app calls startRecordingLive and stopRecording directly rather than going through AudioStreamTranscriber, which is a Swift actor whose executor has no run loop and causes AVAudioEngine.start() to fail with -10877 (kAudioHardwareBadStreamError). All AVAudioEngine calls are made on @MainActor (main thread), which has a run loop.

Dual-pass transcription — while recording, a detached Task calls whisperKit.transcribe(audioArray:) every 800 ms (after ≥ 1 s of new audio accumulates) to drive the HUD partial text. On key-up, stopAndFinalize() runs a single final transcribe pass over the full captured buffer for the cleanest paste result. Streaming = liveness; single-shot = accuracy.

Paste injection — NSPasteboard snapshot → set transcript → 20 ms settle → synthesized Cmd+V via CGEvent on .cghidEventTap → 150 ms → restore original clipboard. .cghidEventTap (not .cgSessionEventTap) is required for reliable delivery into Electron/Chromium apps.

Permissions — Accessibility is detected by polling AXIsProcessTrusted() at 1 Hz. macOS 13+ applies the grant without an app restart. Microphone permission uses AVAudioApplication.requestRecordPermission() on macOS 14+ (not AVCaptureDevice), which both grants TCC and primes the coreaudiod XPC session needed by AVAudioEngine.

Settings persistence — @Observable only tracks stored properties. Settings.swift uses stored properties initialized from UserDefaults with didSet persistence rather than computed properties, ensuring SwiftUI observation picks up every change.

Project layout

Uhura/
├── App/
│   ├── UhuraApp.swift              @main, MenuBarExtra + Settings scene
│   ├── AppDelegate.swift           NSApplicationDelegate, owns AppCoordinator
│   └── AppCoordinator.swift        wires hotkey → engine → HUD → paste
├── Capture/
│   ├── DictationEngine.swift       owns WhisperKit, audio capture, transcription
│   └── Settings.swift              UserDefaults-backed preferences (@Observable)
├── Hotkey/
│   └── HotkeyController.swift      KeyboardShortcuts onKeyDown/onKeyUp
├── Injection/
│   ├── PasteInjector.swift         clipboard save → set → Cmd+V → restore
│   └── AccessibilityPermission.swift  AXIsProcessTrusted + 1 Hz polling
├── HUD/
│   ├── HUDWindowController.swift   NSPanel, .floating, .canJoinAllSpaces
│   ├── HUDView.swift               SwiftUI waveform + partial text overlay
│   └── WaveformView.swift          bar viz of AudioProcessor.relativeEnergy
├── MenuBar/
│   ├── MenuBarIcon.swift           custom Uhura badge template image + state badges
│   └── SettingsScene.swift         SwiftUI Settings + Onboarding views
├── Audio/
│   └── Pings.swift                 NSSound start/stop chimes
└── Resources/
    └── Assets.xcassets/            AppIcon (all sizes), MenuBarIcon template

Known limits

Not sandboxed — cannot ship to the Mac App Store. Developer ID + notarization is the supported distribution path.
Memory: small.en runs ~800 MB–1.2 GB resident. Use tiny.en or base.en for lower-memory machines (configurable in Settings).
Single-modifier hotkeys (e.g. Right Option alone) require macOS 15.2+ and KeyboardShortcuts v2.1.0+. Use a chord on older systems.
Apple secure text fields and password manager master password prompts intentionally block synthesized paste — clipboard is still restored cleanly.
Sub-250 ms keypresses are silently dropped — Whisper hallucinates on near-empty audio.

License

MIT — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
Uhura		Uhura
UhuraTests		UhuraTests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
project.yml		project.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Uhura

Installation

Setup

Use

Settings

Building from source

Releasing

Architecture

Project layout

Known limits

License

About

Uh oh!

Releases 1

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Uhura

Installation

Setup

Use

Settings

Building from source

Releasing

Architecture

Project layout

Known limits

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Contributors

Uh oh!

Languages