Add on-device dictation with Whisper#68
Conversation
Wire whisper capture, downloads, and Tauri commands.
Expose dictation state to UI and composer.
Add model/language controls and progress UI.
Add cmake checks for dev and CI.
|
Review notes (from deep review):
Refs:
|
|
you are absolutely right! :) I addressed all three points. fixes on the way! |
Prevent Enter submit during dictation.
Stop capture and emit idle state on mic errors.
Limit stored samples to avoid unbounded growth.
Allow configurable modifier hold to start/stop dictation.
|
another 2 edge-cases fixes on the way! |
Avoid error when no transcript is produced.
Let cancel stop processing and suppress transcript.
|
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2b5c694098
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| onChange={(event) => | ||
| void onUpdateAppSettings({ | ||
| ...appSettings, | ||
| dictationHoldKey: event.target.value || null, | ||
| }) |
There was a problem hiding this comment.
Avoid sending null for dictationHoldKey
Selecting “Off” sets dictationHoldKey to null, but the Tauri settings struct expects a non‑optional String (dictation_hold_key in src-tauri/src/types.rs). When the user saves settings with Off selected, update_app_settings will try to deserialize null into a String and fail, so settings updates will error or revert. This only reproduces when Off is chosen; keeping the field a string sentinel (e.g., "") or making the backend field Option<String> would avoid the mismatch.
Useful? React with 👍 / 👎.
src/App.tsx
Outdated
| holdDictationActive.current = false; | ||
| if (dictationState === "listening") { | ||
| void stopDictation(); | ||
| } |
There was a problem hiding this comment.
Stop hold-to-dictate even if state update lags
The keyup handler only calls stopDictation() when dictationState === "listening". If the user taps and releases the hold key quickly, the backend may still report idle at keyup time, so no stop is sent; when the state flips to listening a moment later, dictation continues even though the key is no longer held. This can leave the mic running unexpectedly; consider tracking the key‑up intent and stopping once listening begins, or canceling immediately on keyup regardless of current state.
Useful? React with 👍 / 👎.
Avoid null values in app settings and make hold-to-dictate stop on keyup even if state lags.
Move keyboard handling into a dedicated hook and shared key matcher.
What’s in here
Why these decisions
if best_prob > 0.0 && (best_prob - pref_prob) <= 0.30Implementation details
.partial, timeouts, SHA256 verification, atomic rename.Demo
moehaje-codexmonitor-dictation.mp4
Notes / observations
How to test
Escapewhile listening → dictation cancels immediately.