-
Notifications
You must be signed in to change notification settings - Fork 415
implement microphone selector #1207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughThe changes introduce a refactored microphone selection UI in the desktop app by replacing a generic audio control button with dedicated Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant UI (Desktop)
participant MicrophoneSelector
participant DeviceQuery
User->>UI (Desktop): Opens recording controls
UI (Desktop)->>MicrophoneSelector: Render
MicrophoneSelector->>DeviceQuery: Fetch microphone devices
DeviceQuery-->>MicrophoneSelector: Return device list
User->>MicrophoneSelector: Clicks dropdown / selects device
MicrophoneSelector-->>UI (Desktop): Updates selected device
sequenceDiagram
participant AudioInput
participant VAD (chunker)
participant Whisper (transcriber)
participant Local STT Plugin
AudioInput->>VAD (chunker): Audio stream
VAD (chunker)->>Whisper (transcriber): Chunks (with new redemption_time)
Whisper (transcriber)->>Local STT Plugin: Segments (filtered by confidence/content)
Local STT Plugin->>Client: All transcription chunks (no confidence filtering)
sequenceDiagram
participant ListenerExt
participant FSM (Session)
participant AudioInput
ListenerExt->>FSM (Session): set_microphone_device(device_name)
FSM (Session)->>FSM (Session): Update mic_device_name
FSM (Session)->>AudioInput: Reinitialize with new device
AudioInput-->>FSM (Session): New audio stream ready
Estimated code review effort🎯 4 (Complex) | ⏱️ ~40-50 minutes Possibly related issues
Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
apps/desktop/src/components/editor-area/note-header/listen-button.tsx(3 hunks)crates/chunker/src/lib.rs(1 hunks)crates/whisper-local/src/model.rs(1 hunks)plugins/local-stt/src/server.rs(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{js,ts,tsx,rs}
⚙️ CodeRabbit Configuration File
**/*.{js,ts,tsx,rs}: 1. No error handling.
2. No unused imports, variables, or functions.
3. For comments, keep it minimal. It should be about "Why", not "What".
Files:
plugins/local-stt/src/server.rscrates/chunker/src/lib.rscrates/whisper-local/src/model.rsapps/desktop/src/components/editor-area/note-header/listen-button.tsx
🔇 Additional comments (3)
crates/chunker/src/lib.rs (1)
73-73: VAD configuration update looks goodThe change from
post_speech_padtoredemption_timewith 600ms value aligns with the voice activity detection improvements mentioned in the PR objectives.crates/whisper-local/src/model.rs (1)
176-189: Verify the filtering logic for common wordsThe filter removes segments containing common words like "you" and "thank you". This might inadvertently filter out legitimate transcription segments that happen to consist only of these words. Consider if this aggressive filtering aligns with the intended behavior.
Additionally, the confidence threshold of 0.005 is extremely low - verify if this is the intended value.
plugins/local-stt/src/server.rs (1)
210-210: Confidence filtering removal aligns with architecture changesRemoving the confidence-based filtering here is appropriate since the filtering logic has been moved to the model layer in
crates/whisper-local/src/model.rs. This ensures all chunks are streamed and filtering decisions are centralized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🧹 Nitpick comments (4)
apps/desktop/src/components/editor-area/note-header/listen-button.tsx (1)
388-499: Device selection properly implemented with good UX.The past review comment about persistence has been addressed - line 414 correctly calls
setMicrophoneDevice.Consider these optional improvements:
- Only poll for devices when the dropdown is open to reduce unnecessary API calls
- Add error handling for the
setMicrophoneDevicecallconst { data: devices = [], isLoading } = useQuery({ queryKey: ["microphone-devices"], queryFn: () => listenerCommands.listMicrophoneDevices(), refetchOnWindowFocus: false, - refetchInterval: 1000, + refetchInterval: isOpen ? 1000 : false, }); const { data: selectedDevice } = useQuery({ queryKey: ["current-microphone-device"], queryFn: () => listenerCommands.getCurrentMicrophoneDevice(), refetchOnWindowFocus: false, - refetchInterval: 1000, + refetchInterval: isOpen ? 1000 : false, }); const handleSelectDevice = (device: string) => { - listenerCommands.setMicrophoneDevice(device); + listenerCommands.setMicrophoneDevice(device).catch((error) => { + console.error("Failed to set microphone device:", error); + // Optionally show a toast notification + }); };crates/audio/src/mic.rs (3)
13-13: Remove unnecessary#[allow(dead_code)]attribute.The
hostfield is used in the constructor and should not be marked as dead code. This attribute appears to be incorrect.- #[allow(dead_code)] host: cpal::Host,
81-81: Optimize audio processing by avoiding channel selection.The code uses
.step_by(channels)which only takes the first channel from multi-channel audio. This might not be the intended behavior for stereo or multi-channel microphones.Consider if you want to mix channels or explicitly document this mono-only behavior:
+ // Convert multi-channel to mono by taking only the first channel data.iter() .step_by(channels) .map(|&x| x.to_sample()) .collect(),Or mix channels:
- let _ = tx.start_send( - data.iter() - .step_by(channels) - .map(|&x| x.to_sample()) - .collect(), - ); + let mono_data: Vec<f32> = data + .chunks_exact(channels) + .map(|chunk| { + chunk.iter().map(|&x| x.to_sample::<f32>()).sum::<f32>() / channels as f32 + }) + .collect(); + let _ = tx.start_send(mono_data);
69-132: Consider using async/await instead of blocking thread spawn.The current implementation spawns a blocking thread, but the audio stream could be handled more efficiently with async primitives, especially since the rest of the API is async.
This is a significant architectural consideration - while the current approach works, using
tokio::task::spawn_blockingor restructuring to be fully async might be more consistent with the async nature of the API.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (18)
apps/desktop/src/components/editor-area/note-header/listen-button.tsx(3 hunks)apps/desktop/src/locales/en/messages.po(6 hunks)apps/desktop/src/locales/ko/messages.po(6 hunks)apps/desktop/src/utils/broadcast.ts(0 hunks)crates/audio/Cargo.toml(1 hunks)crates/audio/src/lib.rs(4 hunks)crates/audio/src/mic.rs(1 hunks)plugins/listener/build.rs(1 hunks)plugins/listener/js/bindings.gen.ts(1 hunks)plugins/listener/permissions/autogenerated/commands/get_current_microphone_device.toml(1 hunks)plugins/listener/permissions/autogenerated/commands/set_microphone_device.toml(1 hunks)plugins/listener/permissions/autogenerated/reference.md(3 hunks)plugins/listener/permissions/default.toml(1 hunks)plugins/listener/permissions/schemas/schema.json(3 hunks)plugins/listener/src/commands.rs(1 hunks)plugins/listener/src/ext.rs(2 hunks)plugins/listener/src/fsm.rs(7 hunks)plugins/listener/src/lib.rs(1 hunks)
💤 Files with no reviewable changes (1)
- apps/desktop/src/utils/broadcast.ts
✅ Files skipped from review due to trivial changes (6)
- plugins/listener/permissions/autogenerated/commands/get_current_microphone_device.toml
- apps/desktop/src/locales/en/messages.po
- plugins/listener/permissions/autogenerated/commands/set_microphone_device.toml
- crates/audio/Cargo.toml
- plugins/listener/permissions/autogenerated/reference.md
- apps/desktop/src/locales/ko/messages.po
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{js,ts,tsx,rs}
⚙️ CodeRabbit Configuration File
**/*.{js,ts,tsx,rs}: 1. No error handling.
2. No unused imports, variables, or functions.
3. For comments, keep it minimal. It should be about "Why", not "What".
Files:
plugins/listener/build.rsplugins/listener/src/lib.rscrates/audio/src/lib.rsplugins/listener/src/ext.rsapps/desktop/src/components/editor-area/note-header/listen-button.tsxplugins/listener/src/commands.rsplugins/listener/js/bindings.gen.tsplugins/listener/src/fsm.rscrates/audio/src/mic.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: ci (windows, windows-latest)
- GitHub Check: ci (macos, macos-latest)
- GitHub Check: ci
🔇 Additional comments (21)
plugins/listener/build.rs (1)
3-4: LGTM! New microphone device commands added correctly.The new commands follow the existing naming convention and are properly integrated into the COMMANDS array for Tauri plugin registration.
plugins/listener/permissions/default.toml (1)
5-6: LGTM! Permissions correctly added for new microphone device commands.The new permissions follow the established naming convention and are logically placed after the existing microphone device permissions.
plugins/listener/src/lib.rs (1)
32-33: LGTM! Commands properly registered in the plugin.The new microphone device commands are correctly added to the command collection with proper typing and logical placement.
plugins/listener/js/bindings.gen.ts (1)
13-18: LGTM! Auto-generated bindings are correct.The new TypeScript bindings properly expose the microphone device commands with appropriate type signatures and correct Tauri command invocations.
plugins/listener/src/commands.rs (1)
13-32: LGTM! New command implementations follow established patterns.Both functions properly implement the Tauri command pattern with:
- Correct annotations for code generation
- Consistent error handling via
.map_err(|e| e.to_string())- Appropriate return types (
Option<String>for get,()for set)- Proper delegation to the extension trait methods
plugins/listener/src/ext.rs (3)
13-19: LGTM! Well-designed trait methods.The new methods follow the established pattern with appropriate async signatures and error handling.
43-44: LGTM! Simplified implementation.The direct delegation to
AudioInput::list_mic_devices()is clean and appropriate.
46-67: LGTM! Thread-safe state management.The implementations correctly use scoped locks to minimize contention and properly interact with the FSM through events.
crates/audio/src/lib.rs (2)
14-14: LGTM! Necessary import for device enumeration.
94-101: LGTM! Clean implementation of device-specific constructor.The method properly delegates to
MicInput::from_devicewith the specified device name.plugins/listener/permissions/schemas/schema.json (1)
321-332: LGTM! Permission schema correctly updated.The new permissions follow the established pattern and are properly included in the default permission set.
Also applies to: 477-488, 526-526, 529-529
plugins/listener/src/fsm.rs (3)
172-172: LGTM! Proper state initialization.The
mic_device_namefield is correctly initialized toNone.Also applies to: 187-187
244-256: LGTM! Clean device selection logic.The code properly creates the mic input based on whether a specific device is selected, with appropriate logging.
536-542: LGTM! Clean accessor methods.The methods provide appropriate access to device information.
apps/desktop/src/components/editor-area/note-header/listen-button.tsx (3)
3-13: LGTM! Necessary imports for new UI components.
332-341: LGTM! Clean component separation.The layout properly separates microphone and speaker controls with appropriate spacing.
501-527: LGTM! Clean speaker control implementation.The component properly handles mute state and visual feedback.
crates/audio/src/mic.rs (4)
42-58: Add proper error handling for device selection.Similar to the default constructor, this method uses multiple
expect()calls that will panic on failure. Device enumeration and selection should handle errors gracefully.Apply this diff to improve error handling:
- pub fn from_device(device_name: impl AsRef<str>) -> Self { + pub fn from_device(device_name: impl AsRef<str>) -> Result<Self, Box<dyn std::error::Error>> { let host = cpal::default_host(); let device = host .input_devices() - .expect("Failed to get input devices") - .find(|d| d.name().expect("Failed to get input device name") == device_name.as_ref()) - .expect("Failed to get input device"); + .map_err(|e| format!("Failed to get input devices: {}", e))? + .find(|d| d.name().map_or(false, |name| name == device_name.as_ref())) + .ok_or_else(|| format!("Device '{}' not found", device_name.as_ref()))?; let config = device .default_input_config() - .expect("Failed to get default input config"); + .map_err(|e| format!("Failed to get default input config: {}", e))?; - Self { + Ok(Self { host, device, config, - } + }) }Likely an incorrect or invalid review comment.
38-40: Add proper error handling for device name retrieval.The
device_name()method usesexpect()which can panic if the device name cannot be retrieved.Apply this diff:
- pub fn device_name(&self) -> String { - self.device.name().expect("Failed to get input device name") + pub fn device_name(&self) -> Result<String, cpal::DeviceNameError> { + self.device.name() }Likely an incorrect or invalid review comment.
22-24: Add proper error handling for device initialization.The code uses
expect()which will panic if no default input device is available. This violates the coding guidelines that specify "No error handling" but in the context of initialization, graceful error handling would be more appropriate.Consider returning a
Resulttype instead:-impl Default for MicInput { - fn default() -> Self { +impl MicInput { + pub fn new() -> Result<Self, Box<dyn std::error::Error>> { let host = cpal::default_host(); - let device = host - .default_input_device() - .expect("Failed to get default input device"); + let device = host + .default_input_device() + .ok_or("No default input device available")?; - let config = device - .default_input_config() - .expect("Failed to get default input config"); + let config = device + .default_input_config() + .map_err(|e| format!("Failed to get default input config: {}", e))?; - Self { + Ok(Self { host, device, config, - } + }) } }Likely an incorrect or invalid review comment.
128-128: Add proper error handling for channel receive.The code uses
unwrap()on channel receive which will panic if the sender is dropped unexpectedly.- drop_rx.recv().unwrap(); + let _ = drop_rx.recv();Likely an incorrect or invalid review comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
apps/desktop/src/components/editor-area/note-header/listen-button.tsx (1)
388-502: Well-implemented microphone selector with proper device persistenceThe component successfully addresses the previous review comment about device persistence and provides a clean UI for microphone selection with appropriate loading and empty states.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
apps/desktop/src/components/editor-area/note-header/listen-button.tsx(3 hunks)crates/audio/src/lib.rs(4 hunks)crates/chunker/src/lib.rs(1 hunks)crates/whisper-local/src/model.rs(1 hunks)plugins/listener/src/fsm.rs(7 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- crates/chunker/src/lib.rs
- crates/audio/src/lib.rs
- plugins/listener/src/fsm.rs
- crates/whisper-local/src/model.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{js,ts,tsx,rs}
⚙️ CodeRabbit Configuration File
**/*.{js,ts,tsx,rs}: 1. No error handling.
2. No unused imports, variables, or functions.
3. For comments, keep it minimal. It should be about "Why", not "What".
Files:
apps/desktop/src/components/editor-area/note-header/listen-button.tsx
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: ci
- GitHub Check: ci (windows, windows-latest)
- GitHub Check: ci (macos, macos-latest)
🔇 Additional comments (3)
apps/desktop/src/components/editor-area/note-header/listen-button.tsx (3)
3-13: Import changes look goodAll imported icons are properly utilized in the new
MicrophoneSelectorandSpeakerButtoncomponents.
332-341: Clean refactoring of audio controlsThe separation of microphone and speaker controls into distinct components improves modularity and user experience.
504-531: Clean and focused speaker button implementationThe simplified component maintains all necessary functionality while improving code clarity.
| useEffect(() => { | ||
| console.log("currentDeviceQuery.data", currentDeviceQuery.data); | ||
| console.log("allDevicesQuery.data", allDevicesQuery.data); | ||
| }, [currentDeviceQuery.data, allDevicesQuery.data]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove debugging console.log statements
These appear to be debugging statements that should not be in production code.
- useEffect(() => {
- console.log("currentDeviceQuery.data", currentDeviceQuery.data);
- console.log("allDevicesQuery.data", allDevicesQuery.data);
- }, [currentDeviceQuery.data, allDevicesQuery.data]);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| useEffect(() => { | |
| console.log("currentDeviceQuery.data", currentDeviceQuery.data); | |
| console.log("allDevicesQuery.data", allDevicesQuery.data); | |
| }, [currentDeviceQuery.data, allDevicesQuery.data]); |
🤖 Prompt for AI Agents
In apps/desktop/src/components/editor-area/note-header/listen-button.tsx around
lines 415 to 418, remove the console.log statements used for debugging
currentDeviceQuery.data and allDevicesQuery.data to clean up the production
code.
No description provided.