feat(chat): add interactive terminal UI with live transcription#185
Open
basnijholt wants to merge 7 commits intomainfrom
Open
feat(chat): add interactive terminal UI with live transcription#185basnijholt wants to merge 7 commits intomainfrom
basnijholt wants to merge 7 commits intomainfrom
Conversation
Enhance the chat command with an interactive terminal UI that supports: - Live transcription mode: text appears as you speak, editable before sending - Pause/resume: Escape key to mute mic for side conversations - Slash commands: /tts, /mode, /tools, /clear, /help - Tool toggling: enable/disable specific tools at runtime - Two input modes: "live" (default, VAD-based) and "direct" (Ctrl+C to end) New files: - agent_cli/core/voice_input.py: shared VAD recording loop - agent_cli/core/chat_state.py: session state & slash command handling Added prompt_toolkit dependency for async editable input with key bindings.
- Add explicit Ctrl+C key binding to properly exit live input mode - Track accumulated text length to append new transcriptions instead of replacing entire buffer, preserving cursor position when editing - Remove conflicting console.print status updates that caused flickering with prompt_toolkit's display management
- Return None from _get_live_input on Ctrl+C to signal exit, main loop now breaks on None instead of continuing with "No input received" - Add bottom_toolbar to PromptSession showing live status (🎤 Listening, 🔴 Recording, ⏳ Processing, ⏸️ Paused, ✓ Ready) - Insert transcribed text at cursor position instead of always appending at end, allowing users to position cursor before speaking
- Replace emoji status icons with ASCII text to avoid rendering issues - Use call_soon_threadsafe for all UI updates from background voice task - Use mutable list holder for status to avoid race conditions - Schedule buffer text updates on event loop for thread safety
…wrappers - Remove STATUS_ICONS and status toolbar (was causing display issues) - Remove unused _create_input_panel function - Remove call_soon_threadsafe wrappers (callbacks run on same event loop) - Simplify on_text_update to just set buffer text directly - Remove unused imports (Panel, Text, VoiceInputStatus)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
chatcommand with live transcription mode/tts,/mode,/tools,/clear,/help/tools disable|enable <name>New Files
agent_cli/core/voice_input.py: Shared VAD recording loop extracted for reuseagent_cli/core/chat_state.py: Session state management & slash command handlingDependencies
prompt_toolkit>=3.0.0for async editable input with key bindingsCLI Options
--vad-threshold(0.0-1.0): VAD speech detection threshold--silence-threshold: Seconds of silence to end a speech segmentTest plan
agent-cli chatand verify live transcription works/help,/tts,/mode,/tools,/clear)/mode directswitches to original Ctrl+C behavior