feat: add text input mode and fix TTS thread synchronization#1
Open
aopstudio wants to merge 1 commit intotc-mb:masterfrom
Open
feat: add text input mode and fix TTS thread synchronization#1aopstudio wants to merge 1 commit intotc-mb:masterfrom
aopstudio wants to merge 1 commit intotc-mb:masterfrom
Conversation
- Add --text parameter for direct text input testing without audio - Add eval_text_string helper function and test_case_text for text mode - Fix TTS/T2W thread synchronization: wait for threads to finish before stopping - Update README.md with --text option documentation and usage example Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
--textparameter for direct text input testing without requiring audio inputeval_text_stringhelper function andtest_case_textfor text input mode--textoption documentation and usage exampleChanges
New Feature: Text Input Mode
Users can now test the model with text input directly, without needing to prepare audio files:
./build/bin/llama-omni-cli \ -m /path/to/MiniCPM-o-4_5-gguf/MiniCPM-o-4_5-Q4_K_M.gguf \ --text "Hello, please introduce yourself"Bug Fix: TTS Thread Synchronization
Previously,
omni_stop_threadswas called immediately aftertest_casereturned, which caused the TTS/T2W threads to be stopped before they finished processing. This resulted in incomplete or missing WAV output files.The fix adds proper wait logic:
speek_doneflag to indicate TTS completionTest Plan
--text "请介绍一下Python语言"tools/omni/output/round_000/tts_wav/🤖 Generated with Claude Code