This recipe includes scripts for training Qwen-Audio style model using multiple datasets.
./RESULTS.md contains the latest results.
The following table lists the folders for different tasks.
Speech Encoder | LLM | Comment | |
---|---|---|---|
whisper_llm_zh | Whisper | Qwen2 | Using multiple Chinese datasets |