feat: Support iFLYTEK large model for Chinese-English speech recognition #3952

shaohuzhang1 · 2025-08-28T02:55:10Z

feat: Support iFLYTEK large model for Chinese-English speech recognition

f2c-ci-robot · 2025-08-28T02:55:14Z

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

f2c-ci-robot · 2025-08-28T02:55:19Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

shaohuzhang1 · 2025-08-28T02:55:35Z

apps/models_provider/impl/xf_model_provider/model/zh_en_stt.py

+            except asyncio.TimeoutError:
+                break
+
+        return result_text


The provided Python code seems to be an implementation of a web service client for interfacing with a Spark-based speech-to-text API using the XFZhEnSparkSpeechToText class. Here are some general comments on the code:

SSL Context: The use of SSLContext without specific settings might pose security risks. You should configure your SSL context more securely based on your production requirements.

URL Creation: The method create_url() generates an authorization header by concatenating headers and signing them with the key. This approach is fine but could benefit from better exception handling when errors occur during cryptographic operations.

Audio Sending Logic: The loop that sends chunks breaks prematurely after reading zero bytes, which isn't ideal if you haven't reached end-of-stream in chunks. Consider implementing proper detection for EOF or raising an error accordingly.

WebSocket Connection Handling: It's good practice to have cleaner separation between sending audio and receiving responses. However, the current design has a single function for both.

Error Handling: Currently, all exceptions within methods like handle_audio(), send_audio(), and speech_to_text() catch them locally, leading to unhandled exceptions being logged at the global level (maxkb_logger.error). Improving this ensures exceptions don't silently fail.

Asynchronous vs Synchronous Calls: Most parts of the code assume asynchronous execution (e.g., asyncio.run(handle())). Ensure it aligns with your application architecture. Some functions may still require synchronization due to their nature.

Logging: Using logging instead of printing error messages directly can make debugging easier and adhere to best practices by reducing clutter in console output.

Configuration Management: If you're storing keys/configurations in files, consider moving sensitive information such as API keys into environment variables or secure vault configurations rather than plain text files.

Documentation: Adding docstrings across various methods would improve readability and maintainability of the codebase.

Overall, the code provides a solid foundation for interacting with the Spark API through websockets asynchronously. Continuous testing and improvement are recommended to address performance bottlenecks and robustness issues.

shaohuzhang1 · 2025-08-28T02:55:58Z

apps/models_provider/impl/xf_model_provider/credential/zh_en_stt.py

+        return {**model, 'spark_api_secret': super().encryption(model.get('spark_api_secret', ''))}
+
+    def get_model_params_setting_form(self, model_name):
+        pass


This Python code snippet appears to be part of a Django application that validates and manages model credentials for speech-to-text processing using Xiaofengyun's API (ZhEnXunFei). Here are some potential issues and optimizations:

Potential Issues:

Type Annotations: The use of Dict[str, object] for model_credential can lead to runtime type errors since objects could contain unexpected types. Using specific annotation types like dict[str, Any] would improve clarity.

String Formatting with _: While using Unicode literals (u'...') is no longer necessary in modern versions of Python, it might cause syntax warnings or performance issues in certain environments. Ensure all string formats are correctly specified without explicit encoding hints.

Traceback Logging: Although logging the traceback can be helpful during development, it should not be done directly in production unless you have a need to debug exceptions.

Empty List Check: The line if not any(list(filter(lambda mt: mt.get('value') == model_type, model_type_list))):

Is there a reason why you're converting model_type_list to a list before filtering? It seems unnecessary here.

A more idiomatic way would be to skip this step if model_type_list is empty beforehand.

Exception Handling in get_model_params_setting_form: This method does nothing useful; its implementation can be removed or modified based on actual requirements.

Unnecessary Empty File at End: There is an empty line at the end of the file that doesn't serve a purpose.

Optimizations:

Use Type Annotations Accurately:

from typing import Dict, Optional, Any class ZhEnXunFeiSTTModelCredential(BaseForm, BaseModelCredential): # ...

Remove Unnecessary Conversion:

model_type_list = provider.get_model_type_list() if not model_type_list: raise AppApiException(ValidCode.valid_error.value, _("No valid model types found")) # ... rest of the validation logic remains unchanged ...

Simplify Exception Handling:
Instead of raising exceptions within exception handling blocks, catch them separately and handle each case appropriately. For example:

try: model = provider.get_model(model_type, model_name, credential) model.check_auth() except AppApiException as e: if not raise_exception: return False raise except Exception as e: traceback.print_exc() if raise_exception: raise AppApiException(ValidCode.valid_error.value, _('Verification failed')) return False

Consider Adding More Specific Exceptions:
Depending on the complexity, you might want to introduce specific exception classes or wrap existing ones to better describe the context of each failure.

Apply these improvements where appropriate to ensure the code maintains correctness, readability, and maintainability while also considering future scalability needs.

shaohuzhang1 · 2025-08-28T02:56:23Z

apps/models_provider/impl/xf_model_provider/xf_model_provider.py

@@ -47,7 +54,8 @@
    .append_default_model_info(
        ModelInfo('generalv3.5', '', ModelTypeConst.LLM, xunfei_model_credential, XFChatSparkLLM))
    .append_default_model_info(
-        ModelInfo('iat', _('Chinese and English recognition'), ModelTypeConst.STT, stt_model_credential, XFSparkSpeechToText),
+        ModelInfo('iat', _('Chinese and English recognition'), ModelTypeConst.STT, stt_model_credential,
+                  XFSparkSpeechToText),
    )
    .append_default_model_info(
        ModelInfo('tts', '', ModelTypeConst.TTS, tts_model_credential, XFSparkTextToSpeech))


There are no significant issues with the provided Python code snippet. However, I have a few optimizations and improvements you might consider:

Avoid Using append_default_model_info: If this is intended to add default model information, ensure that the class DefaultModelInfoProvider supports appending without causing side effects.

Consistent Use of Quotation Marks: In some places, double quotes (") and single quotes ('') are used interchangeably. Consistency is preferred, especially if you're using string literals throughout.

Line Length: The line lengths in the file can be improved for better readability and maintainability. Consider breaking down long lines into multiple parts or using triple quotes for larger strings.

Here’s an updated version of the code with these considerations:

from models_provider.impl.xf_model_provider.credential.llm import XunFeiLLMModelCredential from models_provider.impl.xf_model_provider.credential.stt import XunFeiSTTModelCredential from models_provider.impl.xf_model_provider.credential.tts import XunFeiTTSModelCredential from models_provider.impl.xf_model_provider.credential.zh_en_stt import ZhEnXunFeiSTTModelCredential from models_provider.impl.xf_model_provider.model.embedding import XFEmbedding from models_provider.impl.xf_model_provider.model.image import XFSparkImage from models_provider.impl.xf_model_provider.model.llm import XFChatSparkLLM from maxkb.conf import PROJECT_DIR from django.utils.translation import gettext as _ import ssl ssl._create_default_https_context = ssl.create_default_context() xunfei_model_credential = XunFeiLLMModelCredential() stt_model_credential = XunFeiSTTModelCredential() zh_en_stt_credential = ZhEnXunFeiSTTModelCredential() image_model_credential = XunFeiImageModelCredential() tts_model_credential = XunFeiTTSModelCredential() embedding_model_credential = XFEmbeddingCredential() model_info_list = [ ModelInfo('generalv3.5', "", ModelTypeConst.LLM, xunfei_model_credential, XFChatSparkLLM), ModelInfo('', 'General Version v3.0', ModelTypeConst.LLM, xunfei_model_credential, XFChatSparkLLM), ModelInfo('', 'General Version v2.0', ModelTypeConst.LLM, xunfei_model_credential, XFChatSparkLLM), # Simplify Chinese and English Recognition model info ModelInfo('iat', _('Chinese and English Recognition'), ModelTypeConst.STT, stt_model_credential, XFSparkSpeechToText), # New STT model info for Chinese and English ModelInfo('slm', _('Chinese and English Recognition'), ModelTypeConst.STT, zh_en_stt_credential, XFZhEnSparkSpeechToText), ModelInfo("", "Text-to-Speech", ModelTypeConst.TTS, tts_model_credential, XFSparkTextToSpeech), ModelInfo("embedding", "Sentence Embeddings", ModelTypeConst.EMBEDDING, embedding_model_credential, XFEmbedding) ]

By applying these changes, the code becomes more readable and consistent in terms of string handling and overall structure.

feat: Support iFLYTEK large model for Chinese-English speech recognition

7c7f37b

f2c-ci-robot bot added the do-not-merge/release-note-label-needed label Aug 28, 2025

shaohuzhang1 commented Aug 28, 2025

View reviewed changes

zhanweizhang7 merged commit 4786970 into v2 Aug 28, 2025
3 of 6 checks passed

zhanweizhang7 deleted the pr@v2@feat_support_xunfei_chinese_english_speech_recognition branch August 28, 2025 02:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Support iFLYTEK large model for Chinese-English speech recognition #3952

feat: Support iFLYTEK large model for Chinese-English speech recognition #3952

Uh oh!

shaohuzhang1 commented Aug 28, 2025

Uh oh!

f2c-ci-robot bot commented Aug 28, 2025

Uh oh!

f2c-ci-robot bot commented Aug 28, 2025

Uh oh!

shaohuzhang1 Aug 28, 2025

Uh oh!

shaohuzhang1 Aug 28, 2025

Uh oh!

shaohuzhang1 Aug 28, 2025

Uh oh!

Uh oh!

Uh oh!

feat: Support iFLYTEK large model for Chinese-English speech recognition #3952

feat: Support iFLYTEK large model for Chinese-English speech recognition #3952

Uh oh!

Conversation

shaohuzhang1 commented Aug 28, 2025

Uh oh!

f2c-ci-robot bot commented Aug 28, 2025

Uh oh!

f2c-ci-robot bot commented Aug 28, 2025

Uh oh!

shaohuzhang1 Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

shaohuzhang1 Aug 28, 2025

Choose a reason for hiding this comment

Potential Issues:

Optimizations:

Uh oh!

shaohuzhang1 Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!