-
Notifications
You must be signed in to change notification settings - Fork 2.3k
feat: Support iFLYTEK large model for Chinese-English speech recognition #3952
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Support iFLYTEK large model for Chinese-English speech recognition #3952
Conversation
Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
except asyncio.TimeoutError: | ||
break | ||
|
||
return result_text |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The provided Python code seems to be an implementation of a web service client for interfacing with a Spark-based speech-to-text API using the XFZhEnSparkSpeechToText class. Here are some general comments on the code:
-
SSL Context: The use of
SSLContext
without specific settings might pose security risks. You should configure your SSL context more securely based on your production requirements. -
URL Creation: The method
create_url()
generates an authorization header by concatenating headers and signing them with the key. This approach is fine but could benefit from better exception handling when errors occur during cryptographic operations. -
Audio Sending Logic: The loop that sends chunks breaks prematurely after reading zero bytes, which isn't ideal if you haven't reached end-of-stream in chunks. Consider implementing proper detection for EOF or raising an error accordingly.
-
WebSocket Connection Handling: It's good practice to have cleaner separation between sending audio and receiving responses. However, the current design has a single function for both.
-
Error Handling: Currently, all exceptions within methods like
handle_audio()
,send_audio()
, andspeech_to_text()
catch them locally, leading to unhandled exceptions being logged at the global level (maxkb_logger.error
). Improving this ensures exceptions don't silently fail. -
Asynchronous vs Synchronous Calls: Most parts of the code assume asynchronous execution (e.g.,
asyncio.run(handle())
). Ensure it aligns with your application architecture. Some functions may still require synchronization due to their nature. -
Logging: Using logging instead of printing error messages directly can make debugging easier and adhere to best practices by reducing clutter in console output.
-
Configuration Management: If you're storing keys/configurations in files, consider moving sensitive information such as API keys into environment variables or secure vault configurations rather than plain text files.
-
Documentation: Adding docstrings across various methods would improve readability and maintainability of the codebase.
Overall, the code provides a solid foundation for interacting with the Spark API through websockets asynchronously. Continuous testing and improvement are recommended to address performance bottlenecks and robustness issues.
return {**model, 'spark_api_secret': super().encryption(model.get('spark_api_secret', ''))} | ||
|
||
def get_model_params_setting_form(self, model_name): | ||
pass No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This Python code snippet appears to be part of a Django application that validates and manages model credentials for speech-to-text processing using Xiaofengyun's API (ZhEnXunFei). Here are some potential issues and optimizations:
Potential Issues:
-
Type Annotations: The use of
Dict[str, object]
formodel_credential
can lead to runtime type errors since objects could contain unexpected types. Using specific annotation types likedict[str, Any]
would improve clarity. -
String Formatting with
_
: While using Unicode literals (u'...'
) is no longer necessary in modern versions of Python, it might cause syntax warnings or performance issues in certain environments. Ensure all string formats are correctly specified without explicit encoding hints. -
Traceback Logging: Although logging the traceback can be helpful during development, it should not be done directly in production unless you have a need to debug exceptions.
-
Empty List Check: The line
if not any(list(filter(lambda mt: mt.get('value') == model_type, model_type_list))):
- Is there a reason why you're converting
model_type_list
to a list before filtering? It seems unnecessary here. - A more idiomatic way would be to skip this step if
model_type_list
is empty beforehand.
- Is there a reason why you're converting
-
Exception Handling in
get_model_params_setting_form
: This method does nothing useful; its implementation can be removed or modified based on actual requirements. -
Unnecessary Empty File at End: There is an empty line at the end of the file that doesn't serve a purpose.
Optimizations:
-
Use Type Annotations Accurately:
from typing import Dict, Optional, Any class ZhEnXunFeiSTTModelCredential(BaseForm, BaseModelCredential): # ...
-
Remove Unnecessary Conversion:
model_type_list = provider.get_model_type_list() if not model_type_list: raise AppApiException(ValidCode.valid_error.value, _("No valid model types found")) # ... rest of the validation logic remains unchanged ...
-
Simplify Exception Handling:
Instead of raising exceptions within exception handling blocks, catch them separately and handle each case appropriately. For example:try: model = provider.get_model(model_type, model_name, credential) model.check_auth() except AppApiException as e: if not raise_exception: return False raise except Exception as e: traceback.print_exc() if raise_exception: raise AppApiException(ValidCode.valid_error.value, _('Verification failed')) return False
-
Consider Adding More Specific Exceptions:
Depending on the complexity, you might want to introduce specific exception classes or wrap existing ones to better describe the context of each failure.
Apply these improvements where appropriate to ensure the code maintains correctness, readability, and maintainability while also considering future scalability needs.
@@ -47,7 +54,8 @@ | |||
.append_default_model_info( | |||
ModelInfo('generalv3.5', '', ModelTypeConst.LLM, xunfei_model_credential, XFChatSparkLLM)) | |||
.append_default_model_info( | |||
ModelInfo('iat', _('Chinese and English recognition'), ModelTypeConst.STT, stt_model_credential, XFSparkSpeechToText), | |||
ModelInfo('iat', _('Chinese and English recognition'), ModelTypeConst.STT, stt_model_credential, | |||
XFSparkSpeechToText), | |||
) | |||
.append_default_model_info( | |||
ModelInfo('tts', '', ModelTypeConst.TTS, tts_model_credential, XFSparkTextToSpeech)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are no significant issues with the provided Python code snippet. However, I have a few optimizations and improvements you might consider:
-
Avoid Using
append_default_model_info
: If this is intended to add default model information, ensure that the classDefaultModelInfoProvider
supports appending without causing side effects. -
Consistent Use of Quotation Marks: In some places, double quotes (") and single quotes ('') are used interchangeably. Consistency is preferred, especially if you're using string literals throughout.
-
Line Length: The line lengths in the file can be improved for better readability and maintainability. Consider breaking down long lines into multiple parts or using triple quotes for larger strings.
Here’s an updated version of the code with these considerations:
from models_provider.impl.xf_model_provider.credential.llm import XunFeiLLMModelCredential
from models_provider.impl.xf_model_provider.credential.stt import XunFeiSTTModelCredential
from models_provider.impl.xf_model_provider.credential.tts import XunFeiTTSModelCredential
from models_provider.impl.xf_model_provider.credential.zh_en_stt import ZhEnXunFeiSTTModelCredential
from models_provider.impl.xf_model_provider.model.embedding import XFEmbedding
from models_provider.impl.xf_model_provider.model.image import XFSparkImage
from models_provider.impl.xf_model_provider.model.llm import XFChatSparkLLM
from maxkb.conf import PROJECT_DIR
from django.utils.translation import gettext as _
import ssl
ssl._create_default_https_context = ssl.create_default_context()
xunfei_model_credential = XunFeiLLMModelCredential()
stt_model_credential = XunFeiSTTModelCredential()
zh_en_stt_credential = ZhEnXunFeiSTTModelCredential()
image_model_credential = XunFeiImageModelCredential()
tts_model_credential = XunFeiTTSModelCredential()
embedding_model_credential = XFEmbeddingCredential()
model_info_list = [
ModelInfo('generalv3.5', "", ModelTypeConst.LLM, xunfei_model_credential, XFChatSparkLLM),
ModelInfo('', 'General Version v3.0', ModelTypeConst.LLM, xunfei_model_credential, XFChatSparkLLM),
ModelInfo('', 'General Version v2.0', ModelTypeConst.LLM, xunfei_model_credential, XFChatSparkLLM),
# Simplify Chinese and English Recognition model info
ModelInfo('iat', _('Chinese and English Recognition'), ModelTypeConst.STT,
stt_model_credential, XFSparkSpeechToText),
# New STT model info for Chinese and English
ModelInfo('slm', _('Chinese and English Recognition'), ModelTypeConst.STT,
zh_en_stt_credential, XFZhEnSparkSpeechToText),
ModelInfo("", "Text-to-Speech", ModelTypeConst.TTS, tts_model_credential, XFSparkTextToSpeech),
ModelInfo("embedding", "Sentence Embeddings", ModelTypeConst.EMBEDDING,
embedding_model_credential, XFEmbedding)
]
By applying these changes, the code becomes more readable and consistent in terms of string handling and overall structure.
feat: Support iFLYTEK large model for Chinese-English speech recognition