Skip to content

Speech Language AutoDetection does not work in Javascript SDK 1.42.0 #2733

Open
@reliccare

Description

IN ORDER TO ASSIST YOU, PLEASE PROVIDE THE FOLLOWING:

  • Speech SDK log taken from a run that exhibits the reported issue.

Logs are available here.

  • A stripped down, simplified version of your source code that exhibits the issue. Or, preferably, try to reproduce the problem with one of the public samples in this repository (or a minimally modified version of it), and share the code.

I am using this sample and making just one change to auto detect the language instead of the language provided with speechConfig. The changes made are shown below

// speechConfig.speechRecognitionLanguage = settings.language;
var autoDetectSourceLanguageConfig = sdk.AutoDetectSourceLanguageConfig.fromLanguages([("zh-CN", "hi-IN", "en-US")]);
// var reco = new sdk.SpeechRecognizer(speechConfig, audioConfig);
  var reco = sdk.SpeechRecognizer.FromConfig(speechConfig, autoDetectSourceLanguageConfig, audioConfig);
  • If relevant, a WAV file of your input audio.

I am using zhcn_short_dummy_sample.wav file provided with samples. This is a chinese utterance.

  • Additional information as shown below

Describe the bug

I am getting the output as below when using auto detection of language

Now recognizing speech from: zhcn_short_dummy_sample.wav
(sessionStarted) SessionId: B651DFE8EECD489EA8DDF223EFF8FBF6
(speechStartDetected) SessionId: B651DFE8EECD489EA8DDF223EFF8FBF6
(recognizing) Reason: RecognizingSpeech Text: jin tian tian qi
(recognizing) Reason: RecognizingSpeech Text: jin tian tian qi zhe me yang
(speechEndDetected) SessionId: B651DFE8EECD489EA8DDF223EFF8FBF6

(recognized)  Reason: RecognizedSpeech Text: Jin Tian Tian Qi Zhe me Yang.
(sessionStopped) SessionId: B651DFE8EECD489EA8DDF223EFF8FBF6

Expected behavior

If I do not use auto detection of language and specify zh-CN as the speech recognition language. I get the correct output

Now recognizing speech from: zhcn_short_dummy_sample.wav
(sessionStarted) SessionId: E0B87990473F4E0B9EF1E2F51B7A6BDD
(speechStartDetected) SessionId: E0B87990473F4E0B9EF1E2F51B7A6BDD
(recognizing) Reason: RecognizingSpeech Text: 今天天气
(recognizing) Reason: RecognizingSpeech Text: 今天天气怎么样
(speechEndDetected) SessionId: E0B87990473F4E0B9EF1E2F51B7A6BDD

(recognized)  Reason: RecognizedSpeech Text: 今天天气怎么样?
(sessionStopped) SessionId: E0B87990473F4E0B9EF1E2F51B7A6BDD

Version of the Cognitive Services Speech SDK

Here is my package.json dependencies

  "dependencies": {
    "difflib": "^0.2.4",
    "https-proxy-agent": "^3.0.0",
    "lodash": "^4.17.21",
    "lodash.foreach": "^4.5.0",
    "lodash.sum": "^4.0.2",
    "mic-to-speech": "^1.0.1",
    "microsoft-cognitiveservices-speech-sdk": "^1.42.0",
    "readline": "^1.3.0",
    "segment": "^0.1.3",
    "wav": "^1.0.2"
  }

Platform, Operating System, and Programming Language

  • OS: [e.g. Windows, Linux, Android, iOS, ...] - Macbook Pro
  • Hardware - Apple M1 Chip
  • Programming language: Javascript
  • Browser [e.g. Chrome, Safari] (if applicable) - Not applicable. Node.js

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions