Skip to content

Support Flux multilingual model (flux-general-multi) and language_hint parameter in WorkersAIFluxSTT #1441

@diroverflow

Description

@diroverflow

Summary

WorkersAIFluxSTT in @cloudflare/voice currently only uses @cf/deepgram/flux which maps to Deepgram's English-only model (flux-general-en). There is no way to select the multilingual model (flux-general-multi) or pass language_hint parameters, making it impossible to build voice agents for non-English users.

Problem

Deepgram has released Flux Multilingual (flux-general-multi), a single model supporting 10 languages (English, Spanish, French, German, Hindi, Russian, Portuguese, Japanese, Italian, Dutch) with auto-detection and language_hint biasing. However:

  1. Workers AI's @cf/deepgram/flux appears to only serve flux-general-en. There is no parameter to switch to flux-general-multi.
  2. WorkersAIFluxSTT has no language or languageHint option in its constructor/config. Looking at the source, the WebSocket input only sends encoding, sample_rate, eot_threshold, etc. — no language-related fields.
  3. The underlying ai.run("@cf/deepgram/flux", input, { websocket: true }) call provides no way to specify the Deepgram model variant or language_hint.

For context, WorkersAINova3STT already supports a language option — it would be great if WorkersAIFluxSTT followed a similar pattern.

Proposed Solution

  1. Workers AI level: Add a model or language parameter to @cf/deepgram/flux that allows selecting flux-general-multi and passing language_hint values through to Deepgram.
  2. @cloudflare/voice level: Add language / languageHints options to WorkersAIFluxSTT constructor, similar to how WorkersAINova3STT accepts language. When provided, pass these through to the ai.run() call.

Example API:

class MyAgent extends VoiceAgent<Env> {
  // Option A: single language hint
  transcriber = new WorkersAIFluxSTT(this.env.AI, { languageHints: ["zh"] });
  
  // Option B: multiple language hints for bilingual users
  transcriber = new WorkersAIFluxSTT(this.env.AI, { languageHints: ["en", "zh"] });
  
  // Option C: no hints — auto-detect
  transcriber = new WorkersAIFluxSTT(this.env.AI, { languageHints: [] });
}

Use Case

We're building an AI companion platform where characters can speak multiple languages. Currently, our voice call feature only works for English-speaking users. Supporting multilingual STT would allow us to serve our global user base without switching to a different STT provider.

Additional Context

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions