Skip to content

[Feature request] Add support for Massively Multilingual Speech(MMS) model #209

Closed
@bil-ash

Description

@bil-ash

Please consider adding support for the asr and tts features of the MMS model to transformers.js.

The model docs are available here . The pytorch ASR model finetuned for 102 languages is available [here]https://huggingface.co/facebook/mms-1b-fl102/blob/main/pytorch_model.bin) . It(3.86GB) is not much larger than the whisper-medium(3.06GB) but works much better on Indian languages like Hindi, Bengali and Assamese(not benchmarked, I know the languages and so analysed the results obtained for a few audio clips in these languages).

Also, the TTS model can be a good contender for the TTS pipeline for transformers about which there is an ongoing work in transformers. Apparently, this is the transformers TTS model with the largest number of supported languages.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions