A pipeline to make ASR datasets better
-
Updated
Jul 25, 2024 - Python
A pipeline to make ASR datasets better
GlotLID: Language Identification with Support for More Than 2000 Labels -- EMNLP 2023
working on llm research
Geographically-informed language identification
Spoken Language Identification on Common Voice and AudioSet using Deep Learning
The dataset with English, German and Spanish speech samples.
Neural net language identification for many languages on short texts plus construction-based dialectometry
Identify a spoken language using artificial intelligence (LID).
Add a description, image, and links to the lid topic page so that developers can more easily learn about it.
To associate your repository with the lid topic, visit your repo's landing page and select "manage topics."