ekstep-language-identification

This repository is a part of Vakyansh's recipes to build state of the art Speech Recogniition Model.

The language Identification repository works for classifying the audio utterances into different classes.This repository can work for 2 or more classes depending on the requirement.

Preparing the Data

Keep separate audio folders for different classes as well as the train and valid sets of each. The audio files should be present in .wav format. To prepare the data edit the data paths in file data/create_manifest.py.

To run the file:

python create_manifest.py

This creates the train and valid csv files in the data/ directory.

Training the Model

Edit the train_config.yml file for the training parameters. Give the file path for train and valid csv's created while preparing the data.

To start the training run

python train.py

Inference

Edit the language_map.yml to map the labels(0,1, etc) with the languege names or codes('hi','en', etc)

To infer, edit inference.py file and provide the best_checkpoint path and audio file name.

Parameters:

model_path : Path to best_checkpoint.pt

audio_path : Audio file path

Run the file:

python inference.py

This runs on a single audio file.

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
.circleci		.circleci
Unit_testing_audio_files		Unit_testing_audio_files
Unit_testing_files		Unit_testing_files
data		data
loaders		loaders
models		models
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
inference.py		inference.py
inference_test.py		inference_test.py
language_map.yml		language_map.yml
requirements.txt		requirements.txt
train.py		train.py
train_config.yml		train_config.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ekstep-language-identification

Preparing the Data

Training the Model

Inference

About

Releases

Packages

Contributors 6

Languages

License

Open-Speech-EkStep/ekstep-language-identification

Folders and files

Latest commit

History

Repository files navigation

ekstep-language-identification

Preparing the Data

Training the Model

Inference

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 6

Languages

Packages