Joint training embeddings

This is the official implementation of the submitted Interspeech 2022 paper [Enhancing Embeddings for Speech Classification in Noisy Conditions], (Mohamed Nabih Ali, Daniele Falavigna, Alessio Brutti).

Description

In this paper, we investigate how enhancement can be applied in neural speech classification architectures employing pre-trained speech embeddings. We investigate two approaches: one applies time-domain enhancement prior to extracting the embeddings; the other employs a convolutional neural network to map the noisy embeddings to the corresponding clean ones. All the experiments are conducted based on Fluent Speech commands, Google Speech commands v 0.01 and generated noisy versions of these datasets.

Architectures

Dataset

You need to create .txt files for training, validation, and testing datasets with the following structure:

<noisy_1_path><space><clean_1_path> 
<noisy_2_path><space><clean_2_path> 
<noisy_n_path><space><clean_n_path>

e.g. In the case of Wave-Enh strategy

/train/noisy/a.wav /train/clean/a.wav
/train/noisy/b.wav /train/clean/b.wav

In the case of the Embed-Enha strategy, it should be like 

/train/noisy/a.pt /train/clean/a.pt
/train/noisy/b.pt /train/clean/b.pt

For the Wave-Enh the test.txt file should be

<noisy_1_path>
<noisy_2_path>
<noisy_n_path>
e.g.
/test/noisy/a.wav
/test/noisy/b.wav

Steps of Wave-Enh Strategy

You need to download the wav2vec model from (https://github.com/pytorch/fairseq/tree/main/examples/wav2vec), and modify its path in util/utils.py file.

Training

Use train.py to jointly train both the speech enhancement and the speech classifier modules. It receives two command line parameters:

-C, --config, the path of your configuration file for the training process.
-R, --resume, resume training from the last saved checkpoint.

Syntax python train.py -C config/train/train.json or python train.py config/train/train.json -R

Evaluation

Use enhancement.py to evaluate both models.

-O specify the folder where to save the enhanced signals.
-D use the signals using the GPU.
-M path to save the best front-end model.
-m path to save the best back-end model.

Syntax: python enhancement.py -C config/enhancement/unet_basic.json -D 0 -O <path to save the enhanced signals> -M <path to the best speech enhancement model> -m <path to the best back end speech classifier>

Steps of Embeds-Enh Strategy

Embeddings Extraction

You need to download the wav2vec model from (https://github.com/pytorch/fairseq/tree/main/examples/wav2vec), and modify its path in emb.py file to extract the embeddings from the dataset.

Syntax: python emb.py

Training

Use train.py to jointly train both the speech enhancement and the speech classifier modules. It receives six main commands line parameters:

-m path to save the best back-end model.
-M path to save the best front-end model.
-b number of residual blocks in the back-end speech classifier.
-r number of repeats of the residual blocks.
-lr learning rate
e number of epochs

Syntax: python train.py -m ./best_bkmodel.pkl -M ./best_frmodel.pkl -b 5 -r 2 -lr 0.001 -e 100

Evaluation

Use evaluation.py to evaluate both models based on the test dataset. It receives two main commands line parameters:

-m path to save the best back-end model.
-M path to save the best front-end model.

Syntax: python evaluation.py -m ./best_bkmodel.pkl -M ./best_frmodel.pkl

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
Embed-Enh		Embed-Enh
Wave-Enh		Wave-Enh
assets		assets
Ne.sh		Ne.sh
README.md		README.md
emb.py		emb.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Joint training embeddings

Description

Architectures

Dataset

For the Wave-Enh the test.txt file should be

Steps of Wave-Enh Strategy

Training

Evaluation

Steps of Embeds-Enh Strategy

Embeddings Extraction

Training

Evaluation

About

Releases

Packages

Languages

mnabihali/Joint-training-embeddings

Folders and files

Latest commit

History

Repository files navigation

Joint training embeddings

Description

Architectures

Dataset

For the Wave-Enh the test.txt file should be

Steps of Wave-Enh Strategy

Training

Evaluation

Steps of Embeds-Enh Strategy

Embeddings Extraction

Training

Evaluation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages