Big model only for hard audios: Sample dependent Whisper model selection for efficient inferences (2023)

Here's the repository of [Big model only for hard audios] It contains all the files to reproduce our decider training and evaluate it, as well as our best model.

Given an audio, our method first runs the encoder of Whisper small to extract representations. Then the decider module is applied to choose wheter to continue the inference with Whisper Small or to restart using Whisper Tiny if the audio is simple enough.

Installation

git clone https://github.com/hugomalard/Big-model-only-for-hard-audios.git 
cd Big-model-only-for-hard-audios

# creating a conda environment
conda create -n BMOHA python=3.8
conda activate BMOHA

pip install -r requirements.txt

Extract WERs of Whisper Small and Whisper Tiny

First change the different paths needed (for Whispers models and CommonVoice datasets), then run:

inferences_whisper.sh

Train the decider module

You might want to change the hyperparameters of the model: edit the file 'BMOHA/hparams/cnn/train_cnn_ponderate.yaml'

python BMOHA/train_cnn_latent_space.py BMOHA/hparams/cnn/train_cnn_ponderate.yaml

Inference using the decider module

The following code allow to perform an inference on a given dataset, while measuring the computational cost of the model (in MACs) and the performance (in WER).

python BMOHA/inference_decider_whisper.py BMOHA/hparams/cnn/inference_whisper_decider.yaml

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
BMOHA		BMOHA
inferences_whisper.sh		inferences_whisper.sh
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Big model only for hard audios: Sample dependent Whisper model selection for efficient inferences (2023)

Installation

Extract WERs of Whisper Small and Whisper Tiny

Train the decider module

Inference using the decider module

About

Releases

Packages

Languages

hugomalard/Big-model-only-for-hard-audios

Folders and files

Latest commit

History

Repository files navigation

Big model only for hard audios: Sample dependent Whisper model selection for efficient inferences (2023)

Installation

Extract WERs of Whisper Small and Whisper Tiny

Train the decider module

Inference using the decider module

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages