Detecting Distribution Shift in Speech Audio

This repo contains tools for doing covariate shift detection on speech audio, and an example application on the VOiCES dataset. Our approach uses pretrained speech featurizers and aggregation methods to embed waveforms into fixed length vectors.

We then use the approach of Failing Loudly to detect distribution shift, by first using untrained autoencoders for dimensionality reduction on the embeddings, followed up by two-sample non-parametric hypothesis testing on samples of source and target data.

embeddors.py: Classes wrapping models that convert waveforms to sequences of feature vectors.
aggregators.py: Classes for aggregrating sequences of feature vectors into single, fixed-length vector embeddings.
data_utils.py: Tools for combining featurizors and aggregators into the embedding pipeline and applying it to lists of waveforms or .wav files.
detection_utils.py: Functional wrappers around the alibi_detect implementation of MMD two sample testing for single and repeated tests.
hypothesis_test.ipynb: This notebook walks through loading the VOiCES dataset, preprocessing the waveforms into embeddings, and evaluating the performance of distribution shift detection.

Dependencies and prerequisites

To follow the experiments in hypothesis_test.ipynb you will need to download the VOiCES dataset. Instructions for that can be found here. We recommend using the devkit subset for expediency.

The following packages are required

alibi-detect
torch
fairseq
tensorflow
tensorflow_hub
librosa
pandas
seaborn
wget

All can be installed using pip

pip install -r requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Detecting Distribution Shift in Speech Audio

Contents

Dependencies and prerequisites

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
LICENSE		LICENSE
README.md		README.md
aggregators.py		aggregators.py
data_utils.py		data_utils.py
detection_utils.py		detection_utils.py
embeddors.py		embeddors.py
hypothesis_test.ipynb		hypothesis_test.ipynb
pipeline.png		pipeline.png
requirements.txt		requirements.txt

License

IQTLabs/speech_shift

Folders and files

Latest commit

History

Repository files navigation

Detecting Distribution Shift in Speech Audio

Contents

Dependencies and prerequisites

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages