VITKLIA

Visualization and Interpretation Tool Kit

This tool allows you to create small dataset using Voxceleb data. You can define how many speakers or utterances you want in the train and the test. You can choose the number of test.

The main goal of the tool is to visualize vectors which are extracted from STKLIA for example. It also find the prototypes (from Interpretable machine learning) (this is the utterance that has the best representation of the data) and the criticisms (the utterance which are whether underrepresented or overrepresented). It reduces the vectors in the number of dimensions you want using UMAP (or UMAP and LDA or LDA). The plot that you obtains at the end show one color for each speaker and if you click on an utterance whether it plays the given utterance or it opens a new plot with the utterances of the speaker. Many things can be configured.

Two main parts

A tool to make smaller Dataset

go to src/smallDatasetCreator and take a look at the README.md.

A tool to transform Xvectors, find prototypes (and criticisms) and visualize it

go to src/ and take a look at README.md

How to install

Just follow these steps:

pip install matplotlib
pip install PyAudio
pip install umap-learn
pip install tqdm
pip install pyaml

or just run:

pip install -r requirements.txt

you may also need : STKLIA and so Kaldi, PyTorch and Voxceleb.

Examples

Complete overview

Launch one of the two shell script in src/smallDatasetCreator and it will create a small dataset. For example:

bash speakersSelector.sh

You can check some infos about the new train or a test (change /Train by /TestX where X is the number of the test) by running:

python3 datasetInfo ../../toy_dataset/NewSet/Train/feats.scp

We train the STKLIA with the train of this small dataset (an example of config file is given in /toy_dataset/Exemple_speaker.cfg. We extract the vectors of the train. Then we extract vectors of test. The result will be like in "pretrained_models".

Now, you can configure /configs/config.yaml to use the vectors that you have extracted. Finally launch in src/:

python3 run.py --conf ../configs/config.yaml --mode reduction

And if you want to compare two tests or one test and the train you can by configuring /configs/compare.yaml and then running:

python3 run.py --conf ../configs/compare.yaml

You need to have save prototypes, criticisms and utterances with run.py.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VITKLIA

Visualization and Interpretation Tool Kit

Two main parts

A tool to make smaller Dataset

A tool to transform Xvectors, find prototypes (and criticisms) and visualize it

How to install

Examples

Complete overview

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
configs		configs
files		files
plot		plot
pretrained_models		pretrained_models
src		src
toy_dataset		toy_dataset
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

AudranBert/VITKLIA

Folders and files

Latest commit

History

Repository files navigation

VITKLIA

Visualization and Interpretation Tool Kit

Two main parts

A tool to make smaller Dataset

A tool to transform Xvectors, find prototypes (and criticisms) and visualize it

How to install

Examples

Complete overview

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages