Skip to content

Utterance-level Aggregation For Speaker Recognition In The Wild

Notifications You must be signed in to change notification settings

ajilim/VGG-Speaker-Recognition

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

README

This repo contains Keras code used to train models for the task of speaker verification, as outlined in the paper below from ICASSP 2019:

[Utterance-level Aggregation For Speaker Recognition In The Wild (Xie et al., ICASSP 2019)].

Dependencies

Data

The datasets used to train these models are the VoxCeleb datasets, which can be found at the link below.

Training the model

To train the model on the Voxceleb2 dataset, please run

- python src/main.py --net resnet34s --batch_size 160 --gpu 2,3 --lr 0.001 --optimizer adam --epochs 48 --multiprocess 8 --loss softmax --data_path ../path_to_voxceleb2

Model

Testing the model

To test a specific model on the voxceleb1 dataset eg. the ResNet34 model trained using adam with a softmax loss, and feature dimension 512 please run

- python src/predict.py --gpu 1 --net resnet34s --ghost_cluster 2 --vlad_cluster 8 --loss softmax --ohem 2 --resume ../model/gvlad_softmax/2019-01-11_resnet34_bs142_adam_lr0.001_vlad8_ghost2_bdim512_ohemlevel2/weights-47-0.866.h5 

Citation

@InProceedings{Xie19,
  author       = "W. Xie, A. Nagrani, J. S. Chung, A. Zisserman ",
  title        = "Utterance-level Aggregation For Speaker Recognition In The Wild.",
  booktitle    = "ICASSP, 2019",
  year         = "2019",
}

@InProceedings{Chung18,
  author       = "J. S. Chung*, A. Nagrani*, A. Zisserman ",
  title        = "VoxCeleb2: Deep Speaker Recognition.",
  booktitle    = "INTERSPEECH, 2018",
  year         = "2018",
}

@InProceedings{Nagrani17,
  author       = "A. Nagrani*, J. S. Chung*, A. Zisserman ",
  title        = "VoxCeleb: A Large-scale Speaker Identification Dataset.",
  booktitle    = "INTERSPEECH, 2017",
  year         = "2018",
}

About

Utterance-level Aggregation For Speaker Recognition In The Wild

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%