This repo contains Keras code used to train models for the task of speaker verification, as outlined in the paper below from ICASSP 2019:
[Utterance-level Aggregation For Speaker Recognition In The Wild (Xie et al., ICASSP 2019)].
The datasets used to train these models are the VoxCeleb datasets, which can be found at the link below.
To train the model on the Voxceleb2 dataset, please run
- python src/ --net resnet34s --batch_size 160 --gpu 2,3 --lr 0.001 --optimizer adam --epochs 48 --multiprocess 8 --loss softmax --data_path ../path_to_voxceleb2
- All models are available at the following google drive link:
- Download the models and put them in the folder, model/
To test a specific model on the voxceleb1 dataset eg. the ResNet34 model trained using adam with a softmax loss, and feature dimension 512 please run
- python src/ --gpu 1 --net resnet34s --ghost_cluster 2 --vlad_cluster 8 --loss softmax --ohem 2 --resume ../model/gvlad_softmax/2019-01-11_resnet34_bs142_adam_lr0.001_vlad8_ghost2_bdim512_ohemlevel2/weights-47-0.866.h5
