code

ajilim · Feb 9, 2019 · f56875e · f56875e
commit f56875e
Show file tree

Hide file tree

Showing 34 changed files with 2,478,757 additions and 0 deletions.
diff --git a/README.md b/README.md
@@ -0,0 +1,58 @@
+# README #
+
+This repo will contain the code for ICASSP 2019, speaker identifcation.
+
+This repo contains a Keras implementation of the paper,     
+[Utterance-level Aggregation For Speaker Recognition In The Wild (Xie et al., ICASSP 2019)].
+
+
+### Dependencies
+- [Python 2.7.15](https://www.continuum.io/downloads)
+- [Keras 2.2.4](https://keras.io/)
+- [Tensorflow 1.8.0](https://www.tensorflow.org/)
+
+### Data
+The dataset used for the experiments are
+
+- [Voxceleb1, Voxceleb2](http://www.robots.ox.ac.uk/~vgg/data/voxceleb/)
+
+### Training the model
+To train the model on the Voxceleb2 dataset, you can run
+
+- python src/main.py --net resnet34s --batch_size 160 --gpu 2,3 --lr 0.001 --optimizer adam --epochs 48 --multiprocess 8 --loss softmax --data_path ../path_to_voxceleb2
+
+### Model 
+- All models will be updated in the google drive: https://drive.google.com/open?id=1M_SXoW1ceKm3LghItY2ENKKUn3cWYfZm
+- Download the models and put it in the folder, model/
+
+### Testing the model
+To test a specific model on the voxceleb1 dataset, 
+for example, the model trained with ResNet34s trained by adam with softmax, and feature dimension 512
+
+- python src/predict.py --gpu 1 --net resnet34s --ghost_cluster 2 --vlad_cluster 8 --loss softmax --ohem 2 --resume ../model/gvlad_softmax/2019-01-11_resnet34_bs142_adam_lr0.001_vlad8_ghost2_bdim512_ohemlevel2/weights-47-0.866.h5 
+
+### Citation
+```
+@InProceedings{Xie19,
+  author       = "W. Xie, A. Nagrani, J. S. Chung, A. Zisserman ",
+  title        = "Utterance-level Aggregation For Speaker Recognition In The Wild.",
+  booktitle    = "ICASSP, 2019",
+  year         = "2019",
+}
+
+@InProceedings{Chung18,
+  author       = "J. S. Chung*, A. Nagrani*, A. Zisserman ",
+  title        = "VoxCeleb2: Deep Speaker Recognition.",
+  booktitle    = "INTERSPEECH, 2018",
+  year         = "2018",
+}
+
+@InProceedings{Nagrani17,
+  author       = "A. Nagrani*, J. S. Chung*, A. Zisserman ",
+  title        = "VoxCeleb: A Large-scale Speaker Identification Dataset.",
+  booktitle    = "INTERSPEECH, 2017",
+  year         = "2018",
+}
+```
+
+
diff --git a/meta/.DS_Store b/meta/.DS_Store
diff --git a/meta/voxceleb1_veri_test.txt b/meta/voxceleb1_veri_test.txt
diff --git a/meta/voxceleb1_veri_test_extended.txt b/meta/voxceleb1_veri_test_extended.txt