Triplet similarity embedding for face verification
Swami Sankaranarayanan, Azadeh Alavi, Rama Chellappa
Arxiv
None
The training time of a previous method(facenet? 1000hours) is too long. Facenet uses a triplet distance loss function and was trained on a large private dataset. This work is more general.
- propose a deep network architecture and a training scheme that ensures **faster** training time.
- **formulate** a triplet similarity embedding learning method
- performance tested on the IJB-A datasets
- Mirroring the AlexNet architecture(fewer parameters in fc layers, and using PReLU instead of ReLU)
- Using AlexNet weights to initialize the network(see thoughts 2)
- Using AlexNet-like network to do feature extraction.
- Using the feature extracted(512 dims) as inputs to learn triplet similarity embedding(128 dims).
- update a matrix
$W$ to transform an intermediate 512 dims feature vector to 128 dims.
The author calls the method used in facenet TDE(Triplet distance embedding), and their own TSE(Triplet similarity embedding). TSE+L2 is better than L2 only and TDE+L2
- Compare running time with facenet is unfair.
- Using VGG/ResNet/DenseNet/Inception instead of AlexNet?
- Pay attention to the “distance” and “similarity”