Matching Networks for One Shot Learning (NeurIPS'2016)

Abstract

Learning from a few examples remains a key challenge in machine learning. Despite recent advances in important domains such as vision and language, the standard supervised deep learning paradigm does not offer a satisfactory solution for learning new concepts rapidly from little data. In this work, we employ ideas from metric learning based on deep neural features and from recent advances that augment neural networks with external memories. Our framework learns a network that maps a small labelled support set and an unlabelled example to its label, obviating the need for fine-tuning to adapt to new class types. We then define one-shot learning problems on vision (using Omniglot, ImageNet) and language tasks. Our algorithm improves one-shot accuracy on ImageNet from 87.6% to 93.2% and from 88.0% to 93.8% on Omniglot compared to competing approaches. We also demonstrate the usefulness of the same model on language modeling by introducing a one-shot task on the Penn Treebank.

Citation

@inproceedings{vinyals2016matching,
    title={Matching networks for one shot learning},
    author={Vinyals, Oriol and Blundell, Charles and Lillicrap, Tim and Wierstra, Daan and others},
    booktitle={Advances in Neural Information Processing Systems},
    pages={3630--3638},
    year={2016}
}

How to Reproduce MatchingNet

It consists of two steps:

Step1: Base training
- use all the images of base classes to train a base model.
- conduct meta testing on validation set to select the best model.
Step2: Meta Testing:
- use best model from step1, the best model are saved into ${WORK_DIR}/${CONFIG}/best_accuracy_mean.pth in default.

An example of CUB dataset with Conv4

# base training
python ./tools/classification/train.py \
  configs/classification/matching_net/cub/matching-net_conv4_1xb105_cub_5way-1shot.py

# meta testing
python ./tools/classification/test.py \
  configs/classification/matching_net/cub/matching-net_conv4_1xb105_cub_5way-1shot.py \
  work_dir/matching-net_conv4_1xb105_cub_5way-1shot/best_accuracy_mean.pth

Note:

All the result are trained with single gpu.
The configs of 1 shot and 5 shot use same training setting, but different meta test setting on validation set and test set.
Currently, we use model selected by 1 shot validation (100 episodes) to evaluate both 1 shot and 5 shot setting on test set.
The hyper-parameters in configs are roughly set and probably not the optimal one so feel free to tone and try different configurations. For example, try different learning rate or validation episodes for each setting. Anyway, we will continue to improve it.
The training batch size is calculated by num_support_way * (num_support_shots + num_query_shots)

Results on CUB dataset with 2000 episodes

Arch	Input Size	Batch Size	way	shot	mean Acc	std	ckpt	log
conv4	84x84	105	5	1	63.65	0.5	ckpt	log
conv4	84x84	105	5	5	76.88	0.39	⇑	⇑
resnet12	84x84	105	5	1	78.33	0.45	ckpt	log
resnet12	84x84	105	5	5	88.98	0.26	⇑	⇑

Results on Mini-ImageNet dataset with 2000 episodes

Arch	Input Size	Batch Size	way	shot	mean Acc	std	ckpt	log
conv4	84x84	105	5	1	53.35	0.44	ckpt	log
conv4	84x84	105	5	5	66.3	0.38	⇑	⇑
resnet12	84x84	105	5	1	59.3	0.45	ckpt	log
resnet12	84x84	105	5	5	72.63	0.36	⇑	⇑

Results on Tiered-ImageNet dataset with 2000 episodes

Arch	Input Size	Batch Size	way	shot	mean Acc	std	ckpt	log
conv4	84x84	105	5	1	48.20	0.48	ckpt	log
conv4	84x84	105	5	5	61.19	0.43	⇑	⇑
resnet12	84x84	105	5	1	58.97	0.52	ckpt	log
resnet12	84x84	105	5	5	72.1	0.45	⇑	⇑

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Matching Networks for One Shot Learning (NeurIPS'2016)

Abstract

Citation

How to Reproduce MatchingNet

An example of CUB dataset with Conv4

Results on CUB dataset with 2000 episodes

Results on Mini-ImageNet dataset with 2000 episodes

Results on Tiered-ImageNet dataset with 2000 episodes

Files

README.md

Latest commit

History

README.md

File metadata and controls

Matching Networks for One Shot Learning (NeurIPS'2016)

Abstract

Citation

How to Reproduce MatchingNet

An example of CUB dataset with Conv4

Results on CUB dataset with 2000 episodes

Results on Mini-ImageNet dataset with 2000 episodes

Results on Tiered-ImageNet dataset with 2000 episodes