Update info about NTS-Net model for CUB

Swapnil2095 · May 28, 2019 · 0d21eeb · 0d21eeb
1 parent 0bf343c
commit 0d21eeb
Show file tree

Hide file tree

Showing 10 changed files with 57 additions and 29 deletions.
diff --git a/README.md b/README.md
@@ -23,11 +23,11 @@ For each supported framework, there is a PIP-package containing pure models with
 
 Currently, models are mostly implemented on Gluon and then ported to other frameworks. Some models are pretrained on
 [ImageNet-1K](http://www.image-net.org), [CIFAR-10/100](https://www.cs.toronto.edu/~kriz/cifar.html),
-[SVHN](http://ufldl.stanford.edu/housenumbers), [Pascal VOC2012](http://host.robots.ox.ac.uk/pascal/VOC/voc2012),
-[ADE20K](http://groups.csail.mit.edu/vision/datasets/ADE20K), [Cityscapes](https://www.cityscapes-dataset.com),
-and [COCO](http://cocodataset.org) datasets. All pretrained weights are loaded automatically during use.
-See examples of such automatic loading of weights in the corresponding sections of the documentation dedicated to a
-particular package:
+[SVHN](http://ufldl.stanford.edu/housenumbers), [CUB-200-2011](http://www.vision.caltech.edu/visipedia/CUB-200-2011.html),
+[Pascal VOC2012](http://host.robots.ox.ac.uk/pascal/VOC/voc2012), [ADE20K](http://groups.csail.mit.edu/vision/datasets/ADE20K),
+[Cityscapes](https://www.cityscapes-dataset.com), and [COCO](http://cocodataset.org) datasets. All pretrained weights
+are loaded automatically during use. See examples of such automatic loading of weights in the corresponding sections of
+the documentation dedicated to a particular package:
 - [Gluon models](gluon/README.md),
 - [PyTorch models](pytorch/README.md),
 - [Chainer models](chainer_/README.md),
@@ -46,8 +46,8 @@ pip install -r requirements.txt
 
 Some remarks:
 - `Repo` is an author repository, if it exists.
-- `A`, `B`, `C`, and `D` means the implementation of a model for ImageNet-1K, CIFAR-10, CIFAR-100, and SVHN, respectively.
-- `A+`, `B+`, `C+`, and `D+` means having a pre-trained model for corresponding datasets.
+- `A`, `B`, `C`, `D`, and `E` means the implementation of a model for ImageNet-1K, CIFAR-10, CIFAR-100, SVHN, and CUB-200-2011, respectively.
+- `A+`, `B+`, `C+`, `D+`, and `E+` means having a pre-trained model for corresponding datasets.
 
 | Model | [Gluon](gluon/README.md) | [PyTorch](pytorch/README.md) | [Chainer](chainer_/README.md) | [Keras](keras_/README.md) | [TensorFlow](tensorflow_/README.md) | Paper | Repo | Year |
 | --- | --- | --- | --- | --- | --- | --- | --- | --- |
@@ -129,6 +129,7 @@ Some remarks:
 | Shake-Shake-ResNet | B+C+D+ | B+C+D+ | B+C+D+ | - | - | [link](https://arxiv.org/abs/1705.07485) | [link](https://github.com/xgastaldi/shake-shake) | 2017 |
 | ShakeDrop-ResNet | BCD | BCD | BCD | - | - | [link](https://arxiv.org/abs/1802.02375) | - | 2018 |
 | FractalNet | BC | BC | - | - | - | [link](https://arxiv.org/abs/1605.07648) | [link](https://github.com/gustavla/fractalnet) | 2016 |
+| NTS-Net | E+ | E+ | E+ | - | - | [link](https://arxiv.org/abs/1809.00287) | [link](https://github.com/yangze0930/NTS-Net) | 2018 |
 
 ## Table of implemented segmentation models
 

diff --git a/chainer_/README.md b/chainer_/README.md
@@ -5,11 +5,11 @@
 
 This is a collection of image classification and segmentation models. Many of them are pretrained on
 [ImageNet-1K](http://www.image-net.org), [CIFAR-10/100](https://www.cs.toronto.edu/~kriz/cifar.html),
-[SVHN](http://ufldl.stanford.edu/housenumbers), [Pascal VOC2012](http://host.robots.ox.ac.uk/pascal/VOC/voc2012),
-[ADE20K](http://groups.csail.mit.edu/vision/datasets/ADE20K), [Cityscapes](https://www.cityscapes-dataset.com),
-and [COCO](http://cocodataset.org) datasets and loaded automatically during use. All pretrained models
-require the same ordinary normalization. Scripts for training/evaluating/converting models are in the
-[`imgclsmob`](https://github.com/osmr/imgclsmob) repo.
+[SVHN](http://ufldl.stanford.edu/housenumbers), [CUB-200-2011](http://www.vision.caltech.edu/visipedia/CUB-200-2011.html),
+[Pascal VOC2012](http://host.robots.ox.ac.uk/pascal/VOC/voc2012), [ADE20K](http://groups.csail.mit.edu/vision/datasets/ADE20K),
+[Cityscapes](https://www.cityscapes-dataset.com), and [COCO](http://cocodataset.org) datasets and loaded automatically
+during use. All pretrained models require the same ordinary normalization. Scripts for training/evaluating/converting
+models are in the [`imgclsmob`](https://github.com/osmr/imgclsmob) repo.
 
 ## List of implemented models
 
@@ -71,6 +71,7 @@ require the same ordinary normalization. Scripts for training/evaluating/convert
 - ResDrop-ResNet (['Deep Networks with Stochastic Depth'](https://arxiv.org/abs/1603.09382))
 - Shake-Shake-ResNet (['Shake-Shake regularization'](https://arxiv.org/abs/1705.07485))
 - ShakeDrop-ResNet (['ShakeDrop Regularization for Deep Residual Learning'](https://arxiv.org/abs/1802.02375))
+- NTS-Net (['Learning to Navigate for Fine-grained Classification'](https://arxiv.org/abs/1809.00287))
 - PSPNet (['Pyramid Scene Parsing Network'](https://arxiv.org/abs/1612.01105))
 - DeepLabv3 (['Rethinking Atrous Convolution for Semantic Image Segmentation'](https://arxiv.org/abs/1706.05587))
 - FCN-8s (['Fully Convolutional Networks for Semantic Segmentation'](https://arxiv.org/abs/1411.4038))
@@ -415,6 +416,12 @@ Some remarks:
 | Shake-Shake-ResNet-20-2x16d | 3.17 | 541,082 | 81.78M | Converted from GL model ([log](https://github.com/osmr/imgclsmob/releases/download/v0.0.295/shakeshakeresnet20_2x16d_svhn-0317-261fd59f.npz.log)) |
 | Shake-Shake-ResNet-26-2x32d | 2.62 | 2,923,162 | 428.89M | Converted from GL model ([log](https://github.com/osmr/imgclsmob/releases/download/v0.0.295/shakeshakeresnet26_2x32d_svhn-0262-844e1f6d.npz.log)) |
 
+### CUB-200-2011
+
+| Model | Error, % | Params | FLOPs/2 | Remarks |
+| --- | ---: | ---: | ---: | --- |
+| NTS-Net | 12.86 | 28,623,333 | 33,361.79M | From [yangze0930/NTS-Net] ([log](https://github.com/osmr/imgclsmob/releases/download/v0.0.334/ntsnet_cub-1286-4d759524.npz.log)) |
+
 ### Pascal VOC20102
 
 | Model | Extractor | Pix.Acc.,% | mIoU,% | Params | FLOPs/2 | Remarks |
@@ -471,4 +478,5 @@ Some remarks:
 [sacmehta/ESPNetv2]: https://github.com/sacmehta/ESPNetv2
 [jhjacobsen/pytorch-i-revnet]: https://github.com/jhjacobsen/pytorch-i-revnet
 [wielandbrendel/bag...models]: https://github.com/wielandbrendel/bag-of-local-features-models
-[MIT-HAN-LAB/ProxylessNAS]: https://github.com/MIT-HAN-LAB/ProxylessNAS
+[MIT-HAN-LAB/ProxylessNAS]: https://github.com/MIT-HAN-LAB/ProxylessNAS
+[yangze0930/NTS-Net]: https://github.com/yangze0930/NTS-Net
diff --git a/chainer_/chainercv2/model_provider.py b/chainer_/chainercv2/model_provider.py
@@ -569,7 +569,7 @@
     'octresnet10_ad2': octresnet10_ad2,
     'octresnet50b_ad2': octresnet50b_ad2,
 
-    'ntsnet': ntsnet_cub,
+    'ntsnet_cub': ntsnet_cub,
 }
 
 

diff --git a/chainer_/chainercv2/models/model_store.py b/chainer_/chainercv2/models/model_store.py
@@ -302,6 +302,7 @@
     ('shakeshakeresnet26_2x32d_cifar10', '0317', '5422fce187dff99fa8f4678274a8dd1519e23e27', 'v0.0.217'),
     ('shakeshakeresnet26_2x32d_cifar100', '1880', '750a574e738cf53079b6965410e07fb3abef82fd', 'v0.0.222'),
     ('shakeshakeresnet26_2x32d_svhn', '0262', '844e1f6d067b830087b9456617159a77137138f7', 'v0.0.295'),
+    ('ntsnet_cub', '1286', '4d7595248f0fb042ef06c657d73bd0a2f3fc4f0d', 'v0.0.334'),
     ('pspnet_resnetd101b_voc', '7626', 'f90c0db9892ec6892623a774ba21000f7cc3995f', 'v0.0.297'),
     ('pspnet_resnetd50b_ade20k', '2746', '7b7ce5680fdfab567222ced11a2430cf1a452116', 'v0.0.297'),
     ('pspnet_resnetd101b_ade20k', '3286', 'c5e619c41740751865f662b539abbad5dd9be42b', 'v0.0.297'),

diff --git a/gluon/README.md b/gluon/README.md
@@ -5,11 +5,11 @@
 
 This is a collection of image classification and segmentation models. Many of them are pretrained on
 [ImageNet-1K](http://www.image-net.org), [CIFAR-10/100](https://www.cs.toronto.edu/~kriz/cifar.html),
-[SVHN](http://ufldl.stanford.edu/housenumbers), [Pascal VOC2012](http://host.robots.ox.ac.uk/pascal/VOC/voc2012),
-[ADE20K](http://groups.csail.mit.edu/vision/datasets/ADE20K), [Cityscapes](https://www.cityscapes-dataset.com),
-and [COCO](http://cocodataset.org) datasets and loaded automatically during use. All pretrained models
-require the same ordinary normalization. Scripts for training/evaluating/converting models are in the
-[`imgclsmob`](https://github.com/osmr/imgclsmob) repo.
+[SVHN](http://ufldl.stanford.edu/housenumbers), [CUB-200-2011](http://www.vision.caltech.edu/visipedia/CUB-200-2011.html),
+[Pascal VOC2012](http://host.robots.ox.ac.uk/pascal/VOC/voc2012), [ADE20K](http://groups.csail.mit.edu/vision/datasets/ADE20K),
+[Cityscapes](https://www.cityscapes-dataset.com), and [COCO](http://cocodataset.org) datasets and loaded automatically
+during use. All pretrained models require the same ordinary normalization. Scripts for training/evaluating/converting
+models are in the [`imgclsmob`](https://github.com/osmr/imgclsmob) repo.
 
 ## List of implemented models
 
@@ -77,9 +77,10 @@ require the same ordinary normalization. Scripts for training/evaluating/convert
 - Shake-Shake-ResNet (['Shake-Shake regularization'](https://arxiv.org/abs/1705.07485))
 - ShakeDrop-ResNet (['ShakeDrop Regularization for Deep Residual Learning'](https://arxiv.org/abs/1802.02375))
 - FractalNet (['FractalNet: Ultra-Deep Neural Networks without Residuals'](https://arxiv.org/abs/1605.07648))
+- NTS-Net (['Learning to Navigate for Fine-grained Classification'](https://arxiv.org/abs/1809.00287))
+- FCN-8s (['Fully Convolutional Networks for Semantic Segmentation'](https://arxiv.org/abs/1411.4038))
 - PSPNet (['Pyramid Scene Parsing Network'](https://arxiv.org/abs/1612.01105))
 - DeepLabv3 (['Rethinking Atrous Convolution for Semantic Image Segmentation'](https://arxiv.org/abs/1706.05587))
-- FCN-8s (['Fully Convolutional Networks for Semantic Segmentation'](https://arxiv.org/abs/1411.4038))
 
 ## Installation
 
@@ -443,6 +444,12 @@ Some remarks:
 | Shake-Shake-ResNet-20-2x16d | 3.17 | 541,082 | 81.78M | Training ([log](https://github.com/osmr/imgclsmob/releases/download/v0.0.295/shakeshakeresnet20_2x16d_svhn-0317-7a48fde5.params.log)) |
 | Shake-Shake-ResNet-26-2x32d | 2.62 | 2,923,162 | 428.89M | Training ([log](https://github.com/osmr/imgclsmob/releases/download/v0.0.295/shakeshakeresnet26_2x32d_svhn-0262-f1dbb8ef.params.log)) |
 
+### CUB-200-2011
+
+| Model | Error, % | Params | FLOPs/2 | Remarks |
+| --- | ---: | ---: | ---: | --- |
+| NTS-Net | 13.26 | 28,623,333 | 33,361.39M | From [yangze0930/NTS-Net] ([log](https://github.com/osmr/imgclsmob/releases/download/v0.0.334/ntsnet_cub-1326-75ae8cdc.params.log)) |
+
 ### Pascal VOC20102
 
 | Model | Extractor | Pix.Acc.,% | mIoU,% | Params | FLOPs/2 | Remarks |
@@ -501,4 +508,5 @@ Some remarks:
 [sacmehta/ESPNetv2]: https://github.com/sacmehta/ESPNetv2
 [jhjacobsen/pytorch-i-revnet]: https://github.com/jhjacobsen/pytorch-i-revnet
 [wielandbrendel/bag...models]: https://github.com/wielandbrendel/bag-of-local-features-models
-[MIT-HAN-LAB/ProxylessNAS]: https://github.com/MIT-HAN-LAB/ProxylessNAS
+[MIT-HAN-LAB/ProxylessNAS]: https://github.com/MIT-HAN-LAB/ProxylessNAS
+[yangze0930/NTS-Net]: https://github.com/yangze0930/NTS-Net
diff --git a/gluon/gluoncv2/model_provider.py b/gluon/gluoncv2/model_provider.py
@@ -628,7 +628,7 @@
     'res2net50_w14_s8': res2net50_w14_s8,
     'res2net50_w26_s8': res2net50_w26_s8,
 
-    'ntsnet': ntsnet_cub,
+    'ntsnet_cub': ntsnet_cub,
 }
 
 

diff --git a/gluon/gluoncv2/models/model_store.py b/gluon/gluoncv2/models/model_store.py
@@ -309,6 +309,7 @@
     ('shakeshakeresnet26_2x32d_cifar10', '0317', '21e60e626765001aaaf4eb26f7cb8f4a69ea3dc1', 'v0.0.217'),
     ('shakeshakeresnet26_2x32d_cifar100', '1880', 'bd46a7418374e3b3c844b33e12b09b6a98eb4e6e', 'v0.0.222'),
     ('shakeshakeresnet26_2x32d_svhn', '0262', 'f1dbb8ef162d9ec56478e2579272f85ed78ad896', 'v0.0.295'),
+    ('ntsnet_cub', '1326', '75ae8cdcf4beb1ab60c1a983c9f143baaebbdea0', 'v0.0.334'),
     ('pspnet_resnetd101b_voc', '8144', 'e15319bf5428637e7fc00dcd426dd458ac937b08', 'v0.0.297'),
     ('pspnet_resnetd50b_ade20k', '3687', 'f0dcdf734f8f32a879dec3c4e7fe61d629244030', 'v0.0.297'),
     ('pspnet_resnetd101b_ade20k', '3797', 'c1280aeab8daa31c0893f7551d70130c2b68214a', 'v0.0.297'),

diff --git a/pytorch/README.md b/pytorch/README.md
@@ -5,11 +5,11 @@
 
 This is a collection of image classification and segmentation models. Many of them are pretrained on
 [ImageNet-1K](http://www.image-net.org), [CIFAR-10/100](https://www.cs.toronto.edu/~kriz/cifar.html),
-[SVHN](http://ufldl.stanford.edu/housenumbers), [Pascal VOC2012](http://host.robots.ox.ac.uk/pascal/VOC/voc2012),
-[ADE20K](http://groups.csail.mit.edu/vision/datasets/ADE20K), [Cityscapes](https://www.cityscapes-dataset.com),
-and [COCO](http://cocodataset.org) datasets and loaded automatically during use. All pretrained models
-require the same ordinary normalization. Scripts for training/evaluating/converting models are in the
-[`imgclsmob`](https://github.com/osmr/imgclsmob) repo.
+[SVHN](http://ufldl.stanford.edu/housenumbers), [CUB-200-2011](http://www.vision.caltech.edu/visipedia/CUB-200-2011.html),
+[Pascal VOC2012](http://host.robots.ox.ac.uk/pascal/VOC/voc2012), [ADE20K](http://groups.csail.mit.edu/vision/datasets/ADE20K),
+[Cityscapes](https://www.cityscapes-dataset.com), and [COCO](http://cocodataset.org) datasets and loaded automatically
+during use. All pretrained models require the same ordinary normalization. Scripts for training/evaluating/converting
+models are in the [`imgclsmob`](https://github.com/osmr/imgclsmob) repo.
 
 ## List of implemented models
 
@@ -76,6 +76,7 @@ require the same ordinary normalization. Scripts for training/evaluating/convert
 - Shake-Shake-ResNet (['Shake-Shake regularization'](https://arxiv.org/abs/1705.07485))
 - ShakeDrop-ResNet (['ShakeDrop Regularization for Deep Residual Learning'](https://arxiv.org/abs/1802.02375))
 - FractalNet (['FractalNet: Ultra-Deep Neural Networks without Residuals'](https://arxiv.org/abs/1605.07648))
+- NTS-Net (['Learning to Navigate for Fine-grained Classification'](https://arxiv.org/abs/1809.00287))
 - PSPNet (['Pyramid Scene Parsing Network'](https://arxiv.org/abs/1612.01105))
 - DeepLabv3 (['Rethinking Atrous Convolution for Semantic Image Segmentation'](https://arxiv.org/abs/1706.05587))
 - FCN-8s (['Fully Convolutional Networks for Semantic Segmentation'](https://arxiv.org/abs/1411.4038))
@@ -430,6 +431,12 @@ OpenCV `Resize` transformation instead of PIL one quality evaluation results wil
 | Shake-Shake-ResNet-20-2x16d | 3.17 | 541,082 | 81.78M | Converted from GL model ([log](https://github.com/osmr/imgclsmob/releases/download/v0.0.295/shakeshakeresnet20_2x16d_svhn-0317-a693ec24.pth.log)) |
 | Shake-Shake-ResNet-26-2x32d | 2.62 | 2,923,162 | 428.89M | Converted from GL model ([log](https://github.com/osmr/imgclsmob/releases/download/v0.0.295/shakeshakeresnet26_2x32d_svhn-0262-c1b8099e.pth.log)) |
 
+### CUB-200-2011
+
+| Model | Error, % | Params | FLOPs/2 | Remarks |
+| --- | ---: | ---: | ---: | --- |
+| NTS-Net | 12.77 | 28,623,333 | 33,361.79M | From [yangze0930/NTS-Net] ([log](https://github.com/osmr/imgclsmob/releases/download/v0.0.334/ntsnet_cub-1277-f6f330ab.pth.log)) |
+
 ### Pascal VOC20102
 
 | Model | Extractor | Pix.Acc.,% | mIoU,% | Params | FLOPs/2 | Remarks |
@@ -488,4 +495,5 @@ OpenCV `Resize` transformation instead of PIL one quality evaluation results wil
 [sacmehta/ESPNetv2]: https://github.com/sacmehta/ESPNetv2
 [jhjacobsen/pytorch-i-revnet]: https://github.com/jhjacobsen/pytorch-i-revnet
 [wielandbrendel/bag...models]: https://github.com/wielandbrendel/bag-of-local-features-models
-[MIT-HAN-LAB/ProxylessNAS]: https://github.com/MIT-HAN-LAB/ProxylessNAS
+[MIT-HAN-LAB/ProxylessNAS]: https://github.com/MIT-HAN-LAB/ProxylessNAS
+[yangze0930/NTS-Net]: https://github.com/yangze0930/NTS-Net
diff --git a/pytorch/pytorchcv/model_provider.py b/pytorch/pytorchcv/model_provider.py
@@ -616,7 +616,7 @@
     'octresnet50b_ad2': octresnet50b_ad2,
 
     'oth_ntsnet': oth_ntsnet,
-    'ntsnet': ntsnet_cub,
+    'ntsnet_cub': ntsnet_cub,
 }
 
 

diff --git a/pytorch/pytorchcv/models/model_store.py b/pytorch/pytorchcv/models/model_store.py
@@ -308,6 +308,7 @@
     ('shakeshakeresnet26_2x32d_cifar10', '0317', 'ecd1f8337cc90b5378b4217fb2591f2ed0f02bdf', 'v0.0.217'),
     ('shakeshakeresnet26_2x32d_cifar100', '1880', 'b47e371f60c9fed9eaac960568783fb6f83a362f', 'v0.0.222'),
     ('shakeshakeresnet26_2x32d_svhn', '0262', 'c1b8099ece97e17ce85213e4ecc6e50a064050cf', 'v0.0.295'),
+    ('ntsnet_cub', '1277', 'f6f330abfabcc2ea17a8d4b8977a6ea322ddf532', 'v0.0.334'),
     ('pspnet_resnetd101b_voc', '8144', 'c22f021948461a7b7ab1ef1265a7948762770c83', 'v0.0.297'),
     ('pspnet_resnetd50b_ade20k', '3687', '13f22137d7dd06c6de2ffc47e6ed33403d3dd2cf', 'v0.0.297'),
     ('pspnet_resnetd101b_ade20k', '3797', '115d62bf66477221b83337208aefe0f2f0266da2', 'v0.0.297'),