This is a collection of image classification and segmentation models. Many of them are pretrained on
ImageNet-1K, CIFAR-10/100,
SVHN, Pascal VOC2012,
ADE20K, Cityscapes,
and COCO datasets and loaded automatically during use. All pretrained models
require the same ordinary normalization. Scripts for training/evaluating/converting models are in the
imgclsmob
repo.
- AlexNet ('One weird trick for parallelizing convolutional neural networks')
- ZFNet ('Visualizing and Understanding Convolutional Networks')
- VGG/BN-VGG ('Very Deep Convolutional Networks for Large-Scale Image Recognition')
- BN-Inception ('Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift')
- ResNet ('Deep Residual Learning for Image Recognition')
- PreResNet ('Identity Mappings in Deep Residual Networks')
- ResNeXt ('Aggregated Residual Transformations for Deep Neural Networks')
- SENet/SE-ResNet/SE-PreResNet/SE-ResNeXt ('Squeeze-and-Excitation Networks')
- IBN-ResNet/IBN-ResNeXt/IBN-DenseNet ('Two at Once: Enhancing Learning and Generalization Capacities via IBN-Net')
- AirNet/AirNeXt ('Attention Inspiring Receptive-Fields Network for Learning Invariant Representations')
- BAM-ResNet ('BAM: Bottleneck Attention Module')
- CBAM-ResNet ('CBAM: Convolutional Block Attention Module')
- ResAttNet ('Residual Attention Network for Image Classification')
- SKNet ('Selective Kernel Networks')
- PyramidNet ('Deep Pyramidal Residual Networks')
- DiracNetV2 ('DiracNets: Training Very Deep Neural Networks Without Skip-Connections')
- ShaResNet ('ShaResNet: reducing residual network parameter number by sharing weights')
- CRU-Net ('Sharing Residual Units Through Collective Tensor Factorization To Improve Deep Neural Networks')
- DenseNet ('Densely Connected Convolutional Networks')
- CondenseNet ('CondenseNet: An Efficient DenseNet using Learned Group Convolutions')
- SparseNet ('Sparsely Aggregated Convolutional Networks')
- PeleeNet ('Pelee: A Real-Time Object Detection System on Mobile Devices')
- Oct-ResNet ('Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution')
- Res2Net ('Res2Net: A New Multi-scale Backbone Architecture')
- WRN ('Wide Residual Networks')
- WRN-1bit ('Training wide residual networks for deployment using a single bit for each weight')
- DRN-C/DRN-D ('Dilated Residual Networks')
- DPN ('Dual Path Networks')
- DarkNet Ref/Tiny/19 ('Darknet: Open source neural networks in c')
- DarkNet-53 ('YOLOv3: An Incremental Improvement')
- ChannelNet ('ChannelNets: Compact and Efficient Convolutional Neural Networks via Channel-Wise Convolutions')
- iSQRT-COV-ResNet ('Towards Faster Training of Global Covariance Pooling Networks by Iterative Matrix Square Root Normalization')
- i-RevNet ('i-RevNet: Deep Invertible Networks')
- BagNet ('Approximating CNNs with Bag-of-local-Features models works surprisingly well on ImageNet')
- DLA ('Deep Layer Aggregation')
- MSDNet ('Multi-Scale Dense Networks for Resource Efficient Image Classification')
- FishNet ('FishNet: A Versatile Backbone for Image, Region, and Pixel Level Prediction')
- ESPNetv2 ('ESPNetv2: A Light-weight, Power Efficient, and General Purpose Convolutional Neural Network')
- X-DenseNet ('Deep Expander Networks: Efficient Deep Networks from Graph Theory')
- SqueezeNet/SqueezeResNet ('SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size')
- SqueezeNext ('SqueezeNext: Hardware-Aware Neural Network Design')
- ShuffleNet ('ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices')
- ShuffleNetV2 ('ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design')
- MENet ('Merging and Evolution: Improving Convolutional Neural Networks for Mobile Applications')
- MobileNet ('MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications')
- FD-MobileNet ('FD-MobileNet: Improved MobileNet with A Fast Downsampling Strategy')
- MobileNetV2 ('MobileNetV2: Inverted Residuals and Linear Bottlenecks')
- IGCV3 ('IGCV3: Interleaved Low-Rank Group Convolutions for Efficient Deep Neural Networks')
- MnasNet ('MnasNet: Platform-Aware Neural Architecture Search for Mobile')
- DARTS ('DARTS: Differentiable Architecture Search')
- ProxylessNAS ('ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware')
- Xception ('Xception: Deep Learning with Depthwise Separable Convolutions')
- InceptionV3 ('Rethinking the Inception Architecture for Computer Vision')
- InceptionV4/InceptionResNetV2 ('Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning')
- PolyNet ('PolyNet: A Pursuit of Structural Diversity in Very Deep Networks')
- NASNet ('Learning Transferable Architectures for Scalable Image Recognition')
- PNASNet ('Progressive Neural Architecture Search')
- NIN ('Network In Network')
- RoR-3 ('Residual Networks of Residual Networks: Multilevel Residual Networks')
- RiR ('Resnet in Resnet: Generalizing Residual Architectures')
- ResDrop-ResNet ('Deep Networks with Stochastic Depth')
- Shake-Shake-ResNet ('Shake-Shake regularization')
- ShakeDrop-ResNet ('ShakeDrop Regularization for Deep Residual Learning')
- FractalNet ('FractalNet: Ultra-Deep Neural Networks without Residuals')
- PSPNet ('Pyramid Scene Parsing Network')
- DeepLabv3 ('Rethinking Atrous Convolution for Semantic Image Segmentation')
- FCN-8s ('Fully Convolutional Networks for Semantic Segmentation')
To use the models in your project, simply install the gluoncv2
package with mxnet
:
pip install gluoncv2 mxnet>=1.2.1
To enable different hardware supports such as GPUs, check out MXNet variants. For example, you can install with CUDA-9.2 supported MXNet:
pip install gluoncv2 mxnet-cu92>=1.2.1
Example of using a pretrained ResNet-18 model:
from gluoncv2.model_provider import get_model as glcv2_get_model
import mxnet as mx
net = glcv2_get_model("resnet18", pretrained=True)
x = mx.nd.zeros((1, 3, 224, 224), ctx=mx.cpu())
y = net(x)
Some remarks:
- Top1/Top5 are the standard 1-crop Top-1/Top-5 errors (in percents) on the validation subset of the ImageNet-1K dataset.
- FLOPs/2 is the number of FLOPs divided by two to be similar to the number of MACs.
- ResNet/PreResNet with b-suffix is a version of the networks with the stride in the second convolution of the bottleneck block. Respectively a network without b-suffix has the stride in the first convolution.
- ResNet/PreResNet models do not use biases in convolutions at all.
- CondenseNet models are only so-called converted versions.
- ShuffleNetV2 and ShuffleNetV2b are different implementations of the same architecture.
- ResNet(D) is a dilated ResNet intended for use as an feature extractor in some segmentation networks.
Model | Top1 | Top5 | Params | FLOPs/2 | Remarks |
---|---|---|---|---|---|
AlexNet | 44.12 | 21.26 | 61,100,840 | 714.83M | From dmlc/gluon-cv (log) |
VGG-11 | 31.91 | 11.76 | 132,863,336 | 7,615.87M | From dmlc/gluon-cv (log) |
VGG-13 | 31.06 | 11.12 | 133,047,848 | 11,317.65M | From dmlc/gluon-cv (log) |
VGG-16 | 26.78 | 8.69 | 138,357,544 | 15,480.10M | From dmlc/gluon-cv (log) |
VGG-19 | 25.88 | 8.23 | 143,667,240 | 19,642.55M | From dmlc/gluon-cv (log) |
BN-VGG-11b | 30.34 | 10.57 | 132,868,840 | 7,630.72M | From dmlc/gluon-cv (log) |
BN-VGG-13b | 29.48 | 10.16 | 133,053,736 | 11,342.14M | From dmlc/gluon-cv (log) |
BN-VGG-16b | 26.89 | 8.65 | 138,365,992 | 15,507.20M | From dmlc/gluon-cv (log) |
BN-VGG-19b | 25.66 | 8.15 | 143,678,248 | 19,672.26M | From dmlc/gluon-cv (log) |
BN-Inception | 25.09 | 7.76 | 11,295,240 | 2,048.06M | From Cadene/pretrained...pytorch (log) |
ResNet-10 | 34.61 | 13.85 | 5,418,792 | 894.04M | Training (log) |
ResNet-12 | 33.42 | 13.03 | 5,492,776 | 1,126.25M | Training (log) |
ResNet-14 | 32.18 | 12.20 | 5,788,200 | 1,357.94M | Training (log) |
ResNet-BC-14b | 30.26 | 11.16 | 10,064,936 | 1,479.12M | Training (log) |
ResNet-16 | 30.24 | 10.88 | 6,968,872 | 1,589.34M | Training (log) |
ResNet-18 x0.25 | 39.31 | 17.40 | 3,937,400 | 270.94M | Training (log) |
ResNet-18 x0.5 | 33.41 | 12.84 | 5,804,296 | 608.70M | Training (log) |
ResNet-18 x0.75 | 30.00 | 10.66 | 8,476,056 | 1,129.45M | Training (log) |
ResNet-18 | 28.09 | 9.51 | 11,689,512 | 1,820.41M | Training (log) |
ResNet-26 | 26.14 | 8.37 | 17,960,232 | 2,746.79M | Training (log) |
ResNet-BC-26b | 24.86 | 7.58 | 15,995,176 | 2,356.67M | Training (log) |
ResNet-34 | 24.53 | 7.43 | 21,797,672 | 3,672.68M | Training (log) |
ResNet-BC-38b | 23.50 | 6.72 | 21,925,416 | 3,234.21M | Training (log) |
ResNet-50 | 22.15 | 6.04 | 25,557,032 | 3,877.95M | Training (log) |
ResNet-50b | 22.06 | 6.11 | 25,557,032 | 4,110.48M | Training (log) |
ResNet-101 | 21.66 | 5.99 | 44,549,160 | 7,597.95M | From dmlc/gluon-cv (log) |
ResNet-101b | 20.79 | 5.39 | 44,549,160 | 7,830.48M | From dmlc/gluon-cv (log) |
ResNet-152 | 20.76 | 5.35 | 60,192,808 | 11,321.85M | From dmlc/gluon-cv (log) |
ResNet-152b | 20.31 | 5.25 | 60,192,808 | 11,554.38M | From dmlc/gluon-cv (log) |
PreResNet-10 | 34.65 | 14.01 | 5,417,128 | 894.19M | Training (log) |
PreResNet-12 | 33.57 | 13.21 | 5,491,112 | 1,126.40M | Training (log) |
PreResNet-14 | 32.29 | 12.18 | 5,786,536 | 1,358.09M | Training (log) |
PreResNet-BC-14b | 30.67 | 11.51 | 10,057,384 | 1,476.62M | Training (log) |
PreResNet-16 | 30.21 | 10.81 | 6,967,208 | 1,589.49M | Training (log) |
PreResNet-18 x0.25 | 39.62 | 17.78 | 3,935,960 | 270.93M | Training (log) |
PreResNet-18 x0.5 | 33.67 | 13.19 | 5,802,440 | 608.73M | Training (log) |
PreResNet-18 x0.75 | 29.96 | 10.68 | 8,473,784 | 1,129.51M | Training (log) |
PreResNet-18 | 28.16 | 9.51 | 11,687,848 | 1,820.56M | Training (log) |
PreResNet-26 | 26.03 | 8.34 | 17,958,568 | 2,746.94M | Training (log) |
PreResNet-BC-26b | 25.21 | 7.86 | 15,987,624 | 2,354.16M | Training (log) |
PreResNet-34 | 24.55 | 7.51 | 21,796,008 | 3,672.83M | Training (log) |
PreResNet-50 | 22.27 | 6.20 | 25,549,480 | 3,875.44M | Training (log) |
PreResNet-50b | 22.36 | 6.32 | 25,549,480 | 4,107.97M | Training (log) |
PreResNet-101 | 21.45 | 5.75 | 44,541,608 | 7,595.44M | From dmlc/gluon-cv (log) |
PreResNet-101b | 21.73 | 5.88 | 44,541,608 | 7,827.97M | From dmlc/gluon-cv (log) |
PreResNet-152 | 20.70 | 5.32 | 60,185,256 | 11,319.34M | From dmlc/gluon-cv (log) |
PreResNet-152b | 21.00 | 5.75 | 60,185,256 | 11,551.87M | From dmlc/gluon-cv (log) |
PreResNet-200b | 21.10 | 5.64 | 64,666,280 | 15,068.63M | From tornadomeet/ResNet (log) |
PreResNet-269b | 20.71 | 5.56 | 102,065,832 | 20,101.11M | From soeaver/mxnet-model (log) |
ResNeXt-14 (32x4d) | 29.95 | 11.10 | 9,411,880 | 1,603.46M | Training (log) |
ResNeXt-26 (32x4d) | 23.93 | 7.21 | 15,389,480 | 2,488.07M | Training (log) |
ResNeXt-101 (32x4d) | 21.32 | 5.79 | 44,177,704 | 8,003.45M | From Cadene/pretrained...pytorch (log) |
ResNeXt-101 (64x4d) | 20.60 | 5.41 | 83,455,272 | 15,500.27M | From Cadene/pretrained...pytorch (log) |
SE-ResNet-50 | 22.51 | 6.44 | 28,088,024 | 3,880.49M | From Cadene/pretrained...pytorch (log) |
SE-ResNet-101 | 21.92 | 5.89 | 49,326,872 | 7,602.76M | From Cadene/pretrained...pytorch (log) |
SE-ResNet-152 | 21.48 | 5.77 | 66,821,848 | 11,328.52M | From Cadene/pretrained...pytorch (log) |
SE-ResNeXt-50 (32x4d) | 21.06 | 5.58 | 27,559,896 | 4,258.40M | From Cadene/pretrained...pytorch (log) |
SE-ResNeXt-101 (32x4d) | 19.99 | 5.00 | 48,955,416 | 8,008.26M | From Cadene/pretrained...pytorch (log) |
SENet-154 | 18.84 | 4.65 | 115,088,984 | 20,745.78M | From Cadene/pretrained...pytorch (log) |
IBN-ResNet-50 | 23.56 | 6.68 | 25,557,032 | 4,110.48M | From XingangPan/IBN-Net (log) |
IBN-ResNet-101 | 21.89 | 5.87 | 44,549,160 | 7,830.48M | From XingangPan/IBN-Net (log) |
IBN(b)-ResNet-50 | 23.91 | 6.97 | 25,558,568 | 4,112.89M | From XingangPan/IBN-Net (log) |
IBN-ResNeXt-101 (32x4d) | 21.43 | 5.62 | 44,177,704 | 8,003.45M | From XingangPan/IBN-Net (log) |
IBN-DenseNet-121 | 24.98 | 7.47 | 7,978,856 | 2,872.13M | From XingangPan/IBN-Net (log) |
IBN-DenseNet-169 | 23.78 | 6.82 | 14,149,480 | 3,403.89M | From XingangPan/IBN-Net (log) |
AirNet50-1x64d (r=2) | 22.48 | 6.21 | 27,425,864 | 4,772.11M | From soeaver/AirNet-PyTorch (log) |
AirNet50-1x64d (r=16) | 22.91 | 6.46 | 25,714,952 | 4,399.97M | From soeaver/AirNet-PyTorch (log) |
AirNeXt50-32x4d (r=2) | 21.51 | 5.75 | 27,604,296 | 5,339.58M | From soeaver/AirNet-PyTorch (log) |
BAM-ResNet-50 | 23.68 | 6.96 | 25,915,099 | 4,196.09M | From Jongchan/attention-module (log) |
CBAM-ResNet-50 | 23.02 | 6.38 | 28,089,624 | 4,116.97M | From Jongchan/attention-module (log) |
PyramidNet-101 (a=360) | 22.72 | 6.52 | 42,455,070 | 8,743.54M | From dyhan0920/Pyramid...PyTorch (log) |
DiracNetV2-18 | 30.61 | 11.17 | 11,511,784 | 1,796.62M | From szagoruyko/diracnets (log) |
DiracNetV2-34 | 27.93 | 9.46 | 21,616,232 | 3,646.93M | From szagoruyko/diracnets (log) |
CRU-Net-56 | 25.72 | 8.25 | 25,609,384 | 5,660.66M | From cypw/CRU-Net (log) |
DenseNet-121 | 23.25 | 6.85 | 7,978,856 | 2,872.13M | Training (log) |
DenseNet-161 | 22.40 | 6.18 | 28,681,000 | 7,793.16M | From dmlc/gluon-cv (log) |
DenseNet-169 | 23.89 | 6.89 | 14,149,480 | 3,403.89M | From dmlc/gluon-cv (log) |
DenseNet-201 | 22.71 | 6.36 | 20,013,928 | 4,347.15M | From dmlc/gluon-cv (log) |
CondenseNet-74 (C=G=4) | 26.82 | 8.64 | 4,773,944 | 546.06M | From ShichenLiu/CondenseNet (log) |
CondenseNet-74 (C=G=8) | 29.76 | 10.49 | 2,935,416 | 291.52M | From ShichenLiu/CondenseNet (log) |
PeleeNet | 31.71 | 11.25 | 2,802,248 | 514.87M | Training (log) |
WRN-50-2 | 22.15 | 6.12 | 68,849,128 | 11,405.42M | From szagoruyko/functional-zoo (log) |
DRN-C-26 | 25.68 | 7.89 | 21,126,584 | 16,993.90M | From fyu/drn (log) |
DRN-C-42 | 23.80 | 6.92 | 31,234,744 | 25,093.75M | From fyu/drn (log) |
DRN-C-58 | 22.35 | 6.27 | 40,542,008 | 32,489.94M | From fyu/drn (log) |
DRN-D-22 | 26.67 | 8.52 | 16,393,752 | 13,051.33M | From fyu/drn (log) |
DRN-D-38 | 24.51 | 7.36 | 26,501,912 | 21,151.19M | From fyu/drn (log) |
DRN-D-54 | 22.05 | 6.27 | 35,809,176 | 28,547.38M | From fyu/drn (log) |
DRN-D-105 | 21.31 | 5.81 | 54,801,304 | 43,442.43M | From fyu/drn (log) |
DPN-68 | 22.87 | 6.58 | 12,611,602 | 2,351.84M | Training (log) |
DPN-98 | 20.23 | 5.28 | 61,570,728 | 11,716.51M | From Cadene/pretrained...pytorch (log) |
DPN-131 | 20.03 | 5.22 | 79,254,504 | 16,076.15M | From Cadene/pretrained...pytorch (log) |
DarkNet Tiny | 40.31 | 17.46 | 1,042,104 | 500.85M | Training (log) |
DarkNet Ref | 38.00 | 16.68 | 7,319,416 | 367.59M | Training (log) |
DarkNet-53 | 21.44 | 5.56 | 41,609,928 | 7,133.86M | From dmlc/gluon-cv (log) |
i-RevNet-301 | 26.97 | 8.97 | 125,120,356 | 14,453.87M | From jhjacobsen/pytorch-i-revnet (log) |
BagNet-9 | 59.57 | 35.44 | 15,688,744 | 16,049.19M | From wielandbrendel/bag...models (log) |
BagNet-17 | 44.76 | 21.52 | 16,213,032 | 15,768.77M | From wielandbrendel/bag...models (log) |
BagNet-33 | 36.43 | 14.95 | 18,310,184 | 16,371.52M | From wielandbrendel/bag...models (log) |
DLA-34 | 26.14 | 8.21 | 15,742,104 | 3,071.37M | From ucbdrive/dla (log) |
DLA-46-C | 33.84 | 12.86 | 1,301,400 | 585.45M | Training (log) |
DLA-X-46-C | 32.96 | 12.25 | 1,068,440 | 546.72M | Training (log) |
DLA-60 | 23.84 | 7.08 | 22,036,632 | 4,255.49M | From ucbdrive/dla (log) |
DLA-X-60 | 22.48 | 6.21 | 17,352,344 | 3,543.68M | From ucbdrive/dla (log) |
DLA-X-60-C | 30.67 | 10.74 | 1,319,832 | 596.06M | Training (log) |
DLA-102 | 22.87 | 6.44 | 33,268,888 | 7,190.95M | From ucbdrive/dla (log) |
DLA-X-102 | 21.97 | 6.02 | 26,309,272 | 5,884.94M | From ucbdrive/dla (log) |
DLA-X2-102 | 21.12 | 5.53 | 41,282,200 | 9,340.61M | From ucbdrive/dla (log) |
DLA-169 | 21.95 | 5.87 | 53,389,720 | 11,593.20M | From ucbdrive/dla (log) |
FishNet-150 | 22.85 | 6.38 | 24,959,400 | 6,435.02M | From kevin-ssy/FishNet (log) |
ESPNetv2 x0.5 | 43.61 | 21.07 | 1,241,332 | 35.36M | From sacmehta/ESPNetv2 (log) |
ESPNetv2 x1.0 | 35.33 | 14.27 | 1,670,072 | 98.09M | From sacmehta/ESPNetv2 (log) |
ESPNetv2 x1.25 | 33.14 | 12.73 | 1,965,440 | 138.18M | From sacmehta/ESPNetv2 (log) |
ESPNetv2 x1.5 | 32.04 | 11.94 | 2,314,856 | 185.77M | From sacmehta/ESPNetv2 (log) |
ESPNetv2 x2.0 | 28.91 | 9.94 | 3,498,136 | 306.93M | From sacmehta/ESPNetv2 (log) |
SqueezeNet v1.0 | 38.73 | 17.34 | 1,248,424 | 823.67M | Training (log) |
SqueezeNet v1.1 | 39.09 | 17.39 | 1,235,496 | 352.02M | Training (log) |
SqueezeResNet v1.0 | 39.32 | 17.67 | 1,248,424 | 823.67M | Training (log) |
SqueezeResNet v1.1 | 39.83 | 17.84 | 1,235,496 | 352.02M | Training (log) |
1.0-SqNxt-23 | 42.25 | 18.66 | 724,056 | 287.28M | Training (log) |
1.0-SqNxt-23v5 | 40.43 | 17.43 | 921,816 | 285.82M | Training (log) |
1.5-SqNxt-23 | 34.46 | 13.21 | 1,511,824 | 552.39M | Training (log) |
1.5-SqNxt-23v5 | 33.48 | 12.68 | 1,953,616 | 550.97M | Training (log) |
2.0-SqNxt-23 | 30.24 | 10.63 | 2,583,752 | 898.48M | Training (log) |
2.0-SqNxt-23v5 | 29.27 | 10.24 | 3,366,344 | 897.60M | Training (log) |
ShuffleNet x0.25 (g=1) | 62.00 | 36.77 | 209,746 | 12.35M | Training (log) |
ShuffleNet x0.25 (g=3) | 61.34 | 36.17 | 305,902 | 13.09M | Training (log) |
ShuffleNet x0.5 (g=1) | 46.22 | 22.38 | 534,484 | 41.16M | Training (log) |
ShuffleNet x0.5 (g=3) | 43.83 | 20.60 | 718,324 | 41.70M | Training (log) |
ShuffleNet x0.75 (g=1) | 39.25 | 16.75 | 975,214 | 86.42M | Training (log) |
ShuffleNet x0.75 (g=3) | 37.81 | 16.09 | 1,238,266 | 85.82M | Training (log) |
ShuffleNet x1.0 (g=1) | 34.41 | 13.50 | 1,531,936 | 148.13M | Training (log) |
ShuffleNet x1.0 (g=2) | 33.98 | 13.32 | 1,733,848 | 147.60M | Training (log) |
ShuffleNet x1.0 (g=3) | 33.96 | 13.29 | 1,865,728 | 145.46M | Training (log) |
ShuffleNet x1.0 (g=4) | 33.84 | 13.10 | 1,968,344 | 143.33M | Training (log) |
ShuffleNet x1.0 (g=8) | 33.65 | 13.19 | 2,434,768 | 150.76M | Training (log) |
ShuffleNetV2 x0.5 | 40.61 | 18.30 | 1,366,792 | 43.31M | Training (log) |
ShuffleNetV2 x1.0 | 30.94 | 11.23 | 2,278,604 | 149.72M | Training (log) |
ShuffleNetV2 x1.5 | 27.17 | 9.13 | 4,406,098 | 320.77M | Training (log) |
ShuffleNetV2 x2.0 | 25.80 | 8.23 | 7,601,686 | 595.84M | Training (log) |
ShuffleNetV2b x0.5 | 39.81 | 17.82 | 1,366,792 | 43.31M | Training (log) |
ShuffleNetV2b x1.0 | 30.39 | 11.01 | 2,279,760 | 150.62M | Training (log) |
ShuffleNetV2b x1.5 | 26.90 | 8.79 | 4,410,194 | 323.98M | Training (log) |
ShuffleNetV2b x2.0 | 25.20 | 8.10 | 7,611,290 | 603.37M | Training (log) |
108-MENet-8x1 (g=3) | 43.62 | 20.30 | 654,516 | 42.68M | Training (log) |
128-MENet-8x1 (g=4) | 42.10 | 19.13 | 750,796 | 45.98M | Training (log) |
128-MENet-8x1 (g=4) | 42.10 | 19.13 | 750,796 | 45.98M | Training (log) |
160-MENet-8x1 (g=8) | 43.47 | 20.28 | 850,120 | 45.63M | Training (log) |
256-MENet-12x1 (g=4) | 32.23 | 12.16 | 1,888,240 | 150.65M | Training (log) |
348-MENet-12x1 (g=3) | 27.85 | 9.36 | 3,368,128 | 312.00M | Training (log) |
352-MENet-12x1 (g=8) | 31.30 | 11.67 | 2,272,872 | 157.35M | Training (log) |
456-MENet-24x1 (g=3) | 25.02 | 7.80 | 5,304,784 | 567.90M | Training (log) |
MobileNet x0.25 | 45.78 | 22.18 | 470,072 | 44.09M | Training (log) |
MobileNet x0.5 | 33.94 | 13.30 | 1,331,592 | 155.42M | Training (log) |
MobileNet x0.75 | 29.85 | 10.51 | 2,585,560 | 333.99M | Training (log) |
MobileNet x1.0 | 26.43 | 8.65 | 4,231,976 | 579.80M | Training (log) |
FD-MobileNet x0.25 | 55.44 | 30.53 | 383,160 | 12.95M | Training (log) |
FD-MobileNet x0.5 | 42.62 | 19.69 | 993,928 | 41.84M | Training (log) |
FD-MobileNet x0.75 | 37.91 | 16.01 | 1,833,304 | 86.68M | Training (log) |
FD-MobileNet x1.0 | 33.80 | 13.12 | 2,901,288 | 147.46M | Training (log) |
MobileNetV2 x0.25 | 48.08 | 24.12 | 1,516,392 | 34.24M | Training (log) |
MobileNetV2 x0.5 | 35.63 | 14.42 | 1,964,736 | 100.13M | Training (log) |
MobileNetV2 x0.75 | 29.78 | 10.44 | 2,627,592 | 198.50M | Training (log) |
MobileNetV2 x1.0 | 26.77 | 8.64 | 3,504,960 | 329.36M | Training (log) |
IGCV3 x0.25 | 53.43 | 28.30 | 1,534,020 | 41.29M | Training (log) |
IGCV3 x0.5 | 39.41 | 17.03 | 1,985,528 | 111.12M | Training (log) |
IGCV3 x0.75 | 30.71 | 10.96 | 2,638,084 | 210.95M | Training (log) |
IGCV3 x1.0 | 27.73 | 9.00 | 3,491,688 | 340.79M | Training (log) |
MnasNet | 31.32 | 11.44 | 4,308,816 | 317.67M | From zeusees/Mnasnet...Model (log) |
DARTS | 27.23 | 8.97 | 4,718,752 | 539.86M | From quark0/darts (log) |
ProxylessNAS CPU | 24.78 | 7.50 | 4,361,648 | 459.96M | Training (log) |
ProxylessNAS GPU | 24.67 | 7.24 | 7,119,848 | 476.08M | Training (log) |
ProxylessNAS Mobile | 25.31 | 7.80 | 4,080,512 | 332.46M | Training (log) |
ProxylessNAS Mob-14 | 22.96 | 6.51 | 6,857,568 | 597.10M | Training (log) |
Xception | 20.99 | 5.56 | 22,855,952 | 8,403.63M | From Cadene/pretrained...pytorch (log) |
InceptionV3 | 21.22 | 5.59 | 23,834,568 | 5,743.06M | From dmlc/gluon-cv (log) |
InceptionV4 | 20.60 | 5.25 | 42,679,816 | 12,304.93M | From Cadene/pretrained...pytorch (log) |
InceptionResNetV2 | 19.96 | 4.94 | 55,843,464 | 13,188.64M | From Cadene/pretrained...pytorch (log) |
PolyNet | 19.09 | 4.53 | 95,366,600 | 34,821.34M | From Cadene/pretrained...pytorch (log) |
NASNet-A 4@1056 | 25.37 | 7.95 | 5,289,978 | 584.90M | From Cadene/pretrained...pytorch (log) |
NASNet-A 6@4032 | 18.17 | 4.24 | 88,753,150 | 23,976.44M | From Cadene/pretrained...pytorch (log) |
PNASNet-5-Large | 17.90 | 4.28 | 86,057,668 | 25,140.77M | From Cadene/pretrained...pytorch (log) |
ResNet(D)-50b | 20.79 | 5.49 | 25,680,808 | 20,496.80M | From dmlc/gluon-cv (log) |
ResNet(D)-101b | 19.49 | 4.61 | 44,672,936 | 35,391.85M | From dmlc/gluon-cv (log) |
ResNet(D)-152b | 19.39 | 4.67 | 60,316,584 | 47,661.38M | From dmlc/gluon-cv (log) |
Some remarks:
- Testing subset is used for validation purpose.
Features
means feature extractor output size.
Model | Error, % | Features | Params | FLOPs/2 | Remarks |
---|---|---|---|---|---|
NIN | 7.43 | 192 | 966,986 | 222.97M | Training (log) |
ResNet-20 | 5.97 | 64 | 272,474 | 41.29M | Training (log) |
ResNet-56 | 4.52 | 64 | 855,770 | 127.06M | Training (log) |
ResNet-110 | 3.69 | 64 | 1,730,714 | 255.70M | Training (log) |
ResNet-164(BN) | 3.68 | 256 | 1,704,154 | 255.31M | Training (log) |
ResNet-1001 | 3.28 | 256 | 10,328,602 | 1,536.40M | Training (log) |
ResNet-1202 | 3.53 | 64 | 19,424,026 | 2,857.17M | Training (log) |
PreResNet-20 | 6.51 | 64 | 272,282 | 41.27M | Training (log) |
PreResNet-56 | 4.49 | 64 | 855,578 | 127.03M | Training (log) |
PreResNet-110 | 3.86 | 64 | 1,730,522 | 255.68M | Training (log) |
PreResNet-164(BN) | 3.64 | 256 | 1,703,258 | 255.08M | Training (log) |
PreResNet-1001 | 2.65 | 256 | 10,327,706 | 1,536.18M | Training (log) |
PreResNet-1202 | 3.39 | 64 | 19,423,834 | 2,857.14M | Training (log) |
ResNeXt-29 (32x4d) | 3.15 | 1024 | 4,775,754 | 780.55M | Training (log) |
ResNeXt-29 (16x64d) | 2.41 | 1024 | 68,155,210 | 10,709.34M | Training (log) |
PyramidNet-110 (a=48) | 3.72 | 64 | 1,772,706 | 408.37M | Training (log) |
PyramidNet-110 (a=84) | 2.98 | 100 | 3,904,446 | 778.15M | Training (log) |
PyramidNet-110 (a=270) | 2.51 | 286 | 28,485,477 | 4,730.60M | Training (log) |
PyramidNet-164 (a=270, BN) | 2.42 | 1144 | 27,216,021 | 4,608.81M | Training (log) |
PyramidNet-200 (a=240, BN) | 2.44 | 1024 | 26,752,702 | 4,563.40M | Training (log) |
PyramidNet-236 (a=220, BN) | 2.47 | 944 | 26,969,046 | 4,631.32M | Training (log) |
PyramidNet-272 (a=200, BN) | 2.39 | 864 | 26,210,842 | 4,541.36M | Training (log) |
DenseNet-40 (k=12) | 5.61 | 258 | 599,050 | 210.80M | Training (log) |
DenseNet-BC-40 (k=12) | 6.43 | 132 | 176,122 | 74.89M | Training (log) |
DenseNet-BC-40 (k=24) | 4.52 | 264 | 690,346 | 293.09M | Training (log) |
DenseNet-BC-40 (k=36) | 4.04 | 396 | 1,542,682 | 654.60M | Training (log) |
DenseNet-100 (k=12) | 3.66 | 678 | 4,068,490 | 1,353.55M | Training (log) |
DenseNet-100 (k=24) | 3.13 | 1356 | 16,114,138 | 5,354.19M | Training (log) |
DenseNet-BC-100 (k=12) | 4.16 | 342 | 769,162 | 298.45M | Training (log) |
DenseNet-BC-190 (k=40) | 2.52 | 2190 | 25,624,430 | 9,400.45M | Training (log) |
DenseNet-BC-250 (k=24) | 2.67 | 1734 | 15,324,406 | 5,519.54M | Training (log) |
X-DenseNet-BC-40-2 (k=24) | 5.31 | 264 | 690,346 | 293.09M | Training (log) |
X-DenseNet-BC-40-2 (k=36) | 4.37 | 396 | 1,542,682 | 654.60M | Training (log) |
WRN-16-10 | 2.93 | 640 | 17,116,634 | 2,414.04M | Training (log) |
WRN-28-10 | 2.39 | 640 | 36,479,194 | 5,246.98M | Training (log) |
WRN-40-8 | 2.37 | 512 | 35,748,314 | 5,176.90M | Training (log) |
WRN-20-10-1bit | 3.26 | 640 | 26,737,140 | 4,019.14M | Training (log) |
WRN-20-10-32bit | 3.14 | 640 | 26,737,140 | 4,019.14M | Training (log) |
RoR-3-56 | 5.43 | 64 | 762,746 | 113.43M | Training (log) |
RoR-3-110 | 4.35 | 64 | 1,637,690 | 242.07M | Training (log) |
RoR-3-164 | 3.93 | 64 | 2,512,634 | 370.72M | Training (log) |
RiR | 3.28 | 384 | 9,492,980 | 1,281.08M | Training (log) |
Shake-Shake-ResNet-20-2x16d | 5.15 | 64 | 541,082 | 81.78M | Training (log) |
Shake-Shake-ResNet-26-2x32d | 3.17 | 64 | 2,923,162 | 428.89M | Training (log) |
Some remarks:
- Testing subset is used for validation purpose.
Model | Error, % | Params | FLOPs/2 | Remarks |
---|---|---|---|---|
NIN | 28.39 | 984,356 | 224.08M | Training (log) |
ResNet-20 | 29.64 | 278,324 | 41.30M | Training (log) |
ResNet-56 | 24.88 | 861,620 | 127.06M | Training (log) |
ResNet-110 | 22.80 | 1,736,564 | 255.71M | Training (log) |
ResNet-164(BN) | 20.44 | 1,727,284 | 255.33M | Training (log) |
ResNet-1001 | 19.79 | 10,351,732 | 1,536.43M | Training (log) |
PreResNet-20 | 30.22 | 278,132 | 41.28M | Training (log) |
PreResNet-56 | 25.05 | 861,428 | 127.04M | Training (log) |
PreResNet-110 | 22.67 | 1,736,372 | 255.68M | Training (log) |
PreResNet-164(BN) | 20.18 | 1,726,388 | 255.10M | Training (log) |
PreResNet-1001 | 18.41 | 10,350,836 | 1,536.20M | Training (log) |
ResNeXt-29 (32x4d) | 19.50 | 4,868,004 | 780.64M | Training (log) |
ResNeXt-29 (16x64d) | 16.93 | 68,247,460 | 10,709.43M | Training (log) |
PyramidNet-110 (a=48) | 20.95 | 1,778,556 | 408.38M | Training (log) |
PyramidNet-110 (a=84) | 18.87 | 3,913,536 | 778.16M | Training (log) |
PyramidNet-110 (a=270) | 17.10 | 28,511,307 | 4,730.62M | Training (log) |
PyramidNet-164 (a=270, BN) | 16.70 | 27,319,071 | 4,608.91M | Training (log) |
PyramidNet-200 (a=240, BN) | 16.09 | 26,844,952 | 4,563.49M | Training (log) |
PyramidNet-236 (a=220, BN) | 16.34 | 27,054,096 | 4,631.41M | Training (log) |
PyramidNet-272 (a=200, BN) | 16.19 | 26,288,692 | 4,541.43M | Training (log) |
DenseNet-40 (k=12) | 24.90 | 622,360 | 210.82M | Training (log) |
DenseNet-BC-40 (k=12) | 28.41 | 188,092 | 74.90M | Training (log) |
DenseNet-BC-40 (k=24) | 22.67 | 714,196 | 293.11M | Training (log) |
DenseNet-BC-40 (k=36) | 20.50 | 1,578,412 | 654.64M | Training (log) |
DenseNet-100 (k=12) | 19.64 | 4,129,600 | 1,353.62M | Training (log) |
DenseNet-100 (k=24) | 18.08 | 16,236,268 | 5,354.32M | Training (log) |
DenseNet-BC-100 (k=12) | 21.19 | 800,032 | 298.48M | Training (log) |
DenseNet-BC-250 (k=24) | 17.39 | 15,480,556 | 5,519.69M | Training (log) |
X-DenseNet-BC-40-2 (k=24) | 23.96 | 714,196 | 293.11M | Training (log) |
X-DenseNet-BC-40-2 (k=36) | 21.65 | 1,578,412 | 654.64M | Training (log) |
WRN-16-10 | 18.95 | 17,174,324 | 2,414.09M | Training (log) |
WRN-28-10 | 17.88 | 36,536,884 | 5,247.04M | Training (log) |
WRN-40-8 | 18.03 | 35,794,484 | 5,176.95M | Training (log) |
WRN-20-10-1bit | 19.04 | 26,794,920 | 4,022.81M | Training (log) |
WRN-20-10-32bit | 18.12 | 26,794,920 | 4,022.81M | Training (log) |
RoR-3-56 | 25.49 | 768,596 | 113.43M | Training (log) |
RoR-3-110 | 23.64 | 1,643,540 | 242.08M | Training (log) |
RoR-3-164 | 22.34 | 2,518,484 | 370.72M | Training (log) |
RiR | 19.23 | 9,527,720 | 1,283.29M | Training (log) |
Shake-Shake-ResNet-20-2x16d | 29.22 | 546,932 | 81.79M | Training (log) |
Shake-Shake-ResNet-26-2x32d | 18.80 | 2,934,772 | 428.90M | Training (log) |
Model | Error, % | Params | FLOPs/2 | Remarks |
---|---|---|---|---|
NIN | 3.76 | 966,986 | 222.97M | Training (log) |
ResNet-20 | 3.43 | 272,474 | 41.29M | Training (log) |
ResNet-56 | 2.75 | 855,770 | 127.06M | Training (log) |
ResNet-110 | 2.45 | 1,730,714 | 255.70M | Training (log) |
ResNet-164(BN) | 2.42 | 1,704,154 | 255.31M | Training (log) |
PreResNet-20 | 3.22 | 272,282 | 41.27M | Training (log) |
PreResNet-56 | 2.80 | 855,578 | 127.03M | Training (log) |
PreResNet-110 | 2.79 | 1,730,522 | 255.68M | Training (log) |
PreResNet-164(BN) | 2.58 | 1,703,258 | 255.08M | Training (log) |
ResNeXt-29 (32x4d) | 2.80 | 4,775,754 | 780.55M | Training (log) |
PyramidNet-110 (a=48) | 2.47 | 1,772,706 | 408.37M | Training (log) |
DenseNet-40 (k=12) | 3.05 | 599,050 | 210.80M | Training (log) |
DenseNet-BC-40 (k=12) | 3.20 | 176,122 | 74.89M | Training (log) |
DenseNet-BC-40 (k=24) | 2.90 | 690,346 | 293.09M | Training (log) |
DenseNet-BC-40 (k=36) | 2.60 | 1,542,682 | 654.60M | Training (log) |
DenseNet-100 (k=12) | 2.60 | 4,068,490 | 1,353.55M | Training (log) |
X-DenseNet-BC-40-2 (k=24) | 2.87 | 690,346 | 293.09M | Training (log) |
X-DenseNet-BC-40-2 (k=36) | 2.74 | 1,542,682 | 654.60M | Training (log) |
WRN-16-10 | 2.78 | 17,116,634 | 2,414.04M | Training (log) |
WRN-28-10 | 2.71 | 36,479,194 | 5,246.98M | Training (log) |
WRN-40-8 | 2.54 | 35,748,314 | 5,176.90M | Training (log) |
WRN-20-10-1bit | 2.73 | 26,737,140 | 4,019.14M | Training (log) |
WRN-20-10-32bit | 2.59 | 26,737,140 | 4,019.14M | Training (log) |
RoR-3-56 | 2.69 | 762,746 | 113.43M | Training (log) |
RoR-3-110 | 2.57 | 1,637,690 | 242.07M | Training (log) |
RoR-3-164 | 2.73 | 2,512,634 | 370.72M | Training (log) |
RiR | 2.68 | 9,492,980 | 1,281.08M | Training (log) |
Shake-Shake-ResNet-20-2x16d | 3.17 | 541,082 | 81.78M | Training (log) |
Shake-Shake-ResNet-26-2x32d | 2.62 | 2,923,162 | 428.89M | Training (log) |
Model | Extractor | Pix.Acc.,% | mIoU,% | Params | FLOPs/2 | Remarks |
---|---|---|---|---|---|---|
PSPNet | ResNet(D)-101b | 98.09 | 81.44 | 65,708,501 | 230,586.69M | From dmlc/gluon-cv (log) |
DeepLabv3 | ResNet(D)-101b | 97.95 | 80.24 | 58,754,773 | 47,624.54M | From dmlc/gluon-cv (log) |
DeepLabv3 | ResNet(D)-152b | 98.11 | 81.20 | 74,398,421 | 59,894.06M | From dmlc/gluon-cv (log) |
FCN-8s(d) | ResNet(D)-101b | 97.80 | 80.40 | 52,072,917 | 196,562.96M | From dmlc/gluon-cv (log) |
Model | Extractor | Pix.Acc.,% | mIoU,% | Params | FLOPs/2 | Remarks |
---|---|---|---|---|---|---|
PSPNet | ResNet(D)-50b | 79.37 | 36.87 | 46,782,550 | 162,410.82M | From dmlc/gluon-cv (log) |
PSPNet | ResNet(D)-101b | 79.93 | 37.97 | 65,774,678 | 230,824.47M | From dmlc/gluon-cv (log) |
DeepLabv3 | ResNet(D)-50b | 79.72 | 37.13 | 39,795,798 | 32,755.38M | From dmlc/gluon-cv (log) |
DeepLabv3 | ResNet(D)-101b | 80.21 | 37.84 | 58,787,926 | 47,650.43M | From dmlc/gluon-cv (log) |
FCN-8s(d) | ResNet(D)-50b | 76.92 | 33.39 | 33,146,966 | 128,387.08M | From dmlc/gluon-cv (log) |
FCN-8s(d) | ResNet(D)-101b | 79.01 | 35.88 | 52,139,094 | 196,800.73M | From dmlc/gluon-cv (log) |
Model | Extractor | Pix.Acc.,% | mIoU,% | Params | FLOPs/2 | Remarks |
---|---|---|---|---|---|---|
PSPNet | ResNet(D)-101b | 96.17 | 71.72 | 65,707,475 | 230,583.01M | From dmlc/gluon-cv (log) |
Model | Extractor | Pix.Acc.,% | mIoU,% | Params | FLOPs/2 | Remarks |
---|---|---|---|---|---|---|
PSPNet | ResNet(D)-101b | 92.05 | 67.41 | 65,708,501 | 230,586.69M | From dmlc/gluon-cv (log) |
DeepLabv3 | ResNet(D)-101b | 92.19 | 67.73 | 58,754,773 | 47,624.54M | From dmlc/gluon-cv (log) |
DeepLabv3 | ResNet(D)-152b | 92.24 | 68.99 | 74,398,421 | 275,084.22M | From dmlc/gluon-cv (log) |
FCN-8s(d) | ResNet(D)-101b | 91.44 | 60.11 | 52,072,917 | 196,562.96M | From dmlc/gluon-cv (log) |