Deep Residual Learning for Image Recognition
link: https://github.com/pytorch/vision
tag: v0.9.0
link: https://github.com/rwightman/pytorch-image-models
tag: v0.6.5
link: https://github.com/open-mmlab/mmclassification
tag: v0.23.1
link: https://github.com/PaddlePaddle/PaddleClas
tag: v2.4.0
link: https://github.com/keras-team/keras
tag: 2.3.1
link: https://github.com/Oneflow-Inc/vision
tag: v0.2.1
ResNet系列网络的预处理操作可以按照如下步骤进行,即先对图片进行resize至256的尺寸,然后利用CenterCrop
算子crop出224的图片对其进行归一化、减均值除方差等操作
[
torchvision.transforms.Resize(256),
torchvision.transforms.CenterCrop(224),
torchvision.transforms.ToTensor(),
torchvision.transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225],),
]
ResNet系列网络的后处理操作是对网络输出进行softmax作为每个类别的预测值,然后根据预测值进行排序,选择topk作为输入图片的预测分数以及类别
ResNet系列网络的backbone结构是由BasicBlock或Bottleneck搭成。ResNet网络名称后面的数字表示整个网络中包含参数层的数量
ResNet系列网络的head层由global-average-pooling层和一层全连接层组成
- residual layer
模型 | 源码 | top1 | top5 | flops(G) | params(M) | input size |
---|---|---|---|---|---|---|
resnet18 | timm | 69.744 | 89.082 | 3.648 | 11.690 | 224 |
resnet26 | timm | 75.300 | 92.578 | 4.744 | 15.995 | 224 |
resnet34 | timm | 75.114 | 92.284 | 7.358 | 21.798 | 224 |
resnet50 | timm | 80.376 | 94.616 | 8.268 | 25.557 | 224 |
resnet101 | timm | 81.932 | 95.770 | 15.732 | 44.549 | 224 |
resnet152 | timm | 82.820 | 96.130 | 23.208 | 60.193 | 224 |
gluon_resnet18_v1b | timm | 70.834 | 89.762 | 4.053 | 11.690 | 224 |
gluon_resnet34_v1b | timm | 74.588 | 91.988 | 8.175 | 21.798 | 224 |
gluon_resnet50_v1b | timm | 77.580 | 93.722 | 9.186 | 25.557 | 224 |
gluon_resnet50_v1c | timm | 78.012 | 93.990 | 9.726 | 26.576 | 224 |
gluon_resnet50_v1d | timm | 79.076 | 94.472 | 9.727 | 25.576 | 224 |
gluon_resnet50_v1s | timm | 78.712 | 94.240 | 12.219 | 25.681 | 224 |
gluon_resnet101_v1b | timm | 79.302 | 94.520 | 17.481 | 44.549 | 224 |
gluon_resnet101_v1c | timm | 79.534 | 94.580 | 18.021 | 44.568 | 224 |
gluon_resnet101_v1d | timm | 80.420 | 95.016 | 18.021 | 44.568 | 224 |
gluon_resnet101_v1s | timm | 80.298 | 95.164 | 20.514 | 44.673 | 224 |
gluon_resnet152_v1b | timm | 79.680 | 94.738 | 25.787 | 60.193 | 224 |
gluon_resnet152_v1c | timm | 79.908 | 94.848 | 26.326 | 60.212 | 224 |
gluon_resnet152_v1d | timm | 80.476 | 95.204 | 26.327 | 60.212 | 224 |
gluon_resnet152_v1s | timm | 81.016 | 95.412 | 28.819 | 60.317 | 224 |
resnet18 | torchvision | 69.758 | 89.078 | 3.648 | 11.690 | 224 |
resnet34 | torchvision | 73.314 | 91.42 | 7.358 | 21.798 | 224 |
resnet50 | torchvision | 76.130 | 92.862 | 8.268 | 25.557 | 224 |
resnet101 | torchvision | 77.374 | 93.546 | 15.732 | 44.549 | 224 |
resnet152 | torchvision | 78.312 | 94.046 | 23.208 | 60.193 | 224 |
resnet18 | mmcls | 69.90 | 89.43 | 3.64 | 11.69 | 224 |
resnet34 | mmcls | 73.62 | 91.59 | 7.36 | 21.8 | 224 |
resnet50 | mmcls | 76.55 | 93.06 | 8.24 | 25.56 | 224 |
resnet101 | mmcls | 77.97 | 94.06 | 15.7 | 44.55 | 224 |
resnet152 | mmcls | 78.48 | 94.13 | 23.16 | 60.19 | 224 |
resnet18 | ppcls | 71.0 | 89.9 | 3.66 | 11.69 | 224 |
resnet18_vd | ppcls | 72.3 | 90.8 | 4.14 | 11.71 | 224 |
resnet34 | ppcls | 74.6 | 92.1 | 7.36 | 21.8 | 224 |
resnet34_vd | ppcls | 76.0 | 93.0 | 7.39 | 21.82 | 224 |
resnet34_vd_ssld | ppcls | 79.7 | 94.9 | 7.39 | 21.82 | 224 |
resnet50 | ppcls | 76.5 | 93.0 | 8.19 | 25.56 | 224 |
resnet50_vc | ppcls | 78.4 | 94.0 | 8.67 | 25.58 | 224 |
resnet50_vd | ppcls | 79.1 | 94.4 | 8.67 | 25.58 | 224 |
resnet50_vd_ssld | ppcls | 83.0 | 96.4 | 8.67 | 25.58 | 224 |
resnet101 | ppcls | 77.6 | 93.6 | 15.52 | 44.55 | 224 |
resnet101_vd | ppcls | 80.2 | 95.0 | 16.1 | 44.57 | 224 |
resnet101_vd_ssld | ppcls | 83.7 | 96.7 | 16.1 | 44.57 | 224 |
resnet152 | ppcls | 78.3 | 94.0 | 23.05 | 60.19 | 224 |
resnet152_vd | ppcls | 80.6 | 95.3 | 23.53 | 60.21 | 224 |
resnet200_vd | ppcls | 80.9 | 95.3 | 30.53 | 74.74 | 224 |
resnet50 | keras | 74.86 | 92.038 | 7.76 | 25.6 | 224 |
resnet101 | keras | 76.418 | 92.792 | 15.2 | 44.7 | 224 |
resnet152 | keras | 76.598 | 93.124 | 22.6 | 60.4 | 224 |
resnet50v2 | keras | 69.404 | 89.736 | 13.1 | 25.7 | 299 |
resnet101v2 | keras | 70.658 | 90.742 | 26.8 | 44.7 | 299 |
resnet152v2 | keras | 71.502 | 91.124 | 40.5 | 60.4 | 299 |
resnet18 | oneflow | 69.760 | 89.082 | 1.8 | 11.7 | 224 |
resnet34 | oneflow | 73.302 | 91.420 | 3.7 | 21.8 | 224 |
resnet50 | oneflow | 76.146 | 92.872 | 4.2 | 25.6 | 224 |
resnet101 | oneflow | 77.366 | 93.5628 | 7.9 | 44.6 | 224 |
resnet152 | oneflow | 78.314 | 94.060 | 11.6 | 60.2 | 224 |
ImageNet 是一个计算机视觉系统识别项目,是目前世界上图像识别最大的数据库。是美国斯坦福的计算机科学家,模拟人类的识别系统建立的。能够从图片中识别物体。ImageNet是一个非常有前景的研究项目,未来用在机器人身上,就可以直接辨认物品和人了。超过1400万的图像URL被ImageNet手动注释,以指示图片中的对象;在至少一百万张图像中,还提供了边界框。ImageNet包含2万多个类别; 一个典型的类别,如“气球”或“草莓”,每个类包含数百张图像。
ImageNet数据是CV领域非常出名的数据集,ISLVRC竞赛使用的数据集是轻量版的ImageNet数据集。ISLVRC2012是非常出名的一个数据集,在很多CV领域的论文,都会使用这个数据集对自己的模型进行测试,在该项目中分类算法用到的测评数据集就是ISLVRC2012数据集的验证集。在一些论文中,也会称这个数据叫成ImageNet 1K或者ISLVRC2012,两者是一样的。“1 K”代表的是1000个类别。
- top1准确率: 测试图片中最佳得分所对应的标签是正确标注类别的样本数除以总的样本数
- top5准确率: 测试图片中正确标签包含在前五个分类概率中的个数除以总的样本数
-
timm
pip install timm==0.6.5 python ../common/utils/export_timm_torchvision_model.py --model_library timm --model_name cspresnet50 --save_dir ./onnx --size 256 --pretrained_weights xxx.pth
-
mmclassification
mmcls框架参考 mmclassification,可使用如下位置的pytorch2onnx.py或pytorch2torchscript.py转成相应的模型
cd mmclassification python tools/deployment/pytorch2onnx.py \ --config configs/resnet/resnet50_b32x8_imagenet.py \ --checkpoint weights/resnet50.pth \ --output-file output/resnet50.onnx \
-
ppcls
pip install PaddlePaddle==2.3.2 Paddle2ONNX==1.0.0 paddle2onnx --model_dir /path/to/cspnet_paddle_model/ \ --model_filename model.pdmodel \ --params_filename model.pdiparams \ --save_file model.onnx \ --enable_dev_version False \ --opset_version 10
-
oneflow
git clone https://github.com/Oneflow-Inc/vision.git mv source_code/oneflow2onnx.py vision cd vision python oneflow2onnx.py
-
torchvision
python ../common/utils/export_timm_torchvision_model.py --model_library torchvision --model_name densenet121 --save_dir ./onnx --size 224 --pretrained_weights xxx.pth
-
keras
⚠️ keras h5 is directly supported formats!
-
本模型使用ImageNet官网ILSVRC2012的5万张验证集进行测试,针对
int8
校准数据可从该数据集中任选1000张,为了保证量化精度,请保证每个类别都有数据,请用户自行获取该数据集,ILSVRC2012├── ImageNet | ├── val | | ├── ILSVRC2012_val_00000001.JPEG │ | ├── ILSVRC2012_val_00000002.JPEG │ | ├── ...... | ├── val_label.txt
sh ./data_prep_sh_files/valprep.sh
# label.txt tench, Tinca tinca goldfish, Carassius auratus ...
-
使用模型转换工具vamc,根据具体模型修改模型转换配置文件
-
命令行执行转换
vamc build ./vacc_code/build/xxx.yaml
-
生成推理数据
npz
以及对应的datalist.txt
python ../common/utils/image2npz.py --dataset_path /path/to/ILSVRC2012_img_val --target_path /path/to/input_npz --text_path npz_datalist.txt
-
性能测试
./vamp -m deploy_weights/resnet50-int8-percentile-3_224_224-vacc/resnet50 --vdsp_params vacc_code/vdsp_params/timm-resnet18-vdsp_params.json -i 2 p 2 -b 1
-
获取精度信息
./vamp -m deploy_weights/resnet50-int8-percentile-3_224_224-vacc/resnet50 --vdsp_params vacc_code/vdsp_params/timm-resnet18-vdsp_params.json -i 2 p 2 -b 1 --datalist npz_datalist.txt --path_output output
-
结果解析及精度评估
python ../common/eval/vamp_eval.py --result_path output --datalist npz_datalist.txt --label data/label/imagenet.txt