11# Convolutional Networks for Image Classification in PyTorch
22
3- In this repository you will find implementations of various image classification models.
3+ In this repository you will find implementations of various image classification models.
44
5- Detailed information on each model can be found here:
5+ ## Table Of Contents
6+
7+ * [ Models] ( #models )
8+ * [ Validation accuracy results] ( #validation-accuracy-results )
9+ * [ Training performance results] ( #training-performance-results )
10+ * [ Training performance: NVIDIA DGX-1 (8x V100 16G)] ( #training-performance-nvidia-dgx-1-(8x-v100-16G) )
11+ * [ Training performance: NVIDIA DGX-2 (16x V100 32G)] ( #training-performance-nvidia-dgx-2-(16x-v100-32G) )
12+ * [ Model comparison] ( #model-comparison )
13+ * [ Accuracy vs FLOPS] ( #accuracy-vs-flops )
14+ * [ Latency vs Throughput on different batch sizes] ( #latency-vs-throughput-on-different-batch-sizes )
15+
16+ ## Models
17+
18+ The following table provides links to where you can find additional information on each model:
619
720| ** Model** | ** Link** |
821| :-:| :-:|
922| resnet50 | [ README] ( ./resnet50v1.5/README.md ) |
1023| resnext101-32x4d | [ README] ( ./resnext101-32x4d/README.md ) |
1124| se-resnext101-32x4d | [ README] ( ./se-resnext101-32x4d/README.md ) |
1225
13- ## Accuracy
26+ ## Validation accuracy results
27+
28+ Our results were obtained by running the applicable
29+ training scripts in the [ framework-container-name] NGC container
30+ on NVIDIA DGX-1 with (8x V100 16G) GPUs.
31+ The specific training script that was run is documented
32+ in the corresponding model's README.
33+
1434
35+ The following table shows the validation accuracy results of the
36+ three classification models side-by-side.
1537
16- | ** Model** | ** AMP Top1** | ** AMP Top5** | ** FP32 Top1** | ** FP32 Top1** |
38+
39+ | ** arch** | ** AMP Top1** | ** AMP Top5** | ** FP32 Top1** | ** FP32 Top1** |
1740| :-:| :-:| :-:| :-:| :-:|
1841| resnet50 | 78.46 | 94.15 | 78.50 | 94.11 |
1942| resnext101-32x4d | 80.08 | 94.89 | 80.14 | 95.02 |
2043| se-resnext101-32x4d | 81.01 | 95.52 | 81.12 | 95.54 |
2144
2245
23- ## Training Performance
46+ ## Training performance results
47+
48+
49+ ### Training performance: NVIDIA DGX-1 (8x V100 16G)
50+
2451
52+ Our results were obtained by running the applicable
53+ training scripts in the pytorch-19.10 NGC container
54+ on NVIDIA DGX-1 with (8x V100 16G) GPUs.
55+ Performance numbers (in images per second)
56+ were averaged over an entire training epoch.
57+ The specific training script that was run is documented
58+ in the corresponding model's README.
2559
26- ### NVIDIA DGX-1 (8x V100 16G)
60+ The following table shows the training accuracy results of the
61+ three classification models side-by-side.
2762
28- | ** Model** | ** Mixed Precision** | ** FP32** | ** Mixed Precision speedup** |
63+
64+ | ** arch** | ** Mixed Precision** | ** FP32** | ** Mixed Precision speedup** |
2965| :-:| :-:| :-:| :-:|
3066| resnet50 | 6888.75 img/s | 2945.37 img/s | 2.34x |
3167| resnext101-32x4d | 2384.85 img/s | 1116.58 img/s | 2.14x |
3268| se-resnext101-32x4d | 2031.17 img/s | 977.45 img/s | 2.08x |
3369
34- ### NVIDIA DGX-2 (16x V100 32G)
70+ ### Training performance: NVIDIA DGX-2 (16x V100 32G)
71+
72+
73+ Our results were obtained by running the applicable
74+ training scripts in the pytorch-19.10 NGC container
75+ on NVIDIA DGX-2 with (16x V100 32G) GPUs.
76+ Performance numbers (in images per second)
77+ were averaged over an entire training epoch.
78+ The specific training script that was run is documented
79+ in the corresponding model's README.
80+
81+ The following table shows the training accuracy results of the
82+ three classification models side-by-side.
3583
36- | ** Model** | ** Mixed Precision** | ** FP32** | ** Mixed Precision speedup** |
84+
85+ | ** arch** | ** Mixed Precision** | ** FP32** | ** Mixed Precision speedup** |
3786| :-:| :-:| :-:| :-:|
3887| resnet50 | 13443.82 img/s | 6263.41 img/s | 2.15x |
3988| resnext101-32x4d | 4473.37 img/s | 2261.97 img/s | 1.98x |
@@ -45,7 +94,16 @@ Detailed information on each model can be found here:
4594### Accuracy vs FLOPS
4695![ ACCvsFLOPS] ( ./img/ACCvsFLOPS.png )
4796
48- Dot size indicates number of trainable parameters
97+ Plot describes relationship between floating point operations
98+ needed for computing forward pass on a 224px x 224px image,
99+ for the implemented models.
100+ Dot size indicates number of trainable parameters.
49101
50102### Latency vs Throughput on different batch sizes
51103![ LATvsTHR] ( ./img/LATvsTHR.png )
104+
105+ Plot describes relationship between
106+ inference latency, throughput and batch size
107+ for the implemented models.
108+
109+
0 commit comments