[New model] Support MobileNetV3 (open-mmlab#268)

* delete markdownlint * Support MobileNetV3 * fix import * add mobilenetv3 head and configs * Modify MobileNetV3 to semantic segmentation version * modify mobilenetv3 configs * add std configs * fix Conv2dAdaptivePadding bug * add configs * add unitest and fix bugs * fix lraspp unitest bugs * restore * fix unitest * add MobileNetV3 docstring * add mmcv * add mmcv * fix syntax bug * fix unitest bug * fix unitest bug * fix unitest bugs * fix docstring * add configs * restore * delete unnecessary assert * modify unitest * delete benchmark
SwinTransformer · Dec 26, 2020 · 7fdb400 · 7fdb400
1 parent 5dacca3
commit 7fdb400
Show file tree

Hide file tree

Showing 21 changed files with 919 additions and 16 deletions.
diff --git a/README.md b/README.md
@@ -60,6 +60,7 @@ Supported backbones:
 - [x] [HRNet](configs/hrnet/README.md)
 - [x] [ResNeSt](configs/resnest/README.md)
 - [x] [MobileNetV2](configs/mobilenet_v2/README.md)
+- [x] [MobileNetV3](configs/mobilenet_v3/README.md)
 
 Supported methods:
 

diff --git a/configs/_base_/models/lraspp_m-v3-d8.py b/configs/_base_/models/lraspp_m-v3-d8.py
@@ -0,0 +1,25 @@
+# model settings
+norm_cfg = dict(type='SyncBN', eps=0.001, requires_grad=True)
+model = dict(
+    type='EncoderDecoder',
+    backbone=dict(
+        type='MobileNetV3',
+        arch='large',
+        out_indices=(1, 3, 16),
+        norm_cfg=norm_cfg),
+    decode_head=dict(
+        type='LRASPPHead',
+        in_channels=(16, 24, 960),
+        in_index=(0, 1, 2),
+        channels=128,
+        input_transform='multiple_select',
+        dropout_ratio=0.1,
+        num_classes=19,
+        norm_cfg=norm_cfg,
+        act_cfg=dict(type='ReLU'),
+        align_corners=False,
+        loss_decode=dict(
+            type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)))
+# model training and testing settings
+train_cfg = dict()
+test_cfg = dict(mode='whole')
diff --git a/configs/mobilenet_v3/README.md b/configs/mobilenet_v3/README.md
@@ -0,0 +1,26 @@
+# Searching for MobileNetV3
+
+## Introduction
+
+```latex
+@inproceedings{Howard_2019_ICCV,
+  title={Searching for MobileNetV3},
+  author={Howard, Andrew and Sandler, Mark and Chu, Grace and Chen, Liang-Chieh and Chen, Bo and Tan, Mingxing and Wang, Weijun and Zhu, Yukun and Pang, Ruoming and Vasudevan, Vijay and Le, Quoc V. and Adam, Hartwig},
+  booktitle={The IEEE International Conference on Computer Vision (ICCV)},
+  pages={1314-1324},
+  month={October},
+  year={2019},
+  doi={10.1109/ICCV.2019.00140}}
+}
+```
+
+## Results and models
+
+### Cityscapes
+
+|   Method   | Backbone | Crop Size | Lr schd | Mem (GB) | Inf time (fps) | mIoU  | mIoU(ms+flip) |                                                                                                                                                                                                              download                                                                                                                                                                                                              |
+|------------|----------|-----------|--------:|---------:|----------------|------:|---------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| LRASPP     | M-V3-D8            | 512x1024 | 320000 | 8.9 |   15.22   | 69.54 | 70.89 | [model](https://download.openmmlab.com/mmsegmentation/v0.5/mobilenet_v3/lraspp_m-v3-d8_512x1024_320k_cityscapes/lraspp_m-v3-d8_512x1024_320k_cityscapes_20201224_220337-cfe8fb07.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/mobilenet_v3/lraspp_m-v3-d8_512x1024_320k_cityscapes/lraspp_m-v3-d8_512x1024_320k_cityscapes-20201224_220337.log.json)|
+| LRASPP     | M-V3-D8 (scratch)  | 512x1024 | 320000 | 8.9 |   14.77   | 67.87 | 69.78 | [model](https://download.openmmlab.com/mmsegmentation/v0.5/mobilenet_v3/lraspp_m-v3-d8_scratch_512x1024_320k_cityscapes/lraspp_m-v3-d8_scratch_512x1024_320k_cityscapes_20201224_220337-9f29cd72.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/mobilenet_v3/lraspp_m-v3-d8_scratch_512x1024_320k_cityscapes/lraspp_m-v3-d8_scratch_512x1024_320k_cityscapes-20201224_220337.log.json)|
+| LRASPP     | M-V3s-D8           | 512x1024 | 320000 | 5.3 |   23.64   | 64.11 | 66.42 | [model](https://download.openmmlab.com/mmsegmentation/v0.5/mobilenet_v3/lraspp_m-v3s-d8_512x1024_320k_cityscapes/lraspp_m-v3s-d8_512x1024_320k_cityscapes_20201224_223935-61565b34.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/mobilenet_v3/lraspp_m-v3s-d8_512x1024_320k_cityscapes/lraspp_m-v3s-d8_512x1024_320k_cityscapes-20201224_223935.log.json)|
+| LRASPP     | M-V3s-D8 (scratch) | 512x1024 | 320000 | 5.3 |   24.50   | 62.74 | 65.01 | [model](https://download.openmmlab.com/mmsegmentation/v0.5/mobilenet_v3/lraspp_m-v3s-d8_scratch_512x1024_320k_cityscapes/lraspp_m-v3s-d8_scratch_512x1024_320k_cityscapes_20201224_223935-03daeabb.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/mobilenet_v3/lraspp_m-v3s-d8_scratch_512x1024_320k_cityscapes/lraspp_m-v3s-d8_scratch_512x1024_320k_cityscapes-20201224_223935.log.json)|
diff --git a/configs/mobilenet_v3/lraspp_m-v3-d8_512x1024_320k_cityscapes.py b/configs/mobilenet_v3/lraspp_m-v3-d8_512x1024_320k_cityscapes.py
@@ -0,0 +1,11 @@
+_base_ = [
+    '../_base_/models/lraspp_m-v3-d8.py', '../_base_/datasets/cityscapes.py',
+    '../_base_/default_runtime.py', '../_base_/schedules/schedule_160k.py'
+]
+
+model = dict(pretrained='open-mmlab://contrib/mobilenet_v3_large')
+
+# Re-config the data sampler.
+data = dict(samples_per_gpu=4, workers_per_gpu=4)
+
+runner = dict(type='IterBasedRunner', max_iters=320000)
diff --git a/configs/mobilenet_v3/lraspp_m-v3-d8_scratch_512x1024_320k_cityscapes.py b/configs/mobilenet_v3/lraspp_m-v3-d8_scratch_512x1024_320k_cityscapes.py
@@ -0,0 +1,9 @@
+_base_ = [
+    '../_base_/models/lraspp_m-v3-d8.py', '../_base_/datasets/cityscapes.py',
+    '../_base_/default_runtime.py', '../_base_/schedules/schedule_160k.py'
+]
+
+# Re-config the data sampler.
+data = dict(samples_per_gpu=4, workers_per_gpu=4)
+
+runner = dict(type='IterBasedRunner', max_iters=320000)
diff --git a/configs/mobilenet_v3/lraspp_m-v3s-d8_512x1024_320k_cityscapes.py b/configs/mobilenet_v3/lraspp_m-v3s-d8_512x1024_320k_cityscapes.py
@@ -0,0 +1,23 @@
+_base_ = './lraspp_m-v3-d8_512x1024_320k_cityscapes.py'
+norm_cfg = dict(type='SyncBN', eps=0.001, requires_grad=True)
+model = dict(
+    type='EncoderDecoder',
+    pretrained='open-mmlab://contrib/mobilenet_v3_small',
+    backbone=dict(
+        type='MobileNetV3',
+        arch='small',
+        out_indices=(0, 1, 12),
+        norm_cfg=norm_cfg),
+    decode_head=dict(
+        type='LRASPPHead',
+        in_channels=(16, 16, 576),
+        in_index=(0, 1, 2),
+        channels=128,
+        input_transform='multiple_select',
+        dropout_ratio=0.1,
+        num_classes=19,
+        norm_cfg=norm_cfg,
+        act_cfg=dict(type='ReLU'),
+        align_corners=False,
+        loss_decode=dict(
+            type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)))
diff --git a/configs/mobilenet_v3/lraspp_m-v3s-d8_scratch_512x1024_320k_cityscapes.py b/configs/mobilenet_v3/lraspp_m-v3s-d8_scratch_512x1024_320k_cityscapes.py
@@ -0,0 +1,22 @@
+_base_ = './lraspp_m-v3-d8_scratch_512x1024_320k_cityscapes.py'
+norm_cfg = dict(type='SyncBN', eps=0.001, requires_grad=True)
+model = dict(
+    type='EncoderDecoder',
+    backbone=dict(
+        type='MobileNetV3',
+        arch='small',
+        out_indices=(0, 1, 12),
+        norm_cfg=norm_cfg),
+    decode_head=dict(
+        type='LRASPPHead',
+        in_channels=(16, 16, 576),
+        in_index=(0, 1, 2),
+        channels=128,
+        input_transform='multiple_select',
+        dropout_ratio=0.1,
+        num_classes=19,
+        norm_cfg=norm_cfg,
+        act_cfg=dict(type='ReLU'),
+        align_corners=False,
+        loss_decode=dict(
+            type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)))
diff --git a/docs/model_zoo.md b/docs/model_zoo.md
@@ -111,6 +111,10 @@ Please refer to [PointRend](https://github.com/open-mmlab/mmsegmentation/blob/ma
 
 Please refer to [MobileNetV2](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/mobilenet_v2) for details.
 
+### MobileNetV3
+
+Please refer to [MobileNetV3](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/mobilenet_v3) for details.
+
 ### EMANet
 
 Please refer to [EMANet](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/emanet) for details.

diff --git a/mmseg/models/backbones/__init__.py b/mmseg/models/backbones/__init__.py
@@ -2,12 +2,13 @@
 from .fast_scnn import FastSCNN
 from .hrnet import HRNet
 from .mobilenet_v2 import MobileNetV2
+from .mobilenet_v3 import MobileNetV3
 from .resnest import ResNeSt
 from .resnet import ResNet, ResNetV1c, ResNetV1d
 from .resnext import ResNeXt
 from .unet import UNet
 
 __all__ = [
     'ResNet', 'ResNetV1c', 'ResNetV1d', 'ResNeXt', 'HRNet', 'FastSCNN',
-    'ResNeSt', 'MobileNetV2', 'UNet', 'CGNet'
+    'ResNeSt', 'MobileNetV2', 'UNet', 'CGNet', 'MobileNetV3'
 ]