Skip to content

Commit

Permalink
[Feature] Add models for mobile and edge devices (#2152)
Browse files Browse the repository at this point in the history
  • Loading branch information
juncaipeng committed May 30, 2022
1 parent fce4a69 commit 53d5932
Show file tree
Hide file tree
Showing 16 changed files with 2,621 additions and 292 deletions.
29 changes: 29 additions & 0 deletions configs/mobileseg/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# MobileSeg

These semantic segmentation models are designed for mobile and edge devices.

MobileSeg models adopt encoder-decoder architecture and use lightweight models as encoder.

## Reference

> Sandler, Mark, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. "Mobilenetv2: Inverted residuals and linear bottlenecks." In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4510-4520. 2018.
> Howard, Andrew, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang et al. "Searching for mobilenetv3." In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1314-1324. 2019.
> Ma, Ningning, Xiangyu Zhang, Hai-Tao Zheng, and Jian Sun. "Shufflenet v2: Practical guidelines for efficient cnn architecture design." In Proceedings of the European conference on computer vision (ECCV), pp. 116-131. 2018.
> Yu, Changqian, Bin Xiao, Changxin Gao, Lu Yuan, Lei Zhang, Nong Sang, and Jingdong Wang. "Lite-hrnet: A lightweight high-resolution network." In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10440-10450. 2021.
> Han, Kai, Yunhe Wang, Qi Tian, Jianyuan Guo, Chunjing Xu, and Chang Xu. "Ghostnet: More features from cheap operations." In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1580-1589. 2020.
## Performance

### Cityscapes

| Model | Backbone | Resolution | Training Iters | mIoU | mIoU (flip) | mIoU (ms+flip) | Links |
|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|
|MobileSeg|MobileNetV2|1024x512|80000|73.94%|74.32%|75.33%|[model](https://paddleseg.bj.bcebos.com/dygraph/cityscapes/mobileseg_mobilenetv2_cityscapes_1024x512_80k/model.pdparams) \| [log](https://paddleseg.bj.bcebos.com/dygraph/cityscapes/mobileseg_mobilenetv2_cityscapes_1024x512_80k/train.log) \| [vdl](https://paddlepaddle.org.cn/paddle/visualdl/service/app?id=f210c79b6fd52f5135cf2f238e9d678d)|
|MobileSeg|MobileNetV3_large_x1_0|1024x512|80000|73.47%|73.72%|74.70%|[model](https://paddleseg.bj.bcebos.com/dygraph/cityscapes/mobileseg_mobilenetv3_cityscapes_1024x512_80k/model.pdparams) \| [log](https://paddleseg.bj.bcebos.com/dygraph/cityscapes/mobileseg_mobilenetv3_cityscapes_1024x512_80k/train.log) \| [vdl](https://paddlepaddle.org.cn/paddle/visualdl/service/app?id=28c57d0e666337ea98a1046160ef95d2)|
|MobileSeg|Lite_HRNet_18|1024x512|80000|70.75%|71.62%|72.40%|[model](https://paddleseg.bj.bcebos.com/dygraph/cityscapes/mobileseg_litehrnet18_cityscapes_1024x512_80k/model.pdparams) \| [log](https://paddleseg.bj.bcebos.com/dygraph/cityscapes/mobileseg_litehrnet18_cityscapes_1024x512_80k/train.log) \| [vdl](https://www.paddlepaddle.org.cn/paddle/visualdl/service/app/scalar?id=02706145c7c463f3c76a0cb9d54728b8)|
|MobileSeg|ShuffleNetV2_x1_0|1024x512|80000|69.46%|70.00%|70.90%|[model](https://paddleseg.bj.bcebos.com/dygraph/cityscapes/mobileseg_shufflenetv2_cityscapes_1024x512_80k/model.pdparams) \| [log](https://paddleseg.bj.bcebos.com/dygraph/cityscapes/mobileseg_shufflenetv2_cityscapes_1024x512_80k/train.log) \| [vdl](https://paddlepaddle.org.cn/paddle/visualdl/service/app?id=3d83c00cf9b90f2446959e8c97a4fb7a)|
|MobileSeg|GhostNet_x1_0|1024x512|80000|71.88%|72.22%|73.11%|[model](https://paddleseg.bj.bcebos.com/dygraph/cityscapes/mobileseg_ghostnet_cityscapes_1024x512_80k/model.pdparams) \| [log](https://paddleseg.bj.bcebos.com/dygraph/cityscapes/mobileseg_ghostnet_cityscapes_1024x512_80k/train.log) \| [vdl](https://paddlepaddle.org.cn/paddle/visualdl/service/app?id=73a6b325c0ae941a40746d53911c03bc)|
48 changes: 48 additions & 0 deletions configs/mobileseg/mobileseg_ghostnet_cityscapes_1024x512_80k.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
_base_: '../_base_/cityscapes.yml'

batch_size: 4 # use 4 GPU in default
iters: 80000

optimizer:
weight_decay: 5.0e-4

lr_scheduler:
warmup_iters: 1000
warmup_start_lr: 1.0e-5
learning_rate: 0.005

loss:
types:
- type: OhemCrossEntropyLoss
min_kept: 130000
- type: OhemCrossEntropyLoss
min_kept: 130000
- type: OhemCrossEntropyLoss
min_kept: 130000
coef: [1, 1, 1]

train_dataset:
transforms:
- type: ResizeStepScaling
min_scale_factor: 0.5
max_scale_factor: 2.0
scale_step_size: 0.25
- type: RandomPaddingCrop
crop_size: [1024, 512]
- type: RandomHorizontalFlip
- type: RandomDistort
brightness_range: 0.5
contrast_range: 0.5
saturation_range: 0.5
- type: Normalize
mode: train

model:
type: MobileSeg
backbone:
type: GhostNet_x1_0 # out channels: [24, 40, 112, 160]
pretrained: https://paddleseg.bj.bcebos.com/dygraph/backbone/ghostnet_x1_0.zip
cm_bin_sizes: [1, 2, 4]
cm_out_ch: 128
arm_out_chs: [32, 64, 128]
seg_head_inter_chs: [32, 32, 32]
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
_base_: '../_base_/cityscapes.yml'

batch_size: 4
iters: 80000

optimizer:
weight_decay: 5.0e-4

lr_scheduler:
warmup_iters: 1000
warmup_start_lr: 1.0e-5
learning_rate: 0.005

loss:
types:
- type: OhemCrossEntropyLoss
min_kept: 130000
- type: OhemCrossEntropyLoss
min_kept: 130000
- type: OhemCrossEntropyLoss
min_kept: 130000
coef: [1, 1, 1]

train_dataset:
transforms:
- type: ResizeStepScaling
min_scale_factor: 0.5
max_scale_factor: 2.0
scale_step_size: 0.25
- type: RandomPaddingCrop
crop_size: [1024, 512]
- type: RandomHorizontalFlip
- type: RandomDistort
brightness_range: 0.5
contrast_range: 0.5
saturation_range: 0.5
- type: Normalize
mode: train

model:
type: MobileSeg
backbone:
type: Lite_HRNet_18
use_head: True # False : [40, 80, 160, 320] True: [40, 40, 80, 160]
pretrained: https://paddleseg.bj.bcebos.com/dygraph/backbone/lite_hrnet_18.tar.gz
backbone_indices: [0, 1, 2]
cm_bin_sizes: [1, 2, 4]
cm_out_ch: 128
arm_out_chs: [32, 64, 128]
seg_head_inter_chs: [32, 32, 32]
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
_base_: '../_base_/cityscapes.yml'

batch_size: 4
iters: 80000

optimizer:
weight_decay: 5.0e-4

lr_scheduler:
warmup_iters: 1000
warmup_start_lr: 1.0e-5
learning_rate: 0.005

loss:
types:
- type: OhemCrossEntropyLoss
min_kept: 130000
- type: OhemCrossEntropyLoss
min_kept: 130000
- type: OhemCrossEntropyLoss
min_kept: 130000
coef: [1, 1, 1]

train_dataset:
transforms:
- type: ResizeStepScaling
min_scale_factor: 0.5
max_scale_factor: 2.0
scale_step_size: 0.25
- type: RandomPaddingCrop
crop_size: [1024, 512]
- type: RandomHorizontalFlip
- type: RandomDistort
brightness_range: 0.5
contrast_range: 0.5
saturation_range: 0.5
- type: Normalize
mode: train

model:
type: MobileSeg
backbone:
type: MobileNetV2_x1_0 # out channels: [24, 32, 96, 320]
pretrained: https://paddleseg.bj.bcebos.com/dygraph/backbone/mobilenetv2_x1_0_ssld.tar.gz
cm_bin_sizes: [1, 2, 4]
cm_out_ch: 128
arm_out_chs: [32, 64, 128]
seg_head_inter_chs: [32, 32, 32]
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
_base_: '../_base_/cityscapes.yml'

batch_size: 4
iters: 80000

optimizer:
weight_decay: 5.0e-4

lr_scheduler:
warmup_iters: 1000
warmup_start_lr: 1.0e-5
learning_rate: 0.005

loss:
types:
- type: OhemCrossEntropyLoss
min_kept: 130000
- type: OhemCrossEntropyLoss
min_kept: 130000
- type: OhemCrossEntropyLoss
min_kept: 130000
coef: [1, 1, 1]

train_dataset:
transforms:
- type: ResizeStepScaling
min_scale_factor: 0.5
max_scale_factor: 2.0
scale_step_size: 0.25
- type: RandomPaddingCrop
crop_size: [1024, 512]
- type: RandomHorizontalFlip
- type: RandomDistort
brightness_range: 0.5
contrast_range: 0.5
saturation_range: 0.5
- type: Normalize
mode: train

model:
type: MobileSeg
backbone:
type: MobileNetV3_large_x1_0 # out channels: [24, 40, 112, 160]
pretrained: https://paddleseg.bj.bcebos.com/dygraph/backbone/mobilenetv3_large_x1_0_ssld.tar.gz
cm_bin_sizes: [1, 2, 4]
cm_out_ch: 128
arm_out_chs: [32, 64, 128]
seg_head_inter_chs: [32, 32, 32]
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
_base_: '../_base_/cityscapes.yml'

batch_size: 4
iters: 80000

optimizer:
weight_decay: 5.0e-4

lr_scheduler:
warmup_iters: 1000
warmup_start_lr: 1.0e-5
learning_rate: 0.005

loss:
types:
- type: OhemCrossEntropyLoss
min_kept: 130000
- type: OhemCrossEntropyLoss
min_kept: 130000
- type: OhemCrossEntropyLoss
min_kept: 130000
coef: [1, 1, 1]

train_dataset:
transforms:
- type: ResizeStepScaling
min_scale_factor: 0.5
max_scale_factor: 2.0
scale_step_size: 0.25
- type: RandomPaddingCrop
crop_size: [1024, 512]
- type: RandomHorizontalFlip
- type: RandomDistort
brightness_range: 0.5
contrast_range: 0.5
saturation_range: 0.5
- type: Normalize
mode: train

model:
type: MobileSeg
backbone:
type: ShuffleNetV2_x1_0 # out channels: [24, 116, 232, 464]
pretrained: https://paddleseg.bj.bcebos.com/dygraph/backbone/shufflenetv2_x1_0.zip
cm_bin_sizes: [1, 2, 4]
cm_out_ch: 128
arm_out_chs: [32, 64, 128]
seg_head_inter_chs: [32, 32, 32]
1 change: 1 addition & 0 deletions paddleseg/models/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,3 +59,4 @@
from .glore import GloRe
from .ddrnet import DDRNet_23
from .ccnet import CCNet
from .mobileseg import MobileSeg
3 changes: 3 additions & 0 deletions paddleseg/models/backbones/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,3 +21,6 @@
from .mobilenetv2 import *
from .mix_transformer import *
from .stdcnet import *
from .lite_hrnet import *
from .shufflenetv2 import *
from .ghostnet import *
Loading

0 comments on commit 53d5932

Please sign in to comment.