-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Feature] Add models for mobile and edge devices (#2152)
- Loading branch information
1 parent
fce4a69
commit 53d5932
Showing
16 changed files
with
2,621 additions
and
292 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
# MobileSeg | ||
|
||
These semantic segmentation models are designed for mobile and edge devices. | ||
|
||
MobileSeg models adopt encoder-decoder architecture and use lightweight models as encoder. | ||
|
||
## Reference | ||
|
||
> Sandler, Mark, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. "Mobilenetv2: Inverted residuals and linear bottlenecks." In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4510-4520. 2018. | ||
> Howard, Andrew, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang et al. "Searching for mobilenetv3." In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1314-1324. 2019. | ||
> Ma, Ningning, Xiangyu Zhang, Hai-Tao Zheng, and Jian Sun. "Shufflenet v2: Practical guidelines for efficient cnn architecture design." In Proceedings of the European conference on computer vision (ECCV), pp. 116-131. 2018. | ||
> Yu, Changqian, Bin Xiao, Changxin Gao, Lu Yuan, Lei Zhang, Nong Sang, and Jingdong Wang. "Lite-hrnet: A lightweight high-resolution network." In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10440-10450. 2021. | ||
> Han, Kai, Yunhe Wang, Qi Tian, Jianyuan Guo, Chunjing Xu, and Chang Xu. "Ghostnet: More features from cheap operations." In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1580-1589. 2020. | ||
## Performance | ||
|
||
### Cityscapes | ||
|
||
| Model | Backbone | Resolution | Training Iters | mIoU | mIoU (flip) | mIoU (ms+flip) | Links | | ||
|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:| | ||
|MobileSeg|MobileNetV2|1024x512|80000|73.94%|74.32%|75.33%|[model](https://paddleseg.bj.bcebos.com/dygraph/cityscapes/mobileseg_mobilenetv2_cityscapes_1024x512_80k/model.pdparams) \| [log](https://paddleseg.bj.bcebos.com/dygraph/cityscapes/mobileseg_mobilenetv2_cityscapes_1024x512_80k/train.log) \| [vdl](https://paddlepaddle.org.cn/paddle/visualdl/service/app?id=f210c79b6fd52f5135cf2f238e9d678d)| | ||
|MobileSeg|MobileNetV3_large_x1_0|1024x512|80000|73.47%|73.72%|74.70%|[model](https://paddleseg.bj.bcebos.com/dygraph/cityscapes/mobileseg_mobilenetv3_cityscapes_1024x512_80k/model.pdparams) \| [log](https://paddleseg.bj.bcebos.com/dygraph/cityscapes/mobileseg_mobilenetv3_cityscapes_1024x512_80k/train.log) \| [vdl](https://paddlepaddle.org.cn/paddle/visualdl/service/app?id=28c57d0e666337ea98a1046160ef95d2)| | ||
|MobileSeg|Lite_HRNet_18|1024x512|80000|70.75%|71.62%|72.40%|[model](https://paddleseg.bj.bcebos.com/dygraph/cityscapes/mobileseg_litehrnet18_cityscapes_1024x512_80k/model.pdparams) \| [log](https://paddleseg.bj.bcebos.com/dygraph/cityscapes/mobileseg_litehrnet18_cityscapes_1024x512_80k/train.log) \| [vdl](https://www.paddlepaddle.org.cn/paddle/visualdl/service/app/scalar?id=02706145c7c463f3c76a0cb9d54728b8)| | ||
|MobileSeg|ShuffleNetV2_x1_0|1024x512|80000|69.46%|70.00%|70.90%|[model](https://paddleseg.bj.bcebos.com/dygraph/cityscapes/mobileseg_shufflenetv2_cityscapes_1024x512_80k/model.pdparams) \| [log](https://paddleseg.bj.bcebos.com/dygraph/cityscapes/mobileseg_shufflenetv2_cityscapes_1024x512_80k/train.log) \| [vdl](https://paddlepaddle.org.cn/paddle/visualdl/service/app?id=3d83c00cf9b90f2446959e8c97a4fb7a)| | ||
|MobileSeg|GhostNet_x1_0|1024x512|80000|71.88%|72.22%|73.11%|[model](https://paddleseg.bj.bcebos.com/dygraph/cityscapes/mobileseg_ghostnet_cityscapes_1024x512_80k/model.pdparams) \| [log](https://paddleseg.bj.bcebos.com/dygraph/cityscapes/mobileseg_ghostnet_cityscapes_1024x512_80k/train.log) \| [vdl](https://paddlepaddle.org.cn/paddle/visualdl/service/app?id=73a6b325c0ae941a40746d53911c03bc)| |
48 changes: 48 additions & 0 deletions
48
configs/mobileseg/mobileseg_ghostnet_cityscapes_1024x512_80k.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
_base_: '../_base_/cityscapes.yml' | ||
|
||
batch_size: 4 # use 4 GPU in default | ||
iters: 80000 | ||
|
||
optimizer: | ||
weight_decay: 5.0e-4 | ||
|
||
lr_scheduler: | ||
warmup_iters: 1000 | ||
warmup_start_lr: 1.0e-5 | ||
learning_rate: 0.005 | ||
|
||
loss: | ||
types: | ||
- type: OhemCrossEntropyLoss | ||
min_kept: 130000 | ||
- type: OhemCrossEntropyLoss | ||
min_kept: 130000 | ||
- type: OhemCrossEntropyLoss | ||
min_kept: 130000 | ||
coef: [1, 1, 1] | ||
|
||
train_dataset: | ||
transforms: | ||
- type: ResizeStepScaling | ||
min_scale_factor: 0.5 | ||
max_scale_factor: 2.0 | ||
scale_step_size: 0.25 | ||
- type: RandomPaddingCrop | ||
crop_size: [1024, 512] | ||
- type: RandomHorizontalFlip | ||
- type: RandomDistort | ||
brightness_range: 0.5 | ||
contrast_range: 0.5 | ||
saturation_range: 0.5 | ||
- type: Normalize | ||
mode: train | ||
|
||
model: | ||
type: MobileSeg | ||
backbone: | ||
type: GhostNet_x1_0 # out channels: [24, 40, 112, 160] | ||
pretrained: https://paddleseg.bj.bcebos.com/dygraph/backbone/ghostnet_x1_0.zip | ||
cm_bin_sizes: [1, 2, 4] | ||
cm_out_ch: 128 | ||
arm_out_chs: [32, 64, 128] | ||
seg_head_inter_chs: [32, 32, 32] |
50 changes: 50 additions & 0 deletions
50
configs/mobileseg/mobileseg_litehrnet18_cityscapes_1024x512_80k.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
_base_: '../_base_/cityscapes.yml' | ||
|
||
batch_size: 4 | ||
iters: 80000 | ||
|
||
optimizer: | ||
weight_decay: 5.0e-4 | ||
|
||
lr_scheduler: | ||
warmup_iters: 1000 | ||
warmup_start_lr: 1.0e-5 | ||
learning_rate: 0.005 | ||
|
||
loss: | ||
types: | ||
- type: OhemCrossEntropyLoss | ||
min_kept: 130000 | ||
- type: OhemCrossEntropyLoss | ||
min_kept: 130000 | ||
- type: OhemCrossEntropyLoss | ||
min_kept: 130000 | ||
coef: [1, 1, 1] | ||
|
||
train_dataset: | ||
transforms: | ||
- type: ResizeStepScaling | ||
min_scale_factor: 0.5 | ||
max_scale_factor: 2.0 | ||
scale_step_size: 0.25 | ||
- type: RandomPaddingCrop | ||
crop_size: [1024, 512] | ||
- type: RandomHorizontalFlip | ||
- type: RandomDistort | ||
brightness_range: 0.5 | ||
contrast_range: 0.5 | ||
saturation_range: 0.5 | ||
- type: Normalize | ||
mode: train | ||
|
||
model: | ||
type: MobileSeg | ||
backbone: | ||
type: Lite_HRNet_18 | ||
use_head: True # False : [40, 80, 160, 320] True: [40, 40, 80, 160] | ||
pretrained: https://paddleseg.bj.bcebos.com/dygraph/backbone/lite_hrnet_18.tar.gz | ||
backbone_indices: [0, 1, 2] | ||
cm_bin_sizes: [1, 2, 4] | ||
cm_out_ch: 128 | ||
arm_out_chs: [32, 64, 128] | ||
seg_head_inter_chs: [32, 32, 32] |
48 changes: 48 additions & 0 deletions
48
configs/mobileseg/mobileseg_mobilenetv2_cityscapes_1024x512_80k.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
_base_: '../_base_/cityscapes.yml' | ||
|
||
batch_size: 4 | ||
iters: 80000 | ||
|
||
optimizer: | ||
weight_decay: 5.0e-4 | ||
|
||
lr_scheduler: | ||
warmup_iters: 1000 | ||
warmup_start_lr: 1.0e-5 | ||
learning_rate: 0.005 | ||
|
||
loss: | ||
types: | ||
- type: OhemCrossEntropyLoss | ||
min_kept: 130000 | ||
- type: OhemCrossEntropyLoss | ||
min_kept: 130000 | ||
- type: OhemCrossEntropyLoss | ||
min_kept: 130000 | ||
coef: [1, 1, 1] | ||
|
||
train_dataset: | ||
transforms: | ||
- type: ResizeStepScaling | ||
min_scale_factor: 0.5 | ||
max_scale_factor: 2.0 | ||
scale_step_size: 0.25 | ||
- type: RandomPaddingCrop | ||
crop_size: [1024, 512] | ||
- type: RandomHorizontalFlip | ||
- type: RandomDistort | ||
brightness_range: 0.5 | ||
contrast_range: 0.5 | ||
saturation_range: 0.5 | ||
- type: Normalize | ||
mode: train | ||
|
||
model: | ||
type: MobileSeg | ||
backbone: | ||
type: MobileNetV2_x1_0 # out channels: [24, 32, 96, 320] | ||
pretrained: https://paddleseg.bj.bcebos.com/dygraph/backbone/mobilenetv2_x1_0_ssld.tar.gz | ||
cm_bin_sizes: [1, 2, 4] | ||
cm_out_ch: 128 | ||
arm_out_chs: [32, 64, 128] | ||
seg_head_inter_chs: [32, 32, 32] |
48 changes: 48 additions & 0 deletions
48
configs/mobileseg/mobileseg_mobilenetv3_cityscapes_1024x512_80k.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
_base_: '../_base_/cityscapes.yml' | ||
|
||
batch_size: 4 | ||
iters: 80000 | ||
|
||
optimizer: | ||
weight_decay: 5.0e-4 | ||
|
||
lr_scheduler: | ||
warmup_iters: 1000 | ||
warmup_start_lr: 1.0e-5 | ||
learning_rate: 0.005 | ||
|
||
loss: | ||
types: | ||
- type: OhemCrossEntropyLoss | ||
min_kept: 130000 | ||
- type: OhemCrossEntropyLoss | ||
min_kept: 130000 | ||
- type: OhemCrossEntropyLoss | ||
min_kept: 130000 | ||
coef: [1, 1, 1] | ||
|
||
train_dataset: | ||
transforms: | ||
- type: ResizeStepScaling | ||
min_scale_factor: 0.5 | ||
max_scale_factor: 2.0 | ||
scale_step_size: 0.25 | ||
- type: RandomPaddingCrop | ||
crop_size: [1024, 512] | ||
- type: RandomHorizontalFlip | ||
- type: RandomDistort | ||
brightness_range: 0.5 | ||
contrast_range: 0.5 | ||
saturation_range: 0.5 | ||
- type: Normalize | ||
mode: train | ||
|
||
model: | ||
type: MobileSeg | ||
backbone: | ||
type: MobileNetV3_large_x1_0 # out channels: [24, 40, 112, 160] | ||
pretrained: https://paddleseg.bj.bcebos.com/dygraph/backbone/mobilenetv3_large_x1_0_ssld.tar.gz | ||
cm_bin_sizes: [1, 2, 4] | ||
cm_out_ch: 128 | ||
arm_out_chs: [32, 64, 128] | ||
seg_head_inter_chs: [32, 32, 32] |
48 changes: 48 additions & 0 deletions
48
configs/mobileseg/mobileseg_shufflenetv2_cityscapes_1024x512_80k.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
_base_: '../_base_/cityscapes.yml' | ||
|
||
batch_size: 4 | ||
iters: 80000 | ||
|
||
optimizer: | ||
weight_decay: 5.0e-4 | ||
|
||
lr_scheduler: | ||
warmup_iters: 1000 | ||
warmup_start_lr: 1.0e-5 | ||
learning_rate: 0.005 | ||
|
||
loss: | ||
types: | ||
- type: OhemCrossEntropyLoss | ||
min_kept: 130000 | ||
- type: OhemCrossEntropyLoss | ||
min_kept: 130000 | ||
- type: OhemCrossEntropyLoss | ||
min_kept: 130000 | ||
coef: [1, 1, 1] | ||
|
||
train_dataset: | ||
transforms: | ||
- type: ResizeStepScaling | ||
min_scale_factor: 0.5 | ||
max_scale_factor: 2.0 | ||
scale_step_size: 0.25 | ||
- type: RandomPaddingCrop | ||
crop_size: [1024, 512] | ||
- type: RandomHorizontalFlip | ||
- type: RandomDistort | ||
brightness_range: 0.5 | ||
contrast_range: 0.5 | ||
saturation_range: 0.5 | ||
- type: Normalize | ||
mode: train | ||
|
||
model: | ||
type: MobileSeg | ||
backbone: | ||
type: ShuffleNetV2_x1_0 # out channels: [24, 116, 232, 464] | ||
pretrained: https://paddleseg.bj.bcebos.com/dygraph/backbone/shufflenetv2_x1_0.zip | ||
cm_bin_sizes: [1, 2, 4] | ||
cm_out_ch: 128 | ||
arm_out_chs: [32, 64, 128] | ||
seg_head_inter_chs: [32, 32, 32] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.