Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP][Feature] Add ViT-Adapter Model #2762

Merged
merged 17 commits into from
Mar 17, 2023
Next Next commit
add vit-adapter and align backbone forward
  • Loading branch information
juncaipeng committed Nov 10, 2022
commit f73f5529a3e81852e18c0f79eea7518809442a45
15 changes: 15 additions & 0 deletions configs/vit_adapter/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Semantic Flow for Fast and Accurate Scene Parsing

## Reference

> Xiangtai Li, Ansheng You, Zhen Zhu, Houlong Zhao, Maoke Yang, Kuiyuan Yang, Shaohua Tan, Yunhai Tong:
Semantic Flow for Fast and Accurate Scene Parsing. ECCV (1) 2020: 775-793 .

## Performance

### Cityscapes

| Model | Backbone | Resolution | Training Iters | mIoU | mIoU (flip) | mIoU (ms+flip) | Links |
|-|-|-|-|-|-|-|-|
|SFNet|ResNet18_OS8|1024x1024|80000|78.72%|79.11%|79.28%|[model](https://bj.bcebos.com/paddleseg/dygraph/cityscapes/sfnet_resnet18_os8_cityscapes_1024x1024_80k/model.pdparams) \| [log](https://bj.bcebos.com/paddleseg/dygraph/cityscapes/sfnet_resnet18_os8_cityscapes_1024x1024_80k/train.log) \| [vdl](https://www.paddlepaddle.org.cn/paddle/visualdl/service/app/scalar?id=0d790ad96282048b136342fcebb08d14)|
|SFNet|ResNet50_OS8|1024x1024|80000|81.49%|81.63%|81.85%|[model](https://bj.bcebos.com/paddleseg/dygraph/cityscapes/sfnet_resnet50_os8_cityscapes_1024x1024_80k/model.pdparams) \| [log](https://bj.bcebos.com/paddleseg/dygraph/cityscapes/sfnet_resnet50_os8_cityscapes_1024x1024_80k/train.log) \| [vdl](https://paddlepaddle.org.cn/paddle/visualdl/service/app?id=d458349ec63ea8ccd6fae84afa8ea981)|
77 changes: 77 additions & 0 deletions configs/vit_adapter/upernet_deit_adapter_tiny_512_160k_ade20k.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
_base_: '../_base_/ade20k.yml'

batch_size: 4 # total batch size is 16
iters: 160000

train_dataset:
transforms:
- type: ResizeStepScaling
min_scale_factor: 0.5
max_scale_factor: 2.0
scale_step_size: 0.25
- type: RandomPaddingCrop
crop_size: [512, 512]
- type: RandomHorizontalFlip
- type: RandomDistort
brightness_range: 0.4
contrast_range: 0.4
saturation_range: 0.4
- type: Normalize
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]

val_dataset:
transforms:
- type: Resize
target_size: [2048, 512]
keep_ratio: True
size_divisor: 32
- type: Normalize
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]

export:
transforms:
- type: Resize
target_size: [2048, 512]
keep_ratio: True
size_divisor: 32
- type: Normalize
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]

optimizer:
_inherited_: False
type: AdamW
weight_decay: 0.01

lr_scheduler:
type: PolynomialDecay
learning_rate: 0.0012
end_lr: 0
power: 1.0
warmup_iters: 1500
warmup_start_lr: 1.0e-6

loss:
types:
- type: CrossEntropyLoss
coef: [1]

model:
type: TopFormer
backbone:
type: ViTAdapter
num_heads: 3
patch_size: 16
embed_dim: 192
depth: 12
mlp_ratio: 4
drop_path_rate: 0.1
conv_inplane: 64
n_points: 4
deform_num_heads: 6
cffn_ratio: 0.25
deform_ratio: 1.0
interaction_indexes: [[0, 2], [3, 5], [6, 8], [9, 11]]
pretrained: pretrained_model/upernet_deit_adapter_tiny_512_160_ade20k_from_torch.pdparams
1 change: 1 addition & 0 deletions paddleseg/models/backbones/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,3 +26,4 @@
from .ghostnet import *
from .top_transformer import *
from .uhrnet import *
from .vit_adapter import *
Loading