Skip to content

Commit

Permalink
[WIP][Feature] Add ViT-Adapter Model (#2762)
Browse files Browse the repository at this point in the history
  • Loading branch information
juncaipeng authored Mar 17, 2023
1 parent 4aae37a commit 574be6f
Show file tree
Hide file tree
Showing 9 changed files with 1,410 additions and 4 deletions.
16 changes: 16 additions & 0 deletions configs/vit_adapter/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Vision Transformer Adapter for Dense Predictions

## Reference

> Chen, Zhe, Yuchen Duan, Wenhai Wang, Junjun He, Tong Lu, Jifeng Dai, and Yu Qiao. "Vision Transformer Adapter for Dense Predictions." arXiv preprint arXiv:2205.08534 (2022).
## Prerequesites

Download the ms_deform_attn.zip (https://paddleseg.bj.bcebos.com/dygraph/customized_ops/ms_deform_attn.zip), and then refer to the readme to install ms_deform_attn lib.
## Performance

### ADE20K

| Model | Backbone | Resolution | Training Iters | mIoU | mIoU (flip) | mIoU (ms+flip) | Links |
|-|-|-|-|-|-|-|-|
|UPerNetViTAdapter|ViT-Adapter-Tiny|512x512|160000|41.90%|-|-|[model](https://paddleseg.bj.bcebos.com/dygraph/ade20k/upernet_vit_adapter_tiny_ade20k_512x512_160k/model.pdparams) \| [log](https://paddleseg.bj.bcebos.com/dygraph/ade20k/upernet_vit_adapter_tiny_ade20k_512x512_160k/train_log.txt) \| [vdl](https://paddlepaddle.org.cn/paddle/visualdl/service/app?id=88173046bd09f61da5f48db66baddd7d)|
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
_base_: '../_base_/ade20k.yml'

batch_size: 4 # total batch size is 16
iters: 160000

train_dataset:
transforms:
- type: ResizeStepScaling
min_scale_factor: 0.5
max_scale_factor: 2.0
- type: RandomPaddingCrop
crop_size: [512, 512]
- type: RandomHorizontalFlip
- type: RandomDistort
brightness_range: 0.4
contrast_range: 0.4
saturation_range: 0.4
- type: Normalize
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]

val_dataset:
transforms:
- type: Resize
target_size: [2048, 512]
keep_ratio: True
size_divisor: 32
- type: Normalize
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]

test_config:
is_slide: True
crop_size: [512, 512]
stride: [341, 341]

optimizer:
_inherited_: False
type: AdamW
weight_decay: 0.01

lr_scheduler:
type: PolynomialDecay
learning_rate: 6.0e-5
end_lr: 0
power: 1.0
warmup_iters: 1500
warmup_start_lr: 1.0e-6

loss:
types:
- type: CrossEntropyLoss
avg_non_ignore: False
coef: [1, 0.4]

model:
type: UPerNetViTAdapter
backbone:
type: ViTAdapter_Tiny
pretrained: https://paddleseg.bj.bcebos.com/dygraph/backbone/deit_tiny_patch16_224.zip
backbone_indices: [0, 1, 2, 3]
channels: 512
pool_scales: [1, 2, 3, 6]
dropout_ratio: 0.1
aux_loss: True
aux_channels: 256
1 change: 1 addition & 0 deletions paddleseg/models/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@
from .mscale_ocrnet import MscaleOCRNet
from .topformer import TopFormer
from .rtformer import RTFormer
from .upernet_vit_adapter import UPerNetViTAdapter
from .lpsnet import LPSNet
from .maskformer import MaskFormer
from .segnext import SegNeXt
Expand Down
3 changes: 2 additions & 1 deletion paddleseg/models/backbones/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,5 +27,6 @@
from .cae import *
from .top_transformer import *
from .uhrnet import *
from .vit_adapter import *
from .hrformer import *
from .mscan import *
from .mscan import *
Loading

0 comments on commit 574be6f

Please sign in to comment.