Skip to content

[Feature] Support MaskFormer(NeurIPS'2021) in MMSeg 1.x #2215

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 23 commits into from
Dec 1, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 6 additions & 3 deletions .circleci/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -61,8 +61,9 @@ jobs:
command: |
pip install git+https://github.com/open-mmlab/mmengine.git@main
pip install -U openmim
mim install 'mmcv>=2.0.0rc1'
mim install 'mmcv>=2.0.0rc3'
pip install git+https://github.com/open-mmlab/mmclassification@dev-1.x
pip install git+https://github.com/open-mmlab/mmdetection.git@dev-3.x
pip install -r requirements/tests.txt -r requirements/optional.txt
- run:
name: Build and install
Expand Down Expand Up @@ -96,18 +97,20 @@ jobs:
command: |
git clone -b main --depth 1 https://github.com/open-mmlab/mmengine.git /home/circleci/mmengine
git clone -b dev-1.x --depth 1 https://github.com/open-mmlab/mmclassification.git /home/circleci/mmclassification
git clone -b dev-3.x --depth 1 https://github.com/open-mmlab/mmdetection.git /home/circleci/mmdetection
- run:
name: Build Docker image
command: |
docker build .circleci/docker -t mmseg:gpu --build-arg PYTORCH=<< parameters.torch >> --build-arg CUDA=<< parameters.cuda >> --build-arg CUDNN=<< parameters.cudnn >>
docker run --gpus all -t -d -v /home/circleci/project:/mmseg -v /home/circleci/mmengine:/mmengine -v /home/circleci/mmclassification:/mmclassification -w /mmseg --name mmseg mmseg:gpu
docker run --gpus all -t -d -v /home/circleci/project:/mmseg -v /home/circleci/mmengine:/mmengine -v /home/circleci/mmclassification:/mmclassification -v /home/circleci/mmdetection:/mmdetection -w /mmseg --name mmseg mmseg:gpu
- run:
name: Install mmseg dependencies
command: |
docker exec mmseg pip install -e /mmengine
docker exec mmseg pip install -U openmim
docker exec mmseg mim install 'mmcv>=2.0.0rc1'
docker exec mmseg mim install 'mmcv>=2.0.0rc3'
docker exec mmseg pip install -e /mmclassification
docker exec mmseg pip install -e /mmdetection
docker exec mmseg pip install -r requirements/tests.txt -r requirements/optional.txt
- run:
name: Build and install
Expand Down
12 changes: 8 additions & 4 deletions .github/workflows/merge_stage_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,8 +44,9 @@ jobs:
python -V
pip install -U openmim
pip install git+https://github.com/open-mmlab/mmengine.git
mim install 'mmcv>=2.0.0rc1'
mim install 'mmcv>=2.0.0rc3'
pip install git+https://github.com/open-mmlab/mmclassification.git@dev-1.x
pip install git+https://github.com/open-mmlab/mmdetection.git@dev-3.x
- name: Install unittest dependencies
run: pip install -r requirements/tests.txt -r requirements/optional.txt
- name: Build and install
Expand Down Expand Up @@ -92,8 +93,9 @@ jobs:
python -V
pip install -U openmim
pip install git+https://github.com/open-mmlab/mmengine.git
mim install 'mmcv>=2.0.0rc1'
mim install 'mmcv>=2.0.0rc3'
pip install git+https://github.com/open-mmlab/mmclassification.git@dev-1.x
pip install git+https://github.com/open-mmlab/mmdetection.git@dev-3.x
- name: Install unittest dependencies
run: pip install -r requirements/tests.txt -r requirements/optional.txt
- name: Build and install
Expand Down Expand Up @@ -155,8 +157,9 @@ jobs:
python -V
pip install -U openmim
pip install git+https://github.com/open-mmlab/mmengine.git
mim install 'mmcv>=2.0.0rc1'
mim install 'mmcv>=2.0.0rc3'
pip install git+https://github.com/open-mmlab/mmclassification.git@dev-1.x
pip install git+https://github.com/open-mmlab/mmdetection.git@dev-3.x
- name: Install unittest dependencies
run: pip install -r requirements/tests.txt -r requirements/optional.txt
- name: Build and install
Expand Down Expand Up @@ -187,8 +190,9 @@ jobs:
python -V
pip install -U openmim
pip install git+https://github.com/open-mmlab/mmengine.git
mim install 'mmcv>=2.0.0rc1'
mim install 'mmcv>=2.0.0rc3'
pip install git+https://github.com/open-mmlab/mmclassification.git@dev-1.x
pip install git+https://github.com/open-mmlab/mmdetection.git@dev-3.x
- name: Install unittest dependencies
run: pip install -r requirements/tests.txt -r requirements/optional.txt
- name: Build and install
Expand Down
9 changes: 6 additions & 3 deletions .github/workflows/pr_stage_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,8 +40,9 @@ jobs:
run: |
pip install -U openmim
pip install git+https://github.com/open-mmlab/mmengine.git
mim install 'mmcv>=2.0.0rc1'
mim install 'mmcv>=2.0.0rc3'
pip install git+https://github.com/open-mmlab/mmclassification.git@dev-1.x
pip install git+https://github.com/open-mmlab/mmdetection.git@dev-3.x
- name: Install unittest dependencies
run: pip install -r requirements/tests.txt -r requirements/optional.txt
- name: Build and install
Expand Down Expand Up @@ -92,8 +93,9 @@ jobs:
python -V
pip install -U openmim
pip install git+https://github.com/open-mmlab/mmengine.git
mim install 'mmcv>=2.0.0rc1'
mim install 'mmcv>=2.0.0rc3'
pip install git+https://github.com/open-mmlab/mmclassification.git@dev-1.x
pip install git+https://github.com/open-mmlab/mmdetection.git@dev-3.x
- name: Install unittest dependencies
run: pip install -r requirements/tests.txt -r requirements/optional.txt
- name: Build and install
Expand Down Expand Up @@ -124,8 +126,9 @@ jobs:
python -V
pip install -U openmim
pip install git+https://github.com/open-mmlab/mmengine.git
mim install 'mmcv>=2.0.0rc1'
mim install 'mmcv>=2.0.0rc3'
pip install git+https://github.com/open-mmlab/mmclassification.git@dev-1.x
pip install git+https://github.com/open-mmlab/mmdetection.git@dev-3.x
- name: Install unittest dependencies
run: pip install -r requirements/tests.txt -r requirements/optional.txt
- name: Build and install
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,7 @@ Supported methods:
- [x] [Segmenter (ICCV'2021)](configs/segmenter)
- [x] [SegFormer (NeurIPS'2021)](configs/segformer)
- [x] [K-Net (NeurIPS'2021)](configs/knet)
- [x] [MaskFormer (NeurIPS'2021)](configs/maskformer)

Supported datasets:

Expand Down
1 change: 1 addition & 0 deletions README_zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,7 @@ MMSegmentation 是一个基于 PyTorch 的语义分割开源工具箱。它是 O
- [x] [Segmenter (ICCV'2021)](configs/segmenter)
- [x] [SegFormer (NeurIPS'2021)](configs/segformer)
- [x] [K-Net (NeurIPS'2021)](configs/knet)
- [x] [MaskFormer (NeurIPS'2021)](configs/maskformer)

已支持的数据集:

Expand Down
60 changes: 60 additions & 0 deletions configs/maskformer/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# MaskFormer

[MaskFormer: Per-Pixel Classification is Not All You Need for Semantic Segmentation](https://arxiv.org/abs/2107.06278)

## Introduction

<!-- [ALGORITHM] -->

<a href="https://github.com/facebookresearch/MaskFormer/">Official Repo</a>

<a href="https://github.com/open-mmlab/mmdetection/blob/dev-3.x/mmdet/models/dense_heads/maskformer_head.py#L21">Code Snippet</a>

## Abstract

<!-- [ABSTRACT] -->

Modern approaches typically formulate semantic segmentation as a per-pixel classification task, while instance-level segmentation is handled with an alternative mask classification. Our key insight: mask classification is sufficiently general to solve both semantic- and instance-level segmentation tasks in a unified manner using the exact same model, loss, and training procedure. Following this observation, we propose MaskFormer, a simple mask classification model which predicts a set of binary masks, each associated with a single global class label prediction. Overall, the proposed mask classification-based method simplifies the landscape of effective approaches to semantic and panoptic segmentation tasks and shows excellent empirical results. In particular, we observe that MaskFormer outperforms per-pixel classification baselines when the number of classes is large. Our mask classification-based method outperforms both current state-of-the-art semantic (55.6 mIoU on ADE20K) and panoptic segmentation (52.7 PQ on COCO) models.

<!-- [IMAGE] -->

<div align=center>
<img src="https://user-images.githubusercontent.com/24582831/199215459-ea507126-aafe-4823-8eb1-ae6487509d5c.png" width="90%"/>
</div>

```bibtex
@article{cheng2021per,
title={Per-pixel classification is not all you need for semantic segmentation},
author={Cheng, Bowen and Schwing, Alex and Kirillov, Alexander},
journal={Advances in Neural Information Processing Systems},
volume={34},
pages={17864--17875},
year={2021}
}
```

### Usage

- MaskFormer model needs to install [MMDetection](https://github.com/open-mmlab/mmdetection) first.

```shell
pip install "mmdet>=3.0.0rc4"
```

## Results and models

### ADE20K

| Method | Backbone | Crop Size | Lr schd | Mem (GB) | Inf time (fps) | mIoU | mIoU(ms+flip) | config | download |
| ---------- | --------- | --------- | ------- | -------- | -------------- | ----- | ------------- | -------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| MaskFormer | R-50-D32 | 512x512 | 160000 | 3.29 | 42.20 | 44.29 | - | [config](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/configs/maskformer/maskformer_r50-d32_8xb2-160k_ade20k-512x512.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/maskformer/maskformer_r50-d32_8xb2-160k_ade20k-512x512/maskformer_r50-d32_8xb2-160k_ade20k-512x512_20221030_182724-cbd39cc1.pth) \| [log](https://download.openmmlab.com/mmsegmentation/v0.5/maskformer/maskformer_r50-d32_8xb2-160k_ade20k-512x512/maskformer_r50-d32_8xb2-160k_ade20k-512x512_20221030_182724.json) |
| MaskFormer | R-101-D32 | 512x512 | 160000 | 4.12 | 34.90 | 45.11 | - | [config](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/configs/maskformer/maskformer_r101-d32_8xb2-160k_ade20k-512x512.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/maskformer/maskformer_r101-d32_8xb2-160k_ade20k-512x512/maskformer_r101-d32_8xb2-160k_ade20k-512x512_20221031_223053-c8e0931d.pth) \| [log](https://download.openmmlab.com/mmsegmentation/v0.5/maskformer/maskformer_r101-d32_8xb2-160k_ade20k-512x512/maskformer_r101-d32_8xb2-160k_ade20k-512x512_20221031_223053.json) |
| MaskFormer | Swin-T | 512x512 | 160000 | 3.73 | 40.53 | 46.69 | - | [config](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/configs/maskformer/maskformer_swin-t_upernet_8xb2-160k_ade20k-512x512.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/maskformer/maskformer_swin-t_upernet_8xb2-160k_ade20k-512x512/maskformer_swin-t_upernet_8xb2-160k_ade20k-512x512_20221114_232813-03550716.pth) \| [log](https://download.openmmlab.com/mmsegmentation/v0.5/maskformer/maskformer_swin-t_upernet_8xb2-160k_ade20k-512x512/maskformer_swin-t_upernet_8xb2-160k_ade20k-512x512_20221114_232813.json) |
| MaskFormer | Swin-S | 512x512 | 160000 | 5.33 | 26.98 | 49.36 | - | [config](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/configs/maskformer/maskformer_swin-s_upernet_8xb2-160k_ade20k-512x512.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/maskformer/maskformer_swin-s_upernet_8xb2-160k_ade20k-512x512/maskformer_swin-s_upernet_8xb2-160k_ade20k-512x512_20221115_114710-5ab67e58.pth) \| [log](https://download.openmmlab.com/mmsegmentation/v0.5/maskformer/maskformer_swin-s_upernet_8xb2-160k_ade20k-512x512/maskformer_swin-s_upernet_8xb2-160k_ade20k-512x512_20221115_114710.json) |

Note:

- All experiments of MaskFormer are implemented with 8 V100 (32G) GPUs with 2 samplers per GPU.
- The results of MaskFormer are relatively not stable. The accuracy (mIoU) of model with `R-101-D32` is from 44.7 to 46.0, and with `Swin-S` is from 49.0 to 49.8.
- The ResNet backbones utilized in MaskFormer models are standard `ResNet` rather than `ResNetV1c`.
- Test time augmentation is not supported in MMSegmentation 1.x version yet, we would add "ms+flip" results as soon as possible.
101 changes: 101 additions & 0 deletions configs/maskformer/maskformer.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
Collections:
- Name: MaskFormer
Metadata:
Training Data:
- Usage
- ADE20K
Paper:
URL: https://arxiv.org/abs/2107.06278
Title: 'MaskFormer: Per-Pixel Classification is Not All You Need for Semantic
Segmentation'
README: configs/maskformer/README.md
Code:
URL: https://github.com/open-mmlab/mmdetection/blob/dev-3.x/mmdet/models/dense_heads/maskformer_head.py#L21
Version: dev-3.x
Converted From:
Code: https://github.com/facebookresearch/MaskFormer/
Models:
- Name: maskformer_r50-d32_8xb2-160k_ade20k-512x512
In Collection: MaskFormer
Metadata:
backbone: R-50-D32
crop size: (512,512)
lr schd: 160000
inference time (ms/im):
- value: 23.7
hardware: V100
backend: PyTorch
batch size: 1
mode: FP32
resolution: (512,512)
Training Memory (GB): 3.29
Results:
- Task: Semantic Segmentation
Dataset: ADE20K
Metrics:
mIoU: 44.29
Config: configs/maskformer/maskformer_r50-d32_8xb2-160k_ade20k-512x512.py
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/maskformer/maskformer_r50-d32_8xb2-160k_ade20k-512x512/maskformer_r50-d32_8xb2-160k_ade20k-512x512_20221030_182724-cbd39cc1.pth
- Name: maskformer_r101-d32_8xb2-160k_ade20k-512x512
In Collection: MaskFormer
Metadata:
backbone: R-101-D32
crop size: (512,512)
lr schd: 160000
inference time (ms/im):
- value: 28.65
hardware: V100
backend: PyTorch
batch size: 1
mode: FP32
resolution: (512,512)
Training Memory (GB): 4.12
Results:
- Task: Semantic Segmentation
Dataset: ADE20K
Metrics:
mIoU: 45.11
Config: configs/maskformer/maskformer_r101-d32_8xb2-160k_ade20k-512x512.py
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/maskformer/maskformer_r101-d32_8xb2-160k_ade20k-512x512/maskformer_r101-d32_8xb2-160k_ade20k-512x512_20221031_223053-c8e0931d.pth
- Name: maskformer_swin-t_upernet_8xb2-160k_ade20k-512x512
In Collection: MaskFormer
Metadata:
backbone: Swin-T
crop size: (512,512)
lr schd: 160000
inference time (ms/im):
- value: 24.67
hardware: V100
backend: PyTorch
batch size: 1
mode: FP32
resolution: (512,512)
Training Memory (GB): 3.73
Results:
- Task: Semantic Segmentation
Dataset: ADE20K
Metrics:
mIoU: 46.69
Config: configs/maskformer/maskformer_swin-t_upernet_8xb2-160k_ade20k-512x512.py
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/maskformer/maskformer_swin-t_upernet_8xb2-160k_ade20k-512x512/maskformer_swin-t_upernet_8xb2-160k_ade20k-512x512_20221114_232813-03550716.pth
- Name: maskformer_swin-s_upernet_8xb2-160k_ade20k-512x512
In Collection: MaskFormer
Metadata:
backbone: Swin-S
crop size: (512,512)
lr schd: 160000
inference time (ms/im):
- value: 37.06
hardware: V100
backend: PyTorch
batch size: 1
mode: FP32
resolution: (512,512)
Training Memory (GB): 5.33
Results:
- Task: Semantic Segmentation
Dataset: ADE20K
Metrics:
mIoU: 49.36
Config: configs/maskformer/maskformer_swin-s_upernet_8xb2-160k_ade20k-512x512.py
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/maskformer/maskformer_swin-s_upernet_8xb2-160k_ade20k-512x512/maskformer_swin-s_upernet_8xb2-160k_ade20k-512x512_20221115_114710-5ab67e58.pth
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
_base_ = './maskformer_r50-d32_8xb2-160k_ade20k-512x512.py'

model = dict(
backbone=dict(
depth=101,
init_cfg=dict(type='Pretrained',
checkpoint='torchvision://resnet101')))
Loading