Skip to content

Commit

Permalink
[Feature] Add multi-label semantic segmentation support (#3479)
Browse files Browse the repository at this point in the history
* 添加UWMGI数据集的转换脚本

* 修改Dataset和Compose op使其适配读取多标签数据的情况

* 添加对多标签模式下的推理结果的可视化支持

* 添加对多标签模式下的语义分割任务评估指标的支持

* 添加对多标签模式下,传入--use_multilabel参数的支持

* 添加多标签语义分割任务在UWMGI数据集上的实例配置文件和说明文档

* 添加多标签语义分割任务的辅助类transform op

* 更新数据增强策略,加快收敛

* 添加使用辅助类transform op的配置文件

* 更新脚本,使其支持`UWMGI` 和主流的COCO类型标注转换为ppseg dataset api支持的格式

* 更新图片和转换脚本的相关命令
  • Loading branch information
MINGtoMING authored Sep 22, 2023
1 parent 1b9574e commit 63f95e6
Show file tree
Hide file tree
Showing 18 changed files with 893 additions and 50 deletions.
54 changes: 54 additions & 0 deletions configs/_base_/uwmgi.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
batch_size: 8
iters: 160000

train_dataset:
type: Dataset
dataset_root: data/UWMGI
transforms:
- type: Resize
target_size: [256, 256]
- type: RandomHorizontalFlip
- type: RandomVerticalFlip
- type: RandomDistort
brightness_range: 0.4
contrast_range: 0.4
saturation_range: 0.4
- type: Normalize
mean: [0.0, 0.0, 0.0]
std: [1.0, 1.0, 1.0]
num_classes: 3
train_path: data/UWMGI/train.txt
mode: train

val_dataset:
type: Dataset
dataset_root: data/UWMGI
transforms:
- type: Resize
target_size: [256, 256]
- type: Normalize
mean: [0.0, 0.0, 0.0]
std: [1.0, 1.0, 1.0]
num_classes: 3
val_path: data/UWMGI/val.txt
mode: val

optimizer:
type: SGD
momentum: 0.9
weight_decay: 4.0e-5

lr_scheduler:
type: PolynomialDecay
learning_rate: 0.001
end_lr: 0
power: 0.9

loss:
types:
- type: MixedLoss
losses:
- type: BCELoss
- type: LovaszHingeLoss
coef: [0.5, 0.5]
coef: [1]
139 changes: 139 additions & 0 deletions configs/multilabelseg/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
English | [简体中文](README_cn.md)

# Multi-label semantic segmentation based on PaddleSeg

## 1. introduction

Multi-label semantic segmentation is an image segmentation task that aims to assign each pixel in an image to multiple categories, rather than just one category. This can better express complex information in the image, such as overlapping, occlusion, boundaries, etc. of different objects. Multi label semantic segmentation has many application scenarios, such as medical image analysis, remote sensing image interpretation, autonomous driving, and so on.

<p align="center">
<img src="https://github.com/PaddlePaddle/PaddleSeg/assets/95759947/ea6bb360-75de-4e06-9910-44c7d2fdbe6c">
<img src="https://github.com/PaddlePaddle/PaddleSeg/assets/95759947/e2781865-db7e-4f46-98b2-3ef731e8bef1">
<img src="https://github.com/PaddlePaddle/PaddleSeg/assets/95759947/9e587935-fd6f-459e-b798-0164eb98f44d">
</p>

+ *The above effect shows the inference results obtained from the model trained using images in the [UWMGI](https://www.kaggle.com/competitions/uw-madison-gi-tract-image-segmentation/) dataset*

## 2. Supported models and loss functions

| Model | Loss |
|:-------------------------------------------------------------------------------------------:|:------------------------:|
| DeepLabV3, DeepLabV3P, MobileSeg, <br/>PP-LiteSeg, PP-MobileSeg, UNet, <br/>Unet++, Unet+++ | BCELoss, LovaszHingeLoss |

+ *The above are the confirmed supported models and loss functions, with a larger actual support range.*

## 3. Sample Tutorial

The following will take the **[UWMGI](https://www.kaggle.com/competitions/uw-madison-gi-tract-image-segmentation/)** multi-label semantic segmentation dataset and the **[PP-MobileSeg](../pp_mobileseg/README.md)** model as examples.

### 3.1 Data Preparation
In the single label semantic segmentation task, the shape of the annotated grayscale image is **(img_h, img_w)**, and the index value of the category is represented by grayscale values.

In the multi-label semantic segmentation task, the shape of the annotated grayscale image is **(img_h, num_classes x img_w)**, which means that the corresponding binary annotations of each category are sequentially concatenated in the horizontal direction.

Download the raw data compression package of the UWMGI dataset and convert it to a format supported by PaddleSeg's [Dataset](../../paddleseg/datasets/dataset.py) API using the provided script.
```shell
wget https://storage.googleapis.com/kaggle-competitions-data/kaggle-v2/27923/3495119/bundle/archive.zip?GoogleAccessId=web-data@kaggle-161607.iam.gserviceaccount.com&Expires=1693533809&Signature=ThCLjIYxSXfk85lCbZ5Cz2Ta4g8AjwJv0%2FgRpqpchlZLLYxk3XRnrZqappboha0moC7FuqllpwlLfCambQMbKoUjCLylVQqF0mEsn0IaJdYwprWYY%2F4FJDT2lG0HdQfAxJxlUPonXeZyZ4pZjOrrVEMprxuiIcM2kpGk35h7ry5ajkmdQbYmNQHFAJK2iO%2F4a8%2F543zhZRWsZZVbQJHid%2BjfO6ilLWiAGnMFpx4Sh2B01TUde9hBCwpxgJv55Gs0a4Z1KNsBRly6uqwgZFYfUBAejySx4RxFB7KEuRowDYuoaRT8NhSkzT2i7qqdZjgHxkFZJpRMUlDcf1RSJVkvEA%3D%3D&response-content-disposition=attachment%3B+filename%3Duw-madison-gi-tract-image-segmentation.zip
python tools/data/convert_multilabel.py \
--dataset_type uwmgi \
--zip_input ./uw-madison-gi-tract-image-segmentation.zip \
--output ./data/UWMGI/ \
--train_proportion 0.8 \
--val_proportion 0.2
# optional
rm ./uw-madison-gi-tract-image-segmentation.zip
```

The structure of the UWMGI dataset after conversion is as follows:
```
UWMGI
|
|--images
| |--train
| | |--*.jpg
| | |--...
| |
| |--val
| | |--*.jpg
| | |--...
|
|--annotations
| |--train
| | |--*.jpg
| | |--...
| |
| |--val
| | |--*.jpg
| | |--...
|
|--train.txt
|
|--val.txt
```

The divided training dataset and evaluation dataset can be configured as follows:
```yaml
train_dataset:
type: Dataset
dataset_root: data/UWMGI
transforms:
- type: Resize
target_size: [256, 256]
- type: RandomHorizontalFlip
- type: RandomVerticalFlip
- type: RandomDistort
brightness_range: 0.4
contrast_range: 0.4
saturation_range: 0.4
- type: Normalize
mean: [0.0, 0.0, 0.0]
std: [1.0, 1.0, 1.0]
num_classes: 3
train_path: data/UWMGI/train.txt
mode: train

val_dataset:
type: Dataset
dataset_root: data/UWMGI
transforms:
- type: Resize
target_size: [256, 256]
- type: Normalize
mean: [0.0, 0.0, 0.0]
std: [1.0, 1.0, 1.0]
num_classes: 3
val_path: data/UWMGI/val.txt
mode: val
```
### 3.2 Training
```shell
python tools/train.py \
--config configs/multilabelseg/pp_mobileseg_tiny_uwmgi_256x256_160k.yml \
--save_dir output/pp_mobileseg_tiny_uwmgi_256x256_160k \
--num_workers 8 \
--do_eval \
--use_vdl \
--save_interval 2000 \
--use_multilabel
```
+ *When using `--do_eval`must be added `--use_multilabel` parameter is used to adapt the evaluation in multi-label mode.*

### 3.3 Evaluation
```shell
python tools/val.py \
--config configs/multilabelseg/pp_mobileseg_tiny_uwmgi_256x256_160k.yml \
--model_path output/pp_mobileseg_tiny_uwmgi_256x256_160k/best_model/model.pdparams \
--use_multilabel
```
+ *Must add `--use_multilabel` when evaluating the model to adapt the evaluation in multi-label mode.*

### 3.4 Inference
```shell
python tools/predict.py \
--config configs/multilabelseg/pp_mobileseg_tiny_uwmgi_256x256_160k.yml \
--model_path output/pp_mobileseg_tiny_uwmgi_256x256_160k/best_model/model.pdparams \
--image_path data/UWMGI/images/val/case122_day18_slice_0089.jpg \
--use_multilabel
```
+ *When executing a prediction, it is necessary to add `--use_multilabel` parameter is used to adapt visualization in multi-label mode.*
139 changes: 139 additions & 0 deletions configs/multilabelseg/README_cn.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
[English](README.md) | 简体中文

# 基于 PaddleSeg 的多标签语义分割

## 1. 简介

多标签语义分割是一种图像分割任务,它的目的是将图像中的每个像素分配到多个类别中,而不是只有一个类别。这样可以更好地表达图像中的复杂信息,例如不同物体的重叠、遮挡、边界等。多标签语义分割有许多应用场景,例如医学图像分析、遥感图像解译、自动驾驶等。

<p align="center">
<img src="https://github.com/PaddlePaddle/PaddleSeg/assets/95759947/ea6bb360-75de-4e06-9910-44c7d2fdbe6c">
<img src="https://github.com/PaddlePaddle/PaddleSeg/assets/95759947/e2781865-db7e-4f46-98b2-3ef731e8bef1">
<img src="https://github.com/PaddlePaddle/PaddleSeg/assets/95759947/9e587935-fd6f-459e-b798-0164eb98f44d">
</p>

+ *以上效果展示图基于 [UWMGI](https://www.kaggle.com/competitions/uw-madison-gi-tract-image-segmentation/)数据集中的图片使用训练的模型所得到的推理结果。*

## 2. 已支持的模型和损失函数

| Model | Loss |
|:-------------------------------------------------------------------------------------------:|:------------------------:|
| DeepLabV3, DeepLabV3P, MobileSeg, <br/>PP-LiteSeg, PP-MobileSeg, UNet, <br/>Unet++, Unet+++ | BCELoss, LovaszHingeLoss |

+ *以上为确认支持的模型和损失函数,实际支持范围更大。*

## 3. 示例教程

如下将以 **[UWMGI](https://www.kaggle.com/competitions/uw-madison-gi-tract-image-segmentation/)** 多标签语义分割数据集和 **[PP-MobileSeg](../pp_mobileseg/README.md)** 模型为例。

### 3.1 数据准备
在单标签多类别语义分割任务中,标注灰度图的形状为 **(img_h, img_w)**, 并以灰度值来表示类别的索引值。

在多标签语义分割任务中,标注灰度图的形状为 **(img_h, num_classes x img_w)**, 即将各个类别对应二值标注按顺序拼接在水平方向上。

下载UWMGI数据集的原始数据压缩包,并使用提供的脚本转换为PaddleSeg的[Dataset](../../paddleseg/datasets/dataset.py) API支持的格式。
```shell
wget https://storage.googleapis.com/kaggle-competitions-data/kaggle-v2/27923/3495119/bundle/archive.zip?GoogleAccessId=web-data@kaggle-161607.iam.gserviceaccount.com&Expires=1693533809&Signature=ThCLjIYxSXfk85lCbZ5Cz2Ta4g8AjwJv0%2FgRpqpchlZLLYxk3XRnrZqappboha0moC7FuqllpwlLfCambQMbKoUjCLylVQqF0mEsn0IaJdYwprWYY%2F4FJDT2lG0HdQfAxJxlUPonXeZyZ4pZjOrrVEMprxuiIcM2kpGk35h7ry5ajkmdQbYmNQHFAJK2iO%2F4a8%2F543zhZRWsZZVbQJHid%2BjfO6ilLWiAGnMFpx4Sh2B01TUde9hBCwpxgJv55Gs0a4Z1KNsBRly6uqwgZFYfUBAejySx4RxFB7KEuRowDYuoaRT8NhSkzT2i7qqdZjgHxkFZJpRMUlDcf1RSJVkvEA%3D%3D&response-content-disposition=attachment%3B+filename%3Duw-madison-gi-tract-image-segmentation.zip
python tools/data/convert_multilabel.py \
--dataset_type uwmgi \
--zip_input ./uw-madison-gi-tract-image-segmentation.zip \
--output ./data/UWMGI/ \
--train_proportion 0.8 \
--val_proportion 0.2
# 可选
rm ./uw-madison-gi-tract-image-segmentation.zip
```

转换完成后的UWMGI数据集结构如下:
```
UWMGI
|
|--images
| |--train
| | |--*.jpg
| | |--...
| |
| |--val
| | |--*.jpg
| | |--...
|
|--annotations
| |--train
| | |--*.jpg
| | |--...
| |
| |--val
| | |--*.jpg
| | |--...
|
|--train.txt
|
|--val.txt
```

划分好的训练数据集和评估数据集可按如下方式进行配置:
```yaml
train_dataset:
type: Dataset
dataset_root: data/UWMGI
transforms:
- type: Resize
target_size: [256, 256]
- type: RandomHorizontalFlip
- type: RandomVerticalFlip
- type: RandomDistort
brightness_range: 0.4
contrast_range: 0.4
saturation_range: 0.4
- type: Normalize
mean: [0.0, 0.0, 0.0]
std: [1.0, 1.0, 1.0]
num_classes: 3
train_path: data/UWMGI/train.txt
mode: train

val_dataset:
type: Dataset
dataset_root: data/UWMGI
transforms:
- type: Resize
target_size: [256, 256]
- type: Normalize
mean: [0.0, 0.0, 0.0]
std: [1.0, 1.0, 1.0]
num_classes: 3
val_path: data/UWMGI/val.txt
mode: val
```
### 3.2 训练模型
```shell
python tools/train.py \
--config configs/multilabelseg/pp_mobileseg_tiny_uwmgi_256x256_160k.yml \
--save_dir output/pp_mobileseg_tiny_uwmgi_256x256_160k \
--num_workers 8 \
--do_eval \
--use_vdl \
--save_interval 2000 \
--use_multilabel
```
+ *当使用`--do_eval`必须添加`--use_multilabel`参数来适配多标签模式下的评估。*

### 3.3 评估模型
```shell
python tools/val.py \
--config configs/multilabelseg/pp_mobileseg_tiny_uwmgi_256x256_160k.yml \
--model_path output/pp_mobileseg_tiny_uwmgi_256x256_160k/best_model/model.pdparams \
--use_multilabel
```
+ *评估模型时必须添加`--use_multilabel`参数来适配多标签模式下的评估。*

### 3.4 执行预测
```shell
python tools/predict.py \
--config configs/multilabelseg/pp_mobileseg_tiny_uwmgi_256x256_160k.yml \
--model_path output/pp_mobileseg_tiny_uwmgi_256x256_160k/best_model/model.pdparams \
--image_path data/UWMGI/images/val/case122_day18_slice_0089.jpg \
--use_multilabel
```
+ *执行预测时必须添加`--use_multilabel`参数来适配多标签模式下的可视化。*
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
_base_: '../_base_/uwmgi.yml'

batch_size: 8
iters: 160000

model:
type: DeepLabV3
num_classes: 3
backbone:
type: ResNet50_vd
output_stride: 8
multi_grid: [1, 2, 4]
pretrained: https://bj.bcebos.com/paddleseg/dygraph/resnet50_vd_ssld_v2.tar.gz
backbone_indices: [3]
aspp_ratios: [1, 12, 24, 36]
aspp_out_channels: 256
align_corners: False
pretrained: null
Loading

0 comments on commit 63f95e6

Please sign in to comment.