Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Doc] Update FAQ doc about binary segmentation and ReduceZeroLabel #2206

Merged
merged 11 commits into from
Oct 28, 2022
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
102 changes: 102 additions & 0 deletions docs/en/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,3 +66,105 @@ In the test script, we provide `show-dir` argument to control whether output the
```shell
python tools/test.py {config} {checkpoint} --show-dir {/path/to/save/image} --opacity 1
```

## How to handle binary segmentation task

MMSegmentation uses `num_classes` and `out_channels` to control output of last layer `self.conv_seg` (More details could be found [here](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/decode_heads/decode_head.py).):

```python
def __init__(self,
...,
):
...
if out_channels is None:
if num_classes == 2:
warnings.warn('For binary segmentation, we suggest using'
'`out_channels = 1` to define the output'
'channels of segmentor, and use `threshold`'
'to convert seg_logist into a prediction'
'applying a threshold')
out_channels = num_classes

if out_channels != num_classes and out_channels != 1:
raise ValueError(
'out_channels should be equal to num_classes,'
'except binary segmentation set out_channels == 1 and'
f'num_classes == 2, but got out_channels={out_channels}'
f'and num_classes={num_classes}')

if out_channels == 1 and threshold is None:
threshold = 0.3
warnings.warn('threshold is not defined for binary, and defaults'
'to 0.3')
self.num_classes = num_classes
self.out_channels = out_channels
self.threshold = threshold
...
self.conv_seg = nn.Conv2d(channels, self.out_channels, kernel_size=1)
```

There are two types of calculating binary segmentation methods:

```python
...
if self.out_channels == 1:
seg_logit = F.sigmoid(seg_logit)
else:
seg_logit = F.softmax(seg_logit, dim=1)

...

if self.out_channels == 1:
seg_pred = (seg_logit >
self.decode_head.threshold).to(seg_logit).squeeze(1)
else:
seg_pred = seg_logit.argmax(dim=1)
```

- When `out_channels=2`, using Cross Entropy Loss in training, using `F.softmax()` and `argmax()` to get prediction of each pixel in inference.

- When `out_channels=1`, we provide a parameter `threshold(default to 0.3)` in [#2016](https://github.com/open-mmlab/mmsegmentation/pull/2016), using Binary Cross Entropy Loss in training, using `F.sigmoid()` and `threshold` to get prediction of each pixel in inference.

More details about calculating segmentation prediction could be found in [encoder_decoder.py](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/segmentors/encoder_decoder.py):

In summary, to implement binary segmentation methods users should modify below parameters in the `decode_head` and `auxiliary_head` configs:

- (1) `num_classes=2`, `out_channels=2` and `use_sigmoid=False` in `CrossEntropyLoss`.

- (2) `num_classes=2`, `out_channels=1` and `use_sigmoid=True` in `CrossEntropyLoss`.

When taking solution (2), below is a modification example of [pspnet_unet_s5-d16.py](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/_base_/models/pspnet_unet_s5-d16.py):

```python
decode_head=dict(
type='PSPHead',
in_channels=64,
in_index=4,
num_classes=2,
out_channels=1,
loss_decode=dict(
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0)),
auxiliary_head=dict(
type='FCNHead',
in_channels=128,
in_index=3,
num_classes=2,
out_channels=1,
loss_decode=dict(
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=0.4)),
```

## What does `reduce_zero_label` work for?

When [loading annotation](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/datasets/pipelines/loading.py#L91) in MMSegmentation, `reduce_zero_label (bool)` is provided to determine whether reduce all label value by 1:

```python
if self.reduce_zero_label:
# avoid using underflow conversion
gt_semantic_seg[gt_semantic_seg == 0] = 255
gt_semantic_seg = gt_semantic_seg - 1
gt_semantic_seg[gt_semantic_seg == 254] = 255
```

`reduce_zero_label` is usually used for datasets where 0 is background label, if `reduce_zero_label=True`, the pixels whose corresponding label is 0 would not be involved in loss calculation.
Noted that in binary segmentation task it is unnecessary to use `reduce_zero_label=True`, please take solutions we mentioned above.
101 changes: 101 additions & 0 deletions docs/zh_cn/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,3 +66,104 @@
```shell
python tools/test.py {config} {checkpoint} --show-dir {/path/to/save/image} --opacity 1
```

## 如何处理二值分割任务?

MMSegmentation 使用 `num_classes` 和 `out_channels` 来控制模型最后一层 `self.conv_seg` 的输出. (更多细节可以参考 [这里](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/decode_heads/decode_head.py).):

```python
def __init__(self,
...,
):
...
if out_channels is None:
if num_classes == 2:
warnings.warn('For binary segmentation, we suggest using'
'`out_channels = 1` to define the output'
'channels of segmentor, and use `threshold`'
'to convert seg_logist into a prediction'
'applying a threshold')
out_channels = num_classes

if out_channels != num_classes and out_channels != 1:
raise ValueError(
'out_channels should be equal to num_classes,'
'except binary segmentation set out_channels == 1 and'
f'num_classes == 2, but got out_channels={out_channels}'
f'and num_classes={num_classes}')

if out_channels == 1 and threshold is None:
threshold = 0.3
warnings.warn('threshold is not defined for binary, and defaults'
'to 0.3')
self.num_classes = num_classes
self.out_channels = out_channels
self.threshold = threshold
...
self.conv_seg = nn.Conv2d(channels, self.out_channels, kernel_size=1)
```
MengzhangLI marked this conversation as resolved.
Show resolved Hide resolved

有两种计算二值分割任务的方法:
MengzhangLI marked this conversation as resolved.
Show resolved Hide resolved

- 当 `out_channels=2` 时, 在训练时以 Cross Entropy Loss 作为损失函数, 在推理时使用 `F.softmax()` 归一化 logits 值, 然后通过 `argmax()` 得到每个像素的预测结果.
MengzhangLI marked this conversation as resolved.
Show resolved Hide resolved

- 当 `out_channels=1` 时, 我们在 [#2016](https://github.com/open-mmlab/mmsegmentation/pull/2016) 里提供了阈值参数 `threshold (默认为 0.3)`, 在训练时以 Binary Cross Entropy Loss 作为损失函数, 在推理时使用 `F.sigmoid()` 和 `threshold` 得到预测结果.
MengzhangLI marked this conversation as resolved.
Show resolved Hide resolved

```python
...
if self.out_channels == 1:
seg_logit = F.sigmoid(seg_logit)
else:
seg_logit = F.softmax(seg_logit, dim=1)

...

if self.out_channels == 1:
seg_pred = (seg_logit >
self.decode_head.threshold).to(seg_logit).squeeze(1)
else:
seg_pred = seg_logit.argmax(dim=1)
```
MengzhangLI marked this conversation as resolved.
Show resolved Hide resolved

更多关于计算语义分割预测的细节可以参考 [encoder_decoder.py](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/segmentors/encoder_decoder.py):
MengzhangLI marked this conversation as resolved.
Show resolved Hide resolved

对于实现上述两种计算二值分割的方法, 需要在 `decode_head` 和 `auxiliary_head` 的配置里修改:

- (1) `num_classes=2`, `out_channels=2` 并在 `CrossEntropyLoss` 里面设置 `use_sigmoid=False`

MengzhangLI marked this conversation as resolved.
Show resolved Hide resolved
- (2) `num_classes=2`, `out_channels=1` 并在 `CrossEntropyLoss` 里面设置 `use_sigmoid=True`.

MengzhangLI marked this conversation as resolved.
Show resolved Hide resolved
如果采用解决方案 (2), 下面是对样例 [pspnet_unet_s5-d16.py](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/_base_/models/pspnet_unet_s5-d16.py) 做出的对应修改:
MengzhangLI marked this conversation as resolved.
Show resolved Hide resolved

```python
decode_head=dict(
type='PSPHead',
in_channels=64,
in_index=4,
num_classes=2,
out_channels=1,
loss_decode=dict(
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0)),
auxiliary_head=dict(
type='FCNHead',
in_channels=128,
in_index=3,
num_classes=2,
out_channels=1,
loss_decode=dict(
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=0.4)),
```

## `reduce_zero_label` 的作用

在 MMSegmentation 里面, 当 [加载注释](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/datasets/pipelines/loading.py#L91) 时, `reduce_zero_label (bool)` 被用来决定是否将所有 label 减去 1:

```python
if self.reduce_zero_label:
# avoid using underflow conversion
gt_semantic_seg[gt_semantic_seg == 0] = 255
gt_semantic_seg = gt_semantic_seg - 1
gt_semantic_seg[gt_semantic_seg == 254] = 255
```

`reduce_zero_label` 常常被用来处理 label 0 是背景的数据集, 如果 `reduce_zero_label=True`, label 0 对应的像素将不会参与损失函数的计算. 需要说明的是在二值分割任务中没有必要设置 `reduce_zero_label=True`, 请采用上面我们提到的解决方案.
MengzhangLI marked this conversation as resolved.
Show resolved Hide resolved
2 changes: 1 addition & 1 deletion mmseg/models/decode_heads/decode_head.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ class BaseDecodeHead(BaseModule, metaclass=ABCMeta):
num_classes (int): Number of classes.
out_channels (int): Output channels of conv_seg.
threshold (float): Threshold for binary segmentation in the case of
`num_classes==1`. Default: None.
`out_channels==1`. Default: None.
dropout_ratio (float): Ratio of dropout layer. Default: 0.1.
conv_cfg (dict|None): Config of conv layers. Default: None.
norm_cfg (dict|None): Config of norm layers. Default: None.
Expand Down