Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Support multiple losses during training #818

Merged
merged 15 commits into from
Sep 24, 2021

Conversation

MengzhangLI
Copy link
Contributor

@MengzhangLI MengzhangLI commented Aug 24, 2021

Because of being asked frequently by communities, I directly cloned related PR of multiple losses implementation and make a new PR.

Related PR: #244

Related Issues: #779, #727, #486 and so on.

Here is my results on UNet with backbone UNet-S5-D16 and model FCN:

Note:
(1) CE means its loss function is cross entropy and DC means dice loss.
(2) CE is cross entropy loss, which is default loss function of MMSegmentation config, I reproduce training to testify real difference between different loss settings.
(3) loss_weight is also important. For instance, (0.5 : 1) below means the weight of cross entropy loss CE and dice loss DC is 0.5 and 1, respectively.
(4) I use --seed 0 but still have some variances from same config setting of different training experiments.

Datasets CE (from repo config) CE (on my own) CE + DC (1:1) CE + DC (0.5:1) CE + DC (1:0.5) CE + DC (1:3)
DRIVE 78.67 78.42 79.18 78.94 79.58 79.51
STARE 81.02 81.29 82.08 81.44 82.01 82.39
CHASE_DB1 80.24 80.25 80.53 80.58 80.46 80.26
HRF 79.45 79.24 80.66 80.72 80.6 80.79

@codecov
Copy link

codecov bot commented Aug 24, 2021

Codecov Report

Merging #818 (01bff41) into master (e235c1a) will increase coverage by 1.44%.
The diff coverage is 96.05%.

❗ Current head 01bff41 differs from pull request most recent head 0b3a22b. Consider uploading reports for the commit 0b3a22b to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master     #818      +/-   ##
==========================================
+ Coverage   87.64%   89.09%   +1.44%     
==========================================
  Files         108      112       +4     
  Lines        5886     6081     +195     
  Branches      958      977      +19     
==========================================
+ Hits         5159     5418     +259     
+ Misses        535      468      -67     
- Partials      192      195       +3     
Flag Coverage Δ
unittests 89.09% <96.05%> (+1.46%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
mmseg/core/evaluation/metrics.py 90.42% <ø> (-0.20%) ⬇️
mmseg/datasets/pipelines/formating.py 63.82% <ø> (ø)
mmseg/models/backbones/cgnet.py 94.63% <ø> (ø)
mmseg/models/backbones/fast_scnn.py 97.08% <ø> (ø)
mmseg/models/backbones/mit.py 91.53% <ø> (ø)
mmseg/models/backbones/mobilenet_v2.py 71.08% <ø> (ø)
mmseg/models/backbones/resnest.py 83.72% <ø> (ø)
mmseg/models/backbones/resnet.py 99.28% <ø> (ø)
mmseg/models/backbones/resnext.py 100.00% <ø> (ø)
mmseg/models/backbones/unet.py 94.91% <ø> (ø)
... and 37 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e235c1a...0b3a22b. Read the comment docs.

@Junjun2016
Copy link
Collaborator

Should add more unittests to improve the coverage.

mmseg/models/decode_heads/decode_head.py Outdated Show resolved Hide resolved
mmseg/models/decode_heads/decode_head.py Outdated Show resolved Hide resolved
mmseg/models/decode_heads/decode_head.py Outdated Show resolved Hide resolved
mmseg/models/decode_heads/decode_head.py Outdated Show resolved Hide resolved
mmseg/models/decode_heads/decode_head.py Outdated Show resolved Hide resolved
mmseg/models/decode_heads/decode_head.py Outdated Show resolved Hide resolved
mmseg/models/decode_heads/decode_head.py Outdated Show resolved Hide resolved
@caodroid
Copy link

caodroid commented Sep 4, 2021

HI, Supporting multiple losses during training is very in need and a nice work,here I also have a suggestion for this work , here
the multiple losses is only for the final prediction, however, if the decoder_head has more than 2 prediction through deep supervisions with differen loss weight, like da_head.py or the network in figure,
image
this may not work. for this problem, I think the following code may works

@force_fp32(apply_to=('seg_logit', ))
def _losses(self, seg_logit, seg_label, loss_decode):
    """Compute segmentation loss."""
    loss = dict()
    seg_logit = resize(
        input=seg_logit,
        size=seg_label.shape[2:],
        mode='bilinear',
        align_corners=self.align_corners)
    if self.sampler is not None:
        seg_weight = self.sampler.sample(seg_logit, seg_label)
    else:
        seg_weight = None
    seg_label = seg_label.squeeze(1)
  #for loss_name, loss_decode in zip(self.loss_names, self.loss_decode):
    loss['loss_seg'] = loss_decode(
        seg_logit,
        seg_label,
        weight=seg_weight,
        ignore_index=self.ignore_index)
    loss['acc_seg'] = accuracy(seg_logit, seg_label)
    return loss
  
 def losses(self, seg_logit, seg_label):
           """Compute segmentation loss."""
    loss = dict()
    if isinstance(seg_logit, torch.Tensor):   # only one seg_logit 
        for loss_name, loss_decode in zip(self.loss_names, self.loss_decode):
              loss.update(add_prefix( self._losses(seg_logit, seg_label, loss_decode ), loss_name))
   elif isinstance(seg_logit, (list, tuple):   # multiple seg_logits
        for logit, loss_name, loss_decode in zip(seg_logit, self.loss_names, self.loss_decode):
               loss.update(add_prefix( self._losses(logit, seg_label, loss_decode ), loss_name))
    return loss`

@MengzhangLI MengzhangLI added the WIP Work in process label Sep 10, 2021
@openmmlab-bot
Copy link
Collaborator

Task linked: CU-k5tuzw mix loss

@Junjun2016
Copy link
Collaborator

Should add more unitests to improve the code coverage.

@MengzhangLI
Copy link
Contributor Author

HI, Supporting multiple losses during training is very in need and a nice work,here I also have a suggestion for this work , here
the multiple losses is only for the final prediction, however, if the decoder_head has more than 2 prediction through deep supervisions with differen loss weight, like da_head.py or the network in figure,
image
this may not work. for this problem, I think the following code may works

@force_fp32(apply_to=('seg_logit', ))
def _losses(self, seg_logit, seg_label, loss_decode):
    """Compute segmentation loss."""
    loss = dict()
    seg_logit = resize(
        input=seg_logit,
        size=seg_label.shape[2:],
        mode='bilinear',
        align_corners=self.align_corners)
    if self.sampler is not None:
        seg_weight = self.sampler.sample(seg_logit, seg_label)
    else:
        seg_weight = None
    seg_label = seg_label.squeeze(1)
  #for loss_name, loss_decode in zip(self.loss_names, self.loss_decode):
    loss['loss_seg'] = loss_decode(
        seg_logit,
        seg_label,
        weight=seg_weight,
        ignore_index=self.ignore_index)
    loss['acc_seg'] = accuracy(seg_logit, seg_label)
    return loss
  
 def losses(self, seg_logit, seg_label):
           """Compute segmentation loss."""
    loss = dict()
    if isinstance(seg_logit, torch.Tensor):   # only one seg_logit 
        for loss_name, loss_decode in zip(self.loss_names, self.loss_decode):
              loss.update(add_prefix( self._losses(seg_logit, seg_label, loss_decode ), loss_name))
   elif isinstance(seg_logit, torch.Tensor):   # multiple seg_logits
        for logit, loss_name, loss_decode in zip(seg_logit, self.loss_names, self.loss_decode):
               loss.update(add_prefix( self._losses(logit, seg_label, loss_decode ), loss_name))
    return loss`

Thanks for your advice. Deep supervision can be implemented by auxiliary head.

@caodroid
Copy link

caodroid commented Sep 23, 2021

HI, Supporting multiple losses during training is very in need and a nice work,here I also have a suggestion for this work , here
the multiple losses is only for the final prediction, however, if the decoder_head has more than 2 prediction through deep supervisions with differen loss weight, like da_head.py or the network in figure,
image
this may not work. for this problem, I think the following code may works

@force_fp32(apply_to=('seg_logit', ))
def _losses(self, seg_logit, seg_label, loss_decode):
    """Compute segmentation loss."""
    loss = dict()
    seg_logit = resize(
        input=seg_logit,
        size=seg_label.shape[2:],
        mode='bilinear',
        align_corners=self.align_corners)
    if self.sampler is not None:
        seg_weight = self.sampler.sample(seg_logit, seg_label)
    else:
        seg_weight = None
    seg_label = seg_label.squeeze(1)
  #for loss_name, loss_decode in zip(self.loss_names, self.loss_decode):
    loss['loss_seg'] = loss_decode(
        seg_logit,
        seg_label,
        weight=seg_weight,
        ignore_index=self.ignore_index)
    loss['acc_seg'] = accuracy(seg_logit, seg_label)
    return loss
  
 def losses(self, seg_logit, seg_label):
           """Compute segmentation loss."""
    loss = dict()
    if isinstance(seg_logit, torch.Tensor):   # only one seg_logit 
        for loss_name, loss_decode in zip(self.loss_names, self.loss_decode):
              loss.update(add_prefix( self._losses(seg_logit, seg_label, loss_decode ), loss_name))
   elif isinstance(seg_logit, torch.Tensor):   # multiple seg_logits
        for logit, loss_name, loss_decode in zip(seg_logit, self.loss_names, self.loss_decode):
               loss.update(add_prefix( self._losses(logit, seg_label, loss_decode ), loss_name))
    return loss`

Thanks for your advice. Deep supervision can be implemented by auxiliary head.

Thanks for your reply, it's thruth that the deep supervision can be implemented by auxiliary head, but the auxiliary head can only supervise the bachbone. in my advice, I suggest to implemente deep supervision on multiple preditions, for example: there are three output preditions in da_head, however if I want to change the loss weight on the other auxiliary output predition, it is not achievable in current version. this demand is very useful for multiple preditions.

` def forward(self, inputs):

    x = self._transform_inputs(inputs)
    pam_feat = self.pam_in_conv(x)
    pam_feat = self.pam(pam_feat)
    pam_feat = self.pam_out_conv(pam_feat)
    pam_out = self.pam_cls_seg(pam_feat)

    cam_feat = self.cam_in_conv(x)
    cam_feat = self.cam(cam_feat)
    cam_feat = self.cam_out_conv(cam_feat)
    cam_out = self.cam_cls_seg(cam_feat)

    feat_sum = pam_feat + cam_feat
    pam_cam_out = self.cls_seg(feat_sum)

    return pam_cam_out, pam_out, cam_out`

@MengzhangLI
Copy link
Contributor Author

HI, Supporting multiple losses during training is very in need and a nice work,here I also have a suggestion for this work , here
the multiple losses is only for the final prediction, however, if the decoder_head has more than 2 prediction through deep supervisions with differen loss weight, like da_head.py or the network in figure,
image
this may not work. for this problem, I think the following code may works

@force_fp32(apply_to=('seg_logit', ))
def _losses(self, seg_logit, seg_label, loss_decode):
    """Compute segmentation loss."""
    loss = dict()
    seg_logit = resize(
        input=seg_logit,
        size=seg_label.shape[2:],
        mode='bilinear',
        align_corners=self.align_corners)
    if self.sampler is not None:
        seg_weight = self.sampler.sample(seg_logit, seg_label)
    else:
        seg_weight = None
    seg_label = seg_label.squeeze(1)
  #for loss_name, loss_decode in zip(self.loss_names, self.loss_decode):
    loss['loss_seg'] = loss_decode(
        seg_logit,
        seg_label,
        weight=seg_weight,
        ignore_index=self.ignore_index)
    loss['acc_seg'] = accuracy(seg_logit, seg_label)
    return loss
  
 def losses(self, seg_logit, seg_label):
           """Compute segmentation loss."""
    loss = dict()
    if isinstance(seg_logit, torch.Tensor):   # only one seg_logit 
        for loss_name, loss_decode in zip(self.loss_names, self.loss_decode):
              loss.update(add_prefix( self._losses(seg_logit, seg_label, loss_decode ), loss_name))
   elif isinstance(seg_logit, torch.Tensor):   # multiple seg_logits
        for logit, loss_name, loss_decode in zip(seg_logit, self.loss_names, self.loss_decode):
               loss.update(add_prefix( self._losses(logit, seg_label, loss_decode ), loss_name))
    return loss`

Thanks for your advice. Deep supervision can be implemented by auxiliary head.

Thanks for your reply, it's thruth that the deep supervision can be implemented by auxiliary head, but the auxiliary head can only supervise the bachbone. in my advice, I suggest to implemente deep supervision on multiple preditions, for example: there are three output preditions in da_head, however if I want to change the loss weight on the other auxiliary output predition, it is not achievable in current version. this demand is very useful for multiple preditions.

` def forward(self, inputs):

    x = self._transform_inputs(inputs)
    pam_feat = self.pam_in_conv(x)
    pam_feat = self.pam(pam_feat)
    pam_feat = self.pam_out_conv(pam_feat)
    pam_out = self.pam_cls_seg(pam_feat)

    cam_feat = self.cam_in_conv(x)
    cam_feat = self.cam(cam_feat)
    cam_feat = self.cam_out_conv(cam_feat)
    cam_out = self.cam_cls_seg(cam_feat)

    feat_sum = pam_feat + cam_feat
    pam_cam_out = self.cls_seg(feat_sum)

    return pam_cam_out, pam_out, cam_out`

Hi, thanks for your nice proposal and I will think about it in the future, i.e., to make this implementation more flexible just as you mentioned above.

Right now there are already two ways to support it. (1) Like UNet we implemented, the total encoder-decoder is the backbone except final fully connected network. (2) Use NECK between BACKBONE and HEAD to handle those feature maps.

Thanks again for your warmhearted proposal.

Best,

Copy link
Collaborator

@xvjiarui xvjiarui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except for the missing unittest

mmseg/models/losses/dice_loss.py Show resolved Hide resolved
mmseg/models/decode_heads/decode_head.py Show resolved Hide resolved
mmseg/models/losses/lovasz_loss.py Show resolved Hide resolved
@Junjun2016
Copy link
Collaborator

Please fix the lint error.

@MengzhangLI MengzhangLI removed the WIP Work in process label Sep 24, 2021
@Junjun2016 Junjun2016 merged commit 186a1fc into open-mmlab:master Sep 24, 2021
@MengzhangLI MengzhangLI deleted the multi-loss branch February 1, 2022 03:16
bowenroom pushed a commit to bowenroom/mmsegmentation that referenced this pull request Feb 25, 2022
* multiple losses

* fix lint error

* fix typos

* fix typos

* Adding Attribute

* Fixing loss_ prefix

* Fixing loss_ prefix

* Fixing loss_ prefix

* Add Same

* loss_name must has 'loss_' prefix

* Fix unittest

* Fix unittest

* Fix unittest

* Update mmseg/models/decode_heads/decode_head.py

Co-authored-by: Junjun2016 <hejunjun@sjtu.edu.cn>
wjkim81 pushed a commit to wjkim81/mmsegmentation that referenced this pull request Dec 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants