Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the code question in semantic_seg #20

Open
Ianresearch opened this issue Dec 9, 2021 · 8 comments
Open

the code question in semantic_seg #20

Ianresearch opened this issue Dec 9, 2021 · 8 comments

Comments

@Ianresearch
Copy link

Hi, I have a questation about the logit_scale and logit_bias in semantic_seg. The shape of the above parameter is (1, num_classes, 1, 1), why not is (1, num_classes, 512, 512) which is matched the input image size for semantic segmenation.

@tonysy
Copy link
Member

tonysy commented Dec 9, 2021

Hi, logit_scale and logit_bias are class-wise magnitudes and margin(scalar for each class), respectively, as described in the original paper.

@Ianresearch
Copy link
Author

Thanky you for reply. If I use the method in semantic segmentation, and the input image size is (512,512), the parameter shape should be the shape(1, num_classes, 512, 512) or not? I think the shape(1, num_classes, 1, 1) in your code is used for Image Classification.Is that correct?

@tonysy
Copy link
Member

tonysy commented Dec 9, 2021

The code should work for this scenario as the broadcast will be conducted automatically. You can try this example

>>> import torch
>>> data = torch.randn(1,10, 512,512)
>>> a = torch.ones(1,10,1,1)
>>> b = torch.zeros(1,10,1,1)
>>> out = a * data + b
>>> out.shape
torch.Size([1, 10, 512, 512])
>>>

@Ianresearch
Copy link
Author

Ianresearch commented Dec 13, 2021

Hi tonysy, I use this paper method in my u-net semantic segmentation, but the result is not improved. Is there something wrong in my implementation process? The u-net reuslt is 4 classes and image size is 512*512, and the last layer is: reuslt = conv_bn_relu(inputchannel, outputchannel=4).In the first stage, I train this u-net with lr=0.02. In the second stage, I fix the parameter in the all net except the last layer, and add the layer code:
#add the long-tail disalign
confidence = self.confidence_layer(result) #self.confidence_layer=conv_bn_relu(inchannel=4,outchannel=1)
confidence = torch.sigmoid(confidence)
#only adjust the foreground classification scores
scores_tmp = confidence * (result * self.logit_scale + self.logit_bias)
result = scores_tmp + (1 - confidence) * result
At the same time in the second stage, I re-weight the celoss weight using the method in Sec.3.2.2(ρ=0.3) .Train the above net again with lr=0.02. Looking forward your answer.

@tonysy
Copy link
Member

tonysy commented Dec 13, 2021

First, in DisAlign, all layers learned in stage-1 are fixed during the second stage.
Second, the proposed method mainly focuses on long-tail, which typically consists of many tail classes.

Thus, you can fix all layers of stage-1, and remove the confidence layer, only using the GRW for you case.(confidence estimation has minor improvement on segmentation task in our recent experiment)

Such as:

result = result * self.logit_scale + self.logit_bias

Then only learn the logit_scale and logit_bias with the GRWCrossEntropyLoss

class GRWCrossEntropyLoss(nn.Module):

@Ianresearch
Copy link
Author

I will try again.The dataset is long-tail distribution, the biggest class ration is 74% and The smallest proportion is only 0.2%.(74%,11%,14.8%,0.2%). Thank you very much.

@tonysy
Copy link
Member

tonysy commented Dec 13, 2021

Hi, the case described is imbalanced classification, not long-tail. long-tail means there exist many tail classes.(typically hundreds or thousands of classes in total)

@Fly-dream12
Copy link

In this project, where is the code about imbalanced image classification, which script should be used? @tonysy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants