Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to convert semantic segmentation results into required format #17

Open
neoyang0620 opened this issue Nov 19, 2020 · 5 comments
Open
Assignees
Labels
question Further information is requested

Comments

@neoyang0620
Copy link

Hi,
Thanks for your amazing work.

I met a question when I implemented this model to other unseen data. The model required two extra inputs: sem_labels and sem_scores. I checked your paper and couldn't find out specified instruction about how to convert the original semantic segmentation results to this two new inputs.

The semantic segmentation model is this. The model will predict a W x H x L score matrix. Can you explain a little about the following operations?

Best,
Neo

@alexlopezcifuentes alexlopezcifuentes self-assigned this Nov 19, 2020
@alexlopezcifuentes alexlopezcifuentes added the question Further information is requested label Nov 19, 2020
@alexlopezcifuentes
Copy link
Member

Hi!

Thanks for your question!

If your semantic segmentation model outputs a score tensor Y with size L x W x H (with normalized probabilities, i.e. between 0 and 1) you need to sort Y and select the 3 labels with a higher probability for each pixel. By that, you will have the Top@3 labels with their respective Top@3 scores.

You can do that on the fly in the training and evaluation loop, however, in order to save time, we decided to save that labels and scores to images so the prediction is only done once. To save 3 semantic labels per pixel we decided to encode them into the three channels of an RGB image.

@neoyang0620
Copy link
Author

Thank for your explanation. I still have one further question. According to you paper, the semantic segmentation score is set to "0", if it is not one of the top3 predictions. However, there are many zero regions in the segmentation-precomputed SUN397 data. How is that possible?

Best,
Neo

@alexlopezcifuentes
Copy link
Member

Because 0 might be a semantic label. Which in the case of the ADE20K dataset its related to "Wall" class.

@neoyang0620
Copy link
Author

For example, sem_score file './Data/Datasets/SUN397/noisy_scores_RGB/val/airplane_cabin/sun_akzqlgepekqslhbn.png' contains many zero values. I don't think there should be zero values.

@neoyang0620
Copy link
Author

neoyang0620 commented Nov 19, 2020

if i understand correctly, the original output from semantic segmentation network is a Scores (150xWxH) matrix.

  1. we normalize the distribution at each pixel ==> torch.sum(Score [i,:,:]) == 1. the output of this stage is Scores_norm (150xWxH)
  2. we generate the sem_scores (3xWxH) by extracting the top3 scores at each pixel. sem_scores[0,i,j] represents the top-1 scores at pixel(i,j).
  3. we retrieve the index of top3 scores as sem_labels (3xWxH). The matrix is divided by 255.

Following such procedure, the value of sem_score cannot be zero. Do i miss any steps?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants