-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How does the loss function work and how is it actually implemented in Torch ? #4
Comments
Okay, after reading up on "https://arxiv.org/abs/1506.02106" and watershed transforms I kinda understand the losses. How exactly are the blobs computed ? (How are they stored and made use of for calculating split level loss) |
Blobs are computed in two steps (as shown in line 33-46 in
As a result this gives you a binary matrix for each category. Note that the background is also a category.
|
What does it mean when blobs[ None ] is returned ? I've never seen this anywhere before. (Also counts[None]) |
|
I went through all the code in model.py and I don't understand much. Particularly in class FCN8 I understand the computation till fc7 but I get lost in the semantic segmentation part. `
` What exactly is happening here ? Also, how many outputs does the model have ? It's supposed to output the blobs and it also is supposed output the locations of the detected objects right ? I'm getting confused as to what the output of the model is and how exactly the blobs are handled (Apologies if I'm getting back to the same thing again, I'm having a touch time wrapping my head around this). |
The segmentation part you showed is the upsampling path which combines different features from VGG16 to output a At this point, there are no blobs, just activations for each pixel for each class. Hope this helps. |
No, I mean more specifically, what does
And how are blobs computed from this again ? |
|
Hey, I'd been looking at your code for a while now and zeroed in on the parts I don't understand (I have other parts I don't understand either but it requires me to understand the former first).
I don't understand how you get target values because when I tried to simulate that piece of code (with respect to trancos dataset, only two classes, background and foreground), ones and BgFgCounts don't have the same number of dimensions. Also is there any reason you flatten Target twice ?
I'm not entirely sure what's happening here either. I understand a lot of these questions might be very trivial 😅 but you have no idea how much your help is appreciated. Thanks again 😃. |
Hope this helps! PS: In case it's helpful, you can put a breaking point as |
Thanks a lot! I didn't know about ipdb, I'll make use of it. |
Hey, Just to clarify, In models.py, ResFCN, You're first reducing the size using interpolate and then in the end you're increasing the size in the last interpolate? Cause I'm guessing logits_16s_spatial_dim will be smaller than that of 32s and 8s spatial dimensions would we smaller than that of 16s ? |
The
|
Ah, yes, I got it. Got a bit confused with the variable names themselves cause I initialized them wrong. Thanks a lot! |
I've ran into a few more questions that unfortunately ipdb couldn't answer (Thanks a lot for that btw, it was very helpful).
So what I understand from split_level_loss is that you're getting the boundaries between objects (inside the blob) and then setting them to background (0) and then performing nll_loss against the output of the network. Please correct me if I am wrong. |
As per suggestion, I've opened a new issue so that others might also benefit from this.
The text was updated successfully, but these errors were encountered: