Implementation of Focal loss #1489

shikhargoswami · 2021-01-30T10:07:04Z

Focal loss was introduced in the RetinaNet paper (https://arxiv.org/pdf/1708.02002.pdf).

Focal loss is useful for classification when you we highly imbalanced classes. It down-weights well-classified examples and focuses on hard examples. The loss value is much high for a sample which is misclassified by the classifier as compared to the loss value corresponding to a well-classified example.

Used in single-shot object detection where the imbalance between the background class and other classes is extremely high.

Here's it's tensorflow implementation (https://github.com/tensorflow/addons/blob/v0.12.0/tensorflow_addons/losses/focal_loss.py#L26-L81)

PR Checklist

Tests are added
Entry in NEWS.md
Documentation, if applicable
Final review from @dhairyagandhi96 (for API changes).

darsnack

Thanks for this contribution! Left some small comments, but the approach looks good to me.

src/losses/functions.jl

DhairyaLGandhi · 2021-01-30T16:53:43Z

Rather than loop iteration, it might be faster to do array operations. That might be faster for when you actually have to materialize the result with bigger arrays

CarloLucibello · 2021-01-30T17:28:10Z

That paper is cited enough that it may be worth having this loss in flux, although pytorch doesn't have it and tensorflow has it only as an addon

src/losses/functions.jl

darsnack · 2021-01-30T17:31:59Z

Yeah having never heard of this loss before, I checked the paper. It's well-cited, so good enough for me to include in Flux.

shikhargoswami · 2021-01-30T17:34:47Z

@darsnack @CarloLucibello Need help on converting the list compreshension to array operations. Any useful leads?

darsnack · 2021-01-30T17:35:41Z

@shikhargoswami You might take a look at how binarycrossentropy is written to see how to address @DhairyaLGandhi's comment. The xlogy utility is what you need I believe.

I didn't see a performance difference when testing, but I didn't use large array sizes. Even if there isn't a gap in performance, it would be good to use the same style as the other loss functions.

CarloLucibello · 2021-01-30T17:38:56Z

assuming 0/1 you can write

p_t = y .* ŷ  + (1 .- y) .* (1 .- ŷ)

So this is used only for binary classification? This should be mentioned

src/losses/functions.jl

test/losses.jl

shikhargoswami · 2021-01-30T19:08:52Z

I guess I only implemented Binary classification. Here are the changes made:

List comprehension -> Array operation
Implemented categorical_focal_loss and added it's test
Minor changes in docstring

Check if it needs any other change

darsnack

Personally, I would prefer the names focal_loss/binary_focal_loss to match crossentropy/binarycrossentropy.

src/losses/functions.jl

src/losses/Losses.jl

src/losses/functions.jl

darsnack

Some suggestions on the docstrings.

src/losses/functions.jl

darsnack · 2021-01-31T18:25:08Z

I would commit the changes that address the comments above. I think this is close to being ready for approval, but we'll need to come to a consensus on the numerical stability vs. performance issue. That will require input for other maintainers, so probably we won't be able to merge today.

DhairyaLGandhi · 2021-02-01T09:18:57Z

I'll also need to review the api before we make a final call

src/losses/functions.jl

test/losses.jl

src/losses/functions.jl

darsnack

This looks good to go for me, but @DhairyaLGandhi will need to approve the final API.

src/losses/functions.jl

DhairyaLGandhi

Thanks, I've added a few last thoughts, but this is looking good. We might want to review how we write our losses but that would be a more general shift, so keeping it consistent with the current seems sensible to me. Thanks again for the good work @shikhargoswami

src/losses/functions.jl

test/losses.jl

CarloLucibello · 2021-02-03T19:14:17Z

I think it would be better to assume logits as inputs instead of probabilities. In model-zoo we use logitcrossentropy everyqhere instead of crossentropy for numerical stability

darsnack · 2021-02-04T15:31:09Z

This goes back to the numerical stability question. We would need to rework the implementation to make the logits version stable. If we do switch to logits, maybe the naming should be logitfocalloss and logitbinaryfocalloss. It's tough on the eyes but consistent with the cross entropy convention.

CarloLucibello · 2021-02-04T22:25:24Z

the fact that we have both crossentropy and logitcrossentropy is quite unfortunate, when I revisited the loss functions I thought about having a single definition function crossentropy(yhat, y; logits=true) but couldn't figure out a nice deprecation path. Working with unnormalized log-probabilities is what one should want in most cases for numerical stability. Should we experiment with this keyword arg approach here? Or we can just leave this as it is, not worth too much overthinking

darsnack · 2021-02-04T22:32:11Z

I like the possibility of making "logit" a kwarg, but I think it is better in a separate PR. I think for here, let's just decide which version (logit or not) we want, and we can keep it consistent with what we already have.

DhairyaLGandhi · 2021-02-04T22:39:46Z

I think the less magical the function, the better, and this makes certain contracts very implicit, like which function softmax we use. It goes against the self documenting code we expect from Flux. I'd need to see some very compelling reasons to do opaque looking keywords in Flux.

It's one reason I feel like we should remove the agg, it doesn't justify being in every loss function when it clearly doesn't add much to most of them.

CarloLucibello · 2021-02-04T22:39:57Z

let's keep it as it is then, it's already done and logitbinaryfocalloss is quite horrifying 😄

DhairyaLGandhi · 2021-02-04T22:41:13Z

I agree the names... Need work

src/losses/functions.jl

darsnack · 2021-02-05T16:48:26Z

@shikhargoswami that should fix it. Can you address the remaining comments?

Co-authored-by: Carlo Lucibello <carlo.lucibello@gmail.com>

darsnack

Let's wait for CI to pass then I think this can merge.

shikhargoswami · 2021-02-05T16:52:16Z

@darsnack Thanks a lot for the help!

darsnack · 2021-02-05T16:52:34Z

No problem! Thank you for your contribution!

darsnack · 2021-02-05T17:15:00Z

bors r+

1489: Implementation of Focal loss r=darsnack a=shikhargoswami Focal loss was introduced in the RetinaNet paper (https://arxiv.org/pdf/1708.02002.pdf). Focal loss is useful for classification when you we highly imbalanced classes. It down-weights well-classified examples and focuses on hard examples. The loss value is much high for a sample which is misclassified by the classifier as compared to the loss value corresponding to a well-classified example. Used in single-shot object detection where the imbalance between the background class and other classes is extremely high. Here's it's tensorflow implementation (https://github.com/tensorflow/addons/blob/v0.12.0/tensorflow_addons/losses/focal_loss.py#L26-L81) ### PR Checklist - [x] Tests are added - [x] Entry in NEWS.md - [x] Documentation, if applicable - [ ] Final review from `@dhairyagandhi96` (for API changes). Co-authored-by: Shikhar Goswami <shikhargoswami2308@gmail.com> Co-authored-by: Shikhar Goswami <44720861+shikhargoswami@users.noreply.github.com>

darsnack · 2021-02-05T17:16:29Z

bors r-

darsnack · 2021-02-05T17:20:33Z

bors r-

bors · 2021-02-05T17:20:34Z

Canceled.

darsnack

@shikhargoswami sorry to delay, but I realized you didn't add the focal loss docstring to the actual docs in docs/models/losses.md.

darsnack

Thanks!

darsnack · 2021-02-05T18:01:20Z

bors r+

bors · 2021-02-05T18:13:38Z

Build succeeded:

buildkite/flux-dot-jl

DhairyaLGandhi · 2021-02-06T03:05:35Z

Kyle, for the future, this is an API related change and the process does require a final approval from me.

And good job @shikhargoswami !

darsnack · 2021-02-06T03:19:49Z

Sorry, my mistake, I thought you had approved of the API. Next time I will leave the final merge to you on API changes.

darsnack requested changes Jan 30, 2021

View reviewed changes

src/losses/functions.jl Outdated Show resolved Hide resolved

src/losses/functions.jl Outdated Show resolved Hide resolved

src/losses/functions.jl Outdated Show resolved Hide resolved

src/losses/functions.jl Outdated Show resolved Hide resolved

CarloLucibello reviewed Jan 30, 2021

View reviewed changes

src/losses/functions.jl Outdated Show resolved Hide resolved

CarloLucibello reviewed Jan 30, 2021

View reviewed changes

src/losses/functions.jl Outdated Show resolved Hide resolved

CarloLucibello reviewed Jan 30, 2021

View reviewed changes

test/losses.jl Outdated Show resolved Hide resolved

shikhargoswami force-pushed the focalLoss branch from eb3882d to ba4b299 Compare January 30, 2021 19:01

darsnack requested changes Jan 30, 2021

View reviewed changes

src/losses/functions.jl Outdated Show resolved Hide resolved

src/losses/functions.jl Outdated Show resolved Hide resolved

src/losses/Losses.jl Outdated Show resolved Hide resolved

ToucheSir reviewed Jan 30, 2021

View reviewed changes

src/losses/functions.jl Outdated Show resolved Hide resolved

CarloLucibello reviewed Jan 30, 2021

View reviewed changes

src/losses/functions.jl Outdated Show resolved Hide resolved

darsnack requested changes Jan 31, 2021

View reviewed changes

darsnack requested changes Feb 2, 2021

View reviewed changes

src/losses/functions.jl Outdated Show resolved Hide resolved

src/losses/functions.jl Outdated Show resolved Hide resolved

src/losses/functions.jl Outdated Show resolved Hide resolved

test/losses.jl Outdated Show resolved Hide resolved

darsnack requested changes Feb 2, 2021

View reviewed changes

src/losses/functions.jl Outdated Show resolved Hide resolved

src/losses/functions.jl Outdated Show resolved Hide resolved

darsnack reviewed Feb 2, 2021

View reviewed changes

src/losses/functions.jl Outdated Show resolved Hide resolved

DhairyaLGandhi reviewed Feb 2, 2021

View reviewed changes

src/losses/functions.jl Outdated Show resolved Hide resolved

src/losses/functions.jl Outdated Show resolved Hide resolved

test/losses.jl Show resolved Hide resolved

test/losses.jl Outdated Show resolved Hide resolved

CarloLucibello reviewed Feb 5, 2021

View reviewed changes

src/losses/functions.jl Outdated Show resolved Hide resolved

shikhargoswami added 11 commits February 5, 2021 10:43

Added tests for focal_loss

3c05c80

Changes done

00a785d

oops! forgot toadd these

b87a6d2

Revised focal_loss

36c2f2c

Refactored code and docstring

feeb6d7

Changes made

07c33eb

docstring mistake solved!

8ccff26

Added GPU tests and requested changes made

e6342a4

Added Doctest and fixed docstring mistake

63747a6

Done!

5ce5481

Added entry in NEWS.md

1987693

darsnack force-pushed the focalLoss branch from 9d733f4 to 1987693 Compare February 5, 2021 16:46

Applied the suggestions

63e4d98

Co-authored-by: Carlo Lucibello <carlo.lucibello@gmail.com>

darsnack approved these changes Feb 5, 2021

View reviewed changes

darsnack requested changes Feb 5, 2021

View reviewed changes

Update losses.md

284425b

darsnack approved these changes Feb 5, 2021

View reviewed changes

bors bot merged commit d341500 into FluxML:master Feb 5, 2021

Implementation of Focal loss #1489

Implementation of Focal loss #1489

Conversation

shikhargoswami commented Jan 30, 2021 • edited Loading

PR Checklist

darsnack left a comment

Choose a reason for hiding this comment

DhairyaLGandhi commented Jan 30, 2021

CarloLucibello commented Jan 30, 2021

darsnack commented Jan 30, 2021

shikhargoswami commented Jan 30, 2021

darsnack commented Jan 30, 2021 • edited Loading

CarloLucibello commented Jan 30, 2021

shikhargoswami commented Jan 30, 2021

darsnack left a comment

Choose a reason for hiding this comment

darsnack left a comment

Choose a reason for hiding this comment

darsnack commented Jan 31, 2021 • edited Loading

DhairyaLGandhi commented Feb 1, 2021

darsnack left a comment

Choose a reason for hiding this comment

DhairyaLGandhi left a comment

Choose a reason for hiding this comment

CarloLucibello commented Feb 3, 2021

darsnack commented Feb 4, 2021

CarloLucibello commented Feb 4, 2021

darsnack commented Feb 4, 2021

DhairyaLGandhi commented Feb 4, 2021

CarloLucibello commented Feb 4, 2021

DhairyaLGandhi commented Feb 4, 2021

darsnack commented Feb 5, 2021

darsnack left a comment

Choose a reason for hiding this comment

shikhargoswami commented Feb 5, 2021

darsnack commented Feb 5, 2021

darsnack commented Feb 5, 2021

darsnack commented Feb 5, 2021 • edited Loading

darsnack commented Feb 5, 2021

bors bot commented Feb 5, 2021

darsnack left a comment

Choose a reason for hiding this comment

darsnack left a comment

Choose a reason for hiding this comment

darsnack commented Feb 5, 2021

bors bot commented Feb 5, 2021

DhairyaLGandhi commented Feb 6, 2021 • edited Loading

darsnack commented Feb 6, 2021

shikhargoswami commented Jan 30, 2021 •

edited

Loading

darsnack commented Jan 30, 2021 •

edited

Loading

darsnack commented Jan 31, 2021 •

edited

Loading

darsnack commented Feb 5, 2021 •

edited

Loading

DhairyaLGandhi commented Feb 6, 2021 •

edited

Loading