Fix AlphaDropout implementation and add tests #1781

ToucheSir · 2021-11-24T20:20:08Z

AFAICT, the original implementation never behaved as expected even pre-Zygote. This was likely not caught because the original PR didn't come with tests, so this PR should remedy that.

Behaviour and outputs are adapted from the PyTorch and TF implementations. Some points of note:

We have to special-case p = 0 to avoid propagating NaNs when calcuating A and B.
Likewise for p = 1. TF just returns the input, but I think the PyTorch approach of returning all zeros (+/- depending on the input sign) is more in line with Dropout.
ifelse is used instead of something like https://github.com/keras-team/keras/blob/v2.7.0/keras/layers/noise.py#L200. I think it better reflects the conditional nature of the operation and it was also faster in local benchmarking.

PR Checklist

Tests are added
Entry in NEWS.md

Behaviour and outputs are adapted from the PyTorch and TF implementations

ToucheSir · 2021-11-24T21:57:33Z

Unexpected bonus: this is now 100% GPU compatible and leaves us with one less broken test :)

CarloLucibello · 2021-11-24T23:01:33Z

looks good!

darsnack · 2021-11-24T23:55:20Z

Looks good, I think this deserves an entry in NEWS.md. Especially the GPU bit.

CarloLucibello · 2021-11-26T14:17:35Z

bors r+

DhairyaLGandhi · 2021-11-26T14:20:30Z

src/layers/normalise.jl

-  return x
+  p = a.p
+  iszero(p) && return x
+  isone(p) && return sign.(x) .* T(0)


Is this correct? Are we intentionally creating -0.0 values here.

Yes, see point 2 in #1781 (comment).

DhairyaLGandhi · 2021-11-26T14:21:03Z

src/layers/normalise.jl

+  iszero(p) && return x
+  isone(p) && return sign.(x) .* T(0)
+
+  α′ = T(-1.7580993408473766) # selu(-Inf) == -λα


We can use oftype here

oftype requires a value of type T, but the only way to get that in this function would be to extract an element of x (which would fail if x is empty and might also trigger the getindex adjoint unnecessarily). Since this function should be specializing on eltype anyhow, it makes sense to take advantage of that for casting.

bors · 2021-11-26T14:41:55Z

Build succeeded:

buildkite/flux-dot-jl

ToucheSir added 2 commits November 24, 2021 12:07

Fix AlphaDropout implementation and add tests

dfacc9c

Behaviour and outputs are adapted from the PyTorch and TF implementations

Fix scalar indexing and restore GPU testing for AlphaDropout

34d835b

CarloLucibello previously approved these changes Nov 24, 2021

View reviewed changes

update NEWS.md

e249c5c

ToucheSir dismissed CarloLucibello’s stale review via e249c5c November 25, 2021 01:04

CarloLucibello approved these changes Nov 26, 2021

View reviewed changes

DhairyaLGandhi reviewed Nov 26, 2021

View reviewed changes

bors bot merged commit 4e28377 into master Nov 26, 2021

DhairyaLGandhi deleted the bc/alpha-dropout-tests branch November 27, 2021 16:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix AlphaDropout implementation and add tests #1781

Fix AlphaDropout implementation and add tests #1781

ToucheSir commented Nov 24, 2021 •

edited

Loading

ToucheSir commented Nov 24, 2021

CarloLucibello commented Nov 24, 2021

darsnack commented Nov 24, 2021

CarloLucibello commented Nov 26, 2021

DhairyaLGandhi Nov 26, 2021

ToucheSir Nov 26, 2021

DhairyaLGandhi Nov 26, 2021

ToucheSir Nov 26, 2021 •

edited

Loading

bors bot commented Nov 26, 2021

Fix AlphaDropout implementation and add tests #1781

Fix AlphaDropout implementation and add tests #1781

Conversation

ToucheSir commented Nov 24, 2021 • edited Loading

PR Checklist

ToucheSir commented Nov 24, 2021

CarloLucibello commented Nov 24, 2021

darsnack commented Nov 24, 2021

CarloLucibello commented Nov 26, 2021

DhairyaLGandhi Nov 26, 2021

Choose a reason for hiding this comment

ToucheSir Nov 26, 2021

Choose a reason for hiding this comment

DhairyaLGandhi Nov 26, 2021

Choose a reason for hiding this comment

ToucheSir Nov 26, 2021 • edited Loading

Choose a reason for hiding this comment

bors bot commented Nov 26, 2021

ToucheSir commented Nov 24, 2021 •

edited

Loading

ToucheSir Nov 26, 2021 •

edited

Loading