Preserve the type in differentiation #149

matsueushi · 2019-12-18T02:06:48Z

The current implementations of leakyrelu, elu and selu return Float64 gradients for Float32 inputs.

julia> using NNlib, Zygote

julia> leakyrelu'(1f0)
1.0

julia> leakyrelu'(-1f0)
0.009999999776482582

julia> elu'(1f0)
1.0

julia> elu'(-1f0)
0.3678794503211975

julia> selu'(1f0)
1.0507010221481323

julia> selu'(-1f0)
0.6467686295509338

cf. FluxML/Flux.jl#963. This PR is intended to preserve float precision for differentiation.

using NNlib, Zygote, Test

ACTIVATION_FUNCTIONS = [σ, relu, leakyrelu, elu, gelu, swish, selu, softplus, softsign, logcosh];
function test_deliv_float_precision_preserving(a)
    @testset "$(a): " begin
        for T in [Float32, Float64]
            for val in [-10, -1, 0, 1, 10]
                val = @inferred a'(T(val))
                @test typeof(val) == T
            end
        end
    end
end

@testset "Float derivative inference" begin
    test_deliv_float_precision_preserving.(ACTIVATION_FUNCTIONS)
end

Before

Test Summary:              | Pass  Fail  Total
Float derivative inference |   85    15    100
  σ:                       |   10           10
  relu:                    |   10           10
  leakyrelu:               |    5     5     10
  elu:                     |    5     5     10
  gelu:                    |   10           10
  swish:                   |   10           10
  selu:                    |    5     5     10
  softplus:                |   10           10
  softsign:                |   10           10
  logcosh:                 |   10           10
ERROR: Some tests did not pass: 85 passed, 15 failed, 0 errored, 0 broken.

After

Test Summary:              | Pass  Total
Float derivative inference |  100    100

codecov-io · 2019-12-18T02:21:42Z

Codecov Report

Merging #149 into master will not change coverage.
The diff coverage is 100%.

@@           Coverage Diff           @@
##           master     #149   +/-   ##
=======================================
  Coverage   74.86%   74.86%           
=======================================
  Files          24       24           
  Lines         768      768           
=======================================
  Hits          575      575           
  Misses        193      193

Impacted Files	Coverage Δ
src/activation.jl	`94.11% <100%> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 136aa82...82e1fde. Read the comment docs.

staticfloat · 2020-01-24T21:52:57Z

Thanks @matsueushi!

DhairyaLGandhi · 2020-01-25T12:25:41Z

Should we also add the testset in our regular tests?

matsueushi · 2020-01-25T16:53:03Z

Do I need to add the test to test/activation.jl? Currently NNlib.jl doesn't have Zygote.jl dependency. Or is it better to update the tests of Zygote.jl?

staticfloat · 2020-01-26T00:09:07Z

I would add a type-stability test to NNlib (just ensure that the activation functions don't change the type unnecessarily).

matsueushi · 2020-01-26T00:24:23Z

If you are talking about the values of the activation functions, a test code is already defined in test/activation.jl and the previous definitions passed it. I modified it to test gradients.

NNlib.jl/test/activation.jl

Lines 5 to 14 in 1c35815

    
           function test_value_float_precision_preserving(a) 
        
               @testset "$(a): " begin 
        
                   for T in [Float32, Float64] 
        
                       for val in [-10, -1, 0, 1, 10] 
        
                           val = @inferred a(T(val)) 
        
                           @test typeof(val) == T 
        
                       end 
        
                   end 
        
               end 
        
           end

staticfloat · 2020-01-26T20:06:33Z

Right, so what I mean is that we should have that same test for the gradients of the activation functions. :)

matsueushi · 2020-01-26T22:21:26Z

Thanks, I see what you mean.

Preserve precision of derivatives

82e1fde

matsueushi mentioned this pull request Jan 7, 2020

Some activation functions change type when backpropagating and pooling layers doesn't like it FluxML/Flux.jl#979

Closed

staticfloat merged commit 1c35815 into FluxML:master Jan 24, 2020

matsueushi deleted the diff_precision branch January 25, 2020 00:19

matsueushi mentioned this pull request Jan 30, 2020

Update test cases of activation functions #162

Merged

mcabbott mentioned this pull request Feb 8, 2020

Type Promotion often Unwieldy and day Ruining FluxML/Flux.jl#1026

Closed

CarloLucibello mentioned this pull request Feb 26, 2020

Gradients of Chain including leakyrelu function FluxML/Flux.jl#963

Closed

matsueushi mentioned this pull request Mar 3, 2020

cufunc wrappers of NNlib activation functions JuliaGPU/CuArrays.jl#614

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preserve the type in differentiation #149

Preserve the type in differentiation #149

matsueushi commented Dec 18, 2019

codecov-io commented Dec 18, 2019 •

edited

Loading

staticfloat commented Jan 24, 2020

DhairyaLGandhi commented Jan 25, 2020

matsueushi commented Jan 25, 2020

staticfloat commented Jan 26, 2020

matsueushi commented Jan 26, 2020 •

edited

Loading

staticfloat commented Jan 26, 2020

matsueushi commented Jan 26, 2020

Preserve the type in differentiation #149

Preserve the type in differentiation #149

Conversation

matsueushi commented Dec 18, 2019

codecov-io commented Dec 18, 2019 • edited Loading

Codecov Report

staticfloat commented Jan 24, 2020

DhairyaLGandhi commented Jan 25, 2020

matsueushi commented Jan 25, 2020

staticfloat commented Jan 26, 2020

matsueushi commented Jan 26, 2020 • edited Loading

staticfloat commented Jan 26, 2020

matsueushi commented Jan 26, 2020

codecov-io commented Dec 18, 2019 •

edited

Loading

matsueushi commented Jan 26, 2020 •

edited

Loading