Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Derivatives of constant functions return nothing. #329

Open
MasonProtter opened this issue Sep 13, 2019 · 4 comments
Open

Derivatives of constant functions return nothing. #329

MasonProtter opened this issue Sep 13, 2019 · 4 comments

Comments

@MasonProtter
Copy link
Contributor

I've tested this on Zygote#master and latest release:

julia> using Zygote

julia> (x -> 1)'(1) == nothing
true

Probably this should give false or something like that instead of nothing.

@MikeInnes
Copy link
Member

This is currently the expected behaviour; we use nothing as a "generalised zero". We could potentially replace this with a Zero type, or something, instead. false is a bit problematic because we can't dispatch on it, and this needs to work not just on numeric types but also on things like structs and symbols.

@MasonProtter
Copy link
Contributor Author

MasonProtter commented Dec 3, 2019

we use nothing as a "generalised zero"

This is problematic because nothing doesn't actually have any of the properties of zero.

julia> g(f, x) = f'(x) + x
g (generic function with 1 method)

julia> g(x -> 2, 1)
ERROR: MethodError: no method matching +(::Nothing, ::Int64)

Yeah, I think some sort of Zero type would be preferable and then we just define some basic relations:

struct Zero end
Base.:(+)(z::Zero, x) = x
Base.:(+)(x, z::Zero) = x

Base.:(-)(z::Zero, x) = -x
Base.:(-)(x, z::Zero) =  x

Base.:(*)(z::Zero, x) = z
Base.:(*)(x, z::Zero) = z

Base.:(/)(z::Zero, x) = z

Base.:(^)(z::Zero, x) = z
Base.:(^)(x, z::Zero) = one(x)

etc.

This could actually potentially go into Base.LinearAlgebra as the zero analogue of UniformScaling.

@MikeInnes
Copy link
Member

It's an option, but note that this (or false) can easily end up giving you the reverse issue: + will throw an error if the gradient is actually non-zero (e.g. because it's a gradient of a struct).

The current recommended way to deal with this is to just use something(f'(x), 0); you can pretty easily wrap the gradient function to do this automatically if you want as well.

@baggepinnen
Copy link
Contributor

For reference, ChainRules.jl is making use of some Zero type, but I believe it has been questioned at some point
https://github.com/JuliaDiff/ChainRules.jl/search?utf8=%E2%9C%93&q=Zero&type=

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants