Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gradient of array splat should be an array #599

Open
CarloLucibello opened this issue Apr 14, 2020 · 2 comments
Open

gradient of array splat should be an array #599

CarloLucibello opened this issue Apr 14, 2020 · 2 comments
Labels
bug Something isn't working

Comments

@CarloLucibello
Copy link
Member

The last example, behaves differently from the first two

julia> gradient(w -> sum(w), [1,1])  # ok, gradient is array
([1, 1],)   

julia> gradient(w -> sum([w[1], w[2]]), [1,1]) # ok, gradient is array
([1, 1],)

julia> gradient(w -> sum([w...]), [1,1]) # NOT OK, gradient is tuple
((1, 1),)

One of the problems with returning a tuple, is that it breaks Flux's update! function, which is expecting an array

@AzamatB
Copy link
Contributor

AzamatB commented Apr 14, 2020

See #489, where @MikeInnes described the approach for fixing this.

@mcabbott
Copy link
Member

BTW this is not fixed, although ProjectTo on final results hides the problem in the example above. If the gradient needs to be passed to another pullback expecting an array, you will get errors.

To avoid the projection & see the problem, you can do this:

julia> pullback(w -> sum([w...]), [1,1])[2](1.0)
((1.0, 1.0),)

julia> pullback(w -> sum([w...]), [1 2; 3 4])[2](1.0)
((1.0, 1.0, 1.0, 1.0),)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants