Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to get gradients of "Dense" models when sparse arrays are involved #965

Closed
simonmandlik opened this issue Dec 13, 2019 · 1 comment

Comments

@simonmandlik
Copy link
Contributor

using Flux, SparseArrays

md = Dense(2,2)
ms = Dense(sparse(randn(2,2)), sparse(randn(2)))
xd = randn(2, 2)
xs = sparse(randn(2, 2))

gradient(() -> sum(md(xd)), Flux.params(md))
gradient(() -> sum(ms(xs)), Flux.params(ms))
gradient(() -> sum(ms(xd)), Flux.params(ms))
gradient(() -> sum(md(xs)), Flux.params(md))

The last call fails (on 0.10.0):

ERROR: MethodError: no method matching zero(::Type{Any})
Closest candidates are:
  zero(::Type{Union{Missing, T}}) where T at missing.jl:105
  zero(::Type{Missing}) at missing.jl:103
  zero(::Type{LibGit2.GitHash}) at /Users/sabae/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.3/LibGit2/src/oid.jl:220
  ...
Stacktrace:
 [1] zero(::Type{Any}) at ./missing.jl:105
 [2] _zeros_eltypes at /Users/sabae/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.3/SparseArrays/src/higherorderfns.jl:203 [inlined]
 [3] _noshapecheck_map(::Zygote.var"#1457#1464", ::SparseMatrixCSC{Any,Int64}) at /Users/sabae/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.3/SparseArrays/src/highero
rderfns.jl:159
 [4] map(::Zygote.var"#1457#1464", ::SparseMatrixCSC{Any,Int64}) at /Users/sabae/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.3/SparseArrays/src/higherorderfns.jl:143
 [5] adjoint at /Users/simon.mandlik/.julia/packages/Zygote/8dVxG/src/lib/broadcast.jl:126 [inlined]
 [6] _pullback at /Users/simon.mandlik/.julia/packages/ZygoteRules/6nssF/src/adjoint.jl:47 [inlined]
 [7] adjoint at /Users/simon.mandlik/.julia/packages/Zygote/8dVxG/src/lib/lib.jl:139 [inlined]
 [8] _pullback at /Users/simon.mandlik/.julia/packages/ZygoteRules/6nssF/src/adjoint.jl:47 [inlined]
 [9] broadcasted at ./broadcast.jl:1231 [inlined]
 [10] Dense at /Users/simon.mandlik/.julia/packages/Flux/oX9Pi/src/layers/basic.jl:116 [inlined]
 [11] _pullback(::Zygote.Context, ::Dense{typeof(identity),Array{Float32,2},Array{Float32,1}}, ::SparseMatrixCSC{Float64,Int64}) at /Users/simon.mandlik/.julia/packages/Zygote/8dVxG/src
/compiler/interface2.jl:0
 [12] #103 at ./REPL[107]:1 [inlined]
 [13] _pullback(::Zygote.Context, ::var"#103#104") at /Users/simon.mandlik/.julia/packages/Zygote/8dVxG/src/compiler/interface2.jl:0
 [14] pullback(::Function, ::Params) at /Users/simon.mandlik/.julia/packages/Zygote/8dVxG/src/compiler/interface.jl:96
 [15] gradient(::Function, ::Params) at /Users/simon.mandlik/.julia/packages/Zygote/8dVxG/src/compiler/interface.jl:46
 [16] top-level scope at REPL[107]:1

On the other hand, computing gradients of only the matrix multiplication works for all possible combinations:

gradient(x -> sum(x*xd), xd)
gradient(x -> sum(x*xs), xs)
gradient(x -> sum(x*xs), xd)
gradient(x -> sum(x*xd), xs)
@simonmandlik
Copy link
Contributor Author

Seems to not be a problem anymore on the latest release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant