Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed issue #542. #596

Merged
merged 4 commits into from
Feb 8, 2019
Merged

Conversation

IvanYashchuk
Copy link
Contributor

Added tracking of LinearAlgebra.det and its grad method.

The gradient of the determinant is taken from
http://www.cs.cmu.edu/~zkolter/course/15-884/linalg-review.pdf (Section 4.5)

It is also mentioned that $det{A} A^{-T}$ can be used, this form is also written in
Matrix Cookbook Sec. 2.1.2
https://www.ics.uci.edu/~welling/teaching/KernelsICS273B/MatrixCookBook.pdf

I don't know what variant is more appropriate.

Added tracking of LinearAlgebra.det and its grad method.
@KristofferC
Copy link
Contributor

Regarding your first implementation, note that adj here means Adjugate matrix and not adjoint (and in fact adj(A) = 1/det(A) * inv(A)

@IvanYashchuk
Copy link
Contributor Author

IvanYashchuk commented Feb 5, 2019

Yes, when I thought about it again, I noticed it.
I've changed it already.

@IvanYashchuk
Copy link
Contributor Author

Can someone guide me, how should it be tested?

The result matches ForwardDiff.

julia> f(x) = det(x)
julia> df(x) = Flux.Tracker.gradient(f, x)[1]
julia> a = rand(3, 3)
3×3 Array{Float64,2}:
 0.92519   0.494903   0.380887
 0.688972  0.0461206  0.226083
 0.961183  0.358504   0.60728
julia> ForwardDiff.gradient(f, a) -  df(a)
Tracked 3×3 Array{Float64,2}:
  0.0          -5.55112e-17   8.32667e-17
 -2.77556e-17   2.77556e-17   0.0
 -4.16334e-17   2.77556e-17  -5.55112e-17

@mcabbott
Copy link
Member

mcabbott commented Feb 5, 2019

For tests I think you just need to add this to test/tracker.jl:

@test gradtest(det, (4,4))

May I suggest adding also logdet and logabsdet at the same time? (Just Δ * transpose(inv(xs).)

Is it possible to cache the forward result? I was going to suggest the following, but it breaks second derivatives:

@grad function det(xs) 
    d = det(data(xs))
    d, Δ -> (Δ * d * transpose(inv(xs)),)
end

Note that both det and inv first work out lu(x) so even more re-use might be possible, but would add a lot of complexity.

@MikeInnes
Copy link
Member

This is great. Thanks a lot for the patch!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants