Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a check kwarg to cholesky adjoint and bug fix #539

Merged
merged 5 commits into from
Mar 13, 2020

Conversation

mohamed82008
Copy link
Contributor

This PR adds a check kwarg to the cholesky adjoint.

@mohamed82008
Copy link
Contributor Author

CC: @willtebbutt

@DhairyaLGandhi
Copy link
Member

Lgtm

@willtebbutt
Copy link
Member

Unclear to me if this is the correct behaviour. For example, consider

using LinearAlgebra
A = randn(7, 2)
B = A * A'
julia> cholesky(B; check=false).U
7×7 UpperTriangular{Float64,Array{Float64,2}}:
 0.531198  0.880491   0.129829     -1.10028   -0.126263   1.35857    1.08306
          1.63233    1.06744       0.280939   0.998416   0.972167  -1.82391
                   -2.22045e-16   0.157038   1.04936    1.21411   -1.8063
                                 1.28954    0.419418  -1.22169   -1.70408
                                           1.01278    0.79909   -1.95777
                                                     2.79083   -0.30172
                                                               4.49967

julia> cholesky(B + 1e-9I)
Cholesky{Float64,Array{Float64,2}}
U factor:
7×7 UpperTriangular{Float64,Array{Float64,2}}:
 0.531198  0.880491  0.129829    -1.10028     -0.126263     1.35857      1.08306
          1.63233   1.06744      0.280939     0.998416     0.972167    -1.82391
                   4.61784e-5   4.52804e-5   3.14148e-5  -2.01157e-5  -8.6562e-5
                               6.73263e-5   2.42426e-5  -3.99157e-5  -8.08339e-5
                                           3.69577e-5  -4.0951e-8   -2.36548e-5
                                                       4.2695e-5    1.1151e-5
                                                                   5.17436e-5

Looks to me like the top-left triangle, up to the point where the factorisation of B breaks down, really is a function of B, so the gradient shouldn't be zero.

@mohamed82008 would you open to the idea of throwing an error instead? It's unclear to me what the utility of the reverse-pass succeeding when the forwards-pass fails is.

@mohamed82008
Copy link
Contributor Author

@willtebbutt I think throwing in the backward pass should be fine. I will update the PR.

@mohamed82008
Copy link
Contributor Author

mohamed82008 commented Mar 11, 2020

The last commit is a bug fix for:

using LinearAlgebra, Zygote

B = rand(10, 10); B = B * B' + Matrix{Float64}(I, 10, 10);
Zygote.pullback(cholesky, B)[2]((factors = LowerTriangular(rand(10, 10)),))

@mohamed82008 mohamed82008 changed the title Add a check kwarg to cholesky adjoint Add a check kwarg to cholesky adjoint and bug fix Mar 11, 2020
@willtebbutt
Copy link
Member

@mohamed82008 nice work. Could you please add a @test_throws to prevent accidental regression?

@willtebbutt
Copy link
Member

Also for whatever the above bug is. It's not quite clear what's going wrong from your example, but it's clearly getting through the tests at the minute.

U, Ū = C.U, Δ.factors
Σ̄ = Ū * U'
Σ̄ = similar(U.data)
Copy link
Member

@CarloLucibello CarloLucibello Mar 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe better to not reassign the function argument Σ here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the same symbol, it's a different one.

return C, function(Δ::NamedTuple)
issuccess(C) || throw(PosDefException(C.info))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want this also if check=false?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well this will be called if check = false, no?

@CarloLucibello
Copy link
Member

can you add a test for the bugfix?

@mohamed82008
Copy link
Contributor Author

I will add some tests.

@mohamed82008
Copy link
Contributor Author

Also for whatever the above bug is. It's not quite clear what's going wrong from your example, but it's clearly getting through the tests at the minute.

The problem with LowerTriangular factors is that the product will be a LowerTriangular matrix so copytri! fails. I added a test.

@CarloLucibello
Copy link
Member

thanks!

bors r+

@bors
Copy link
Contributor

bors bot commented Mar 13, 2020

Build succeeded

@bors bors bot merged commit 6e56675 into FluxML:master Mar 13, 2020
@mohamed82008
Copy link
Contributor Author

Would be nice to release this please, thank you :)

@mohamed82008
Copy link
Contributor Author

@CarloLucibello can we get a new release?

@mohamed82008
Copy link
Contributor Author

I think a tag is remaining.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants