-
Notifications
You must be signed in to change notification settings - Fork 57
Accumulate derivative into Adjoint's original elements #184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #184 +/- ##
==========================================
- Coverage 84.36% 84.24% -0.13%
==========================================
Files 18 18
Lines 1721 1726 +5
==========================================
+ Hits 1452 1454 +2
- Misses 269 272 +3
Continue to review full report at Codecov.
|
The adjoint rule is causing other issues: Another solution is that we don't deal with for ... in DiffRules.diffrules()
... to something like: rules = ...DiffRules except [adjoint, conj]
for ... in rules
... to fix the problem. What do think @yebai @mohamed82008 @devmotion |
I don't think these rules should be excluded: #183 (comment) |
We have two ways to fix this:
Because there's not a concise way to tackle all the broadcasting, I am trying the first one. |
@@ -212,6 +212,8 @@ end | |||
# JacobianTape # | |||
################ | |||
|
|||
LinearAlgebra.lu(x::Adjoint, args...) = LinearAlgebra.lu(Array(x), args...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a quite terrible type piracy 😥
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be possible to definite a lu(x::TrackedArray{Adjoint,..}, ...)
instead here to avoid/reduce type piracy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure, since this is so unrelated to the other changes to me it seems generally a bit weird. Why do we have to "fix" lu
if the problem is the accumulation of the derivatives? I understand (or assume) it is necessary to fix some test errors but I think this points to another deeper problem or at least requires a more general solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the introduction of adjoint
in DiffRules also causes failure of another test case which is unrelative to broadcasting on Adjoint:
https://github.com/JuliaDiff/ReverseDiff.jl/blob/master/test/api/JacobianTests.jl#L293
My intention was indeed to fix it, but I am not familiar with LA and Jacobian Matrix, I just went through the error log and found it needs such a method. Sorry about that. I hope somebody would like to dig deeper into this. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries 🙂 I just found #183 (comment), so there's definitely a lot of not nice/non-general code regarding Adjoint
already in ReverseDiff 🙈 Would make it even nicer to fix these problems more generally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is functionally correct, shall we 1) open a new issue 2) leave more improvements as separate PR?
#183 (also for |
Try to fix #183.