-
Notifications
You must be signed in to change notification settings - Fork 5
Pullback rules for truncation methods #53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report❌ Patch coverage is
🚀 New features to boost your workflow:
|
|
As @kshyatt is working on package extensions for supporting Enzyme.jl and Mooncake.jl, it might be good to revisit the interface we have currently chosen to write the pullbacks within the
While this has worked well so far in combination with ChainRules.jl & Zygote.jl, it is apparently not the best fit for the other AD packages. As far as I understood, these other AD packages encode the absence/zeroness of certain adjoints differently, and this conflicts with the fact that the different incoming adjoints (those associated with the output variables) in our interface are assumed to be packed in a tuple. Hence, I want to put this interface back on the table before continuing with this PR. Input from @kshyatt will be essential. |
|
One other problem while implementing the alternative methods for the truncated decompositions (that do not require the full decomposition) is that they do need the input matrix |
lkdvos
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely happy to rethink the interface for the pullbacks. I don't think we specifically had many things in mind when we designed this, so restrategizing now that Mooncake and Enzyme are on the horizon seems like the right thing to do.
0a49429 to
bf91b1d
Compare
c839270 to
ab08956
Compare
[skip ci]
| UΔUp = view(UΔU, :, indU) | ||
| mul!(UΔUp, Ur', ΔU) | ||
| ΔU -= Ur * UΔUp | ||
| mul!(ΔU, Ur, UΔUp, -1, 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering whether modifying the incoming adjoints in place is allowed by all the AD backends. That is why I had refrained from doing so. Maybe @kshyatt can comment on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are probably right, and I definitely don't want to risk that for saving a single allocation. I just hadn't thought about that.
A first attempt at generalizing the
eig,eighandsvdpullback rules so that they can also be used in the case of truncated methods withTruncatedAlgorithm, i.e. where first the full decomposition is computed. This was already supported forsvd, but with the implicit assumption that the singular values/vectors kept were always the first ones.This clearly needs additional tests, especially for the
eigandeighmethods where such a sorted truncation order is definitely not guaranteed.Alternatively, I can also write an implementation that does not use the full decomposition, and solves the problem in terms of Sylvester equations with the original matrix. That might actually more generic. In particular, I am also thinking about randomized SVD, where one would not have the full decomposition available (and for which the current rules would thus be wrong or failing).