-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reusable
and NonReusable
capability
#592
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportBase: 93.16% // Head: 93.16% // No change to project coverage 👍
Additional details and impacted files@@ Coverage Diff @@
## main #592 +/- ##
=======================================
Coverage 93.16% 93.16%
=======================================
Files 15 15
Lines 907 907
=======================================
Hits 845 845
Misses 62 62
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
Before merging I want to see a PR on Zygote where this is hooked into the code for gradient vs jacobian. |
Still working on that. I tried to hook that into Zygote with the idea of injecting the information into p.s. if anyone have time and know how to make it work, feel free to take over. |
Happy to help workshop the Zygote PR. I can't think of too many places where we'd be able to use it without adding new rules, however, given that the rules which would use it are likely to use mutation and there aren't many second-order rules for those rules. |
We should be able to use it to add new optimized rules. Ideally the change shouldn't affect all the old rule and we could decide whether to add an optimized rule with mutation. But if we are adding those optimized rules, we also need to get the info about whether the pullback is execute inside an AD context to avoid breaking higher-order AD. Do we have a interface for second-order rules? It seems Zygote is using jacobian with gradient to compute second-order derivative and we can't specify a rule for second-order directly? |
I don't think so. This is something I'd really like to see in CRC, but it's not clear to me what a higher-order rules interface would look like. Diffractor appears to have something for this (usage example here?), but I don't quite understand how it works.
AIUI Zygote only needs On possible ideas, I've been reading through https://github.com/pytorch/pytorch/blob/master/tools/autograd/derivatives.yaml and their model looks the most similar to FluxML/NNlib.jl#434. As noted on that PR though, such a model might not work for all ChainRules-compatible ADs. Another way that works today would be to manually split out an |
addressed in #591
cc: @oxinabox, @mzgubic