-
-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
port rule definitions to ChainRulesCore #242
port rule definitions to ChainRulesCore #242
Conversation
It makes sense to keep NNlib fairly orthogonal and not have it depend on specific AD tooling. That's just the first thought in my head for this. Maybe we could have the backwards functions defined here and have zygote have the "glue" adjoint/ rrule defs? The release work should be fine either way. |
The idea behind ChainRulesCore.jl is basically that it's orthogonal to any specific AD implementation, it just allows packages to define rules in a generic way that a specific AD tool can use, so I would argue that |
I would go with the easier solution, and just tag breaking releases, I'm sure no one will come after us :) |
Ok, as @mcabbott pointed out in FluxML/Zygote.jl#824 (comment), I think it would make sense to use |
Ok, this should now be ready from my side. (Probably better to merge the Zygote PR first though, so we can rerun CI here) |
824: move NNlib rules out of Zygote r=CarloLucibello a=simeonschaub Partner to FluxML/NNlib.jl#242. I talked a bit about how to go about releasing this there, would appreciate any feedback and suggestions. Co-authored-by: Simeon Schaub <simeondavidschaub99@gmail.com>
824: move NNlib rules out of Zygote r=CarloLucibello a=simeonschaub Partner to FluxML/NNlib.jl#242. I talked a bit about how to go about releasing this there, would appreciate any feedback and suggestions. Co-authored-by: Simeon Schaub <simeondavidschaub99@gmail.com>
824: move NNlib rules out of Zygote r=CarloLucibello a=simeonschaub Partner to FluxML/NNlib.jl#242. I talked a bit about how to go about releasing this there, would appreciate any feedback and suggestions. Co-authored-by: Simeon Schaub <simeondavidschaub99@gmail.com>
how do we solve Compat issues? should we bump NNlib version later? |
My thinking was that we first tag Zygote 0.6 and then we merge and release this. Packages like Flux that depend on both Zygote and NNlib should then bump their compat for both simultaneously. |
Zygote 0.6 is being tagged |
Definitely shouldn't tag without the due diligence, I think If ci is a consideration, we can use a manifest file, not necessarily tag |
Weird, tests in the multi-threaded environment are not passing |
Yes, I just checked, this reproduces for me locally as well when using two threads. |
It looks like there is a race condition somewhere in the pooling code. Who would be best to take a look at that? I am not really familiar with what it is doing. |
6a53068
to
5df14d2
Compare
a205b17
to
5878efa
Compare
Ok, doesn't really seem related to threading, I think I misdiagnosed that. I don't think the differences here can be explained by numerical instabilities though, as they are often quite large, but am a bit unsure how to further debug this, |
Co-authored-by: Carlo Lucibello <carlo.lucibello@gmail.com>
9dbfded
to
c9eeb25
Compare
Ok, there's something really sketchy going on with |
Maybe @staticfloat or @thebhatman can help with maxpool? |
I added the rules for the new softmax interface here #250 . I put the rules close to the method definition instead of using a common |
@simeonschaub since this |
Not really AFAIK. Most packages probably added them as an afterthought, so it's usually in a different file, but I see no reason not to put them in the same file if that makes more sense.
Ok, I can try to mark these as broken for now, but it would be good to open a separate issue, since they look serious. |
not even worth marking them as broken, since they pass on some platforms and fail on others |
@simeonschaub merge whenever you are ready |
I don't think I have commit rights here. If you merge, it would probably be good to squash merge, since I generally tend to add commits just for better reviewability with the assumption that the PR is going to be squashed anyways. I can also squash myself though, if that makes things easier. |
yes, please squash then I'll do the merge |
actually, I didn't now how easy it easy to squash-merge, I'll do that |
The current solution of defining these in Zygote is quite unsatisfying and makes working on these unnecessarily complicated and it's also a bit weird that Zygote depends on NNlib at all. Since rules should now be defined using ChainRulesCore anyways, making use of that and moving the definitions into NNlib itself makes the most sense to me. This will require careful coordination with a new release of Zygote removing these rules, because tests will now fail because of circular dependencies. The best solution might be to add upper bounds on NNlib for all Zygote versions in General retroactively, release a new Zygote version with no dependency on NNlib and then release this. Alternatively, we could tag a breaking release for NNlib, but that seems a bit wrong to me, since these changes shouldn't actually be breaking for anything except Zygote.jl.
Rule definitions are ported from https://github.com/FluxML/Zygote.jl/blob/master/src/lib/nnlib.jl. The tests are mostly just copied from https://github.com/FluxML/Zygote.jl/blob/master/test/gradcheck.jl, eventually we probably want to port these to ChainRulesTestUtils.jl, which should be quite a bit more accurate and thorough.