-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tests are slow: use vjvp ? #204
Comments
A big part of it is that FiniteDifferences.jl is slower, but more accurate what Zygote's We could try tinkering with that, getting it faster that way might be possible, and might not break too many tests, though we might need to also relax some of the |
Another thing to consider is that we're currently generating entire The thing that we really need to test for reverse-mode is that < J' ȳ, ẋ> ≈ <ȳ, J ẋ> rrules are good at computing The only extra thing we should need to make this work is the ability to compute inner products between the output of I've been doing this for a while in TemporalGPs.jl and it seems to work really well. I wrote loads of code to hack around ChainRulesTestUtils not being up to what I needed prior to ADIA -- didn't want to contribute it back at the time because I wasn't completely sure whether this was the right way to go about things, but I'm now convinced that it is. |
Yeah, and A problem maybe is if it fails that won't tell you where you failed, will it? |
Indeed.
Yeah, this is a problem. It does make life a bit trickier when it comes to debugging. I've typically found that you want to retain the ability to (slowly) compute the vjp for debugging puposes. Fortunately, you don't really need to have things like the ability to check for equality lying around to do this -- you'll always be doing it by eye. One of the real benefits of doing things this way is that you can test AD at scale, For example, whereas with the current way of doing things really requires small problem sizes, the inner product approach can handle any problem size in which you're happy to make a small handful of function evaluations. The advantage is less that it's better to test on big problems, and more that its convenient to be able to test any old problem you have lying around regardless its size. |
Had a quick stab at doing this (I'll not be pushing this further myself in the immediate future, just wanted to see what it might look like) #208 On the surface of it, it doesn't look like we have any substantial practical impediments to doing this, but I've not dug into the details. |
Compare Zygote's
gradtest
and CRTUs
test_rrule
it's nearly 100x slower. Do we understand why? I didn't have time to look into it so just making an issue.
ChainRules tests take a pretty long time to run so this might be worth improving.
The text was updated successfully, but these errors were encountered: