-
-
Notifications
You must be signed in to change notification settings - Fork 617
Fix ADADelta calculations and broken tests not catching the problems #1299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
29832ac broke ADADelta, simply reversing its change and adding comments to not move around the epsilons.
|
thanks! can we fix tests so that we would have catched this? |
|
I kind of found why the test isn't working, but I have no idea how it is happening. Lines 6 to 13 in 1dc9023
If you add an If you remove the |
|
Alright, figured it out. Every |
The `@testset` macro resets the global RNG's state after every body execution. Which means that the true parameters `w` and `w'` were always getting the same values, meaning there was nothing to optimise and the tests would always pass. Fixed that by explicitly setting different seeds.
|
Locally did you verify/ cross check if there are other optimisers affected by this? |
|
Are you talking about the tests not doing anything or any other optimisers being affected by the epsilon like changes? The tests should be fixed, I think. All of the optimisers are being tested in the same I haven't checked other optimisers for algorithmic correctness. That would be next. |
|
Looks like OADAM might be broken, the test failed for only that one. |
|
Calculations of OADAM look fine to me. It does seem to have reduced the loss to a certain extent, just not as much as the Perhaps changing the test to just check that the loss has decreased would be a good idea (as opposed to checking it has reduced below a hardcoded value of 0.01). Please provide other suggestions too. Also should that be a separate PR? This isn't specific to fixing ADADelta anymore, it is already fixed. |
|
It might mean that the default parameters of the optimiser aren't ideal. |
|
That is a good possibility. The paper itself does not suggest any default parameters, only tells which ones they used when conducting their experiments. On top of that, neither PyTorch nor Tensorflow have implemented OADAM, so that avenue of finding established defaults is closed too. |
|
Increasing the learning rate of OADAM by a factor of 10 is enough to make it pass the test. Should we change the default parameter value or only increase in the test? |
|
The default one would be best, thanks! |
To make it pass the Optimise testset.
|
Ready for review |
|
Thanks! Lgtm bors r+ |
|
Build succeeded: |
29832ac broke ADADelta, simply reversing its change and adding comments to not move around the epsilons. Fixes #1158
The testset probably needs more work than specifically for this case, so I'm not adding it here.
PR Checklist
addedfixed@dhairyagandhi96(for API changes).