Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SGD with Momentum Optimizer, addresses #298 #369

Merged
merged 8 commits into from
Jul 23, 2019

Conversation

dylanagreen
Copy link
Contributor

This PR adds momentum with a new SGDMomentum object. The default values for SGDMomentum are the same as SGD, and when left as default the SGDMomentum optimizer will operate in the same way the base SGD optimizer. SGDMomentum optimizers must be declared with var, similarly to Adam and unlike SGD. Additionally an SGDMomentum object has a parameter that allows it to use Nesterov momentum rather than regular momentum. I've included a link to the paper that proposes Nesterov momentum in the documentation.

This implementation of SGD with Momentum is within 3-4 decimal places with the PyTorch implementation in both the Nesterov and base momentum cases. I have included a PyTorch output for three SGD test cases (without momentum, with momentum, and with nesterov momentum) as well as one for Adam. Tests for SGDMomentum are located in tests/nn/test_optimizers.nim. The included PyTorch implementation is a modification of the following legacy PyTorch test: https://github.com/pytorch/pytorch/blob/master/test/optim/test.py.

It may be possible to automate these tests, right now I've done them manually by eye. If the output of the PyTorch impl. is saved to a file the Arraymancer test could load the results and check that at each optimizer step the new values of the (x, y) model points is "close enough" to the PyTorch output.

- I've retained an older non-momentum version of SGD for backcompatibility. Storing moments requires
a variable SGD object, and most code written for arraymancer prior to this more than likely defines
itsoptimizers with `let` since that is how it is done in the examples.
- This reording doesn't change the function of update(), but it does make it easier
to implement Nesterov momentum.
- This preserves backwards compatibility with old `let optim` declared SGD optimizers.
@dylanagreen dylanagreen changed the title Add SGD with Momentum Optimizer, partially closes #298 Add SGD with Momentum Optimizer, addresses #298 Jul 22, 2019
@mratsim mratsim merged commit 2e7b193 into mratsim:master Jul 23, 2019
@mratsim
Copy link
Owner

mratsim commented Jul 23, 2019

Perfect, thank you very much

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants