Skip to content

add lookahead optimiser #969

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed

Conversation

chengchingwen
Copy link
Member

This add the implementation of Lookahead optimiser from this paper and based on the pytorch impl

state::IdDict
end

const MomentumOptim = Union{Momentum, RMSProp, Nesterov, ADAM, RADAM, AdaMax, ADAGrad, ADADelta, AMSGrad, NADAM}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps an AbstractOptimiser type would help with some of the repitition.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But some optimiser don't have momentum term. Maybe we also need a AbstractMomentumOptimiser?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking if we could use traits here?

Copy link
Member Author

@chengchingwen chengchingwen Dec 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://docs.julialang.org/en/v1/manual/methods/#Trait-based-dispatch-1
if you are talking about this, I guess it's possible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

abstract type OptimiserStyle end
struct Momentumlike <: OptimiserStyle end
struct NonMomentumlike <: OptimiserStyle end

has_momentum(o::Type{Descent}) = NonMomentumlike()
has_momentum(o::Type{Momentum}) = Momentumlike()
has_momentum(o::Type{Adam}) = Momentumlike()

like this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like that, yeah.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chengchingwen
Copy link
Member Author

@dhairyagandhi96 I define a AbstractOptimiserand the GradientStyletraits`.

@CarloLucibello
Copy link
Member

I would step back to the original problem and just add the Lookahead optimizer

@chengchingwen
Copy link
Member Author

@CarloLucibello Sure! should I revert the commit to 510cfaa or 14956c2 ?

@CarloLucibello
Copy link
Member

yes please. If we want to introduce a type hierarchy for the optimizers let's first discuss it in a separate issue/PR

@chengchingwen
Copy link
Member Author

@CarloLucibello Reverted.

@CarloLucibello
Copy link
Member

CarloLucibello commented Jul 4, 2020

In the paper, I read
"""
We have the choice of maintaining, interpolating, or resetting the internal state (e.g. momentum) of
the inner optimizer. We evaluate this tradeoff on the CIFAR dataset (where every choice improves
convergence) in Appendix D.1 and maintain internal state for the other experiments.
....
D.1 Inner Optimizer State
Throughout our paper, we maintain the state of our inner optimizer for simplicity. For SGD with
heavy-ball momentum, this corresponds to preserving the momentum. Here, we present a sensitivity
study by comparing the convergence of Lookahead when maintaining the momentum, interpolating
the momentum, and resetting the momentum. All three improve convergence versus SGD.
"""
So let's keep this simple and just maintain the state.

This also needs a rebase

@CarloLucibello
Copy link
Member

So let's keep this simple and just maintain the state.

I'm saying this because the specific needs of this PR, we probably need a state setter/getter API for the optimizers, so we should think of it more carefully in a separate issue/PR, then we can revisit LookAhead when we have that in place.
So I think it's better to just merge the minimal LookAhead implementation here, with no state reset

@chengchingwen
Copy link
Member Author

@DhairyaLGandhi rebased and updated

@CarloLucibello
Copy link
Member

sorry we droppebd the ball here. Closing since optimisers are now in Optimisers.jl

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants