-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Auto increase gamma #86
Comments
@aplavin I think this makes sense to include as an option pretty much anywhere that backtracking routine is adopted. Backtracking is used to make sure gamma is small enough wrt the local geometry of the objective, to guarantee some sufficient decrease condition hence convergence: as long as it runs at every iteration, having it start from the previous gamma or something larger will not make a difference for convergence (rates). Do you want to open a PR? It would have an associated coefficient in the algorithm (your 1.02) to allow configuration: just like the backtracking coefficient, I’m not sure there’s some a priori optimal setting for this. Too large an increase will trigger backtracking more often, which has a cost. For a proximal gradient iteration with fully adaptive stepsize, and no backtracking logic, you can check out this paper https://arxiv.org/abs/2301.04431 with associated code https://github.com/pylat/adaptive-proximal-algorithms. We found this to often outperform Nesterov extrapolation, and requires no setting, although it doesn’t come with the same convergence rate guarantees. (Haven’t found the time to include this in ProximalAlgorithms.) |
Thanks for confirming that the modification makes sense! I'm not closely familiar with proximal algorithms, just saw this trick used somewhere.Also I noticed some algorithms (eg PANOC) do a state reset when gamma changes. Would this modification play well with those?However I didn't find any of the other algorithms to perform better than the accelerated gradient (FastForwardBackward) in cursory testing on my problems.Definitely looking forward for trying the new algorithm you linked!Message ID: ***@***.***>
|
This PR adds the possibility to specify a regret factor to increase the stepsize gamma at every iteration, if adaptive, before backtracking. Recent results provide convergence guarantees even without a maximum value on gamma [Theorem 14, [arXiv:2208.00779v2](https://arxiv.org/abs/2208.00799)]. This feature was implemented for ForwardBackward only. Although it does not seem a good fit for accelerated methods like ZeroFPR, PANOC, and PANOCplus, at least when based on quasi-Newton directions, it is unclear whether the FastForwardBackward solver could benefit from it. See discussion in #86 . Practical performance seems to improve with values regret_gamma close to 1. Tests and references have also been updated accordingly.
Currently, the step size
gamma
parameter can only decrease, eg atProximalAlgorithms.jl/src/algorithms/fast_forward_backward.jl
Line 92 in dc0342c
However, sometimes larger step sizes can be performed at some point during iteration.
I tried doing
state.gamma *= 1.02
in the iteration loop, and seem to get ~2x faster convergence on some problems.Gamma vs iterations plot looks like this:
Does this modification make sense in general? For some algorithms?
Should it be included into the package implementation itself?
The text was updated successfully, but these errors were encountered: