-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add proper accept/reject logic for LM optimizer #364
Conversation
a0cd143
to
40c4e86
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Do the current unit tests cover the new step logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comments! I didn't think of adding new tests, since the optimizer needs to go through all the new step logic. I guess it's possible to add some tests specifically for accept/reject, but that seems a bit cumbersome, since I think I'd need to mock a lot of things to be able to test it.
…at also return optional reject indices.
990ac2c
to
16a0097
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. There are some comments which seem difficult to be addressed in this PR.
den = (delta * (damping * delta + linearization.Atb.squeeze(2))).sum(dim=1) | ||
rho = (last_err - new_err) / den | ||
good_idx = rho > damping_accept | ||
rho = (previous_err - err) / den |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if we should rename previous_err
and err
to previous_cost
and cost
. In our convention, we assume err
is vector.
self._damping * up_damping_ratio, | ||
self._damping / down_damping_ratio, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should change the _damping
update policy. In Ceres, up_damping_ratio
/down_damping_ratio
are also dynamically updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In LM method, the self._damping
might be unchanged if the the rho
is neither good nor bad. Therefore, there should good_indices
, bad_indices
and accepted_indices
, where self._damping
is reduced, increased and unchanged, respectively.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created #384
* Refactored optimizer retract step so that retract and update happen separately. * Changed NonlinearOptimizer.step() so that it returns the error. * Renamed retract_optim_vars to a more descriptive name. * Changed Objective.update() so that it can also be given an ignore mask. * Replaced update_optimizer_state by a method called _complete_step, that also return optional reject indices. * Made _step private. Added some comments.
The previous version would adjust the damping depending on whether the step should be accepted or not, but it would nevertheless always take the step. This could result in divergence if the down damping ratio was too aggressive.
In the new version, if the step should be rejected, it's indeed rejected (and this is computed independently for each batch element).