Problem: | Methods with preconditioning with weight decay regularization. |
---|---|
Work type: | M1P |
Author: | Статкевич Екатерина Игоревна |
Scientific advisor: | Безносиков Александр |
This work considers a regularization for such algorithms as Adam, OASIS for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The main difference from gradient descent is that Adam's and OASIS algorithm use information about previous gradients to update parameters of the model.