Skip to content

[Python] [arXiv/cs] Paper "An Overview of Gradient Descent Optimization Algorithms" by Sebastian Ruder

License

Notifications You must be signed in to change notification settings

harshraj11584/Paper-Implementation-Overview-Gradient-Descent-Optimization-Sebastian-Ruder

Repository files navigation

Paper-Implementation-Overview-Gradient-Descent-Optimization-Algorithms

forthebadge made-with-python

arXiv paper : "An Overview of Gradient Descent Optimization Algorithms"

- Sebastian Ruder

Python 2.7

Links to original paper published on arXiv.org>cs>arXiv:1609.04747 : [1], [2]

Link to Blog with Paper Explanation : [3]

Implemented following Gradient Desent Optimization Algorithms from Scratch:

  1. Vanilla Batch/Stochastic Gradient Descent [4]

  2. Momentum [5]

  3. NAG : Nesterov Accelarated Gradient [6]

  4. AdaGrad : Adaptive Gradient Algorithm [7]

  5. AdaDelta : Adaptive Learning Rate Method [8]

  6. RMS Prop [9]

  7. Adam : Adaptive Moment Estimation [10] [11]

  8. AdaMax : Infinity Norm based Adaptive Moment Estimation [12]

  9. Nadam : Nesterov-accelarated Adaptive Moment Estimation [13]

  10. AMSGrad [14]