Nesterov Accelerated Gradient from Scratch in Python

Аватар автора
Python умения
Momentum is great, however if the gradient descent steps could slow down when it gets to the bottom of a minima that would be even better. This is Nesterov Accelerated Gradient in a nutshell, check it out! Credit The music is taken from Youtube music! Table of Content - Introduction: - Theory: - Python Implementation: - Conclusion: Here is an explanation of Nesterov Accelerated Gradient from that very cool blogpost mentioned in the credit section (check it out!): "Nesterov accelerated gradient (NAG) [see reference] is a way to give our momentum term this kind of prescience. We know that we will use our momentum term γvt−1 to move the parameters θ. Computing θ−γvt−1 thus gives us an approximation of the next position of the parameters (the gradient is missing for the full update), a rough idea where our parameters are going to be. We can now effectively look ahead by calculating the gradient not w.r.t. to our current parameters θ but w.r.t. the approximate future position of our parameters:" Reference Nesterov, Y. (1983). A method for unconstrained convex minimization problem with the rate of convergence o(1/k2). Doklady ANSSSR (translated as Soviet.Math.Docl.), vol. 269, pp. 543– 547 ---- ---- Follow Me Online Here: ___ Have a great week! ?

0/0


0/0

0/0

0/0