On Convergence Rates of Stochastic Gradient Descent

Project Topics

Convex OptimizationPythonMachine Learning

Sources

  • Web Link

Acknowledgements


About the Project

In 2020 I took a course on Optimization Theory taught by Professor Bahman Gharesifard. The course gave a rigorous foundation on the field of optimization theory. One of the topics we covered in the course is convergence rate proofs for learning algorithms. I got really interested in the topic and decided to do my project on convergence rates for stochastic gradient descent. My project surrounded reproducing a proof from the paper Making Gradient Descent Optimal for Strongly Convex Stochastic Optimization by Sridharan et al. Specfically, I reproduce a proof on how to recover an optimal O(1/T) convergence rate to SGD. I tried to use concepts, theorems, and propisitions learned in Optimization Theory to motivate and reproduce the proof. Additionally, I wrote a simple Neural Network to demonstrate the practical use of SGD vs slower algorithms like GD.

Project Paper:

This is the direct link to the PDF of the paper