Elizabeth Collins-Woodfin: High-dimensional dynamics of SGD for generalized linear models

Date: 2024-09-18

Time: 10:00 - 11:00

Speaker
Elizabeth Collins-Woodfin, McGill University

Abstract
Stochastic gradient descent (SGD) is an efficient and heavily used tool for high-dimensional optimization problems such as those that arise in machine learning.  We analyze the dynamics of streaming (online) SGD in the high-dimensional limit when applied to generalized linear models with general data-covariance. We show that, when the number of parameters grows proportionally to the number of data, SGD converges to a deterministic equivalent, characterized by a system of ordinary differential equations.  In addition to the deterministic equivalent, we introduce a stochastic differential equation that allows us to analyze the dynamics of general statistics of SGD iterates.  I will discuss how we leverage techniques from high-dimensional probability, matrix theory, and stochastic calculus to obtain these results.  I will also discuss some applications, including learning rate thresholds for stability and analyzing algorithms with adaptive learning rates (e.g. AdaGrad-Norm).