Speaker
Lucas Benigni, Université de Montréal
Abstract
Despite their surplus of parameters, modern deep learning models often generalize well, a phenomenon exemplified by the “double descent curve.” While this behavior is theoretically grasped for problems such as ridge regression under linear scaling of dimensions, intriguing phenomenon emerge under quadratic scaling, where sample size equals parameter count. In this presentation, we study the eigenvalues of the Neural Tangent Kernel, a matrix model pertinent to wide neural networks trained via gradient descent, within this quadratic regime.
Workshop: Eigenvalue distribution of the Neural Tangent Kernel under a quadratic scaling
Date: 2024-10-16
Time: 13:30 - 14:30