Seminar: Implicit Regularization of SGD in High-dimensional Linear Regression

Аватар автора
BRAIn Lab: научные семинары
Speaker: Cong Fang, Researcher at Peking University What will the talk cover? Stochastic Gradient Descent (SGD) is one of the most widely used algorithms in modern machine learning. In high-dimensional learning problems, the number of SGD iterations is often smaller than the number of model parameters, and the implicit regularization induced by the algorithm plays a key role in ensuring strong generalization performance. In this seminar, we will: 🔵 Analyze the generalization behavior of SGD across different learning scenarios; 🔵 Compare learning efficiency under various scales — depending on data size and dimensionality; 🔵 Discuss the effects of covariate shift; 🔵 Present theoretical insights that inspire memory-efficient training algorithms for large language models (e.g., GPT-2)

0/0


0/0

0/0

0/0