[CMU AI Seminar] Nov 9 at 12pm (Zoom) -- Quanquan Gu (UCLA) -- Stochastic Gradient Descent: Benign Overfitting and Implicit Regularization -- AI Seminar sponsored by Morgan Stanley
Shaojie Bai
shaojieb at cs.cmu.edu
Fri Nov 5 12:07:13 EDT 2021
Dear all,
We look forward to seeing you *next Tuesday (11/9)* from *1**2:00-1:00 PM
(U.S. Eastern time)* for the next talk of our *CMU AI Seminar*, sponsored
by Morgan Stanley <https://www.morganstanley.com/about-us/technology/>.
To learn more about the seminar series or see the future schedule, please
visit the seminar website <http://www.cs.cmu.edu/~aiseminar/>.
On 11/9, *Quanquan Gu* (UCLA) will be giving a talk on "*Stochastic
Gradient Descent: Benign Overfitting and Implicit Regularization*" and his
group's latest research progress on DL theory.
*Title:* Stochastic Gradient Descent: Benign Overfitting and Implicit
Regularization
*Talk Abstract:* There is an increasing realization that algorithmic
inductive biases are central in preventing overfitting; empirically, we
often see a benign overfitting phenomenon in overparameterized settings for
natural learning algorithms, such as stochastic gradient descent (SGD),
where little to no explicit regularization has been employed. In the first
part of this talk, I will discuss benign overfitting of constant-stepsize
SGD in arguably the most basic setting: linear regression in the
overparameterized regime. Our main results provide a sharp excess risk
bound, stated in terms of the full eigenspectrum of the data covariance
matrix, that reveals a bias-variance decomposition characterizing when
generalization is possible. In the second part of this talk, I will
introduce sharp instance-based comparisons of the implicit regularization
of SGD with the explicit regularization of ridge regression, which are
conducted in a sample-inflation manner. I will show that provided up to
polylogarithmically more sample size, the generalization performance of SGD
is always no worse than that of ridge regression for a broad class of least
squares problem instances, and could be much better for some problem
instances. This suggests the benefits of implicit regularization in SGD
compared with the explicit regularization of ridge regression. This is
joint work with Difan Zou, Jingfeng Wu, Vladimir Braverman, Dean P. Foster
and Sham M. Kakade.
*Speaker Bio: *Quanquan Gu is an Assistant Professor of Computer Science at
UCLA. His research is in the area of artificial intelligence and machine
learning, with a focus on developing and analyzing nonconvex optimization
algorithms for machine learning to understand large-scale, dynamic,
complex, and heterogeneous data and building the theoretical foundations of
deep learning and reinforcement learning. He received his Ph.D. degree in
Computer Science from the University of Illinois at Urbana-Champaign in
2014. He is a recipient of the NSF CAREER Award, Simons Berkeley Research
Fellowship among other industrial research awards.
*Zoom Link: *
https://cmu.zoom.us/j/97788824898?pwd=alM4T1EvK1VHdEZ6aWdOa0lWOHdrZz09
<https://www.google.com/url?q=https://cmu.zoom.us/j/97788824898?pwd%3DalM4T1EvK1VHdEZ6aWdOa0lWOHdrZz09&sa=D&source=calendar&ust=1636461071062736&usg=AOvVaw1uuw9SRd6OXo_LaYjjKNvz>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/ai-seminar-announce/attachments/20211105/b1b689ba/attachment.html>
More information about the ai-seminar-announce
mailing list