[CMU AI Seminar] Nov 9 at 12pm (Zoom) -- Quanquan Gu (UCLA) -- Stochastic Gradient Descent: Benign Overfitting and Implicit Regularization -- AI Seminar sponsored by Morgan Stanley

Mon Nov 8 14:43:56 EST 2021

Dear all,

Just a reminder that the CMU AI Seminar is tomorrow 12pm-1pm:
https://cmu.zoom.us/j/97788824898?pwd=alM4T1EvK1VHdEZ6aWdOa0lWOHdrZz09
<https://www.google.com/url?q=https://cmu.zoom.us/j/97788824898?pwd%3DalM4T1EvK1VHdEZ6aWdOa0lWOHdrZz09&sa=D&source=calendar&ust=1636461071062736&usg=AOvVaw1uuw9SRd6OXo_LaYjjKNvz>
.

*Professor Quanquan Gu (UCLA) *will be giving a talk on some
surprising findings, such as the implicit regularization effect, of SGD.

Thanks,
Asher

On Fri, Nov 5, 2021 at 12:07 PM Shaojie Bai <shaojieb at cs.cmu.edu> wrote:

> Dear all,
>
> We look forward to seeing you *next Tuesday (11/9)* from *1**2:00-1:00 PM
> (U.S. Eastern time)* for the next talk of our *CMU AI Seminar*, sponsored
> by Morgan Stanley <https://www.morganstanley.com/about-us/technology/>.
>
> To learn more about the seminar series or see the future schedule, please
> visit the seminar website <http://www.cs.cmu.edu/~aiseminar/>.
>
> On 11/9, *Quanquan Gu* (UCLA) will be giving a talk on "*Stochastic
> Gradient Descent: Benign Overfitting and Implicit Regularization*" and
> his group's latest research progress on DL theory.
>
> *Title:* Stochastic Gradient Descent: Benign Overfitting and Implicit
> Regularization
>
> *Talk Abstract:* There is an increasing realization that algorithmic
> inductive biases are central in preventing overfitting; empirically, we
> often see a benign overfitting phenomenon in overparameterized settings for
> natural learning algorithms, such as stochastic gradient descent (SGD),
> where little to no explicit regularization has been employed. In the first
> part of this talk, I will discuss benign overfitting of constant-stepsize
> SGD in arguably the most basic setting: linear regression in the
> overparameterized regime. Our main results provide a sharp excess risk
> bound, stated in terms of the full eigenspectrum of the data covariance
> matrix, that reveals a bias-variance decomposition characterizing when
> generalization is possible. In the second part of this talk, I will
> introduce sharp instance-based comparisons of the implicit regularization
> of SGD with the explicit regularization of ridge regression, which are
> conducted in a sample-inflation manner. I will show that provided up to
> polylogarithmically more sample size, the generalization performance of SGD
> is always no worse than that of ridge regression for a broad class of least
> squares problem instances, and could be much better for some problem
> instances. This suggests the benefits of implicit regularization in SGD
> compared with the explicit regularization of ridge regression. This is
> joint work with Difan Zou, Jingfeng Wu, Vladimir Braverman, Dean P. Foster
> and Sham M. Kakade.
>
> *Speaker Bio: *Quanquan Gu is an Assistant Professor of Computer Science
> at UCLA. His research is in the area of artificial intelligence and machine
> learning, with a focus on developing and analyzing nonconvex optimization
> algorithms for machine learning to understand large-scale, dynamic,
> complex, and heterogeneous data and building the theoretical foundations of
> deep learning and reinforcement learning. He received his Ph.D. degree in
> Computer Science from the University of Illinois at Urbana-Champaign in
> 2014. He is a recipient of the NSF CAREER Award, Simons Berkeley Research
> Fellowship among other industrial research awards.
>
> *Zoom Link: *
> https://cmu.zoom.us/j/97788824898?pwd=alM4T1EvK1VHdEZ6aWdOa0lWOHdrZz09
> <https://www.google.com/url?q=https://cmu.zoom.us/j/97788824898?pwd%3DalM4T1EvK1VHdEZ6aWdOa0lWOHdrZz09&sa=D&source=calendar&ust=1636461071062736&usg=AOvVaw1uuw9SRd6OXo_LaYjjKNvz>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/ai-seminar-announce/attachments/20211108/09ea6007/attachment.html>