[CMU AI Seminar] December 5 at 12pm (NSH 3305 & Zoom) -- Elan Rosenfeld (CMU) -- Outliers with Opposing Signals Have an Outsized Effect on Neural Network Optimization -- AI Seminar sponsored by SambaNova System

Tue Dec 5 11:57:43 EST 2023

NOTE: In NSH 3305.

On Tue, Dec 5, 2023 at 11:49 AM Asher Trockman <ashert at andrew.cmu.edu>
wrote:

> Reminder this is happening soon!
>
> On Dec 3, 2023, at 5:40 PM, Asher Trockman <ashert at cs.cmu.edu> wrote:
>
> 
> Dear all,
>
> We look forward to seeing you *this Tuesday (12/5)* from *1**2:00-1:00 PM
> (U.S. Eastern time)* for the next talk of this semester's *CMU AI Seminar*,
> sponsored by SambaNova Systems <https://sambanova.ai/>. The seminar will
> be held in NSH 3305 *with pizza provided *and will be streamed on Zoom.
>
> To learn more about the seminar series or to see the future schedule,
> please visit the seminar website <http://www.cs.cmu.edu/~aiseminar/>.
>
> On this Tuesday (12/5), *Elan Rosenfeld* (CMU) will be giving a talk
> titled *"**Outliers with Opposing Signals Have an Outsized Effect on
> Neural Network Optimization**"*.
>
> *Title*: Outliers with Opposing Signals Have an Outsized Effect on Neural
> Network Optimization
>
> *Talk Abstract*: There is a growing list of intriguing properties of
> neural network optimization, including specific patterns in their training
> dynamics (e.g. simplicity bias, edge of stability, grokking) and the
> unexplained effectiveness of various tools (e.g. batch normalization, SAM,
> Adam). Extensive study of these properties has so far yielded only a
> partial understanding of their origins—and their relation to one another is
> even less clear. What is it about gradient descent on neural networks that
> gives rise to these phenomena?
>
> In this talk, I will present our recent experiments which offer a new
> perspective on many of these findings and suggest that they may have a
> shared underlying cause. Our investigation identifies and explores the
> significant influence of paired groups of outliers with what we call
> Opposing Signals: large magnitude features that dominate the network’s
> output throughout most of training and cause large gradients pointing in
> opposite directions.
>
> Though our experiments shed some light on these outliers’ influence, we
> lack a complete understanding of their precise effect on network training
> dynamics. Instead, I’ll share our working hypothesis via a high-level
> explanation, and I’ll describe initial experiments which verify some of its
> qualitative predictions. We hope a deeper understanding of this phenomenon
> will enable future principled improvements to neural network optimization.
>
> *Speaker Bio:* Elan Rosenfeld <https://www.cs.cmu.edu/~elan/> is a final
> year PhD student in CMU MLD advised by Profs. Andrej Risteski and Pradeep
> Ravikumar. His research focuses on principled approaches to understanding
> and improving robustness, representation learning, and generalization in
> deep learning.
>
> *In person: *NSH 3305
> *Zoom Link*:
> https://cmu.zoom.us/j/99510233317?pwd=ZGx4aExNZ1FNaGY4SHI3Qlh0YjNWUT09
>
> Thanks,
> Asher Trockman
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/ai-seminar-announce/attachments/20231205/8794efcc/attachment.html>