[CMU AI Seminar] April 20 (Today!) at 12pm (GHC 6115 & Zoom) -- Bingbin Liu (CMU) -- Thinking Fast with Transformers: Algorithmic Reasoning via Shortcuts -- AI Seminar sponsored by SambaNova Systems

Thu Apr 20 11:50:59 EDT 2023

Reminder this is happening in ~10m! (Sorry for the late notice)

On Thu, Apr 20, 2023 at 10:41 AM Asher Trockman <ashert at cs.cmu.edu> wrote:

> Dear all,
>
> We look forward to seeing you* today, **this Thursday (4/20)* from *1**2:00-1:00
> PM (U.S. Eastern time)* for the next talk of this semester's
> *CMU AI Seminar*, sponsored by SambaNova Systems <https://sambanova.ai/>.
> The seminar will be held in GHC 6115 *with pizza provided *and will be
> streamed on Zoom.
>
> To learn more about the seminar series or to see the future schedule,
> please visit the seminar website <http://www.cs.cmu.edu/~aiseminar/>.
>
> Today (4/20), *Bingbin Liu* (CMU) will be giving a talk titled *"**Thinking
> Fast with Transformers: Algorithmic Reasoning via Shortcuts**".*
>
> *Title*: Thinking Fast with Transformers: Algorithmic Reasoning via
> Shortcuts
>
> *Talk Abstract*: Algorithmic reasoning requires capabilities which are
> most naturally understood through recurrent models of computation, like the
> Turing machine. However, Transformer models, while lacking recurrence, are
> able to perform such reasoning using far fewer layers than the number of
> reasoning steps. This raises the question: what solutions are these shallow
> and non-recurrent models finding? In this talk, we will formalize reasoning
> in the setting of automata, and show that the computation of an automaton
> on an input sequence of length T can be replicated exactly by Transformers
> with o(T) layers, which we call "shortcuts". We provide two constructions,
> with O(log T) layers for all automata and O(1) layers for solvable
> automata. Empirically, our results from synthetic experiments show that
> shallow solutions can also be found in practice.
>
> *Speaker Bio:* Bingbin Liu is a fourth-year PhD student at the Machine
> Learning Department of Carnegie Mellon University, co-advised by Pradeep
> Ravikumar and Andrej Risteski. Her research focuses on the theoretical
> understanding of self-supervised learning and unsupervised learning, often
> motivated by findings in vision and language.
>
> *In person: *GHC 6115
> *Zoom Link*:
> https://cmu.zoom.us/j/99510233317?pwd=ZGx4aExNZ1FNaGY4SHI3Qlh0YjNWUT09
>
> Thanks,
> Asher Trockman
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/ai-seminar-announce/attachments/20230420/8299ccef/attachment.html>