Feb 25 at 12pm (GHC 6115) -- Samy Bengio (Apple) -- How far can transformers reason? the globality barrier and inductive scratchpad
Victor Akinwande
vakinwan at andrew.cmu.edu
Thu Feb 20 11:29:21 EST 2025
Dear all,
We look forward to seeing you next *Tuesday (02/25) from 12:00-1:00 PM
(ET)* for
the next talk of CMU AI Seminar, sponsored by SambaNova Systems
<https://sambanova.ai/>. The seminar will be held in *GHC 6115* with pizza
provided and will be streamed on Zoom.
To learn more about the seminar series or to see the future schedule,
please visit the seminar website (http://www.cs.cmu.edu/~aiseminar/).
Next Tuesday (02/25) Samy Bengio (Apple) will be giving a talk titled: "How
far can transformers reason? the globality barrier and inductive
scratchpad".
*Abstract:*
Can Transformers predict new syllogisms by composing established ones? More
generally, what type of targets can be learned by such models from scratch?
Recent works show that Transformers can be Turing-complete in terms of
expressivity, but this does not address the learnability objective. This
presentation puts forward the notion of 'globality degree' to capture when
weak learning is efficiently achievable by regular Transformers, where the
latter measures the least number of tokens required in addition to the
tokens histogram to correlate nontrivially with the target. As shown
experimentally and theoretically under additional assumptions,
distributions with high globality cannot be learned efficiently. In
particular, syllogisms cannot be composed on long chains. Furthermore, we
show that (i) an agnostic scratchpad cannot help to break the globality
barrier, (ii) an educated scratchpad can help if it breaks the globality
barrier at each step, (iii) a notion of 'inductive scratchpad' can both
break the globality barrier and improve the out-of-distribution
generalization, e.g., generalizing to almost double input size for some
arithmetic tasks.
*Speaker Bio:*
Samy Bengio (PhD in computer science, University of Montreal, 1993) is a
senior director of machine learning research at Apple since 2021 and an
adjunct professor at EPFL since 2024. Before that, he was a distinguished
scientist at Google Research since 2007 where he was heading part of the
Google Brain team, and at IDIAP in the early 2000s where he co-wrote the
well-known open-source Torch machine learning library.
His research interests span many areas of machine learning such as deep
architectures, representation learning, vision and language processing and
more recently, reasoning.
He is action editor of the Journal of Machine Learning Research and on the
board of the NeurIPS foundation. He was on the editorial board of the
Machine Learning Journal, has been program chair (2017) and general chair
(2018) of NeurIPS, program chair of ICLR (2015, 2016), general chair of
BayLearn (2012-2015), MLMI (2004-2006), as well as NNSP (2002), and on the
program committee of several international conferences such as NeurIPS,
ICML, ICLR, ECML and IJCAI.
More details can be found at http://bengio.abracadoudou.com.
*In person: GHC 6115*
Zoom Link:* https://cmu.zoom.us/j/93599036899?pwd=oV45EL19Bp3I0PCRoM8afhKuQK7HHN.1
<https://cmu.zoom.us/j/93599036899?pwd=oV45EL19Bp3I0PCRoM8afhKuQK7HHN.1>*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/ai-seminar-announce/attachments/20250220/8bcc8617/attachment.html>
More information about the ai-seminar-announce
mailing list