[CMU AI Seminar] Fwd: AI Seminar sponsored by Fortive -- Feb 16 (Zoom) -- Will Grathwohl (U of Toronto) -- Using and Abusing Gradients for Discrete Sampling and Energy-Based Models

Mon Feb 15 11:55:44 EST 2021

Hi all,

Just a reminder that the CMU AI Seminar <http://www.cs.cmu.edu/~aiseminar/> is
tomorrow 12pm-1pm:
https://cmu.zoom.us/j/91853143684?pwd=UDNLNWpRcEs2WUx4S21UZ3d2RHV2dz09.

Will Grathwohl (University of Toronto) will be talking about discrete
sampling and deep energy-based models (see below).

Thanks,
Shaojie

---------- Forwarded message ---------
From: Shaojie Bai <shaojieb at andrew.cmu.edu>
Date: Tue, Feb 9, 2021 at 1:37 PM
Subject: AI Seminar sponsored by Fortive -- Feb 16 (Zoom) -- Will Grathwohl
(U of Toronto) -- Using and Abusing Gradients for Discrete Sampling and
Energy-Based Models
To: <ai-seminar-announce at cs.cmu.edu>

Dear all,

We look forward to seeing you *next Tuesday (2/16)* from 12:00-1:00 PM
(U.S. Eastern time) for the next talk of our *CMU AI seminar*, sponsored by
Fortive <https://careers.fortive.com/>.

To learn more about the seminar series or see the future schedule, please
visit the seminar website <http://www.cs.cmu.edu/~aiseminar/>.
<http://www.cs.cmu.edu/~aiseminar/>

On 2/16, *Will Grathwohl <http://www.cs.toronto.edu/~wgrathwohl/>* (University
of Toronto) will be giving a talk on "*Using and Abusing Gradients for
Discrete Sampling and Energy-Based Models*."

*Title*: Using and Abusing Gradients for Discrete Sampling and Energy-Based
Models

*Talk Abstract*: Deep energy-based models have quickly become a popular and
successful approach for generative modeling in high dimensions. The success
of these models can mainly be attributed to improvements in MCMC sampling
such as Langevin Dynamics and training with large persistent Markov chains.
Because of the reliance on gradient-based sampling, these models have been
successful in modeling continuous data. As it stands, the same solution
cannot be applied when dealing with discrete data. In this work, we propose
a general and automatic approximate sampling strategy for probabilistic
models with discrete variables. Our approach uses gradients of the
likelihood function with respect to discrete assignments to propose
Metropolis-Hastings updates. These updates can be incorporated into larger
Markov chain Monte Carlo or learning schemes. We show theoretically and
empirically that this simple approach outperforms generic samplers in a
number of difficult settings including Ising Models, Potts Models,
Restricted Boltzmann Machines, and Factorial Hidden Markov Models -- even
outperforming some samplers which exploit known structure in these
distributions. We also show that our improved sampler enables the training
of deep energy-based models on high dimensional discrete data which
outperforms variational auto-encoders and previous energy-based models.

*Speaker Bio*: Will Grathwohl is a PhD student at the University of Toronto
supervised by Richard Zemel and David Duvenaud. His work mainly focuses on
generative models and their applications to downstream discriminative
tasks. His work has covered variational inference, normalizing flows and
now focuses mainly on energy-based models. Will is currently a student
researcher on the Google Brain team in Toronto. Prior to graduate school,
Will worked on machine learning applications in silicon valley and did his
undergraduate degree in mathematics at the Massachusetts Institute of
Technology.

*Zoom Link*:
https://cmu.zoom.us/j/91853143684?pwd=UDNLNWpRcEs2WUx4S21UZ3d2RHV2dz09

Thanks,
Shaojie Bai (MLD)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/ai-seminar-announce/attachments/20210215/e88ca9e6/attachment.html>