Fwd: Re: RI Ph.D. Thesis Defense: Viraj Mehta
Jeff Schneider
jeff4 at andrew.cmu.edu
Thu Dec 7 14:04:21 EST 2023
Autonlab:
Adding one more reminder.
Please come and see Viraj's PhD defense starting in 26 minutes!
Jeff.
-------- Forwarded Message --------
Subject: Re: RI Ph.D. Thesis Defense: Viraj Mehta
Date: Thu, 7 Dec 2023 11:30:00 -0500
From: Viraj Mehta <virajm at cs.cmu.edu>
To: Suzanne Muth <lyonsmuth at cmu.edu>
CC: RI People <ri-people at andrew.cmu.edu>
Hi everyone,
Just wanted to send a reminder that this will be at 2:30pm today in NSH
4305. Hope to see you there!
Viraj
On Mon, Nov 27, 2023 at 8:32 AM Suzanne Muth <lyonsmuth at cmu.edu
<mailto:lyonsmuth at cmu.edu>> wrote:
Date: 07 December 2023
Time: 2:30 p.m. (ET)
Location: NSH 4305
Zoom Link:
https://cmu.zoom.us/j/92549072618?pwd=MndqNzMzS2Q3ZU4rMTVIVVlPTUh5dz09
<https://cmu.zoom.us/j/92549072618?pwd=MndqNzMzS2Q3ZU4rMTVIVVlPTUh5dz09>
Type: Ph.D. Thesis Defense
Who: Viraj Mehta
Title: Sample-Efficient Reinforcement Learning with applications in
Nuclear Fusion
Abstract:
In many practical applications of reinforcement learning (RL), it is
expensive to observe state transitions from the environment. In the
problem of plasma control for nuclear fusion, the motivating example
of this thesis, determining the next state for a given state-action
pair requires querying an expensive transition function which can
lead to many hours of computer simulation or dollars of scientific
research. Such expensive data collection prohibits application of
standard RL algorithms which usually require a large number of
observations to learn. In this thesis, I address the problem of
efficiently learning a policy from a relatively modest number of
observations, motivated by the application of automated decision
making and control to nuclear fusion. The first section presents
four approaches developed to evaluate the prospective value of data
in learning a good policy and discusses their performance,
guarantees, and application. These approaches address the problem
through the lenses of information theory, decision theory, the
optimistic value gap, and learning from comparative feedback. We
apply this last method to reinforcement learning from human feedback
for the alignment of large language models. The second presents work
which uses physical prior knowledge about the dynamics to more
quickly learn an accurate model. Finally, I give an introduction to
the problem setting of nuclear fusion, present recent work
optimizing the design of plasma current rampdowns at the DIII-D
tokamak, and discuss future applications of AI in fusion.
Thesis Committee Members:
Jeff Schneider, Chair
Deepak Pathak
David Held
Stefano Ermon, Stanford University
Mark D. Boyer, Commonwealth Fusion Systems
A draft of the thesis document is available at:
virajm.com/assets/pdf/thesis.pdf
<http://virajm.com/assets/pdf/thesis.pdf>
More information about the Autonlab-users
mailing list