Fwd: Re: RI Ph.D. Thesis Defense: Viraj Mehta

Thu Dec 7 14:04:21 EST 2023

Autonlab:

Adding one more reminder.

Please come and see Viraj's PhD defense starting in 26 minutes!

Jeff.

-------- Forwarded Message --------
Subject: 	Re: RI Ph.D. Thesis Defense: Viraj Mehta
Date: 	Thu, 7 Dec 2023 11:30:00 -0500
From: 	Viraj Mehta <virajm at cs.cmu.edu>
To: 	Suzanne Muth <lyonsmuth at cmu.edu>
CC: 	RI People <ri-people at andrew.cmu.edu>

Hi everyone,

Just wanted to send a reminder that this will be at 2:30pm today in NSH 
4305. Hope to see you there!

Viraj

On Mon, Nov 27, 2023 at 8:32 AM Suzanne Muth <lyonsmuth at cmu.edu 
<mailto:lyonsmuth at cmu.edu>> wrote:

     Date: 07 December 2023
     Time: 2:30 p.m. (ET)
     Location: NSH 4305
     Zoom Link:
     https://cmu.zoom.us/j/92549072618?pwd=MndqNzMzS2Q3ZU4rMTVIVVlPTUh5dz09

<https://cmu.zoom.us/j/92549072618?pwd=MndqNzMzS2Q3ZU4rMTVIVVlPTUh5dz09>

     Type: Ph.D. Thesis Defense
     Who: Viraj Mehta
     Title: Sample-Efficient Reinforcement Learning with applications in
     Nuclear Fusion

     Abstract:
     In many practical applications of reinforcement learning (RL), it is
     expensive to observe state transitions from the environment. In the
     problem of plasma control for nuclear fusion, the motivating example
     of this thesis, determining the next state for a given state-action
     pair requires querying an expensive transition function which can
     lead to many hours of computer simulation or dollars of scientific
     research. Such expensive data collection prohibits application of
     standard RL algorithms which usually require a large number of
     observations to learn. In this thesis, I address the problem of
     efficiently learning a policy from a relatively modest number of
     observations, motivated by the application of automated decision
     making and control to nuclear fusion. The first section presents
     four approaches developed to evaluate the prospective value of data
     in learning a good policy and discusses their performance,
     guarantees, and application. These approaches address the problem
     through the lenses of information theory, decision theory, the
     optimistic value gap, and learning from comparative feedback. We
     apply this last method to reinforcement learning from human feedback
     for the alignment of large language models. The second presents work
     which uses physical prior knowledge about the dynamics to more
     quickly learn an accurate model. Finally, I give an introduction to
     the problem setting of nuclear fusion, present recent work
     optimizing the design of plasma current rampdowns at the DIII-D
     tokamak, and discuss future applications of AI in fusion.

     Thesis Committee Members:
     Jeff Schneider, Chair
     Deepak Pathak
     David Held
     Stefano Ermon, Stanford University
     Mark D. Boyer, Commonwealth Fusion Systems

     A draft of the thesis document is available at:
     virajm.com/assets/pdf/thesis.pdf
     <http://virajm.com/assets/pdf/thesis.pdf>