Please come to Ian Char's PhD thesis defense starting at 10am in GHC 6115!
Jeff Schneider
jeff4 at andrew.cmu.edu
Thu Apr 11 09:19:05 EDT 2024
-------- Forwarded Message --------
Subject: Thesis Defense - April 11, 2024 - Ian Char - Advancing
Model-Based Reinforcement Learning with Applications in Nuclear Fusion
Date: Thu, 28 Mar 2024 14:16:52 -0400
From: Diane Stidle <stidle at andrew.cmu.edu>
Reply-To: stidle at andrew.cmu.edu
To: ml-seminar at cs.cmu.edu <ML-SEMINAR at CS.CMU.EDU>,
riedmiller at google.com, ekolemen at pppl.gov
*/Thesis Defense/*
Date: April 11, 2024
Time: 10:00am (EDT)
Place: GHC 6115 & Remote
PhD Candidate: Ian Char
*Title: Advancing Model-Based Reinforcement Learning with Applications
in Nuclear Fusion*
Abstract:
Reinforcement learning (RL) may be the key to overcoming previous
insurmountable obstacles, leading to technological and scientific
innovations. One such example where RL could have a sizable impact is in
tokamak control. Tokamaks are one of the most promising devices for
making nuclear fusion into a viable energy source. They operate by
magnetically confining a plasma; however, sustaining the plasma for long
periods of time and at high pressures remains a challenge for the
tokamak control community. RL may be able to learn how to sustain the
plasma, but like many exciting applications of RL, it is infeasible to
collect data on the real device in order to learn a policy.
In this thesis, we explore learning policies using surrogate models of
the environment, and especially using surrogate models that are learned
from an offline data source. To start in Part I, we investigate the
scenario in which one has access to a simulator that can be used to
generate data, but the simulator is too computationally taxing to use
data-hungry deep RL algorithms. We instead suggest a Bayesian
optimization algorithm to learn such a policy. Following this, we pivot
to the setting in which surrogate models of the environment can be
learned with offline data. While these models are much more
computationally cheap, their predictions inevitably contain errors. As
such, both robust policy learning procedures and good uncertainty
quantification of model errors are crucial for success. To address the
former, in Part II we propose a trajectory stitching algorithm that
accounts for these modeling errors and a policy network architecture
that is adaptive, yet robust. Part III shifts focus onto uncertainty
quantification, where we propose a more intelligent uncertainty sampling
procedure and a neural process architecture for learning uncertainties
efficiently. In the final part, we detail how we learned models to
predict plasma evolution, how we used these models to train a neutral
beam controller, and the results of deploying this controller on the
DIII-D tokamak.
*Thesis Committee:*
Jeff Schneider, Chair
Ruslan Salakhutdinov
Zico Kolter
Martin Riedmiller (DeepMind)
Egemen Kolemen (Princeton)
Link to Draft Document:
https://drive.google.com/file/d/1VQAZDuvRA1GfEfZkGS6EfzFd-zovetU1/view?usp=sharing
Link to Zoom meeting:
https://www.google.com/url?q=https://cmu.zoom.us/j/94461753500?pwd%3DN1FmTktDWWU5cDkwM0szWWxvSXNndz09&sa=D&source=calendar&ust=1712067446243633&usg=AOvVaw0pAS1H8u4VyGICh2A69iS2
<https://www.google.com/url?q=https://cmu.zoom.us/j/94461753500?pwd%3DN1FmTktDWWU5cDkwM0szWWxvSXNndz09&sa=D&source=calendar&ust=1712067446243633&usg=AOvVaw0pAS1H8u4VyGICh2A69iS2>
--
Diane Stidle
PhD Program Manager
Machine Learning Department
Carnegie Mellon University
stidle at andrew.cmu.edu
More information about the Autonlab-users
mailing list