Fwd: MSR Thesis Talk: Distributed Reinforcement Learning for Autonomous Driving

Wed Apr 27 13:11:13 EDT 2022

Hi Everyone,

Please come and hear Zhe talk about his Masters work on parallelizing RL 
tomorrow!!

Jeff.

-------- Forwarded Message --------
Subject: 	MSR Thesis Talk: Distributed Reinforcement Learning for 
Autonomous Driving
Date: 	Wed, 20 Apr 2022 17:06:42 -0400
From: 	Zhe Huang <zhehuang at cmu.edu>
To: 	ri-people at cs.cmu.edu, Jeff Schneider <jeff4 at andrew.cmu.edu>, David 
Held <dheld at andrew.cmu.edu>, Adam Villaflor <avillafl at andrew.cmu.edu>, 
Barbara (B.J.) Fecich <barbarajean at cmu.edu>

Hello all,

I will be giving my MSR thesis talk on Thursday, April 28th, 2022 at 
1:30 pm EST. Everyone is invited!

*Date*: Thursday, April  28th, 2022
*Time*: 1:30pm - 2pm
*Location*: NSH 4305, or
*Zoom Link*: https://cmu.zoom.us/j/8838971548 
<https://cmu.zoom.us/j/8838971548>

*Title*: Distributed Reinforcement Learning for Autonomous Driving

*Abstract*:

Due to the complex and safety-critical nature of autonomous driving, 
recent works typically test their ideas on simulators designed for the 
very purpose of advancing self-driving research. Despite the convenience 
of modeling autonomous driving as a trajectory optimization problem, few 
of these methods resort to online reinforcement learning (RL) to address 
challenging driving scenarios. This is mainly because classic online RL 
algorithms are originally designed for toy problems such as Atari games, 
which are solvable within hours. In contrast, it may take weeks or 
months to get satisfactory results on self-driving tasks using these 
online RL methods as a consequence of the time-consuming simulation and 
the difficulty of the problem itself. Thus, a promising online RL 
pipeline for autonomous driving should be efficiency driven.

In this thesis, we investigate the inefficiency of directly applying 
generic online RL algorithms to self-driving pipelines. We propose two 
distributed multi-agent RL algorithms, Multi-Parallel SAC (off-policy) 
and Multi-Parallel PPO (on-policy), both of which are highly scalable by 
running asynchronously. Our methods are dedicated to accelerating the 
online RL training on CARLA simulator by establishing both inter-process 
and intra-process parallelization. We demonstrate that our multi-agent 
methods achieve state-of-the-art performances on various CARLA 
self-driving tasks in much shorter and reasonable time.

*Committee*:
Prof. Jeff Schneider (advisor)
Prof. David Held
Adam Villaflor

--

Zhe Huang
MSR Student, Robotics Institute, CMU