<div dir="ltr"><div style="font-size:12.8px"><span style="font-size:12.8px">Dear faculty and students,</span><br></div><div style="font-size:12.8px"><br></div><div style="font-size:12.8px">We look forward to seeing you Next Tuesday, March 21, at noon in <b>NSH 1507</b> for <span class="gmail-m_-6999767458661046436gmail-m_-6479828366463820995gmail-m_1701734696480192352gmail-m_2647091708305858519gmail-m_-7609873634547071637gmail-m_2670556149876493410gmail-il">AI</span> <span class="gmail-m_-6999767458661046436gmail-m_-6479828366463820995gmail-m_1701734696480192352gmail-m_2647091708305858519gmail-m_-7609873634547071637gmail-m_2670556149876493410gmail-il">lunch</span>. To learn more about the seminar and <span class="gmail-m_-6999767458661046436gmail-m_-6479828366463820995gmail-m_1701734696480192352gmail-m_2647091708305858519gmail-m_-7609873634547071637gmail-m_2670556149876493410gmail-il">lunch</span>, please visit <span style="font-size:12.8px">the </span><a href="http://www.cs.cmu.edu/~aiseminar/" target="_blank" style="font-size:12.8px"><span class="gmail-m_-6999767458661046436gmail-m_-6479828366463820995gmail-m_1701734696480192352gmail-m_2647091708305858519gmail-m_-7609873634547071637gmail-m_2670556149876493410gmail-il"><span class="gmail-m_-6999767458661046436gmail-m_-6479828366463820995gmail-m_1701734696480192352gmail-m_2647091708305858519gmail-m_-7609873634547071637gmail-il">AI</span></span> <span class="gmail-m_-6999767458661046436gmail-m_-6479828366463820995gmail-m_1701734696480192352gmail-m_2647091708305858519gmail-m_-7609873634547071637gmail-m_2670556149876493410gmail-il"><span class="gmail-m_-6999767458661046436gmail-m_-6479828366463820995gmail-m_1701734696480192352gmail-m_2647091708305858519gmail-m_-7609873634547071637gmail-il">Lunch</span></span> webpage</a><span style="font-size:12.8px">.</span></div><div style="font-size:12.8px"><br></div><div style="font-size:12.8px">On Tuesday, <a href="http://www.cs.cmu.edu/~wensun/">Wen Sun</a> will give a talk titled *Differentiable Imitation Learning and Sequential Prediction*.</div><div><br></div><div><div>*Abstract*: Recently, researchers have demonstrated state-of-the-art performance on sequential decision making problems (e.g., robotics control, sequential prediction) with deep neural networks and Reinforcement Learning (RL). However, for some of these problems, oracles that can demonstrate good performance are available during training. In this work, we propose AggreVaTeD, a policy gradient extension of the Imitation Learning (IL) approach of Ross & Bagnell (2014) that can leverage oracles to achieve faster and more accurate solutions with less training data than with a less-informed RL approaches. Specifically, we provide a comprehensive theoretical study of IL that demonstrates we can expect up to exponentially lower sample complexity for learning with AggreVaTeD than with RL algorithms. Finally, we present two stochastic gradient procedures that learn neural network policies for several problems including a sequential prediction task as well as various high dimensional robotics control problems. Our results and theory indicate that the proposed approach can achieve superior performance with respect to the oracle when the demonstrator is sub-optimal.</div><div><br></div><div>This a joint work with Arun Venkatraman, Geoff Gordon, Byron Boots and Drew Bagnell.</div></div></div>