[AI Seminar] AI Seminar sponsored by Apple -- Qizhe Xie -- Oct. 9th

Han Zhao han.zhao at cs.cmu.edu
Mon Oct 8 15:19:03 EDT 2018


A gentle reminder that the following talk will happen at noon in GHC 6115
tomorrow.

Han Zhao <han.zhao at cs.cmu.edu> 于2018年10月6日周六 下午4:19写道:

> Dear faculty and students:
>
> We look forward to seeing you next Tuesday, Oct. 9th, at noon in GHC 6115
> for AI Seminar sponsored by Apple. To learn more about the seminar series,
> please visit the website.
> On Tuesday, Qizhe Xie will give the following talk:
>
> Title: From Credit Assignment to Entropy Regularization: Two New
> Algorithms for Neural Sequence Prediction
>
> Background:
> Modeling and predicting discrete sequences is the central problem to many
> natural language processing tasks. Despite the distinct evaluation metrics
> for different tasks, the standard training algorithm for language
> generation has been maximum likelihood estimation (MLE). However, the MLE
> algorithm has two obvious weaknesses: (1) the MLE training ignores the
> information of the task specific metric; (2) MLE can suffer from the
> exposure bias, which refers to the phenomenon that the model is never
> exposed to its own failures during training. The recently proposed reward
> augmented maximum likelihood (RAML) tackles these problems by constructing
> a task metric dependent target distribution, and training the model to
> match this task-specific target instead of the empirical data distribution.
>
> Abstract:
> In this talk, we study the credit assignment problem in reward augmented
> maximum likelihood (RAML), and establish a theoretical equivalence between
> the token-level counterpart of RAML and the entropy regularized
> reinforcement learning. Inspired by the connection, we propose two sequence
> prediction algorithms, one extending RAML with fine-grained credit
> assignment and the other improving Actor-Critic with a systematic entropy
> regularization. On two benchmark datasets, we show that the proposed
> algorithms outperform RAML and Actor-Critic respectively.
>
> --
>
> *Han ZhaoMachine Learning Department*
>
>
> *School of Computer ScienceCarnegie Mellon UniversityMobile: +1-*
> *412-652-4404*
>


-- 

*Han ZhaoMachine Learning Department*


*School of Computer ScienceCarnegie Mellon UniversityMobile: +1-*
*412-652-4404*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/ai-seminar-announce/attachments/20181008/b2520524/attachment.html>


More information about the ai-seminar-announce mailing list