[CL+NLP Lunch] CL+NLP Lunch, Kevin Gimpel, Friday August 15 @ 2:00pm

Fri Aug 8 12:14:49 EDT 2014

Please join us for the next CL+NLP lunch at 2pm on August 15th, where
Kevin Gimpel will be speaking about weakly-supervised NLP with
cost-augmented contrastive estimation. Lunch will be provided!

If you would like to set up a meeting with Kevin on August 15th, please
email Dallas Card <dcard at cs.cmu.edu>.

CL+NLP lunch <http://www.cs.cmu.edu/~nlp-lunch/>
Friday, August 15th at 2:00pm
GHC 8102

Kevin Gimpel, Toyota Technological Institute at Chicago

*Weakly-Supervised NLP with Cost-Augmented Contrastive Estimation*

Abstract:
Unsupervised NLP aims to discover meaningful structure in unannotated
text, such as parts-of-speech, morphological segmentation, or syntactic
structure.  Unsupervised systems improve when researchers incorporate
knowledge to bias learning to better capture characteristics of the
desired structure.  Contrastive estimation (CE; Smith and Eisner, 2005) is
a general approach to unsupervised learning with a particular way of
incorporating knowledge.  CE increases the likelihood of the observations
at the expense of those in a particular neighborhood of each observation. 
The neighborhood typically contains corrupted versions of the
observations.

In this talk, we generalize CE in two ways that allow us to add more
knowledge to unsupervised learning (thereby adding "weak" supervision). 
In particular, we augment CE with two types of cost functions, one on
observations and one on output structures.  The first allows the modeler
to specify not only the set of corrupted inputs for each observation, but
also how bad each one is.  The second lets us specify preferences on
desired output structures, regardless of the input sentence.  We evaluate
our approach, which we call cost-augmented contrastive estimation (CCE),
on unsupervised part-of-speech tagging of five languages from the PASCAL
2012 shared task.  We find that CCE improves over both standard CE and
strong benchmarks from the shared task.

This is joint work with Mohit Bansal.

Bio:
Kevin Gimpel is a research assistant professor at the Toyota Technological
Institute at Chicago, a philanthropically-endowed academic computer
science institute located on the campus of the University of Chicago.  In
2012, he received his PhD from the LTI where he was advised by Noah Smith.
 His recent research focuses on machine translation, unsupervised
learning, and structure prediction.

-- 
Dallas Card
Machine Learning Department
Carnegie Mellon University