[CL+NLP Lunch] CL+NLP Lunch, Ankur Parikh, Tuesday February 17th @ 12:00pm
Dallas Card
dcard at andrew.cmu.edu
Fri Feb 13 11:25:38 EST 2015
Please join us for the next CL+NLP lunch at noon on Tuesday February 17th,
where Ankur Parikh will be speaking about language modeling with power low
rank ensembles. Lunch will be provided!
---
CL+NLP lunch <http://www.cs.cmu.edu/~nlp-lunch/>
Tuesday, February 17th at 12:00pm
GHC 4405
Speaker: Ankur Parikh, Machine Learning Department
TITLE: Language Modeling with Power Low Rank Ensembles
ABSTRACT:
Language modeling, the task of estimating the probability of sequences of
words, is an important component in many applications such as speech
recognition and machine translation. While seemingly simple, the large
vocabulary space and power law nature of language lead to a severe data
sparsity problem, making parameter estimation challenging. The predominant
approach to language modeling is the n-gram model where n-grams of various
orders are interpolated via different smoothing techniques to produce
robust estimates of rare sequences.
In this work, I present power low rank ensembles (PLRE), a framework for
language modeling that consists of collections of low rank matrices and
tensors. Our method can be understood as a generalization of n-gram
modeling to non-integer n, and includes standard techniques such as
absolute discounting, deleted-interpolation, and Kneser Ney smoothing as
special cases. Our approach consistently outperforms state-of-the-art
modified Kneser Ney baselines while preserving the computational
advantages of n-gram models. In particular, unlike other recent advances
such as neural language models, our method does not have any partition
functions, thus enabling fast evaluation at test time.
This work received the best paper runner up award at EMNLP 2014 and is
joint work with Avneesh Saluja, Chris Dyer, and Eric Xing.
Bio:
Ankur Parikh is a PhD candidate at Carnegie Mellon University advised by
Professor Eric Xing. He is passionate about research in machine learning,
natural language processing (NLP), and computational biology. In
particular, his thesis explores probabilistic modeling from the
perspective of linear algebra and applications of these insights to design
effective solutions for NLP tasks. Ankur has received a best paper runner
up award at EMNLP 2014, a best paper in translational bioinformatics at
ISMB 2011, and an NSF Graduate Fellowship in 2011. He previously graduated
from Princeton University in 2009 with highest honors.
http://www.cs.cmu.edu/~apparikh/
More information about the nlp-lunch
mailing list