[IR-Series] Talk this friday: Jaime Arguello

Tue Jul 13 14:25:04 EDT 2010

Join us for another IR series talk this Friday.  There will be
**LUNCH** provided Yahoo!.

Time: Noon
Location: GHC 6501

Speaker: Jaime Arguello (http://www.cs.cmu.edu/~jaime/)

Title: Vertical Selection in the Presence of Unlabeled Verticals.

Vertical aggregation is the task of incorporating results from specialized
search engines or verticals (e.g., images, video, news) into Web search
results.  Vertical selection is the subtask of deciding, given a query,
which verticals, if any, are relevant.  State of the art approaches use
machine learned models to predict which verticals are relevant to a query.
 When trained using a large set of labeled data, a machine learned
vertical selection model outperforms baselines which require no training
data.  Unfortunately, whenever a new vertical is introduced, a costly new
set of editorial data must be gathered.  In this paper, we propose methods
for reusing training data from a set of existing (source) verticals to
learn a predictive model for a new (target) vertical.  We study methods
for learning robust, portable, and adaptive cross-vertical models.
Experiments show the need to focus on different types of features when
maximizing portability (the ability for a single model to make accurate
predictions across multiple verticals) than when maximizing adaptability
(the ability for a single model to make accurate predictions for a
specific vertical).  We demonstrate the efficacy of our methods through
extensive experimentation for 11 verticals.

This is joint work with Fernando Diaz and Jean-Francois Paiement from
Yahoo! Labs and will be presented at SIGIR 2010.