[IR Series] - Jaime Arguello & Pinar Donmez - Thursday July 16th, 2009, 11:00 AM - Wean Hall 7220

Tue Jul 14 15:34:47 EDT 2009

UPDATE:  We will have 2 talks at this IR series, both Jaime and Pinar  
will be presenting their work in preparation for SIGIR next week.

Speaker: Pinar Donmez
Time & Date: see below
Title: On the Local Optimality of LambdaRank

A machine learning approach to rank learning trains a model
to optimize a target evaluation measure with repect to train-
ing data. Currently, existing information retrieval measures
are impossible to optimize directly except for models with a
trivial number of parameters. The IR community thus faces
a major challenge: how to optimize IR measures of interest
directly. In this paper, we present a solution. Specifically,
we show that LambdaRank [1], which smoothly approxi-
mates the gradient of the target measure, can be adapted
to work with three popular IR target evaluation measures
using the same underlying gradient construction. It is likely,
therefore, that this construction is extendable to other eval-
uation measures. We empirically show that LambdaRank
finds a locally optimal solution for NDCG, MAP and MRR
with a 99% confidence rate. We also show that the amount
of effective training data varies with IR measure and that
with a sufficiently large training set size, matching the train-
ing optimization measure to the target evaluation measure
yields the best accuracy.

This work is conducted jointly with Krysta Svore and Chris Burges while
interning at MSR Redmond. It will be presented at SIGIR '09.

On Jul 13, 2009, at 5:09 PM, Jonathan Elsas wrote:

> Hello -- Please join us for our an IR series talk this Thursday.   
> NOTE the different time & location.
>
> Speaker: Jaime Arguello (LTI, CMU)
> Time & Date: Thursday July 16th, 2009, 11:00 AM
> Place: Wean Hall 7220
>
> Lunch will be provided by Yahoo!
>
> Title: Sources of Evidence for Vertical Selection
>
> Web search providers often include search services for domain- 
> specific subcollections, called verticals, such as news, images,  
> videos, job postings, company summaries, and artist profiles. We  
> address the problem of vertical selection, predicting relevant  
> verticals (if any) for queries issued to a search engine's main web  
> search page. In contrast to prior collection selection tasks,  
> vertical selection is associated with unique resources that can  
> inform the classificationdecision. We focus on three sources of  
> evidence: (1) the query string, from which features are derived  
> independent of external resources, (2) logs of queries previously  
> issued to the vertical directly by users, and (3) corpora  
> representative of vertical content. These sources of evidence are  
> integrated as features in a classification-based approach. We make  
> use of and compare against prior work in federated search and  
> retrieval effectiveness prediction. Our evaluation focuses on 18  
> different verticals, which differ in terms of semantics, media type,  
> size, and level of query traffic. An in-depth error analysis reveals  
> unique challenges across different verticals and provides insight  
> into vertical selection for future work.
>
> Based on work conducted at Yahoo! Labs Montreal to be presented at  
> SIGIR 2009.
>
>
> Thanks,
>
> Jon, Jaime & Grace
>
>