From jelsas+ at cs.cmu.edu  Mon Jul 13 17:09:03 2009
From: jelsas+ at cs.cmu.edu (Jonathan Elsas)
Date: Mon, 13 Jul 2009 17:09:03 -0400
Subject: [IR Series] - Jaime Arguello - Thursday July 16th, 2009,
	11:00 AM - Wean Hall 7220
In-Reply-To: <4A1C0E1D.2090309@cs.cmu.edu>
References: <4A1C0E1D.2090309@cs.cmu.edu>
Message-ID: <D8CB84AB-2CFD-4C27-B0A8-38C96795A424@cs.cmu.edu>

Hello -- Please join us for our an IR series talk this Thursday.  NOTE  
the different time & location.

Speaker: Jaime Arguello (LTI, CMU)
Time & Date: Thursday July 16th, 2009, 11:00 AM
Place: Wean Hall 7220

Lunch will be provided by Yahoo!

Title: Sources of Evidence for Vertical Selection

Web search providers often include search services for domain-specific  
subcollections, called verticals, such as news, images, videos, job  
postings, company summaries, and artist profiles. We address the  
problem of vertical selection, predicting relevant verticals (if any)  
for queries issued to a search engine's main web search page. In  
contrast to prior collection selection tasks, vertical selection is  
associated with unique resources that can inform the  
classificationdecision. We focus on three sources of evidence: (1) the  
query string, from which features are derived independent of external  
resources, (2) logs of queries previously issued to the vertical  
directly by users, and (3) corpora representative of vertical content.  
These sources of evidence are integrated as features in a  
classification-based approach. We make use of and compare against  
prior work in federated search and retrieval effectiveness prediction.  
Our evaluation focuses on 18 different verticals, which differ in  
terms of semantics, media type, size, and level of query traffic. An  
in-depth error analysis reveals unique challenges across different  
verticals and provides insight into vertical selection for future work.

Based on work conducted at Yahoo! Labs Montreal to be presented at  
SIGIR 2009.


Thanks,

Jon, Jaime & Grace


From jelsas+ at cs.cmu.edu  Tue Jul 14 15:34:47 2009
From: jelsas+ at cs.cmu.edu (Jonathan Elsas)
Date: Tue, 14 Jul 2009 15:34:47 -0400
Subject: [IR Series] - Jaime Arguello & Pinar Donmez - Thursday July 16th,
	2009, 11:00 AM - Wean Hall 7220
In-Reply-To: <D8CB84AB-2CFD-4C27-B0A8-38C96795A424@cs.cmu.edu>
References: <4A1C0E1D.2090309@cs.cmu.edu>
	<D8CB84AB-2CFD-4C27-B0A8-38C96795A424@cs.cmu.edu>
Message-ID: <DC717269-D1B4-473F-9C1C-5419BCCA3B55@cs.cmu.edu>

UPDATE:  We will have 2 talks at this IR series, both Jaime and Pinar  
will be presenting their work in preparation for SIGIR next week.

Speaker: Pinar Donmez
Time & Date: see below
Title: On the Local Optimality of LambdaRank

A machine learning approach to rank learning trains a model
to optimize a target evaluation measure with repect to train-
ing data. Currently, existing information retrieval measures
are impossible to optimize directly except for models with a
trivial number of parameters. The IR community thus faces
a major challenge: how to optimize IR measures of interest
directly. In this paper, we present a solution. Specifically,
we show that LambdaRank [1], which smoothly approxi-
mates the gradient of the target measure, can be adapted
to work with three popular IR target evaluation measures
using the same underlying gradient construction. It is likely,
therefore, that this construction is extendable to other eval-
uation measures. We empirically show that LambdaRank
finds a locally optimal solution for NDCG, MAP and MRR
with a 99% confidence rate. We also show that the amount
of effective training data varies with IR measure and that
with a sufficiently large training set size, matching the train-
ing optimization measure to the target evaluation measure
yields the best accuracy.

This work is conducted jointly with Krysta Svore and Chris Burges while
interning at MSR Redmond. It will be presented at SIGIR '09.


On Jul 13, 2009, at 5:09 PM, Jonathan Elsas wrote:

> Hello -- Please join us for our an IR series talk this Thursday.   
> NOTE the different time & location.
>
> Speaker: Jaime Arguello (LTI, CMU)
> Time & Date: Thursday July 16th, 2009, 11:00 AM
> Place: Wean Hall 7220
>
> Lunch will be provided by Yahoo!
>
> Title: Sources of Evidence for Vertical Selection
>
> Web search providers often include search services for domain- 
> specific subcollections, called verticals, such as news, images,  
> videos, job postings, company summaries, and artist profiles. We  
> address the problem of vertical selection, predicting relevant  
> verticals (if any) for queries issued to a search engine's main web  
> search page. In contrast to prior collection selection tasks,  
> vertical selection is associated with unique resources that can  
> inform the classificationdecision. We focus on three sources of  
> evidence: (1) the query string, from which features are derived  
> independent of external resources, (2) logs of queries previously  
> issued to the vertical directly by users, and (3) corpora  
> representative of vertical content. These sources of evidence are  
> integrated as features in a classification-based approach. We make  
> use of and compare against prior work in federated search and  
> retrieval effectiveness prediction. Our evaluation focuses on 18  
> different verticals, which differ in terms of semantics, media type,  
> size, and level of query traffic. An in-depth error analysis reveals  
> unique challenges across different verticals and provides insight  
> into vertical selection for future work.
>
> Based on work conducted at Yahoo! Labs Montreal to be presented at  
> SIGIR 2009.
>
>
> Thanks,
>
> Jon, Jaime & Grace
>
>


From jelsas+ at cs.cmu.edu  Thu Jul 16 11:05:03 2009
From: jelsas+ at cs.cmu.edu (Jonathan Elsas)
Date: Thu, 16 Jul 2009 11:05:03 -0400
Subject: [IR Series] - Jaime Arguello & Pinar Donmez - Thursday July 16th,
	2009, 11:00 AM - Wean Hall 7220
In-Reply-To: <DC717269-D1B4-473F-9C1C-5419BCCA3B55@cs.cmu.edu>
References: <4A1C0E1D.2090309@cs.cmu.edu>
	<D8CB84AB-2CFD-4C27-B0A8-38C96795A424@cs.cmu.edu>
	<DC717269-D1B4-473F-9C1C-5419BCCA3B55@cs.cmu.edu>
Message-ID: <A23B8081-D04C-4C84-943B-31792E5BE75E@cs.cmu.edu>

REMINDER: IR Series talk NOW.  WEH 7220


On Jul 14, 2009, at 3:34 PM, Jonathan Elsas wrote:

> UPDATE:  We will have 2 talks at this IR series, both Jaime and  
> Pinar will be presenting their work in preparation for SIGIR next  
> week.
>
> Speaker: Pinar Donmez
> Time & Date: see below
> Title: On the Local Optimality of LambdaRank
>
> A machine learning approach to rank learning trains a model
> to optimize a target evaluation measure with repect to train-
> ing data. Currently, existing information retrieval measures
> are impossible to optimize directly except for models with a
> trivial number of parameters. The IR community thus faces
> a major challenge: how to optimize IR measures of interest
> directly. In this paper, we present a solution. Specifically,
> we show that LambdaRank [1], which smoothly approxi-
> mates the gradient of the target measure, can be adapted
> to work with three popular IR target evaluation measures
> using the same underlying gradient construction. It is likely,
> therefore, that this construction is extendable to other eval-
> uation measures. We empirically show that LambdaRank
> finds a locally optimal solution for NDCG, MAP and MRR
> with a 99% confidence rate. We also show that the amount
> of effective training data varies with IR measure and that
> with a sufficiently large training set size, matching the train-
> ing optimization measure to the target evaluation measure
> yields the best accuracy.
>
> This work is conducted jointly with Krysta Svore and Chris Burges  
> while
> interning at MSR Redmond. It will be presented at SIGIR '09.
>
>
>
> On Jul 13, 2009, at 5:09 PM, Jonathan Elsas wrote:
>
>> Hello -- Please join us for our an IR series talk this Thursday.   
>> NOTE the different time & location.
>>
>> Speaker: Jaime Arguello (LTI, CMU)
>> Time & Date: Thursday July 16th, 2009, 11:00 AM
>> Place: Wean Hall 7220
>>
>> Lunch will be provided by Yahoo!
>>
>> Title: Sources of Evidence for Vertical Selection
>>
>> Web search providers often include search services for domain- 
>> specific subcollections, called verticals, such as news, images,  
>> videos, job postings, company summaries, and artist profiles. We  
>> address the problem of vertical selection, predicting relevant  
>> verticals (if any) for queries issued to a search engine's main web  
>> search page. In contrast to prior collection selection tasks,  
>> vertical selection is associated with unique resources that can  
>> inform the classificationdecision. We focus on three sources of  
>> evidence: (1) the query string, from which features are derived  
>> independent of external resources, (2) logs of queries previously  
>> issued to the vertical directly by users, and (3) corpora  
>> representative of vertical content. These sources of evidence are  
>> integrated as features in a classification-based approach. We make  
>> use of and compare against prior work in federated search and  
>> retrieval effectiveness prediction. Our evaluation focuses on 18  
>> different verticals, which differ in terms of semantics, media  
>> type, size, and level of query traffic. An in-depth error analysis  
>> reveals unique challenges across different verticals and provides  
>> insight into vertical selection for future work.
>>
>> Based on work conducted at Yahoo! Labs Montreal to be presented at  
>> SIGIR 2009.
>>
>>
>> Thanks,
>>
>> Jon, Jaime & Grace
>>
>>
>
>