Reminder : [IR Discussion Series] Today 2pm in Wean 7220.

Grace Hui Yang huiyang at cs.cmu.edu
Thu Oct 2 11:59:29 EDT 2008


Dear all,
>>
>>    We are going to have Le Zhao to give our first IR talk in this 
>> semester. Reception will provided by Yahoo!.  Here is the talk 
>> information:
>>
>>    Date: Thursday 2nd Oct 2008
>>    Time: 2pm
>>    Place: Wean Hall 7220
>>
>>    Speaker: Le Zhao
>>    Title: A Generative Retrieval Model for Structured Documents
>>
>> Abstract
>> Structured documents contain elements defined by the author(s) and
>> annotations assigned by other people or processes.   Structured 
>> documents
>> pose challenges for probabilistic retrieval models when there are
>> mismatches between the structured query and the actual structure in a
>> relevant document or erroneous structure introduced by an annotator. 
>> This
>> paper makes three contributions.  First, a new generative retrieval 
>> model
>> is proposed to deal with the mismatch problem.  This new model 
>> extends the
>> basic keyword language model by treating structure as hidden variable
>> during the generation process.  Second, variations of the model are
>> compared. Third, term-level and structure-level smoothing strategies are
>> studied.  Evaluation was conducted with INEX XML retrieval and
>> question-answering retrieval tasks.  Experimental results indicate that
>> the optimal structured retrieval model is task dependent, two-level
>> Dirichlet smoothing significantly outperforms two-level Jelinek-Mercer
>> smoothing, and with accurate structured queries, the proposed structured
>> retrieval model outperforms keyword retrieval significantly, on both QA
>> and INEX datasets.
>>
>> Based on work accepted at CIKM'08.
>>
>>
>> See you then!
>>
>> Grace, Jaime, Jon
>>
>>
>>
>>
>>
>
>



More information about the Ir-series mailing list