[IR Discussion Series] Oct 2nd 2pm in Wean 7220.
Grace Hui Yang
huiyang at cs.cmu.edu
Mon Sep 29 20:44:07 EDT 2008
Dear all,
>
> We are going to have Le Zhao to give our first IR talk in this
> semester. Reception will provided by Yahoo!. Here is the talk
> information:
>
> Date: Thursday 2nd Oct 2008
> Time: 2pm
> Place: Wean Hall 7220
>
> Speaker: Le Zhao
> Title: A Generative Retrieval Model for Structured Documents
>
> Abstract
> Structured documents contain elements defined by the author(s) and
> annotations assigned by other people or processes. Structured documents
> pose challenges for probabilistic retrieval models when there are
> mismatches between the structured query and the actual structure in a
> relevant document or erroneous structure introduced by an annotator. This
> paper makes three contributions. First, a new generative retrieval model
> is proposed to deal with the mismatch problem. This new model extends
> the
> basic keyword language model by treating structure as hidden variable
> during the generation process. Second, variations of the model are
> compared. Third, term-level and structure-level smoothing strategies are
> studied. Evaluation was conducted with INEX XML retrieval and
> question-answering retrieval tasks. Experimental results indicate that
> the optimal structured retrieval model is task dependent, two-level
> Dirichlet smoothing significantly outperforms two-level Jelinek-Mercer
> smoothing, and with accurate structured queries, the proposed structured
> retrieval model outperforms keyword retrieval significantly, on both QA
> and INEX datasets.
>
> Based on work accepted at CIKM'08.
>
>
> See you then!
>
> Grace, Jaime, Jon
>
>
>
>
>
More information about the Ir-series
mailing list