<html><body><div style="font-family: times new roman, new york, times, serif; font-size: 12pt; color: #000000"><DIV><BR></DIV>

<DIV><SPAN style="FONT-SIZE: medium" data-mce-style="font-size: medium;"><B>Master2 position at Multispeech Team, LORIA (Nancy, France)</B></SPAN></DIV>

<DIV><SPAN style="FONT-SIZE: medium" data-mce-style="font-size: medium;"><SPAN lang=en-US><I><B>Automatic speech recognition: contextualisation of the language model based on neural networks by dynamic adjustment</B></I></SPAN></SPAN></DIV>

<DIV><SPAN style="FONT-SIZE: small" data-mce-style="font-size: small;"><B></B></SPAN> </DIV>

<DIV><SPAN style="FONT-SIZE: small" data-mce-style="font-size: small;"><B>Framework of ANR project ContNomina </B></SPAN></DIV>

<DIV><SPAN style="FONT-SIZE: small" data-mce-style="font-size: small;">The technologies involved in information retrieval in large audio/video databases are often based on the analysis of large, but closed, corpora, and on machine learning techniques and statistical modeling of the written and spoken language. The effectiveness of these approaches is now widely acknowledged, but they nevertheless have major flaws, particularly for what concern proper names, that are crucial for the interpretation of the content.</SPAN></DIV>

<DIV><SPAN style="FONT-SIZE: small" data-mce-style="font-size: small;"></SPAN> </DIV>

<DIV><SPAN style="FONT-SIZE: small" data-mce-style="font-size: small;">In the context of diachronic data (data which change over time) new proper names appear constantly requiring dynamic updates of the lexicons and language models used by the speech recognition system.</SPAN></DIV>

<DIV><SPAN style="FONT-SIZE: small" data-mce-style="font-size: small;"><SPAN lang=en-US></SPAN></SPAN> </DIV>

<DIV><SPAN style="FONT-SIZE: small" data-mce-style="font-size: small;"><SPAN lang=en-US>As a result, the ANR project </SPAN></SPAN><SPAN style="FONT-SIZE: small" data-mce-style="font-size: small;"><SPAN lang=en-US><I>ContNomina</I></SPAN></SPAN><SPAN style="FONT-SIZE: small" data-mce-style="font-size: small;"><SPAN lang=en-US> (2013-2017) focuses on the problem of proper names in automatic audio processing systems by exploiting in the most efficient way the context of the processed documents. To do this, the student will address </SPAN></SPAN><SPAN style="COLOR: #333333" data-mce-style="color: #333333;"><SPAN style="FONT-SIZE: small" data-mce-style="font-size: small;"><SPAN lang=en-US>the contextualization of the recognition module through the dynamic adjustment of the language model in order to make it more accurate.</SPAN></SPAN></SPAN></DIV>

<DIV><SPAN style="COLOR: #333333" data-mce-style="color: #333333;"><SPAN style="FONT-SIZE: small" data-mce-style="font-size: small;"><B></B></SPAN></SPAN> </DIV>

<DIV><SPAN style="COLOR: #333333" data-mce-style="color: #333333;"><SPAN style="FONT-SIZE: small" data-mce-style="font-size: small;"><B>Subject</B></SPAN></SPAN></DIV>

<DIV><SPAN style="COLOR: #333333" data-mce-style="color: #333333;"><SPAN style="FONT-SIZE: small" data-mce-style="font-size: small;">Current systems for automatic speech recognition are based on statistical approaches. They require three components: an acoustic model, a lexicon and a language model. This stage will focus on the language model. The language model of our recognition system is based on a neural network learned from a large corpus of text. The problem is to re-estimate the language model parameters for a new proper name depending on its context and a small amount of adaptation data. Several tracks can be explored: adapting the language model, using a class model or studying the notion of analogy.</SPAN></SPAN></DIV>

<DIV><SPAN style="COLOR: #333333" data-mce-style="color: #333333;"><SPAN style="FONT-SIZE: small" data-mce-style="font-size: small;"></SPAN></SPAN> </DIV>

<DIV><SPAN style="COLOR: #333333" data-mce-style="color: #333333;"><SPAN style="FONT-SIZE: small" data-mce-style="font-size: small;">Our team has developed a fully automatic system for speech recognition to transcribe a radio broadcast from the corresponding audio file. The student will develop a new module whose function is to integrate new proper names in the language model.</SPAN></SPAN></DIV>

<DIV><SPAN style="COLOR: #333333" data-mce-style="color: #333333;"><SPAN style="FONT-SIZE: small" data-mce-style="font-size: small;"><B></B></SPAN></SPAN> </DIV>

<DIV><SPAN style="COLOR: #333333" data-mce-style="color: #333333;"><SPAN style="FONT-SIZE: small" data-mce-style="font-size: small;"><B>Required skills</B></SPAN></SPAN></DIV>

<DIV><SPAN style="COLOR: #333333" data-mce-style="color: #333333;"><SPAN style="FONT-SIZE: small" data-mce-style="font-size: small;">Background in statistics and object-oriented programming.</SPAN></SPAN></DIV>

<DIV><SPAN style="FONT-SIZE: small" data-mce-style="font-size: small;"><B></B></SPAN> </DIV>

<DIV><SPAN style="FONT-SIZE: small" data-mce-style="font-size: small;"><B>Localization and contacts</B></SPAN></DIV>

<DIV><SPAN style="FONT-SIZE: small" data-mce-style="font-size: small;"><SPAN lang=en-US>Loria laboratory, </SPAN></SPAN><SPAN style="FONT-SIZE: small" data-mce-style="font-size: small;"><SPAN lang=en-US><I>Multi</I></SPAN></SPAN><SPAN style="FONT-SIZE: small" data-mce-style="font-size: small;"><SPAN lang=en-US><I>speech team</I></SPAN></SPAN><SPAN style="FONT-SIZE: small" data-mce-style="font-size: small;"><SPAN lang=en-US>, Nancy, France</SPAN></SPAN></DIV>

<DIV><SPAN style="COLOR: #0000ff" data-mce-style="color: #0000ff;"><SPAN lang=en-US><SPAN style="TEXT-DECORATION: underline" data-mce-style="text-decoration: underline;"><SPAN id=OBJ_PREFIX_DWT478_com_zimbra_email class=Object><SPAN id=OBJ_PREFIX_DWT480_com_zimbra_email class=Object><A href="mailto:Irina.illina@loria.fr" target=_blank data-mce-href="mailto:Irina.illina@loria.fr">Irina.illina@loria.fr</A></SPAN></SPAN></SPAN></SPAN></SPAN><SPAN lang=en-US> </SPAN><SPAN style="COLOR: #0000ff" data-mce-style="color: #0000ff;"><SPAN lang=en-US><SPAN style="TEXT-DECORATION: underline" data-mce-style="text-decoration: underline;"><SPAN id=OBJ_PREFIX_DWT479_com_zimbra_email class=Object><SPAN id=OBJ_PREFIX_DWT481_com_zimbra_email class=Object><A href="mailto:dominique.fohr@loria.fr" target=_blank data-mce-href="mailto:dominique.fohr@loria.fr">dominique.fohr@loria.fr</A></SPAN></SPAN></SPAN></SPAN></SPAN></DIV>

<DIV><BR></DIV>

<DIV><SPAN style="FONT-SIZE: small" data-mce-style="font-size: small;"><SPAN style="FONT-SIZE: small" data-mce-style="font-size: small;"><SPAN style="FONT-SIZE: small" data-mce-style="font-size: small;">Candidates should email a detailed CV and diploma</SPAN></SPAN></SPAN><BR></DIV>

<DIV><BR></DIV>

<DIV>-- <BR></DIV>

<DIV><SPAN name="x"></SPAN>Associate Professor <BR>Lorraine University<BR>LORIA-INRIA<BR>office C147 <BR>Building C <BR>615 rue du Jardin Botanique<BR>54600 Villers-les-Nancy Cedex<BR>Tel:+ 33 3 54 95 84 90<SPAN name="x"></SPAN><BR></DIV></div></body></html>