Connectionists: Research engineer or post-doc position in Natural Language Processing: Introduction of semantic information in a speech recognition system

Irina Illina irina.illina at loria.fr
Sun Feb 24 16:10:47 EST 2019


Research engineer or post-doc position in Natural Language Processing: 
Introduction of semantic information in a speech recognition system 


Supervisors: Irina Illina, MdC, Dominique Fohr, CR CNRS 

Team: Multispeech, LORIA-INRIA (https://team.inria.fr/multispeech/) 

Contact: illina at loria.fr, dominique.fohr at loria.fr 

Duration: 12-15 months 

Deadline to apply : March 15th, 2019 

Required skills: Strong background in mathematics, machine learning (DNN), 
statistics, natural language processing and computer program skills (Perl, 
Python). 

Following profiles are welcome, either: 

· Strong background in signal processing 
or 
· Strong experience with natural language processing 

Excellent English writing and speaking skills are required in any case. 

Candidates should email a detailed CV with diploma 

LORIA is the French acronym for the “Lorraine Research Laboratory in Computer 
Science and its Applications” and is a research unit (UMR 7503), common to 
CNRS, the University of Lorraine and INRIA. This unit was officially created in 
1997. Loria’s missions mainly deal with fundamental and applied research in 
computer sciences. 

MULTISPEECH is a joint research team between the Université of Lorraine, Inria, 
and CNRS. Its research focuses on speech processing, with particular emphasis 
to multisource (source separation, robust speech recognition), multilingual 
(computer assisted language learning), and multimodal aspects (audiovisual 
synthesis). 

Context and objectives 

Under noisy conditions, audio acquisition is one of the toughest challenges to 
have a successful automatic speech recognition (ASR). Much of the success 
relies on the ability to attenuate ambient noise in the signal and to take it 
into account in the acoustic model used by the ASR. Our DNN (Deep Neural 
Network) denoising system and our approach to exploiting uncertainties have 
shown their combined effectiveness against noisy speech. 

The ASR stage will be supplemented by a semantic analysis. Predictive 
representations using continuous vectors have been shown to capture the 
semantic characteristics of words and their context, and to overcome 
representations based on counting words. Semantic analysis will be performed by 
combining predictive representations using continuous vectors and uncertainty 
on denoising. This combination will be done by the rescoring component. All our 
models will be based on the powerful technologies of DNN. 

The performances of the various modules will be evaluated on artificially noisy 
speech signals and on real noisy data. At the end, a demonstrator, integrating 
all the modules, will be set up. 

Main activities 

• study and implementation of a noisy speech enhancement module and a 
propagation of uncertainty module; 
• design a semantic analysis module; 
• design a module taking into account the semantic and uncertainty information. 

References 

[Nathwani et al., 2018] Nathwani, K., Vincent, E., and Illina, I. DNN 
uncertainty propagation using GMM-derived uncertainty features for noise robust 
ASR, IEEE Signal Processing Letters, 2018. 

[Nathwani et al., 2017] Nathwani, K., Vincent, E., and Illina, I. Consistent DNN 
uncertainty training and decoding for robust ASR, in Proc. IEEE Automatic 
Speech Recognition and Understanding Workshop, 2017. 

[Nugraha et al., 2016] Nugraha, A., Liutkus, A., Vincent E. Multichannel audio 
source separation with deep neural networks. IEEE/ACM Transactions on Audio, 
Speech, and Language Processing, 2016. 

[Sheikh, 2016] Sheikh, I. Exploitation du contexte sémantique pour améliorer la 
reconnaissance des noms propres dans les documents audio diachroniques”, These 
de doctorat en Informatique, Université de Lorraine, 2016. 

[Sheikh et al., 2016] Sheikh, I. Illina, I. Fohr, D. Linares, G. Learning word 
importance with the neural bag-of-words model, in Proc. ACL Representation 
Learning for NLP (Repl4NLP) Workshop, Aug 2016. 

[Mikolov et al., 2013a] Mikolov, T. Chen, K., Corrado, G., and Dean, J. 
Efficient estimation of word representations in vector space, CoRR, vol. 
abs/1301.3781, 2013. 


Associate Professor 
Lorraine University 
LORIA-INRIA 
office C147 
Building C 
615 rue du Jardin Botanique 
54600 Villers-les-Nancy Cedex 
Tel:+ 33 3 54 95 84 90 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/connectionists/attachments/20190224/f92c6522/attachment.html>


More information about the Connectionists mailing list