<html><body><div style="font-family: arial,helvetica,sans-serif; font-size: 12pt; color: #000000"><div><span style="font-size: 11pt;">Dear all,</span><br></div><div data-marker="__QUOTED_TEXT__"><div style="font-family:'arial' , 'helvetica' , sans-serif;font-size:12pt;color:#000000"><div><div><div><div id="zimbraEditorContainer" class="4"><div style="color:rgb( 0 , 0 , 0 );font-family:'arial' , 'helvetica' , sans-serif;font-size:12pt"><div><span style="font-size:11pt"><br></span></div><div><span style="font-size:11pt">Please, could you post it on your lists? Thank you.<br></span></div><div><span style="font-size:11pt">Best regards,</span></div><div><span style="font-size:11pt">Irina Illina</span></div><div><em><span style="font-size:17.3333px"><b><br></b></span></em></div><div><em><span style="font-size:13pt"><strong>Automatic speech recognition for non-natives speakers in a noisy environment</strong></span></em></div><br><div> <strong style="color:#000000;font-family:'arial' , 'helvetica' , sans-serif;font-size:17.3333px;font-style:normal;letter-spacing:normal;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:#ffffff">Post-doctoral and engineer positions</strong> </div><div><strong>Starting date:</strong> July-September of 2023</div><div><strong>Duration:</strong> 24 months for a post-doc position and 12 months for an engineer position </div><div><strong>Supervisors:</strong> Irina Illina, Associate Professor, HDR Lorraine University LORIA-INRIA Multispeech Team, <a href="mailto:illina@loria.fr" rel="nofollow noopener noreferrer nofollow noopener noreferrer nofollow noopener noreferrer" target="_blank">illina@loria.fr</a></div><div>Emmanuel Vincent, Senion Research Scientist & Head of Science, <span style="color:#000000;font-family:'arial' , 'helvetica' , sans-serif;font-size:16px;font-style:normal;font-weight:400;letter-spacing:normal;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:#ffffff;display:inline !important;float:none">INRIA Multispeech Team, emmanuel.vincent@inria.fr</span></div> <a href="http://members.loria.fr/evincent/" style="color:darkblue;text-decoration:none;font-family:monospace;font-size:14.16px;font-style:normal;font-weight:400;letter-spacing:normal;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:#fdfdfd" rel="noopener nofollow noopener noreferrer" target="_blank">http://members.loria.fr/evincent/</a> </div><div><strong>Cons</strong>: the application must meet the requirements of the French <span style="color:#202122;font-family:sans-serif;font-size:14px;font-style:normal;font-weight:400;letter-spacing:normal;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:#ffffff;display:inline !important;float:none">Directorate General of Armament (Direction générale de l'armement, DGA). </span><br><div style="color:rgb( 0 , 0 , 0 );font-family:'arial' , 'helvetica' , sans-serif;font-size:12pt"><strong>Context</strong></div><div style="color:rgb( 0 , 0 , 0 );font-family:'arial' , 'helvetica' , sans-serif;font-size:12pt">When a person has their hands busy performing a task like driving a car or piloting an airplane, voice is a fast and efficient way to achieve interaction. In aeronautical communications, the English language is most often compulsory. Unfortunately, a large part of the pilots are not native English and speak with an accent dependent on their native language and are therefore influenced by the pronunciation mechanisms of this language. Inside an aircraft cockpit, non-native voice of the pilots and the surrounding noises are the most difficult challenges to overcome in order to have efficient automatic speech recognition (ASR). The problems of non-native speech are numerous: incorrect or approximate pronunciations, errors of agreement in gender and number, use of non-existent words, missing articles, grammatically incorrect sentences, etc. The acoustic environment adds a disturbing component to the speech signal. Much of the success of speech recognition relies on the ability to take into account different accents and ambient noises into the models used by ARP.</div><div style="color:rgb( 0 , 0 , 0 );font-family:'arial' , 'helvetica' , sans-serif;font-size:12pt">Automatic speech recognition has made great progress thanks to the spectacular development of deep learning. In recent years, end-to-end automatic speech recognition, which directly optimizes the probability of the output character sequence based on the input acoustic characteristics, has made great progress [Chan et al., 2016; Baevski et al., 2020; Gulati, et al., 2020].</div><br><div style="color:rgb( 0 , 0 , 0 );font-family:'arial' , 'helvetica' , sans-serif;font-size:12pt"><strong>Objectives</strong></div><div style="color:rgb( 0 , 0 , 0 );font-family:'arial' , 'helvetica' , sans-serif;font-size:12pt">The recruited person will have to develop methodologies and tools to obtain high-performance non-native automatic speech recognition in the aeronautical context and more specifically in a (noisy) aircraft cockpit.</div><div style="color:rgb( 0 , 0 , 0 );font-family:'arial' , 'helvetica' , sans-serif;font-size:12pt">This project will be based on an end-to-end automatic speech recognition system [Shi et al., 2021] using wav2vec 2.0 [Baevski et al., 2020]. This model is one of the most efficient of the current state of the art. This wav2vec 2.0 model enables self-supervised learning of representations from raw audio data (without transcription).</div><br><div style="color:rgb( 0 , 0 , 0 );font-family:'arial' , 'helvetica' , sans-serif;font-size:12pt"><strong>How to apply:</strong> Interested candidates are encouraged to contact Irina Illina (illina@loria.fr) with the required documents (CV, transcripts, motivation letter, and recommendation letters).</div><br><div style="color:rgb( 0 , 0 , 0 );font-family:'arial' , 'helvetica' , sans-serif;font-size:12pt"><strong>Requirements & skills:</strong></div><div style="color:rgb( 0 , 0 , 0 );font-family:'arial' , 'helvetica' , sans-serif;font-size:12pt">- Ph.D. degree in speech/audio processing, computer vision, machine learning, or in a related field,</div><div style="color:rgb( 0 , 0 , 0 );font-family:'arial' , 'helvetica' , sans-serif;font-size:12pt">- ability to work independently as well as in a team,</div><div style="color:rgb( 0 , 0 , 0 );font-family:'arial' , 'helvetica' , sans-serif;font-size:12pt">- solid programming skills (Python, PyTorch), and deep learning knowledge,</div><div style="color:rgb( 0 , 0 , 0 );font-family:'arial' , 'helvetica' , sans-serif;font-size:12pt">- good level of written and spoken English.</div><br><div style="color:rgb( 0 , 0 , 0 );font-family:'arial' , 'helvetica' , sans-serif;font-size:12pt"><strong>References</strong></div><div style="color:rgb( 0 , 0 , 0 );font-family:'arial' , 'helvetica' , sans-serif;font-size:12pt">[Baevski et al., 2020] A. Baevski, H. Zhou, A. Mohamed, and M. Auli. Wav2vec 2.0: A framework for self-supervised learning of speech representations, 34th Conference on Neural Information Processing Systems (NeurIPS 2020), 2020.</div><div style="color:rgb( 0 , 0 , 0 );font-family:'arial' , 'helvetica' , sans-serif;font-size:12pt">[Chan et al., 2016] W. Chan, N. Jaitly, Q. Le and O. Vinyals. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016, pp. 4960-4964, 2016.</div><div style="color:rgb( 0 , 0 , 0 );font-family:'arial' , 'helvetica' , sans-serif;font-size:12pt">[Chorowski et al., 2017] J. Chorowski, N. Jaitly. Towards better decoding and language model integration in sequence to sequence models. Interspeech, 2017.</div><div style="color:rgb( 0 , 0 , 0 );font-family:'arial' , 'helvetica' , sans-serif;font-size:12pt">[Houlsby et al., 2019] N. Houlsby, A. Giurgiu, S. Jastrzebski, B. Morrone, Q. De Laroussilhe, A. Gesmundo, M. Attariyan, S. Gelly. Parameter-efficient transfer learning for NLP. International Conference on Machine Learning, PMLR, pp. 2790–2799, 2019.</div><div style="color:rgb( 0 , 0 , 0 );font-family:'arial' , 'helvetica' , sans-serif;font-size:12pt">[Gulati et al., 2020] A. Gulati, J. Qin, C.-C. Chiu, N. Parmar, Y. Zhang, J. Yu, W. Han, S. Wang, Z. Zhang, Y. Wu, and R. Pang. Conformer: Convolution-augmented transformer for speech recognition. Interspeech, 2020.</div><div style="color:rgb( 0 , 0 , 0 );font-family:'arial' , 'helvetica' , sans-serif;font-size:12pt">[Shi et al., 2021] X. Shi, F. Yu, Y. Lu, Y. Liang, Q. Feng, D. Wang, Y. Qian, and L. Xie. The accented english speech recognition challenge 2020: open datasets, tracks, baselines, results and methods. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6918–6922, 2021.</div></div><br><div style="color:rgb( 0 , 0 , 0 );font-family:'arial' , 'helvetica' , sans-serif;font-size:12pt">-- <br></div><div style="color:rgb( 0 , 0 , 0 );font-family:'arial' , 'helvetica' , sans-serif;font-size:12pt">Best regards,<br>Irina Illina<br><br>Associate Professor, HDR <br>Lorraine University<br>LORIA-INRIA<br>Multispeech Team<br>office C147 <br>Building C <br>615 rue du Jardin Botanique<br>54600 Villers-les-Nancy Cedex<br>Tel:+ 33 3 54 95 84 90</div></div><br></div><br></div></div></div></div><div><br data-mce-bogus="1"></div></div></body></html>