<div dir="ltr"><div class="gmail_quote"><div dir="ltr"><span id="gmail-m_7854121387515265607gmail-docs-internal-guid-ccc74f27-7fff-3bb4-8773-cb3363d09ae1"><p dir="ltr" style="line-height:1.38;text-align:center;margin-top:0pt;margin-bottom:0pt"><span style="font-size:16.5pt;font-family:Roboto,sans-serif;color:rgb(32,33,36);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">MT4All Unsupervised MT Shared Task </span></p><p dir="ltr" style="line-height:1.38;text-align:center;margin-top:0pt;margin-bottom:0pt"><span style="font-size:16.5pt;font-family:Roboto,sans-serif;color:rgb(32,33,36);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">at SIGUL 2022 </span></p><p dir="ltr" style="line-height:1.38;text-align:center;margin-top:0pt;margin-bottom:0pt"><span style="font-size:16.5pt;font-family:Roboto,sans-serif;color:rgb(32,33,36);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">(24-25 June, Marseille)</span></p><p dir="ltr" style="line-height:1.38;text-align:center;margin-top:0pt;margin-bottom:0pt"><span id="gmail-m_7854121387515265607gmail-docs-internal-guid-a5c82ca1-7fff-f05b-3193-6159bc855ecc"><br></span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(32,33,36);font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">FIRST CALL FOR PARTICIPATION</span></p><p dir="ltr" style="line-height:1.38;text-align:center;margin-top:0pt;margin-bottom:0pt"><br></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(32,33,36);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">We invite you to participate in the first edition of the MT4All Unsupervised Machine Translation Shared Task, hosted by the ELRA/ISCA Special Interest Group on Under-Resourced Languages Workshop (SIGUL 2022). Papers on the task will be published as part of the Proceedings.</span></p><p dir="ltr" style="line-height:1.38;text-align:center;margin-top:0pt;margin-bottom:0pt"><br></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(32,33,36);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Invitation to Participate – </span><a href="https://docs.google.com/forms/d/1tllq0jWhcKwMHgPtRCA4aLkgLDuN8JlZG7Vp4TqcNQ0" target="_blank" style="text-decoration-line:none"><span style="font-size:11pt;font-family:Arial;font-variant-numeric:normal;font-variant-east-asian:normal;text-decoration-line:underline;vertical-align:baseline;white-space:pre-wrap">Expression of Interest</span></a><span style="font-size:11pt;font-family:Arial;color:rgb(32,33,36);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">.</span></p><p dir="ltr" style="line-height:1.38;text-align:center;margin-top:0pt;margin-bottom:0pt"><br></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">TASK DESCRIPTION</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:12pt;font-family:Arial;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> </span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">For this Shared task we will leverage the resources generated by the recently finished CEF project MT4All , with the aim of exploring unsupervised MT techniques based only on monolingual corpora. In the course of the project, the following novel datasets were created: 18 monolingual corpora for specific languages and domains, 12 bilingual dictionaries and translation models, and 10 annotated datasets for evaluation. Most of them will be used in the present Shared task.</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"> </p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">The task is divided into three separate subtasks, each one covering a specific domain and set of languages.</span></p><ul style="margin-top:0px;margin-bottom:0px"><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" role="presentation" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline">Subtask 1: Unsupervised translation from English to Ukrainian, Georgian and Kazakh in the Legal domain.</span></p></li><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" role="presentation" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline">Subtask 2: Unsupervised translation from English to Finnish, Latvian, and Norwegian Bokmål in the Financial domain.</span></p></li><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" role="presentation" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline">Subtask 3: Unsupervised translation from English to German, Norwegian Bokmål, and Spanish in the Customer support domain.</span></p></li></ul><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"> </p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">In this Shared task, we are interested in how the in-domain monolingual data that we will provide can be leveraged by creating a purely unsupervised machine translation model, either by </span></p><ul style="margin-top:0px;margin-bottom:0px"><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" role="presentation" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline">training an unsupervised model from scratch, or</span></p></li><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" role="presentation" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline">adding value to an existing pre-trained model, on the condition that</span></p></li><ul style="margin-top:0px;margin-bottom:0px"><li dir="ltr" style="list-style-type:circle;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" role="presentation" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline">it has been trained on monolingual datasets</span></p></li><li dir="ltr" style="list-style-type:circle;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" role="presentation" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline">it has not been fine-tuned with any parallel data</span></p></li><li dir="ltr" style="list-style-type:circle;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" role="presentation" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline">it is publicly accessible from the HuggingFace repository</span></p></li></ul></ul><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"> </p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Although we exclude the possibility of fine-tuning the models with any existing parallel data, we allow making use of the bilingual resources created in the framework of MT4All using purely unsupervised technologies.</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">As additional monolingual data, we allow the use of any monolingual Oscar dataset, only.</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"> </p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"> </p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">IMPORTANT DATES</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"> </p><ul style="margin-top:0px;margin-bottom:0px"><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" role="presentation" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline">Training data release</span><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline"> </span><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline"> </span><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline"> </span><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline">10.03.2022</span></p></li></ul><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"> </p><ul style="margin-top:0px;margin-bottom:0px"><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" role="presentation" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline">Test sets release</span><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline"> </span><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline"> </span><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline"> </span><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline">25.04.2022</span></p></li></ul><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"> </p><ul style="margin-top:0px;margin-bottom:0px"><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" role="presentation" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline">Results deadline</span><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline"> </span><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline"> </span><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline"> </span><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline">02.05.2022</span></p></li></ul><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"> </p><ul style="margin-top:0px;margin-bottom:0px"><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" role="presentation" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline">Paper submission deadline</span><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline"> </span><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline"> </span><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline">16.05.2022</span></p></li></ul><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"> </p><ul style="margin-top:0px;margin-bottom:0px"><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" role="presentation" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline">Acceptance notice</span><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline"> </span><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline"> </span><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline"> </span><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline">30.05.2022</span></p></li></ul><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"> </p><ul style="margin-top:0px;margin-bottom:0px"><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" role="presentation" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline">Camera ready</span><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline"> </span><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline"> </span><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline"> </span><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline"> </span><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline">13.06.2022</span></p></li></ul><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"> </p><ul style="margin-top:0px;margin-bottom:0px"><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" role="presentation" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline">Workshop starts</span><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline"> </span><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline"> </span><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline"> </span><span style="font-size:11pt;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline">24.06.2022</span></p></li></ul><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"> </p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:12pt;font-family:Arial;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> </span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Please visit the website for more details: </span><a href="https://sigul-2022.ilc.cnr.it/mt4all-shared-task/" target="_blank" style="text-decoration-line:none"><span style="font-size:11pt;font-family:Arial;font-variant-numeric:normal;font-variant-east-asian:normal;text-decoration-line:underline;vertical-align:baseline;white-space:pre-wrap">https://sigul-2022.ilc.cnr.it/mt4all-shared-task/</span></a></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"> </p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">If you have any comments and/or questions, do not hesitate to contact ksenia.kharitonova at <a href="http://bsc.es/" target="_blank">bsc.es</a>.</span></p></span></div></div></div>