<div dir="ltr"><span id="gmail-docs-internal-guid-1272a147-b4da-c5f0-8c33-799104553151"><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">                       Call for Shared Task Participation</span></p><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">                             SemEval 2017 Task 1</span></p><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">                       Semantic Textual Similarity (STS)</span></p><br><br><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">Semantic Textual Similarity (STS) measures the degree of equivalence in the underlying semantics of paired snippets of text. While making such an assessment is trivial for humans, constructing algorithms and computational models that mimic human level performance represents a difficult and deep natural language understanding problem.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">STS evaluations have seen significant progress in methods targeted at a specific language such as English or Spanish. For the 2017 shared task, the emphasis is on building multilingual textual similarity models that are capable of assessing both same language and cross-lingual sentence pairs. The primary evaluation for the shared task assesses methods over a combination of same language pairs in Arabic, English and Spanish as well as cross-lingual Arabic-English and Spanish-English pairs. </span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">To encourage the development of methods that can be readily applied or adapted to new languages, we also provide an optional evaluation track with a surprise language that will only be announced at the beginning of the evaluation period. This optional track provides an opportunity to explore STS models capable of zero-shot learning via mechanisms such as multilingual embeddings.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">In addition to the multilingual primary evaluation and the surprise language track, a number of language and language pair specific tracks are also provided. We hope that these tracks will provide participants with particular linguistic expertise a chance to excel as well as provide an opportunity to compare performance differences between multilingual and language specific methods.</span></p><br><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">Task Definition</span></p><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">===============</span></p><br><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">Given two sentences, participants are asked to produce a continuous valued similarity score on a scale from 0 to 5, with 0 indicating that the semantics of the sentences are completely independent and 5 signifying semantic equivalence. Performance is assessed by computing the Pearson correlation between machine assigned semantic similarity scores and human judgments.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">Following the emphasis on building multilingual and cross-lingual models, the </span><span style="font-size:13.3333px;font-family:"courier new";color:rgb(0,0,0);background-color:transparent;vertical-align:baseline;white-space:pre-wrap">2017 shared task is organized into the following seven multilingual and cross-lingual tracks:</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";color:rgb(0,0,0);background-color:transparent;vertical-align:baseline;white-space:pre-wrap">  Track 0 - Primary:         Combined evaluation of all announced monolingual </span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";color:rgb(0,0,0);background-color:transparent;vertical-align:baseline;white-space:pre-wrap">                             and cross-lingual language pairings explored by </span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";color:rgb(0,0,0);background-color:transparent;vertical-align:baseline;white-space:pre-wrap">                             the 2017 task: ar-ar, ar-en, en-en, es-en, and  </span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";color:rgb(0,0,0);background-color:transparent;vertical-align:baseline;white-space:pre-wrap">                             es-es. The primary track will not include the </span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";color:rgb(0,0,0);background-color:transparent;vertical-align:baseline;white-space:pre-wrap">                             surprise language evaluation data.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";color:rgb(0,0,0);background-color:transparent;vertical-align:baseline;white-space:pre-wrap">  Track 1 - Arabic-Arabic:   Evaluation only on ar-ar pairs.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";color:rgb(0,0,0);background-color:transparent;vertical-align:baseline;white-space:pre-wrap">  Track 2 - Arabic-English:  Evaluation only on ar-en pairs.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";color:rgb(0,0,0);background-color:transparent;vertical-align:baseline;white-space:pre-wrap">  Track 3 - Spanish-Spanish: Evaluation only on es-es pairs</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";color:rgb(0,0,0);background-color:transparent;vertical-align:baseline;white-space:pre-wrap">  Track 4 - Spanish-English: Evaluation only on es-en pairs.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";color:rgb(0,0,0);background-color:transparent;vertical-align:baseline;white-space:pre-wrap">  Track 5 - English-English: Evaluation only on en-en pairs.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";color:rgb(0,0,0);background-color:transparent;vertical-align:baseline;white-space:pre-wrap">  Track 6 - Surprise language track (announced during the evaluation period)</span><span style="font-size:13.3333px;font-family:"courier new";color:rgb(0,0,0);background-color:transparent;vertical-align:baseline;white-space:pre-wrap"><br class="gmail-kix-line-break"><br class="gmail-kix-line-break"></span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";color:rgb(0,0,0);background-color:transparent;vertical-align:baseline;white-space:pre-wrap">For all language pairings, participants will be provided with two sentence length snippets of text, s1 and s2. The two snippets will then be used to compute and return a continuous valued semantic similarity score.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";color:rgb(0,0,0);background-color:transparent;vertical-align:baseline;white-space:pre-wrap">The cross-lingual language pairings (ar-en, es-en) only differ from the monolingual language pairings (ar-ar, en-en, es-es) in that the two text snippets in each pair are written in different languages. The inclusion of cross-lingual STS pairs follows a successful pilot in 2016 that paired English and Spanish sentences. Depending on the approach being used to compute the similarity scores this may present different degrees of difficulty in adapting the underlying model to handle the cross-lingual pairs. </span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";color:rgb(0,0,0);background-color:transparent;vertical-align:baseline;white-space:pre-wrap">Participants are encouraged to review the successful approaches to monolingual and cross-lingual STS from prior years of the STS shared task (Agirre et al. 2016; Agirre et al. 2015; Agirre et al. 2014; Agirre et al. 2013; Agirre et al. 2012)  </span></p><br><br><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">2017 Data</span></p><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">=========</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">This year's shared task includes one evaluation set for each of the seven tracks described above. Each evaluation set consists of between 200 to 250 sentence pairs. Within each evaluation set, we will attempt to approximately balance the distribution of STS scores.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">For training data, participants are encouraged to make use of all existing English, Spanish and cross-lingual English-Spanish data sets from prior STS evaluations. This includes all previously released trial, training and evaluation data. </span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">Since this is the first year that we will include Arabic as part of an STS evaluation, we will release training data for both monolingual Arabic and cross-lingual Arabic-English. Each training set will consist of approximately 14,000 pairs sourced from prior English STS evaluations. </span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">As with the 2016 evaluation, participants are allowed and very much encouraged to train purely unsupervised models and model components on arbitrary data (e.g., unsupervised word embeddings).</span></p><br><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">Participation</span></p><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">=============</span></p><br><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">[Register]</span></p><br><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">To register, please complete the following form:</span><a href="https://docs.google.com/forms/d/1HTRtP7B94gqdW5YuRfRh5pEBhukuRIh5hXR1nOEib90/viewform?usp=send_form" style="text-decoration:none"><span style="font-size:13.3333px;font-family:"courier new";color:rgb(34,34,34);vertical-align:baseline;white-space:pre-wrap"> </span><span style="font-size:13.3333px;font-family:"courier new";text-decoration:underline;vertical-align:baseline;white-space:pre-wrap">https://docs.google.com/forms/d/e/1FAIpQLScXnt7qeioCPyxu6dv9wrSDYaF04bRgVBFCUbahxsAG6F43Sg/viewform</span></a></p><br><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">[Website and trial data]</span></p><br><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">For more details, including trial data, see the STS SemEval 2017 Task 1 webpage at:</span><a href="http://alt.qcri.org/semeval2016/task1/" style="text-decoration:none"><span style="font-size:13.3333px;font-family:"courier new";color:rgb(34,34,34);vertical-align:baseline;white-space:pre-wrap"> http://alt.qcri.org/semeval2017/task1/</span></a></p><br><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">[Mailing List]</span></p><br><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">Join the mailing list for task updates and discussion at:</span><a href="http://groups.google.com/group/STS-semeval" style="text-decoration:none"><span style="font-size:13.3333px;font-family:"courier new";color:rgb(34,34,34);vertical-align:baseline;white-space:pre-wrap"> http://groups.google.com/group/STS-semeval</span></a><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">.</span></p><br><br><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">Important dates</span></p><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">===============</span></p><br><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">  Trail data ready:             Wed 21 Sep 2016</span></p><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">  Training data ready:          Mon 24 Oct 2016</span></p><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">  Evaluation start:             Mon 09 Jan 2017</span></p><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">  Evaluation end:               Mon 30 Jan 2017</span></p><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">  Results posted:               Mon 06 Feb 2017</span></p><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">  Paper submissions due:        Mon 27 Feb 2017</span></p><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">  Author notifications:         Mon 03 Apr 2017</span></p><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">  Camera ready submissions due: Mon 17 Apr 2017</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">  SemEval workshop:             Summer 2017</span></p><br><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">Organizers (alpha. order)</span></p><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">==========</span></p><br><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">  Eneko Agirre, Daniel Cer, Mona Diab, Lucia Specia</span></p><br><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">References</span></p><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">==========</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">Eneko Agirre, Carmen Banea, Daniel Cer, Mona Diab, Aitor Gonzalez-Agirre, Rada Mihalcea, German Rigau, Janyce Wiebe. SemEval-2016 Task 1: Semantic Textual Similarity, Monolingual and Cross-Lingual Evaluation. Proceedings of SemEval 2016.</span></p><br><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">Eneko Agirre, Carmen Banea, Claire Cardie, Daniel Cer, Mona Diab, Aitor Gonzalez-Agirre, Weiwei Guo, Inigo Lopez-Gazpio, Montse Maritxalar, Rada Mihalcea, German Rigau, Larraitz Uria and Janyce Wiebe. SemEval-2015 Task 2: Semantic Textual Similarity, English, Spanish and Pilot on Interpretability. Proceedings of SemEval 2015.</span></p><br><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">Eneko Agirre, Carmen Banea, Claire Cardie, Daniel Cer, Mona Diab, Aitor Gonzalez-Agirre, Weiwei Guo, Rada Mihalcea, German Rigau and Janyce Wiebe. SemEval-2014 Task 10: Multilingual Semantic Textual Similarity. Proceedings of SemEval 2014.</span></p><br><p dir="ltr" style="line-height:1.656;margin-top:0pt;margin-bottom:0pt"><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">Eneko Agirre, Daniel Cer, Mona Diab, Aitor Gonzalez-Agirre and WeiWei Guo. *SEM 2013 shared task: Semantic Textual Similarity. Proceedings of *SEM 2013.</span></p><br><span style="font-size:13.3333px;font-family:"courier new";vertical-align:baseline;white-space:pre-wrap">Eneko Agirre, Daniel Cer, Mona Diab and Aitor Gonzalez-Agirre. SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity. Proceedings of SemEval 2012.</span></span><br></div>