<div dir="ltr"><div style="margin:0px;padding:0px;border:0px;font-variant-numeric:inherit;font-variant-east-asian:inherit;font-stretch:inherit;font-size:12pt;line-height:inherit;font-family:Calibri,Arial,Helvetica,sans-serif;vertical-align:baseline;color:black"><a href="https://sigtyp.github.io/st2020.html">https://sigtyp.github.io/st2020.html</a><br></div><div style="margin:0px;padding:0px;border:0px;font-variant-numeric:inherit;font-variant-east-asian:inherit;font-stretch:inherit;font-size:15px;line-height:inherit;vertical-align:baseline;color:rgb(32,31,30)"><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit"><div style="margin:0px;padding:0px;border:0px;font-style:inherit;font-variant:inherit;font-weight:inherit;font-stretch:inherit;font-size:12pt;line-height:inherit;font-family:Calibri,Arial,Helvetica,sans-serif;vertical-align:baseline;color:black"><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit"><br></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit">The SIGTYP workshop, co-located with the EMNLP 2020 conference in Punta Cana (Dominican Republic), is offering a shared task on the prediction of typological features. The shared task encompasses nearly 2,000 languages, with typological features taken from the World Atlas of Language Structures (WALS; Dryer and Haspelmath 2013).<br></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit"><br></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit">To participate in the shared task, you will build a system that can predict typological properties of languages, given a handful of observed features. Training examples and development examples have already been provided (see link below). All submitted systems will be compared on a held-out test set.<br></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit"><br></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit">Moreover, you will be invited to describe your system in a system paper for the SIGTYP workshop proceedings. The task organisers will write an overview paper that describes the task and summarises the different approaches taken, and their results.<br></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit"><br></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit"><b>Important Links</b></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit"><b><br></b></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit">- Download Train and Dev data: <a href="https://github.com/sigtyp/ST2020/tree/master/data">https://github.com/sigtyp/ST2020/tree/master/data</a><br></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit">- Register for the Task! <a href="https://sigtyp.github.io/st2020-reg.html">https://sigtyp.github.io/st2020-reg.html</a><br></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit"><br></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit"><b>Important Dates</b></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit"><br></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit">- Training data Release: 26 March 2020<br></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit">- Test data Release: 20 June 2020<br></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit">- Submissions Due: 1 July 2020<br></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit">- Writeup Due: 1 August 2020<br></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit"><br></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit"><b>Description</b></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit"><br></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit">The typological features in WALS represent one approach to the categorization of the languages of the world according to their linguistic properties, e.g. in terms of their syntax, morphology, phonology inter alia. One example of such a typological feature is the basic word order feature. For instance, English is best described as a subject-verb-object (SVO) language whereas Japanese is best described as a subject-object-verb (SOV) language.<br></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit"><br></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit">One major issue with WALS, however, is that it is both sparse and skewed in terms of language-feature annotations. It is sparse in the sense that most languages only have annotations for a handful of features, and skewed in the sense that a few features have much wider coverage than others. Luckily, such features often correlate with one another, which allows for prediction of those features from others. For instance, languages where the verb precedes the object tend to have prepositions, e.g. Norwegian, whereas languages where the object precedes the verb word tend to have postpositions, e.g. Japanese.<br></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit"><br></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit">Although there is a significant amount of previous work dealing with versions of this task (<i>Daumé III and Campbell 2017; Bjerva et al. 2019; Ponti et al. 2019</i>), important design choices have been frequently ignored. Some papers controlled for genetic relationships between training and evaluation languages, but little-to-no work has considered controlling for geographical proximity.<br></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit"><br></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit">The shared task will consist of two settings (subtasks):<br></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit"><ol><li><i>Constrained</i>: only provided training data can be employed.</li><li><i>Unconstrained</i>: training data can be extended with any external source of information (e.g. pre-trained embeddings, raw texts, etc.)</li></ol></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit"><b>Organizers</b></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit"><br></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit">Johannes Bjerva<br></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit">Isabelle Augenstein<br></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit">Aditi Chaudhary<br></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit">Ekaterina Vylomova</div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit">Edoardo M. Ponti<br></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit">Giuseppe Celano<br></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit">Liz Salesky<br></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit">Ryan Cotterell<br></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit">Michael Regan<br></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit">Sabrina J. Mielke<br></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit"><br></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit"><b>Contact</b></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit"><b><br></b></div><div style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit">- email: sigtyp AT gmail DOT com<br></div><span style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:inherit">- website: <a href="https://sigtyp.github.io/st2020.html" target="_blank" rel="noopener noreferrer" id="gmail-LPNoLP782020" style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline">https://sigtyp.github.io/st2020.html</a></span></div></div></div></div>