Connectionists: CFP SYMPTEMIST (BioCreative VIII @ AMIA 2023): Named entity recognition & linking of symptoms (incl. multilingual dataset)

Martin Krallinger krallinger.martin at gmail.com
Fri Aug 18 11:29:52 EDT 2023


(Apologies for cross-posting)

CFP: SYMPTEMIST Shared Task (BioCreative VIII run with AMIA 2023)

Named entity recognition and linking of symptoms, signs & findings (incl.
multilingual dataset)

https://temu.bsc.es/SYMPTEMIST/ <https://temu.bsc.es/distemist/>



The SYMPTEMIST track focuses on the automatic detection of mentions of
clinical symptoms (NER) and mapping to concept identifiers in clinical case
reports in Spanish (entity linking). Also a multilingual version of the
dataset will be released including versions in English, French, Italian,
Dutch, Portuguese, Romanian and Swedish.

Key information:

   -

   Web: https://temu.bsc.es/symptemist
   -

   Data: <https://doi.org/10.5281/zenodo.6408476>
   https://zenodo.org/record/8223654
   -

   Annotation guidelines: https://zenodo.org/record/8246440
   -

   BioCreative web: https://biocreative.bioinformatics.udel.edu
   -

   Registration form (Track 2- SYMPTEMIST):
   <https://temu.bsc.es/distemist/registration/>
   https://docs.google.com/forms/d/e/1FAIpQLScoSNulOoxRju3c8v9Q-CSv-w5jJcXu93G7A7v343AWfonpPw/viewform


Motivation
Systems able to detect and normalize clinical symptom mentions from medical
texts are crucial for almost any healthcare data mining, AI, medical
analytics or predictive application. As opposed to other clinical
information types, such as diagnoses (diseases/procedures), lab test
results or even medications, clinical symptoms can only be recovered
directly from written clinical narratives. Due to the high complexity,
variability and difficulty in generating annotated corpora for clinical
symptoms, only few large manually annotated data collections have been
constructed so far, with certain underlying limitations in terms of a)
entity linking / normalization of the symptom mentions to controlled
vocabularies and b) a lack of attempts to promote the development of
multilingual solutions and b) provide detailed annotation criteria and
guidelines. To address these issues, we have posed the SYMPTEMIST track at
the upcoming BioCreative VIII initiative, which will be run in the context
of the prestigious AMIA 2023 conference, which received over 1400
submissions this year.

Automatic detection of symptoms mentions are key for a range of clinical
use cases and real world applications like:

   -

   Predictive modeling of diseases
   -

   Differential diagnosis of complex diseases
   -

   Rare disease characterization & analysis
   -

   Selection of appropriate treatment & therapy
   -

   Study of  disease-symptom associations
   -

   Early detection of disease outbreaks & epidemiological surveillance
   -

   Extraction of phenotypes
   -

   Drug repurposing & off label indications


The SYMPTEMIST organizers will also release multilingual resources to
foster the development of multilingual tools and generate systems not only
for Spanish but also for content in English and Romance languages (French,
Portuguese, Italian, Romanian and Catalan) as well as versions in Dutch,
Swedish and Czech.

Inspired by previous initiatives (e.g. n2c2, CLEF or TREC) and shared tasks
(CANTEMIST, PharmaCoNER, or CodiEsp), we are launching the SYMPTEMIST
shared task as part of the BioCreative 2023 evaluation initiative, with the
following three sub-tracks:

   -

   SYMPTEMIST-entities: automatic detection of mentions of symptoms.


   -

   SYMPTEMIST-linking: finding mentions of symptoms and normalizing them to
   their Snomed-CT concept identifiers.
   -

   SYMPTEMIST-multilingual: automatic detection of mentions of symptoms in
   versions of the corpus generated in English, French, Italian, Portuguese,
   Romanian, Catalan, Dutch, Swedish and Czech.



Tentative schedule

   -

   Annotation Guidelines: August 8th 2023
   -

   Train Set Subtask 1 (NER): August 8th, 2023
   -

   Train Set Subtask 2 (Linking): September 10th 2023
   -

   Train Set Subtask 3 (Multilingual): September 10th 2023
   -

   SympTEMIST Test Set: September 30th 2023
   -

   Participants Test Predictions Deadline: October 5th 2023
   -

   Participants Evaluation Results Release. October 10th 2023
   -

   Submission of Participant Papers Deadline: October 22nd 2023
   -

   Notification of Acceptance Participant Papers: October 30 2023
   -

   Submission of Camera-ready Participant Papers Deadline. November 1st 2023
   -

   BioCreative VIII workshop @ AMIA 2023: November 11-15, 2023, In New
   Orleans, LA.



BioCreative proceedings and AMIA workshop


Teams participating in SYMPTEMIST will be invited to contribute a systems
description paper for the BioCreative  2023 Working Notes proceedings and a
flash presentation of their approach at the BioCreative 2023 session. The
BioCreative VIII workshop will run with AMIA 2023, November 11-15, 2023, In
New Orleans, LA. See:
https://amia.org/education-events/amia-2023-annual-symposium


Workshop Proceedings and Special Issue:

The BioCreative VIII Proceedings will host all the submissions from
participating teams, and it will be freely available by the time of the
workshop. In addition, we are happy to announce that the journal Database
will host the BioCreative VIII special issue for work that has passed their
peer-review process. Invitation to submit will be sent after the workshop.



All BioCreative VIII tracks

Track 1: BioRED (Biomedical Relation Extraction Dataset)

*Track 2: SYMPTEMIST (Symptom TExt Mining Shared Task)

Track 3: Genetic Phenotype Extraction and Normalization from Dysmorphology
Physical Examination Entries

Track 4: Clinical Annotation Tool Track


Main Organizers

   -

   Martin Krallinger, Barcelona Supercomputing Center, Spain
   -

   Eulàlia Farré-Maduell, Barcelona Supercomputing Center, Spain
   -

   Luis Gascó, Barcelona Supercomputing Center, Spain
   -

   Salvador Lima, Barcelona Supercomputing Center, Spain
   -

   Jan Rodriguez, Barcelona Supercomputing Center, Spain




=======================================
Martin Krallinger, Dr.
Head of NLP for Biomedical Information Analysis Unit
Barcelona Supercomputing Center (BSC-CNS)
https://www.linkedin.com/in/martin-krallinger-85495920/
 =======================================
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/connectionists/attachments/20230818/fe6d8deb/attachment.html>


More information about the Connectionists mailing list