Connectionists: Final CFP: Normalization of Genetic Phenotypes from Dysmorphology Physical Examinations @ BioCreative VIII - track 3

Davy Weissenbacher davy.weissenbacher at gmail.com
Thu Sep 7 14:05:07 EDT 2023


*Task:* We call for automated systems to extract and normalize the findings
of dysmorphology physical examinations. The dataset consists of 3136
de-identified observations with dysmorphic findings manually annotated and
normalized with their corresponding HumanPhenotype Ontology
<https://hpo.jax.org/app/> (HPO) terms.


*Motivation:* Dysmorphology physical examinations catalog minor
morphological differences of patients’ bodies and may also identify general
medical signs such as neurologic dysfunction. These findings enable
correlations of patients with known rare genetic diseases and allow
researchers to delineate undescribed genetic conditions. These medical
findings are nearly always captured as unstructured free text within the
electronic health record, making them unavailable for downstream
computational analysis. Advanced natural language processing methods are
therefore required to retrieve the information from the records.



*Challenge:* Both extraction and normalization are challenging. The
extraction is challenging due to the descriptive style of the examinations
which, for conciseness, report findings with disjoint and overlapping
mentions. The normalization is challenging due to the large scale of the
HPO ontology which requires a normalizer to learn the task without
supervision since our training set does not provide examples of all terms
in the HPO.

See
https://biocreative.bioinformatics.udel.edu/tasks/biocreative-viii/track-3/ for
details., in short:

   - 3136 de-identified observations with dysmorphic and normal findings
   manually annotated and normalized with their corresponding Human
   Phenotype Ontology <https://hpo.jax.org/app/> terms


   - Baseline systems available (e.g. doc2HPO
   <https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.com%2Fv3%2F__https%3A%2Fdoi.org%2F10.1093%2Fnar%2Fgkz386__%3B!!KOmnBZxC8_2BBQ!wOul5WmKEXAz3ieVMFnkWsnE22f7qVws_GT94mj2AxE_p9hY_nBY3f4pCJT10h7WmZyFYl5nLY7QhOPrSRJMmoBx7To%24&data=05%7C01%7CCAMPBELLIM%40chop.edu%7C29f04983ec7343b4f41708db3b8c57be%7Ca611241607b041a59bb1d146b575c975%7C0%7C0%7C638169246199221115%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=hux0LlF4U0GT6HWpO%2FY8JjqYLWB6WrkSMcl7RPGlF08%3D&reserved=0>
   , NeuralCR
   <https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.com%2Fv3%2F__https%3A%2Fdoi.org%2F10.2196%2F12596__%3B!!KOmnBZxC8_2BBQ!wOul5WmKEXAz3ieVMFnkWsnE22f7qVws_GT94mj2AxE_p9hY_nBY3f4pCJT10h7WmZyFYl5nLY7QhOPrSRJMNZ4HF7s%24&data=05%7C01%7CCAMPBELLIM%40chop.edu%7C29f04983ec7343b4f41708db3b8c57be%7Ca611241607b041a59bb1d146b575c975%7C0%7C0%7C638169246199221115%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=B2pqq50tjZ1QJtfESjbiiequC%2BGte1b%2BrxPQ3%2BrjAd0%3D&reserved=0>
   , PhenoTagger
   <https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.com%2Fv3%2F__https%3A%2Fdoi.org%2F10.1093%2Fbioinformatics%2Fbtab019__%3B!!KOmnBZxC8_2BBQ!wOul5WmKEXAz3ieVMFnkWsnE22f7qVws_GT94mj2AxE_p9hY_nBY3f4pCJT10h7WmZyFYl5nLY7QhOPrSRJMn6mBH0w%24&data=05%7C01%7CCAMPBELLIM%40chop.edu%7C29f04983ec7343b4f41708db3b8c57be%7Ca611241607b041a59bb1d146b575c975%7C0%7C0%7C638169246199221115%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=b527AtyQQb6mRMU8KSdw7L2APgTzM5Zf6ESNax9VO%2B4%3D&reserved=0>
   , PhenoBERT
   <https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.com%2Fv3%2F__https%3A%2Fdoi.org%2F10.1109%2FTCBB.2022.3170301__%3B!!KOmnBZxC8_2BBQ!wOul5WmKEXAz3ieVMFnkWsnE22f7qVws_GT94mj2AxE_p9hY_nBY3f4pCJT10h7WmZyFYl5nLY7QhOPrSRJMHtsRXdg%24&data=05%7C01%7CCAMPBELLIM%40chop.edu%7C29f04983ec7343b4f41708db3b8c57be%7Ca611241607b041a59bb1d146b575c975%7C0%7C0%7C638169246199221115%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=V%2BJKQHfBB7Jj6LqzwzAE7bIJ0NWitzhILOpekgbMf9w%3D&reserved=0>,
   and txt2HPO
   <https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.com%2Fv3%2F__https%3A%2Fgithub.com%2FGeneDx%2Ftxt2hpo__%3B!!KOmnBZxC8_2BBQ!wOul5WmKEXAz3ieVMFnkWsnE22f7qVws_GT94mj2AxE_p9hY_nBY3f4pCJT10h7WmZyFYl5nLY7QhOPrSRJMeawcndc%24&data=05%7C01%7CCAMPBELLIM%40chop.edu%7C29f04983ec7343b4f41708db3b8c57be%7Ca611241607b041a59bb1d146b575c975%7C0%7C0%7C638169246199221115%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=mtCRgl7GcWjWcSArggA%2FbhlbAnDpwZtty0reoUuDrWI%3D&reserved=0>
   )


   - Codalab opened at https://codalab.lisn.upsaclay.fr/competitions/11351


   - Evaluation period: Sept. 15, 9:00 UTC - Sept. 18, 23:59 UTC



[Apologies for cross-posting]

Best regards,
Davy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/connectionists/attachments/20230907/fcb7bfd3/attachment.html>


More information about the Connectionists mailing list