<div dir="ltr"><div><div id="gmail-:6ho" class="gmail-a3s gmail-aiL"><div dir="ltr"><div><p class="MsoNormal" align="center" style="margin-right:0in;margin-bottom:12pt;margin-left:0in;text-align:center">
<span style="font-size:26pt;font-family:"Arial",sans-serif;color:black"><span>BioCreative</span> IX Challenge and Workshop CFP</span><span style="font-size:12pt"><u></u><u></u></span></p>
<p class="MsoNormal" align="center" style="text-align:center"><span style="font-size:15pt;font-family:"Arial",sans-serif;color:black">Large Language Models for Clinical and Biomedical NLP at IJCAI</span><span style="font-size:12pt"><u></u><u></u></span></p>
<p class="MsoNormal" style="margin-right:0in;margin-bottom:6pt;margin-left:0in">
<span style="font-size:16pt;font-family:"Arial",sans-serif;color:black">Where, When:</span><b><span style="font-size:18pt"><u></u><u></u></span></b></p>
<p class="MsoNormal"><span style="font-family:"Arial",sans-serif;color:black">The
</span><span style="font-size:12pt"><a href="https://www.ncbi.nlm.nih.gov/research/bionlp/biocreative9" target="_blank"><span style="font-size:11pt;font-family:"Arial",sans-serif;color:rgb(17,85,204)"><span>BioCreative</span> IX workshop</span></a></span><span style="font-family:"Arial",sans-serif;color:black">
will run with </span><span style="font-size:12pt"><a href="https://2025.ijcai.org/" target="_blank"><span style="font-size:11pt;font-family:"Arial",sans-serif;color:rgb(17,85,204)">IJCAI 2025</span></a></span><span style="font-family:"Arial",sans-serif;color:black">, August
16-22, 2025, In Montreal, CA. </span><span style="font-size:12pt"><u></u><u></u></span></p>
<p class="MsoNormal" style="margin-right:0in;margin-bottom:12pt;margin-left:0in">
<span style="font-size:16pt;font-family:"Arial",sans-serif;color:black"><span>BioCreative</span> IX:</span><b><span style="font-size:18pt"><u></u><u></u></span></b></p>
<p class="MsoNormal" style="margin-right:0in;margin-bottom:12pt;margin-left:0in">
<span style="font-family:"Arial",sans-serif;color:black">The 9</span><sup><span style="font-size:6.5pt;font-family:"Arial",sans-serif;color:black">th</span></sup><span style="font-family:"Arial",sans-serif;color:black"> <span>BioCreative</span>
workshop seeks to attract
researchers interested in developing and evaluating automatic methods
of extracting medically relevant information from clinical data and aims
to bring together the medical NLP community and the healthcare
researchers and practitioners. The challenge tracks
explore MedHopQA, a dataset for benchmarking LLM-based reasoning
systems with disease-centered question answers, ToxHabits, a task
exploring the information extraction related to substance use and abuse
in Spanish clinical content, and Sentence segmentation
of real clinical notes using MIMIC-II clinical notes. We also will
feature paper submissions on relevant topics and poster/tool
demonstrations. </span><span style="font-size:12pt"><u></u><u></u></span></p>
<p class="MsoNormal" style="margin-right:0in;margin-bottom:12pt;margin-left:0in">
<span style="font-size:16pt;font-family:"Arial",sans-serif;color:black">Important Dates</span><b><span style="font-size:18pt"><u></u><u></u></span></b></p>
<p class="MsoNormal"><span style="font-family:"Arial",sans-serif;color:black">March - April: Team Registration</span><span style="font-size:12pt"><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-family:"Arial",sans-serif;color:black">May 12, 2025: Testing predictions, Evaluation results</span><span style="font-size:12pt"><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-family:"Arial",sans-serif;color:black">May 19, 2025: Submission of participants papers deadline</span><span style="font-size:12pt"><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-family:"Arial",sans-serif;color:black">Jun 06, 2025: Notification of accepted papers deadline</span><span style="font-size:12pt"><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-family:"Arial",sans-serif;color:black">Aug 16- Aug 22 2025: IJCAI 2025</span><span style="font-size:12pt"><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:12pt"><u></u> <u></u></span></p>
<p class="MsoNormal" style="margin-right:0in;margin-bottom:12pt;margin-left:0in">
<span style="font-size:16pt;font-family:"Arial",sans-serif;color:black">Workshop Proceedings and Special Issue:</span><b><span style="font-size:18pt"><u></u><u></u></span></b></p>
<p class="MsoNormal"><span style="font-family:"Arial",sans-serif;color:black">The <span>BioCreative</span>
IX Proceedings will host all the submissions from participating teams,
and they will be freely available by the time of the workshop.</span><span style="font-size:12pt"><u></u><u></u></span></p>
<p class="MsoNormal" style="margin-right:0in;margin-bottom:12pt;margin-left:0in">
<span style="font-family:"Arial",sans-serif;color:black">In addition, select papers will be invited for a journal <span>BioCreative</span>
IX special issue for work that passes their peer-review process. More
details and information to submit will be posted in June. </span><span style="font-size:12pt"><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:12pt"> </span></p>
<p class="MsoNormal" style="margin-right:0in;margin-bottom:12pt;margin-left:0in">
<span style="font-size:16pt;font-family:"Arial",sans-serif;color:black">Participation:</span><b><span style="font-size:18pt"><u></u><u></u></span></b></p>
<p class="MsoNormal"><span style="font-family:"Arial",sans-serif;color:black">Teams
can participate in one or more of these tracks. Team registration will
continue until April 30th, when final commitment is requested.</span><span style="font-size:12pt"><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-family:"Arial",sans-serif;color:black">To register a team go to the
</span><span style="font-size:12pt"><a href="https://forms.gle/xbQp158cn5pgJ1oj9" target="_blank"><span style="font-size:11pt;font-family:"Arial",sans-serif;color:rgb(17,85,204)">Registration Form</span></a></span><span style="font-family:"Arial",sans-serif;color:black">. If
you have restrictions accessing Google forms please send e-mail to <a href="mailto:BiocreativeChallenge@gmail.com" target="_blank">BiocreativeChallenge@gmail.com</a>.</span><span style="font-size:12pt"><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:12pt"><u></u> <u></u></span></p>
<p class="MsoNormal" style="margin-right:0in;margin-bottom:12pt;margin-left:0in">
<span style="font-size:14pt;font-family:"Arial",sans-serif;color:rgb(67,67,67)">Call for Papers</span><b><span style="font-size:13.5pt"><u></u><u></u></span></b></p>
<p class="MsoNormal" style="margin-right:0in;margin-bottom:12pt;margin-left:0in">
<span style="font-family:"Arial",sans-serif;color:black">We welcome submissions on work that describes research on similar topics to the three challenges, as well as:</span><span style="font-size:12pt"><u></u><u></u></span></p>
<ul style="margin-top:0in" type="disc"><li class="MsoNormal" style="color:black;margin-top:12pt;vertical-align:baseline">
<span style="font-family:"Arial",sans-serif">Development of benchmarking datasets for clinical NLP<u></u><u></u></span></li><li class="MsoNormal" style="color:black;vertical-align:baseline">
<span style="font-family:"Arial",sans-serif">Creating and evaluating synthetic data using LLMs and its impact for downstream tasks<u></u><u></u></span></li><li class="MsoNormal" style="color:black;vertical-align:baseline">
<span style="font-family:"Arial",sans-serif"><span>Creative</span> use of data augmentation for increasing tool accuracy and trustworthiness<u></u><u></u></span></li><li class="MsoNormal" style="color:black;vertical-align:baseline">
<span style="font-family:"Arial",sans-serif">Use of LLMs to streamline annotation tasks <u></u><u></u></span></li><li class="MsoNormal" style="color:black;vertical-align:baseline">
<span style="font-family:"Arial",sans-serif">NLP-systems capable of identifying entities in multilingual corpora<u></u><u></u></span></li><li class="MsoNormal" style="color:black;vertical-align:baseline">
<span style="font-family:"Arial",sans-serif">NLP-systems capable of semantic interoperability across different terminologies/ ontologies for efficient data curation <u></u><u></u></span></li><li class="MsoNormal" style="color:black;vertical-align:baseline">
<span style="font-family:"Arial",sans-serif">Integrating ontologies and knowledge bases for factual LLM production<u></u><u></u></span></li><li class="MsoNormal" style="color:black;margin-bottom:12pt;vertical-align:baseline">
<span style="font-family:"Arial",sans-serif">Annotated corpora and other resources for health care and biomedical data modelling <u></u><u></u></span></li></ul>
<p class="MsoNormal" style="margin-right:0in;margin-bottom:12pt;margin-left:0in">
<span style="font-family:"Arial",sans-serif;color:black">All submissions will be considered for poster presentations and tool demonstrations at the workshop.</span><span style="font-size:12pt"><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:12pt"><u></u> <u></u></span></p>
<p class="MsoNormal" style="margin-right:0in;margin-bottom:12pt;margin-left:0in">
<span style="font-size:20pt;font-family:"Arial",sans-serif;color:black"><span>BioCreative</span> IX Tracks:</span><b><span style="font-size:24pt"><u></u><u></u></span></b></p>
<p class="MsoNormal" style="margin-right:0in;margin-bottom:12pt;margin-left:0in">
<span style="font-size:14pt;font-family:"Arial",sans-serif;color:rgb(67,67,67)">Track 1: MedHopQA</span><b><span style="font-size:13.5pt"><u></u><u></u></span></b></p>
<p class="MsoNormal" style="margin-right:0in;margin-bottom:12pt;margin-left:0in">
<span style="font-family:"Arial",sans-serif;color:black">Large
language models (LLMs) are commonly evaluated on their capabilities to
answer questions in various domains, and it has become clear that robust
QA datasets are critical to ensure proper evaluation
of LLMs prior to their deployment in real-world biomedical or
healthcare related applications. This track aims to advance the
development of LLM-based systems that are capable of answering questions
that involve multi-step reasoning. We have created a resource
consisting of 1,000 question-answer pairs – focusing on diseases, genes
and chemicals, mostly pertaining to rare diseases – based on public
information in Wikipedia. The participants are encouraged to use any
training data they wish to design and develop their
NLP system agents that understand asserted information on genes,
diseases, chemicals etc. and are able to answer multi-step reasoning
questions involving such information. This track builds on the previous
success in biomedical QA benchmarking (e.g., PubMedQA
and BioASQ, MedQA) but differs from them in the fact that for MedHopQA
it is necessary to employ a multi-step reasoning process to find the
correct answer. </span><span style="font-size:12pt"><u></u><u></u></span></p>
<p class="MsoNormal" style="margin-right:0in;margin-bottom:12pt;margin-left:0in">
<span style="font-size:14pt;font-family:"Arial",sans-serif;color:rgb(67,67,67)">Track 2: Sentence segmentation of real-life clinical notes</span><b><span style="font-size:13.5pt"><u></u><u></u></span></b></p>
<p class="MsoNormal" style="margin-right:0in;margin-bottom:12pt;margin-left:0in">
<span style="font-family:"Arial",sans-serif;color:black">Sentence
segmentation is a fundamental linguistic task and is widely used as a
pre-processing step in many NLP tasks. Although the development of LLMs
and the sparse attention mechanism in transformer
networks have reduced the necessity of sentence level inputs in some
NLP tasks, many models are designed and tested only for shorter
sequences. The need for sentence segmentation is particularly pronounced
in clinical notes, as most clinical NLP tasks depend
on this information for annotation and model training. In this shared
task, we challenge participants to detect sentence boundaries (spans)
for MIMIC-III clinical notes, where fragmented and incomplete sentences,
complex graphemic devices (e.g. abbreviations,
and acronyms), and markups are common. To encourage generalizability to
multi-domain texts, participants will receive annotated texts from
newswire articles and biomedical literature, in addition to clinical
notes, for model development and evaluation.</span></p><p class="MsoNormal" style="margin-right:0in;margin-bottom:12pt;margin-left:0in">
<span style="font-size:14pt;font-family:"Arial",sans-serif;color:rgb(67,67,67)">Track 3: ToxHabits</span><b><span style="font-size:13.5pt"><u></u><u></u></span></b></p>
<p class="MsoNormal" style="margin-right:0in;margin-bottom:12pt;margin-left:0in">
<span style="font-family:"Arial",sans-serif;color:black">There
is a pressing need to extract information related to substance use and
abuse more systematically, including not only smoking and alcohol abuse
but also other harmful drugs and substances from clinical
content. These toxic habits have a considerable health impact on a
variety of medical conditions and also affect the action of prescribed
medications. To make such information actionable, it is critical to not
only detect instances of consumption, but also
to characterize certain aspects related to it, such as duration or mode
of administration. Some initial efforts have been made to automatically
detect social determinants of health, including smoking status, for
content in English, but very limited efforts
have been made for content in other languages. Therefore, we propose
the ToxHabits track to address the automatic extraction of substance use
and abuse information from clinical cases in Spanish. This task will
consist of three subtasks: (a) toxic habit mention
recognition, (b) detection of relevant clinical modifiers related to
substance abuse, as well as (c) toxic habit condition QA challenge.</span></p><p class="MsoNormal" style="margin-right:0in;margin-bottom:12pt;margin-left:0in"><span style="font-size:20pt;font-family:"Arial",sans-serif;color:black">Organizing Committee</span><b><span style="font-size:24pt"></span></b></p>
<ul style="margin-top:0in" type="disc"><li class="MsoNormal" style="color:black;vertical-align:baseline">
<span style="font-family:"Arial",sans-serif">Dr. Rezarta Islamaj, National Library of Medicine<u></u><u></u></span></li><li class="MsoNormal" style="color:black;vertical-align:baseline">
<span style="font-family:"Arial",sans-serif">Dr. Graciela Gonzalez-Hernandez, Cedars-Sinai Medical Center<u></u><u></u></span></li><li class="MsoNormal" style="color:black;vertical-align:baseline">
<span style="font-family:"Arial",sans-serif">Dr. Martin Krallinger, Barcelona Supercomputing Center<u></u><u></u></span></li><li class="MsoNormal" style="color:black;vertical-align:baseline">
<span style="font-family:"Arial",sans-serif">Dr. Zhiyong Lu, National Library of Medicine<font color="#888888"><u></u></font></span></li></ul></div></div></div><br clear="all"></div><br><span class="gmail_signature_prefix">-- </span><br><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><font size="2"><span style="font-family:arial,sans-serif"><font color="#888888"><span style="color:rgb(0,0,0)">Salvador Lima Lopez<br>RESEARCH ENGINEER<br>Life Sciences - NLP for Biomedical Information Analysis, BSC-CNS<br>Barcelona, Spain</span></font></span></font></div></div></div>