<div dir="ltr">Team,<div><br></div><div>Please come and witness Rachel giving her excellent thesis proposal presentation on Monday next week!</div><div><br></div><div>Cheers,</div><div>Artur<br><br><div class="gmail_quote gmail_quote_container"><div dir="ltr" class="gmail_attr">---------- Forwarded message ---------<br>From: <strong class="gmail_sendername" dir="auto">RI PhD Program Manager</strong> <span dir="auto"><<a href="mailto:ri-phd-manager@andrew.cmu.edu">ri-phd-manager@andrew.cmu.edu</a>></span><br>Date: Tue, Apr 7, 2026 at 2:45 PM<br>Subject: RI PhD Thesis Proposal - Xinyu (Rachel) Li<br>To: RI People <<a href="mailto:ri-people@andrew.cmu.edu">ri-people@andrew.cmu.edu</a>><br></div><br><br><div dir="ltr"><b><a href="https://www.ri.cmu.edu/event/ri-phd-thesis-proposal-xinyu-rachel-li/" target="_blank">RI CALENDAR EVENT</a></b><div><b><br></b></div><div><div><b>Date: April 13, 2026<br>Time: 03:15 PM (ET) <br>Location: GHC 6121<br><a href="https://cmu.zoom.us/j/96249846505?pwd=aZalDPGdL4JsUbJR0M0YQOPta8pFZX.1" target="_blank">Zoom Link</a></b></div><div><p style="margin:0in"><b><span style="font-family:arial,sans-serif;color:rgb(0,0,0)">Type: Ph.D. </span><span style="font-family:arial,sans-serif;color:rgb(0,0,0)">Thesis Proposal</span></b></p></div><div><b><a href="https://cmu.zoom.us/j/96249846505?pwd=aZalDPGdL4JsUbJR0M0YQOPta8pFZX.1" target="_blank"></a><span style="font-family:arial,sans-serif;color:rgb(0,0,0)">Who: Xinyu (Rachel) Li</span></b></div><div><b>Title: Towards Accessible AI Agents</b></div><div><b><br></b></div><div><b>Abstract:</b></div><div>Empowered by large language models (LLMs), AI agents have shown strong potential across tasks such as general-purpose assistance, software coding, and scientific research. However, their practical utility in applications involving consequential decisions such as healthcare, remains constrained by three major challenges.<br><br><b>Evaluation.</b> Existing agent evaluations often focus on well-structured tasks and final outcomes, failing to fully capture the complexity of real-world workflows. We propose evaluation frameworks grounded in realistic machine learning engineering workflows, providing skill-based, multi-artifact, and holistic assessments that systematically evaluate the practical utility of AI agents.<br><br><b>Learning.</b> Improving LLMs for agentic use typically relies on reinforcement learning with large amounts of high-quality labeled data, which are costly and difficult to obtain in expert domains including healthcare. To address this limitation, we aim to develop learning frameworks that require minimal external supervision, improving the scalability and efficiency of agent learning.<br><br><b>Specialization.</b> AI agents typically follow a one-size-fits-all paradigm at the time of deployment, lacking mechanisms to account for task-specific or user-specific requirements. We propose methods that enable agent specialization for downstream tasks and users, expanding their applicability across heterogeneous deployment settings.<br><br>This thesis aims to make AI agents more broadly accessible and impactful in important real-world applications by enhancing their practical utility, making them more measurable, more capable, and better tailored to the needs of their users and applications.</div><div><br></div><div><a href="https://drive.google.com/drive/folders/1mPlZXQ3WLa42e1LFKyMkjuHy_diDc6yy?usp=sharing" target="_blank"><b>Link to thesis</b></a></div><div><br></div><div><b>Thesis committee members:</b></div><div>Artur Dubrawski (Chair)<br>Andrea Bajcsy<br>Barnabás Póczos<br>Daniel McDuff (Google)<br></div><div><br></div></div></div>

</div></div></div>