<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body dir="auto"><div dir="ltr"></div><div dir="ltr">Reminder that this is happening today!</div><div dir="ltr"><br><blockquote type="cite">On Oct 14, 2023, at 1:51 PM, Asher Trockman <ashert@cs.cmu.edu> wrote:<br><br></blockquote></div><blockquote type="cite"><div dir="ltr"><div dir="ltr">Dear all,<div><br></div><div><div>We look forward to seeing you <b>this Tuesday (10/17)</b> from <b><font color="#ff0000">1</font></b><font color="#ff0000"><b>2:00-1:00 PM (U.S. Eastern time)</b></font> for the next talk of this semester's <b>CMU AI Seminar</b>, sponsored by <a href="https://sambanova.ai/" target="_blank">SambaNova Systems</a>. The seminar will be held in GHC 6115 <b>with pizza provided </b>and will<b> </b>be streamed on Zoom.</div><div><br></div><div>To learn more about the seminar series or to see the future schedule, please visit the <a href="http://www.cs.cmu.edu/~aiseminar/" target="_blank">seminar website</a>.</div><div><br></div><font color="#0b5394"><span style="background-color:rgb(255,255,0)">On this Tuesday (10/17),<span class="gmail-Apple-converted-space"> </span><u>Nicholas Roberts</u> </span><span style="background-color:rgb(255,255,0)">(UW Madison) will be giving a talk titled </span><b style="background-color:rgb(255,255,0)">"</b><b style="background-color:rgb(255,255,0)">Geometry-Aware Adaptation for Pretrained Models</b></font><b style="color:rgb(11,83,148);background-color:rgb(255,255,0)">"</b><font color="#0b5394" style="background-color:rgb(255,255,0)">.</font></div><div><font color="#0b5394"><span style="background-color:rgb(255,255,0)"><br></span><b>Title</b>: Geometry-Aware Adaptation for Pretrained Models<br><br></font><div><font color="#0b5394"><b>Talk Abstract</b>: Machine learning models—including prominent zero-shot models—are often trained on datasets whose labels are only a small proportion of a larger label space. Such spaces are commonly equipped with a metric that relates the labels via distances between them. We propose a simple approach to exploit this information to adapt the trained model to reliably predict new classes—or, in the case of zero-shot prediction, to improve its performance—without any additional training. Our technique is a drop-in replacement of the standard prediction rule, swapping arg max with the Fréchet mean. We provide a comprehensive theoretical analysis for this approach, studying (i) learning-theoretic results trading off label space diameter, sample complexity, and model dimension, (ii) characterizations of the full range of scenarios in which it is possible to predict any unobserved class, and (iii) an optimal active learning-like next class selection procedure to obtain optimal training classes for when it is not possible to predict the entire range of unobserved classes. Empirically, using easily-available external metrics, our proposed approach, LOKI, gains up to 29.7% relative improvement over SimCLR on ImageNet and scales to hundreds of thousands of classes. When no such metric is available, LOKI can use self-derived metrics from class embeddings and obtains a 10.5% improvement on pretrained zero-shot models such as CLIP.</font></div><div><div><div><font color="#0b5394"> </font><font color="#0b5394"><br></font></div><div><font color="#0b5394"><b>Speaker Bio:</b> <a href="https://nick11roberts.science">Nicholas Roberts</a> is a third year Ph.D. student at the University of Wisconsin-Madison advised by Frederic Sala. This past summer, he completed an internship with the Physics of AGI research group at Microsoft Research led by Sebastien Bubeck, working on large language models. Previously, he completed his M.S. in the Machine Learning Department at CMU, working with Ameet Talwalkar and Zack Lipton. Nicholas’ research is motivated by the need to democratize machine learning and foundation models to handle the long tail of emerging ML tasks in the sciences and beyond. In order to use these models to solve high-impact problems in the sciences, his work aims to solve two main challenges: (1) determine what additional data to provide them and understand how it interacts with pretraining data, and (2) automate the process of adapting them to new problems. To address these challenges, he is focused on the intersection of data-centric ML (which aims to solve 1) and automated machine learning (AutoML) (which aims to solve 2), or more concisely data-centric AutoML. As a result of these motivating challenges, his work on developing the foundations of data-centric AutoML has a focus on diverse ML tasks that are far afield from standard ML domains. These often include problems related to solving PDEs, protein folding, climate modeling, and beyond.</font></div><div><font color="#0b5394"><br></font></div><div><font color="#0b5394"><b>In person: </b>GHC 6115</font></div><div><font color="#0b5394"><b>Zoom Link</b>: <a href="https://cmu.zoom.us/j/99510233317?pwd=ZGx4aExNZ1FNaGY4SHI3Qlh0YjNWUT09" target="_blank">https://cmu.zoom.us/j/99510233317?pwd=ZGx4aExNZ1FNaGY4SHI3Qlh0YjNWUT09</a></font></div></div></div></div><div><br></div><div>Thanks,</div><div>Asher Trockman</div></div>
</div></blockquote></body></html>