FYI, since most of you aren't on this list. :)<div><br><div dir="auto"><br><div class="gmail_quote"><div dir="ltr">---------- Forwarded message ---------<br>From: Debbie Cavlovich <<a href="mailto:deb@cs.cmu.edu">deb@cs.cmu.edu</a>><br>Date: Wed, Sep 7, 2016, 9:11 AM<br>Subject: Thesis Oral -Dougal J. Sutherland - September 14, 2016<br>To: <<a href="mailto:cs-friends@cs.cmu.edu">cs-friends@cs.cmu.edu</a>>, <<a href="mailto:cs-students@cs.cmu.edu">cs-students@cs.cmu.edu</a>>, <<a href="mailto:cs-visitors@cs.cmu.edu">cs-visitors@cs.cmu.edu</a>>, <<a href="mailto:cs-advisors-ext@cs.cmu.edu">cs-advisors-ext@cs.cmu.edu</a>>, Catherine Copetas <<a href="mailto:copetas%2B@cs.cmu.edu">copetas+@cs.cmu.edu</a>><br></div><br><br><div class="gmail_forwarded">CALENDAR OF EVENTS<br>
<br>
Dougal J. Sutherland<br>
<br>
September 14, 2016<br>
<br>
2:00 PM - GHC 6501<br>
<br>
Thesis Oral<br>
<br>
Title: Scalable, Flexible, and Active Learning on Distributions<br>
<br>
Abstract:<br>
<br>
<br>
A wide range of machine learning problems, including astronomical<br>
inference about galaxy clusters, natural image scene classification,<br>
parametric statistical inference, and detection of potentially harmful<br>
sources of radioactive, can be well-modeled as learning a function on<br>
(samples from) distributions. This thesis explores problems in learning<br>
such functions via kernel methods, and applies the framework to yield<br>
state-of-the-art results in several novel settings.<br>
<br>
One major challenge with this approach is one of computational<br>
efficiency when learning from large numbers of distributions: the<br>
computation of typical methods scales between quadratically and<br>
cubically, and so they are not amenable to large datasets. As a<br>
solution, we investigate approximate embeddings into Euclidean spaces<br>
such that inner products in the embedding space approximate kernel<br>
values between the source distributions. We provide a greater<br>
understanding of the standard existing tool for doing so on Euclidean<br>
inputs, random Fourier features. We also present a new embedding for a<br>
class of information-theoretic distribution distances, and evaluate it<br>
and existing embeddings on several real-world applications.<br>
<br>
The next challenge is that the choice of distance is important for<br>
getting good practical performance, but how to choose a good distance<br>
for a given problem is not obvious. We study this problem in the setting<br>
of two-sample testing, where we attempt to distinguish two distributions<br>
via the maximum mean divergence, and provide a new technique for kernel<br>
choice in these settings, including the use of kernels defined by deep<br>
learning-type models.<br>
<br>
In a related problem setting, common to physical observations,<br>
autonomous sensing, and electoral polling, we have the following<br>
challenge: when observing samples is expensive, but we can choose where<br>
we would like to do so, how do we pick where to observe? We give a<br>
method for a closely related problem where we search for instances of<br>
patterns by making point observations.<br>
<br>
Throughout, we combine theoretical results with extensive empirical<br>
evaluations to increase our understanding of the methods.<br>
<br>
Thesis Committee:<br>
Jeff Schneider, Chair<br>
Barnabás Póczos<br>
Maria-Florina Balcan<br>
Arthur Gretton, University College London<br>
<br>
<br>
</div></div></div></div>