Fwd: Thesis Oral -Dougal J. Sutherland - Today, 2pm!

Wed Sep 14 13:43:28 EDT 2016

Please come and listen to Dougal's defense!

---------- Forwarded message ---------
From: Debbie Cavlovich <deb at cs.cmu.edu <mailto:deb at cs.cmu.edu>>
Date: Wed, Sep 7, 2016, 9:11 AM
Subject: Thesis Oral -Dougal J. Sutherland - September 14, 2016
To: <cs-friends at cs.cmu.edu <mailto:cs-friends at cs.cmu.edu>>, 
<cs-students at cs.cmu.edu <mailto:cs-students at cs.cmu.edu>>, 
<cs-visitors at cs.cmu.edu <mailto:cs-visitors at cs.cmu.edu>>, 
<cs-advisors-ext at cs.cmu.edu <mailto:cs-advisors-ext at cs.cmu.edu>>, Catherine 
Copetas <copetas+ at cs.cmu.edu <mailto:copetas%2B at cs.cmu.edu>>

CALENDAR OF EVENTS

Dougal J. Sutherland

September 14, 2016

2:00 PM - GHC 6501

Thesis Oral

Title:  Scalable, Flexible, and Active Learning on Distributions

Abstract:

A wide range of machine learning problems, including astronomical
inference about galaxy clusters, natural image scene classification,
parametric statistical inference, and detection of potentially harmful
sources of radioactive, can be well-modeled as learning a function on
(samples from) distributions. This thesis explores problems in learning
such functions via kernel methods, and applies the framework to yield
state-of-the-art results in several novel settings.

One major challenge with this approach is one of computational
efficiency when learning from large numbers of distributions: the
computation of typical methods scales between quadratically and
cubically, and so they are not amenable to large datasets. As a
solution, we investigate approximate embeddings into Euclidean spaces
such that inner products in the embedding space approximate kernel
values between the source distributions. We provide a greater
understanding of the standard existing tool for doing so on Euclidean
inputs, random Fourier features. We also present a new embedding for a
class of information-theoretic distribution distances, and evaluate it
and existing embeddings on several real-world applications.

The next challenge is that the choice of distance is important for
getting good practical performance, but how to choose a good distance
for a given problem is not obvious. We study this problem in the setting
of two-sample testing, where we attempt to distinguish two distributions
via the maximum mean divergence, and provide a new technique for kernel
choice in these settings, including the use of kernels defined by deep
learning-type models.

In a related problem setting, common to physical observations,
autonomous sensing, and electoral polling, we have the following
challenge: when observing samples is expensive, but we can choose where
we would like to do so, how do we pick where to observe? We give a
method for a closely related problem where we search for instances of
patterns by making point observations.

Throughout, we combine theoretical results with extensive empirical
evaluations to increase our understanding of the methods.

Thesis Committee:
Jeff Schneider, Chair
Barnabás Póczos
Maria-Florina Balcan
Arthur Gretton, University College London