Fwd: Thesis Defense - 5/25/18 - Junier Oliva - Distribution and Histogram (DisH) Learning
    Artur Dubrawski 
    awd at cs.cmu.edu
       
    Fri May 11 14:55:34 EDT 2018
    
    
  
This may be the last chance to see Junier as a student!
Artur
---------- Forwarded message ----------
From: Diane Stidle <stidle at andrew.cmu.edu>
Date: Fri, May 11, 2018 at 2:29 PM
Subject: Thesis Defense - 5/25/18 - Junier Oliva - Distribution and
Histogram (DisH) Learning
To: "ml-seminar at cs.cmu.edu" <ML-SEMINAR at cs.cmu.edu>, Le Song <
lsong at cc.gatech.edu>
Thesis Defense
Date: May 25, 2018
Time: 10:00am
Place: 8102 GHC
PhD Candidate: Junier Oliva
Title: Distribution and Histogram (DisH) Learning
Abstract: Machine learning has made incredible advances in the last couple
of decades. Notwithstanding, a lot of this progress has been limited to
basic point-estimation tasks. That is, a large bulk of attention has been
geared at solving problems that take in a static finite vector and map it
to another static finite vector. However, we do not navigate through life
in a series of point-estimation problems, mapping x to y. Instead, we find
broad patterns and gather a far-sighted understanding of data by
considering collections of points like sets, sequences, and distributions.
Thus, contrary to what various billionaires, celebrity theoretical
physicists, and sci-fi classics would lead you to believe, true machine
intelligence is fairly out of reach currently. In order to bridge this gap,
we have developed algorithms that understand data at an aggregate, holistic
level.
This thesis pushes machine learning past the realm of operating over static
finite vectors, to start reasoning ubiquitously with complex, dynamic
collections like sets and sequences. We develop algorithms that consider
distributions as functional covariates/responses, and methods that use
distributions as internal representations. We consider distributions since
they are a straightforward characterization of many natural phenomena and
provide a richer description than simple point data by detailing
information at an aggregate level. Our approach may be seen as addressing
two sides of the same coin: on one side, we use traditional machine
learning algorithms adjusted to directly operate on inputs and outputs that
are probability functions (and sample sets); on the other side, we develop
better estimators for traditional tasks by making use of and adjusting
internal distributions.
We begin by developing algorithms for traditional machine learning tasks
for the cases when one’s input (and/or possibly output) is not a finite
point, but is instead a distribution, or sample set drawn from a
distribution. We develop a scalable nonparametric estimator for regressing
a real valued response given an input that is a distribution, a case which
we coin distribution to real regression (DRR). Furthermore, we extend this
work to the case when both the output response and the input covariate are
distributions; a task we call distribution to distribution regression
(DDR).
After, we look to expand the versatility and efficacy of traditional
machine learning tasks through novel methods that operate with
distributions of features. For example, we show that one may improve the
performance of kernel learning tasks by learning a kernel’s spectral
distribution in a data-driven fashion using Bayesian nonparametric
techniques. Moreover, we study how to perform sequential modeling by
looking at summary statistics from past points. Lastly, we also develop
methods for high-dimensional density estimation that make use of flexible
transformations of variables and autoregressive conditionals.
Thesis Committee:
Barnabas Poczos (Co-Chair)
Jeff Schneider (Co-Chair)
Ruslan Salakhutdinov
Le Song (Georgia Institute of Technology, lsong at cc.gatech.edu)
Link to draft document:
https://www.dropbox.com/s/z93s3qanl02fs8l/draft.pdf?dl=0
-- 
Diane Stidle
Graduate Programs Manager
Machine Learning Department
Carnegie Mellon Universitydiane at cs.cmu.edu
412-268-1299
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/autonlab-users/attachments/20180511/cc2ec8f9/attachment.html>
    
    
More information about the Autonlab-users
mailing list