<div dir="ltr">ICYMI I'll be proposing my thesis this Friday! Feel free to come by :)<div><br></div><div>Best,</div><div>Junier</div><div><br><div class="gmail_quote">---------- Forwarded message ----------<br>From: <b class="gmail_sendername">Diane Stidle</b> <span dir="ltr"><<a href="mailto:diane%2B@cs.cmu.edu">diane+@cs.cmu.edu</a>></span><br>Date: Fri, Mar 3, 2017 at 4:30 PM<br>Subject: Thesis Proposal - 3/10/17 - Junier Oliva - Distribution and Histogram (DisH) Learning<br>To: "<a href="mailto:ml-seminar@cs.cmu.edu">ml-seminar@cs.cmu.edu</a>" <<a href="mailto:ML-SEMINAR@cs.cmu.edu">ML-SEMINAR@cs.cmu.edu</a>>, Le Song <<a href="mailto:lsong@cc.gatech.edu">lsong@cc.gatech.edu</a>><br><br><br>

  <div text="#000000" bgcolor="#FFFFFF">

    <p><i>Thesis Proposal</i></p>

    <p>Date: 3/10/17<br>

      Time: 3:00pm<br>

      Place: 8102 GHC<br>

      Speaker: Junier Oliva</p>

    <p>Title: Distribution and Histogram (DisH) Learning</p>

    <p>Abstract:<br>

    </p>

    <div>

      <div>This thesis advances the explicit use of distributions in

        machine learning. We develop algorithms that consider

        distributions as functional covariates/responses, and methods

        that use distributions as internal representations. We consider

        distributions since they are a straightforward characterization

        of many natural phenomena and provide a richer description than

        simple point data by detailing information at an aggregate

        level. Our approach may be seen as addressing two sides of the

        same coin: on one side, we use traditional machine learning

        algorithms adjusted to directly operate on inputs and outputs

        that are probability functions (and sample sets); on the other

        side, we develop better estimators for traditional vector data

        by making use of, and adjusting internal distributions.</div>

      <div><br>

      </div>

      <div>We begin by developing algorithms for traditional machine

        learning tasks for the cases when one's input (and/or possibly

        output) is not a finite point, but is instead a distribution, or

        sample set drawn from a distribution. We develop a scalable

        nonparametric estimator for regressing a real valued response

        given an input that is a distribution, a case which we coin

        distribution to real regression (DRR). Furthermore, we extend

        this work to the case when both the output response and the

        input covariate are distributions; a task we call distribution

        to distribution regression (DDR). Moreover, we propose flexible

        and scalable techniques for conditional density estimation where

        one regresses an output response that is a distribution given a

        real valued covariate.</div>

      <div><br>

      </div>

      <div>After, we look to expand the versatility and efficacy of

        traditional machine learning tasks through novel methods that

        operate with implicit or latent distributions. Take for example

        kernel methods that use a shift-invariant kernel. Here, one's

        kernel uniquely determines a distribution, the spectral density,

        that controls the frequencies considered over inputs. We show

        that one may improve the performance of kernel learning tasks by

        learning this spectral distribution in a data-driven fashion

        using Bayesian nonparametric techniques. Furthermore, we recast

        classification as a task on distributions. Namely, the Bayes

        classification risk is minimized when the distributions of

        features of instances from each particular class are

        non-overlapping. Hence, we propose a distribution based task

        with ties to a Bayes risk to perform supervised feature

        learning.</div>

      <div><br>

      </div>

      <div>Leveraging the high-level, aggregate information provided by

        distributions in these algorithms allows us to improve

        performance in a broad range of domains including: cosmology,

        neuroscience, computer vision, and natural language processing.

        Furthermore, the scalable nature of our algorithms are such that

        we may scale to millions, even billions of instances.<br>

        <br>

        Thesis Committee:<br>

        Barnabas Poczos (Co-Chair)<br>

        Jeff Schneider (Co-Chair)<br>

        Ruslan Salakhutdinov<br>

        Le Song (Georgia Institute of Technology)<br>

        <br>

        Link to draft document:<br>

<a class="m_-5944510183316843526moz-txt-link-freetext" href="https://www.dropbox.com/s/tbxj4v0oy35omky/Thesis_Proposal.pdf?dl=0" target="_blank">https://www.dropbox.com/s/<wbr>tbxj4v0oy35omky/Thesis_<wbr>Proposal.pdf?dl=0</a><span class="HOEnZb"><font color="#888888"><br>

      </font></span></div><span class="HOEnZb"><font color="#888888">

    </font></span></div><span class="HOEnZb"><font color="#888888">

    <pre class="m_-5944510183316843526moz-signature" cols="72">-- 

Diane Stidle

Graduate Programs Manager

Machine Learning Department

Carnegie Mellon University

<a class="m_-5944510183316843526moz-txt-link-abbreviated" href="mailto:diane@cs.cmu.edu" target="_blank">diane@cs.cmu.edu</a>

<a href="tel:(412)%20268-1299" value="+14122681299" target="_blank">412-268-1299</a></pre>

  </font></span></div>

</div><br></div></div>