<html>

  <head>

    <meta http-equiv="content-type" content="text/html; charset=utf-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    if you're free please attend Manzil's presentation<br>

    <div class="moz-forward-container"><br>

      <br>

      -------- Forwarded Message --------

      <table class="moz-email-headers-table" cellspacing="0"

        cellpadding="0" border="0">

        <tbody>

          <tr>

            <th nowrap="nowrap" valign="BASELINE" align="RIGHT">Subject:

            </th>

            <td>Reminder - Thesis Proposal - 1/12/18 - Manzil Zaheer -

              Representation Learning @ Scale</td>

          </tr>

          <tr>

            <th nowrap="nowrap" valign="BASELINE" align="RIGHT">Date: </th>

            <td>Thu, 11 Jan 2018 17:09:00 -0500</td>

          </tr>

          <tr>

            <th nowrap="nowrap" valign="BASELINE" align="RIGHT">From: </th>

            <td>Diane Stidle <a class="moz-txt-link-rfc2396E" href="mailto:diane+@cs.cmu.edu"><diane+@cs.cmu.edu></a></td>

          </tr>

          <tr>

            <th nowrap="nowrap" valign="BASELINE" align="RIGHT">To: </th>

            <td><a class="moz-txt-link-abbreviated" href="mailto:ml-seminar@cs.cmu.edu">ml-seminar@cs.cmu.edu</a> <a class="moz-txt-link-rfc2396E" href="mailto:ML-SEMINAR@cs.cmu.edu"><ML-SEMINAR@cs.cmu.edu></a>,

              Alex Smola <a class="moz-txt-link-rfc2396E" href="mailto:alex.smola@gmail.com"><alex.smola@gmail.com></a>,

              <a class="moz-txt-link-abbreviated" href="mailto:mccallum@cs.umass.edu">mccallum@cs.umass.edu</a> <a class="moz-txt-link-rfc2396E" href="mailto:mccallum@cs.umass.edu"><mccallum@cs.umass.edu></a></td>

          </tr>

        </tbody>

      </table>

      <br>

      <br>

      <meta http-equiv="content-type" content="text/html; charset=utf-8">

      <p><font face="Helvetica, Arial, sans-serif"><i>Thesis Proposal</i><br>

        </font></p>

      <font face="Helvetica, Arial, sans-serif">Date: January 12, 2018<br>

        Time: 3:00 PM<br>

        Place: <span style="font-size:12pt">8102 GHC</span><br>

      </font><font face="Helvetica, Arial, sans-serif"><span

          style="font-size:12pt">Speaker: Manzil Zaheer</span></font><font

        face="Helvetica, Arial, sans-serif"><br>

        <br>

        <span style="font-size: 12pt;">Title: </span></font><font

        face="Helvetica, Arial, sans-serif"><span style="font-size:

          12pt;"><font face="Helvetica, Arial, sans-serif"><span

              style="font-size: 12pt;">Representation Learning @ Scale</span></font></span><br>

        <br>

        Abstract:<br>

        <br>

      </font>

      <div><font face="Helvetica, Arial, sans-serif">Machine learning

          techniques are reaching or exceeding human <span

            style="font-size:12pt">level performances in tasks like

            image classification, </span><span style="font-size:12pt">translation,

            and text-to-speech. The success of these </span><span

            style="font-size:12pt">machine learning algorithms have been

            attributed to </span><span style="font-size:12pt">highly

            versatile representations learnt from data using </span><span

            style="font-size:12pt">deep network</span><span

            style="font-size:12pt">s or intricately designed Bayesian

            models<strong style="text-decoration: underline;">. </strong></span><u><span

              style="font-size: 12pt;"><strong>Represen</strong></span></u><span

            style="font-size:12pt"><strong style="text-decoration:

              underline;">tation learning</strong> has also provided

            hints </span><span style="font-size:12pt">in neuroscience,

            e.g. for understanding how humans might </span><span

            style="font-size:12pt">categorize objects. Despite these

            instances of success, </span><span style="font-size:12pt">many

            open questions remain. </span></font></div>

      <font face="Helvetica, Arial, sans-serif"> </font>

      <div><font face="Helvetica, Arial, sans-serif"><br>

        </font> </div>

      <font face="Helvetica, Arial, sans-serif"> </font>

      <div><font face="Helvetica, Arial, sans-serif">Data come in all

          shapes and sizes: not just as images or text, but <span

            style="font-size:12pt">also as point clouds, sets, graphs,

            compressed, or even heterogeneous </span><span

            style="font-size:12pt">mixture of these data types. In this

            thesis, we want to develop </span></font></div>

      <font face="Helvetica, Arial, sans-serif">representation learning

        algorithms for such unconventional data <span

          style="font-size:12pt">types by leveraging their structure and

          establishing new mathematical </span><span

          style="font-size:12pt">properties. Representations learned in

          this fashion were applied on diverse</span></font><font

        face="Helvetica, Arial, sans-serif"> domains and found to be

        competitive with task specific <span style="font-size:12pt">state-of-the-art

          methods.</span></font><font face="Helvetica, Arial,

        sans-serif"> </font>

      <div><font face="Helvetica, Arial, sans-serif"><br>

        </font> </div>

      <font face="Helvetica, Arial, sans-serif"> </font>

      <div><font face="Helvetica, Arial, sans-serif">Once we have the

          representations, in various applications its <span

            style="font-size:12pt">interpretability is as crucial as its

            accuracy. Deep models often </span><span

            style="font-size:12pt">yield better accuracy, but require a

            large number of parameters, </span><span

            style="font-size:12pt">often notwithstanding the simplicity

            of the underlying data, </span><span style="font-size:12pt">rendering

            it uninterpretable wh</span><span style="font-size:12pt">ich

            is highly undesirable in tasks </span></font></div>

      <font face="Helvetica, Arial, sans-serif">like user modeling. On

        the other hand, Bayesian models produce <span

          style="font-size:12pt">sparse discrete representations, easily

          amenable to human </span><span style="font-size:12pt">interpretation.

          In this thesis, we want to explore methods </span><span

          style="font-size: 12pt; text-decoration: underline;"><strong>that

            are</strong></span><span style="font-size: 12pt;

          text-decoration: underline;"><strong style=""> capable of </strong></span><span

          style="font-size: 12pt; text-decoration: underline;"></span></font><br>

      <font face="Helvetica, Arial, sans-serif"><span style="font-size:

          12pt; text-decoration: underline;"></span></font><font

        face="Helvetica, Arial, sans-serif"><span style="font-size:12pt"><strong

            style="text-decoration: underline;">learning </strong>mixed

          representations retaining best of both the worlds. </span><span

          style="font-size:12pt">Our experimental evaluations show that

          the proposed techniques </span><span style="font-size:12pt">compare

          favorably with several state-of-the-art baselines.</span></font>

      <font face="Helvetica, Arial, sans-serif"> </font>

      <div><font face="Helvetica, Arial, sans-serif"><br>

        </font> </div>

      <font face="Helvetica, Arial, sans-serif"> </font>

      <div><font face="Helvetica, Arial, sans-serif">Finally, one would

          want such interpretable representations to be <span

            style="font-size:12pt">inferred from large-scale data,

            however, often there is a mismatch </span><span

            style="font-size:12pt">between our computational resources

            and the statistical models. In </span><span

            style="font-size:12pt">this thesis, we want to bridge this

            gap by solutions based on a </span><span

            style="font-size:12pt">combination of modern computational

            techniques/data structures </span><span

            style="font-size:12pt">on one side and modified statistical

            inference algorithms on the other. </span><span

            style="font-size:12pt">We introd</span><span

            style="font-size:12pt">uce new ways to parallelize, reduce

            look-ups, handle variable </span><span

            style="font-size:12pt">state space size, and escape saddle

            points. On latent variable models, </span><span

            style="font-size:12pt">like latent Dirichlet allocation

            (LDA), we find significant gains </span><span

            style="font-size:12pt">in performance.</span></font></div>

      <font face="Helvetica, Arial, sans-serif"> </font>

      <div><font face="Helvetica, Arial, sans-serif"><br>

        </font> </div>

      <font face="Helvetica, Arial, sans-serif"> </font>

      <div><font face="Helvetica, Arial, sans-serif">To summarize, in

          this thesis, we want to explore three major aspects <span

            style="font-size:12pt">of representation learning ---

            diversity: being able to handle </span><span

            style="font-size:12pt">different types of data,

            interpretability: being accessible to and </span><span

            style="font-size:12pt">understandable by humans, and

            scalablity: being able to process </span><span

            style="font-size:12pt">massive datasets in a reasonable time

            and budget.</span></font></div>

      <font face="Helvetica, Arial, sans-serif"> <br>

        Thesis Committee:<br>

      </font>

      <div><font face="Helvetica, Arial, sans-serif">    Barnabas

          Poczos, Co-Chair<br>

        </font> </div>

      <font face="Helvetica, Arial, sans-serif"> </font>

      <div><font face="Helvetica, Arial, sans-serif">    Ruslan

          Salakhutdinov, Co-Chair<br>

        </font> </div>

      <font face="Helvetica, Arial, sans-serif"> </font>

      <div><font face="Helvetica, Arial, sans-serif">    Alexander J

          Smola (Amazon)<br>

        </font> </div>

      <font face="Helvetica, Arial, sans-serif"> </font>

      <div><font face="Helvetica, Arial, sans-serif">    Andrew McCallum

          (UMass Amherst)<br>

        </font> </div>

      <font face="Helvetica, Arial, sans-serif"> <br>

        Link to proposal document:<br>

      </font><font face="Helvetica, Arial, sans-serif"><a

          href="http://manzil.ml/proposal.pdf" moz-do-not-send="true">http://manzil.ml/proposal.pdf</a> </font><br>

      <pre class="moz-signature" cols="72">-- 

Diane Stidle

Graduate Programs Manager

Machine Learning Department

Carnegie Mellon University

<a class="moz-txt-link-abbreviated" href="mailto:diane@cs.cmu.edu" moz-do-not-send="true">diane@cs.cmu.edu</a>

412-268-1299</pre>

    </div>

  </body>

</html>