<html>
  <head>

    <meta http-equiv="content-type" content="text/html; charset=utf-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    if you're free please attend Manzil's presentation<br>
    <div class="moz-forward-container"><br>
      <br>
      -------- Forwarded Message --------
      <table class="moz-email-headers-table" cellspacing="0"
        cellpadding="0" border="0">
        <tbody>
          <tr>
            <th nowrap="nowrap" valign="BASELINE" align="RIGHT">Subject:
            </th>
            <td>Reminder - Thesis Proposal - 1/12/18 - Manzil Zaheer -
              Representation Learning @ Scale</td>
          </tr>
          <tr>
            <th nowrap="nowrap" valign="BASELINE" align="RIGHT">Date: </th>
            <td>Thu, 11 Jan 2018 17:09:00 -0500</td>
          </tr>
          <tr>
            <th nowrap="nowrap" valign="BASELINE" align="RIGHT">From: </th>
            <td>Diane Stidle <a class="moz-txt-link-rfc2396E" href="mailto:diane+@cs.cmu.edu"><diane+@cs.cmu.edu></a></td>
          </tr>
          <tr>
            <th nowrap="nowrap" valign="BASELINE" align="RIGHT">To: </th>
            <td><a class="moz-txt-link-abbreviated" href="mailto:ml-seminar@cs.cmu.edu">ml-seminar@cs.cmu.edu</a> <a class="moz-txt-link-rfc2396E" href="mailto:ML-SEMINAR@cs.cmu.edu"><ML-SEMINAR@cs.cmu.edu></a>,
              Alex Smola <a class="moz-txt-link-rfc2396E" href="mailto:alex.smola@gmail.com"><alex.smola@gmail.com></a>,
              <a class="moz-txt-link-abbreviated" href="mailto:mccallum@cs.umass.edu">mccallum@cs.umass.edu</a> <a class="moz-txt-link-rfc2396E" href="mailto:mccallum@cs.umass.edu"><mccallum@cs.umass.edu></a></td>
          </tr>
        </tbody>
      </table>
      <br>
      <br>
      <meta http-equiv="content-type" content="text/html; charset=utf-8">
      <p><font face="Helvetica, Arial, sans-serif"><i>Thesis Proposal</i><br>
        </font></p>
      <font face="Helvetica, Arial, sans-serif">Date: January 12, 2018<br>
        Time: 3:00 PM<br>
        Place: <span style="font-size:12pt">8102 GHC</span><br>
      </font><font face="Helvetica, Arial, sans-serif"><span
          style="font-size:12pt">Speaker: Manzil Zaheer</span></font><font
        face="Helvetica, Arial, sans-serif"><br>
        <br>
        <span style="font-size: 12pt;">Title: </span></font><font
        face="Helvetica, Arial, sans-serif"><span style="font-size:
          12pt;"><font face="Helvetica, Arial, sans-serif"><span
              style="font-size: 12pt;">Representation Learning @ Scale</span></font></span><br>
        <br>
        Abstract:<br>
        <br>
      </font>
      <div><font face="Helvetica, Arial, sans-serif">Machine learning
          techniques are reaching or exceeding human <span
            style="font-size:12pt">level performances in tasks like
            image classification, </span><span style="font-size:12pt">translation,
            and text-to-speech. The success of these </span><span
            style="font-size:12pt">machine learning algorithms have been
            attributed to </span><span style="font-size:12pt">highly
            versatile representations learnt from data using </span><span
            style="font-size:12pt">deep network</span><span
            style="font-size:12pt">s or intricately designed Bayesian
            models<strong style="text-decoration: underline;">. </strong></span><u><span
              style="font-size: 12pt;"><strong>Represen</strong></span></u><span
            style="font-size:12pt"><strong style="text-decoration:
              underline;">tation learning</strong> has also provided
            hints </span><span style="font-size:12pt">in neuroscience,
            e.g. for understanding how humans might </span><span
            style="font-size:12pt">categorize objects. Despite these
            instances of success, </span><span style="font-size:12pt">many
            open questions remain. </span></font></div>
      <font face="Helvetica, Arial, sans-serif"> </font>
      <div><font face="Helvetica, Arial, sans-serif"><br>
        </font> </div>
      <font face="Helvetica, Arial, sans-serif"> </font>
      <div><font face="Helvetica, Arial, sans-serif">Data come in all
          shapes and sizes: not just as images or text, but <span
            style="font-size:12pt">also as point clouds, sets, graphs,
            compressed, or even heterogeneous </span><span
            style="font-size:12pt">mixture of these data types. In this
            thesis, we want to develop </span></font></div>
      <font face="Helvetica, Arial, sans-serif">representation learning
        algorithms for such unconventional data <span
          style="font-size:12pt">types by leveraging their structure and
          establishing new mathematical </span><span
          style="font-size:12pt">properties. Representations learned in
          this fashion were applied on diverse</span></font><font
        face="Helvetica, Arial, sans-serif"> domains and found to be
        competitive with task specific <span style="font-size:12pt">state-of-the-art
          methods.</span></font><font face="Helvetica, Arial,
        sans-serif"> </font>
      <div><font face="Helvetica, Arial, sans-serif"><br>
        </font> </div>
      <font face="Helvetica, Arial, sans-serif"> </font>
      <div><font face="Helvetica, Arial, sans-serif">Once we have the
          representations, in various applications its <span
            style="font-size:12pt">interpretability is as crucial as its
            accuracy. Deep models often </span><span
            style="font-size:12pt">yield better accuracy, but require a
            large number of parameters, </span><span
            style="font-size:12pt">often notwithstanding the simplicity
            of the underlying data, </span><span style="font-size:12pt">rendering
            it uninterpretable wh</span><span style="font-size:12pt">ich
            is highly undesirable in tasks </span></font></div>
      <font face="Helvetica, Arial, sans-serif">like user modeling. On
        the other hand, Bayesian models produce <span
          style="font-size:12pt">sparse discrete representations, easily
          amenable to human </span><span style="font-size:12pt">interpretation.
          In this thesis, we want to explore methods </span><span
          style="font-size: 12pt; text-decoration: underline;"><strong>that
            are</strong></span><span style="font-size: 12pt;
          text-decoration: underline;"><strong style=""> capable of </strong></span><span
          style="font-size: 12pt; text-decoration: underline;"></span></font><br>
      <font face="Helvetica, Arial, sans-serif"><span style="font-size:
          12pt; text-decoration: underline;"></span></font><font
        face="Helvetica, Arial, sans-serif"><span style="font-size:12pt"><strong
            style="text-decoration: underline;">learning </strong>mixed
          representations retaining best of both the worlds. </span><span
          style="font-size:12pt">Our experimental evaluations show that
          the proposed techniques </span><span style="font-size:12pt">compare
          favorably with several state-of-the-art baselines.</span></font>
      <font face="Helvetica, Arial, sans-serif"> </font>
      <div><font face="Helvetica, Arial, sans-serif"><br>
        </font> </div>
      <font face="Helvetica, Arial, sans-serif"> </font>
      <div><font face="Helvetica, Arial, sans-serif">Finally, one would
          want such interpretable representations to be <span
            style="font-size:12pt">inferred from large-scale data,
            however, often there is a mismatch </span><span
            style="font-size:12pt">between our computational resources
            and the statistical models. In </span><span
            style="font-size:12pt">this thesis, we want to bridge this
            gap by solutions based on a </span><span
            style="font-size:12pt">combination of modern computational
            techniques/data structures </span><span
            style="font-size:12pt">on one side and modified statistical
            inference algorithms on the other. </span><span
            style="font-size:12pt">We introd</span><span
            style="font-size:12pt">uce new ways to parallelize, reduce
            look-ups, handle variable </span><span
            style="font-size:12pt">state space size, and escape saddle
            points. On latent variable models, </span><span
            style="font-size:12pt">like latent Dirichlet allocation
            (LDA), we find significant gains </span><span
            style="font-size:12pt">in performance.</span></font></div>
      <font face="Helvetica, Arial, sans-serif"> </font>
      <div><font face="Helvetica, Arial, sans-serif"><br>
        </font> </div>
      <font face="Helvetica, Arial, sans-serif"> </font>
      <div><font face="Helvetica, Arial, sans-serif">To summarize, in
          this thesis, we want to explore three major aspects <span
            style="font-size:12pt">of representation learning ---
            diversity: being able to handle </span><span
            style="font-size:12pt">different types of data,
            interpretability: being accessible to and </span><span
            style="font-size:12pt">understandable by humans, and
            scalablity: being able to process </span><span
            style="font-size:12pt">massive datasets in a reasonable time
            and budget.</span></font></div>
      <font face="Helvetica, Arial, sans-serif"> <br>
        Thesis Committee:<br>
      </font>
      <div><font face="Helvetica, Arial, sans-serif">    Barnabas
          Poczos, Co-Chair<br>
        </font> </div>
      <font face="Helvetica, Arial, sans-serif"> </font>
      <div><font face="Helvetica, Arial, sans-serif">    Ruslan
          Salakhutdinov, Co-Chair<br>
        </font> </div>
      <font face="Helvetica, Arial, sans-serif"> </font>
      <div><font face="Helvetica, Arial, sans-serif">    Alexander J
          Smola (Amazon)<br>
        </font> </div>
      <font face="Helvetica, Arial, sans-serif"> </font>
      <div><font face="Helvetica, Arial, sans-serif">    Andrew McCallum
          (UMass Amherst)<br>
        </font> </div>
      <font face="Helvetica, Arial, sans-serif"> <br>
        Link to proposal document:<br>
      </font><font face="Helvetica, Arial, sans-serif"><a
          href="http://manzil.ml/proposal.pdf" moz-do-not-send="true">http://manzil.ml/proposal.pdf</a>​ </font><br>
      <pre class="moz-signature" cols="72">-- 
Diane Stidle
Graduate Programs Manager
Machine Learning Department
Carnegie Mellon University
<a class="moz-txt-link-abbreviated" href="mailto:diane@cs.cmu.edu" moz-do-not-send="true">diane@cs.cmu.edu</a>
412-268-1299</pre>
    </div>
  </body>
</html>