<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
if you're free please attend Manzil's presentation<br>
<div class="moz-forward-container"><br>
<br>
-------- Forwarded Message --------
<table class="moz-email-headers-table" cellspacing="0"
cellpadding="0" border="0">
<tbody>
<tr>
<th nowrap="nowrap" valign="BASELINE" align="RIGHT">Subject:
</th>
<td>Reminder - Thesis Proposal - 1/12/18 - Manzil Zaheer -
Representation Learning @ Scale</td>
</tr>
<tr>
<th nowrap="nowrap" valign="BASELINE" align="RIGHT">Date: </th>
<td>Thu, 11 Jan 2018 17:09:00 -0500</td>
</tr>
<tr>
<th nowrap="nowrap" valign="BASELINE" align="RIGHT">From: </th>
<td>Diane Stidle <a class="moz-txt-link-rfc2396E" href="mailto:diane+@cs.cmu.edu"><diane+@cs.cmu.edu></a></td>
</tr>
<tr>
<th nowrap="nowrap" valign="BASELINE" align="RIGHT">To: </th>
<td><a class="moz-txt-link-abbreviated" href="mailto:ml-seminar@cs.cmu.edu">ml-seminar@cs.cmu.edu</a> <a class="moz-txt-link-rfc2396E" href="mailto:ML-SEMINAR@cs.cmu.edu"><ML-SEMINAR@cs.cmu.edu></a>,
Alex Smola <a class="moz-txt-link-rfc2396E" href="mailto:alex.smola@gmail.com"><alex.smola@gmail.com></a>,
<a class="moz-txt-link-abbreviated" href="mailto:mccallum@cs.umass.edu">mccallum@cs.umass.edu</a> <a class="moz-txt-link-rfc2396E" href="mailto:mccallum@cs.umass.edu"><mccallum@cs.umass.edu></a></td>
</tr>
</tbody>
</table>
<br>
<br>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<p><font face="Helvetica, Arial, sans-serif"><i>Thesis Proposal</i><br>
</font></p>
<font face="Helvetica, Arial, sans-serif">Date: January 12, 2018<br>
Time: 3:00 PM<br>
Place: <span style="font-size:12pt">8102 GHC</span><br>
</font><font face="Helvetica, Arial, sans-serif"><span
style="font-size:12pt">Speaker: Manzil Zaheer</span></font><font
face="Helvetica, Arial, sans-serif"><br>
<br>
<span style="font-size: 12pt;">Title: </span></font><font
face="Helvetica, Arial, sans-serif"><span style="font-size:
12pt;"><font face="Helvetica, Arial, sans-serif"><span
style="font-size: 12pt;">Representation Learning @ Scale</span></font></span><br>
<br>
Abstract:<br>
<br>
</font>
<div><font face="Helvetica, Arial, sans-serif">Machine learning
techniques are reaching or exceeding human <span
style="font-size:12pt">level performances in tasks like
image classification, </span><span style="font-size:12pt">translation,
and text-to-speech. The success of these </span><span
style="font-size:12pt">machine learning algorithms have been
attributed to </span><span style="font-size:12pt">highly
versatile representations learnt from data using </span><span
style="font-size:12pt">deep network</span><span
style="font-size:12pt">s or intricately designed Bayesian
models<strong style="text-decoration: underline;">. </strong></span><u><span
style="font-size: 12pt;"><strong>Represen</strong></span></u><span
style="font-size:12pt"><strong style="text-decoration:
underline;">tation learning</strong> has also provided
hints </span><span style="font-size:12pt">in neuroscience,
e.g. for understanding how humans might </span><span
style="font-size:12pt">categorize objects. Despite these
instances of success, </span><span style="font-size:12pt">many
open questions remain. </span></font></div>
<font face="Helvetica, Arial, sans-serif"> </font>
<div><font face="Helvetica, Arial, sans-serif"><br>
</font> </div>
<font face="Helvetica, Arial, sans-serif"> </font>
<div><font face="Helvetica, Arial, sans-serif">Data come in all
shapes and sizes: not just as images or text, but <span
style="font-size:12pt">also as point clouds, sets, graphs,
compressed, or even heterogeneous </span><span
style="font-size:12pt">mixture of these data types. In this
thesis, we want to develop </span></font></div>
<font face="Helvetica, Arial, sans-serif">representation learning
algorithms for such unconventional data <span
style="font-size:12pt">types by leveraging their structure and
establishing new mathematical </span><span
style="font-size:12pt">properties. Representations learned in
this fashion were applied on diverse</span></font><font
face="Helvetica, Arial, sans-serif"> domains and found to be
competitive with task specific <span style="font-size:12pt">state-of-the-art
methods.</span></font><font face="Helvetica, Arial,
sans-serif"> </font>
<div><font face="Helvetica, Arial, sans-serif"><br>
</font> </div>
<font face="Helvetica, Arial, sans-serif"> </font>
<div><font face="Helvetica, Arial, sans-serif">Once we have the
representations, in various applications its <span
style="font-size:12pt">interpretability is as crucial as its
accuracy. Deep models often </span><span
style="font-size:12pt">yield better accuracy, but require a
large number of parameters, </span><span
style="font-size:12pt">often notwithstanding the simplicity
of the underlying data, </span><span style="font-size:12pt">rendering
it uninterpretable wh</span><span style="font-size:12pt">ich
is highly undesirable in tasks </span></font></div>
<font face="Helvetica, Arial, sans-serif">like user modeling. On
the other hand, Bayesian models produce <span
style="font-size:12pt">sparse discrete representations, easily
amenable to human </span><span style="font-size:12pt">interpretation.
In this thesis, we want to explore methods </span><span
style="font-size: 12pt; text-decoration: underline;"><strong>that
are</strong></span><span style="font-size: 12pt;
text-decoration: underline;"><strong style=""> capable of </strong></span><span
style="font-size: 12pt; text-decoration: underline;"></span></font><br>
<font face="Helvetica, Arial, sans-serif"><span style="font-size:
12pt; text-decoration: underline;"></span></font><font
face="Helvetica, Arial, sans-serif"><span style="font-size:12pt"><strong
style="text-decoration: underline;">learning </strong>mixed
representations retaining best of both the worlds. </span><span
style="font-size:12pt">Our experimental evaluations show that
the proposed techniques </span><span style="font-size:12pt">compare
favorably with several state-of-the-art baselines.</span></font>
<font face="Helvetica, Arial, sans-serif"> </font>
<div><font face="Helvetica, Arial, sans-serif"><br>
</font> </div>
<font face="Helvetica, Arial, sans-serif"> </font>
<div><font face="Helvetica, Arial, sans-serif">Finally, one would
want such interpretable representations to be <span
style="font-size:12pt">inferred from large-scale data,
however, often there is a mismatch </span><span
style="font-size:12pt">between our computational resources
and the statistical models. In </span><span
style="font-size:12pt">this thesis, we want to bridge this
gap by solutions based on a </span><span
style="font-size:12pt">combination of modern computational
techniques/data structures </span><span
style="font-size:12pt">on one side and modified statistical
inference algorithms on the other. </span><span
style="font-size:12pt">We introd</span><span
style="font-size:12pt">uce new ways to parallelize, reduce
look-ups, handle variable </span><span
style="font-size:12pt">state space size, and escape saddle
points. On latent variable models, </span><span
style="font-size:12pt">like latent Dirichlet allocation
(LDA), we find significant gains </span><span
style="font-size:12pt">in performance.</span></font></div>
<font face="Helvetica, Arial, sans-serif"> </font>
<div><font face="Helvetica, Arial, sans-serif"><br>
</font> </div>
<font face="Helvetica, Arial, sans-serif"> </font>
<div><font face="Helvetica, Arial, sans-serif">To summarize, in
this thesis, we want to explore three major aspects <span
style="font-size:12pt">of representation learning ---
diversity: being able to handle </span><span
style="font-size:12pt">different types of data,
interpretability: being accessible to and </span><span
style="font-size:12pt">understandable by humans, and
scalablity: being able to process </span><span
style="font-size:12pt">massive datasets in a reasonable time
and budget.</span></font></div>
<font face="Helvetica, Arial, sans-serif"> <br>
Thesis Committee:<br>
</font>
<div><font face="Helvetica, Arial, sans-serif"> Barnabas
Poczos, Co-Chair<br>
</font> </div>
<font face="Helvetica, Arial, sans-serif"> </font>
<div><font face="Helvetica, Arial, sans-serif"> Ruslan
Salakhutdinov, Co-Chair<br>
</font> </div>
<font face="Helvetica, Arial, sans-serif"> </font>
<div><font face="Helvetica, Arial, sans-serif"> Alexander J
Smola (Amazon)<br>
</font> </div>
<font face="Helvetica, Arial, sans-serif"> </font>
<div><font face="Helvetica, Arial, sans-serif"> Andrew McCallum
(UMass Amherst)<br>
</font> </div>
<font face="Helvetica, Arial, sans-serif"> <br>
Link to proposal document:<br>
</font><font face="Helvetica, Arial, sans-serif"><a
href="http://manzil.ml/proposal.pdf" moz-do-not-send="true">http://manzil.ml/proposal.pdf</a> </font><br>
<pre class="moz-signature" cols="72">--
Diane Stidle
Graduate Programs Manager
Machine Learning Department
Carnegie Mellon University
<a class="moz-txt-link-abbreviated" href="mailto:diane@cs.cmu.edu" moz-do-not-send="true">diane@cs.cmu.edu</a>
412-268-1299</pre>
</div>
</body>
</html>