<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
It is today, 4pm, Gates 4405!<br>
<br>
A.<br>
<br>
<div class="moz-cite-prefix">On 11/21/2017 11:56 AM, Artur Dubrawski
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:d393a7cd-aa11-0cf1-2d84-89a9e863760c@cs.cmu.edu">
<meta http-equiv="content-type" content="text/html; charset=utf-8">
Team,<br>
<br>
Happy Thanksgiving!<br>
<br>
+ please mark your calendars for Monday next week. <br>
Attending Matt's proposal talk will surely help us burn the excess
calories from our turkey dinners.<br>
<br>
Cheers,<br>
Artur<br>
<div class="moz-forward-container"><br>
<br>
-------- Forwarded Message --------
<table class="moz-email-headers-table" cellspacing="0"
cellpadding="0" border="0">
<tbody>
<tr>
<th nowrap="nowrap" valign="BASELINE" align="RIGHT">Subject:
</th>
<td>RI Ph.D. Thesis Proposal: Matt Barnes</td>
</tr>
<tr>
<th nowrap="nowrap" valign="BASELINE" align="RIGHT">Date:
</th>
<td>Mon, 20 Nov 2017 15:37:18 +0000</td>
</tr>
<tr>
<th nowrap="nowrap" valign="BASELINE" align="RIGHT">From:
</th>
<td>Suzanne Muth <a class="moz-txt-link-rfc2396E"
href="mailto:lyonsmuth@cmu.edu" moz-do-not-send="true"><lyonsmuth@cmu.edu></a></td>
</tr>
<tr>
<th nowrap="nowrap" valign="BASELINE" align="RIGHT">To: </th>
<td><a class="moz-txt-link-abbreviated"
href="mailto:ri-people@cs.cmu.edu"
moz-do-not-send="true">ri-people@cs.cmu.edu</a> <a
class="moz-txt-link-rfc2396E"
href="mailto:ri-people@cs.cmu.edu"
moz-do-not-send="true"><ri-people@cs.cmu.edu></a></td>
</tr>
</tbody>
</table>
<br>
<br>
<meta http-equiv="Content-Type" content="text/html;
charset=utf-8">
<style type="text/css" style="display:none"><!-- P { margin-top: 0px; margin-bottom: 0px; }--></style>
<div id="Signature">
<div name="divtagdefaultwrapper" style="margin: 0px;">
<p style="font-family: Corbel, sans-serif; background-color:
rgb(255, 255, 255);"> <span style="font-size: 11pt;"><span
style="font-family: Calibri, Arial, Helvetica,
sans-serif;">Date: 27 November 2017</span><br
style="font-family: Calibri, Arial, Helvetica,
sans-serif;">
</span></p>
<p style="font-family: Corbel, sans-serif; background-color:
rgb(255, 255, 255);"> <span style="font-family: Calibri,
Arial, Helvetica, sans-serif; font-size: 11pt;">Time:
4:00 p.m.</span></p>
<p style="font-family: Corbel, sans-serif; background-color:
rgb(255, 255, 255);"> <span style="font-size: 11pt;"><span
style="font-family: Calibri, Arial, Helvetica,
sans-serif;">Place: Gates Hillman Center 4405</span><br
style="font-family: Calibri, Arial, Helvetica,
sans-serif;">
</span></p>
<p style="font-family: Corbel, sans-serif; background-color:
rgb(255, 255, 255);"> <span style="font-family: Calibri,
Arial, Helvetica, sans-serif; font-size: 11pt;">Type:
Ph.D. Thesis Proposal</span></p>
<p style="font-family: Corbel, sans-serif; background-color:
rgb(255, 255, 255);"> <span style="font-family: Calibri,
Arial, Helvetica, sans-serif; font-size: 11pt;">Who:
Matt Barnes</span></p>
<p style="font-family: Corbel, sans-serif; background-color:
rgb(255, 255, 255);"> <span style="font-family: Calibri,
Arial, Helvetica, sans-serif; font-size: 11pt;">Topic:
Learning with Clusters</span></p>
<p style="font-family: Corbel, sans-serif; background-color:
rgb(255, 255, 255);"> <span style="font-size: 11pt;"> <br
style="">
</span></p>
<p style="font-family: Corbel, sans-serif; background-color:
rgb(255, 255, 255);"> <span style="font-family: Calibri,
Arial, Helvetica, sans-serif; font-size: 11pt;">Abstract:</span></p>
<p style="font-family: Corbel, sans-serif; font-size: 16px;
background-color: rgb(255, 255, 255);"> <span
style="font-family: Calibri, Arial, Helvetica,
sans-serif;"></span></p>
<p style="font-family: Corbel, sans-serif; font-size: 16px;
background-color: rgb(255, 255, 255);"> </p>
<div style="font-family: Calibri, Arial, Helvetica,
sans-serif;"><font face="Calibri" color="#212121"><span
style="font-size:14.6667px">As machine learning
becomes more ubiquitous, clustering has evolved from
primarily a data analysis tool into an integrated
component of complex machine learning systems,
including those involving dimensionality reduction,
anomaly detection, network analysis, image
segmentation and classifying groups of data. </span></font><span
style="font-size:14.6667px;color:rgb(33,33,33);font-family:Calibri">With
this integration into multi-stage systems comes a need
to better understand interactions between pipeline
components. Changing parameters of the clustering
algorithm will impact downstream components and, quite
unfortunately, it is usually not possible to simply
back-propagate through the entire system. Currently, as
with many machine learning systems, the output of the
clustering algorithm is taken as ground truth at the
next pipeline step. Our empirical results show this
false assumption may have dramatic empirical impacts --
sometimes biasing results by upwards of 25%.</span></div>
<div style="font-family: Calibri, Arial, Helvetica,
sans-serif;"><font face="Calibri" color="#212121"><span
style="font-size:14.6667px"><br>
</span></font></div>
<div style="font-family: Calibri, Arial, Helvetica,
sans-serif;"><font face="Calibri" color="#212121"><span
style="font-size:14.6667px">We address this gap by
developing estimators and methods to both quantify and
correct for clustering errors' impacts on downstream
learners. Our work is agnostic to the downstream
learners, and requires few assumptions on the
clustering algorithm. Theoretical and empirical
results demonstrate our methods and estimators are
superior to the current naive approaches, which do not
account for clustering errors.</span></font></div>
<div style="font-family: Calibri, Arial, Helvetica,
sans-serif;"><font face="Calibri" color="#212121"><span
style="font-size:14.6667px"><br>
</span></font></div>
<div style="font-family: Calibri, Arial, Helvetica,
sans-serif;"><font face="Calibri" color="#212121"><span
style="font-size:14.6667px">Along these lines, we also
develop several new clustering algorithms and prove
theoretical bounds for existing algorithms, to be used
as inputs to our later error-correction methods. Not
surprisingly, we find learning on clusters of data is
both theoretically and empirically easier as the
number of clustering errors decreases. Thus, our work
is two-fold. We attempt to both provide the best
clustering possible and learn on inevitably noisy
clusters.</span></font></div>
<div style="font-family: Calibri, Arial, Helvetica,
sans-serif;"><font face="Calibri" color="#212121"><span
style="font-size:14.6667px"><br>
</span></font></div>
<div style="font-family: Calibri, Arial, Helvetica,
sans-serif;"><font face="Calibri" color="#212121"><span
style="font-size:14.6667px">A major limiting factor in
our error-correction methods is scalability.
Currently, their computational complexity is O(n^3)
where n is the size of the training dataset. This
limits their applicability to very small machine
learning problems. We propose addressing this
scalability issue through approximation. It should be
possible to reduce the computational complexity to
O(p^3) where p is a small fixed constant and
independent of n, corresponding to the number of
parameters in the approximation.</span></font></div>
<p style="font-family: Corbel, sans-serif; font-size: 16px;
background-color: rgb(255, 255, 255);"> <br>
</p>
<p style="font-family: Corbel, sans-serif; font-size: 16px;
background-color: rgb(255, 255, 255);"> <span
style="font-family: Calibri, Arial, Helvetica,
sans-serif;"></span></p>
<p style="font-family: Corbel, sans-serif; font-size: 16px;
background-color: rgb(255, 255, 255);"> <br>
</p>
<p style="font-family: Corbel, sans-serif; background-color:
rgb(255, 255, 255);"> <span style="font-family: Calibri,
Arial, Helvetica, sans-serif; font-size: 11pt;">Thesis
Committee Members:</span></p>
<p style="background-color: rgb(255, 255, 255);"><font
face="Calibri, Arial, Helvetica, sans-serif"><span
style="font-size: 14.666666984558105px;">Artur
Dubrawski, Chair</span></font></p>
<p style="background-color: rgb(255, 255, 255);"><font
face="Calibri, Arial, Helvetica, sans-serif"><span
style="font-size: 14.666666984558105px;">Geoff Gordon</span></font></p>
<p style="background-color: rgb(255, 255, 255);"><font
face="Calibri, Arial, Helvetica, sans-serif"><span
style="font-size: 14.666666984558105px;">Kris Kitani</span></font></p>
<p style="background-color: rgb(255, 255, 255);"><font
face="Calibri, Arial, Helvetica, sans-serif"><span
style="font-size: 14.666666984558105px;">Beka Steorts,
Duke University</span></font></p>
<p style="font-family: Corbel, sans-serif; background-color:
rgb(255, 255, 255);"> <br>
</p>
<p style="font-family: Corbel, sans-serif; background-color:
rgb(255, 255, 255);"> <br>
</p>
<p style="font-family: Corbel, sans-serif; background-color:
rgb(255, 255, 255);"> <span style="font-size: 11pt;"><span
style="font-family: Calibri, Arial, Helvetica,
sans-serif;">A copy of the thesis proposal document is
available at:</span></span></p>
<p style="font-family: Corbel, sans-serif; background-color:
rgb(255, 255, 255);"> <span style="font-size: 11pt;"><span
style="font-family: Calibri, Arial, Helvetica,
sans-serif;"><font color="#212121"><a
href="http://goo.gl/MpwTCN" target="_blank"
moz-do-not-send="true">http://goo.gl/MpwTCN</a></font><br>
</span></span></p>
</div>
</div>
</div>
</blockquote>
<br>
</body>
</html>