<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
Team,<br>
<br>
Happy Thanksgiving!<br>
<br>
+ please mark your calendars for Monday next week. <br>
Attending Matt's proposal talk will surely help us burn the excess
calories from our turkey dinners.<br>
<br>
Cheers,<br>
Artur<br>
<div class="moz-forward-container"><br>
<br>
-------- Forwarded Message --------
<table class="moz-email-headers-table" cellspacing="0"
cellpadding="0" border="0">
<tbody>
<tr>
<th nowrap="nowrap" valign="BASELINE" align="RIGHT">Subject:
</th>
<td>RI Ph.D. Thesis Proposal: Matt Barnes</td>
</tr>
<tr>
<th nowrap="nowrap" valign="BASELINE" align="RIGHT">Date: </th>
<td>Mon, 20 Nov 2017 15:37:18 +0000</td>
</tr>
<tr>
<th nowrap="nowrap" valign="BASELINE" align="RIGHT">From: </th>
<td>Suzanne Muth <a class="moz-txt-link-rfc2396E" href="mailto:lyonsmuth@cmu.edu"><lyonsmuth@cmu.edu></a></td>
</tr>
<tr>
<th nowrap="nowrap" valign="BASELINE" align="RIGHT">To: </th>
<td><a class="moz-txt-link-abbreviated" href="mailto:ri-people@cs.cmu.edu">ri-people@cs.cmu.edu</a> <a class="moz-txt-link-rfc2396E" href="mailto:ri-people@cs.cmu.edu"><ri-people@cs.cmu.edu></a></td>
</tr>
</tbody>
</table>
<br>
<br>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<style type="text/css" style="display:none"><!-- P { margin-top: 0px; margin-bottom: 0px; }--></style>
<div id="Signature">
<div name="divtagdefaultwrapper" style="margin: 0px;">
<p style="font-family: Corbel, sans-serif; background-color:
rgb(255, 255, 255);">
<span style="font-size: 11pt;"><span style="font-family:
Calibri, Arial, Helvetica, sans-serif;">Date:
27 November 2017</span><br style="font-family:
Calibri, Arial, Helvetica, sans-serif;">
</span></p>
<p style="font-family: Corbel, sans-serif; background-color:
rgb(255, 255, 255);">
<span style="font-family: Calibri, Arial, Helvetica,
sans-serif; font-size: 11pt;">Time: 4:00 p.m.</span></p>
<p style="font-family: Corbel, sans-serif; background-color:
rgb(255, 255, 255);">
<span style="font-size: 11pt;"><span style="font-family:
Calibri, Arial, Helvetica, sans-serif;">Place: Gates
Hillman Center 4405</span><br style="font-family:
Calibri, Arial, Helvetica, sans-serif;">
</span></p>
<p style="font-family: Corbel, sans-serif; background-color:
rgb(255, 255, 255);">
<span style="font-family: Calibri, Arial, Helvetica,
sans-serif; font-size: 11pt;">Type: Ph.D. Thesis
Proposal</span></p>
<p style="font-family: Corbel, sans-serif; background-color:
rgb(255, 255, 255);">
<span style="font-family: Calibri, Arial, Helvetica,
sans-serif; font-size: 11pt;">Who: Matt Barnes</span></p>
<p style="font-family: Corbel, sans-serif; background-color:
rgb(255, 255, 255);">
<span style="font-family: Calibri, Arial, Helvetica,
sans-serif; font-size: 11pt;">Topic: Learning with
Clusters</span></p>
<p style="font-family: Corbel, sans-serif; background-color:
rgb(255, 255, 255);">
<span style="font-size: 11pt;"> <br style="">
</span></p>
<p style="font-family: Corbel, sans-serif; background-color:
rgb(255, 255, 255);">
<span style="font-family: Calibri, Arial, Helvetica,
sans-serif; font-size: 11pt;">Abstract:</span></p>
<p style="font-family: Corbel, sans-serif; font-size: 16px;
background-color: rgb(255, 255, 255);">
<span style="font-family: Calibri, Arial, Helvetica,
sans-serif;"></span></p>
<p style="font-family: Corbel, sans-serif; font-size: 16px;
background-color: rgb(255, 255, 255);">
</p>
<div style="font-family: Calibri, Arial, Helvetica,
sans-serif;"><font face="Calibri" color="#212121"><span
style="font-size:14.6667px">As machine learning becomes
more ubiquitous, clustering has evolved from primarily a
data analysis tool into an integrated component of
complex machine learning systems, including those
involving dimensionality reduction, anomaly detection,
network analysis, image segmentation and classifying
groups of data. </span></font><span
style="font-size:14.6667px;color:rgb(33,33,33);font-family:Calibri">With
this integration into multi-stage systems comes a need to
better understand interactions between pipeline
components. Changing parameters of the clustering
algorithm will impact downstream components and, quite
unfortunately, it is usually not possible to simply
back-propagate through the entire system. Currently, as
with many machine learning systems, the output of the
clustering algorithm is taken as ground truth at the next
pipeline step. Our empirical results show this false
assumption may have dramatic empirical impacts --
sometimes biasing results by upwards of 25%.</span></div>
<div style="font-family: Calibri, Arial, Helvetica,
sans-serif;"><font face="Calibri" color="#212121"><span
style="font-size:14.6667px"><br>
</span></font></div>
<div style="font-family: Calibri, Arial, Helvetica,
sans-serif;"><font face="Calibri" color="#212121"><span
style="font-size:14.6667px">We address this gap by
developing estimators and methods to both quantify and
correct for clustering errors' impacts on downstream
learners. Our work is agnostic to the downstream
learners, and requires few assumptions on the clustering
algorithm. Theoretical and empirical results demonstrate
our methods and estimators are superior to the current
naive approaches, which do not account for clustering
errors.</span></font></div>
<div style="font-family: Calibri, Arial, Helvetica,
sans-serif;"><font face="Calibri" color="#212121"><span
style="font-size:14.6667px"><br>
</span></font></div>
<div style="font-family: Calibri, Arial, Helvetica,
sans-serif;"><font face="Calibri" color="#212121"><span
style="font-size:14.6667px">Along these lines, we also
develop several new clustering algorithms and prove
theoretical bounds for existing algorithms, to be used
as inputs to our later error-correction methods. Not
surprisingly, we find learning on clusters of data is
both theoretically and empirically easier as the number
of clustering errors decreases. Thus, our work is
two-fold. We attempt to both provide the best clustering
possible and learn on inevitably noisy clusters.</span></font></div>
<div style="font-family: Calibri, Arial, Helvetica,
sans-serif;"><font face="Calibri" color="#212121"><span
style="font-size:14.6667px"><br>
</span></font></div>
<div style="font-family: Calibri, Arial, Helvetica,
sans-serif;"><font face="Calibri" color="#212121"><span
style="font-size:14.6667px">A major limiting factor in
our error-correction methods is scalability. Currently,
their computational complexity is O(n^3) where n is the
size of the training dataset. This limits their
applicability to very small machine learning problems.
We propose addressing this scalability issue through
approximation. It should be possible to reduce the
computational complexity to O(p^3) where p is a small
fixed constant and independent of n, corresponding to
the number of parameters in the approximation.</span></font></div>
<p style="font-family: Corbel, sans-serif; font-size: 16px;
background-color: rgb(255, 255, 255);">
<br>
</p>
<p style="font-family: Corbel, sans-serif; font-size: 16px;
background-color: rgb(255, 255, 255);">
<span style="font-family: Calibri, Arial, Helvetica,
sans-serif;"></span></p>
<p style="font-family: Corbel, sans-serif; font-size: 16px;
background-color: rgb(255, 255, 255);">
<br>
</p>
<p style="font-family: Corbel, sans-serif; background-color:
rgb(255, 255, 255);">
<span style="font-family: Calibri, Arial, Helvetica,
sans-serif; font-size: 11pt;">Thesis Committee Members:</span></p>
<p style="background-color: rgb(255, 255, 255);"><font
face="Calibri, Arial, Helvetica, sans-serif"><span
style="font-size: 14.666666984558105px;">Artur
Dubrawski, Chair</span></font></p>
<p style="background-color: rgb(255, 255, 255);"><font
face="Calibri, Arial, Helvetica, sans-serif"><span
style="font-size: 14.666666984558105px;">Geoff Gordon</span></font></p>
<p style="background-color: rgb(255, 255, 255);"><font
face="Calibri, Arial, Helvetica, sans-serif"><span
style="font-size: 14.666666984558105px;">Kris Kitani</span></font></p>
<p style="background-color: rgb(255, 255, 255);"><font
face="Calibri, Arial, Helvetica, sans-serif"><span
style="font-size: 14.666666984558105px;">Beka Steorts,
Duke University</span></font></p>
<p style="font-family: Corbel, sans-serif; background-color:
rgb(255, 255, 255);">
<br>
</p>
<p style="font-family: Corbel, sans-serif; background-color:
rgb(255, 255, 255);">
<br>
</p>
<p style="font-family: Corbel, sans-serif; background-color:
rgb(255, 255, 255);">
<span style="font-size: 11pt;"><span style="font-family:
Calibri, Arial, Helvetica, sans-serif;">A copy of the
thesis proposal document is available at:</span></span></p>
<p style="font-family: Corbel, sans-serif; background-color:
rgb(255, 255, 255);">
<span style="font-size: 11pt;"><span style="font-family:
Calibri, Arial, Helvetica, sans-serif;"><font
color="#212121"><a href="http://goo.gl/MpwTCN"
target="_blank" moz-do-not-send="true">http://goo.gl/MpwTCN</a></font><br>
</span></span></p>
</div>
</div>
</div>
</body>
</html>