<html>
  <head>

    <meta http-equiv="content-type" content="text/html; charset=UTF-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    Team,<br>
    <br>
    Happy Thanksgiving!<br>
    <br>
    + please mark your calendars for Monday next week. <br>
    Attending Matt's proposal talk will surely help us burn the excess
    calories from our turkey dinners.<br>
    <br>
    Cheers,<br>
    Artur<br>
    <div class="moz-forward-container"><br>
      <br>
      -------- Forwarded Message --------
      <table class="moz-email-headers-table" cellspacing="0"
        cellpadding="0" border="0">
        <tbody>
          <tr>
            <th nowrap="nowrap" valign="BASELINE" align="RIGHT">Subject:
            </th>
            <td>RI Ph.D. Thesis Proposal: Matt Barnes</td>
          </tr>
          <tr>
            <th nowrap="nowrap" valign="BASELINE" align="RIGHT">Date: </th>
            <td>Mon, 20 Nov 2017 15:37:18 +0000</td>
          </tr>
          <tr>
            <th nowrap="nowrap" valign="BASELINE" align="RIGHT">From: </th>
            <td>Suzanne Muth <a class="moz-txt-link-rfc2396E" href="mailto:lyonsmuth@cmu.edu"><lyonsmuth@cmu.edu></a></td>
          </tr>
          <tr>
            <th nowrap="nowrap" valign="BASELINE" align="RIGHT">To: </th>
            <td><a class="moz-txt-link-abbreviated" href="mailto:ri-people@cs.cmu.edu">ri-people@cs.cmu.edu</a> <a class="moz-txt-link-rfc2396E" href="mailto:ri-people@cs.cmu.edu"><ri-people@cs.cmu.edu></a></td>
          </tr>
        </tbody>
      </table>
      <br>
      <br>
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
      <style type="text/css" style="display:none"><!-- P { margin-top: 0px; margin-bottom: 0px; }--></style>
      <div id="Signature">
        <div name="divtagdefaultwrapper" style="margin: 0px;">
          <p style="font-family: Corbel, sans-serif; background-color:
            rgb(255, 255, 255);">
            <span style="font-size: 11pt;"><span style="font-family:
                Calibri, Arial, Helvetica, sans-serif;">Date:
                  27 November 2017</span><br style="font-family:
                Calibri, Arial, Helvetica, sans-serif;">
            </span></p>
          <p style="font-family: Corbel, sans-serif; background-color:
            rgb(255, 255, 255);">
            <span style="font-family: Calibri, Arial, Helvetica,
              sans-serif; font-size: 11pt;">Time:   4:00 p.m.</span></p>
          <p style="font-family: Corbel, sans-serif; background-color:
            rgb(255, 255, 255);">
            <span style="font-size: 11pt;"><span style="font-family:
                Calibri, Arial, Helvetica, sans-serif;">Place:  Gates
                Hillman Center 4405</span><br style="font-family:
                Calibri, Arial, Helvetica, sans-serif;">
            </span></p>
          <p style="font-family: Corbel, sans-serif; background-color:
            rgb(255, 255, 255);">
            <span style="font-family: Calibri, Arial, Helvetica,
              sans-serif; font-size: 11pt;">Type:   Ph.D. Thesis
              Proposal</span></p>
          <p style="font-family: Corbel, sans-serif; background-color:
            rgb(255, 255, 255);">
            <span style="font-family: Calibri, Arial, Helvetica,
              sans-serif; font-size: 11pt;">Who:   Matt Barnes</span></p>
          <p style="font-family: Corbel, sans-serif; background-color:
            rgb(255, 255, 255);">
            <span style="font-family: Calibri, Arial, Helvetica,
              sans-serif; font-size: 11pt;">Topic:  Learning with
              Clusters</span></p>
          <p style="font-family: Corbel, sans-serif; background-color:
            rgb(255, 255, 255);">
            <span style="font-size: 11pt;"> <br style="">
            </span></p>
          <p style="font-family: Corbel, sans-serif; background-color:
            rgb(255, 255, 255);">
            <span style="font-family: Calibri, Arial, Helvetica,
              sans-serif; font-size: 11pt;">Abstract:</span></p>
          <p style="font-family: Corbel, sans-serif; font-size: 16px;
            background-color: rgb(255, 255, 255);">
            <span style="font-family: Calibri, Arial, Helvetica,
              sans-serif;"></span></p>
          <p style="font-family: Corbel, sans-serif; font-size: 16px;
            background-color: rgb(255, 255, 255);">
          </p>
          <div style="font-family: Calibri, Arial, Helvetica,
            sans-serif;"><font face="Calibri" color="#212121"><span
                style="font-size:14.6667px">As machine learning becomes
                more ubiquitous, clustering has evolved from primarily a
                data analysis tool into an integrated component of
                complex machine learning systems, including those
                involving dimensionality reduction, anomaly detection,
                network analysis, image segmentation and classifying
                groups of data. </span></font><span
              style="font-size:14.6667px;color:rgb(33,33,33);font-family:Calibri">With
              this integration into multi-stage systems comes a need to
              better understand interactions between pipeline
              components. Changing parameters of the clustering
              algorithm will impact downstream components and, quite
              unfortunately, it is usually not possible to simply
              back-propagate through the entire system. Currently, as
              with many machine learning systems, the output of the
              clustering algorithm is taken as ground truth at the next
              pipeline step. Our empirical results show this false
              assumption may have dramatic empirical impacts --
              sometimes biasing results by upwards of 25%.</span></div>
          <div style="font-family: Calibri, Arial, Helvetica,
            sans-serif;"><font face="Calibri" color="#212121"><span
                style="font-size:14.6667px"><br>
              </span></font></div>
          <div style="font-family: Calibri, Arial, Helvetica,
            sans-serif;"><font face="Calibri" color="#212121"><span
                style="font-size:14.6667px">We address this gap by
                developing estimators and methods to both quantify and
                correct for clustering errors' impacts on downstream
                learners. Our work is agnostic to the downstream
                learners, and requires few assumptions on the clustering
                algorithm. Theoretical and empirical results demonstrate
                our methods and estimators are superior to the current
                naive approaches, which do not account for clustering
                errors.</span></font></div>
          <div style="font-family: Calibri, Arial, Helvetica,
            sans-serif;"><font face="Calibri" color="#212121"><span
                style="font-size:14.6667px">​<br>
              </span></font></div>
          <div style="font-family: Calibri, Arial, Helvetica,
            sans-serif;"><font face="Calibri" color="#212121"><span
                style="font-size:14.6667px">Along these lines, we also
                develop several new clustering algorithms and prove
                theoretical bounds for existing algorithms, to be used
                as inputs to our later error-correction methods. Not
                surprisingly, we find learning on clusters of data is
                both theoretically and empirically easier as the number
                of clustering errors decreases. Thus, our work is
                two-fold. We attempt to both provide the best clustering
                possible and learn on inevitably noisy clusters.</span></font></div>
          <div style="font-family: Calibri, Arial, Helvetica,
            sans-serif;"><font face="Calibri" color="#212121"><span
                style="font-size:14.6667px"><br>
              </span></font></div>
          <div style="font-family: Calibri, Arial, Helvetica,
            sans-serif;"><font face="Calibri" color="#212121"><span
                style="font-size:14.6667px">A major limiting factor in
                our error-correction methods is scalability. Currently,
                their computational complexity is O(n^3) where n is the
                size of the training dataset. This limits their
                applicability to very small machine learning problems.
                We propose addressing this scalability issue through
                approximation. It should be possible to reduce the
                computational complexity to O(p^3) where p is a small
                fixed constant and independent of n, corresponding to
                the number of parameters in the approximation.</span></font></div>
          <p style="font-family: Corbel, sans-serif; font-size: 16px;
            background-color: rgb(255, 255, 255);">
            <br>
          </p>
          <p style="font-family: Corbel, sans-serif; font-size: 16px;
            background-color: rgb(255, 255, 255);">
            <span style="font-family: Calibri, Arial, Helvetica,
              sans-serif;"></span></p>
          <p style="font-family: Corbel, sans-serif; font-size: 16px;
            background-color: rgb(255, 255, 255);">
             <br>
          </p>
          <p style="font-family: Corbel, sans-serif; background-color:
            rgb(255, 255, 255);">
            <span style="font-family: Calibri, Arial, Helvetica,
              sans-serif; font-size: 11pt;">Thesis Committee Members:</span></p>
          <p style="background-color: rgb(255, 255, 255);"><font
              face="Calibri, Arial, Helvetica, sans-serif"><span
                style="font-size: 14.666666984558105px;">Artur
                Dubrawski, Chair</span></font></p>
          <p style="background-color: rgb(255, 255, 255);"><font
              face="Calibri, Arial, Helvetica, sans-serif"><span
                style="font-size: 14.666666984558105px;">Geoff Gordon</span></font></p>
          <p style="background-color: rgb(255, 255, 255);"><font
              face="Calibri, Arial, Helvetica, sans-serif"><span
                style="font-size: 14.666666984558105px;">Kris Kitani</span></font></p>
          <p style="background-color: rgb(255, 255, 255);"><font
              face="Calibri, Arial, Helvetica, sans-serif"><span
                style="font-size: 14.666666984558105px;">Beka Steorts,
                Duke University</span></font></p>
          <p style="font-family: Corbel, sans-serif; background-color:
            rgb(255, 255, 255);">
             <br>
          </p>
          <p style="font-family: Corbel, sans-serif; background-color:
            rgb(255, 255, 255);">
            <br>
          </p>
          <p style="font-family: Corbel, sans-serif; background-color:
            rgb(255, 255, 255);">
            <span style="font-size: 11pt;"><span style="font-family:
                Calibri, Arial, Helvetica, sans-serif;">A copy of the
                thesis proposal document is available at:</span></span></p>
          <p style="font-family: Corbel, sans-serif; background-color:
            rgb(255, 255, 255);">
            <span style="font-size: 11pt;"><span style="font-family:
                Calibri, Arial, Helvetica, sans-serif;"><font
                  color="#212121"><a href="http://goo.gl/MpwTCN"
                    target="_blank" moz-do-not-send="true">http://goo.gl/MpwTCN</a></font><br>
              </span></span></p>
        </div>
      </div>
    </div>
  </body>
</html>