<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    It is today, 4pm, Gates 4405!<br>
    <br>
    A.<br>
    <br>
    <div class="moz-cite-prefix">On 11/21/2017 11:56 AM, Artur Dubrawski
      wrote:<br>
    </div>
    <blockquote type="cite"
      cite="mid:d393a7cd-aa11-0cf1-2d84-89a9e863760c@cs.cmu.edu">
      <meta http-equiv="content-type" content="text/html; charset=utf-8">
      Team,<br>
      <br>
      Happy Thanksgiving!<br>
      <br>
      + please mark your calendars for Monday next week. <br>
      Attending Matt's proposal talk will surely help us burn the excess
      calories from our turkey dinners.<br>
      <br>
      Cheers,<br>
      Artur<br>
      <div class="moz-forward-container"><br>
        <br>
        -------- Forwarded Message --------
        <table class="moz-email-headers-table" cellspacing="0"
          cellpadding="0" border="0">
          <tbody>
            <tr>
              <th nowrap="nowrap" valign="BASELINE" align="RIGHT">Subject:
              </th>
              <td>RI Ph.D. Thesis Proposal: Matt Barnes</td>
            </tr>
            <tr>
              <th nowrap="nowrap" valign="BASELINE" align="RIGHT">Date:
              </th>
              <td>Mon, 20 Nov 2017 15:37:18 +0000</td>
            </tr>
            <tr>
              <th nowrap="nowrap" valign="BASELINE" align="RIGHT">From:
              </th>
              <td>Suzanne Muth <a class="moz-txt-link-rfc2396E"
                  href="mailto:lyonsmuth@cmu.edu" moz-do-not-send="true"><lyonsmuth@cmu.edu></a></td>
            </tr>
            <tr>
              <th nowrap="nowrap" valign="BASELINE" align="RIGHT">To: </th>
              <td><a class="moz-txt-link-abbreviated"
                  href="mailto:ri-people@cs.cmu.edu"
                  moz-do-not-send="true">ri-people@cs.cmu.edu</a> <a
                  class="moz-txt-link-rfc2396E"
                  href="mailto:ri-people@cs.cmu.edu"
                  moz-do-not-send="true"><ri-people@cs.cmu.edu></a></td>
            </tr>
          </tbody>
        </table>
        <br>
        <br>
        <meta http-equiv="Content-Type" content="text/html;
          charset=utf-8">
        <style type="text/css" style="display:none"><!-- P { margin-top: 0px; margin-bottom: 0px; }--></style>
        <div id="Signature">
          <div name="divtagdefaultwrapper" style="margin: 0px;">
            <p style="font-family: Corbel, sans-serif; background-color:
              rgb(255, 255, 255);"> <span style="font-size: 11pt;"><span
                  style="font-family: Calibri, Arial, Helvetica,
                  sans-serif;">Date:   27 November 2017</span><br
                  style="font-family: Calibri, Arial, Helvetica,
                  sans-serif;">
              </span></p>
            <p style="font-family: Corbel, sans-serif; background-color:
              rgb(255, 255, 255);"> <span style="font-family: Calibri,
                Arial, Helvetica, sans-serif; font-size: 11pt;">Time:  
                4:00 p.m.</span></p>
            <p style="font-family: Corbel, sans-serif; background-color:
              rgb(255, 255, 255);"> <span style="font-size: 11pt;"><span
                  style="font-family: Calibri, Arial, Helvetica,
                  sans-serif;">Place:  Gates Hillman Center 4405</span><br
                  style="font-family: Calibri, Arial, Helvetica,
                  sans-serif;">
              </span></p>
            <p style="font-family: Corbel, sans-serif; background-color:
              rgb(255, 255, 255);"> <span style="font-family: Calibri,
                Arial, Helvetica, sans-serif; font-size: 11pt;">Type:
                  Ph.D. Thesis Proposal</span></p>
            <p style="font-family: Corbel, sans-serif; background-color:
              rgb(255, 255, 255);"> <span style="font-family: Calibri,
                Arial, Helvetica, sans-serif; font-size: 11pt;">Who:
                  Matt Barnes</span></p>
            <p style="font-family: Corbel, sans-serif; background-color:
              rgb(255, 255, 255);"> <span style="font-family: Calibri,
                Arial, Helvetica, sans-serif; font-size: 11pt;">Topic:
                 Learning with Clusters</span></p>
            <p style="font-family: Corbel, sans-serif; background-color:
              rgb(255, 255, 255);"> <span style="font-size: 11pt;"> <br
                  style="">
              </span></p>
            <p style="font-family: Corbel, sans-serif; background-color:
              rgb(255, 255, 255);"> <span style="font-family: Calibri,
                Arial, Helvetica, sans-serif; font-size: 11pt;">Abstract:</span></p>
            <p style="font-family: Corbel, sans-serif; font-size: 16px;
              background-color: rgb(255, 255, 255);"> <span
                style="font-family: Calibri, Arial, Helvetica,
                sans-serif;"></span></p>
            <p style="font-family: Corbel, sans-serif; font-size: 16px;
              background-color: rgb(255, 255, 255);"> </p>
            <div style="font-family: Calibri, Arial, Helvetica,
              sans-serif;"><font face="Calibri" color="#212121"><span
                  style="font-size:14.6667px">As machine learning
                  becomes more ubiquitous, clustering has evolved from
                  primarily a data analysis tool into an integrated
                  component of complex machine learning systems,
                  including those involving dimensionality reduction,
                  anomaly detection, network analysis, image
                  segmentation and classifying groups of data. </span></font><span
style="font-size:14.6667px;color:rgb(33,33,33);font-family:Calibri">With
                this integration into multi-stage systems comes a need
                to better understand interactions between pipeline
                components. Changing parameters of the clustering
                algorithm will impact downstream components and, quite
                unfortunately, it is usually not possible to simply
                back-propagate through the entire system. Currently, as
                with many machine learning systems, the output of the
                clustering algorithm is taken as ground truth at the
                next pipeline step. Our empirical results show this
                false assumption may have dramatic empirical impacts --
                sometimes biasing results by upwards of 25%.</span></div>
            <div style="font-family: Calibri, Arial, Helvetica,
              sans-serif;"><font face="Calibri" color="#212121"><span
                  style="font-size:14.6667px"><br>
                </span></font></div>
            <div style="font-family: Calibri, Arial, Helvetica,
              sans-serif;"><font face="Calibri" color="#212121"><span
                  style="font-size:14.6667px">We address this gap by
                  developing estimators and methods to both quantify and
                  correct for clustering errors' impacts on downstream
                  learners. Our work is agnostic to the downstream
                  learners, and requires few assumptions on the
                  clustering algorithm. Theoretical and empirical
                  results demonstrate our methods and estimators are
                  superior to the current naive approaches, which do not
                  account for clustering errors.</span></font></div>
            <div style="font-family: Calibri, Arial, Helvetica,
              sans-serif;"><font face="Calibri" color="#212121"><span
                  style="font-size:14.6667px">​<br>
                </span></font></div>
            <div style="font-family: Calibri, Arial, Helvetica,
              sans-serif;"><font face="Calibri" color="#212121"><span
                  style="font-size:14.6667px">Along these lines, we also
                  develop several new clustering algorithms and prove
                  theoretical bounds for existing algorithms, to be used
                  as inputs to our later error-correction methods. Not
                  surprisingly, we find learning on clusters of data is
                  both theoretically and empirically easier as the
                  number of clustering errors decreases. Thus, our work
                  is two-fold. We attempt to both provide the best
                  clustering possible and learn on inevitably noisy
                  clusters.</span></font></div>
            <div style="font-family: Calibri, Arial, Helvetica,
              sans-serif;"><font face="Calibri" color="#212121"><span
                  style="font-size:14.6667px"><br>
                </span></font></div>
            <div style="font-family: Calibri, Arial, Helvetica,
              sans-serif;"><font face="Calibri" color="#212121"><span
                  style="font-size:14.6667px">A major limiting factor in
                  our error-correction methods is scalability.
                  Currently, their computational complexity is O(n^3)
                  where n is the size of the training dataset. This
                  limits their applicability to very small machine
                  learning problems. We propose addressing this
                  scalability issue through approximation. It should be
                  possible to reduce the computational complexity to
                  O(p^3) where p is a small fixed constant and
                  independent of n, corresponding to the number of
                  parameters in the approximation.</span></font></div>
            <p style="font-family: Corbel, sans-serif; font-size: 16px;
              background-color: rgb(255, 255, 255);"> <br>
            </p>
            <p style="font-family: Corbel, sans-serif; font-size: 16px;
              background-color: rgb(255, 255, 255);"> <span
                style="font-family: Calibri, Arial, Helvetica,
                sans-serif;"></span></p>
            <p style="font-family: Corbel, sans-serif; font-size: 16px;
              background-color: rgb(255, 255, 255);">  <br>
            </p>
            <p style="font-family: Corbel, sans-serif; background-color:
              rgb(255, 255, 255);"> <span style="font-family: Calibri,
                Arial, Helvetica, sans-serif; font-size: 11pt;">Thesis
                Committee Members:</span></p>
            <p style="background-color: rgb(255, 255, 255);"><font
                face="Calibri, Arial, Helvetica, sans-serif"><span
                  style="font-size: 14.666666984558105px;">Artur
                  Dubrawski, Chair</span></font></p>
            <p style="background-color: rgb(255, 255, 255);"><font
                face="Calibri, Arial, Helvetica, sans-serif"><span
                  style="font-size: 14.666666984558105px;">Geoff Gordon</span></font></p>
            <p style="background-color: rgb(255, 255, 255);"><font
                face="Calibri, Arial, Helvetica, sans-serif"><span
                  style="font-size: 14.666666984558105px;">Kris Kitani</span></font></p>
            <p style="background-color: rgb(255, 255, 255);"><font
                face="Calibri, Arial, Helvetica, sans-serif"><span
                  style="font-size: 14.666666984558105px;">Beka Steorts,
                  Duke University</span></font></p>
            <p style="font-family: Corbel, sans-serif; background-color:
              rgb(255, 255, 255);">  <br>
            </p>
            <p style="font-family: Corbel, sans-serif; background-color:
              rgb(255, 255, 255);"> <br>
            </p>
            <p style="font-family: Corbel, sans-serif; background-color:
              rgb(255, 255, 255);"> <span style="font-size: 11pt;"><span
                  style="font-family: Calibri, Arial, Helvetica,
                  sans-serif;">A copy of the thesis proposal document is
                  available at:</span></span></p>
            <p style="font-family: Corbel, sans-serif; background-color:
              rgb(255, 255, 255);"> <span style="font-size: 11pt;"><span
                  style="font-family: Calibri, Arial, Helvetica,
                  sans-serif;"><font color="#212121"><a
                      href="http://goo.gl/MpwTCN" target="_blank"
                      moz-do-not-send="true">http://goo.gl/MpwTCN</a></font><br>
                </span></span></p>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
  </body>
</html>