<div dir="ltr">A reminder about Yichong's defense coming up this Monday at 9am.<div><br><div>All details below, it will be a fun talk so please come and join if you can!</div><div><br></div><div>Cheers,</div><div>Artur<br><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">---------- Forwarded message ---------<br>From: <strong class="gmail_sendername" dir="auto">Diane Stidle</strong> <span dir="auto"><<a href="mailto:stidle@andrew.cmu.edu">stidle@andrew.cmu.edu</a>></span><br>Date: Mon, Jun 15, 2020 at 1:49 PM<br>Subject: Thesis Defense - June 29, 2020 - Yichong Xu - Learning and Decision Making from Diverse Forms of Information<br>To: <a href="mailto:ml-seminar@cs.cmu.edu">ml-seminar@cs.cmu.edu</a> <<a href="mailto:ML-SEMINAR@cs.cmu.edu">ML-SEMINAR@cs.cmu.edu</a>>,  <<a href="mailto:jcl@microsoft.com">jcl@microsoft.com</a>><br></div><br><br>

  
  <div>

    <p><i><b>Thesis Defense</b></i></p>

    <p>Date: June 29, 2020<br>

      Time: 9:00am (EDT)<br>

      PhD Candidate: Yichong Xu</p>

    <p>Virtual Presentation Link:<br>

      <a href="https://cmu.zoom.us/j/99909151454" target="_blank">https://cmu.zoom.us/j/99909151454</a></p>

    <p><font size="-1"><b>Title:  </b><span style="font-weight:bold;background-color:white" lang="en-US"></span><span style="font-weight:bold" lang="zh-CN">Learning and Decision Making from Diverse Forms

          of Information</span></font></p>

    <p><font size="-1">Abstract:<br>

        Classical machine learning posits that data are independently

        and identically distributed, in a single format usually the same

        as test data. In modern applications however, additional

        information in other formats might be available freely or at a

        lower cost. For example, in data crowdsourcing we can collect

        preferences over the data points instead of directly asking the

        labels of a single data point at a lower cost. In natural

        language understanding problems, we might have limited amount of

        data in the target domain, but can use a large amount of general

        domain data for free.</font>

    </p>

    <div style="margin:0in;font-size:9pt">

      <p><font size="-1">The main topic of this thesis is to study how

          to efficiently incorporate these diverse forms of information

          into the learning and decision making process. We study two

          representative paradigms in this thesis. Firstly, we study

          learning and decision making problems with direct labels and

          comparisons. Our algorithms can efficiently combine

          comparisons with direct labels so that the total learning cost

          can be greatly reduced. Secondly, we study multi-task learning

          problems from multiple domain data, and design algorithms to

          transfer the data from a general, abundant domain to the

          target domain. We show theoretical guarantees of our

          algorithms as well as their statistical minimaxity through

          information-theoretic limits. On the practical side, we

          demonstrate promising experimental results on price estimation

          and natural language understanding tasks.</font></p>

    </div>

    <p>

    </p>

    <div style="margin:0in;font-size:9pt">

      <p><font size="-1"><b>Thesis Committee:</b><br>

          Artur Dubrawski, Co-Chair <br>

          Aarti Singh, Co-Chair <br>

          Sivaraman Balakrishnan<br>

          John Langford (Microsoft Research)</font></p>

    </div>

    <pre cols="72">-- 

Diane Stidle

Graduate Programs Manager

Machine Learning Department

Carnegie Mellon University

<a href="mailto:stidle@andrew.cmu.edu" target="_blank">stidle@andrew.cmu.edu</a>

412-268-1299</pre>

  </div>


</div></div></div></div>