[Research] Reminder - Thesis Defense - Yi Zhang - 4/30/12 - Learning with Limited Supervision by Input and Output Coding

Jeff Schneider schneide at cs.cmu.edu
Mon Apr 30 09:06:59 EDT 2012


Hi Everyone!

Today Yi will do his thesis defense!  Please come to his talk and support the 
next Auton Lab PhD!

Jeff.


-------- Original Message --------
Subject: Reminder - Thesis Defense - Yi Zhang - 4/30/12 - Learning with Limited 
Supervision by Input and Output Coding
Date: Fri, 27 Apr 2012 10:36:22 -0400
From: Diane Stidle <diane+ at cs.cmu.edu>
To: ml-seminar at cs.cmu.edu, Jerry Zhu <jerryzhu at cs.wisc.edu>

Thesis Defense..

Date: April 30th (Monday)
Time: 1:00pm
Place: 6501 GHC
PhD Candidate: Yi Zhang

Title: Learning with Limited Supervision by Input and Output Coding

Abstract:

In many real-world applications of supervised learning, only a limited
number of labeled examples are available because the cost of obtaining
high-quality examples is high. Even with a relatively large number of
examples, the learning problem may still suffer from limited supervision
as the complexity of the prediction function increases. As a result,
learning with limited supervision presents a major challenge to machine
learning. With the goal of supervision reduction, this thesis studies
the representation, discovery and incorporation of extra input and
output information in learning.

Information about the input space can be encoded by regularization. We
first design a regularization method for text classification that
encodes the correlation of words inferred from seemingly irrelevant
unlabeled text. We then propose a matrix-normal penalty for multi-task
learning, which compactly encodes the covariance structure of the joint
input space of multiple tasks. To capture structure information that is
more general than covariance and correlation, we study a class of
regularization penalties on model compressibility. Then we design the
projection penalty, which can encode the input information highlighted
by a dimension reduction while controlling the risk of information loss
during the reduction.

Information about the output space can be exploited by error correcting
output codes. Inspired by composite likelihoods, we propose an improved
pairwise coding for multi-label classification. We then investigate
problem-dependent coding schemes, where the encoding is learned from
data instead of being predefined. We first propose a multi-label output
code using canonical correlation analysis, where predictability of the
code is optimized. We then argue that both discriminability and
predictability are critical for multi-label output codes, and propose a
max-margin formulation that promotes both discriminative and predictable
codes.

We empirically study our methods in a wide spectrum of applications,
including text categorization, landmine detection, face recognition,
brain signal classification, handwritten digit recognition, price
forecasting, music emotion prediction, medical decision, email analysis,
gene function classification and outdoor scene recognition. Under
limited supervision, our proposed methods for encoding input and output
information lead to significantly improved prediction performance.

Thesis Committee:
Jeff Schneider, Chair
Geoff Gordon
Tom Mitchell
Xiaojin (Jerry) Zhu, University of Wisconsin-Madison

Link to the draft document:
http://www.cs.cmu.edu/~yizhang1/docs/thesis_draft.pdf


-- 
*******************************************************************
Diane Stidle
Business & Graduate Programs Manager
Machine Learning Department
School of Computer Science
8203 Gates Hillman Complex			
Carnegie Mellon University		
5000 Forbes Avenue		
Pittsburgh, PA  15213-3891
Phone: 412-268-1299
Fax:   412-268-3431
Email: diane at cs.cmu.edu	
URL:http://www.ml.cmu.edu			



-------------- next part --------------
A non-text attachment was scrubbed...
Name: Thesis Defense Poster-zhang.pdf
Type: application/pdf
Size: 1019704 bytes
Desc: not available
URL: <https://mailman.srv.cs.cmu.edu/mailman/private/autonlab-research/attachments/20120430/fddeebdb/attachment.pdf>


More information about the Autonlab-research mailing list