[AI Seminar] Talk: Learning deep representations for perception, reasoning, and decision-making
Chris Atkeson
cga at cs.cmu.edu
Fri Feb 23 00:28:36 EST 2018
Learning deep representations for perception, reasoning, and decision-making
Honglak Lee
Monday, Feb 26, 10:00am, NSH 3305
Abstract:
Over the recent years, deep learning has emerged as a powerful method for
learning feature representations from complex input data, and it has been
greatly successful in many domains, such as computer vision, speech
recognition, and language processing. While many deep learning algorithms
focus on standard discriminative tasks with explicit supervision (e.g.,
classification), we aim for learning deep representations with less
supervision that can serve as explanatory models of data and allow for
richer inference, reasoning, and control.
First, I will present techniques for learning deep representations in
weakly-supervised settings that disentangle underlying factors of
variation. These learned representations model intricate interaction
between underlying factors of variations in the data (e.g., pose and
morphology for 3d objects in images) and allows for hypothetical
reasoning and improved control (e.g., grasping of objects). I will also
talk about learning via analogy-making and its connection to
disentangling. In the second part of the talk, I will describe my work on
learning deep representations from multiple heterogeneous input
modalities. Specifically, I will present a multimodal learning framework
via conditional prediction that explicitly encourages cross-modal
associations. This framework provides a theoretical guarantee about
learning a joint distribution and explains recent progress in deep
architectures that interface vision and language, such as caption
generation and conditional image synthesis. I will also describe related
ongoing work on learning joint embedding from images and text for
zero-shot learning. Finally, I will present ongoing work on integrating
deep learning and reinforcement learning. Specifically, I will talk about
how learning a predictive generative model from sequential input data can
be useful for reinforcement learning. I also will talk about a
memory-based architecture that helps sequential decision making in
a first-person view and active perception setting, as well as zero-shot
multi-task generalization with hierarchical reinforcement learning given
task descriptions.
Bio:
Honglak Lee is a Research Scientist at Google Brain and an Associate
Professor of Computer Science and Engineering at the University of
Michigan, Ann Arbor. He received his Ph.D. from Computer Science
Department at Stanford University in 2010, advised by Prof. Andrew Ng.
His research focuses on deep learning and representation learning, which
spans over unsupervised and supervised and semi-supervised learning,
transfer learning, structured prediction, reinforcement learning, and
optimization. His methods have been successfully applied to computer
vision and other perception problems. He received best paper awards at
ICML 2009 and CEAS 2005. He has served as area chairs of ICML, NIPS,
ICLR, ICCV, CVPR, ECCV, AAAI, and IJCAI, as well as a guest editor of
IEEE TPAMI Special Issue on Learning Deep Architectures, an editorial
board member of Neural Networks, and an associate editor of IEEE TPAMI.
He received the Google Faculty Research Award (2011), NSF CAREER Award
(2015), and was selected as one of AI's 10 to Watch by IEEE Intelligent
Systems (2013) and a research fellow by Alfred P. Sloan Foundation
(2016).
More information about the ai-seminar-announce
mailing list