[Research] Lab meeting this Wednesday: Noon, NSH 1507
Artur Dubrawski
awd at cs.cmu.edu
Mon Apr 4 08:48:04 EDT 2011
It will be a pre-AISTATS-athon!
3 speakers, 3 topics, a lot of fun.
Missing it is not an option.
See you all!
Artur
Barnabas:
Title: On the Estimation of alpha-Divergences
Abstract: We propose new nonparametric, consistent Renyi-alpha and
Tsallis-alpha divergence estimators for continuous
distributions. Given two independent and identically distributed
samples, a ``naive'' approach would be to simply estimate the
underlying densities and plug the estimated densities into the
corresponding formulas. Our proposed estimators,
in contrast, avoid density estimation completely, estimating the
divergences directly using only simple k-nearest-neighbor
statistics. We are nonetheless able to prove that the estimators are
consistent under certain conditions. We also describe how to apply
these estimators to mutual information and demonstrate their
efficiency via numerical experiments.
Liang:
Title: Hierarchical Probabilistic Models for Group Anomaly Detection
Abstract: Statistical anomaly detection typically focuses on finding
individual point anomalies. Often the most interesting or
unusual things in a data set are not odd individual points, but
rather larger scale phenomena that only become apparent when groups
of points are considered. In this paper, we propose
generative models for detecting such group
anomalies. We evaluate our methods on synthetic data as well as
astronomical data from the Sloan Digital Sky Survey. The
empirical results show that the proposed models are effective in
detecting group anomalies.
Yi:
Title: Multi-Label Output Codes using Canonical Correlation Analysis
Abstract: Traditional error-correcting output codes (ECOCs) decompose a
multi-class classification problem into many binary problems. Although
it seems natural to use ECOCs for multi-label problems as well, doing so
naively creates issues related to: the validity of the encoding, the
efficiency of the decoding, the predictability of the generated
codeword, and the exploitation of the label dependency. Using canonical
correlation analysis, we propose an error-correcting code for
multi-label classification. Label dependency is characterized as the
most predictable directions in the label space, which are extracted as
canonical output variates and encoded into the codeword. Predictions for
the codeword define a graphical model of labels with both Bernoulli
potentials (from classifiers on the labels) and Gaussian potentials
(from regression on the canonical output variates). Decoding is
performed by efficient mean-field approximation. We establish
connections between the proposed code and research areas such as
compressed sensing and ensemble learning. Some of these connections
contribute to better understanding of the new code, and others lead to
practical improvements in code design. In our empirical study, the
proposed code leads to substantial improvements compared to various
competitors in music emotion classification and outdoor scene recognition.
More information about the Autonlab-research
mailing list