[Research] Lab meeting this Wednesday: Noon, NSH 1507

Artur Dubrawski awd at cs.cmu.edu
Mon Apr 4 08:48:04 EDT 2011


It will be a pre-AISTATS-athon!
3 speakers, 3 topics, a lot of fun.
Missing it is not an option.

See you all!
Artur

Barnabas:
Title: On the Estimation of alpha-Divergences
Abstract: We propose new nonparametric, consistent Renyi-alpha and
Tsallis-alpha divergence estimators for continuous
distributions. Given two independent and identically distributed
samples, a ``naive'' approach would be to simply estimate the
underlying densities and plug the estimated densities into the
corresponding formulas. Our proposed estimators,
in contrast, avoid density estimation completely, estimating the
divergences directly using only simple k-nearest-neighbor
statistics. We are nonetheless able to prove that the estimators are
consistent under certain conditions. We also describe how to apply
these estimators to mutual information and demonstrate their
efficiency via numerical experiments.

Liang:
Title: Hierarchical Probabilistic Models for Group Anomaly Detection
Abstract: Statistical anomaly detection typically focuses on finding 
individual point anomalies. Often the most interesting or
unusual things in a data set are not odd individual points, but
rather larger scale phenomena that only become apparent when groups
of points are considered. In this paper, we propose
generative models for detecting such group
anomalies. We evaluate our methods on synthetic data as well as
astronomical data from the Sloan Digital Sky Survey. The
empirical results show that the proposed models are effective in
detecting group anomalies.

Yi:
Title: Multi-Label Output Codes using Canonical Correlation Analysis
Abstract: Traditional error-correcting output codes (ECOCs) decompose a 
multi-class classification problem into many binary problems. Although 
it seems natural to use ECOCs for multi-label problems as well, doing so 
naively creates issues related to: the validity of the encoding, the 
efficiency of the decoding, the predictability of the generated 
codeword, and the exploitation of the label dependency.  Using canonical 
correlation analysis, we propose an error-correcting code for 
multi-label classification. Label dependency is characterized as the 
most predictable directions in the label space, which are extracted as 
canonical output variates and encoded into the codeword. Predictions for 
the codeword define a graphical model of labels with both Bernoulli 
potentials (from classifiers on the labels) and Gaussian potentials 
(from regression on the canonical output variates). Decoding is 
performed by efficient mean-field approximation. We establish 
connections between the proposed code and research areas such as 
compressed sensing and ensemble learning. Some of these connections 
contribute to better understanding of the new code, and others lead to 
practical improvements in code design. In our empirical study, the 
proposed code leads to substantial improvements compared to various 
competitors in music emotion classification and outdoor scene recognition.



More information about the Autonlab-research mailing list