From awd at cs.cmu.edu Mon Apr 4 08:48:04 2011 From: awd at cs.cmu.edu (Artur Dubrawski) Date: Mon, 04 Apr 2011 08:48:04 -0400 Subject: [Research] Lab meeting this Wednesday: Noon, NSH 1507 Message-ID: <4D99BE04.2010007@cs.cmu.edu> It will be a pre-AISTATS-athon! 3 speakers, 3 topics, a lot of fun. Missing it is not an option. See you all! Artur Barnabas: Title: On the Estimation of alpha-Divergences Abstract: We propose new nonparametric, consistent Renyi-alpha and Tsallis-alpha divergence estimators for continuous distributions. Given two independent and identically distributed samples, a ``naive'' approach would be to simply estimate the underlying densities and plug the estimated densities into the corresponding formulas. Our proposed estimators, in contrast, avoid density estimation completely, estimating the divergences directly using only simple k-nearest-neighbor statistics. We are nonetheless able to prove that the estimators are consistent under certain conditions. We also describe how to apply these estimators to mutual information and demonstrate their efficiency via numerical experiments. Liang: Title: Hierarchical Probabilistic Models for Group Anomaly Detection Abstract: Statistical anomaly detection typically focuses on finding individual point anomalies. Often the most interesting or unusual things in a data set are not odd individual points, but rather larger scale phenomena that only become apparent when groups of points are considered. In this paper, we propose generative models for detecting such group anomalies. We evaluate our methods on synthetic data as well as astronomical data from the Sloan Digital Sky Survey. The empirical results show that the proposed models are effective in detecting group anomalies. Yi: Title: Multi-Label Output Codes using Canonical Correlation Analysis Abstract: Traditional error-correcting output codes (ECOCs) decompose a multi-class classification problem into many binary problems. Although it seems natural to use ECOCs for multi-label problems as well, doing so naively creates issues related to: the validity of the encoding, the efficiency of the decoding, the predictability of the generated codeword, and the exploitation of the label dependency. Using canonical correlation analysis, we propose an error-correcting code for multi-label classification. Label dependency is characterized as the most predictable directions in the label space, which are extracted as canonical output variates and encoded into the codeword. Predictions for the codeword define a graphical model of labels with both Bernoulli potentials (from classifiers on the labels) and Gaussian potentials (from regression on the canonical output variates). Decoding is performed by efficient mean-field approximation. We establish connections between the proposed code and research areas such as compressed sensing and ensemble learning. Some of these connections contribute to better understanding of the new code, and others lead to practical improvements in code design. In our empirical study, the proposed code leads to substantial improvements compared to various competitors in music emotion classification and outdoor scene recognition. From donghanw at cs.cmu.edu Thu Apr 14 10:51:26 2011 From: donghanw at cs.cmu.edu (Donghan (Jarod) Wang) Date: Thu, 14 Apr 2011 10:51:26 -0400 Subject: [Research] MLcomp -- comparing machine learning programs Message-ID: Hey, A cool website lets you Do a comprehensive evaluation of your new algorithm. Find the best algorithm (program) for your dataset. Check it out. -- Donghan (Jarod) Wang Research Programmer Robotics Institute Carnegie Mellon University 5000 Forbes Avenue Pittsburgh, PA 15213 3122 Newell-Simon Hall Tel: +1 412 268 1238 Cel: +1 646 623 4001 From donghanw at cs.cmu.edu Thu Apr 14 10:52:08 2011 From: donghanw at cs.cmu.edu (Donghan (Jarod) Wang) Date: Thu, 14 Apr 2011 10:52:08 -0400 Subject: [Research] MLcomp -- comparing machine learning programs In-Reply-To: References: Message-ID: and here is the link http://mlcomp.org/ On Thu, Apr 14, 2011 at 10:51 AM, Donghan (Jarod) Wang wrote: > Hey, > > A cool website lets you > > Do a comprehensive evaluation of your new algorithm. > > Find the best algorithm (program) for your dataset. > > Check it out. > > -- > Donghan (Jarod) Wang > Research Programmer > Robotics Institute > Carnegie Mellon University > 5000 Forbes Avenue > Pittsburgh, PA 15213 > 3122 Newell-Simon Hall > Tel: +1 412 268 1238 > Cel: +1 646 623 4001 > -- Donghan (Jarod) Wang Research Programmer Robotics Institute Carnegie Mellon University 5000 Forbes Avenue Pittsburgh, PA 15213 3122 Newell-Simon Hall Tel: +1 412 268 1238 Cel: +1 646 623 4001 From awd at cs.cmu.edu Thu Apr 14 11:07:49 2011 From: awd at cs.cmu.edu (Artur Dubrawski) Date: Thu, 14 Apr 2011 11:07:49 -0400 Subject: [Research] MLcomp -- comparing machine learning programs In-Reply-To: References: Message-ID: <4DA70DC5.4000502@cs.cmu.edu> As soon as they also automate assembly of conference papers based on these comparisons, we will know that UC Berkeley had caught up with the 1996 CMU Auton Lab :) Thanks Jarod. On 4/14/2011 10:52 AM, Donghan (Jarod) Wang wrote: > and here is the link > > http://mlcomp.org/ > > On Thu, Apr 14, 2011 at 10:51 AM, Donghan (Jarod) Wang > wrote: >> Hey, >> >> A cool website lets you >> >> Do a comprehensive evaluation of your new algorithm. >> >> Find the best algorithm (program) for your dataset. >> >> Check it out. >> >> -- >> Donghan (Jarod) Wang >> Research Programmer >> Robotics Institute >> Carnegie Mellon University >> 5000 Forbes Avenue >> Pittsburgh, PA 15213 >> 3122 Newell-Simon Hall >> Tel: +1 412 268 1238 >> Cel: +1 646 623 4001 >> > > >