Articles on combining multiple models

Thu Sep 17 12:48:43 EDT 1998

I am announcing the availability of several articles related to
classification and regression by combining models. The first list
below contains the titles, url's and FEATURES of each article. The
second list contains the abstracts.

Please contact me at cmerz at ics.uci.edu with any questions.

Thanks,
Chris Merz

================== Titles, URL's and *** FEATURES *** =================

1. My dissertation:  

       "Classification and Regression by Combining Models"
       Merz, Christopher J. (1998) 
       http://www.ics.uci.edu/~cmerz/thesis.ps

       *** DESCRIBES TWO ROBUST METHODS FOR COMBINING LEARNED MODELS ***
       *** USING TECHNIQUES BASED ON SINGULAR VALUE DECOMPOSITION.   ***
       *** CONTAINS COMPREHENSIVE BACKGROUND AND SURVEY CHAPTERS.    ***

2. Preprints of two accepted Machine Learning Journal articles:

       "A Principal Components Approach to Combining Regression Estimates", 
       Merz, C. J., Pazzani, M. J. (1997) 
       To appear in the Special Issue of Machine Learning on Integrating 
       Multiple Learned Models. 
       http://www.ics.uci.edu/~cmerz/jr.html/mlj.pcr.ps

       *** SHOWS HOW PCA MAY BE USED TO SYSTEMATICALLY EXPLORE  ***
       *** WEIGHT SETS WITH VARYING DEGREES OF REGULARIZATION.  ***

       "Using Correspondence Analysis to Combine Classifiers", 
       Merz, C. J. (1997) 
       To appear in the Special Issue of Machine Learning on Integrating 
       Multiple Learned Models. 
       http://www.ics.uci.edu/~cmerz/jr.html/mlj.scann.ps

       *** SHOWS THAT THE SCANN METHOD COMBINES BOOSTED MODEL  ***
       *** SETS BETTER THAN BOOSTING DOES.                     ***

3. A bibtex file of the references in my dissertation survey:

       http://www.ics.uci.edu/~cmerz/bib.html/survey.bib

       *** COMPREHENSIVE BIBLIOGRAPHY - MANY ABSTRACTS INCLUDED ***

======================== Abstracts ==========================

1.      "Classification and Regression by Combining Models"

     Two novel methods for combining predictors are introduced in this
thesis; one for the task of regression, and the other for the task of
classification. The goal of combining the predictions of a set of
models is to form an improved predictor. This dissertation
demonstrates how a combining scheme can rely on the stability of the
consensus opinion and, at the same time, capitalize on the unique
contributions of each model.

     An empirical evaluation reveals that the new methods consistently
perform as well or better than existing combining schemes for a
variety of prediction problems. The success of these algorithms is
explained empirically and analytically by demonstrating how they
adhere to a set of theoretical and heuristic guidelines.

     A byproduct of the empirical investigation is the evidence that
existing combining methods fail to satisfy one or more of the
guidelines defined. The new combining approaches satisfy these
criteria by relying upon Singular Value Decomposition as a tool for
filtering out the redundancy and noise in the predictions of the learn
models, and for characterizing the areas of the example space where
each model is superior. The SVD-based representation used in the new
combining methods aids in avoiding sensitivity to correlated
predictions without discarding any learned models. Therefore, the
unique contributions of each model can still be discovered and
exploited. An added advantage of the combining algorithms derived in
this dissertation is that they are not limited to models generated by
a single algorithm; they may be applied to model sets generated by a
diverse collection of machine learning and statistical modeling
methods.

     The three main contributions of this dissertation are: 
        1. The introduction of two new combining methods capable of 
           robustly combining classification and regression estimates, and
           applicable to a broad range of model sets. 
        2. An in-depth analysis revealing how the new methods address the 
           specific problems encountered in combining multiple learned
           models. 
        3. A detailed account of existing combining methods and an 
           assessment of where they fall short in the criteria for 
           combining approaches. 

----------------

2. Preprints of two accepted Machine Learning Journal articles:

"A Principal Components Approach to Combining Regression Estimates"
Christopher J. Merz and Michael J. Pazzani

  Abstract

  The goal of combining the predictions of multiple learned 
  models is to form an improved estimator.  A combining strategy must be 
  able to robustly handle the inherent correlation, or 
  multicollinearity, of the learned models while identifying the unique 
  contributions of each.  A progression of existing approaches and their 
  limitations with respect to these two issues are discussed.  A new 
  approach, PCR*, based on principal components regression is proposed 
  to address these limitations.  An evaluation of the new approach on a 
  collection of domains reveals that 1) PCR* was the most robust 
  combining method, 2) correlation could be handled without eliminating 
  any of the learned models, and 3) the principal components of the 
  learned models provided a continuum of ``regularized'' weights from 
  which PCR* could choose.

"Using Correspondence Analysis to Combine Classifiers"
Christopher J. Merz

  Abstract

  Several effective methods have been developed recently for improving
  predictive performance by generating and combining multiple learned
  models.  The general approach is to create a set of learned models
  either by applying an algorithm repeatedly to different versions of
  the training data, or by applying different learning algorithms to the
  same data.  The predictions of the models are then combined according
  to a voting scheme.  This paper focuses on the task of combining the
  predictions of a set of learned models.  The method described uses the
  strategies of stacking and Correspondence Analysis to model the
  relationship between the learning examples and their classification by
  a collection of learned models.  A nearest neighbor method is then
  applied within the resulting representation to classify previously
  unseen examples.  The new algorithm does not perform worse than, and
  frequently performs significantly better than other combining
  techniques on a suite of data sets.

----------------

3. The bibtex file contains all of the references in my dissertation,
   including the survey. I've managed to paste in the abstracts of 
   many of the articles. I am willing to update this bibliography
   if any authors want to contribute references, abstracts and/or URL's.