Paper available.

Tue Apr 16 19:37:49 EDT 1996

                     *** Paper Announcement ***

                  ON BIAS, VARIANCE, 0/1 - LOSS, AND
                     THE CURSE-OF-DIMENSIONALITY

                         Jerome H. Friedman
                         Stanford University
                     (jhf at playfair.stanford.edu)

                              ABSTRACT

The classification problem is considered in which an output variable
assumes discrete values with respective probabilities that depend upon the
simultaneous values of a set of input variables. At issue is how error in
the estimates of these probabilities affects classification error when the
estimates are used in a classification rule. These effects are seen to be
somewhat counter intuitive in both their strength and nature. In particular
the bias and variance components of the estimation error combine to
influence classification in a very different way than with squared error on
the probabilities themselves. Certain types of (very high) bias can be
canceled by low variance to produce accurate classification. This can
dramatically mitigate the effect of the bias associated with some simple
estimators like "naive" Bayes, and the bias induced by the curse-of-
dimensionality on nearest-neighbor procedures. This helps explain why such
simple methods are often competitive with and sometimes superior to more
sophisticated ones for classification, and why "bagging/aggregating"
classifiers can often improve accuracy. These results also suggest simple
modifications to these procedures that can (sometimes dramatically) further
improve their classification performance.

Available by ftp from:
"ftp://playfair.stanford.edu/pub/friedman/curse.ps.Z"