new paper

Wed May 17 20:28:08 EDT 1995

The following paper (without figures)
is now available at the ftp site 
utstat.toronto.edu in pub/bootpred.shar (shar postscript files).

A paper copy with figures is available upon request from
karola at playfair.stanford.edu

   Cross-Validation and the Bootstrap: Estimating the Error Rate 
    of a Prediction   Rule

   Bradley Efron                      Robert Tibshirani
    Stanford Univ                      Univ of Toronto

A training set of data has been used to construct a rule for predicting
future responses.  What is the error rate of this rule?  The
traditional answer to this question is given by cross-validation.  The
cross-validation estimate of prediction error is nearly unbiased, but
can be highly variable.  This article discusses bootstrap estimates of
prediction error, which can be thought of as smoothed versions of
cross-validation.  A particular bootstrap method, the $632+$ rule, is
shown to substantially outperform cross-validation in a catalog of 24
simulation experiments.  Besides providing point estimates, we also
consider estimating the variability of an error rate estimate.  All of
the results here are nonparametric, and apply to any possible
prediction rule.  The simulations include ``smooth'' prediction rules
like Fisher's Linear Discriminant Function, and unsmooth ones like
Nearest Neighbors.

=============================================================
                                            | Rob Tibshirani
  The History of science                    | Dept. of Statistics 
      is full of fruitful errors            | Sequoia Hall
         and barren truths                  | Stanford Univ
                                            | Stanford, CA
          Arthur Koestler                   | USA  94305

Phone: 1-415-725 2237  Email: tibs at playfair.stanford.edu
         FAX: 1-416-725-8977