new paper
Rob Tibshirani
tibs at pc-tibs.Stanford.EDU
Wed May 17 20:28:08 EDT 1995
The following paper (without figures)
is now available at the ftp site
utstat.toronto.edu in pub/bootpred.shar (shar postscript files).
A paper copy with figures is available upon request from
karola at playfair.stanford.edu
Cross-Validation and the Bootstrap: Estimating the Error Rate
of a Prediction Rule
Bradley Efron Robert Tibshirani
Stanford Univ Univ of Toronto
A training set of data has been used to construct a rule for predicting
future responses. What is the error rate of this rule? The
traditional answer to this question is given by cross-validation. The
cross-validation estimate of prediction error is nearly unbiased, but
can be highly variable. This article discusses bootstrap estimates of
prediction error, which can be thought of as smoothed versions of
cross-validation. A particular bootstrap method, the $632+$ rule, is
shown to substantially outperform cross-validation in a catalog of 24
simulation experiments. Besides providing point estimates, we also
consider estimating the variability of an error rate estimate. All of
the results here are nonparametric, and apply to any possible
prediction rule. The simulations include ``smooth'' prediction rules
like Fisher's Linear Discriminant Function, and unsmooth ones like
Nearest Neighbors.
=============================================================
| Rob Tibshirani
The History of science | Dept. of Statistics
is full of fruitful errors | Sequoia Hall
and barren truths | Stanford Univ
| Stanford, CA
Arthur Koestler | USA 94305
Phone: 1-415-725 2237 Email: tibs at playfair.stanford.edu
FAX: 1-416-725-8977
More information about the Connectionists
mailing list