gacv-paper available-deg.fdm.sig
Grace Wahba
wahba at stat.wisc.edu
Wed Sep 21 16:30:01 EDT 1994
The following paper is available by ftp in the
ftp directory ftp.stat.wisc.edu/pub/wahba in the
file gacv.ps.gz:
A Generalized Approximate Cross Validation for
Smoothing Splines with Non-Gaussian Data
by
Dong Xiang and Grace Wahba
Abstract
We consider the model
Prob {Y_i= 1} = exp{f(t_i}/(1+exp{f(t_i})
Prob {Y_i =0} = 1/(1+exp{f(t_i)}
where t is a vector of predictor variables, t_i
is the vector of predictor variables for the i th
subject/patient/instance and Y_i is the outcome
(classification) for the i th subject. f(\cdot) is
supposed to be a `smooth' function of t, and the
goal is to estimate f by choosing f in an appropriate
class of functions to minimize
Log likelihood {Y_1, ...Y_n|f} + \lambda J(f)
where J{f} is a an appropriate penalty functional which
restricts the degrees of freedom for signal attributed to f.
Our results concentrate on J(f) a `smoothness' penalty
which results in spline and related (e. g. rbf) estimates.
We propose a Generalized Approximate Cross Validation
score (GACV) for estimating $\lambda$ (internally) from
a relatively small data set. The GACV score is derived
by first obtaining an approximation to the
leaving-out-one cross validation function and
then, in a step reminiscent of that used to get from
leaving-out-one cross validation to GCV in the Gaussian
data case, we replace diagonal elements of certain matrices
by $\frac{1}{n}$ times the trace. A numerical simulation with
`data' Y_i, i = 1,2..., n generated from an hypothesized
`true' f is used to compare the $\lambda$ chosen by minimizing
this GACV score with the $\lambda$ chosen from two often used
algorithms based on the generalized cross validation
procedure (O'Sullivan {\em et al} 1986, Gu, 1990, 1992).
In the examples here, the GACV estimate produces a better fit
to the true f in terms of minimizing the Kullback-Liebler
distance of the estimate of f from the true f.
Figures suggest that the GACV may be an approximately
unbiased estimate of the Kullback-Leibler distance of
the estimate to the true f, however, a theoretical
proof is yet to be found. The work of Wong (1992) suggests
that an exact unbiased estimate does not exist in the
{0,1} data case. The present work is related to
Moody(1991), The effective number of parameters: An analysis
of generalization and regularization in nonlinear
learning systems, and Liu(199), Unbiased estimate of
generalization error and model selection in neural
network.
University of Wisconsin-Madison Statistics Department TR 930
September, 1994
Keywords: Generalized Approximate Cross Validation,
smoothing spline, penalized likelihood, generalized
cross validation, Kullback-Leibler distance.
Other papers of potential interest for supervised
machine learning in the directory ftp.stat.wisc.edu/pub/wahba
are in the files: (some previously announced)
nonlin-learn.ps.gz ml-bib.ps.gz
soft-class.ps.gz ssanova.ps.gz
theses/ywang.thesis.README nips6.ps.gz
tuning-nwp.ps.gz
Department of Statistics, University of Wisconsin-Madison
wahba at stat.wisc.edu xiang at stat.wisc.edu
PS to Geoff Hinton- The database is a great idea!!
More information about the Connectionists
mailing list