SVM's and the GACV
Grace Wahba
wahba at stat.wisc.edu
Wed Apr 7 19:19:17 EDT 1999
The following paper was the basis for my
NIPS*98 Large Margin Classifier Workshop talk.
now available as University of Wisconsin-Madison
Statistics Dept TR1006 in
http://www.stat.wisc.edu/~wahba -> TRLIST
..................................................
Generalized Approximate Cross Validation For
Support Vector Machines, or, Another Way to
Look at Margin-Like Quantities.
Grace Wahba, Yi Lin and Hao Zhang.
Abstract
We first review the steps connecting the
Support Vector Machine (SVM) paradigm in
reproducing kernel Hilbert space, and
and its connection to the (dual) mathematical
programming problem traditional in SVM classification
problems. We then review the Generalized Comparative
Kullback-Leibler Distance (GCKL) for the SVM
paradigm and observe that it is trivially a simple
upper bound on the expected misclassification rate.
Next we revisit the Generalized Approximate
Cross Validation (GACV) as a computable proxy for
the GCKL, as a function of certain tuning
parameters in SVM kernels. We have found a justifiable
(new) approximation for the GACV which is readily computed
exactly along with the SVM solution to the dual
mathematical programming problem. This GACV
turns out interestingly, but not surprisingly to
be simply related to what several authors
have identified as the (observed) VC dimension
of the estimated SVM. Some preliminary simulations
in a special case
are suggestive of the fact that the minimizer of
the GACV is in fact a good estimate of the
minimizer of the GCKL, although further simulation
and theoretical studies are warranted. It is hoped
that this preliminary work will lead to better
understanding of `tuning' issues in the
optimization of SVM's and related classifiers.
.................................................
More information about the Connectionists
mailing list