Gaussian statistical models, Hilbert spaces
Grace Wahba
wahba at stat.wisc.edu
Tue Sep 1 15:00:30 EDT 1998
Readers of
...............
http://www.santafe.edu/~zhuh/draft/edmc.ps.gz
Error Decomposition and Model Complexity
Huaiyu Zhu
Bayesian information geometry provides a general error decomposition
theorem for arbitrary statistical models and a family of information
deviations that include Kullback-Leibler information as a special case.
When applied to Gaussian measures it takes the classical Hilbert space
(Sobolev space) theories for estimation (regression, filtering,
approximation, smoothing) as a special case. When the statistical and
computational models are properly distinguished, the dilemmas of
over-fitting and ``curse of dimensionality'' disappears, and the optimal
model order disregarding computing cost is always infinity.
.............
will do doubt be interested in the long history of the relationship
between reproducing kernel Hilbert spaces (rkhs), gaussian measures
and regularization, -
see
1962 Proccedings of the Symposium on Time Series Analysis
edited by Murray Rosenblatt, Wiley 1962, esp. the paper by Parzen
1962 J. Hajek On linear statistical problems in stochastic processes
Czech Math J. v 87.
1971 Kimeldorf and Wahba, Some results on Tchebycheffian spline functions,
J. Math Anal. Applic. v 33.
1990 G. Wahba, Spline Models for Observational Data, SIAM
1997 F. Girosi, An equivalence between sparse approximation and
support vector machines, to appear Neural Comp
1997 G. Wahba, Support vector vachines, reproducing kernel Hilbert
spaces and the randomized GACV, to appear, Schoelkopf, Burges
and Smola, eds, forthcoming book on Support Vector Machines, MIT Press
1981 C. Micchelli and G. Wahba, Design problems for optimal
surface interpolation, Approximation Theory and Applications,
Z. Ziegler ed, Academic press. Also numerous works by
L. Plaskota and others on optimal bases. First k eigenfunctions
of the reproducing kernel are well known to have certain
optimal properties under restricted circumstances,
see e.g. Ch 12 of Spline Models and references there, but
if there are n observations, then the Bayes estimates
are found in an at most n-dimensional subspace of the rkhs
associated with the prior, KW 1971. B. Silverman 1982
`On the estimation of a probability density fuction by
the maximum penalized likelihood method', Ann. Statist 1982
will also be of interest - convergence rates are related
to the rate of decay of the eigenvalues of the reproducing
kernel.
Grace Wahba http://www.stat.wisc.edu/~wahba/
More information about the Connectionists
mailing list