PhD Thesis available: Learning with Kernels

Thu Jan 7 12:14:30 EST 1999

Dear Connectionists,

I am pleased to announce the availability of my PhD Thesis
"Learning with Kernels" which is now available at

http://svm.first.gmd.de/papers/Smola98.ps.gz

Abstract

Support Vector (SV) Machines combine several techniques from
statistics, machine learning and neural networks. One of the most 
important ingredients are kernels, i.e. the concept of transforming
linear algorithms into nonlinear ones via a map into feature
spaces. The present work focuses on the following issues:

- Extensions of Support Vector Machines.
- Extensions of kernel methods to other algorithms such as
  unsupervised learning. 
- Capacity bounds which are particularly well suited for kernel
  methods.  

After a brief introduction to SV regression it is shown how the
classical \epsilon insensitive loss function can be replaced by other
cost functions while keeping the original advantages or adding other
features such as automatic parameter adaptation.

Moreover the connection between kernels and regularization is pointed 
out. A theoretical analysis of several common kernels follows and
criteria to check Mercer's condition more easily are presented. 
Further modifications lead to semiparametric models and greedy 
approximation schemes.  Next three different types of optimization
algorithms, namely interior point codes, subset selection algorithms,
and sequential minimal optimization (including pseudocode) are
presented. The primal--dual framework is used as an analytic tool 
in this context.

Unsupervised learning is an extension of kernel methods to new
problems. Besides Kernel PCA one can use the regularization to obtain
more general feature exractors. A second approach leads to regularized
quantization functionals which allow a smooth transition between the
Generative Topographic Map and Principal Curves.

The second part of the thesis deals with uniform convergence bounds
for the algorithms and concepts presented so far. It starts with a
brief self contained overview over existing techniques and an
introduction to functional analytic tools which play a crucial role in
this problem. By viewing the class of kernel expansions as an image of
a linear operator it is possible to give bounds on the generalization
ability of kernel expansions even when standard concepts like the VC
dimension fail or give way too conservative estimates.

In particular it is shown that it is possible to compute the covering
numbers of the given hypothesis classes directly instead of taking the
detour via the VC dimension. Applications of the new tools to SV
machines, convex combinations of hypotheses (i.e. boosting and sparse
coding), greedy approximation schemes, and principal curves conclude
the presentation.

-- 
    / Alex J. Smola           GMD FIRST             /
   / Room 214                Rudower Chaussee 5    /
  / Tel: (+49)30-6392-1833  12489 Berlin, Germany /
 / Fax: (+49)30-6392-1805  smola at first.gmd.de    /
/ URL: http://www.first.gmd.de/~smola           /