Kernel geometry, invariance, and support vector machines

Chris Burges cjcb at molson.ho.lucent.com
Thu Jul 2 14:50:49 EDT 1998


The following paper is available at

    http://svm.research.bell-labs.com/SVMdoc.html



Geometry and Invariance in Kernel Based Methods

C.J.C. Burges, Bell Laboratories, Lucent Technologies

To Appear In:
Advances in Kernel Methods - Support Vector Learning,
Eds. B. Schoelkopf, C. Burges, A. Smola, MIT Press, Cambridge, USA, 1998


We explore the questions of (1) how to describe the intrinsic geometry of the
manifolds which occur naturally in methods, such as support vector machines
(SVMs), in which the choice of kernel specifies a nonlinear mapping of one's
data to a Hilbert space; and (2) how one can find kernels which are locally
invariant under some given symmetry.  The motivation for exploring the geometry
of support vector methods is to gain a better intuitive understanding of the
manifolds to which one's data is being mapped, and hence of the support vector
method itself: we show, for example, that the Riemannian metric induced on the
manifold by its embedding can be expressed in closed form in terms of the
kernel.  The motivation for looking for classes of kernels which instantiate
local invariances is to find ways to incorporate known symmetries of the problem
into the model selection (i.e. kernel selection) phase of the problem.  A
useful by-product of the geometry analysis is a necessary test which any
proposed kernel must pass if it is to be a support vector kernel (i.e. a kernel
which satisfies Mercer's positivity condition); as an example, we use this to
show that the hyperbolic tangent kernel (for which the SVM is a two-layer neural
network) violates Mercer's condition for various values of its parameters, a
fact noted previously only experimentally.  A basic result of the invariance
analysis is that directly imposing a symmetry on the class of kernels
effectively results in a preprocessing step, in which the preprocessed data lies
in a space whose dimension is reduced by the number of generators of the
symmetry group.  Any desired kernels can then be used on the preprocessed data.
We give a detailed example of vertical translation invariance for pixel data,
where the binning of the data into pixels has some interesting consequences.
The paper comprises two parts: Part 1 studies the geometry of the kernel
mapping, and Part 2 the incorporation of invariances by choice of kernel.


More information about the Connectionists mailing list