preprints available
Klaus Obermayer
oby at cs.tu-berlin.de
Wed Apr 29 06:37:47 EDT 1998
Dear Connectionists,
I am happy to announce a series of papers on topographic clustering,
self-organizing maps, dissimilarity data, and kernels.
Cheers
Klaus
------------------------------------------------------------------------
Prof. Klaus Obermayer phone: 49-30-314-73442
FR2-1, NI, Informatik 49-30-314-73120
Technische Universitaet Berlin fax: 49-30-314-73121
Franklinstrasse 28/29 e-mail: oby at cs.tu-berlin.de
10587 Berlin, Germany http://ni.cs.tu-berlin.de/
=========================================================================
A Stochastic Self-organizing Map for Proximity Data
T. Graepel and K. Obermayer
We derive an efficient algorithm for topographic mapping of proximity
data (TMP), which can be seen as an extension of Kohonen's Self-
Organizing Map to arbitrary distance measures. The TMP cost function is
derived in a Baysian framework of Folded Markov Chains for the description
of autoencoders. It incorporates the data via a dissimilarity matrix
${\mathcal D}$ and the topographic neighborhood via a matrix ${\mathcal H}$
of transition probabilities. From the principle of Maximum Entropy a
non-factorizing Gibbs-distribution is obtained, which is approximated in
a mean-field fashion. This allows for Maximum Likelihood estimation using
an EM-algorithm. In analogy to the transition from Topographic Vector
Quantization (TVQ) to the Self-organizing Map (SOM) we suggest an
approximation to TMP which is computationally more efficient. In order to
prevent convergence to local minima, an annealing scheme in the temperature
parameter is introduced, for which the critical temperature of the first
phase-transition is calculated in terms of ${\mathcal D}$ and ${\mathcal
H}$. Numerical results demonstrate the working of the algorithm and confirm
the analytical results. Finally, the algorithm is used to generate a
connection map of areas of the cat's cerebral cortex.
to appear in: Neural Computation
preprint: http://ni.cs.tu-berlin.de/publications/#journals
-------------------------------------------------------------------------
Fuzzy Topographic Kernel Clustering}
T. Graepel and K. Obermayer
A new topographic clustering algorithm is proposed, which - by the use
of integral operator kernel functions - efficiently estimates the centers
of clusters in high-dimensional feature spaces, which is related to data
space by some nonlinear map. Like in the Self-Organizing Map topography
is imposed by assuming finite transition probabilities between cluster
indices. The optimization of the associated cost function is achieved by
estimating the parameters via an EM-scheme and deterministic annealing.
The effect of different radial basis function kernels on topographic maps
of handwritten digit data is examined in computer simulations.
In: W. Brauer, editor, Proceedings of the 5th GI Workshop Fuzzy Neuro
Systems '98, pages 90-97, 1998.
preprint: http://ni.cs.tu-berlin.de/publications/#conference
-------------------------------------------------------------------------
An Annealed Self-Organizing Map for Source-Channel Coding
M. Burger, T. Graepel, and K. Obermayer
We derive and analyse robust optimization schemes for noisy vector
quantization on the basis of deterministic annealing. Starting from a
cost function for central clustering that incorporates distortions from
channel noise we develop a soft topographic vector quantization algorithm
(STVQ) which is based on the maximum entropy principle and which performs
a maximum-likelihood estimate in an expectation-maximization (EM) fashion.
Annealing in the temperature paramete $\beta$ leads to phase transitions
in the existing code vector representation during the cooling process for
which we calculate critical temperatures and modes as a function of
eigenvectors and eigenvalues of the covariance matrix of the data and the
transition matrix of the channel noise. A whole family of vector
quantization algorithms is derived from STVQ, among them a deterministic
annealing scheme for Kohonen's self-organizing map (SOM). This algorithm,
which we call SSOM, is then applied to vector quantization of image data
to be sent via a noisy binary symmetric channel. The algorithm's
performance is compared to those of LBG and STVQ. While it is naturally
superior to LBG, which does not take into account channel noise, its
results compare very well to those of STVQ, which is computationally much
more demanding.
to appear in: NIPS 10 proceedings
preprint: http://ni.cs.tu-berlin.de/publications/#conference
The theory is quite well described in:
T. Graepel, M. Burger, and K. Obermayer. Phase transitions in Stochastic
Self-Organizing Maps. Phys. Rev. E, 56(4):3876-3890, 1997.
preprint: http://ni.cs.tu-berlin.de/publications/#journals
-----------------------------------------------------------------------
Review-style paper for the practioneers:
Self-Organizing Maps: Generalization and New Optimization Techniques
T. Graepel, M. Burger, and K. Obermayer
We offer three algorithms for the generation of topographic mappings to
the practitioner of unsupervised data analysis. The algorithms are each
based on the minimization of a cost function which is performed using an
EM algorithm and deterministic annealing. The soft topographic vector
quantization algorithm (STVQ) -- like the original Self-Organizing Map
(SOM) -- provides a tool for the creation of self-organizing maps of
Euclidean data. Its optimization scheme, however, offers an alternative
to the heuristic stepwise shrinking of the neighborhood width in the SOM
and makes it possible to use a fixed neighborhood function solely to
encode desired neighborhood relations between nodes.
The kernel-based soft topographic mapping (STMK) is a generalization of
STVQ and introduces new distance measures in data space based on kernel
functions. Using the new distance measures corresponds to performing the
STVQ in a high-dimensional feature space, which is related to data space
by a nonlinear mapping. This preprocessing can reveal structure of the
data which may go unnoticed if the STVQ is performed in the standard
Euclidean space.
The soft topographic mapping for proximity data (STMP) is another
generalization of STVQ that enables the user to generate topographic maps
for data which are given in terms of pairwise proximities. It thus offers
a flexible alternative to multidimensional scaling methods and opens up
a new range of applications for Self-Organizing Maps. Both STMK and STMP
share the robust optimization properties of STVQ due to the application
of deterministic annealing. In our contribution we discuss the algorithms
together with their implementation and provide detailed pseudo-code and
explanations.
to appear in: Neurocomputing
preprint: http://ni.cs.tu-berlin.de/publications/#journals
More information about the Connectionists
mailing list