preprints available

Klaus Obermayer oby at cs.tu-berlin.de
Wed Apr 29 06:37:47 EDT 1998


Dear Connectionists,

I am happy to announce a series of papers on topographic clustering,
self-organizing maps, dissimilarity data, and kernels.

Cheers

Klaus

------------------------------------------------------------------------

Prof. Klaus Obermayer             phone:  49-30-314-73442
FR2-1, NI, Informatik                     49-30-314-73120
Technische Universitaet Berlin    fax:    49-30-314-73121
Franklinstrasse 28/29             e-mail: oby at cs.tu-berlin.de
10587 Berlin, Germany             http://ni.cs.tu-berlin.de/

=========================================================================


A Stochastic Self-organizing Map for Proximity Data

T. Graepel and K. Obermayer

We derive an efficient algorithm for topographic mapping of proximity 
data (TMP), which can be seen as an extension of Kohonen's Self-
Organizing Map to arbitrary distance measures. The TMP cost function is 
derived in a Baysian framework of Folded Markov Chains for the description 
of autoencoders. It incorporates the data via a dissimilarity matrix 
${\mathcal D}$ and the topographic neighborhood via a matrix ${\mathcal H}$
of transition probabilities. From the principle of Maximum Entropy a 
non-factorizing Gibbs-distribution is obtained, which is approximated in 
a mean-field fashion. This allows for Maximum Likelihood estimation using 
an EM-algorithm. In analogy to the transition from Topographic Vector
Quantization (TVQ) to the Self-organizing Map (SOM) we suggest an 
approximation to TMP which is computationally more efficient. In order to 
prevent convergence to local minima, an annealing scheme in the temperature
parameter is introduced, for which the critical temperature of the first
phase-transition is calculated in terms of ${\mathcal D}$ and ${\mathcal
H}$. Numerical results demonstrate the working of the algorithm and confirm
the analytical results. Finally, the algorithm is used to generate a 
connection map of areas of the cat's cerebral cortex.

to appear in: Neural Computation

preprint: http://ni.cs.tu-berlin.de/publications/#journals

-------------------------------------------------------------------------

Fuzzy Topographic Kernel Clustering}

T. Graepel and K. Obermayer

A new topographic clustering algorithm is proposed, which - by the use 
of integral operator kernel functions - efficiently estimates the centers
of clusters in high-dimensional feature spaces, which is related to data 
space by some nonlinear map. Like in the Self-Organizing Map topography
is imposed by assuming finite transition probabilities between cluster 
indices. The optimization of the associated cost function is achieved by
estimating the parameters via an EM-scheme and deterministic annealing. 
The effect of different radial basis function kernels on topographic maps 
of handwritten digit data is examined in computer simulations.

In: W. Brauer, editor, Proceedings of the 5th GI Workshop Fuzzy Neuro 
Systems '98, pages 90-97, 1998.

preprint: http://ni.cs.tu-berlin.de/publications/#conference

-------------------------------------------------------------------------

An Annealed Self-Organizing Map for Source-Channel Coding

M. Burger, T. Graepel, and K. Obermayer

We derive and analyse robust optimization schemes for noisy vector 
quantization on the basis of deterministic annealing. Starting from a 
cost function for central clustering that incorporates distortions from 
channel noise we develop a soft topographic vector quantization algorithm
(STVQ) which is based on the maximum entropy principle and which performs 
a maximum-likelihood estimate in an expectation-maximization (EM) fashion. 
Annealing in the temperature paramete $\beta$ leads to phase transitions 
in the existing code vector representation during the cooling process for
which we calculate critical temperatures and modes as a function of
eigenvectors and eigenvalues of the covariance matrix of the data and the
transition matrix of the channel noise. A whole family of vector
quantization algorithms is derived from STVQ, among them a deterministic
annealing scheme for Kohonen's self-organizing map (SOM). This algorithm,
which we call SSOM, is then applied to vector quantization of image data
to be sent via a noisy binary symmetric channel. The algorithm's
performance is compared to those of LBG and STVQ. While it is naturally
superior to LBG, which does not take into account channel noise, its
results compare very well to those of STVQ, which is computationally much
more demanding.

to appear in: NIPS 10 proceedings

preprint:  http://ni.cs.tu-berlin.de/publications/#conference

The theory is quite well described in:

T. Graepel, M. Burger, and K. Obermayer. Phase transitions in Stochastic
Self-Organizing Maps. Phys. Rev. E, 56(4):3876-3890, 1997.

preprint: http://ni.cs.tu-berlin.de/publications/#journals

-----------------------------------------------------------------------

Review-style paper for the practioneers:

Self-Organizing Maps: Generalization and New Optimization Techniques

T. Graepel, M. Burger, and K. Obermayer

We offer three algorithms for the generation of topographic mappings to 
the practitioner of unsupervised data analysis. The algorithms are each 
based on the minimization of a cost function which is performed using an
EM algorithm and deterministic annealing. The soft topographic vector
quantization algorithm (STVQ) -- like the original Self-Organizing Map
(SOM) -- provides a tool for the creation of self-organizing maps of 
Euclidean data. Its optimization scheme, however, offers an alternative 
to the heuristic stepwise shrinking of the neighborhood width in the SOM
and makes it possible to use a fixed neighborhood function solely to 
encode desired neighborhood relations between nodes.

The kernel-based soft topographic mapping (STMK) is a generalization of
STVQ and introduces new distance measures in data space based on kernel 
functions. Using the new distance measures corresponds to performing the 
STVQ in a high-dimensional feature space, which is related to data space 
by a nonlinear mapping. This preprocessing can reveal structure of the 
data which may go unnoticed if the STVQ is performed in the standard 
Euclidean space.

The soft topographic mapping for proximity data (STMP) is another 
generalization of STVQ that enables the user to generate topographic maps
for data which are given in terms of pairwise proximities. It thus offers
a flexible alternative to multidimensional scaling methods and opens up 
a new range of applications for Self-Organizing Maps. Both STMK and STMP
share the robust optimization properties of STVQ due to the application 
of deterministic annealing. In our contribution we discuss the algorithms
together with their implementation and provide detailed pseudo-code and
explanations.

to appear in: Neurocomputing

preprint: http://ni.cs.tu-berlin.de/publications/#journals



More information about the Connectionists mailing list