Thesis available

Markus Svensen svensen at cns.mpg.de
Thu Apr 1 04:14:58 EST 1999


Dear Connectionists

My PhD-thesis --- GTM: The Generative Topographic Mapping --- is
available for downloading in compressed postscript format from
http://www.ncrg.aston.ac.uk/GTM/ (this page also contains other material
on the GTM); abstract follows below.

It can also be obtained via anonymous ftp (see below) or from the NCRG
Publication database (http://www.ncrg.aston.ac.uk/Papers/, tech.rep.no.
NCRG/98/024, single-sided version only).


Markus Svensen          

Max-Plank-Institute             Email: svensen at cns.mpg.de       
  of Cognitive Neuroscience     Phone: +49/0 341 9940 229
Postfach 500 355                Fax (not personal):
D-04303 LEIPZIG                        +49/0 341 9940 221
GERMANY


Abstract
========

This thesis describes the Generative Topographic Mapping (GTM) --- a
non-linear latent variable model, intended for modelling continuous,
intrinsically low-dimensional probability distributions, embedded in
high-dimensional spaces. It can be seen as a non-linear form of
principal component analysis or factor analysis. It also provides a
principled alternative to the self-organizing map --- a widely
established neural network model for unsupervised learning ---
resolving many of its associated theoretical problems.

An important, potential application of the GTM is visualization of
high-dimensional data. Since the GTM is non-linear, the relationship
between data and its visual representation may be far from trivial,
but a better understanding of this relationship can be gained by
computing the so-called magnification factor. In essence, the
magnification factor relates the distances between data points, as
they appear when visualized, to the actual distances between those
data points.

There are two principal limitations of the basic GTM model. The
computational effort required will grow exponentially with the
intrinsic dimensionality of the density model. However, if the
intended application is visualization, this will typically not be a
problem. The other limitation is the inherent structure of the GTM,
which makes it most suitable for modelling moderately curved
probability distributions of approximately rectangular shape. When the
target distribution is very different to that, the aim of maintaining
an `interpretable' structure, suitable for visualizing data, may come
in conflict with the aim of providing a good density model.

The fact that the GTM is a probabilistic model means that results from
probability theory and statistics can be used to address problems such
as model complexity. Furthermore, this framework provides solid ground
for extending the GTM to wider contexts than that of this thesis.

Keywords: latent variable model, visualization, magnification factor,
self-organizing map, principal component analysis


Availability via ftp
====================

The thesis is available via anonymous ftp from cs.aston.ac.uk, in the
directory neural/svensjfm/GTM. It's avaiblable in two versions:

NCRG_98_024.ps.Z contains the final version of the thesis, as it was
submitted to Aston University, formatted according to Aston's
regulations (1.5 linespacing and margins for singlesided printing);
112 pages on 112 A4 papers

NCRG_98_024_dblsided.ps.Z contains the same version of the thesis in
terms of content, but formatted slightly differently (single
linespacing and margins for doublesided printing, with blank pages
added as appropriate); 108 pages on 56 A4 papers.

Naturally, the singlesided version will not look very nice when
printed doublesided, and vice versa!


More information about the Connectionists mailing list