2 NIPS preprints on neuroprose

Mon Jan 17 17:18:38 EST 1994

                    **DO NOT FORWARD TO OTHER GROUPS**

The following two papers have been placed in the neuroprose archive.  The
first prints on 9 pages, the second on 8.  The abstracts are given below,
followed by retrieval instructions.  Only electronic versions of these papers
are available.  Both are to appear in J.D. Cowan, G. Tesauro, and J.
Alspector (Eds.), Advances in Neural Information Processing Systems 6, San
Mateo, CA: Morgan Kaufmann.

 Rich Zemel
 e-mail: zemel at salk.edu

 FTP-host: archive.cis.ohio-state.edu
 FTP-filename: /pub/neuroprose/hinton.autoencoders.ps.Z
 FTP-filename: /pub/neuroprose/zemel.pop-codes.ps.Z

-----------------------------------------------------------------------------
   Autoencoders, Minimum Description Length and Helmholtz Free Energy

	      Geoffrey E. Hinton and Richard S. Zemel

An autoencoder network uses a set of {\it recognition} weights to convert an
input vector into a code vector.  It then uses a set of {\it generative}
weights to convert the code vector into an approximate reconstruction of the
input vector.  We derive an objective function for training autoencoders based
on the Minimum Description Length (MDL) principle.  The aim is to minimize the
information required to describe both the code vector and the reconstruction
error.  We show that this information is minimized by choosing code vectors
stochastically according to a Boltzmann distribution, where the generative
weights define the energy of each possible code vector given the input vector.
Unfortunately, if the code vectors use distributed representations, it is
exponentially expensive to compute this Boltzmann distribution because it
involves all possible code vectors.  We show that the recognition weights of
an autoencoder can be used to compute an approximation to the Boltzmann
distribution and that this approximation gives an upper bound on the
description length.  Even when this bound is poor, it can be used as a
Lyapunov function for learning both the generative and the recognition
weights.  We demonstrate that this approach can be used to learn factorial
codes.

-----------------------------------------------------------------------------
         Developing Population Codes By Minimizing Description Length

               Richard S. Zemel and Geoffrey E. Hinton

The Minimum Description Length principle (MDL) can be used to train the hidden
units of a neural network to extract a representation that is cheap to
describe but nonetheless allows the input to be reconstructed accurately.  We
show how MDL can be used to develop highly redundant population codes.  Each
hidden unit has a location in a low-dimensional {\em implicit} space.  If the
hidden unit activities form a bump of a standard shape in this space, they can
be cheaply encoded by the center of this bump.  So the weights from the input
units to the hidden units in an autoencoder are trained to make the
activities form a standard bump.  The coordinates of the hidden units in the
implicit space are also learned, thus allowing flexibility, as the network
develops a discontinuous topography when presented with different input
classes.  Population-coding in a space other than the input enables a network
to extract nonlinear higher-order properties of the inputs.

-----------------------------------------------------------------------------

To retrieve from neuroprose:

unix> ftp cheops.cis.ohio-state.edu
Name (cheops.cis.ohio-state.edu:zemel): anonymous
Password: (use your email address)
ftp> cd pub/neuroprose
ftp> get zemel.pop-codes.ps.Z
ftp> get hinton.autoencoders.ps.Z
ftp> quit
unix> uncompress zemel.pop-codes.ps
unix> uncompress hinton.autoencoders.ps
unix> lpr zemel.pop-codes.ps
unix> lpr hinton.autoencoders.ps