New paper in neuroprose

Tue Jun 6 06:52:25 EDT 2006

FTP-host: archive.cis.ohio-state.edu
FTP-filename: /pub/neuroprose/priel.2_layered_perc.ps.Z

The file priel.2_layered_perc.ps.Z is now available for copying
from the Neuroprose archive. This is a 41-page long paper.
This paper was submitted for publication in  " Physical Review E ".

A limited number of Hardcopies (10) is reserved for those who can not
use the FTP server.

      Computational Capabilities of Restricted Two Layered Perceptrons

				     by

	  Avner Priel, Marcelo Blatt, Tal Grossman and Eytan Domany
	 Electronics Department, The Weizmann Institute of Science,
			   Rehovot 76100, Israel.

				     and
				 Ido Kanter
		 Department of Physics, Bar Ilan University,
			  52900 Ramat Gan, Israel.

Abstract:
We study the extent to which fixing the second layer weights
reduces the capacity and generalization ability of a two-layer perceptron.
Architectures with $N$ inputs, $K$ hidden units and a single
output are considered, with both overlapping and non-overlapping
receptive fields. We obtain from simulations one measure of the
strength of a network - its critical capacity, $\alpha_c$.
Using the ansatz $\tau_{med} \propto (\alpha_c - \alpha)^{-2}$ to
describe the manner in which the median learning time diverges
as $\alpha_c$ is approached, we estimate $\alpha_c$ in a manner
that does not depend on arbitrary impatience parameters. The $CHIR$
learning algorithm is used in our simulations.
For $K=3$ and
overlapping receptive fields we show that
the general machine is equivalent
to the Committee with the same architecture. For $K=5$ and the same
connectivity the general machine is the union of four distinct
networks with fixed second layer weights, of which the Committee
is the one with the highest $\alpha_c$.
Since the capacity of the union of a finite set of machines
equals that of the strongest constituent, the capacity of the
general machine with $K=5$ equals that of the Committee.
We investigated the internal representations used by different
machines, and found that high correlations between the hidden units
and the output reduce the capacity. Finally we studied the Boolean
functions that can be realized by networks with fixed second layer
weights. We discovered that two different machines implement two
completely distinct sets of Boolean functions.