Paper on Neocognitron Training avail on neuroprose

Mon Mar 30 10:47:48 EST 1992

About two weeks ago, David Lovell posted to CONNECTIONISTS advertising
the note which he has placed in the neuroprose archive.  I have a few
comments on the paper.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

   A NOTE ON A CLOSED-FORM TRAINING ALGORITHM FOR THE NEOCOGNITRON

			  David Lovell, Ah Chung Tsoi & Tom Downs
  Intelligent Machines Laboratory, Department of Electrical Engineering
         University of Queensland, Queensland 4072, Australia

In this note, a difficulty with the application of Hildebrandt's
closed-form training algorithm for the neocognitron is reported.  In
applying this algorithm we have observed that S-cells frequently fail
to respond to features that they have been trained to extract.  We
present results which indicate that this training vector rejection is
an important factor in the overall classification performance of the
neocognitron trained using Hildebrandt's procedure.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

I my paper: 
"Optimal training of thresholded linear correlation classifiers", IEEE
Transactions on Neural Networks, Nov 1991.  
a one-step training procedure is outlined, which configures the cells
in a layer of Fukushima-type neurons, so that their classification
regions are mutually exclusive.

Starting at 3 dimensions, this mutual exclusion condition causes gaps
to open up among the cones that define the classification regions.  An
input pattern which falls into one of these gaps will be rejected.  As
the number of dimensions considered increases, so does the relative
volume assigned to these rejection regions. 

In the general linear model presented in the paper, if the number of
classes is fewer than the number of dimensions, then the half-angle at
the vertex of the cone is set to 45 degrees, in order to obtain mutual
exclusion.  On the other hand, to obtain complete coverage (no
rejections), it is necessary to choose an angle which depends on the
number of dimensions as follows:

vertex half-angle = 1 / sqrt(dimensions)

So for the first 10 dimensions, the angles at which complete coverage
is achieved are given by (in degrees):

0   45   54.73   60   63.43   65.9   67.79   69.29   70.53   71.56

In Fukushima's training procedure, classification regions compete for
training patterns, so that in its final state, the configuration of
the network more closely resembles one which achieves complete
coverage than one which achieves mutual exclusion of the
classification regions.  In classification problems, the correct
classification rate and the rejection rate are fundamentally in
opposition.  Therefore, it is not surprising that the network trained
using Fukushima's procedure achieved a higher classification rate than
the one trained using my one-step training procedure.  

A fairer comparison would be obtained by relaxing the mutual exclusion
of REGIONS in the latter network to the mutual exclusion of SAMPLES
(i.e.  that a training sample falls in one and only one classification
region).  In that case, the rejection rate for my network is expected
to be lower, in general, and the classification rate correspondingly
higher.

				Thomas H. Hildebrandt
				CSEE Department
				Lehigh University