Thesis available: The Neocognitron...Limitations and Improvements

Fri Jul 8 15:03:14 EDT 1994

FTP-host: archive.cis.ohio-state.edu
FTP-filename: /pub/neuroprose/Thesis/lovell.thesis.tar

Title: The Neocognitron as a System for Handwritten Character Recognition:
       Limitations and Improvements

Size:  3080192 bytes (compressed),
       234 pages (10 point, single spaced, double-sided format),
       100 figures.

No hardcopies available (sorry!)

Hi folks,
This is just to let you know that my doctoral dissertation can be retrieved
from the Neuroprose archive. I know 3Mb is pretty hefty for a compressed file
but there are a lot of detailed PostScript figures and bitmaps in the
document. Here's how to print it out (once you have retrieved it):

    tar xf lovell.thesis.tar
    cd dlovell-thesis
    zcat ch0.ps.Z | lpr -P(your PostScript printer)
    zcat ch1.ps.Z | lpr -P(your PostScript printer)
    zcat ch2.ps.Z | lpr -P(your PostScript printer)
    etc...

I have written a c-shell script called "print-thesis.csh" which will automate
the uncompressing and printing process for you. The README file contains an
explanation of how to use "print-thesis.csh".

I hope that the thesis will be useful (or at least interesting) to anyone
working in the area of off-line character recognition with hierarchical neural
networks.

Best regards,
David

-------------------------------------------------------------------------------

     The Neocognitron as a System for Handwritten Character Recognition:
                     Limitations and Improvements

                           by David R. Lovell

A thesis submitted for the degree of Doctor of Philosophy Department of
Electrical and Computer Engineering, University of Queensland.

ABSTRACT

This thesis is about the neocognitron, a neural network that was
proposed by Fukushima in 1979. Inspired by Hubel and Wiesel's serial
model of processing in the visual cortex, the neocognitron was
initially intended as a self-organizing model of vision, however, we
are concerned with the supervised version of the network, put forward
by Fukushima in 1983. Through "training with a teacher", Fukushima
hoped to  obtain a character recognition system that was tolerant  of
shifts and deformations in input images. Until now though, it has not
been clear whether Fukushima's approach has resulted in a network that
can rival the performance of other recognition systems.

In the first three chapters of this thesis, the biological basis,
operational principles and mathematical implementation of the
supervised neocognitron are presented in detail. At the end of this
thorough introduction, we consider a number of important issues that
have not previously been addressed (at least not with any proven degree
of success). How should S-cell selectivity and other parameters be
chosen so as to maximize the network's performance? How sensitive is
the network's classification ability to the supervisor's choice of
training patterns? Can the neocognitron achieve state-of-the-art
recognition rates and, if not, what is preventing it from doing so?

Chapter 4 looks at the Optimal Closed-Form Training (OCFT) algorithm, a
method for adjusting S-cell selectivity, suggested by Hildebrandt in
1991. Experiments reveal flaws in the assumptions behind OCFT and
provide motivation for the development and testing (in Chapter 5) of
three new algorithms for selectivity adjustment: SOFT, SLOG and SHOP.
Of these methods, SHOP is shown to be the most effective, determining
appropriate selectivity values through the use of a validation set of
handwritten characters.

SHOP serves as a method for probing the behaviour of the neocognitron
and is used to investigate the effect of cell masks, skeletonization of
input data and choice of training patterns on the network's
performance. Even though SHOP is the best selectivity adjustment
algorithm to be described to date, the system's peak correct
recognition rate (for isolated ZIP code digits from the CEDAR database)
is around 75% (with 75% reliability) after SHOP training.  It is clear
that  the neocognitron, as originally described by Fukushima, is unable
to match the performance of today's most accurate digit recognition
systems which typically achieve 90% correct recognition with near 100%
reliability.

After observing the neocognitron's failure to exploit the
distinguishing features of different kinds of digits in its
classification of images, Chapter 6 proposes modifications to enhance
the networks ability in this regard. Using this new architecture, a
correct classification rate of 84.62% (with 96.36% reliability) was
obtained on CEDAR ZIP codes, a substantial improvement but still a
level of performance that is somewhat less than state-of-the-art
recognition rates.  Chapter 6 concludes with a critical review of the
hierarchical feature extraction paradigm.

The final chapter summarizes the material presented in this thesis and
draws the significant findings together in a series of conclusions. In
addition to the investigation of the neocognitron, this thesis also
contains a derivation of statistical bounds on the errors that arise in
multilayer feedforward networks as a result of weight perturbation
(Appendix E).

------------------------------------------------------------------------------
David Lovell - dlovell at elec.uq.oz.au      |
                                          |
Dept. Electrical and Computer Engineering | "Oh bother! The pudding is ruined
University of Queensland                  | completely now!" said Marjory, as
BRISBANE 4072                             | Henry the dachshund  leapt up and
Australia                                 |      into the lemon surprise.
                                          |
tel: (07) 365 3770                        |