doctoral thesis on automatic structuring available by ftp

Fri Feb 10 11:10:18 EST 1995

The following doctoral thesis is available by ftp. 
Sorry, no hardcopies available.

ftp://archive.cis.ohio-state.edu/pub/neuroprose/Thesis/bodenhausen.thesis.ps.Z

FTP-host: archive.cis.ohio-state.edu
FTP-file: pub/neuroprose/Thesis/bodenhausen.thesis.Z

 -----------------------------------------------------------------------------

        	Automatic Structuring of Neural Networks for
        	  Spatio-Temporal Real-World Applications

			    (153 pages)

			 Ulrich Bodenhausen

			  Doctoral Thesis

		   University of Karlsruhe, Germany

      			     Abstract

The successful application of speech recognition (SR) and on-line 
handwriting recognition (OLHR) systems to new domains greatly depend 
on the tuning of a recognizer's architecture to a new task. Architectural 
tuning is especially important if the amount of training data is small 
because the amount of training data limits the number of trainable 
parameters that can be estimated properly using an automatic learning 
algorithm. The number of trainable parameters of a connectionist SR 
or OLHR is dependent on architectural parameters like the width of 
input windows over time, the number of hidden units and the number of 
state units. Each of these architectural parameters provides different 
functionality in the system and can not be optimized independently. 
Manual optimization of these architectural parameters is time-consuming 
and expensive. Automatic optimization algorithms can free the developer 
of SR and OLHR applications from this task. 

In this thesis I develop and evaluate novel methods that allocate 
connectionist resources for spatio-temporal classification problems 
automatically. The methods are evaluated under the following evaluation 
criteria:

- Suitability for small systems (~ 1,000 parameters) as well as for large 
  systems (more than 10,000 parameters): Is the proposed method efficient 
  for various sizes of the system?

- Ease of use for non-expert users: How much knowledge is necessary to 
  adapt the system to a customized application?

- Final performance: Can the automatically optimized system compete with 
  state-of-the-art well engineered systems?

Several algorithms were developed and evaluated in this thesis. The 
Automatic Structure Optimization (ASO) algorithm performed best under the 
above criteria. ASO automatically optimizes 

- the width of the input windows over time which allow the following unit 
  of the neural network to capture a certain amount of temporal context of 
  the input signal;

- the number of hidden units which allow the neural network to learn 
  non-linear classification boundaries;

- the number of states that are used to model segments of the spatio-
  temporal input, such as acoustic segments of speech or strokes of 
  on-line handwriting.

The ASO algorithm uses a constructive approach to find the best 
architecture. Training starts with a neural network of minimum size. 
Resources are added to specifically improve parts of the network which 
are involved in classification errors. ASO was developed on the recognition 
of spoken letters and improved the performance on an independent test set 
from 88.0% to 92.2% over a manually tuned architecture. The performances 
of architectures found by ASO for different domains and databases are also 
compared to architectures optimized manually by other researchers. For 
example, ASO improved the performance on on-line handwritten digits from 
98.5% to 99.5% over a manually optimized architecture. It is also shown 
that ASO can successfully adapt to different sizes of the training database 
and that it can be applied to the recognition of connected spoken letters.

The ASO algorithm is applicable to all classification problems with 
spatio-temporal input. It was tested on speech and on-line handwriting, 
as two instances of such tasks. The approach is new, requires no domain 
specific knowledge by the user and is efficient. It is shown for the 
first time that fully automatic tuning of all relevant architectural
parameters of speech and on-line handwriting recognizers (window widths, 
number of hidden units and states) to the domain and the available 
amount of training data is actually possible with the ASO algorithm 
automatic tuning by ASO is efficient, both in terms of computational 
effort and final performance.

------------------------------------------------------------------------

Instructions for ftp retrieval of this paper are given below. Our university
requires that the title page is in German. The rest of the thesis is English.

FTP INSTRUCTIONS:

unix> ftp archive.cis.ohio-state.edu (or 128.146.8.52)
    Name: anonymous
    Password: <your e-mail address>
    ftp> cd pub/neuroprose/Thesis
    ftp> binary
    ftp> get bodenhausen.thesis.Z
    ftp> quit
unix> uncompress bodenhausen.thesis.Z

Thanks to Jordan Pollack for maintaining this archive. 

Uli Bodenhausen

=======================================================================
Uli Bodenhausen
University of Karlsruhe
Germany
uli at ira.uka.de
=======================================================================