PhD Thesis available for FTP in neuroprose

Thu Oct 14 13:55:43 EDT 1993

FTP-host: archive.cis.ohio-state.edu (128.146.8.52)
FTP-file: pub/neuroprose/zhu.thesis.ps.Z

PhD Thesis (222 pages) available in neuroprose repository.
(An index entry, and sample ftp procedure follows abstract)

		NEURAL NETWORKS AND ADAPTIVE COMPUTERS:
	Theory and Methods of Stochastic Adaptive Computation

			    Huaiyu Zhu
	Department of Statistics and Computational Mathematics
	    Liverpool University, Liverpool L69 3BX, UK

ABSTRACT:  This thesis studies the theory of stochastic adaptive 
computation based on neural networks.  A mathematical theory of 
computation is developed in the framework of information geometry, 
which generalises Turing machine (TM) computation in three aspects 
--- It can be continuous, stochastic and adaptive --- and retains 
the TM computation as a subclass called ``data processing''.  The 
concepts of Boltzmann distribution, Gibbs sampler and simulated 
annealing are formally defined and their interrelationships are 
studied.  The concept of ``trainable information processor'' (TIP) 
--- parameterised stochastic mapping with a rule to change the 
parameters --- is introduced as an abstraction of neural network 
models.  A mathematical theory of the class of homogeneous semilinear 
neural networks is developed, which includes most of the commonly 
studied NN models such as back propagation NN, Boltzmann machine 
and Hopfield net, and a general scheme is developed to classify 
the structures, dynamics and learning rules.

All the previously known general learning rules are based on 
gradient following (GF), which are susceptible to local optima in 
weight space.  Contrary to the widely held belief that this is 
rarely a problem in practice, numerical experiments show that for 
most non-trivial learning tasks GF learning never converges to a 
global optimum.  To overcome the local optima, simulated annealing 
is introduced into the learning rule, so that the network retains 
adequate amount of ``global search'' in the learning process. 
Extensive numerical experiments confirm that the network always 
converges to a global optimum in the weight space.  The resulting 
learning rule is also easier to be implemented and more biologically 
plausible than back propagation and Boltzmann machine learning rules: 
Only a scalar needs to be back-propagated for the whole network.

Various connectionist models have been proposed in the literature for
solving various instances of problems, without a general method by
which their merits can be combined. Instead of proposing yet another
model, we try to build a modular structure in which each module is
basically a TIP.  As an extension of simulated annealing to temporal
problems, we generalise the theory of dynamic programming and Markov
decision process to allow adaptive learning, resulting in a
computational system called a ``basic adaptive computer', which has
the advantage over earlier reinforcement learning systems, such as
Sutton's ``Dyna'', in that it can adapt in a combinatorial environment
and still converge to a global optimum.

  The theories are developed with a universal normalisation scheme 
for all the learning parameters so that the learning system can be 
built without prior knowledge of the problems it is to solve.
___________________________________________________________________

INDEX entry:

zhu.thesis.ps.Z hzhu at liverpool.ac.uk
222 pages.
Foundation of stochastic adaptive computation based on neural networks.
Simulated annealing learning rule superior to backpropagation and 
Boltzmann machine learning rules.  Reinforcement learning for 
combinatorial state space and action space.  
(Mathematics with simulation results plus philosophy.)

---------------------
Sample ftp procedure:

unix$ ftp archive.cis.ohio-state.edu
Name (archive.cis.ohio-state.edu:name): ftp (or anonymous)
Password: (your email address including @)
ftp> cd pub/neuroprose
ftp> binary
ftp> get zhu.thesis.ps.Z
ftp> quit
unix$ uncompress zhu.thesis.ps.Z
unix$ lpr -P<printer_name> zhu.thesis.ps

The last two steps can also be combined to
unix$ zcat zhu.thesis.ps.Z | lpr -P<printer_name>
which will save some space.

----------------------
Note:

This announcement is simultaneous sent to the following three 
mailing lists:
   connectionists at cs.cmu.edu, anneal at sti.com, reinforce at cs.uwa.edu.au
My apology to those who subscribe to more than one of them.

I'm sorry that there is no hard copy available.

-- 
Huaiyu Zhu			hzhu at liverpool.ac.uk
Dept. of Stat. & Comp. Math., University of Liverpool, L69 3BX, UK