papers available by ftp: Recurrent net architectures

Thu Jul 6 19:48:41 EDT 1995

The files wilson.recurrent.ps.Z and wilson.stability.ps.Z are now available
for copying from the Neuroprose repository:

File: wilson.recurrent.ps (4 pages)
Title: A Comparison of Architectural Alternatives for Recurrent Networks
Author: William H. Wilson

Abstract:
This paper describes a class of recurrent neural networks related to Elman
networks. The networks used herein differ from standard Elman networks in
that they may have more than one state vector. Such networks have an explicit
representation of the hidden unit activations from several steps back. In
principle, a single-state-vector network is capable of learning any sequential
task that a multi-state-vector network can learn. This paper describes
experiments which show that, in practice, and for the learning task used, a
multi-state-vector network can learn a task faster and better than a
single-state-vector network. The task used involved learning the graphotactic
structure of a sample of about 400 English words. The training method
and architecture used somewhat resemble backpropagation through time, but
differ in that multiple state vectors persist in the trained network, and
that each state vector is connected to the hidden layer by independent sets
of weights.

-------------------------------------
File: wilson.stability.ps (4 pages)
Title: Stability of Learning in Classes of Recurrent and Feedforward Networks
Author: William H. Wilson

Abstract:
This paper concerns a class of recurrent neural networks related to Elman
networks (simple recurrent networks) and Jordan networks and a class of
feedforward networks architecturally similar to Waibel's TDNNs. The recurrent
nets used herein, unlike standard Elman/Jordan networks, may have more than
one state vector. It is known that such multi-state Elman networks have better
learning performance on certain tasks than standard Elman networks of similar
weight complexity. The task used involves learning the graphotactic structure
of a sample of about 400 English words. Learning performance was tested using
regimes in which the state vectors are, or are not, zeroed between words:
the former results in larger minimum total error, but without the large
oscillations in total error observed when the state vectors are not
periodically zeroed. Learning performance comparisons of the three classes
of network favour the feedforward nets.

Bill Wilson
Artificial Intelligence Laboratory
School of Computer Science and Engineering
University of New South Wales
Sydney 2052 Australia
Email: billw at cse.unsw.edu.au