Reprint:LEARNING A CLASS OF LARGE FINITE STATE MACHINES WITH A RECURRENT
Lee Giles
giles at research.nj.nec.com
Thu Aug 4 18:00:26 EDT 1994
NEURAL NETWORK
The following reprint is available via the University of Maryland Department
of Computer Science and the NEC Research Institute archives:
_____________________________________________________________________________
Learning a Class of Large Finite State Machines
with a Recurrent Neural Network
UNIVERSITY OF MARYLAND TECHNICAL REPORT UMIACS-TR-94-94 AND CS-TR-3328
C. L. Giles[1,2], B. G. Horne[1], T. Lin[1,3]
[1] NEC Research Institute, 4 Independence Way, Princeton, NJ 08540
[2]UMIACS, University of Maryland, College Park, MD 20742
[3] EE Department, Princeton University, Princeton, NJ 08540
{giles,horne,lin}@research.nj.nec.com
One of the issues in any learning model is how it scales with problem size.
Neural networks have not been immune to scaling issues. We show that a
dynamically- driven discrete-time recurrent network (DRNN) can learn rather
large grammatical inference problems when the strings of a finite memory
machine (FMM) are encoded as temporal sequences. FMMs are a subclass of
finite state machines which have a finite memory or a finite order of
inputs and outputs. The DRNN that learns the FMM is a neural network that
maps directly from the sequential machine implementation of the FMM. It has
feedback only from the output and not from any hidden units; an example is
the recurrent network of Narendra and Parthasarathy. (FMMs that have zero
order in the feedback of outputs are called definite memory machines and
are analogous to Time-delay or Finite Impulse Response neural networks.)
Due to their topology these DRNNs are as least as powerful as any
sequential machine implementation of a FMM and should be capable of
representing any FMM. We choose to learn ``particular FMMs.\' Specif
ically, these FMMs have a large number of states (simulations are for $256$
and $512$ state FMMs) but have minimal order, relatively small depth and
little logic when the FMM is implemented as a sequential machine.
Simulations for the num ber of training examples versus generalization
performance and FMM extraction size show that the number of training
samples necessary for perfect generalization is less than that necessary to
completely characterize the FMM to be learned. This is in a sense a best
case learning problem since any arbitrarily chosen FMM with a minimal
number of states would have much more order and string depth and most
likely require more logic in its sequential machine implementation.
--------------------------------------------------------------------------
FTP INSTRUCTIONS
unix> ftp cs.umd.edu (128.8.128.8)
Name: anonymous
Password: (your_userid at your_site)
ftp> cd pub/papers/TRs
ftp> binary
ftp> get 3328.ps.Z
ftp> quit
unix> uncompress 3328.ps.Z
OR
unix> ftp external.nj.nec.com (138.15.10.100)
Name: anonymous
Password: (your_userid at your_site)
ftp> cd pub/giles/papers
ftp> binary
ftp> get large.fsm.ps.Z
ftp> quit
unix> uncompress large.fsm.ps.Z
--------------------------------------------------------------------------
--
C. Lee Giles / NEC Research Institute / 4 Independence Way
Princeton, NJ 08540 / 609-951-2642 / Fax 2482
==
More information about the Connectionists
mailing list