Corrected ps file of previously announced TR.

Thu Aug 11 19:01:15 EDT 1994

It seems that a figure in the postscript file of this TR 
generated printing problems for certain printers. We changed the figure 
to eliminate this problem. We apologize to anyone who was inconvenienced. 
In the revised TR, we also included some references that initially were 
unintentionally excluded.

Lee Giles, Bill Horne, Tsungnan Lin

_________________________________________________________________________________________

                 Learning a Class of Large Finite State Machines
                      with a Recurrent Neural Network

        UNIVERSITY OF MARYLAND TECHNICAL REPORT UMIACS-TR-94-94 AND CS-TR-3328

                 C. L. Giles[1,2], B. G. Horne[1], T. Lin[1,3] 

      [1] NEC Research Institute, 4 Independence Way, Princeton, NJ  08540
            [2]UMIACS, University of Maryland, College Park, MD 20742
          [3] EE Department, Princeton University, Princeton, NJ 08540

                      {giles,horne,lin}@research.nj.nec.com

One of the issues in any learning model is how it scales with problem size. Neural 
networks have not been immune to scaling issues. We show that a dynamically-
driven discrete-time recurrent network (DRNN) can learn rather large grammatical 
inference problems when the strings of a finite memory machine (FMM) are 
encoded as temporal sequences. FMMs are a subclass of finite state machines 
which have a finite memory or a finite order of inputs and outputs. The DRNN that 
learns the FMM is a neural network that maps directly from the sequential 
machine implementation of the FMM. It has feedback only from the output and not 
from any hidden units; an example is the recurrent network of Narendra and 
Parthasarathy. (FMMs that have zero order in the feedback of outputs are called 
definite memory machines and are analogous to Time-delay or Finite Impulse 
Response neural networks.) Due to their topology these DRNNs are as least as 
powerful as any sequential machine implementation of a FMM and should be 
capable of representing any FMM. We choose to learn ``particular FMMs.\' Specif
ically, these FMMs have a large number of states (simulations are for $256$ and 
$512$ state FMMs) but have minimal order, relatively small depth and little logic 
when the FMM is implemented as a sequential machine. Simulations for the num
ber of training examples versus generalization performance and FMM extraction 
size show that the number of training samples necessary for perfect generalization 
is less than that sufficient to completely characterize the FMM to be learned. This 
is in a sense a best case learning problem since any arbitrarily chosen FMM with a 
minimal number of states would have much more order and string depth and most 
likely require more logic in its sequential machine implementation.

--------------------------------------------------------------------------------------

                          FTP INSTRUCTIONS

                unix> ftp cs.umd.edu (128.8.128.8)
                Name: anonymous
                Password: (your_userid at your_site)
                ftp> cd pub/papers/TRs
                ftp> binary
                ftp> get 3328.ps.Z
                ftp> quit
                unix> uncompress 3328.ps.Z

                                 OR

                unix> ftp external.nj.nec.com (138.15.10.100)
                Name: anonymous
                Password: (your_userid at your_site)
                ftp> cd pub/giles/papers
                ftp> binary
                ftp> get large.fsm.ps.Z
                ftp> quit
                unix> uncompress large.fsm.ps.Z

--------------------------------------------------------------------------------------

--                                 
C. Lee Giles / NEC Research Institute / 4 Independence Way
Princeton, NJ 08540 / 609-951-2642 / Fax 2482
==