Reprint: Experimental Comparison of the Effect of Order ...

Fri Mar 5 12:22:26 EST 1993

The following reprint is available via the NEC Research Institute ftp archive 
external.nj.nec.com. Instructions for retrieval from the archive follow 
the summary.

--------------------------------------------------------------------------------

                  Experimental Comparison of the Effect of 
                     Order in Recurrent Neural Networks

                  Clifford B. Miller(a) and C.L. Giles(a,b)

      (a) NEC Research Institute, 4 Independence Way, Princeton, NJ 08540

 (b) Institute for Advanced Computer Studies, U. Maryland, College Park, MD20742

                               ABSTRACT

There has been much interest in increasing the computational power of 
neural networks. In addition there has been much interest in "designing" 
neural networks to better suit particular problems. Increasing the "order" of 
the connectivity of a neural network permits both. Though order has played 
a significant role in feedforward neural networks, it role in dynamically 
driven recurrent networks is still being understood. This work explores the 
effect of order in learning grammars. We present an experimental 
comparison of first-order and second-order recurrent neural networks, as 
applied to the task of grammatical inference. We show that for the small 
grammars studied these two neural net architectures have comparable 
learning and generalization power, and that both are reasonably capable of 
extracting the correct finite state automata for the language in question. 
However, for a larger randomly-generated, ten-state grammar second-order 
networks significantly outperformed the first-order networks, both in 
convergence time and generalization capability. We show that these 
networks learn faster the more neurons they have (our experiments used up 
to 10 hidden neurons), but that the solutions found by smaller networks are 
usually of better quality (in terms of generalization performance after 
training). Second-order nets have the advantage that they converge more 
quickly to a solution and can find it more reliably than first-order nets, but 
that the second-order solutions tend to be of poorer quality than those of 
first-order if both architectures are trained to the same error tolerance. 
Despite this, second-order nets can more successfully extract finite state 
machines using heuristic clustering techniques applied to the internal state 
representations. We speculate that this may be due to restrictions on the 
ability of first-order architecture to fully make use of its internal state 
representation power and that this may have implications for the 
performance of the two architectures when scaled up to larger problems.

List of key words:

recurrent neural networks, higher order, learning, generalization, automata, 
grammatical inference, grammars.

--------------------------------------------------------------------------------

                          FTP INSTRUCTIONS

                unix> ftp external.nj.nec.com
                Name: anonymous
                Password: (your_userid at your_site)
                ftp> cd pub/giles/papers
                ftp> binary
                ftp> get miller-giles-ijprai.ps.Z
                ftp> quit
                unix> uncompress miller-giles-ijprai.ps.Z

---------------------------------------------------------------------------------

--                                 
C. Lee Giles / NEC Research Institute / 4 Independence Way
Princeton, NJ 08540 / 609-951-2642 / Fax 2482
==