Reprint: Experimental Comparison of the Effect of Order ...
Lee Giles
giles at research.nj.nec.com
Fri Mar 5 12:22:26 EST 1993
The following reprint is available via the NEC Research Institute ftp archive
external.nj.nec.com. Instructions for retrieval from the archive follow
the summary.
--------------------------------------------------------------------------------
Experimental Comparison of the Effect of
Order in Recurrent Neural Networks
Clifford B. Miller(a) and C.L. Giles(a,b)
(a) NEC Research Institute, 4 Independence Way, Princeton, NJ 08540
(b) Institute for Advanced Computer Studies, U. Maryland, College Park, MD20742
ABSTRACT
There has been much interest in increasing the computational power of
neural networks. In addition there has been much interest in "designing"
neural networks to better suit particular problems. Increasing the "order" of
the connectivity of a neural network permits both. Though order has played
a significant role in feedforward neural networks, it role in dynamically
driven recurrent networks is still being understood. This work explores the
effect of order in learning grammars. We present an experimental
comparison of first-order and second-order recurrent neural networks, as
applied to the task of grammatical inference. We show that for the small
grammars studied these two neural net architectures have comparable
learning and generalization power, and that both are reasonably capable of
extracting the correct finite state automata for the language in question.
However, for a larger randomly-generated, ten-state grammar second-order
networks significantly outperformed the first-order networks, both in
convergence time and generalization capability. We show that these
networks learn faster the more neurons they have (our experiments used up
to 10 hidden neurons), but that the solutions found by smaller networks are
usually of better quality (in terms of generalization performance after
training). Second-order nets have the advantage that they converge more
quickly to a solution and can find it more reliably than first-order nets, but
that the second-order solutions tend to be of poorer quality than those of
first-order if both architectures are trained to the same error tolerance.
Despite this, second-order nets can more successfully extract finite state
machines using heuristic clustering techniques applied to the internal state
representations. We speculate that this may be due to restrictions on the
ability of first-order architecture to fully make use of its internal state
representation power and that this may have implications for the
performance of the two architectures when scaled up to larger problems.
List of key words:
recurrent neural networks, higher order, learning, generalization, automata,
grammatical inference, grammars.
--------------------------------------------------------------------------------
FTP INSTRUCTIONS
unix> ftp external.nj.nec.com
Name: anonymous
Password: (your_userid at your_site)
ftp> cd pub/giles/papers
ftp> binary
ftp> get miller-giles-ijprai.ps.Z
ftp> quit
unix> uncompress miller-giles-ijprai.ps.Z
---------------------------------------------------------------------------------
--
C. Lee Giles / NEC Research Institute / 4 Independence Way
Princeton, NJ 08540 / 609-951-2642 / Fax 2482
==
More information about the Connectionists
mailing list