TR available
Lee Giles
giles at research.nj.nec.com
Fri Dec 13 09:52:54 EST 1996
The following Technical Report is available via the University of
Maryland Department of Computer Science and the NEC Research
Institute archives:
____________________________________________________________________
HOW EMBEDDED MEMORY IN RECURRENT NEURAL NETWORK
ARCHITECTURES HELPS LEARNING LONG-TERM DEPENDENCIES
Technical Report CS-TR-3626 and UMIACS-TR-96-28, Institute for
Advanced Computer Studies, University of Maryland, College Park, MD
20742
Tsungnan Lin{1,2}, Bill G. Horne{1}, C. Lee Giles{1,3}
{1}NEC Research Institute, 4 Independence Way, Princeton, NJ 08540
{2}Department of Electrical Engineering, Princeton University,
Princeton, NJ 08540
{3}UMIACS, University of Maryland, College Park, MD 20742
ABSTRACT
Learning long-term temporal dependencies with recurrent neural
networks can be a difficult problem. It has recently been
shown that a class of recurrent neural networks called NARX
networks perform much better than conventional recurrent
neural networks for learning certain simple long-term dependency
problems. The intuitive explanation for this behavior is that
the output memories of a NARX network can be manifested as
jump-ahead connections in the time-unfolded network. These
jump-ahead connections can propagate gradient information more
efficiently, thus reducing the sensitivity of the network
to long-term dependencies.
This work gives empirical justification to our
hypothesis that similar improvements in learning long-term
dependencies can be achieved with other classes of recurrent
neural network architectures simply by increasing the order of
the embedded memory.
In particular we explore the impact of learning simple long-term
dependency problems on three classes of recurrent neural networks
architectures: globally recurrent networks, locally recurrent
networks, and NARX (output feedback) networks.
Comparing the performance of these architectures with different
orders of embedded memory on two simple long-term dependences
problems shows that all of these classes of networks
architectures demonstrate significant improvement on learning
long-term dependencies when the orders of embedded memory are
increased. These results can be important to a user comfortable
to a specific recurrent neural network architecture because
simply increasing the embedding memory order will make the
architecture more robust to the problem of long-term dependency
learning.
-------------------------------------------------------------------
KEYWORDS: discrete-time, memory, long-term dependencies, recurrent
neural networks, training, gradient-descent
PAGES: 15 FIGURES: 7 TABLES: 2
-------------------------------------------------------------------
http://www.neci.nj.nec.com/homepages/giles.html
http://www.cs.umd.edu/TRs/TR-no-abs.html
or
ftp://ftp.nj.nec.com/pub/giles/papers/UMD-CS-TR-3626.recurrent.arch.long.term.ps.Z
------------------------------------------------------------------------------------
--
C. Lee Giles / Computer Sciences / NEC Research Institute /
4 Independence Way / Princeton, NJ 08540, USA / 609-951-2642 / Fax 2482
www.neci.nj.nec.com/homepages/giles.html
==
More information about the Connectionists
mailing list