TR announcment - long-term dependencies
Lee Giles
giles at research.nj.nec.com
Tue Jul 18 18:27:57 EDT 1995
The following Technical Report is available via the University of Maryland
Department of Computer Science and the NEC Research Institute archives:
_____________________________________________________________________________
LEARNING LONG-TERM DEPENDENCIES IS NOT AS DIFFICULT
WITH NARX RECURRENT NEURAL NETWORKS
Technical Report UMIACS-TR-95-78 and CS-TR-3500, Institute for
Advanced Computer Studies, University of Maryland, College Park, MD
20742
Tsungnan Lin{1,2}, Bill G. Horne{1}, Peter Tino{1,3}, C. Lee Giles{1,4}
{1}NEC Research Institute, 4 Independence Way, Princeton, NJ 08540
{2}Department of Electrical Engineering, Princeton University, Princeton,
NJ 08540
{3}Dept. of Computer Science and Engineering, Slovak Technical University,
Ilkovicova 3, 812 19 Bratislava, Slovakia
{4}UMIACS, University of Maryland, College Park, MD 20742
ABSTRACT
It has recently been shown that gradient descent learning algorithms for
recurrent neural networks can perform poorly on tasks that involve long-
term dependencies, i.e. those problems for which the desired output
depends on inputs presented at times far in the past.
In this paper we explore the long-term dependencies problem for a class of
architectures called NARX recurrent neural networks, which have power
ful representational capabilities. We have previously reported that gradient
descent learning is more effective in NARX networks than in recurrent
neural network architectures that have ``hidden states'' on problems includ
ing grammatical inference and nonlinear system identification. Typically,
the network converges much faster and generalizes better than other net
works. The results in this paper are an attempt to explain this phenomenon.
We present some experimental results which show that NARX networks
can often retain information for two to three times as long as conventional
recurrent neural networks. We show that although NARX networks do not
circumvent the problem of long-term dependencies, they can greatly
improve performance on long-term dependency problems.
We also describe in detail some of the assumption regarding what it means
to latch information robustly and suggest possible ways to loosen these
assumptions.
----------------------------------------------------------------------------------
----------------------------------------------------------------------------------
http://www.neci.nj.nec.com/homepages/giles.html
http://www.cs.umd.edu/TRs/TR-no-abs.html
or
ftp://ftp.nj.nec.com/pub/giles/papers/UMD-CS-TR-3500.long-term.dependencies.narx.ps.Z
------------------------------------------------------------------------------------
--
C. Lee Giles / NEC Research Institute / 4 Independence Way
Princeton, NJ 08540, USA / 609-951-2642 / Fax 2482
URL http://www.neci.nj.nec.com/homepages/giles.html
==
More information about the Connectionists
mailing list