Reprint:SYNAPTIC NOISE IN DYNAMICALLY-DRIVEN RECURRENT NEURAL NETWORKS
Lee Giles
giles at research.nj.nec.com
Mon Aug 1 13:12:11 EDT 1994
****************************************************************************************
Reprint:SYNAPTIC NOISE IN DYNAMICALLY-DRIVEN RECURRENT NEURAL NETWORKS:
CONVERGENCE AND GENERALIZATION
The following reprint is available via the University of Maryland Department of Computer
Science Technical Report archive:
_________________________________________________________________________________________
"Synaptic Noise in Dynamically-driven Recurrent Neural Networks:
Convergence and Generalization"
UNIVERSITY OF MARYLAND TECHNICAL REPORT UMIACS-TR-94-89 AND CS-TR-3322
Kam Jim(a), C.L. Giles(a,b), B.G. Horne(a)
{kamjim,giles,horne}@research.nj.nec.com
(a) NEC Research Institute,4 Independence Way, Princeton, NJ 08540
(b) Institute for Advanced Computer Studies, U. of Maryland, College Park, MD 20742
There has been much interest in applying noise to feedforward neural networks in order
to observe their effect on network performance. We extend these results by introducing
and analyzing various methods of injecting synaptic noise into dynamically-driven
recurrent networks during training. By analyzing and comparing the effects of these
noise models on the error function, we found that applying a controlled amount of noise
during training can improve convergence time and generalization performance. In addition,
we analyze the effects of various noise parameters (additive vs. multiplicative,
cumulative vs. non-cumulative, per time step vs. per sequence) and predict that best
overall performance can be achieved by injecting additive noise at each time step. Noise
contributes a second-order gradient term to the error function which can be viewed as an
anticipatory agent to aid convergence. This term appears to find promising regions of
weight space in the beginning stages of training when the training error is large and
should improve convergence on error surfaces with local minima.Synaptic noise also
enhances the error function by favoring internal representations where state nodes are
operating in the saturated regions of the sigmoid discriminant function, thus improving
generalization to longer sequences. We substantiate these predictions by performing
extensive simulations on learning the dual parity grammar from grammatical strings
encoded as temporal sequences with a second-order fully recurrent neural network.
--------------------------------------------------------------------------------------
FTP INSTRUCTIONS
unix> ftp cs.umd.edu (128.8.128.8)
Name: anonymous
Password: (your_userid at your_site)
ftp> cd pub/papers/TRs
ftp> binary
ftp> get 3322.ps.Z
ftp> quit
unix> uncompress 3322.ps.Z
--------------------------------------------------------------------------------------
--
C. Lee Giles / NEC Research Institute / 4 Independence Way
Princeton, NJ 08540 / 609-951-2642 / Fax 2482
==
More information about the Connectionists
mailing list