Reprint:SYNAPTIC NOISE IN DYNAMICALLY-DRIVEN RECURRENT NEURAL NETWORKS

Mon Aug 1 13:12:11 EDT 1994

****************************************************************************************

Reprint:SYNAPTIC NOISE IN DYNAMICALLY-DRIVEN RECURRENT NEURAL NETWORKS:
        CONVERGENCE AND GENERALIZATION

The following reprint is available via the University of Maryland Department of Computer 
Science Technical Report archive:

_________________________________________________________________________________________

          "Synaptic Noise in Dynamically-driven Recurrent Neural Networks: 
                       Convergence and Generalization"

      UNIVERSITY OF MARYLAND TECHNICAL REPORT UMIACS-TR-94-89 AND CS-TR-3322

                     Kam Jim(a), C.L. Giles(a,b), B.G. Horne(a)

                     {kamjim,giles,horne}@research.nj.nec.com

        (a) NEC Research Institute,4 Independence Way, Princeton, NJ 08540
  (b) Institute for Advanced Computer Studies, U. of Maryland, College Park, MD 20742  

There has been much interest in applying noise to feedforward neural networks in order 
to observe their effect on network performance. We extend these results by introducing 
and analyzing various methods of injecting synaptic noise into dynamically-driven 
recurrent networks during training. By analyzing and comparing the effects of these 
noise models on the error function, we found that applying a controlled amount of noise 
during training can improve convergence time and generalization performance. In addition, 
we analyze the effects of various noise parameters (additive vs. multiplicative, 
cumulative vs. non-cumulative, per time step vs. per sequence) and predict that best 
overall performance can be achieved by injecting additive noise at each time step. Noise 
contributes a second-order gradient term to the error function which can be viewed as an 
anticipatory agent to aid convergence. This term appears to find promising regions of 
weight space in the beginning stages of training when the training error is large and 
should improve convergence on error surfaces with local minima.Synaptic noise also 
enhances the error function by favoring internal representations where state nodes are 
operating in the saturated regions of the sigmoid discriminant function, thus improving 
generalization to longer sequences. We substantiate these predictions by performing 
extensive simulations on learning the dual parity grammar from grammatical strings 
encoded as temporal sequences with a second-order fully recurrent neural network.

--------------------------------------------------------------------------------------

                          FTP INSTRUCTIONS

                unix> ftp cs.umd.edu (128.8.128.8)
                Name: anonymous
                Password: (your_userid at your_site)
                ftp> cd pub/papers/TRs
                ftp> binary
                ftp> get 3322.ps.Z
                ftp> quit
                unix> uncompress 3322.ps.Z

--------------------------------------------------------------------------------------

--                                 
C. Lee Giles / NEC Research Institute / 4 Independence Way
Princeton, NJ 08540 / 609-951-2642 / Fax 2482
==