What have neural networks achieved?

Tue Sep 1 12:15:44 EDT 1998

> From: "Randall C. O'Reilly" <oreilly at grey.colorado.edu>
>
> Another angle on the hippocampal story has to do with the phenomenon
> of catestrophic interference (McCloskey & Cohen, 1989), and the notion
> that the hippocampus and the cortex are complementary learning systems

The catastrophic forgetting (catastrophic interference, serial
learning) problem has come up in this thread.  In most neural networks
most of the time, learning new information disrupts (even eliminates)
old information.  I want to quickly describe what we think is an
interesting and general solution to this problem.

First, a comment on rehearsal.  The catastrophic forgetting 
problem can be solved with rehearsal - relearning old items as
new items are learned.  A range of rehearsal regimes have been
explored (see for example Murre, 1992; Robins, 1995; and the
"interleaved learning" referred to earlier in this thread by Jay
McClelland from McClelland, McNaughton, & O'Reilly, 1995).

Rehearsal is an effective solution as long as the previously learned
items are actually available for relearning.  It may be, however, that
the old items have been lost, or it is not practical for some reason
to store them.  In any case, retaining old items for rehearsal in a
network seems somewhat artificial, as it requires that they be
available on demand from some other source, which would seem to make
the network itself redundant.

It is possible to achieve the benefits of rehearsal, however, even
when there is no access to old items.  This "pseudorehearsal"
mechanism, introduced in Robins (1995), is based on the relearning of
artificially constructed populations of "pseudoitems" instead of the
actual old items.

In MLP / backprop type networks a pseudoitem is constructed by
generating a new input vector at random, and passing it forward
through a network in the standard way.  Whatever output vector this
input generates becomes the associated target output.  Rehearsing
these pseudoitems during new learning protects the old items in the
same way that rehearsing the real old items does.  Why does it work?
The essence of preventing catastrophic forgetting is to localise
changes to the function instantiated by the network so that it changes
only in the immediate vicinity of the new item to be learned.
Rehearsal localises changes by relearning ("fixing") the original
training data points.  Pseudorehearsal localises change by relearning
("fixing") other points randomly chosen from the function (the
pseudoitems).  (Work in progress suggests that simply using a "local"
learning algorithm such as an RBF is not enough).

Pseudorehearsal is the generation of approximations of old knowledge
to be rehearsed as needed.  The method is very effective, and has been 
further explored in a number of papers (Robins, 1996; Frean & Robins, 
1998; Ans & Rousset, 1997; French, 1997; and as a part of work 
described in Silver & Mercer, 1998).  Pseudorehearsal enables sequential 
learning (the learning of new information at any time) in a neural 
network.

Extending these ideas to dynamical networks (such as Hopfield nets),
we can rehearse randomly chosen attractors to preserve previously
learned items / attractors during new learning (Robins & McCallum,
1998). Here the distinction between rehearsal and pseudorehearsal
starts to break down, as randomly chosen attractors naturally contain
a mixture of both real old items / learned attractors and pseudoitems
/ spurious attractors.

We have already linked pseudorehearsal in MLP networks to the
consolidation of information during sleep (Robins, 1996).  In the
context of Hopfield type nets another proposed solution to
catastrophic forgetting based on unlearning spurious attractors has
also been linked to sleep (eg Hopfield, Feinstein & Palmer, 1983;
Crick & Mitchison, 1983; Christos, 1996).  We are currently exploring
the relationship between this *unlearning* and our *relearning* based
accounts.  Details of the input patterns, architecture, and learning
algorithm are all significant in determining the efficacy of the two
approaches (we think our approach has advantages, but this is work in
progress!).

References

Ans,B. & Rousset,S. (1997) Avoiding Catastrophic Forgetting by Coupling
Two Reverberating Neural Networks.  Academie des Sciences, Sciences de la
vie, 320, 989 - 997.

Christos, G. (1996) Investigation of the Crick-Mitchison Reverse-Learning
Dream Sleep Hypothesis in a Dynamic Setting. Neural Networks, 9, 427 -
434.

Crick,F. & Mitchison,G (1983) The Function of Dream Sleep.  Nature, 304,
111 -114.

Frean,M.R. & Robins,A.V. (1998). Catastrophic forgetting and
"pseudorehearsal" in linear networks. In Downs T, Frean M & Gallagher M
(Eds) Proceedings of the Ninth Australian Conference on Neural Networks
Brisbane: University of Queensland (1998) 173 - 178.

French,R.M. (1997) Pseudo-recurrent Connectionist Networks:  An Approach
to the Sensitivity Stability Dilemma.  Connection Science, 9, 353 - 380.

Hopfield,J., Feinstein, D. & Palmer,R. (1983) 'Unlearning' has a
Stabilizing Effect in Collective Memories.  Nature, 304. 158 - 159.

McClelland,J., McNaughton,B. & O'Reilly,R. (1995) Why there are 
complementary learning systems in the hippocampus and neocortex: 
Insights from the successes and failures of connectionist models of 
learning and memory.  Psychological Review, 102, 419-457.

Murre,J.M.J. (1992) Learning and Categorization in Modular Neural
Networks.  Hillsdale, NJ:  Earlbaum.

Robins,A. (1995) Catastrophic Forgetting, Rehearsal, and Pseudorehearsal.
Connection Science, 7, 123 - 146.

Robins,A. (1996)  Consolidation in Neural Networks and in the Sleeping
Brain.  Connection Science, 8, 259 - 275.

Robins, A. & McCallum, S. (1998).  Pseudorehearsal and the Catastrophic
Forgetting Solution in Hopfield Type Networks.  Connection Science, 7 :
121 - 135.

Silver,D. & Mercer,R. (1998) The Task Rehearsal Method of Sequential
Learning.  Department of Computer Science University of Western Ontario
Technical Report # 517.

Anthony Robins  ----------------------------------------------------
Computer Science                                 coscavr at otago.ac.nz
University of Otago                             ph:    +64 3 4798314
Dunedin, NEW ZEALAND                            fax:   +64 3 4798529