NIPS-93 Workshop on catastrophic interference

Mon Oct 25 12:07:29 EDT 1993

NIPS-93 Workshop: 
================

      CATASTROPHIC INTERFERENCE IN CONNECTIONIST NETWORKS: 
           CAN IT BE PREDICTED, CAN IT BE PREVENTED?

Date: Saturday, December 4, 1993, at Vail, Colorado
====

Intended audience:    Connectionists, cognitive scientists and
=================	applications-oriented users of connectionist
			networks interested in a better understanding
			of:
			     i) when and why their networks can suddenly 
				and completely forget previously learned 
				information;
			    ii) how it is possible to reduce or even
				eliminate this phenomenon.

Organizer:	Bob French
=========	Computer Science Department
		Willamette University, Salem OR
		french at willamette.edu

Program:
========                       
   When connectionist networks learn new information, they can
suddenly and completely forget everything they had previously learned.
This problem is called catastrophic forgetting or catastrophic
interference.  Given the demonstrated severity of the problem, it is
intriguing to note that this problem has to date received very little
attention.  When new information must be added to an already-trained
connectionist network, it is currently taken for granted that the
network will simply cycle through all of the old data again.  Since
relearning all of the old data is both psychologically implausible as
well as impractical for very large data sets, is it possible to do
otherwise?  Can connectionist networks be developed that do not forget
catastrophically -- or perhaps that do not forget at all -- in the
presence of new information?  Or is catastrophic forgetting perhaps
the inevitable price for using fully distributed representations?
Under what circumstances will a network forget or not forget?
Further, can the amount of forgetting be predicted with any
reliability?  These questions are of particular interest to anyone who
intends to use connectionist networks as a memory/generalization
device.
   This workshop will focus on:
          - the theoretical reasons for catastrophic interference;
          - the techniques that have been developed to eliminate
it or to reduce its severity;
          - the side-effects of catastrophic interference;
          - the degree to which a priori prediction of catastrophic 
forgetting is or is not possible.
   As connectionist networks become more and more a part of
applications packages, the problem of catastrophic interference will
have to be addressed.  This workshop will bring the audience up to
date on current research on catastrophic interference. 

Speakers:	Stephan Lewandowsky  (lewan at constellation.ecn.uoknor.edu)
========	Department of Psychology
		University of Oklahoma

		Phil A. Hetherington (het at blaise.psych.mcgill.ca)
		Department of Psychology
		McGill University  

		Noel Sharkey (noel at dcs.exeter.ac.uk)
		Connection Science Laboratory
		Dept. of Computer Science               
		University of Exeter, U.K.

		Bob French  (french at willamette.edu)
		Computer Science Department
		Willamette University

Morning session:
---------------
7:30 - 7:45     Bob French:  An Introduction to the Problem of 
		Catastrophic Interference in Connectionist Networks

7:45 - 8:15 	Stephan Lewandowsky: Catastrophic Interference: Causes,
		Solutions, and Side-Effects

8:15 - 8:30	Brief discussion

8:30 - 9:00	Phil Hetherington:  Sequential Learning in Connectionist 
		Networks: A Problem for Whom?

9:00 - 9:30	General discussion

Afternoon session
-----------------

4:30 - 5:00 	Noel Sharkey: Catastrophic Interference and 
		Discrimination.

5:00 - 5:15	Brief discussion

5:15 - 5:45	Bob French:  Prototype Biasing and the 
		Problem of Prediction

5:45 - 6:30	General discussion and closing remarks

Below are the abstracts for the talks to be presented in this workshop:

       CATASTROPHIC INTERFERENCE: CAUSES, SOLUTIONS, AND SIDE-EFFECTS
                      Stephan Lewandowsky
                    Department of Psychology
                     University of Oklahoma

   I briefly review the causes for catastrophic interference in
connectionist models and summarize some existing solutions.  I then
focus on possible trade-offs between resolutions to catastrophic
interference and other desirable network properties. For example, it
has been suggested that reduced interference might impair
generalization or prototype formation.  I suggest that these
trade-offs occur only if interference is reduced by altering the
response surfaces of hidden units.
--------------------------------------------------------------------------

SEQUENTIAL LEARNING IN CONNECTIONIST NETWORKS: A PROBLEM FOR WHOM?
                    Phil A. Hetherington
                  Department of Psychology
                     McGill University  

   Training networks in a strictly blocked, sequential manner normally
results in poor performance because new items overlap with old items
at the hidden unit layer.  However, catastrophic interference is not a
necessary consequence of using distributed representations.  First,
examination by the method of savings demonstrates that much of the
early information is still retained: Items thought lost can be
relearned within a couple of trials.  Second, when items are learned
in a windowed, or overlapped fashion, less interference obtains.  And
third, when items are presented in a strictly blocked, sequential
manner to a network that already possesses a relevant knowledge base,
interference may not occur at all.  Thus, when modeling normal human
learning there is no catastrophic interference problem.  Nor is there
a problem when modeling strictly sequential human memory experiments
with a network that has a relevant knowledge base.  There is only a
problem when simple, unstructured, tabula rasa networks are expected
to model the intricacies of human memory.
--------------------------------------------------------------------------

            CATASTROPHIC INTERFERENCE AND DISCRIMINATION
                         Noel Sharkey
                 Connection Science Laboratory
                   Dept. of Computer Science               
                     University of Exeter         
                        Exeter, U.K.

   Connectionist learning techniques, such as backpropagation, have
been used increasingly for modelling psychological phenomena.
However, a number of recent simulation studies have shown that when a
connectionist net is trained, using backpropagation, to memorize sets
of items in sequence and without negative exemplars, newly learned
information seriously interferes with old.  Three converging methods
were employed to show why and under what circumstances such
retroactive interference arises.  First, a geometrical analysis
technique, derived from perceptron research, was introduced and
employed to determine the computational and representational
properties of feedforward nets with one and two layers of weights.
This analysis showed that the elimination of interference always
resulted in a breakdown of old-new discrimination.  Second, a formally
guaranteed solution to the problems of interference and discrimination
was presented as the HARM model and used to assess the relative merits
of other proposed solutions. Third, two simulation studies were
reported that assessed the effects of providing nets with experience
of the experimental task.  Prior knowledge of the encoding task was
provided to the nets either by block training them in advance or by
allowing them to extract the knowledge through sequential training.
The overall conclusion was that the interference and discrimination
problems are closely related.  Sequentially trained nets employing the
backpropagation learning algorithm will unavoidably suffer from either
one or the other.
--------------------------------------------------------------------------

         PROTOTYPE BIASING IN CONNECTIONIST NETWORKS
			Bob French
	  	   Computer Science Dept. 	
		   Willamette University  

   Previously learned representations bias new representations.  If
subjects are told that a newly encountered object X belongs to an
already familiar category P, they will tend to emphasize in their
representation of X features of the prototype they have for the
category P.  This is the basis of prototype biasing, a technique that
appears to significantly reduce the effects catastrophic forgetting.
   The 1984 Congressional Voting Records database is used to
illustrate prototype biasing.  This database contains the yes-no
voting records of Republican and Democratic members of Congress in
1984 on 16 separate issues.  This database lends itself conveniently
to the use of a network having 16 "yes-no" input units, a hidden layer
and one "Republican/Democrat" output node.  A "Republican" prototype
and a "Democrat" prototype are built, essentially by separately
averaging over Republican and Democrat hidden-layer representations.
These prototypes then "bias" subsequent representations of new
Democrats towards the Democrat prototype and of new Republicans
towards the Republican prototype.
   Prototypes are learned by a second, separate backpropagation
network that associates teacher patterns with their respective
prototypes.  Thus, ideally, when the "Republican" teacher pattern is
fed into it, it produces the "Republican" prototype on output.  The
output from this network is continually fed back to the hidden layer
of the primary network and is used to bias new representations.
   Also discussed in this paper are the problems involved in 
predicting the severity of catastrophic forgetting.