NIPS-93 Workshop on catastrophic interference
Bob French
french at willamette.edu
Mon Oct 25 12:07:29 EDT 1993
NIPS-93 Workshop:
================
CATASTROPHIC INTERFERENCE IN CONNECTIONIST NETWORKS:
CAN IT BE PREDICTED, CAN IT BE PREVENTED?
Date: Saturday, December 4, 1993, at Vail, Colorado
====
Intended audience: Connectionists, cognitive scientists and
================= applications-oriented users of connectionist
networks interested in a better understanding
of:
i) when and why their networks can suddenly
and completely forget previously learned
information;
ii) how it is possible to reduce or even
eliminate this phenomenon.
Organizer: Bob French
========= Computer Science Department
Willamette University, Salem OR
french at willamette.edu
Program:
========
When connectionist networks learn new information, they can
suddenly and completely forget everything they had previously learned.
This problem is called catastrophic forgetting or catastrophic
interference. Given the demonstrated severity of the problem, it is
intriguing to note that this problem has to date received very little
attention. When new information must be added to an already-trained
connectionist network, it is currently taken for granted that the
network will simply cycle through all of the old data again. Since
relearning all of the old data is both psychologically implausible as
well as impractical for very large data sets, is it possible to do
otherwise? Can connectionist networks be developed that do not forget
catastrophically -- or perhaps that do not forget at all -- in the
presence of new information? Or is catastrophic forgetting perhaps
the inevitable price for using fully distributed representations?
Under what circumstances will a network forget or not forget?
Further, can the amount of forgetting be predicted with any
reliability? These questions are of particular interest to anyone who
intends to use connectionist networks as a memory/generalization
device.
This workshop will focus on:
- the theoretical reasons for catastrophic interference;
- the techniques that have been developed to eliminate
it or to reduce its severity;
- the side-effects of catastrophic interference;
- the degree to which a priori prediction of catastrophic
forgetting is or is not possible.
As connectionist networks become more and more a part of
applications packages, the problem of catastrophic interference will
have to be addressed. This workshop will bring the audience up to
date on current research on catastrophic interference.
Speakers: Stephan Lewandowsky (lewan at constellation.ecn.uoknor.edu)
======== Department of Psychology
University of Oklahoma
Phil A. Hetherington (het at blaise.psych.mcgill.ca)
Department of Psychology
McGill University
Noel Sharkey (noel at dcs.exeter.ac.uk)
Connection Science Laboratory
Dept. of Computer Science
University of Exeter, U.K.
Bob French (french at willamette.edu)
Computer Science Department
Willamette University
Morning session:
---------------
7:30 - 7:45 Bob French: An Introduction to the Problem of
Catastrophic Interference in Connectionist Networks
7:45 - 8:15 Stephan Lewandowsky: Catastrophic Interference: Causes,
Solutions, and Side-Effects
8:15 - 8:30 Brief discussion
8:30 - 9:00 Phil Hetherington: Sequential Learning in Connectionist
Networks: A Problem for Whom?
9:00 - 9:30 General discussion
Afternoon session
-----------------
4:30 - 5:00 Noel Sharkey: Catastrophic Interference and
Discrimination.
5:00 - 5:15 Brief discussion
5:15 - 5:45 Bob French: Prototype Biasing and the
Problem of Prediction
5:45 - 6:30 General discussion and closing remarks
Below are the abstracts for the talks to be presented in this workshop:
CATASTROPHIC INTERFERENCE: CAUSES, SOLUTIONS, AND SIDE-EFFECTS
Stephan Lewandowsky
Department of Psychology
University of Oklahoma
I briefly review the causes for catastrophic interference in
connectionist models and summarize some existing solutions. I then
focus on possible trade-offs between resolutions to catastrophic
interference and other desirable network properties. For example, it
has been suggested that reduced interference might impair
generalization or prototype formation. I suggest that these
trade-offs occur only if interference is reduced by altering the
response surfaces of hidden units.
--------------------------------------------------------------------------
SEQUENTIAL LEARNING IN CONNECTIONIST NETWORKS: A PROBLEM FOR WHOM?
Phil A. Hetherington
Department of Psychology
McGill University
Training networks in a strictly blocked, sequential manner normally
results in poor performance because new items overlap with old items
at the hidden unit layer. However, catastrophic interference is not a
necessary consequence of using distributed representations. First,
examination by the method of savings demonstrates that much of the
early information is still retained: Items thought lost can be
relearned within a couple of trials. Second, when items are learned
in a windowed, or overlapped fashion, less interference obtains. And
third, when items are presented in a strictly blocked, sequential
manner to a network that already possesses a relevant knowledge base,
interference may not occur at all. Thus, when modeling normal human
learning there is no catastrophic interference problem. Nor is there
a problem when modeling strictly sequential human memory experiments
with a network that has a relevant knowledge base. There is only a
problem when simple, unstructured, tabula rasa networks are expected
to model the intricacies of human memory.
--------------------------------------------------------------------------
CATASTROPHIC INTERFERENCE AND DISCRIMINATION
Noel Sharkey
Connection Science Laboratory
Dept. of Computer Science
University of Exeter
Exeter, U.K.
Connectionist learning techniques, such as backpropagation, have
been used increasingly for modelling psychological phenomena.
However, a number of recent simulation studies have shown that when a
connectionist net is trained, using backpropagation, to memorize sets
of items in sequence and without negative exemplars, newly learned
information seriously interferes with old. Three converging methods
were employed to show why and under what circumstances such
retroactive interference arises. First, a geometrical analysis
technique, derived from perceptron research, was introduced and
employed to determine the computational and representational
properties of feedforward nets with one and two layers of weights.
This analysis showed that the elimination of interference always
resulted in a breakdown of old-new discrimination. Second, a formally
guaranteed solution to the problems of interference and discrimination
was presented as the HARM model and used to assess the relative merits
of other proposed solutions. Third, two simulation studies were
reported that assessed the effects of providing nets with experience
of the experimental task. Prior knowledge of the encoding task was
provided to the nets either by block training them in advance or by
allowing them to extract the knowledge through sequential training.
The overall conclusion was that the interference and discrimination
problems are closely related. Sequentially trained nets employing the
backpropagation learning algorithm will unavoidably suffer from either
one or the other.
--------------------------------------------------------------------------
PROTOTYPE BIASING IN CONNECTIONIST NETWORKS
Bob French
Computer Science Dept.
Willamette University
Previously learned representations bias new representations. If
subjects are told that a newly encountered object X belongs to an
already familiar category P, they will tend to emphasize in their
representation of X features of the prototype they have for the
category P. This is the basis of prototype biasing, a technique that
appears to significantly reduce the effects catastrophic forgetting.
The 1984 Congressional Voting Records database is used to
illustrate prototype biasing. This database contains the yes-no
voting records of Republican and Democratic members of Congress in
1984 on 16 separate issues. This database lends itself conveniently
to the use of a network having 16 "yes-no" input units, a hidden layer
and one "Republican/Democrat" output node. A "Republican" prototype
and a "Democrat" prototype are built, essentially by separately
averaging over Republican and Democrat hidden-layer representations.
These prototypes then "bias" subsequent representations of new
Democrats towards the Democrat prototype and of new Republicans
towards the Republican prototype.
Prototypes are learned by a second, separate backpropagation
network that associates teacher patterns with their respective
prototypes. Thus, ideally, when the "Republican" teacher pattern is
fed into it, it produces the "Republican" prototype on output. The
output from this network is continually fed back to the hidden layer
of the primary network and is used to bias new representations.
Also discussed in this paper are the problems involved in
predicting the severity of catastrophic forgetting.
More information about the Connectionists
mailing list