What have neural networks achieved?
Jaap Murre
murre at psy.uva.nl
Wed Sep 2 15:12:13 EDT 1998
Max Coltheart wrote:
Suppose a net learned Task A to criterion and then was trained on Tas
B to criterion without any further exposure to A (no interleaving,
nothing corresponding to rehearsal of A). Then retest on A will reveal
catastrophic forgetting.
What happens to people here? If I spend 1998 learning to play golf,
and 1999 learning to play tennis and doing *nothing at all about golf*,
I would not expect my golf game to be have been completely blown away
when I try it again on Jan 1, 2000.
Isn't this a big difference between how neural nets learn and how
people learn?
Circularity needs to be avoided here e.g. it would not be good to
reply: If your golf game is still there, you must have been rehearsing
it in 1999.
If one only focusses on one element of current hippocampal models this
circularity may appear.
In fact, there are two independent problems here:
1. How neural networks are able to learn sequential tasks without much
interference.
2. How the brain (e.g., hippocampus-cortex architecture) accomplishes this.
Neural networks that have very distributed architectures (such as
backpropagation) will tend to show catastrophic interference. That is,
when compared to the human data (e.g., Osgood, 1949), they will either
show too much forgetting or--what is less well known--they will show
too *much* learning (Murre, 1996a). As was pointed out in the debate,
this effect can be reduced by interleaved learning of various kinds,
bringing the network behavior in line with the human data. (It can also
be reduced by using localist, modular or semi-distributed architectures.)
Hippocampus models that deal with the effects of hippocampal lesions,
must be able to explain why such lesions tend to obliterate *recent* rather
than old memories (called the Ribot effect). In normal forgetting these
recent memories are most readily accessible; under lesioning they are
the first to go. This is typically explained by assuming (1) that
memories are first stored in or via the hippocampus (or medial
temporal lobe complex) and (2) that there is a process of consolidation
whereby the memories are strengthened at a cortical level. At least
three models have been published with neural network simulations of such
a process (Alvarez and Squire, 1994; McClelland, McNaughton, and O'Reilly,
1995; Murre, 1996b). Consolidation in these models is implemented by
selecting representations in the hippocampus by some random process and
giving them extra learning trials at a cortical level. (There is also
some process by which representations are gradually lost from the
hippocampus.) Some neurobiological data exists supporting such a
process (Wilson and McNaughton, 1994).
McClelland et al. use the catestrophic interference effect as an
in-principle argument why this consolidation process exists, which
resembles interleaved learning. Other arguments have been put
forward. We, for example, stress the fact that the cortex has a
'connectivity problem' making it somewhat time-consuming to set up
the long-range connections that underlie episodic memories (Murre
and Sturdy, 1995).
Evidence for the consolidation process is still a little thin
at a neurobiological level. Some new data in neuropsychology has
recently emerged that seems to carry the thought experiment
"What would happen if consolidation could *not* take place,
i.e., the case where the brain remains dependent primarily on the
representations in the hippocampus (and some remnants in the
cortex). This seems to be the case in a newly discovered form of
dementia, called semantic dementia, whereby the semantic
representations disappear but the episodic memory remains relatively
preserved. On the basis of modelling work, we have predicted and
found several new characteristics of semantic dementia(Graham and
Hodges, 1997; Murre, Graham, and Hodges, submitted).
Though there is clearly an enormous amount of work to be done, I
think that it is fair to say that neural network models have
contributed and continue to contribute towards our understanding
of human (and animal)learning and memory and that one cannot rule
out hippocampus/amnesia models on the basis of circularity.
References
Alvarez, P., & L.R. Squire (1994). Memory consolidation and the
medial temporal lobe: a simple network model. Proceedings
of National Academy of Sciences (USA), 91, 7041-7045.
Graham, K.S., & J.R. Hodges (1997). Differentiating the roles of
the hippocampal complex and the neocortex in long-term memory
storage: evidence from the study of semantic dementia and
Alzheimer's disease, Neuropsychology, 11, 1-13.
McClelland, J.L., B.L., McNaughton, & R.C. O'Reilly (1995). Why
there are complementary learning systems in the hippocampus
and neocortex: insights from the successes and failures of
connectionist models of learning and memory. Psychological
Review, 102, 419-457.
Murre, J.M.J. (1996a). Hypertransfer in neural networks. Connection
Science, 8, 225-234.
Murre, J.M.J. (1996b). TraceLink: a model of amnesia and consolidation
of memory. Hippocampus, 6, 675-684.
Murre, J.M.J., & D.P.F. Sturdy (1995). The connectivity of the brain:
multi-level quantitative analysis. Biological Cybernetics,
73, 529-545.
Murre, J.M.J., K.S. Graham and J.R. Hodges (submitted). Semantic
dementia: new constraints on connectionist models of long-term
memory. Submitted to Psychological Bulletin.
Osgood, C.E. (1949). The similarity paradox in human learning: a
resolution. Psychological Review, 56, 132-143.
Wilson, M.A., & B.L. McNaughton (1994). Reactivation of hippocampal
ensemble memories during sleep. Science, 255, 676-679.
More information about the Connectionists
mailing list