What have neural networks achieved?

Wed Sep 2 15:12:13 EDT 1998

Max Coltheart wrote:

	Suppose a net learned Task A to criterion and then was trained on Tas
	B to criterion without any further exposure to A (no interleaving,
	nothing corresponding to rehearsal of A). Then retest on A will reveal
	catastrophic forgetting.

	What happens to people here? If I spend 1998 learning to play golf,
	and 1999 learning to play tennis and doing *nothing at all about golf*,
	I would not expect my golf game to be have been completely blown away
	when I try it again on Jan 1, 2000.

	Isn't this a big difference between how neural nets learn and how
	people learn?

	Circularity needs to be avoided here e.g. it would not be good to
	reply: If your golf game is still there, you must have been rehearsing
	it in 1999.

If one only focusses on one element of current hippocampal models this 
circularity may appear.

In fact, there are two independent problems here:

1. How neural networks are able to learn sequential tasks without much 
interference.

2. How the brain (e.g., hippocampus-cortex architecture) accomplishes this.

Neural networks that have very distributed architectures (such as 
backpropagation) will tend to show catastrophic interference. That is,
when compared to the human data (e.g., Osgood, 1949), they will either
show too much forgetting or--what is less well known--they will show 
too *much* learning (Murre, 1996a). As was pointed out in the debate, 
this effect can be reduced by interleaved learning of various kinds, 
bringing the network behavior in line with the human data. (It can also 
be reduced by using localist, modular or semi-distributed architectures.)

Hippocampus models that deal with the effects of hippocampal lesions,
must be able to explain why such lesions tend to obliterate *recent* rather
than old memories (called the Ribot effect). In normal forgetting these
recent memories are most readily accessible; under lesioning they are 
the first to go. This is typically explained by assuming (1) that 
memories are first stored in or via the hippocampus (or medial
temporal lobe complex) and (2) that there is a process of consolidation
whereby the memories are strengthened at a cortical level. At least
three models have been published with neural network simulations of such 
a process (Alvarez and Squire, 1994; McClelland, McNaughton, and O'Reilly,
1995; Murre, 1996b). Consolidation in these models is implemented by
selecting representations in the hippocampus by some random process and
giving them extra learning trials at a cortical level. (There is also
some process by which representations are gradually lost from the
hippocampus.) Some neurobiological data exists supporting such a
process (Wilson and McNaughton, 1994).

McClelland et al. use the catestrophic interference effect as an 
in-principle argument why this consolidation process exists, which
resembles interleaved learning. Other arguments have been put 
forward. We, for example, stress the fact that the cortex has a 
'connectivity problem' making it somewhat time-consuming to set up 
the long-range connections that underlie episodic memories (Murre 
and Sturdy, 1995). 

Evidence for the consolidation process is still a little thin 
at a neurobiological level. Some new data in neuropsychology has
recently emerged that seems to carry the thought experiment
"What would happen if consolidation could *not* take place, 
i.e., the case where the brain remains dependent primarily on the 
representations in the hippocampus (and some remnants in the 
cortex). This seems to be the case in a newly discovered form of 
dementia, called semantic dementia, whereby the semantic 
representations disappear but the episodic memory remains relatively 
preserved. On the basis of modelling work, we have predicted and 
found several new characteristics of semantic dementia(Graham and 
Hodges, 1997; Murre, Graham, and Hodges, submitted). 

Though there is clearly an enormous amount of work to be done, I
think that it is fair to say that neural network models have
contributed and continue to contribute towards our understanding 
of human (and animal)learning and memory and that one cannot rule 
out hippocampus/amnesia models on the basis of circularity.

References

Alvarez, P., & L.R. Squire (1994). Memory consolidation and the 
	medial temporal lobe: a simple network model. Proceedings 
	of National Academy of Sciences (USA), 91, 7041-7045.

Graham, K.S., & J.R. Hodges (1997). Differentiating the roles of 
	the hippocampal complex and the neocortex in long-term memory 
	storage: evidence from the study of semantic dementia and 
	Alzheimer's disease, Neuropsychology, 11, 1-13.

McClelland, J.L., B.L., McNaughton, & R.C. O'Reilly (1995). Why 
	there are complementary learning systems in the hippocampus 
	and neocortex: insights from the successes and failures of 
	connectionist models of learning and memory. Psychological 
	Review, 102, 419-457.

Murre, J.M.J. (1996a). Hypertransfer in neural networks. Connection 
	Science, 8, 225-234.

Murre, J.M.J. (1996b). TraceLink: a model of amnesia and consolidation 
	of memory. Hippocampus, 6, 675-684.

Murre, J.M.J., & D.P.F. Sturdy (1995). The connectivity of the brain: 
	multi-level quantitative analysis. Biological Cybernetics, 
	73, 529-545.

Murre, J.M.J., K.S. Graham and J.R. Hodges (submitted). Semantic 
	dementia: new constraints on connectionist models of long-term 
	memory. Submitted to Psychological Bulletin.

Osgood, C.E. (1949). The similarity paradox in human learning: a 
	resolution. Psychological Review, 56, 132-143.

Wilson, M.A., & B.L. McNaughton (1994). Reactivation of hippocampal 
	ensemble memories during sleep. Science, 255, 676-679.