What have neural networks achieved?

Thu Aug 27 15:21:05 EDT 1998

Jay McClelland <jlm at cnbc.cmu.edu> writes:

>There has been a great deal of connectionist work on the processing of
>regular and exceptional material, initiated by the
>Rumelhart-McClelland paper on the past tense.  Debate has raged on the
>subject of the past tense and work there is ongoing, but I won't claim
>a success story there at this time.  What I would like to point to
>instead is the related topic of single word reading.  Sejnowski and
>Rosenberg's NETTALK first extended connectionist ideas to this issue,
>and Seidenberg and McClelland went on to show that a connectionist
>model could account in great detail for the pattern of reaction times
>found in around 30 studies concerning the effects of regularity,
>frequency, and lexical neighbors on reading words aloud.  This was
>followed by a resounding critique along the lines of Pinker and
>Prince's critique of R&M, coming this time from Derrick Besner (and
>colleagues) and Max Coltheart (and colleagues).  Both pointed to the
>fact that the S&M model didn't do a very good job of reading nonwords,
>and both claimed that this reflected an in-principal limitation of a
>connectionist, single mechanism account: To do a good job with both,
>it was claimed, a dual route system was required.
>
>The success story is a paper by Plaut, McClelland, Seidenberg, and
>Patterson, in which it was shown in fact that a single mechanism,
>connectionist model can indeed account for human performance in
>reading both words and nonwords.  The model replicated all the S&M
>findings, and at the same time was able to read non-words as well as
>human subjects, showing the same types of neighbor-driven responses that
>human readers show (eg MAVE is sometimes read to rhyme with HAVE
>instead of SAVE).
>
>Of course there are still some loose ends but it is no longer possible
>to claim that a single-mechanism account cannot capture the basic
>pattern of word and non-word reading data. 

The demonstration that a single mechanism (ie a single, uniform network)
can deal with both regular and exception items does not speak to the issue
of which system humans are more likely to posses. For example, it is easy
to constrain a single backpropagation network to perform both "what" and
"where" vision tasks (Rueckl, Cave & Kosslyn, 1989), but the most efficient
way to do it is through a modular architecture (Jacobs, Jordan, & Barto,
1991); incidentally, this is also what the brain seems be doing (in a very
broad sense). This is the general (and important) issue of modular
decomposition in learning (see Ghahramani & Wolpert, 1997, for recent
evidence that the brain uses a modular decomposition strategy to learn a
new visuomotor task).

With regard to the more specific issue of regular vs. exception and/or
words vs. non-words, a modular connectionist perspective (alternative to
the approach of Plaut, McClelland, Seidenberg, & Patterson, 1996) can be
found in papers (just appeared) by Zorzi, Houghton, and Butterworth (1998a,
1998b) for reading, and by Houghton and Zorzi (1998) for spelling (refs and
abstracts below). The main point here is that the regularities of a
"quasi-regular" domain such as reading or spelling are more easily and
quickly exctracted by a network without hidden units and trained with the
simple delta rule; this also provides early and robust generalization to
novel forms (eg, non-words). The reading model has been shown to account
for a wide range of empirical findings, including experimental,
neuropsychological and developmental data.

-- Marco Zorzi

References:

Ghahramani, Z., & Wolpert, D.M. (1997). Modular decomposition in visuomotor
learning. Nature, 386, 392-395.

Houghton, G., & Zorzi, M. (1998). A model of the sound-spelling mapping in
English and its role in word and nonword spelling. In Proceedings of the
Twentieth Annual Conference of the Cognitive Science Society (p. 490-501).
Mahwah (NJ): Erlbaum.

Jacobs, R.A., Jordan, M.I., & Barto, A.G. (1991). Task decomposition
through competition in a modular connectionist architecture: The What and
Where vision tasks. Cognitive Science, 15, 219-250.

Plaut, D. C., McClelland, J. L., Seidenberg, M. S. & Patterson, K. E.
(1996). Understanding normal and impaired word reading: Computational
principles in quasi-regular domain. Psychological Review, 103, 56-115.

Rueckl, J.G., Cave, K.R., & Kosslyn, S.M. (1989). Why are "What" and
"Where" processed by separate cortical visual systems? A computational
investigation. Journal of Cognitive Neuroscience, 1, 171-186.

Zorzi, M., Houghton, G., & Butterworth, B. (1988a). Two routes or one in
reading aloud? A connectionist dual-process model. Journal of Experimental
Psychology: Human Perception and Performance, 24, 1131-1161.

Zorzi, M., Houghton, G., & Butterworth, B. (1988b). The development of
spelling-sound relationships in a model of phonological reading. Language
and Cognitive Processes (Special Issue: Language Acquisition and
Connectionsim), 13, 337-371.

Two Routes or One in Reading Aloud? A Connectionist Dual-Process Model

Marco Zorzi, George Houghton and Brian Butterworth
Journal of Experimental Psychology: Human Perception and Performance, 1998,
Vol. 24, No. 4, 1131-1161

A connectionist study of word reading is described that emphasizes the
computational demands of the spelling-sound mapping in determining the
properties of the reading system. It is shown that the phonological
assembly process can be implemented by a two-layer network, which
easily extracts the regularities in the spelling-sound mapping for
English from training data containing many exception words. It is
argued that productive knowledge about spelling-sound relationships
is more easily acquired and used if it is separated from case-specific
knowledge of the pronunciation of known words. It is then shown how
the interaction of assembled and retrieved phonologies can account
for the combined effects of frequency and regularity-consistency and
for the reading performance of dyslexic patients. It is concluded that
the organization of the reading system reflects the demands of the task
and that the pronunciations of nonwords and exception words are computed
by different processes.

The development of spelling-sound relationships in a model of phonological
reading.

Marco Zorzi, George Houghton and Brian Butterworth
Language and Cognitive Processes (Special Issue: Language Acquisition and
Connectionsim), 1998, Vol. 13 (2/3), 337-371.

Developmental aspects of the spelling to sound mapping for English
monosyllabic words are investigated with a simple 2-layer network model
using a simple, general learning rule. The model is trained on both
regularly and irregularly spelled words, but extracts the regular spelling
to sound relationships which it can apply to new words, and which cause it
to regularize irregular words. These relationships are shown to include
single letter to phoneme mappings as well as mappings involving larger
units such as multi-letter graphemes and onset-rime structures. The
development of these mappings as a function of training is analyzed and
compared with relevant developmental data. We also show that the 2-layer
model can generalize after very little training, in comparison to a 3-layer
network. This ability relies on the fact that orthography and phonology can
make direct contact with each other, and its importance for self-teaching
is emphasized.

A model of the sound-spelling mapping in English and its role in word and
nonword spelling.

George Houghton and Marco Zorzi
In: Proceedings of the Twentieth Annual Conference of the Cognitive Science
Society (p. 490-501), 1998.

A model of the productive sound-spelling mapping in English is described,
based on previous work on the analogous problem for reading (Zorzi,
Houghton & Butterworth, 1998a, 1998b). It is found that a two-layer network
can robustly extract this mapping from a representative corpus of English
monosyllabic sound-spelling pairs, but that good performance requires the
use of graphemic representations. Performance of the model is discussed for
both words and nonwords, direct comparison being made with the spelling of
surface dysgraphic MP (Behrmann & Bub, 1992). The model shows appropriate
contextual effects on spelling and exactly reproduces many of the subject’s
spellings. Effects of sound-spelling consistency are examined, and results
arising from the interaction of this system with a lexical spelling system
are compared with normal subject data.

----------------------------------------------------------------------
Marco Zorzi             email: marco at psychol.ucl.ac.uk
			http://www.psychol.ucl.ac.uk/marco.zorzi/marco.html

Department of Psychology    	voice: +44 171 5045393
University College London    	fax  : +44 171 4364276
Gower Street  	
London WC1E 6BT (UK)	   

(and)

Dipartimento di Psicologia   	voice: +39 40 6767325
Universita` di Trieste       	fax  : +39 40 312272
via dell'Universita` 7		email: zorzi at univ.trieste.it
34123 Trieste (Italy)  
----------------------------------------------------------------------