Subsymbolic Language Processing: PSYC Multiple Book Review

Stevan Harnad harnad at Princeton.EDU
Mon Aug 8 22:32:16 EDT 1994


                CALL FOR BOOK REVIEWERS

Below is the Precis of SUBSYMBOLIC NATURAL LANGUAGE PROCESSING by
Risto Mikkulainen. This book has been selected for multiple review in
PSYCOLOQUY. If you wish to submit a formal book review (see
Instructions following Precis) please write to psyc at pucc.bitnet
indicating what expertise you would bring to bear on reviewing the book
if you were selected to review it. (If you have never reviewed for
PSYCOLOQUY or Behavioral & Brain Sciences before, it would be helpful
if you could also append a copy of your CV to your message.) If you are
selected as one of the reviewers, you will be sent a copy of the book
directly by the publisher (please let us know if you have a copy
already). Reviews may also be submitted without invitation, but all
reviews will be refereed. The author will reply to all accepted
reviews.

----------------------------------------------------------------------
psycoloquy.94.5.46.language-network.1.miikkulainen  Monday  8 Aug 1994
ISSN 1055-0143 (34 paragraphs, 1 fig, 1 note, 16 references, 609 lines)
PSYCOLOQUY is sponsored by the American Psychological Association (APA)
                Copyright 1994 Risto Miikkulainen

                Precis of:
                SUBSYMBOLIC NATURAL LANGUAGE PROCESSING:
                AN INTEGRATED MODEL OF SCRIPTS, LEXICON, AND MEMORY
                Cambridge, MA: MIT Press, 1993
                15 chapters, 403 Pages

                Risto Miikkulainen
                Department of Computer Sciences
                The University of Texas at Austin
                Austin, TX 78712
                risto at cs.utexas.edu

    ABSTRACT: Distributed neural networks have been very successful in
    modeling isolated cognitive phenomena, but complex high-level
    behavior has been amenable only to symbolic artificial intelligence
    techniques. Aiming to bridge this gap, this book describes
    DISCERN, a complete natural language processing system implemented
    entirely at the subsymbolic level. In DISCERN, distributed neural
    network models of parsing, generating, reasoning, lexical
    processing and episodic memory are integrated into a single system
    that learns to read, paraphrase, and answer questions about
    stereotypical narratives. Using DISCERN as an example, a general
    approach to building high-level cognitive models from distributed
    neural networks is introduced, and the special properties of such
    networks are shown to provide insight into human performance. In
    this approach, connectionist networks are not only plausible models
    of isolated cognitive phenomena, but also sufficient constituents
    for generating complex, high-level behavior.

    KEYWORDS: computational modeling, connectionism, distributed neural
    networks, episodic memory, lexicon, natural language processing,
    scripts.

I. MOTIVATION

1. Recently there has been a great deal of excitement in cognitive
science about the subsymbolic (i.e., parallel distributed processing,
or distributed connectionist, or distributed neural network) approach
to natural language processing. Subsymbolic systems seem to capture a
number of intriguing properties of human-like information processing
such as learning from examples, context sensitivity, generalization,
robustness of behavior, and intuitive reasoning. These properties have
been very difficult to model with traditional, symbolic techniques.

2. Within this new paradigm, the central issues are quite different
from (even incompatible with) the traditional issues in symbolic
cognitive science, and the research has proceeded without much in
common with the past. However, the ultimate goal is still the same: to
understand how human cognition is put together. Even if cognitive
science is being built on a new foundation, as can be argued, many of
the results obtained through symbolic research are still valid, and
could be used as a guide for developing subsymbolic models of cognitive
processes.

3. This is where DISCERN, the computer-simulated neural network model
described in this book (Miikkulainen 1993), fits in. DISCERN is a
purely subsymbolic model, but at the high level it consists of modules
and information structures similar to those of symbolic systems, such
as scripts, lexicon, and episodic memory. At the highest level of
cognitive modeling, the symbolic and subsymbolic paradigms have to
address the same basic issues. Outlining a parallel distributed
approach to those issues is the purpose of DISCERN.

4. In more specific terms, DISCERN aims: (1) to demonstrate that
distributed artificial neural networks can be used to build a
large-scale natural language processing system that performs
approximately at the level of symbolic models; (2) to show that several
cognitive phenomena can be explained at the subsymbolic level using the
special properties of these networks; and (3) to identify central
issues in subsymbolic cognitive modeling and to develop well-motivated
techniques to deal with them. To the extent that DISCERN is successful
in these areas, it constitutes a first step towards subsymbolic natural
language processing.

II. THE SCRIPT PROCESSING TASK

5. Scripts (Schank and Abelson, 1977) are schemas of often-encountered,
stereotypic event sequences, such as visiting a restaurant, traveling
by airplane, and shopping at a supermarket. Each script divides further
into tracks, or established minor variations. A script can be
represented as a causal chain of events with a number of open roles.
Script-based understanding means reading a script-based story,
identifying the proper script and track, and filling its roles with the
constituents of the story. Events and role fillers that were not
mentioned in the story but are part of the script can then be inferred.
Understanding is demonstrated by generating an expanded paraphrase of
the original story, and by answering questions about the story.

6. To see what is involved in the task, let us consider an example of
DISCERN input/output behavior. The following input stories are examples
of the fancy-restaurant, plane-travel, and electronics-shopping
tracks:

    John went to MaMaison. John asked the waiter for lobster. John left
    the waiter a big tip.

    John went to LAX. John checked in for a flight to JFK. The plane
    landed at JFK.

    John went to Radio-Shack. John asked the staff questions
    about CD-players. John chose the best CD-player.

7. DISCERN reads the orthographic word symbols sequentially, one at a
time. An internal representation of each story is formed, where all
inferences are made explicit. These representations are stored in the
episodic memory. The system then answers questions about the stories:

    What did John buy at Radio-Shack?
      John bought a CD-player at Radio-Shack.

    Where did John fly to?
      John flew to JFK.

    What did John eat at MaMaison?
      John ate a good lobster.

With the question as a cue, the appropriate story representation is
retrieved from the episodic memory and the answer is generated word by
word. DISCERN also generates full paraphrases of the input stories.
For example, it generates an expanded version of the restaurant story:

    John went to MaMaison. The waiter seated John. John asked the
    waiter for lobster. John ate a good lobster. John paid the waiter.
    John left a big tip. John left MaMaison.

8. The answers and the paraphrase show that DISCERN has made a number
of inferences beyond the original story. For example, it inferred that
John ate the lobster and the lobster tasted good. The inferences are
not based on specific rules but are statistical and learned from
experience. DISCERN has read a number of similar stories in the past
and the unmentioned events and role bindings have occurred in most
cases. They are assumed immediately and automatically upon reading the
story and have become part of the memory of the story. In a similar
fashion, human readers often confuse what was mentioned in the story
with what was only inferred (Bower et al., 1979; Graesser et al.,
1979).

9. A number of issues can be identified from the above examples.
Specifically, DISCERN has to (1) make statistical, script-based
inferences and account for learning them from experience; (2) store
items in the episodic memory in a single presentation and retrieve them
with a partial cue; (3) develop a meaningful organization for the
episodic memory, based on the stories it reads; (4) represent meanings
of words, sentences, and stories internally; and (5) organize a lexicon
of symbol and concept representations based on examples of how words
are used in the language and form a many-to-many mapping between them.
Script processing constitutes a good framework for studying these
issues, and a good domain for developing an approach towards the goals
outlined above.

III. APPROACH

10. Parallel distributed processing models typically have very little
internal structure. They produce the statistically most likely answer
given the input conditions in a process that is opaque to the external
observer. This is well suited to the modeling of isolated low-level
tasks, such as learning past tense forms of verbs (Rumelhart and
McClelland, 1986) or word pronunciation (Sejnowski and Rosenberg,
1987). Given the success of such models, a possible approach to
higher-level cognitive modeling would be to construct the system from
several submodules that work together to produce the higher-level
behavior.

11. In DISCERN, the immediate goal is to build a complete, integrated
system that performs well in the script processing task. In this sense,
DISCERN is very similar to traditional models in artificial
intelligence. However, DISCERN also aims to show how certain parts of
human cognition could actually be built. The components of DISCERN were
designed as independent cognitive models that can account for
interesting language processing and memory phenomena, many of which are
not even required in the DISCERN task. Combining these models into a
single, working system is one way of validating them. In DISCERN, the
components are not just models of isolated cognitive phenomena; they
are sufficient constituents for generating complex high-level
behavior.

IV. THE DISCERN MODEL

12. DISCERN can be divided into parsing, generating, question
answering, and memory subsystems, each with two modules (figure 1).
Each module is trained in its task separately and in parallel. During
performance, the modules form a network of networks, each feeding its
output to the input of another module.

                       Input text     Output text
                              |         ^
                              V         |
  =================        =================        =================
   Sentence Parser <-------     Lexicon     <------- Sentence Gener.
  =================        =================        =================
     |         |                                           ^     ^
     |         |                                           |     |
     |         +-------+-----------------------+           |     |
     |                 |                       |           |     |
     |                 V                       V           |     |
     |         =================       =================   |     |
     |             Cue Former           Answer Producer ---+     |
     |         =================       =================         |
     |                       |           ^                       |
     |                       |           |                       |
     V                       V           |                       |
  =================        =================        =================
    Story Parser   -------> Episodic Memory -------> Story Generator
  =================        =================        =================

                      Figure 1: The DISCERN Model.

13. The sentence parser reads the input words one at a time and forms
a representation of each sentence. The story parser combines the
sequence of sentences into an internal representation of the story,
which is then stored in the episodic memory. The story generator
receives the internal representation and generates the sentences of the
paraphrase one at a time. The sentence generator outputs the sequence
of words for each sentence. The cue former receives a question
representation, built by the sentence parser, and forms a cue pattern
for the episodic memory, which returns the appropriate story
representation. The answer producer receives the question and the story
and generates an answer representation, which is output word by word by
the sentence generator. The architecture and behavior of each of these
modules in isolation is outlined below.

V. LEXICON

14. The input and output of DISCERN consist of distributed
representations for orthographic word symbols (also called lexical
words). Internally, DISCERN processes semantic concept representations
(semantic words). Both the lexical and semantic words are represented
distributively as vectors of gray-scale values between 0.0 and 1.0. The
lexical representations are based on the visual patterns of characters
that make up the written word; they remain fixed throughout the
training and performance of DISCERN. The semantic representations stand
for distinct meanings and are developed automatically by the system
while it is learning the processing task.

15. The lexicon stores the lexical and semantic representations and
translates between them. It is implemented as two feature maps
(Kohonen, 1989), one lexical and the other semantic. Words whose
lexical forms are similar, such as "LINE" and "LIKE", are represented
by nearby units in the lexical map. In the semantic map, words with
similar semantic content, such as "John" and "Mary", or "Leone's" and
"MaMaison" are mapped near each other. There is a dense set of
associative interconnections between the two maps. A localized activity
pattern representing a word in one map will cause a localized activity
pattern to form in the other map, representing the same word. The
output representation is then obtained from the weight vector of the
most highly active unit. The lexicon thus transforms a lexical input
vector into a semantic output vector and vice versa. Both maps and the
associative connections between them are organized simultaneously,
based on examples of co-occurring symbols and meanings.

16. The lexicon architecture facilitates interesting behavior.
Localized damage to the semantic map results in category-specific
lexical deficits similar to human aphasia (Caramazza, 1988; McCarthy and
Warrington, 1990). For example, the system selectively loses access to
restaurant names, or animate words, when that part of the map is
damaged. Dyslexic performance errors can also be modeled. If the
performance is degraded, for example, by adding noise to the
connections, parsing and generation errors that occur are quite similar
to those observed in human deep dyslexia (Coltheart et al., 1988). For
example, the system may confuse "Leone's" with "MaMaison", or "LINE"
with "LIKE", because they are nearby in the map and share similar
associative connections.

VI. FGREP PROCESSING MODULES

17. Processing in DISCERN is carried out by hierarchically organized
pattern-transformation networks. Each module performs a specific
subtask, such as parsing a sentence or generating an answer to a
question. All these networks have the same basic architecture: they are
three-layer, simple-recurrent backpropagation networks (Elman, 1990),
with the extension called FGREP that allows them to develop distributed
representations for their input/output words.

18. The network learns the processing task by adapting the connection
weights according to the standard on-line backpropagation procedure
(Rumelhart et al., 1986, pp. 327-329). The error signal is propagated
to the input layer, and the current input representations are modified
as if they were an extra layer of weights. The modified representation
vectors are put back in the lexicon, replacing the old representations.
Next time the same words occur in the input or output, their new
representations are used to form the input/output patterns for the
network. In FGREP, therefore, the required mappings change as the
representations evolve, and backpropagation is shooting at a moving
target.

19. The representations that result from this process have a number of
useful properties for cognitive modeling. (1) Since they adapt to the
error signal, they end up coding information most crucial to the task.
Representations for words that are used in similar ways in the examples
become similar. Thus, these profiles of continuous activity values can
be claimed to code the meanings of the words as well. (2) As a result,
the system never has to process very novel input patterns, because
generalization has already been done in the representations. (3) The
representation of a word is determined by all the contexts in which
that word has been encountered; consequently, it is also a
representation of all those contexts. Expectations emerge automatically
and cumulatively from the input word representations. (4) Single
representation components do not usually stand for identifiable
semantic features. Instead, the representation is holographic: word
categories can often be recovered from the values of single
components. (5) Holography makes the system very robust against noise
and damage. Performance degrades approximately linearly as
representation components become defective or inaccurate.

VII. EPISODIC MEMORY

20. The episodic memory in DISCERN consists of a hierarchical pyramid
of feature maps organized according to the taxonomy of script-based
stories. The highest level of the hierarchy is a single feature map
that lays out the different script classes. Beneath each unit of this
map there is another feature map that lays out the tracks within the
particular script. The different role bindings within each track are
separated at the bottom level. The map hierarchy receives a story
representation vector as its input and classifies it as an instance of
a particular script, track, and role binding. The hierarchy thereby
provides a unique memory representation for each script-based story as
the maximally responding units in the feature maps at the three
levels.

21. Whereas the top and the middle level in the hierarchy only serve as
classifiers, selecting the appropriate track and role-binding map for
each input, at the bottom level a permanent trace of the story must
also be created. The role-binding maps are trace feature maps, with
modifiable lateral connections. When the story representation vector is
presented to a role-binding map, a localized activity pattern forms as
a response. Each lateral connection to a unit with higher activity is
made excitatory, while a connection to a unit with lower activity is
made inhibitory. The units within the response now "point" towards the
unit with highest activity, permanently encoding that the story was
mapped at that location.

22. A story is retrieved from the episodic memory by giving it a
partial story representation as a cue. Unless the cue is highly
deficient, the map hierarchy is able to recognize it as an instance of
the correct script and track and form a partial cue for the
role-binding map. The trace feature map mechanism then completes the
role binding. The initial response of the map is again a localized
activity pattern; because the map is topological, it is likely to be
located somewhere near the stored trace. If the cue is close enough,
the lateral connections pull the activity to the center of the stored
trace. The complete story representation is retrieved from the weight
vectors of the maximally responding units at the script, track, and
role-binding levels.

23. Hierarchical feature maps have a number of properties that make
them useful for memory organization: (1) The organization is formed in
an unsupervised manner, extracting it from the input experience of the
system. (2) The resulting order reflects the properties of the data,
the hierarchy corresponding to the levels of variation, and the maps
laying out the similarities at each level. (3) By dividing the data
first into major categories and gradually making finer distinctions
lower in the hierarchy, the most salient components of the input data
are singled out and more resources are allocated for representing them
accurately. (4) Because the representation is based on salient
differences in the data, the classification is very robust, and usually
correct even if the input is noisy or incomplete. (5) Because the
memory is based on classifying the similarities and storing the
differences, retrieval becomes a reconstructive process (Kolodner, 1984;
Williams and Hollan, 1981) similar to human memory.

24. The trace feature map exhibits interesting memory effects that
result from interactions between traces. Later traces capture units
from earlier ones, making later traces more likely to be retrieved.
The extent of the traces determines memory capacity. The smaller the
traces, the more of them will fit in the map, but more accurate cues
are required to retrieve them. If the memory capacity is exceeded,
older traces will be selectively replaced by newer ones. Traces that
are unique, that is, located in a sparse area of the map, are not
affected, no matter how old they are. Similar effects are common in
human long-term memory (Baddeley, 1976; Postman, 1971).

VIII. DISCERN HIGH-LEVEL BEHAVIOR

25. DISCERN is more than just a collection of individual cognitive
models. Interesting behavior results from the interaction of the
components in a complete story-processing system.

26. DISCERN was trained and tested with an artificially generated
corpus of script-based stories consisting of three scripts with three
tracks and three open roles each. The complete DISCERN system performs
very well: at the output, about 98 percent of the words are correct.
This is rather remarkable for a chain of networks that is 9 modules
long and consists of several different types of modules.

27. A modular neural network system can only operate if it is stable,
that is, if small deviations from the normal flow of information are
automatically corrected. It turns out that DISCERN has several built-in
safeguards against minor inaccuracies and noise. The semantic
representations are distributed and redundant, and inaccuracies in the
output of one module are cleaned up by the module that uses the
output. The memory modules clean up by categorical processing: a noisy
input is recognized as a representative of an established class and
replaced by the correct representation of that class. As a result,
small deviations do not throw the system off course, but rather the
system filters out the errors and returns to the normal course of
processing, which is an essential requirement for building robust
cognitive models.

28. DISCERN also demonstrates strong script-based inferencing. Even
when the input story is incomplete, consisting of only a few main
events, DISCERN can usually form an accurate internal representation
of it. DISCERN was trained to form complete story representations
from the first sentence on, and because the stories are stereotypical,
missing sentences have little effect on the parsing process. Once the
story representation has been formed, DISCERN performs as if the script
had been fully instantiated. Questions about missing events and
role-bindings are answered as if they were part of the original story.
If events occurred in an unusual order, they are recalled in the
stereotypical order in the paraphrase. If there is not enough
information to fill a role, the most likely filler is selected and
maintained throughout the paraphrase generation. Such behavior
automatically results from the modular architecture of DISCERN and is
consistent with experimental observations on how people remember
stories of familiar event sequences (Bower et al., 1979; Graesser et
al., 1979).

29. In general, given the information in the question, DISCERN recalls
the story that best matches it in the memory. An interesting issue
is: what happens when DISCERN is asked a question that is inaccurate
or ambiguous, that is, one that does not uniquely specify a story? For
example, DISCERN might have read a story about John eating lobster at
MaMaison, and then about Mary doing the same at Leone's, and the
question could be "Who ate lobster?" Because later traces are more
prominent in the memory, DISCERN is more likely to retrieve the
Mary-at-Leone's story in this case. The earlier story is still in the
memory, but to recall it, more details need to be specified in the
question, such as `Who ate lobster at MaMaison?" Similarly, DISCERN can
robustly retrieve a story even if the question is slightly inaccurate.
When asked "How did John like the steak at MaMaison?", DISCERN
generates the answer "John thought lobster was good at MaMaison",
ignoring the inaccuracy in the question, because the cue is still close
enough to the stored trace. DISCERN does recognize, though, when a
question is too different from anything in the memory, and should not
be answered. For "Who ate at McDonald's?", the cue vector is not close
to any trace, the memory does not settle, and nothing is retrieved.
Note that these mechanisms were not explicitly built into DISCERN,
but they emerge automatically from the physical layout of the
architecture and representations.

IX. DISCUSSION

30. There is an important distinction between scripts (or more
generally, schemas) in symbolic systems, and scripts in subsymbolic
models such as DISCERN. In the symbolic approach, a script is stored
in memory as a separate, exact knowledge structure, coded by the
knowledge engineer. The script has to be instantiated by searching the
schema memory sequentially for a structure that matches the input.
After instantiation, the script is active in the memory and later
inputs are interpreted primarily in terms of this script. Deviations
are easy to recognize and can be taken care of with special
mechanisms.

31. In the subsymbolic approach, schemas are based on statistical
properties of the training examples, extracted automatically during
training. The resulting knowledge structures do not have explicit
representations. For example, a script exists in a neural network only
as statistical correlations coded in the weights. Every input is
automatically matched to every correlation in parallel. There is no
all-or-none instantiation of a particular knowledge structure. The
strongest, most probable correlations will dominate, depending on how
well they match the input, but all of them are simultaneously active at
all times. Regularities that make up scripts can be particularly well
captured by such correlations, making script-based inference a good
domain for the subsymbolic approach. Generalization and graceful
degradation give rise to inferencing that is intuitive, immediate, and
occurs without conscious control, as is script-based inference in
humans. On the other hand, it is very difficult to recognize deviations
from the script and to initiate exception-processing when the automatic
mechanisms fail. Such sequential reasoning would require intervention
of a high-level "conscious" monitor, which has yet to be built in the
connectionist framework.

X. CONCLUSION

32. The main conclusion from DISCERN is that building subsymbolic
models is a feasible approach to understanding mechanisms underlying
natural language processing. DISCERN shows how several cognitive
phenomena may result from subsymbolic mechanisms. Learning word
meanings, script processing, and episodic memory organization are based
on self-organization and gradient-descent in error in this model.
Script-based inferences, expectations, and defaults automatically
result from generalization and graceful degradation. Several types of
performance errors in role binding, episodic memory, and lexical access
emerge from the physical organization of the system. Perhaps most
significantly, DISCERN shows how individual connectionist models can be
combined into a large, integrated system that demonstrates that these
models are sufficient constituents for generating sequential, symbolic,
high-level behavior.

33. Although processing simple script instantiations is a start, there
is a long way to go before subsymbolic models will rival the best
symbolic cognitive models. For example, in story understanding, symbolic
systems have been developed that analyze realistic stories in depth,
based on higher-level knowledge structures such as goals, plans,
themes, affects, beliefs, argument structures, plots, and morals. In
designing subsymbolic models that would do that, we are faced with two
major challenges: (1) how to implement connectionist control of
high-level processing strategies (making it possible to model processes
more sophisticated than a series of reflex responses), and (2) how to
represent and learn abstractions (making it possible to process
information at a higher level than correlations in the raw input
data). Progress in these areas would constitute a major step towards
extending the capabilities of subsymbolic natural language processing
models beyond those of DISCERN.

XI. NOTE

34. Software for the DISCERN system is available through anonymous ftp
from cs.utexas.edu:pub/neural-nets/discern. An X11 graphics demo,
showing DISCERN in processing the example stories discussed in the
book, can be run remotely under the World Wide Web at
http://www.cs.utexas.edu/~risto/discern.html, or by telnet with "telnet
cascais.utexas.edu 30000".

XII. TABLE OF CONTENTS

PART I Overview
1 Introduction
2 Background
3 Overview of DISCERN

PART II Processing Mechanisms
4 Backpropagation Networks
5 Developing Representations in FGREP Modules
6 Building from FGREP Modules

PART III Memory Mechanisms
7 Self-Organizing Feature Maps
8 Episodic Memory Organization: Hierarchical Feature Maps
9 Episodic Memory Storage and Retrieval: Trace Feature Maps
10 Lexicon

PART IV Evaluation
11 Behavior of the Complete Model
12 Discussion
13 Comparison to Related Work
14 Extensions and Future Work
15 Conclusions

APPENDICES
A Story Data
B Implementation Details
C Instructions for Obtaining the DISCERN Software

XIII. REFERENCES

Baddeley, A.D. (1976) The Psychology of Memory. New York: Basic Books.

Bower, G.H., Black, J.B. and Turner, T.J. (1979) Scripts in memory for
text. Cognitive Psychology, 11:177-220.

Caramazza, A. (1988) Some aspects of language processing revealed
through the analysis of acquired aphasia: The lexical system. Annual
Review of Neuroscience, 11:395-421.

Coltheart, M., Patterson, K. and Marshall, J.C., editors (1988) Deep
Dyslexia. London; Boston: Routledge and Kegan Paul. Second edition.

Elman, J.L. (1990) Finding structure in time. Cognitive Science,
14:179-211.

Graesser, A.C., Gordon, S.E. and Sawyer, J.D. (1979) Recognition memory
for typical and atypical actions in scripted activities: Tests for the
script pointer+tag hypothesis. Journal of Verbal Learning and Verbal
Behavior, 18:319-332.

Kohonen, T. (1989) Self-Organization and Associative Memory. Berlin;
Heidelberg; New York: Springer. Third edition.

Kolodner, J.L. (1984) Retrieval and Organizational Strategies in
Conceptual Memory: A Computer Model. Hillsdale, NJ: Erlbaum.

Miikkulainen, R. (1993) Subsymbolic Natural Language Processing:  an
Integrated Model of Scripts, Lexicon, and Memory. Cambridge MA: MIT.

McCarthy, R.A. and Warrington, E.K. (1990) Cognitive Neuropsychology: A
Clinical Introduction. New York: Academic Press.

Postman, L. (1971) Transfer, interference and forgetting. In Kling,
J.W., and Riggs, L.A., editors, Woodworth and Schlosberg's Experimental
Psychology, 1019-1132. New York: Holt, Rinehart and Winston. Third
edition.

Rumelhart, D.E. and McClelland, J.L. (1986) On learning past tenses of
English verbs. In Rumelhart, D.E., and McClelland, J.L., editors,
Parallel Distributed Processing: Explorations in the Microstructure of
Cognition, Volume 2, 216--271. Cambridge, MA: MIT Press.

Rumelhart, D.E., Hinton, G.E. and Williams, R.J. (1986) Learning
internal representations by error propagation. In Rumelhart, D.E. and
McClelland, J.L., editors, Parallel Distributed Processing:
Explorations in the Microstructure of Cognition, Volume 1, 318-362.
Cambridge, MA: MIT Press.

Sejnowski, T. J., and Rosenberg, C. R. (1987) Parallel networks that
learn to pronounce English text. Complex Systems, 1:145--168.

Schank, R.C. and Abelson, R.P. (1977) Scripts, Plans, Goals, and
Understanding: An Inquiry into Human Knowledge Structures. Hillsdale,
NJ: Erlbaum.

Williams, M.D. and Hollan, J.D. (1981) The process of retrieval from
very long-term memory. Cognitive Science, 5:87--119.

--------------------------------------------------------------------

        PSYCOLOQUY Book Review Instructions

The PSYCOLOQUY book review procedure is very similar to the commentary
procedure except that it is the book itself, not a target article, that
is under review. (The Precis summarizing the book is intended to permit
PSYCOLOQUY readers who have not read the book to assess the exchange,
but the reviews should address the book, not primarily the Precis.)

Note that as multiple reviews will be co-appearing, you need only
comment on the aspects of the book relevant to your own specialty and
interests, not necessarily the book in its entirety. Any substantive
comments and criticism -- including points calling for a detailed and
substantive response from the author -- are appropriate. Hence,
investigators who have already reviewed or intend to review this book
elsewhere are still encouraged to submit a PSYCOLOQUY review specifically
written with this specialized multilateral review-and-response feature
in mind.

1.  Before preparing your review, please read carefully
    the Instructions for Authors and Commentators and examine
    recent numbers of PSYCOLOQUY.

2.  Reviews should not exceed 500 lines. Where judged necessary
    by the Editor, reviews will be formally refereed.

3.  Please provide a title for your review. As many
    commentators will address the same general topic, your
    title should be a distinctive one that reflects the gist
    of your specific contribution and is suitable for the
    kind of keyword indexing used in modern bibliographic
    retrieval systems. Each review should also have a brief
    (~50-60 word) Abstract

4.  All paragraphs should be numbered consecutively. Line length
    should not exceed 72 characters.  The review should begin with
    the title, your name and full institutional address (including zip
    code) and email address.  References must be prepared in accordance
    with the examples given in the Instructions.  Please read the
    sections of the Instruction for Authors concerning style,

    INSTRUCTIONS FOR PSYCOLOQUY AUTHORS AND COMMENTATORS

PSYCOLOQUY is a refereed electronic journal (ISSN 1055-0143) sponsored
on an experimental basis by the American Psychological Association
and currently estimated to reach a readership of 40,000. PSYCOLOQUY
publishes brief reports of new ideas and findings on which the author
wishes to solicit rapid peer feedback, international and
interdisciplinary ("Scholarly Skywriting"), in all areas of psychology
and its related fields (biobehavioral science, cognitive science,
neuroscience, social science, etc.). All contributions are refereed.

Target article length should normally not exceed 500 lines [c. 4500 words].
Commentaries and responses should not exceed 200 lines [c. 1800 words].

All target articles, commentaries and responses must have (1) a short
abstract (up to 100 words for target articles, shorter for commentaries
and responses), (2) an indexable title, (3) the authors' full name(s)
and institutional address(es).

In addition, for target articles only: (4) 6-8 indexable keywords,
(5) a separate statement of the authors' rationale for soliciting
commentary (e.g., why would commentary be useful and of interest to the
field? what kind of commentary do you expect to elicit?) and
(6) a list of potential commentators (with their email addresses).

All paragraphs should be numbered in articles, commentaries and
responses (see format of already published articles in the PSYCOLOQUY
archive; line length should be < 80 characters, no hyphenation).

It is strongly recommended that all figures be designed so as to be
screen-readable ascii. If this is not possible, the provisional
solution is the less desirable hybrid one of submitting them as
postscript files (or in some other universally available format) to be
printed out locally by readers to supplement the screen-readable text
of the article.

PSYCOLOQUY also publishes multiple reviews of books in any of the above
fields; these should normally be the same length as commentaries, but
longer reviews will be considered as well. Book authors should submit a
500-line self-contained Precis of their book, in the format of a target
article; if accepted, this will be published in PSYCOLOQUY together
with a formal Call for Reviews (of the book, not the Precis). The
author's publisher must agree in advance to furnish review copies to the
reviewers selected.

Authors of accepted manuscripts assign to PSYCOLOQUY the right to
publish and distribute their text electronically and to archive and
make it permanently retrievable electronically, but they retain the
copyright, and after it has appeared in PSYCOLOQUY authors may
republish their text in any way they wish -- electronic or print -- as
long as they clearly acknowledge PSYCOLOQUY as its original locus of
publication. However, except in very special cases, agreed upon in
advance, contributions that have already been published or are being
considered for publication elsewhere are not eligible to be considered
for publication in PSYCOLOQUY,

Please submit all material to psyc at pucc.bitnet or psyc at pucc.princeton.edu
Anonymous ftp archive is DIRECTORY pub/harnad/Psycoloquy HOST princeton.edu



More information about the Connectionists mailing list