Connectionist Explanation: PSYC Call for Commentators (614 lines)

Tue Apr 14 09:39:12 EDT 1998

                Green: CONNECTIONIST EXPLANATION

    The target article below has just appeared in PSYCOLOQUY, a
    refereed journal of Open Peer Commentary sponsored by the American
    Psychological Association. Qualified professional biobehavioral,
    neural or cognitive scientists are hereby invited to submit Open
    Peer Commentary on this article. Please email for Instructions if
    you are not familiar with PSYCOLOQUY format and acceptance criteria
    (all submissions are refereed).

    To submit articles and commentaries or seek information:

EMAIL: psyc at pucc.princteton.edu
URL:   http://www.princeton.edu/~harnad/psyc.html
       http://www.cogsci.soton.ac.uk/psyc

AUTHOR'S RATIONALE FOR SOLICITING COMMENTARY: My reason for soliciting
commentary is quite straightforward. Connectionist models of cognition
are quite common in the psychological literature these days, but there
is very little discussion of the exact role they are thought to play in
science. If they are indeed theories, in the traditional sense, then
some explanation of the ways in which they seem to depart from
traditional theories is needed. If they are not traditional theories,
then a clear description of what they are, and an account of why we
should pay attention to them, is needed. Such a discussion should take
place among connectionists, philosophers of science and mind, and
psychologists. Psycoloquy seems like an ideal vehicle for such a
discussion.

-----------------------------------------------------------------------
psycoloquy.98.9.04.connectionist-explanation.1.green    Tue 14 Mar 1998
ISSN 1055-0143       (23 paragraphs, 16 references, 4 notes, 584 lines)
PSYCOLOQUY is sponsored by the American Psychological Association (APA)
               Copyright 1998 Christopher D. Green

        ARE CONNECTIONIST MODELS THEORIES OF COGNITION?

                Christopher D. Green
                Department of Psychology
                York University
                North York, Ontario 
                M3J 1P3 CANADA
                christo at yorku.ca
        http://www.yorku.ca/faculty/academic/christo

    ABSTRACT: This paper explores the question of whether connectionist
    models of cognition should be considered to be scientific theories
    of the cognitive domain. It is argued that in traditional
    scientific theories, there is a fairly close connection between the
    theoretical (unobservable) entities postulated and the empirical
    observations accounted for. In connectionist models, however,
    hundreds of theoretical terms are postulated -- viz., nodes and
    connections -- that are far removed from the observable phenomena.
    As a result, many of the features of any given connectionist model
    are relatively optional. This leads to the question of what,
    exactly, is learned about a cognitive domain modelled by a
    connectionist network.

    KEYWORDS: artificial intelligence, cognition, computer modelling,
    connectionism, epistemology, explanation, methodology, neural nets,
    philosophy of science, theory.

1. Connectionist models of cognition are all the rage now. It is not
clear, however, in what sense such models are to be considered THEORIES
of cognition. This may be problematic, for if connectionist models are
NOT to be considered THEORIES of cognition, in the traditional
scientific sense of the word, then the question arises as to what
exactly they are, and why we should pay attention to them? If, on the
other hand, they are to be regarded as scientific theories it should be
possible to explicate precisely in what sense this is true, and to show
how they fulfill the functions we normally associate with theories. In
this paper, I begin by examining the question of what it is to be a
scientific theory. Second, I describe in precisely what sense
traditional computational models of cognition can be said to perform
this role. Third, I examine whether or not connectionist models can be
said to do the same. My conclusion is that connectionist models could,
under a certain interpretation of what it is they model, be considered
to be theories, but that this interpretation is likely to be
unacceptable to many connectionists.

2. A typical complex scientific theory contains both empirical and
theoretical terms. The empirical terms refer to observable entities.
The theoretical terms refer to unobservable entities that improve the
predictive power of the theory as a whole. The exact ontological status
of objects referred to by theoretical terms is a matter of some debate.
Realists believe them to be actual objects that resist direct
observation for one reason or another. Instrumentalists consider them
to be mere "convenient fictions" that earn their scientific keep merely
by the predictive accuracy they lend to the theory. I think it is fair
to say that the vast majority of research psychologists are realists
about the theoretical terms they use, though they are, in the main,
unreflective realists who have never seriously considered alternative
possibilities.

3. Let us begin with a relatively uncontroversial theory from outside
psychology -- Mendelian genetics. In the Mendelian scheme, entities
called "genes" were said to be responsible for the propagation of
traits from one generation of organisms to another. Mendel was unable
to observe anything corresponding to "genes," but their invocation made
it possible for him to predict correctly the proportions in which
succeeding generations of organisms would express a variety of traits.
As such, the gene is a classic example of a theoretical entity. For
present purposes, it is important to note that each such theoretical
gene, though unobservable, was hypothesized to correspond to an
individual trait. That is, in addition to the predictive value each
theoretical gene provided, each also justified its existence by being
responsible for a particular phenomenon. There were no genes in the
system that were not directly tied to the expression of a trait.
Although some genes were said not to be expressed in the phenotype
(viz., recessive genes in heterozygous individuals), all were said to
be directly involved in the calculation of the expression of a specific
trait. That is to say, their inclusion in the theory was justified in
part by the SPECIFICITY of the role they were said to play. It is worth
noting that the actual existence of genes remained controversial until
the discovery of their molecular basis -- viz., DNA -- and our
understanding of them changed considerably with that discovery.

4. Now consider, as a psychological example of theoretical entities,
the model of memory proposed by Atkinson and Shiffrin (1971). It is a
classic "box-and-arrow" theory. Information is fed from the sensory
register into a holding space called Short Term Store (STS). If
continuously rehearsed, a limited number of items can be stored there
indefinitely. If the number of items exceeds the capacity of the store,
some are lost. If rehearsal continues for an unspecified duration, it
is claimed that some or all of these items are transferred to another
holding space called Long Term Store (LTS). The capacity of LTS is
effectively unlimited, and items in LTS need not be continuously
rehearsed, but are said to be kept in storage effectively permanently.
STS and LTS are, like genes, theoretical entities. They cannot be
directly observed, but their postulation enables the psychologist to
predict correctly a number of memory phenomena. In each such
phenomenon, the activity of each store is carefully specified. The
precision of this specification seems to be at least part of the reason
that scientists are willing to accept them. Indeed, many experiments
aimed at confirming their existence are explicitly designed to block,
or interfere with the hypothesized activity of one in order to
demonstrate the features of the "pure" activity of the other. Whether
or not this could be successfully accomplished was once a dominant
question in memory theory. The issue of short term memory EFFECTS being
"contaminated" by the uncontrollable and unwanted activity of LTS
occupied many experiments of the 1960s and 1970s.

5. Over the last 30 years the Atkinson and Shiffrin model has been
elaborated and refined. As a result, the number of memory systems
hypothesized to exist has grown tremendously. Baddeley (1992), for
instance, has developed STS into a series of slave systems responsible
for information entering memory from the various sense modalities
(e.g., the phonological loop, the visuospatial sketchpad), the
activities of which are coordinated by a central executive. Tulving
(1985), on the other hand, has divided LTS into four hierarchically
arranged systems responsible for episodic memory (for personal events),
semantic memory (for general information), procedural memory (for
skills) and implicit memory (for priming). In order to establish the
existence of each of these many theoretical entities, thousands of
experiments have been performed, aimed at revealing the independent of
activity of one or another by attempts to block the activity of the
others.

6. Once again, the question of whether the activity of a single memory
system can be studied in isolation has called into question the very
existence of that system. For over a decade, now, the elucidation of
implicit memory phenomena has been a major issue in memory theory
(Schacter, 1987, 1992; Roediger, 1990; Roediger & McDermott, 1993;). In
the typical implicit memory experiment [1], subjects study a list of
items (both words and pictures have been used) by processing them
briefly. This can be as simple as reading the word or naming the
object, or it can be more involved, such as deciding whether the items
belongs to a certain class of items (e.g., is a car a kind vehicle?) or
decomposing it in to parts (e.g., counting the number of letters in
words, or counting the edges or corners in pictured items). The
subjects then take part in a memory test, although they are not told
that it is a memory test, and it could indeed be performed without
having studied the material. In this test, they see a new list of
items, some of which, unbeknownst to them, are the same as (or closely
related to) the items they have studied. Such tests are sometimes
puzzles of various sorts (e.g., completing incomplete words or
identifying the items in incomplete pictures). Sometimes they are as
simple as deciding whether the items are true words (as opposed to
pronounceable non-words such as BLICK) or possible objects. People
perform reliably better on these tasks when the items in question are
ones that were on the study list (or closely related to items on the
study list) than when the items are new. Upon post-experimental
debriefing, however, they are often unable to say which items they had
studied before and which they had not. This is the classic implicit
memory effect.

7. Recently, however, it has been argued (Roediger & McDermott, 1993)
that explicit memory may be "contaminating" the hypothesized effect of
the implicit memory system. The degree of this contamination is not
clear, but it is possible, in principle (though unlikely), that ALL
implicit memory phenomena are the result of covert explicit memory. The
evidence for this (Jacoby, 1991) comes from comparing the behavior of a
typical implicit memory group with that of a control group that goes
through the same procedure but is told EXPLICITLY that the answers to
some of the test problems are items they studied before.  The outcome
is that these subjects do almost as well as the
experimental subjects, thus calling into question the "implicitness" of
the traditional subjects' memories. As a result, many have begun
question the very existence of the implicit memory system. Many
psychologists argue that the implicit memory effects are the result of a
certain kind of processing of a more general memory system, not the
autonomous activity of a distinct system of its own.

8. With the entry of computer models into psychology, the theories have
become even more complex, using dozens of theoretical entities. A
recent version of Chomskyan linguistic theory, for instance, postulates
more than two dozen rules that are said to control the building and
interpretation of grammatical sentences (see e.g., Berwick, 1985). But
even here the empirical data must bear fairly directly on each
theoretical entity. None of these rules is without specific predicted
effects. Each of the rules performs a certain function without which
the construction and interpretation of grammatical sentences could not
proceed correctly. For example, RULE ATTACH-VP, sensibly enough,
attaches verb phrases to sentences; RULE ATTACH-NOUN similarly attaches
nouns to noun phrases; and so forth. Part of what justifies the
inclusion in the theory of terms referring to each of these entities is
the fact that they are explicitly connected to specific empirical
phenomena.

9. In each of the models I have described so far, each theoretical
entity represents something in particular, even if that something is
itself theoretical. The existence and properties of the entities
represented are supported by empirical evidence relating specifically
to that entity. In a typical connectionist model, however, there are
dozens, sometimes hundreds, of simple units, bound together by
hundreds, sometimes thousands, of connections. Neither the units nor
the connections represent anything known to exist in the cognitive
domain the network is being used to model. Similarly, the rules that
govern how the activity of one unit will affect the activity of other
units to which it is connected are extremely simple, and not obviously
related to the domain that the network is being used to model. Ditto
for the rules that govern how the weights on the connections between
units are to be changed. In particular, the units of the network are
not thought to represent particular propositional attitudes (i.e.,
beliefs, desires, etc.) or the terms or concepts that might be thought
to underlie them. This is all considered a distinct advantage among
connectionists. Neither the units nor the connections correspond to
anything in the way that variables and rules did in traditional
computational models of cognition. Representations, to the degree that
they are admitted at all, are said to be distributed across the
activities of the units as a group. Any representation-level rules that
the model is said to use are likewise distributed across the weights of
all of the connections in the network. This gives connectionist
networks their characteristic flexibility: they are able to learn in a
wide variety of cognitive domains, to generalize their knowledge easily
to new cases, to continue working reasonably well despite incomplete
input or even moderate damage to their internal structure, etc. The
only real question is whether they are, indeed, TOO flexible to be good
theories. Or whether, by contrast, there are heretofore unrecognized
features of good theories of which connectionist models can apprise
us.

10. Each of the units, connections, and rules in a connectionist
network is a theoretical entity. Each name referring to it in a
description of the network is a theoretical term in the theory of
cognition that it embodies [2]. With the previously described theories,
it was evident that each theoretical entity had a specific job to do.
If it were removed, not only would the performance of the model as a
whole suffer, but it would suffer in predictable ways, viz., the
particular feature of the model's performance for which the theoretical
entity in question was responsible -- i.e., that which it represented
-- would no longer obtain. The units and connections in a connectionist
net -- precisely in virtue of the distributed nature of their activity
-- need not bear any such relation to the various activities of the
model. Although this seems to increase the model's overall efficiency,
it also seems to undermine the justification for each of the units and
connections in the network. To put things even more plainly, if one
were to ask, say, of Berwick's (1985) symbolic model of grammar, "What
is the justification for postulating RULE ATTACH-NOUN?" the answer
would be quite straightforward: "Because without it nouns would not be
attached to noun phrases and the resulting outputs would be
ungrammatical." The answer to the parallel question with respect to the
a connectionist network --  viz., "What is the justification for
postulating (say) unit 123 in this network?" -- is not so
straightforward. Precisely because connectionist networks are so
flexible, the right answer is probably something like, "No reason in
particular. The network would probably perform just as well without it"
[3].

11. If this is true, we are led to an even more pressing question:
exactly what is it that we can actually be said to KNOW about a given
cognitive process once we have modelled it with a connectionist
network?  In the case of, say, the Atkinson and Shiffrin model of
memory, we can say that we have confirmation of the idea that there are
at least two forms of memory store -- short and long term -- and this
confirmation amounts to a justification of sorts for their postulation.
Are we similarly to say that a particular connectionist model with,
say, 326 units that correctly predicts activity in a given cognitive
domain confirms the idea that there are exactly 326 units governing
that activity? This seems ridiculous -- indeed almost meaningless. Aside
from the obvious fact that we don't know what the "units" are units OF,
we might well have gotten just as good results with 325, or 327 units,
or indeed with 300 or 350 units. Since none of the units correspond to
ANY particular aspect of the performance of the network, there is no
particular justification for any one of them. Some might argue that the
theory instantiated by the network is not meant to be read at this
level of precision -- that it is not the number of units, specifically,
that is being put forward for test, but only a network with a certain
general sort of architecture and certain sorts of activation and
learning rules. This seems simply too weak a claim to be
of much scientific value. As Popper told us, scientists should put
forward "bold conjectures" for test. The degree to which the hypothesis
is subject to refutation by the test is the degree to which it is
scientifically important. Even without accepting Popper's strong stand on
the unique status of refutation in scientific work, this much
remains clear: To back away from the details of one's theory -- to
shield them from the possibility of refutation --  is to make one's
theory scientifically less significant. Surely this is not a move
connectionist researchers want to make in the long run.

12. It might be argued that the mapping of particular theoretical terms
on to particular aspects of the behavior being modelled is unnecessary;
it may just be an historical accident, primarily the result of our not
being able to keep simultaneous control of thousands of theoretical
terms until the advent of computers. Perhaps surprisingly, Carl Hempel
seems to have presaged this possibility in his classic essay,
Fundamentals of Concept Formation in Empirical Science: "A scientific
theory might ... be likened to a complex spatial network: Its terms are
represented by knots, while the threads connecting the latter
correspond, in part, to the definitions and, in part, to the
fundamental and derivative hypotheses included in the theory. The whole
system floats, as it were, above the plane of observation and is
anchored to it by rules of interpretation. These might be viewed as
strings which are not part of the network but link to certain points of
the latter with specific places in the plane of observation. By virtue
of those interpretive connections, the network can function as a
scientific theory: From certain observational data, we may ascend, via
an interpretive string, to some point in the theoretical network,
thence proceed, via definitions and hypotheses, to other points, from
which another interpretive string permits a descent to the plane of
observation." (Hempel, 1952, p. 36)

13. Now, it is by no means clear that Hempel had in mind here that
there might be literally thousands of "knots in the network" between
those few that are connected to the "plane of observation," but by the
same token there is nothing in the passage that seems to definitely
preclude the possibility either.

14. The real question seems to be about what one can really be said to
have learned about the phenomenon of interest if one's model of that
phenomenon contains far more terms that are not tied down to the
"empirical plane," so to speak, than it does entities that are.
Consider the following analogy: suppose that an historian wants to
understand the events that lead up to political revolutions, so he
tries to simulate several revolutions and a variety of other less
successful political uprisings with a connectionist network. The input
units encode data on, say, the state of the economy in the years prior
to the uprising, the morale of the population, the kinds of political
ideas popular at the time, and a host of other important socio-
political variables. The output units encode various possible
outcomes:  revolution, uprising forcing significant political change,
uprising diffused by superficial political concessions, uprising put
down by force, etc. Among the input and output units, let us say that
the historian places exactly 72 units which, he says, encode "a
distributed representation of the socio-political situation of the
time." His simulation runs beautifully. Indeed, let us say that because
he has learned the latest techniques of recurrent networks, he is
actually able to simulate events in the order in which they took place
over several years either side of each uprising.

15. What has he learned about revolution? That there must have been
(even approximately) 72 units involved? Certainly not. If the "hidden"
units corresponded to something in particular -- say, to political
leaders, or parties, or derivative socio-political variables -- that
is, if the network had been SYMBOLIC, then perhaps he would have a
case. Instead, he must simply repeat the mantra that they constitute "a
distributed representation of the situation," and that the network is
likely a close approximation to the situation because it plausibly
simulates so many different variants of it.

16. It must be concluded that he has not learned very much about
revolution at all. The simple fact of having a working
"simulation" seems to mean little. It is only if one can interpret the
INTERNAL ACTIVITY of the simulation that the simulation increases our
knowledge; i.e., it is only then that the simulation is to be
considered a scientific THEORY worthy of consideration.

17. Some might find this analogy invalid because of the widely
recognized problems with studying history with the methods of science.
My own opinion is that this is a non sequitur; but rather than arguing
the point let us turn to a less controversial case. Assume for the
moment that some aspiring amateur physicist, blithely unaware of the
work of Galileo and Newton, gets the idea that the way to study the
dynamics of balls rolling down inclined planes is to simulate their
movements with a connectionist network. He sets up the net with inputs
corresponding to variables such as the mass and volume of the ball, the
length and angle of the plane, etc. Perhaps, not really knowing what he
is after, he adds in some interesting variations such as ellipsoidal
balls and curved surfaces, and includes the pertinent features of these
in his encoding scheme. The activity of the output unit represents
simply the time it takes the ball to complete its descent down the
surface. He throws in a handful of hidden units, say 5, and runs the
simulation. Eventually the network is able to predict closely how long
it will take a certain ball to run down a certain surface, and it is
able to generalize its knowledge to new instances on which it was not
trained. If asked what the hidden units represent, the young physicist
says, "the individual units represent nothing in particular; just a
distributed representation of the physical situation as a whole." What
has he learned? Not much, it would seem. Certainly not what was
learned in the explanation of these kinds of phenomena in the theories
of Galileo and Newton, in which the theoretical entities
clearly REFER to relatively uncontroversial aspects of the world (e.g.,
distance, duration, size).

18. One way we cognitive scientists might try to avoid the fate of our
hypothetical connectionist historian and physicist is to claim that
connectionist units DO correspond to something closely related to the
cognitive domain; viz., the neurons of the brain. Whether this is to be
considered an analogy or an actual literal claim is often left vague by
those who suggest it. Most connectionists seem wary of proclaiming too
boldly that their networks model the actual activity of the brain.
McClelland, Rumelhart, and Hinton (1986), for instance, say that
connectionist models "seem much more closely tied to the physiology of
the brain than other information-processing models" (p. 10), but then
they retreat to saying that their "physiological plausibility and
neural inspiration...are not the primary bases of their appeal to us"
(p. 11). Smolensky (1988), after having examined a number of possible
mappings, writes that "given the difficulty of precisely stating the
neural counterpart of components of subsymbolic [i.e., connectionist]
models, and given the very significant number of misses, even in the
very general properties considered..., it seems advisable to keep the
question open" (p. 9). Only with this caveat in place does he then go
on to claim that "there seems no denying, however, that the
subconceptual [i.e., connectionist] level is SIGNFICANTLY CLOSER
[emphasis added] to the neural level than is the conceptual [i.e.,
symbolic] level" (p. 9). Precisely what metric he is using to measure
the "closeness" of various theoretical approaches to the neural level
of description is left unexplicated.

19. The general aversion to making very strong claims about the
relation between connectionist models and brain is not without good
reason. Crick and Asanuma (1986) describe five properties that the
units of connectionist networks typically have that are rarely or never
seen in neurons, and two further properties of neurons that are rarely
found in the units of connectionist networks. Perhaps most important of
these is the fact that the success of connectionist models seems to
DEPEND upon the fact that any given unit can send excitatory impulses
to some units and inhibitory impulses to others. No neuron in the
mammalian brain is known to do this (though "dual-action" neurons have
been found in the abdominal ganglion of Aplysia; see Levitan &
Kaczmarek, 1991, pp. 196-197). Although it is certainly possible that
dual-action neurons will be found in the human brain, the vast majority
of cells do not seem to have this property, whereas the vast majority
of units in connectionist networks typically do. Even as strong a
promoter of connectionism as Paul Churchland (1990, p. 221) has
recognized this as a major hurdle to be overcome if connectionist nets
are to be taken seriously as models of brain activity. What is more,
despite some obvious but possibly superficial similarities between the
structure of connectionist units and the structure of neurons, there is
currently little hard evidence that any SPECIFIC aspect of cognition is
instantiated in the brain by neurons arranged in any SPECIFIED
connectionist configuration.

20. It would accordingly appear that at present the only way of
interpreting connectionist networks as serious candidates for theories
of cognition, would be as literal models of the brain activity that
underpins cognition. This means, if Crick and Asanuma are right in
their critique, that connectionists should start restricting themselves
to units, connections, and rules that use all and only principles that
are known to be true of neurons. Other interpretations of connectionist
networks may be possible in principle, but at this point none seem to
have appeared on the intellectual horizon [4]. Without such an
interpretation, connectionist modelers are left more or less in the
position of out hypothetical connectionist historian. Even a simulation
that is successful in terms of transforming certain inputs into the
"right" outputs does not tell us much about the cognitive process it is
simulating unless there is a plausible interpretation of its inner
workings. All the researcher can claim is that the success of the
simulation confirms that SOME connectionist architecture is involved,
and perhaps something very general about the nature of that
architecture (e.g., that it is self-organizing, recurrent, etc.).
There is little or no confirmation of the specific features of the
network because so much of it is OPTIONAL.

21. Now, it might be argued that this situation is no different from
that of early atomic theory in physics. Visible bits of matter and
their interactions with other bits of matter were explained by the
postulation of not just thousands, but millions upon millions of
theoretical entities of mostly unknown character -- viz., atoms. This,
the argument would continue, is not so different from the situation in
connectionism. After all, as Lakatos (1970) taught us, new research
programs need a grace period in the beginning to get themselves
established. Although I don't have a demonstrative argument against
this line of thought, I think it has relatively little merit. We know
pretty well what atoms are, and where we would find them, were we able
to achieve the required optical resolution. Put very bluntly, if you
simply look closer and closer and closer at a material object, you'll
eventually see the atoms. Atoms are, at least in that sense, perfectly
ordinary material objects themselves. Although they constitute an
extension of our normal ontological categories, they do not REPLACE an
old well-understood category with a new ill-understood one.[5]

22. By contrast, the units of connectionist networks (unless identified
with neurons, or other bits of neural material) are quite different.
They are not a REDUCTION of mental concepts, and as such give us no
obvious path to follow to get from the "high level" of behavior and
cognition to the "low level" of units and connections. That it is not a
REDUCTIVE position is in fact often cited as a STRENGTH of
connectionism but, if I am right, it is also the primary source of the
ontological problems that have been discussed here.

23. To conclude, it is important to note that I am not arguing that
connectionist networks must give way to symbolic networks because
cognition is inherently symbolic (see, e.g., Fodor & Pylyshyn, 1988).
That is an entirely independent question. What I am suggesting,
however, is that the apparent success of connectionism in domains where
symbolic models typically fail may be due as much to the huge number of
additional "degrees of freedom" that connectionist networks are
afforded by virtue of the blanket claim of distributed representation
across large numbers of uninterpreted units, as it is to any inherent
virtues that connectionism has over symbolism in explaining cognitive
phenomena.

FOOTNOTES

[1] There are many ways of studying implicit memory. The experiment I
describe here is, I believe, a "classic" procedure, but by no means the
only one.

[2] A Psycoloquy reviewer of this paper suggested that it is not the
individual units that are theoretical entities, but only the units as a
general type. He explicitly compared the situation to that of
statistical dynamics, in which the phenomena are said to result from
the actions of large, but unspecified, numbers of molecules of a
general type. The difference is, of course, that we have lots of
independent evidence of the existence of molecules. We know quite a lot
about their properties. The same cannot be said of units in
connectionist networks. Their existence is posited SOLELY for the
purpose of making the networks behave the way we want them to. There is
no independent evidence of their properties or their existence at all.

[3] Notice that a version of the Sorites paradox threatens here. There
must come a point where the subtraction of a unit from the network
would lead to a decrement in its performance, but typically
connectionist researchers work well above this level in order to
optimize learning speed and generalization.

[4] One Psycoloquy referee suggested that units might correspond to
small neural circuits rather than individual neurons. This might be so,
but the evidential burden is clearly on the person who makes this
proposal to find some convincing empirical evidence for it.

[5] There may be a temptation to attempt to carry this through to the
quantum level, and claim that it does not carry through at that level
because of the physical impossibility of seeing subatomic particles.
First of all, relying on our intuitions about the quantum world to
illuminate other scientific spheres is a very dangerous move because it
is there more than anywhere that our intuitions seem to fail. Despite
this, the move would fail in any case because the impossibility at
issue is merely PHYSICAL, not LOGICAL. In a world in which light turned
out to be continuous rather than particulate, the argument would carry
through perfectly well. Put less technically, we know WHERE to see
subatomic particles, we just don't know HOW to see them. The same
cannot be said for units in connectionist networks. They simply don't
seem to refer to ANYTHING in the system being studied at all.

REFERENCES

Atkinson, R. C. & Shiffrin, R. M. (1971) The control of short-term
memory. Scientific American 225:82-90.

Baddeley (1992) Working memory. Science 255:556-559.

Berwick, R. C. (1985) The acquisition of syntactic knowledge, MIT
Press.

Churchland, P. M. (1990) Cognitive activity in artificial neural
networks. In: Thinking: An invitation to cognitive science (Vol. 3),
ed. D. N. Osherson & E. E. Smith, MIT Press.

Fodor, J. A. & Pylyshyn, Z. W. (1988). Connectionism and cognitive
architecture: A critical analysis. Cognition 28:3-71.

Hempel, C. G. (1952) Fundamentals of concept formation in empirical
science. University of Chicago Press.

Jacoby, L. L. (1991) A process dissociation framework: Separating
automatic from intentional uses of memory. Journal of Memory & Language
30: 513-541.

Lakatos, I. (1970). Falsification and the methodology of scientific
research programmes. In: Criticism and the growth of knowledge, ed. I.
Lakatos & A. Musgrave (Eds.), Cambridge University Press.

Levitan, I. B. & Kaczmarek, L. K.. (1991). The neuron: Cell and
molecular biology. New York: Oxford University Press.

McClelland, J. L., Rumelhart, D. E., & Hinton, G. E. (1986) The appeal
of parallel distributed processing. In: Parallel distributed
processing: Explorations in the microstructure of cognition (vol. 1),
ed. Rumelhart, D. E. & McClelland, J. L., MIT Press.

Roediger, H. L. III (1990) Implicit memory: Retention without
remembering. American Psychologist 45:1043-1056.

Roediger, H. L., III, & McDermott, K. B. (1993) Implicit memory in
normal human subjects. In: Handbook of neuropsychology (Vol. 8, pp.
63-131), ed. F. Boller & J. Grafman, Elsevier.

Schacter, D. L. (1987) Implicit memory: History and current status.
Journal of Experimental Psychology: Learning, Memory, and Cognition
13:501-518.

Schacter, D. L. (1992). Understanding implicit memory: A cognitive
neuroscience approach. American Psychologist 47:559- 569.

Smolensky, P. (1988). On the proper treatment of connectionism.
Behavioral and Brain Sciences 11:1-73.

Tulving, E. (1985) How many memory systems are there? American
Psychologist 40:385-398.