AI, NN, Neurobiology, architectures and design space
Aaron Sloman
aarons at cogs.sussex.ac.uk
Fri Dec 28 17:40:02 EST 1990
I'd like to make some comments on the recent discussions from the
standpoint of a philosopher who has dabbled in various aspects of AI for
about 20 years and believes that in principle it should be possible (in
the VERY distant future) to replicate all or most interesting features
of human minds in machines of some kind, because I don't believe in
magic of any kind. Also I have seen several fashions come and go.
I'll start with some background comments before getting to more meaty
matters (bringing together, and extending, some of the points already
made by other people).
1. Despite many written and spoken words about the differences between
Connectionism and AI, I think it is clear that PDP/NN/Connectionist
models have FAR more in common with AI models than they have with human
brains, both in terms of what they do or how they work, and in terms of
what they don't (yet) do (see below).
Unfortunately people who know little about AI (e.g. those who think that
expert systems and automatic theorem provers exhaust AI, because they
don't know about AI work on speech, vision, robotics, numeric-based
learning systems, etc.) are easily misled into believing exaggerated
claims about the differences.
A good antidote for such exaggerations is the technical report "Symbol
Processing Systems, Connectionist Networks, and Generalized
Connectionist Networks," by Honavar and Uhr, recently announced in this
mail forum. Another good antidote is to survey IJCAI (International
Joint Conference on AI) proceedings over the years.
Some of the authors of narrowly focussed text-books on AI are partly to
blame for the misconceptions about the scope of AI (e.g. books that
present AI as logic).
(Heated and noisy but fairly vacuous, and often transient, disputes
between factions or fashions are quite common in science and academe. I
leave it to others to offer sociological explanations in terms of
patterns of mutual excitation and inhibition.)
2. I am not saying that there are no differences between NNs and other
AI models, but that the technical, scientific, and philosophical
significance of the differences has been exaggerated. E.g. I think
there is NO significant philosophical difference. The biological
differences are more concerned with what they are trying to explain than
with correctness as models. The technical differences are still barely
understood. (More on this below).
NNs are closer to some non-NN AI models (e.g. in speech and vision) than
those AI models are to other non-NN AI models (e.g. in theorem proving,
natural language processing). E.g. So-called sub-symbolic or
sub-cognitive or micro- features in NNs have much in common with low
level or intermediate representations in vision. Distributed
representations have much in common with intermediate databases in
vision programs. There's also a loose analogy between distributed
representations and theorems implicit in (=distributed over?) an axiom
set.
Like some of the earlier discussants, I see all AI and NN work as
exploring sub-regions in the space of possible explanatory designs. We
understand too little of this space to know where the major
discontinuities are.
3. Concerning details of brain structure, both seem miles off.
Jim Bower wrote (Thu, 27 Dec 90 13:01:35 PST)
| .....As a neurobiologist, however, I would assert
| that even a cursory look at the brain reveals a structure having
| very little in common with connectionist models.
The same can be said about existing AI models and intelligence. I'll put
the point differently: it is clear from even a cursory study of
literature on the anatomy and physiology of the brain that far more
complex and varied designs exist than anyone has yet modelled.
The same conclusion can be reached on the basis of cursory reflection on
the differences between the abilities of people, squirrels,
nest-building birds, etc. and the abilities of current AI or NN models.
(To say nothing of what neurobiologists can explain!)
4. People working "at the coalface" in a scientific or technical
discipline often find it hard to see the limitations of what they are
doing, and therefore exaggerate its importance. Drew McDermott once
wrote a paper called "Artificial Intelligence meets natural stupidity"
(reprinted in John Haugeland, ed. Mind Design, MIT press 1981),
criticising (from an AI standpoint) AI researchers who, among other
things, use words and phrases from ordinary language as "wishful
mnemonics", e.g. "Goal", "Understand", "Planner", "General Problem
Solver". Much of what he says can be applied equally to NN research
where words like "Learn", "Recognise" and "Interpret" are used to
describe mechanisms that do little more than map vectors from one space
into another, or store vectors using content-addressible memories.
Maybe a prize should be offered for the best essay by a student
re-writing Drew's paper with a title something like "Artificial nets
meet dumb brains"?
5. I am not attacking NN research: I am merely pointing out
commonalities between NN and AI research. The limitations of both are to
be expected: Because most scientific research is inevitably piecemeal,
and progress depends in part on a lot of systematic exploration of
detail, a tendency that is common to most researchers is that they
focus on tiny sub-mechanisms without any consideration of the global
architecture within which those mechanisms might function. (I suspect
this is equally true of people who study real brains, but I don't know
their work so well. It is certainly true of many psychologists.)
By consideration of the "global architecture" I mean the study of the
capabilities of the whole system, and the analysis of a system into
distinct sub-systems or sub-mechanisms with different causal roles
(defined by how they interact with the environment and with other
sub-systems), contributing to (and explaining) the global capabilities
of the whole system. (I think this is closely related to what Jim
Hendler wrote: Thu, 20 Dec 90 09:18:09 -0500).
This study of global architecture is a very hard thing to do, especially
when the functional decomposition may not be closely related to
anatomical decomposition. So much of the analysis has to be inspired
from a design standpoint (how could we make something like this?). This
is best done in parallel with the more detailed studies: with feedback
between them.
Notes:
5.1. Don't assume that the division into sub-systems has to be rigidly
defined: sub-mechanisms may share functions or change functions.
5.2 Don't assume that every system has a fixed architecture: one
important kind of capability may be creation, destruction or
modification of sub-structures or sub-mechanisms. Some of these may be
virtual machines of changing complexity implemented in a physical
mechanism of fixed complexity: a common feature of sophisticated
software. Perhaps the conceptual development of a child is best
understood as the development of new (virtual?) sub-mechanisms that make
new processes possible: e.g. percepts or thoughts of increasing
complexity, more complex motivational patterns, etc.
5.3. Don't assume that there is only one level of decomposition into
sub-systems. A proper understanding of how the system works, may require
some of the sub-systems to be thought of as themselves implemented in
lower level mechanisms with different capabilities. There may be many
levels.
5.4. Don't assume there's a fixed separation of levels of
implementation: some of the very high level functionality of large scale
sub-mechanism may be closely coupled with some very low level mechanism.
An example might be chemical processes that alter speed of processing,
or turn whole sub-mechanisms on or off. (How does alcohol alter decision
making?) (Compare a computer whose programs are run by a microcode
interpreter that can be altered by those very programs, or an
interpreter uses subroutines written in the language being interpreted.)
6. It's clear that the global architecture of human beings includes a
lot of coarse-grained parallelism. I.e. there are often many concurrent
processes e.g. simultaneous walking, talking, thinking, eating,
hearing, seeing, scratching one's ear, feeling hungry, feeling cold,
etc. to say nothing of the regulation of internal physiological
processes we are not aware of, or the decomposition of processes like
walking, or talking, into different concurrent sub-processes (generating
motor control signals, internal monitoring, external sensory monitoring,
etc. etc.) Moreover a process that might have one label in ordinary
language (e.g. "seeing") can simultaneously perform several major
sub-functions (e.g. providing optical flow information for posture
control, providing information about the nearby 3-d structure of the
environment for short term motor planning, providing 2-d alignment
information for checking direction of movement, providing information
for future path-finding, providing enjoyment of the scenery, etc. etc.)
I suspect that it would be quite useful for a new sub-stream of AI
research to address the problem of how best to think of the high-level
decomposition of a typical human mind. (Re-invent faculty psychology
from an engineering standpoint?)
Although some AI people have always emphasised the need to think about
complete systems (the motivation for some AI robot projects), it has
simply not been technically possible to aim at anything but vastly
oversimplified designs. Roboticists don't normally include the study of
motivation, emotions, visual enjoyment, conceptual learning, social
interaction, etc, etc. So, very little is known about what kind of high
level architecture might be required for designing something close to
human capabilities.
The techniques of software engineering (e.g. requirements analysis)
coupled with philosophical conceptual analysis and surveys of what is
known from psychology, linguistics, anthropology, etc. might eventually
lead us to some useful conjectures about the global architecture, that
could then be tested by a combination of implementational experiments
(using AI, NN, or any other relevant techniques) and directed
neurobiological studies. (I've been trying to do this recently in
connection with attitudes, motivation, emotions and attention.)
7. Top-down illumination from this kind of architectural analysis may be
required to give some direction (a) to conventional AI (since the space
of possible software systems is too vast to be explored only bottom up)
(b) to the exploration of NNs (since the space of possible networks,
with different topologies, different self-modification algorithms,
different threshold functions, etc. etc. is also too vast to be explored
bottom up) and (c) to the study of real brains, since without good
hypotheses about what a complex and intricate mechanism might be for,
and how its functions might be implemented, it is too easy to
concentrate on the wrong features: e.g. details that have little
relevance to how the total system works. (Like measuring the shape,
density, elasticity, etc. of something because you don't know that it's
primarily a resistor in an electronic circuit. How can neurobiologists
tell whether they are making this sort of mistake?)
8. The formal study of global architectures is somewhat different from
mathematical analysis of formalisms or algorithms in computer science,
and different from the kind of numerical mathematics that has so far
dominated NN research. It will require at least a considerable
generalisation of the mathematics of control theory to incorporate
techniques for representing mutual causal interactions between systems
undergoing qualitative and structural changes that cannot be
accommodated in a set of differential equations relating a fixed set of
variables.
It will probably require the invention (or discovery?) of a host of new
organising technical concepts, roughly analogous to the earlier
discovery of concepts like feedback, information (in the mathematical
sense), formal grammars, etc. (I think Lev Goldfarb was saying something
similar (Sat, 22 Dec 90 00:57:29 AST), but I am not sure I understood it
right.)
9. Minimising the significance of the AI/NN divide:
I conjecture that most of the things that interest psychologists and
cognitive scientists about human beings, e.g. our ability to perceive,
learn, think, plan, act, communicate, co-operate, have desires, have
emotions, be self-aware, etc. etc. depend more on the global
architecture (i.e. how many co-existing, functionally distinct, causally
interacting, sub-mechanisms there are, what their causal relationships
are, and what functions they support in the total system) than on the
implementation details of sub-mechanisms.
It is not obvious what difference it makes how the various sub-
components are implemented. E.g. for many components the difference
between an NN implementation and a more conventional AI implementation
may make a difference to speed (on particular electronic technology), or
flexibility, or reliability, or modifiability -- differences that are
marginal compared with the common functionality that arises not from the
implementation details of individual systems but from the causal
relations with other parts of the whole system. (Philosophical,
scientific, and engineering issues converge here.)
(Compare replacing one make of capacitor or resistor in an electronic
circuit with another that has approximately the same behaviour: if the
circuit is well designed, the differences in detailed behaviour will not
matter, except perhaps in highly abnormal conditions, eg. high
temperatures, high mechanical stress, or when exposed to a particular
kind of corrosive gas etc. If the gas turns up often, the difference is
important. Otherwise not.)
It is quite likely that different sorts of implementation techniques
will be needed for different sub-functions. E.g. rapid visual-motor
feedback involved in posture control in upright bipeds (who are
inherently very unstable) may be best served by NNs that map input
vectors into output vectors. For representing the main high level steps
in a complex sequence of actions (e.g. tying a shoelace) or for working
out a plan to achieve a number of goals in a richly structured
environment, very different mechanisms may be more suitable, even if
NN's are useful for transforming low-level plan details to signals for
co-operating muscles. NNs as currently studied may not be the best
mechanism for accurate storage of long sequences of items, e.g. the
alphabet, a poem, a dance routine, a memorised piano sonata, etc. When
we have a good theory of the global architecture we'll be in a better
position to ask which sub-mechanisms are best implemented in which
ways.
However, using a less suitable mechanism for one of the components may,
like changing a resistor in a circuit, produce only a difference in
degree, not kind, of capability for the whole system. (Which is why I
think the NN/AI distinction is of no philosophical significance. This
agrees with Beth Preston (21 Dec 90 16:35:15 EST) but for different
reasons.)
10. The conjecture that implementation details of sub-mechanisms is
relatively unimportant in explaining global capabilities, will be false
(or partly false) to the extent that high level functionality depends
essentially on close-coupling of different implementation levels.
Are the mechanisms that allow alcohol (or other drugs) to produce
qualitative changes in high level processes intimately bound up with
chemical control mechanisms that are essential for normal human
functioning, in the sense that no other implementation would have
worked, or are they side-effects of biologically inessential
implementation details that result from accidents of evolutionary
history? We know that natural heart valves and kidneys can be replaced
by artificial ones made very differently. We don't yet know which brain
sub-mechanisms could also be replaced because most of the detail is
inessential.
When we know more, it may turn out that in order to explain human
behaviour when drugged it is necessary to look at details that are
irrelevant when explaining normal capabilities, from the design
standpoint.
Of course, some neurobiologists will not accept that two essentially
similar circuits have the same functionality for the same reason if
their components are made differently: but that's just a kind of
scientific myopia. (I am not sure whether Jim Bower was saying that.)
11. It may also turn out that some aspect of the global architecture,
for example the nature of the required causal links between different
sub-mechanisms, favours one kind of implementation over another. Is
there any intrinsic difference in the kind of information flow, or
control flow, that can be implemented (a) between two or more components
linked only by a few high speed parseable byte-streams, (b) between
components linked by shared recursive data-structures accessed via
pointers in a shared name-space, and (c) between two components linked
by a web of connecting fibres? (I am not saying these are the only
options for communication between sub-mechanisms.)
I suspect not: only differences in speed and resistance to physical
damage, etc. But perhaps there are important relevant theorems I don't
know about (perhaps not yet proven?).
12. There are many problems not yet addressed by either NN or AI
research, and some addressed but not solved. E.g. I suspect that neither
has much to say about the representation of (arbitrary) shapes in visual
systems, such that the representation can both be quickly derived from
sampling the optic array and can also usefully serve a multiplicity of
purposes, including: recognition, finding symmetries, seeing similarity
of structure despite differences of detail, fine motor control, motor
planning, explaining an object's capabilities, predicting points of
contact of moving objects, etc. etc. Adequate representations of spatial
structure and motion may require the invention of quite new techniques.
13. Although not everyone should put all their efforts into this, I
commend the interdisciplinary exploration of the space of possible
global architectures to people working in AI, NN, and neurobiology. (We
may need some help from philosophers, software engineers,
mathematicians, linguists, psychologists, ....)
I fear there are probably a lot more fads and fashions waiting to turn
up.
Apologies: this message grew too long.
Aaron Sloman,
EMAIL aarons at cogs.sussex.ac.uk
aarons%uk.ac.sussex.cogs at nsfnet-relay.ac.uk
aarons%uk.ac.sussex.cogs%nsfnet-relay.ac.uk at relay.cs.net
More information about the Connectionists
mailing list