No subject


Mon Jun 5 16:42:55 EDT 2006


about Turing equivalence or lack thereof lose some of their force.
I would hope that everyone would agree that no finite connectionist
hardware could be more powerful than a conventional finite state
automaton, and it would be nice if everybody could also agree that
no amoeba, no starfish, no koala bear, and no human being can be more
powerful than a finite-state automaton either.  There is, of course,
the question of whether a connectionist machine would be as powerful
as a finite-state automaton, but this strikes me as trivial.  (Is it?)
Some of you may then ask, why anybody bothers with Turing machines or
any machines more powerful than finite-state ones.  For example, what
of all the arguments of Chomsky et al. about NLs not being finite-state?
This leads to the next point.

(3) Equivalence with respect to I/O behavior
is not
the sum total of what the theory of computation has taught us.
Thus, while I would claim that no physically realized system can
be more powerful than a finite-state machine in the weak sense,
the same is not true in other senses.  The correct view of the
matter is, I am pretty sure, that machines more powerful than
the finite-state models are useful mathematical tools for proving
results which do in fact apply to physically realized (and hence
really finite-state) systems, but which would be more difficult
to prove otherwise.  Thus, the claim that NLs are not finite-state
for example should really be taken to mean that NLs have constructions
(such as center-embedding or--much more commonly--repetition) which
have the following feature: given a finite-state machine of the usual
sort, the size of the machine increases in proportion to an increase
in the size of a finite set of tokens of such a construction.  Hence,
in a sense which can be made more precise, one wants to say that the
finite-state machine cannot "know" these patterns.  On the other hand,
a machine with pushdown storage for center
embedding or queue storage for repetition, even if it is strictly
finite and so only recognizes a finite set of such tokens, can
be modified only slightly to recognize a larger such set (the
modification would consist in extending the storage by a cell).
In a sense that can be made precise, such a machine "knows" the infinite
pattern even though it can only recognize at any given time a finite
set of tokens of the pattern.  It has always seemed more convenient
to state the relevant results in terms of idealized machines which
actually have infinite memor, but in reality we are talking about
machines that are finite-state in power but have a cleverer kind of
design, which allows them in a sense to "know" more than a conventional
finite-state machine.  Thus, we have to have a broader conception of
what mathematical equivalence is.  A finite-state machine is weakly
equivalent to a pushdown machine with bounded storage, yet there is
a well-defined sense in which the latter is more powerful.

(4) Hence, it would be useful to know whether connectionist models
(the theoretical ones) are equivalent to Turing machines or
at least to some class of machines that can handle center-embedding
repetition and a few other non-finite-state properties of NLs, for
example.  For the idea would be that this would tell us IN WHAT SENSE
connectionist models (those actually implementable) are equivalent
to finite-state machines.  The crucial point is this: a finite-state
machine designed to handle repetition must be drastically altered
each time we increase the size of repetitions we want handled.  A simple
TM (or a queue machine) with a bound on its memory is I/O equivalent
to a finite-state machine, but in order for it to handle larger and
larger repetitions, it suffices to keep extending its tape (queue),
a much less radical change (in a well-defined sense) than that required
in the case of a finite-state machine.  Turning back to connectionist
models, the question then is whether to handle non-finite-state linguistic
constructions (or other such cognitive tasks), they have to altered as
radically as finite-state machines do (in effect, by adding new states)
or less radically (as in the case of TMs and other classes of automata
traditionally assumed to have infinite memory).

(5) Perhaps I should add, by way of analogy, that there are many other
situations where it is more clearly understood that the theory of
computation deals with a class of formalisms that cannot be physically
realized but the results are really intended to tell us about formalisms
that are realizable.  The case of nondeterminism is an obvious one.
Nondeterministic machines (in the sense used in automata theory; this
is quite different from nondeterminism in physical theory or in colloquial
usage) cannot be physically realized.  But it is convenient to be able to
pull tricks such as (a) prove the equivalence of regular grammars to
nondeterministic finite-state machines, (b) prove the equivalence of
nondeterministic and deterministic finite-state machines, and (c) be
able to conclude that regular grammars might be a useful tool for studying
physically realizable deterministic finite-state machines.  It is
much easier to do things this way than to do things directly, in many
cases, but the danger is that people (even the initiated and certainly
the uninitiated) will then assume either that the impossible devices
are really possible or that the theory is chimerical.  I claim that this
precisely what has happened with the concept of machines with infinite
memory (such as Turing machines).

(6) Given what has been said, we can I think also make sense of the
results that show that certain connectionist models are MORE
powerful than Turing machines.  This might seem to contradict
the Church-Turing thesis (which I have been assuming throughout),
but in fact the Church-Turing thesis implicitly refers only to
situations where only time and memory are infinite, but where
the algorithm is finitely specified, whereas the results I am
alluding to deal with (theoretical) models of nets that are infinite.
In more traditional terms, these would correspond to machines like
finite-state machines but with an infinite number of states.  There
is no question but that with an infinite number of states you could
do all sorts of things that you cannot do with a finite number
but it is unclear to me that such results offer any comfort to the
proponents of net architectures.

(8) Finally (at least for now), it seems to me that there remain
significant gaps between what the theory of computation tells us about
and what we need when we try to understand the behavior of complex
systems (not only human beings but even real computer
software and hardware!) even after some of the confusions I have been
trying to dispel are dispelled.  While there are many
formal notions of equivalence beyond that of I/O equivalence which
can make our discussion of theoretical models of language or cognition
more useful (higher-level if you will) without sacrificing precision,
I don't think that we have developed the mathematical definitions that
we need to really capture what goes on in the highest levels of current
theoretical thinking.  Thus, I would not want to say that the question
of the equivalence of connectionist and standard models can be put to
rest with current tools of the theory of computation.  Rather, I
would say that (a) at certain levels the two are equivalent (assuming
somebody can do the necessary work of drafting and proving the theorems)
(b) at certain other levels they are not, (c) the level at which
most people in the field are thinking about the problem intuitively
has probably not been formalized, (d) only if this level is formalized
will we know what we are really talking about and it is only by deepening
the mathematical models that we will achieve this, not by scuttling them,
and (e) ultimately the answers we want will have to come from a marriage
of a more complete mathematical theory with a richer empirical science of
cognition.  But neither is achievable without the other, for

         Every science gets the math that it deserves,
                       and vice versa.

Alexis Manaster-Ramer
POB 704
IBM Research
Yorktown Heights NY 10598
amr at ibm.com
(914) 789-7239


More information about the Connectionists mailing list