Connectionist symbol processing: any progress?

Mitsu Hadeishi mitsu at ministryofthought.com
Thu Aug 13 16:11:14 EDT 1998


I'm not quite sure I agree with your analysis.  Since I haven't looked
at it in great detail, I present this as a tentative critique of your
presentation.

Since recurrent ANNs can be made to carry out any Turing operation
(i.e., modulo their finite size, they are Turing equivalent), then
unless you are saying that your represention cannot be implemented on
a Turing machine, then it is clearly NOT the case that recurrent ANNs
*cannot* learn arbitrary symbolic representations.  Not having looked
at your scheme in detail, of course, I don't know whether your scheme
somehow is unimplementable even on a Turing machine, but it seems to
me you must not be claiming this.

You seem to be basing your argument on the notion that the input space
to a recurrent ANN is a set of numbers, which you interpret as the
coordinates of a vector.  However, this is only a kind of vague
analogy, since the field operations of the vector space (addition,
multiplication, etc.) have no clear meaning on the input space.
"Adding" two input vectors does not necessarily result in anything
meaningful except in the sense that the recurrent ANN to be useful
must be locally stable with respect to small variations in the input.
However, the actual structure or metric of the input space is in some
sense determined not a priori but by the state of the recurrent ANN
itself, and can change over time both as a result of training and as a
result of iteration.  The input space is numbers, yes, but that
doesn't make it a vector space.  For example, what properties of the
input would be preserved if I, say, added the vector (10^25, 10^25,
..) to the input?  If it is a "vector space" then that operation
would yield something sensible, some symmetries, and yet it obviously
does not.  Thus, while I sympathize with your claim that the vector
field of R(n) does not admit to the structure necessary to make
visible much symbolic structure, this in itself does not doom
connectionist symbol processing by any means.

Your argument does have weight when applied to a single-layer
perceptron, which is, after all, just a thresholded/distorted linear
transformation.  Although it seemed to take the early connectionist
community by surprise, it should be no surprise at all that a
single-layer perceptron cannot learn the parity problem, because
obviously the parity problem is not linearly separable, and how could
any linear discriminator possibly learn a non-linearly-separable
problem?  However, we do not live in a world of single-layer
perceptrons.  Because networks are more complex than this, arguments
about the linearity of the input space seem to me rather irrelevant.
I suspect you mean something else, however.

I think the intuitive point you are perhaps trying to make is that
symbolic representations are arbitrarily nestable (recursively
recombinable), and an input space which consists of a fixed number of
dimensions cannot handle recursive combinations.  However, one can use
time-sequence to get around this problem (as we all are doing when we
read and write for example).  Rather than make our eyes, for example,
capable of handling arbitrarily recombinable input all at once, we
sequence the input to our eyes by reading material over time.  The
same trick can be used with recurrent networks for example.

Mitsu

Lev Goldfarb wrote:

> David,
> I'm afraid, I haven't got the "good news", but, who knows, some good may
> still come out of it.
>
> About 8-9 years ago, soon after the birth of the connectionists mailing
> list, there was a discussion somewhat related to the present one. I recall
> stating, in essence, that it doesn't make sense to talk about the
> connectionist symbol processing simply because the connectionist
> representation space--the vector space over the reals--by its very
> definition (recall the several axioms that define it) doesn't allow one to
> "see" practically any symbolic operations, and therefore one cannot
> construct, or learn, in it (without cheating) the corresponding inductive
> class representation. I have been reluctant to put a substantial effort
> into a formal proof of this statement since I believe (after so many years
> of working with the symbolic data) that it is, in some sense, quite
> obvious (see also [1-3]).
>
> Let me try, again, to clarify the above. Hacking apart, the INPUT SPACE of
> a learning machine must be defined axiomatically, as is the now universal
> practice in mathematics. These axioms define the BASIC OPERATIONAL BIAS of
> the learning machine, i.e. the bias related to the class of permitted
> object operations (compare with the central CS concept of abstract data
> type). There could be, of course, other, additional, biases related to
> different classes of learning algorithms each operating, however, in the
> SAME input space (compare, for example, with the Chomsky overall framework
> for languages and its various subclasses of languages).
>
> It appears that the present predicament is directly related to the fact
> that, historically, in mathematics, there was, essentially, no work done
> on the formalization of the concept of "symbolic" representation space.
> Apparently, such spaces are nontrivial generalizations of the classical
> representation spaces, the latter being used in all sciences and have
> evolved from the "numeric" spaces. I emphasize "in mathematics" since
> logic (including computability theory) does not deal with the
> representation spaces, where the "representation space" could be thought
> of as a generalization of the concept of MEASUREMENT SPACE. By the way,
> "measurement" implies the presence of some distance measure(s) defined on
> the corresponding space, and that is the reason why the study of such
> spaces belongs to the domain of mathematics rather than logic.
>
> It appears to us now that there are fundamental difference between the two
> classes of "measurement spaces": the "symbolic" and the "numeric"  spaces
> (see my home page). To give you at least some idea about the differences,
> I am presenting below the "symbolic solution" (without the learning
> algorithm) to the generalized parity problem, the problem quite notorious
> within the connectionist community.
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>                         THE PARITY CLASS PROBLEM
>
> The alphabet:   A = {a, b}
> ------------
>
> Input set S (i.e. the input space without the distance function): The set
> -----------
>              of strings over A.
>
> The parity class C: The set of strings with an even number of b's.
> ------------------
>
> Example of a positive training set C+:   aababbbaabbaa
> -------------------------------------    baabaaaababa
>                                          abbaaaaaaaaaaaaaaa
>                                          bbabbbbaaaaabab
>                                          aaa
>
> Solution to the parity problem, i.e. inductive (parity) class representation:
> -----------------------------------------------------------------------------
>
> One element from C+, e.g. 'aaa', plus the following 3 weighted operations
> operations (note that the sum of the weights is 1)
>                        deletion/insertion of 'a'  (weight 0)
>                        deletion/insertion of 'b'  (weight 1)
>                        deletion/insertion of 'bb' (weight 0)
>
> This means, in particular, that the DISTANCE FUNCTION  D between any two
> strings from the input set S is now defined as the shortest weighted path
> (based on the above set of operations) between these strings. The class is
> now defined as the set of all strings in the measurement space (S,D) whose
> distance from aaa is 0.
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> Why do, then, so many people work on the "connectionist symbol
> processing"?  On the one hand, many of us feel (correctly, in my opinion)
> that the symbolic representation is a very important topic. On the other
> hand, and I am quite sure of that, if we look CAREFULLY at any
> corresponding concrete implementation, we would see that in order to
> "learn"  the chosen symbolic class one had to smuggle into the model, in
> some form, some additional structure "equivalent" to the sought symbolic
> structure (e.g. in the form of the recurrent ANN's architecture). This is,
> again, due to the fact that in the vector space one simply cannot detect
> (in a formally reasonable manner) any non-vector-space operations.
>
>
> [1] L. Goldfarb, J. Abela, V.C. Bhavsar, V.N. Kamat, Can a vector space
>     based learning model discover inductive class generalization in a
>     symbolic environment? Pattern Recognition Letters, 16 (7), 1995, pp.
>     719-726.
>
> [2] L. Goldfarb and J. Hook, Why classical models for pattern recognition
>     are not pattern recognition models, to appear in Proc. Intern. Conf.
>     on Advances in Pattern Recognition (ICAPR), ed. Sameer Singh,
>     Plymouth, UK, 23-25 Nov. 1998, Springer.
>
> [3] V.C. Bhavsar, A.A. Ghorbany, L. Goldfarb, Artificial neural networks
>     are not learning machines, Tech. Report, Faculty of Computer Science,
>     U.N.B.
>
>
> --Lev Goldfarb
>
>       http://wwwos2.cs.unb.ca/profs/goldfarb/goldfarb.htm


More information about the Connectionists mailing list