Connectionist symbol processing: any progress?

Sun Aug 16 10:18:49 EDT 1998

Lev wrote:

> Can I give you a one sentence answer? If you look very carefully at
> the topologies induced on the set of strings (over an alphabet of
> size > 1) by various symbolic distances (of type given in the parity
> class problem), then you will discover that they have hardly
> anything to do with the continuous topologies we are used to from
> the classical mathematics. In this sense, the difficulties ANNs have
> with the parity problem are only the tip of the iceberg.

Mitsu wrote:

> I see no reason why what you are talking about would prevent a
> connectionist approach (based on a recurrent or more sophisticated
> architecture) from being able to discover the same symbolic
> metric---because, as I say, the input space is not in any meaningful
> sense a vector space, and the recurrent architecture allows the
> "metric" of the learning algorithm, it seems to me, to acquire
> precisely the kind of structure that you need it to---or, at least,
> I do not see in principle why it cannot.  The reason this is so is
> again because the input is spread out over multiple presentations to
> the network.
>  There are good reasons to use connectionist schemes, however, I
> believe, as opposed to purely symbolic schemes.  For one: symbolic
> techniques are inevitably limited to highly discrete
> representations, whereas connectionist architectures can at least in
> theory combine both discrete and continuous representations.

"Connectionist" is too broad a term to distinguish inherently symbolic
from approaches which are not inherently symbolic, but which have yet
to be clearly excluded from being able to induce approximately
symbolic processing solutions.  In an attempt to characterize these
two approaches, the one builds in symbolic processing structure (this
is certainly true for Shruti and, from reading Lev's messages, appears
to be true of that research as well), while the other intends to
utilize a "recurrent or more sophisticated architecture" to induce the
desired behavior without "special" mechanisms.

It is certainly true that we have the ability to, and, of necessity,
must, construct connectionist systems with different inductive biases.
A recurrent MLP (multi-layer-perceptron) *typically* builds in scalar
weights, sigmoid transfer functions, high-forward connectivity,
recurrent connections, etc.  Simultaneous recurrent networks are
similar, but build in a settling process by which an output/behavior
is computed.  In the work with Shruti, we have built into a
simultaneous recurrent network localized structure and transfer
functions which facilitate "symbolic" processing.  While such
specialized structure does not preclude using, e.g., backpropagation
for learning, it also opens up explicit search of the structure space
by methods more similar to evolutionary programming.

My point, here, is not that we have the "right" solution, but that the
architectural variations which are being discussed need not be
exclusive.  Given a focus on "symbolic" processing, I suggest that
there are two issues which have dominated this discussion:

	- What inductive biases should be built into connectionist
	  architectures for this class of problems?  This question
	  should include choices of "structure", and "learning rules".

	- What meaningful differences exist in the learned behavior
	  of systems with different inductive biases.  Especially,
	  questions of rigidity and generalization of the solutions,
	  the efficiency of learning, and the preservation of
	  plasticity seem important.

I feel that Lev is concerned that learning algorithms using recurrent
networks with distributed representations have an inductive bias which
limits their practical capacity to induce solutions (internal
representations / transforms) for domains in which symbol processing
is a critical.  I agree with this "intuitively," but I would like to
see a firmer characterization of why such networks are ill-suited for
"symbolic processing" (quoted to indicate that good solutions need not
be purely symbolic and could exhibit aspects of more classical ANNs).

I am thinking about an effort several years ago which was made to
characterize problems (and representations) which were "GA" hard --
this is, which were ill suited to the transforms and inductive biases
of (certain classes of) genetic algorithms.  A similar effort with
various classes of connectionist architectures would be quite useful
in moving beyond such "intuitive" senses of the fitness of different
approaches and the expected utility of research in different
connectionist solutions for different classes of problems.

I feel that it is a reasonable argument that evolution has facilitated
us with both gross and localized structure.  That includes the body
and the brain.  Within the "brain" there are clearly systems that are
structurally (pre-)disposed for different kinds of computing, witness
the cerebellum vs the cerebral cortex.  We do not need to, and should
not, make the same structural choices for connectionist solutions for
different classes of problems.

My own intuitive "argument" leads me to believe that distributed
connectionist solutions are unlikely to prove suitable for symbolic
processing.  Recurrent, and simultaneous recurrent, distributed
networks may posses the representational capacity, but I maintain
doubts concerning their inductive capacity for "symbolic" domains.
Perhaps a fruitful approach would be to enumerate characteristics of a
system which facilitate learning and behavior in domains which are
considered "symbolic" (including variable binding, appropriate
generalization, plasticity, etc.), and to see how those properties
might be realized or approximated within the temporal dynamics of a
class of distributed recurrent networks.  This effort must, of course,
not seek to allocate too much responsibility to single system and
therefore, needs be part of a broader theory of the structure of mind
and organism.

If we consider that the primary mechanism of recurrence in a
distributed representations as enfolding space into time, I still have
reservations about the complexity that the agent / organism faces in
learning an enfolding of mechanisms sufficient to support symbolic
processing.

--bryan thompson

PS: I will be on vacation next week (Aug 17-21) and will be unable to
answer any replies until I return.