Connectionist symbol processing: any progress?

Tue Aug 18 07:37:49 EDT 1998

As the question of metrics in pattern recognition seems to be of
some controversy, here is my point of view: 

It is quite possible to construct algorithms for
learning the metric of some set of patterns in the supervised learning
paradigm. This means if we have a set of patterns for class A and
a set of patterns for class B we can induce a metric such that class
A patterns are close (similar) to each other and dissimilar to class B patterns.
This metric will usually be rather distinct from metrics based
on numeric properties, such as the Euclidean metric.
It is especially useful in the case of binary patterns, which code some
features with several distinct ("symbolic") values.

Such a metric can be rather involved, for instance the metrics
that determine phonemic similarity in different languages are
quite different. (Think of "r" and "l" to speakers of Indo-european
and some Asian languages). This approach has therefore been applied
to the question of how phonemes (classes) relate to phonetic features
(pattern sets).

However, I believe as has also been pointed out by Pao and possibly Kohonen,
the problems of learning distance metrics or of finding hypersurfaces
(as in back-propagation) are in substance related. In the first case,
we have a distorted topology (with respect to Euclidean space) but
simple dividing lines (such as circles) , in the second case the toplogy
stays fixed (euclidean space), but dividing surfaces may be rather complex.

Although in certain cases, we may prefer one method rather than the other,
the question of symbolic vs. non-symbolic ("numeric"?) representations
really has not much to do with it.

Recall that back-propagation is one method of a universal function
approximation, and you find that any class distinction is approximable
provided you have found a sufficient and suitable training set. (which 
is of course the practical problem.)

Nonetheless efforts to build pattern classification schemes based on 
an induction of different metrics for different problems are I believe
really interesting and may change some ideas on what constitute "easy"
and "hard" problems. (For instance I agree with Lev, that classification
according to parity, as well as several symmetry etc. problems become
very easy with distance-measure-based approaches.)

Gabriele

References:
Pao, Y.H.: Adaptive Pattern recognition and Neural Networks.
Addison-Wesley, 1989.
Kohonen, T.: Self-Organization and Associative Memory. Springer 1989.
Scheler,G: Feature Selection with Exception Handling- An Example from
Phonology, In; Trappl,R. (ed.) Proceedings of EMCSR 94, Springer, 1994.
Scheler, G: Pattern Classification with Adaptive Distance Measures,
Tech Rep FKI-188-94, 1994.
(some more references at www7.informatik.tu-muenchen.de/~scheler/publications.html).