Turing Machines and New Reconfigurable Learning Machines

Thu Sep 27 16:28:09 EDT 1990

In connection with the discussion on relation between Turing machines
(TM) and neural nets (NN), I would like to draw your attention to my
resent paper in Pattern Recognition Vol.23 No.6 (June 1990) pp.595-616,
"On the Foundations of Intelligent Processes I: An Evolving Model for
Pattern Learning"  (as well as several other submitted papers). In it I
proposed a finite model of a learning machine (LM) which could be viewed
as a far-reaching "symbolic" generalization of the NN and which is more
powerful than any other known finite machine (including the NN).  This
learning power is achieved only because the LM embodies the first model
of a reconfigurable machine, i.e., machine that can learn any set of
classes by modifying its set of operations.  In retrospect, the idea of
the reconfigurable machine appears to be quite natural and the only
realistic alternative if we want a finite machine to achieve potentially
unbounded learning capabilities in some infinite environment.

The new model is quite different from the known NNs, and I expect some
may not see any similarity immediately. NNs operate on vectors, while
the LM can operate on any chosen pattern representation (vectors,
strings, trees. etc.; discrete or continuous).

NNs have essentially two groups of numeric operations: the accumulating
additive operations and the transmitting nonlinear operations (the
second group does not "actively" participate in the learning, since no
weights are associated with these operations). In addition, the
structure (global and local) of the NN imposes restrictions on the
variety of metric point transformations that can realized by each layer
as well as by the NN itself (think of each layer as changing the metric
structure of the input vector space; see section 5.5 of Y.-H.Pao
"Adaptive Pattern Recognition and Neural Networks",Addison-Wesley,1989).

In the proposed model the operations are not necessarily numeric
operations but rather they correspond to the chosen pattern
representation. During learning, to find the best separation of the
training classes, the LM seeks an optimal weighting scheme for its
current set of operations. If the current set of operations is not
sufficient, new operations, formed as the compositions of the current
operations, are gradually introduced (with the help of several functions
defined on the current space of weights) and the optimization process is
repeated.

Even for vector patterns, one of the most important distinctions between
the NN and the LM is that the operations realized by the nodes of the NN
are connected in a fixed manner while the operations of the LM are
decoupled (to the extent that they have to be decoupled) and they may
grow in number as the need arises during learning. There is a strong
evidence that the learning stage for the LM always converges, is much
more efficient, and produces a much smaller machine than the NN.