Reconfigurable Learning Machines (RLM)

Tue Jan 15 13:34:35 EST 1991

Here is an informal outline of the model proposed in On the Foundations
of Intelligent Processes I: An Evolving Model for Pattern Learning,
Pattern Recognition, Vol.23, No.6, 1990. Think of an object
representation as of a "molecule" (vectors and strings are special
types of such molecules). Let O denotes the set of all objects
from the environment, and let  S = {Si}  denotes the set of BASIC
SUBSTITUTION OPERATIONS, where each operation can transform one object
into another by removing a piece of the molecule and replacing it by
another small molecule. For string molecules these could be the
operations of letter deletions/insertions. In addition, a small FIXED
set CR of COMPOSITION RULES for forming new operations from the existing
operations is also given. Think of these rules as specifications for
gluing several operations together into one operation.

The intrinsic distance D between two objects is defined as the minimum
number of operations from S that are necessary to transform one molecule
into the other. D depends on the set S of operations. A larger set S can
only reduce some of the distances. This idea of distance embodies a very
important, perhaps the most important, physical principle -- the least-
action principle, which was characterized by Max Planck as follows:
     Amid the more or less general laws which mark the achievements of
     physical science during the course of the last centuries, the
     principle of least action is perhaps that which, as regards form
     and content, may claim to come nearest to that ideal final aim of
     theoretical research.
In a vector setting, D is a city-block distance between two vectors. In
a non-vector setting, however, even small sets of patterns (4-5) with
such distances cannot be represented in a Euclidean vector space of ANY
dimension. The adjective "intrinsic" in the above definition refers to
the fact that the distance D does not reflect any empirical knowledge
about the role of the substitution operations. Thus, we are led to the
most natural extension of this concept obtained by allowing different
substitution operations to have different weights associated with them:
assign to each operation Si nonnegative weight w^i subject to one
restriction that their sum is 1. The latter constraint is necessary to
ensure that during learning the operations are forced to cooperatively
compete for the weights. The new weighted distance WD is defined
similarly to the above distance D, but replacing the minimum number
of operations by the shortest weighted path between the two molecules.

In the basic learning process, i.e. that of learning to recognize one
class, the RLM is presented with two finite sets C^+ (of positive
training patterns) and C^- (of negative training patterns). The main
objective of the learning process is to produce, if necessary, an
expanded set S of operations and at least one corresponding weight
vector w*, such that with the help of the distance WD(w*)   (which
induces in the set O the corresponding similarity field)  the RLM
can classify new patterns as positive or negative. The basic step in
the learning process is optimization of the function F(w)=F1(w)/c+F2(w)
where F1 is the smallest WD(w) distance between C^+ and C^-, F2 is the
average WD(w) distance in C^+, and c is a small positive constant to
prevent the overflow (when the values of F2 approach 0). One can show
that the above continuous optimization problem can be reduced to the
discrete one. During the learning process the new operations to be
added to the set S of current operations are chosen among the
compositions of the "optimum" current operations. The addition of such
new operations "improves" the value of F, and therefore the learning
process is guaranteed to converge. The concept of non-probabilistic
class entropy, or complexity, w.r.t. the (current) state of the RLM
can also be introduced. During the learning process this entropy
DECREASES.

Within the proposed model it is not difficult to see the relations
between the learning process and the propositional class description.
Moreover, most of the extensive psychological observation related, for
example, to object perception (Object Perception: Structure and
Process, eds. B.E. Shepp and S. Ballesteros, Lawrence Erlbaum
Associates, 1989) can naturally be explained.

--Lev Goldfarb