ART 1
Michael J. Healy 865-3123 (206)
mjhealy at espresso.rt.cs.boeing.com
Wed Jan 26 21:08:03 EST 1994
> Recently I heard an argument against Gail Carpenter and Stephen
> Grossberg's ART(Adaptive resonance theory). The basic argument was that ART
> was simply the 'leader clustering algorithm' enclosed in a load of neural
> net terminology. I am not very familiar with the leader clustering
> algorithm and was wondering would anyone like to remark for/against this
> argument as I am very interested in ART. Does anyone know of any paper on
> this subject? (ART vs. Leader clustering, or even leader clustering on
> it's own?).
>
I thought it would be informative to post my reply, since I have done
some work with ART. I would like to make two points:
First, it is incorrect to state that the binary pattern clustering
algorithm implemented by ART1 is equivalent to the leader clustering
algorithm (ART is much more general than the ART1 architecture. I
assumed the reference was to ART1). There are two significant
differences:
1. ART1 is meant to function as a real-time clustering algorithm.
This means that it (1) accepts and clusters input patterns in sequence, as
they would appear in an application requiring an online system that learns
as it processes data, and (2) is capable of finding a representation of the
inputs that is arguably general (see below).
The leader clustering algorithm, as I understand it, is supposed to have
all its inputs available at once so that it can scan the set globally to
form clusters. Hardly a real-time algorithm in any sense of the word.
2. The leader clustering algorithm does not generalize about its inputs.
To explain, the patterns that it uses to represent its clusters are simply
the input patterns that initiate the clusters (the "leaders"). ART1, on
the other hand, forms a synaptic (in the neurobiological sense of the
word) memory consisting of patterns that are templates for the patterns in
each of the (real-time, dynamic) clusters that it forms. It updates these
templates as it processes its inputs. Each template is the bitwise AND of
all the input patterns that have been assigned to the corresponding cluster
at some time in the learning history of ART1. This bitwise AND is a
consequence of the Hebbian-like (actually, Weber-Fechner law) learning
at each synapse in the outstar of F2 ---> F1 feedback connections from the
F2 node that represents the cluster. A corresponding change occurs in the
F1 ---> F2 connections to that same node, which form an adaptive filter
for screening the inputs that come in through the F1 layer.
Whether an input pattern is adopted by a particular cluster or not depends
upon two measures of input pattern/template similarity that the ART1 system
computes. The first measure is a result of F2 layer competition through
inhibitory interconnections (again, synaptic). The second is computed by
F2 ---> F1 gain control and the vigilance mechanism. The F2 ---> F1 gain
control and F1 ---> vigilance node inhibitory connections,
input layer ---> vigilance node connections, and vigilance node ---> F2
connections (all synaptic) effect the computation.
The result is
(1) Generalization. In fact, if the F1 nodes are thought of as implementing
predicates in a two-valued logic, it is possible to prove that the ART1
templates represent conjunctive generalizations about the objects or events
represented by the input patterns that have been adopted by a cluster. That
is, each ART1 cluster represents a concept class.
Each template also corresponds to a formula about any future objects
that might be recognized as members of its concept class. This is more
complicated than a simple conjunction of F1 predicates, but can be broken
down into component conjunctions. I have a technical report on this, but
the following reference is more useful relative to ART1 and its algorithm:
Healy, M. J., Caudell, T. P. and Smith, S. D. G.,
A Neural Architecture for Pattern Sequence Verification Through
Inferencing, IEEE Transactions on Neural Networks, Vol 4, No. 1,
1993, pp. 9-20.
Suppose it is important to stabilize the memory on a fixed set of training
patterns. Suppose it is desirable to know how many cycles, repeatedly
showing the set of patterns to the ART1 system, are necessary to accomplish
this; that is, how many cycles until the templates do not change any more,
and each input pattern is recognized consistently as corresponding to a
single template? Further, can the patterns be presented in some randomized
order each time, or do they have to be presented in a particular order?
The answer is as follows: Suppose that the number of distinct sizes of
patterns---size being the number of 1-bits in a binary pattern---is M
(obviously, M <= N, where N is the number of training patterns). Then
M cycles are required. Further, the order of presentation can be arbitrary,
and can be different with each cycle. Reference:
M. Georgiopoulos, G. L. Heileman, and J. Huang,
Properties of Learning Related to Pattern Diversity in ART1,
Neural Networks, Vol. 4, pp. 751-757, 1991.
This does not mean that the FORM of the templates is independent of the order
of presentation. In fact, learning in ART1 is order-dependent, as it is in
all clustering algorithms. I'll bet that leader clustering, even though it
views the training set all at once, is also order-dependent. The inputs
still have to be processed in some order and then deleted from the training
set on each cycle. You could redo the entire training process for all N!
possible presentation orders, but you would still have to somehow find the
"best" of all the N! clusterings.
My second point addresses the relevance of the argument that ART (meaning
ART1) is "simply the leader clustering algorithm enclosed in a load of
neural net terminology":
ART1 represents a neural network, complete with a dynamic system model.
Watch for
Heileman, G.,
A Dynamical Adaptive Resonance Architecture,
IEEE Transactions on Neural Networks (soon to appear)
Given the relevance of ART1 to neural systems, including
those that may actually exist in the brain, and given the proven
stability of the ART1 algorithm, it seems to me that any argument that
ART1 is simply this, that or the other algorithm is a moot point.
I hope this sheds some light on the relationship between ART1 and the
leader clustering algorithm. My thanks to the author of the original
posting.
Mike Healy
More information about the Connectionists
mailing list