2-parameter search in ACT-R

Alexander Petrov apetrov at andrew.cmu.edu
Sat Jan 20 12:38:57 EST 2001


> This suggests an advantage of an architecture like ACT-R over neural 
> networks, namely that the parameters are readily interpretable (and 
> generally fewer).  

Models have two complementary sets of parameters -- (a) global
parameters, which are few in number and apply throughout the
system, and (b) local parameters associated with individual
processing elements.

In ACT-R, the global parameters include the decay rate, activation
noise, W, etc., as well as the specific form of knowledge representation
(e.g. chunks with three slots vs. chunks with four). The local
parameters in ACT-R include the base-level activations of the
chunks, the similarities between them, associative strengths, and
the various utility parameters of productions.

In backprop networks the local parameters are the weights but
there are also global, interpretable parameters just as in ACT-R.
For instance, global parameters may include the decay rate, the
gain of the sigmoid function, the number of units in the hidden
layer (which incidentally plays similar role to that of ACT-R's W),
the pattern of connectivity between layers, etc.

In both paradigms the global parameters are generally set by
the human modeler, interpreted, reported in publications, an so on.
In contrast, the local parameters are constrained by some learning
algorithm and the human modeler cannot "tweak" them at will.
The backpropagation algorithm minimizes the "error" defined as
some sum of squares, while ACT-R learning algorithms maximize
some Bayesian posterior probability.  This, in my view, is not a
principled difference. For example, the value of each individual
weight is just as sharply nailed down by backpropagation within the
context of the surrounding weights as the utility of an ACT-R
production is nailed down by the PG-C formula within the context of 
the surrounding productions.

It is not fair to count the local parameters in one model, not count
them in another, and pretend that the second has fewer parameters
than the first.

I agree with Christian that one advantage of ACT-R is that even
its local parameters are interpretable. Another advantage is that,
due to rational analysis, the effect of many of its learning algorithms
is available in closed form. Therefore, no search is involved -- one
can just update the base-level activation of a chunk according to
the closed-form (though approximate) activation equation from the
book.

Alex

-----------------------------------------------------------
Alexander Alexandrov Petrov     apetrov+ at andrew.cmu.edu
                         http://www.andrew.cmu.edu/~apetrov
Post-doctoral associate
Department of Psychology     Baker Hall 345B, (412)268-3498
Carnegie Mellon University   Pittsburgh, PA 15213, USA
-----------------------------------------------------------




More information about the ACT-R-users mailing list