2-parameter search in ACT-R (e.g. Slow Kendler)
Christian Lebiere
cl at andrew.cmu.edu
Sat Jan 20 10:21:27 EST 2001
The steep ravine with gently sloping floor has been a cherished part of
connectionist lore since at least 1985 in the early days of
backpropagation.
A number of numerical optimization techniques have been used to try to
speed up the weight learning (a.k.a. parameter tweaking), including the
momentum method, quickprop, and various second order methods, all with
various degrees of success. But poorly configured search spaces are a
fundamental computational problem for which no magic bullet is likely to
exist. Otherwise we would already have a trillion-unit backprop net with
the capacities of the human brain.
The ravine results from tightly coupled parameters, in which the value of
one (or more) strongly determines the optimal value of the other(s). In
the case of connectionist networks, for example, the value of the weights
from the input units to the hidden units will strongly determine the value
of the weights from the hidden units to the output units, because the
former determine the meaning of the latter. That is likely to result in
any system with multiple parameters, unless those parameters are
independent from each other.
The basic problem in this case is the lack of data, as Niels suggested.
The impact of the :rule parameter is particularly strong initially but will
fade with experience because its influence will be reduced in the Bayesian
weighting, whereas :egs is a constant architectural parameter. Therefore
one would expect that having the learning curve data in addition to the
aggregate performance data would more strongly determine a single parameter
set. For example, in my work on cognitive arithmetic (Lebiere, 1998;
Lebiere, 1999), I found that the level of (activation) noise will
fundamentally determine the slope of the learning curve, whereas other
parameters will only shift it up and down by a constant factor. Other
parameter explorations for a model of implicit learning can be found in
(Lebiere & Wallach, 2000).
This suggests an advantage of an architecture like ACT-R over neural
networks, namely that the parameters are readily interpretable (and
generally fewer). This (sometimes) allows to set them by hand through
careful analysis of their effect on model behavior rather than through
brute force search. Not that we sometimes don't have to resort to that as
well. The parameter optimizer available on the ACT-R web site tries to
deal with the valley problem by resetting the direction of search according
to the conjugate gradient technique. Richard, I would be interested to
know how well it does on your example. Roman and Wheeler, if you can
please make your parameter search program available on the ACT-R web site
by emailing it to db30+ at andrew.cmu.edu. Different techniques perform best
on different problems, therefore it is important to have a wide assortment
available.
In and of itself there is nothing wrong with parameter tuning. But of
course it is not predictive, and therefore fits to the data that result
from parameter tuning cannot be taken as support for the model or theory.
That is why we try to determine constant values (or range of values) for
architectural parameters (e.g. :egs [though take note of Werner Tack's
arguments regarding that parmeter at the 2000 workshop]) and rules and
constraints for setting initial values of knowledge parameters (e.g. :rule).
Christian
Lebiere, C. (1998). The dynamics of cognition: An ACT-R model of cognitive
arithmetic. Ph.D. Dissertation. CMU Computer Science Dept Technical
Report CMU-CS-98-186. Pittsburgh,PA.
Available at http://reports-archive.adm.cs.cmu.edu/.
Lebiere, C. (1999). The dynamics of cognitive arithmetic.
Kognitionswissenschaft [Journal of the German Cognitive Science Society]
Special issue on cognitive modelling and cognitive architectures, D.
Wallach & H. A. Simon (eds.)., 8 (1), 5-19.
Lebiere, C., & Wallach, D. (2000). Sequence learning in the ACT-R
cognitive architecture: Empirical analysis of a hybrid model. In Sun, R. &
Giles, L. (Eds.) Sequence Learning: Paradigms, Algorithms, and
Applications. Springer LNCS/LNAI, Germany.
More information about the ACT-R-users
mailing list