[ACT-R-users] ACT-R/Soar on a robot

Kelley, Troy (Civ,ARL/HRED) tkelley at arl.army.mil
Fri Nov 4 10:25:02 EST 2005


I find it a good idea to try to instigate a discussion about questions 
like yours.
----
Yes, I was hoping there would be more discussion on these topics, I am
surprised more people have not responded.


> 1)  ACT-R grew out of the General Problem Solver (GPS) production
system 
> architecture developed by Newell and Simon.  Their intent was to
develop 
> a *general* problem solver using various syntax based strategies.
These 
> are also known as weak method problem solvers in AI.  However, it
seems 
> as if ACT-R has moved toward strong method problem solvers, which use 
> *specific* domain knowledge to solve problems.  So the question is,
can 
> we develop ACT-R/Soar models that are general adaptable problem
solvers 
> that are NOT domain specific?  Or does domain specificity so influence

> the problem solving process, that one cannot extricate oneself from
the 
> domain?

I think research on expertise has shown that problem solving _is_ 
quite domain specific. Think of those "memory-artists" who have been 
trained to memorize lists of 70 or more digits but perform at a normal 
level when asked to memorize lists of letters, or of chess-experts 
trying to memorize random positions. On the other hand, it cannot be 
denied that humans are to a certain extent capable of weak method 
problem solving. But even in weak method problem solving a lot of 
general world knowledge is potentially relevant, which is difficult 
(but not impossible) to model. Also, problem solving is often based on 
induction, which is not one of the topics that has been extensively 
dealt with in the ACT-R community.
-----
I think weak method learning techniques are very possible to model.
This is what Newell and Simon's GPS was all about.  This is basically
what natural language translation methods have been doing for years.  We
are also using some of the machine learning techniques that Tom Mitchell
has developed.  The basic idea is to represent productions in their
conical, semantic form.  In our robotic environment SS-RICS, that means
we can't name a production anything we want, we must follow a
constrained syntax.  Once we have developed a syntax for productions,
then we can mix and match productions to create new production systems.
This is similar to Schank's conceptual dependency work.  I think the key
concept is developing a syntax for production naming, then you can use
productions across models, or use the same productions for solving other
problems, which is what the weak methods are all about.


> 4) There is a distinction in psychological literature between long
term, 
> short term and working memory, but what does that mean
computationally?  
> For us, we have implemented different decay rates for LTM and STM.  
> However, those decay rates are really part of a continuum of decay
rates 
> for all memories.  For example, if we have one million memories on our

> robot, don't those memories form a continuum of various decay rates?  
> Why should I have to have a place holder for LTM and STM and WM?  Why 
> can't all my memories just have various decay rates?  It seems as if 
> LTM, STM and WM are simply convenient labels for something that is 
> really a continuum.  If we define WM as those memories currently in
use, 
> does that preclude LTM or STM from the problem space?

In my classes, I demonstrate how ACT-R's baselevel-learning equation 
nicely reproduces primacy and recency effects and their sensitivity to 
retention intervals in list learning. So why would you want to 
distinguish between STM and LTM? On the other hand, I miss an explicit 
theory of working memory in ACT-R. (To me it seems that WM functions 
and structures are implicitely distributed across the architecture. 
For example, the goal-chunk usually provides functionality of WM 
without reflecting all we know about the dynamics of WM.)
------
Distinguishing between STM and LTM seems to be just a convenient box to
me.  Computationally, why not represent memory as a single continuum
with different parameters (i.e. decay rates).

It seems to me an entire ACT-R model is working memory.  Ideally, you
don't include anything in declarative memory that is not being used by
the model, and everything in the production cycles could be considered
working memory as well.  This is the difference between our system
SS-RICS and ACT-R.  We need to distinguish between millions of memories,
not just the memories that we have programmed into declarative memory.
So, we truly have LTM (i.e. memories that are from other tasks that are
remembered but not being used in the current task).


> 5)  The transition from a rule-based, symbolic understanding of a 
> problem, to a proceduralized, intuitive, understanding of a problem is

> difficult to represent computationally. What does proceduralization 
> really mean?  At the symbolic level, we can change latency values, or 
> skip over rules, to speed up the procedure.  But really, 
> psychologically, it seems as if the symbolic level rules are being 
> "rewritten" or "re-compiled" as subsymbolic representations.  On our 
> robot, it is unclear how we could take symbolic production systems and

> recompile them in a subsymbolic way to produce improved performance
seen 
> in humans.

I would restate the question as "are there learned perceptual-motor 
shortcuts that completely bypass the central bottleneck?" I'm not sure 
about the answer.
----
I think there are.  Perhaps they don't completely bypass the central
bottleneck, but they certainly get turned into something that is
different from the serialized nature of a production.  That, in turn,
frees up the "central executive" to concentrate on other tasks.


> 6) ... that means there is a lot of information unavailable to the
production 
> system.  For example, points get organized into lines which then get 
> organized into shapes at a Gestalt layer, which then become memories
for 
> the production system.  So, a lot of processing goes on before our 
> sensory data becomes a memory.

That's the same in human information processing (see, e.g., Sperling's 
experiments). The difference is that in humans the details have the 
potential to be attended to by the "production system" under certain 
circumstances (see, e.g., the cocktail party phenomenon), which is 
hard to model.
-----------
Yes, very hard to model.  Within SS-RICS we have toyed with the idea of
sets of productions as having their own activation level or strength.
So, we have as the highest level goal a goal of "attention" which
moderates the other goals based on goal activation.  So the goal of
listening to another person at a cocktail party would have an overall
activation.  This activation might not be high enough to block out the
activation created by hearing your name from across the room.  This
means that low level stimuli would get activations as soon as they are
created and could push into the attention goal depending on their
activation.  There is some amount of psychological support for this.
For example, the visual system treats movement stimuli different than
other stimuli, suggesting that it has some kind of special importance.
So activation for some lower level stimuli would have higher activation
than other stimuli and would "overwhelm" the attention goal.


> 7) Is cognition sensory specific?  Much of our code right now for the 
> robot seems to be for interpreting information the robot gets from
each 
> one of its sensors.  So, if we were to put a new sensor on the robot, 
> would it still be able to make sense of the world?  That is a
direction 
> we are going toward, but it can be very difficult.  Many of the 
> productions we end up writing seem to be tied directly to specific 
> sensory stimuli, and it is difficult to write productions that are 
> general and not tied to sensory information.  So, is cognition bound
to 
> the sensory information returned from perceptual mechanisms (as Rodney

> Brooks would say) or can it be separated from sensory information.  

This amounts to the problem of modeling the transformation of 
subsymbolic to symbolic information. Since the most peripheral 
elements in ACT-R/PM are also symbols, it does not really solve this 
problem. It would be useful to have an "ACT-R compatible" theory of 
the preattentive aspects of pattern recognition. The afore mentioned 
problem of modeling induction plays a role here, too.
---
Agreed.  Actually, I think we have solved the above problem by making
everything the production system uses a memory - and not a stimulus
directly.  So, a sensor will perceive the environment and create a label
for the environment (i.e. a chair) and put this into memory.  Then the
production system can act on the memory, not the stimuli.  However the
memory arrives there is irrelevant, because the production system is
just using the memory.  This creates more layers of programming, but it
separates cognition from perception.  I am not sure if I agree this is
how the human system works, but in order to be flexible for a robot, we
need this layer of separation to allow for other sensors to be used.

Troy  





More information about the ACT-R-users mailing list