[ACT-R-users] ACT-R/Soar on a robot

Fri Nov 4 09:20:19 EST 2005

Troy,

I find it a good idea to try to instigate a discussion about questions 
like yours.

> 1)  ACT-R grew out of the General Problem Solver (GPS) production system 
> architecture developed by Newell and Simon.  Their intent was to develop 
> a *general* problem solver using various syntax based strategies.  These 
> are also known as weak method problem solvers in AI.  However, it seems 
> as if ACT-R has moved toward strong method problem solvers, which use 
> *specific* domain knowledge to solve problems.  So the question is, can 
> we develop ACT-R/Soar models that are general adaptable problem solvers 
> that are NOT domain specific?  Or does domain specificity so influence 
> the problem solving process, that one cannot extricate oneself from the 
> domain?

I think research on expertise has shown that problem solving _is_ 
quite domain specific. Think of those "memory-artists" who have been 
trained to memorize lists of 70 or more digits but perform at a normal 
level when asked to memorize lists of letters, or of chess-experts 
trying to memorize random positions. On the other hand, it cannot be 
denied that humans are to a certain extent capable of weak method 
problem solving. But even in weak method problem solving a lot of 
general world knowledge is potentially relevant, which is difficult 
(but not impossible) to model. Also, problem solving is often based on 
induction, which is not one of the topics that has been extensively 
dealt with in the ACT-R community.

> 4) There is a distinction in psychological literature between long term, 
> short term and working memory, but what does that mean computationally?  
> For us, we have implemented different decay rates for LTM and STM.  
> However, those decay rates are really part of a continuum of decay rates 
> for all memories.  For example, if we have one million memories on our 
> robot, don’t those memories form a continuum of various decay rates?  
> Why should I have to have a place holder for LTM and STM and WM?  Why 
> can’t all my memories just have various decay rates?  It seems as if 
> LTM, STM and WM are simply convenient labels for something that is 
> really a continuum.  If we define WM as those memories currently in use, 
> does that preclude LTM or STM from the problem space?

In my classes, I demonstrate how ACT-R's baselevel-learning equation 
nicely reproduces primacy and recency effects and their sensitivity to 
retention intervals in list learning. So why would you want to 
distinguish between STM and LTM? On the other hand, I miss an explicit 
theory of working memory in ACT-R. (To me it seems that WM functions 
and structures are implicitely distributed across the architecture. 
For example, the goal-chunk usually provides functionality of WM 
without reflecting all we know about the dynamics of WM.)

> 4) Has there ever been any use of *different* decay rates *within* an 
> ACT-R model?  I am talking about different base level activations for 
> different chunks at the beginning of the model run.  For example, we 
> have found it very helpful to code perceptual processing as building on 
> lower level memories that decay very quickly.  Once the higher level 
> concepts are formed, the lower level memories are quickly forgotten, and 
> the higher level concepts remain.  So, the higher level concepts have a 
> slower decay rate than the lower level perceptions.  I know right now, 
> in ACT-R, we request information from the buffers, but I have been 
> working on some memory research that says that information will “get in” 
> even if there is not a “request” for it (or an attentional shift to 
> it).  Perhaps this is more of a statement than a question then; what do 
> people think of having more control over decay rates for individual 
> chunks, especially those coming in from the perceptual components?

see answer to 3)

> 5)  The transition from a rule-based, symbolic understanding of a 
> problem, to a proceduralized, intuitive, understanding of a problem is 
> difficult to represent computationally. What does proceduralization 
> really mean?  At the symbolic level, we can change latency values, or 
> skip over rules, to speed up the procedure.  But really, 
> psychologically, it seems as if the symbolic level rules are being 
> “rewritten” or “re-compiled” as subsymbolic representations.  On our 
> robot, it is unclear how we could take symbolic production systems and 
> recompile them in a subsymbolic way to produce improved performance seen 
> in humans.

I would restate the question as "are there learned perceptual-motor 
shortcuts that completely bypass the central bottleneck?" I'm not sure 
about the answer.

> 6) ... that means there is a lot of information unavailable to the production 
> system.  For example, points get organized into lines which then get 
> organized into shapes at a Gestalt layer, which then become memories for 
> the production system.  So, a lot of processing goes on before our 
> sensory data becomes a memory.

That's the same in human information processing (see, e.g., Sperling's 
experiments). The difference is that in humans the details have the 
potential to be attended to by the "production system" under certain 
circumstances (see, e.g., the cocktail party phenomenon), which is 
hard to model.

> 7) Is cognition sensory specific?  Much of our code right now for the 
> robot seems to be for interpreting information the robot gets from each 
> one of its sensors.  So, if we were to put a new sensor on the robot, 
> would it still be able to make sense of the world?  That is a direction 
> we are going toward, but it can be very difficult.  Many of the 
> productions we end up writing seem to be tied directly to specific 
> sensory stimuli, and it is difficult to write productions that are 
> general and not tied to sensory information.  So, is cognition bound to 
> the sensory information returned from perceptual mechanisms (as Rodney 
> Brooks would say) or can it be separated from sensory information.  

This amounts to the problem of modeling the transformation of 
subsymbolic to symbolic information. Since the most peripheral 
elements in ACT-R/PM are also symbols, it does not really solve this 
problem. It would be useful to have an "ACT-R compatible" theory of 
the preattentive aspects of pattern recognition. The afore mentioned 
problem of modeling induction plays a role here, too.

Best regards
-- Wolfgang
-----------------------------------------------------------------
  Dr. Wolfgang Schoppek          Universitaet Bayreuth
  Tel.: +49 921 554140
  http://www.uni-bayreuth.de/departments/psychologie/schoppek/
-----------------------------------------------------------------