Connectionists: autonomous neural system for learning planned action sequences towards a rewarded goal

Mon Oct 1 16:33:24 EDT 2007

The following article is now available at 
<http://www.cns.bu.edu/Profiles/Grossberg>http://www.cns.bu.edu/Profiles/Grossberg 
:

Gnadt, W. and Grossberg, S.
SOVEREIGN: An autonomous neural system for incrementally learning 
planned action sequences to navigate towards a rewarded goal.
Neural Networks, in press.

ABSTRACT
How do reactive and planned behaviors interact in real time? How are 
sequences of such behaviors released at appropriate times during 
autonomous navigation to realize valued goals? Controllers for both 
animals and mobile robots, or animats, need reactive mechanisms for 
exploration, and learned plans to reach goal objects once an 
environment becomes familiar. The SOVEREIGN (Self-Organizing, Vision, 
Expectation, Recognition, Emotion, Intelligent, Goal-oriented 
Navigation) animat model embodies these capabilities, and is tested 
in a 3D virtual reality environment. SOVEREIGN includes several 
interacting subsystems which model complementary properties of 
cortical What and Where processing streams and which clarify 
similarities between mechanisms for navigation and arm movement 
control. As the animat explores an environment, visual inputs are 
processed by networks that are sensitive to visual form and motion in 
the What and Where streams, respectively. Position-invariant and 
size-invariant recognition categories are learned by real-time 
incremental learning in the What stream. Estimates of target position 
relative to the animat are computed in the Where stream, and can 
activate approach movements toward the target. Motion cues from 
animat locomotion can elicit head-orienting movements to bring a new 
target into view. Approach and orienting movements are alternately 
performed during animat navigation. Cumulative estimates of each 
movement are derived from interacting proprioceptive and visual cues. 
Movement sequences are stored within a motor working memory. 
Sequences of visual categories are stored in a sensory working 
memory. These working memories trigger learning of sensory and motor 
sequence categories, or plans, which together control planned 
movements. Predictively effective chunk combinations are selectively 
enhanced via reinforcement learning when the animat is rewarded. 
Selected planning chunks effect a gradual transition from variable 
reactive exploratory movements to efficient goal-oriented planned 
movement sequences. Volitional signals gate interactions between 
model subsystems and the release of overt behaviors. The model can 
control different motor sequences under different motivational states 
and learns more efficient sequences to rewarded goals as exploration 
proceeds.