Connectionists: Papers on TD-FALCON and Minefield Navigation Simulator

Thu Apr 10 04:04:37 EDT 2008

Dear Colleagues

We are pleased to inform the availability of two recently published
articles and its associated minefield navigation simulation platform.

The preprints of the papers can be downloaded from
http://www.ntu.edu.sg/home/asahtan/ under Publications.

The minefield simulator (written in JAVA) can be downloaded from
http://www.ntu.edu.sg/home/asahtan/ under Downloads.

Article 1:
Ah-Hwee Tan, Ning Lu and Dan Xiao. Integrating Temporal Difference
Methods and Self-Organizing Neural Networks 
for Reinforcement Learning with Delayed Evaluative Feedback. IEEE
Transactions on Neural Networks, 
Vol. 9, No. 2 (February 2008), 230-244.

Abstract: This paper presents a neural architecture for learning
category nodes encoding mappings across multi-modal patterns involving
sensory inputs, actions, and rewards. By integrating Adaptive Resonance
Theory (ART) and temporal difference (TD) methods, the
proposed neural model, called TD-FALCON, enables an autonomous agent to
adapt and function in a dynamic environment with
immediate as well as delayed evaluative feedback (reinforcement)
signals. TD-FALCON learns the value functions of the state-action
space estimated through on-policy and off-policy temporal difference
learning methods, specifically SARSA and Q-Learning.
The learned value functions are then used to determine the optimal
actions based on an action selection policy. We have developed
TD-FALCON systems using various TD learning strategies and compared
their performance in terms of task completion, learning
speed, as well as time and space efficiency. Experiments based on a
minefield navigation task have shown that TD-FALCON systems are
able to learn effectively with both immediate and delayed reinforcement
and achieve a stable performance in a pace much 
faster than those of standard gradient descent based reinforcement
learning systems.

Article 2:
Dan Xiao and Ah-Hwee Tan. Self-Organizing Neural Architectures and
Cooperative Learning in Multi-Agent Environment. 
IEEE Transactions on Systems, Man, and Cybernetics - Part B, Vol. 37,
No. 6 (December 2007), 1567-1580.

TD-FALCON (Temporal Difference - Fusion Architecture for Learning,
COgnition, and Navigation) is a generalization of 
Adaptive Resonance Theory (a class of self-organizing neural networks)
that incorporates Temporal Difference (TD) methods 
for real-time reinforcement learning. In this paper, we investigate how
a team of TD-FALCON networks may cooperate to learn 
and function in a dynamic multi-agent environment based on a minefield
navigation task and a predator/prey pursuit task. 
Experiments on the navigation task demonstrate that TD-FALCON agent
teams are able to adapt and function well in a multi-agent 
environment without an explicit mechanism of collaboration. In
comparison, traditional Q-learning agents using gradient descent based 
feedforward neural networks, trained with the standard backpropagation
and the resilient propagation algorithms, produce a significantly 
poorer level of performance. For the predator/prey pursuit task, we
experiment with various cooperative strategies and find that 
a combination of a high level compressed state representation and a
hybrid reward function produces the best results. 
Using the same cooperative strategy, the TD-FALCON team also outperforms
the resilient propagation based reinforcement learners 
in terms of both task completion rate and learning efficiency.

______________________________________________________________
Dr. Ah-Hwee Tan
Director, Emerging Research Lab <http://erlab.ntu.edu.sg>  
Associate Professor, School of Computer Engineering
<http://www.ntu.edu.sg/sce/default.asp> 
Nanyang Technological University
Homepage: http://www.ntu.edu.sg/home/asahtan
______________________________________________________________

-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://mailman.srv.cs.cmu.edu/mailman/private/connectionists/attachments/20080410/ba9a70af/attachment-0001.html