Stanford Adaptive Networks Colloquium

Mark Gluck netlist at psych.Stanford.EDU
Fri Sep 23 09:19:12 EDT 1988


         Stanford University Interdisciplinary Colloquium Series:
                 Adaptive Networks and their Applications
   

                     Oct. 4th (Tuesday, 3:15pm) 

**************************************************************************

                 Connectionist Prediction Systems:
   Relationship to Least-Squares Estimation and Dynamic Programming


                       RICHARD S. SUTTON

                       GTE Laboratories Incorporated
                       40 Sylvan Road
                       Waltham, MA 02254
                       <netmail to: rich%gte.com at RELAY.CS.NET>

**************************************************************************
                            - Abstract -


In this talk I will present two examples of productive interplay between
connectionist machine learning and more traditional engineering areas.
The first concerns the problem of learning to predict time series.  I
will briefly review previous approaches including least squares linear
estimation and the newer nonlinear backpropagation methods, and then
present a new class of methods called Temporal-Difference (TD)
methods.  Whereas previous methods are driven by the error or
difference between predictions and actual outcomes, TD methods are
similarly driven the difference between temporally successive
predictions.  This idea is also the key idea behind the learning in
Samuel's checker player, in Holland's bucket brigade, and in Barto,
Sutton & Anderson's pole-balancer.  TD methods can be more efficient
computationally because their errors are available immediately after
the predictions are made, without waiting for a final outcome.  More
surprisingly, they can also be more efficient in terms of how much
data is needed to achieve a particular level of accuracy.  Formal
results will be presented concerning the computational complexity,
convergence, and optimality of TD methods.  Possible areas of
application of TD methods include temporal pattern recognition such as
speech recognition and weather forecasting, the learning of heuristic
evaluation functions, and learning control. Second, I would like to 
present work on the theory of TD methods used in conjunction with 
reinforcement learning techniques to solve control problems.  


**************************************************************************

Location: Room 380-380W, which can be reached through the lower level
 between the Psychology and Mathematical Sciences buildings. 

Technical Level: These talks will be technically oriented and are intended 
 for persons actively working in related areas. They are not intended
 for the newcomer seeking general introductory material. 

Information: To be added to the network mailing list, netmail to
             netlist at psych.stanford.edu For additional information,
             contact Mark Gluck (gluck at psych.stanford.edu).

Upcomming talks:
      Nov. 22:     Mike Jordan (MIT)
      Dec.  6:     Ralph Linsker (IBM) 

                           *    *    *

Co-Sponsored by: Departments of Electrical Engineering (B. Widrow) and
       Psychology (D. Rumelhart, M. Pavel, M. Gluck), Stanford Univ.



More information about the Connectionists mailing list