New book on Neuro-Dynamic Programming/Reinforcement Learning

Dimitri Bertsekas dimitrib at MIT.EDU
Sat Oct 12 01:46:39 EDT 1996


Dear colleagues,

our Neuro-Dynamic Programming book has just been published, and we are 
attaching a description.

Dimitri Bertsekas (dimitrib at mit.edu)
John Tsitsiklis (jnt at mit.edu)

********************************************************************

                   NEURO-DYNAMIC PROGRAMMING
                             by 
          Dimitri P. Bertsekas and John N. Tsitsiklis
            Massachusetts Institute of Technology
         
      (512 pages, hardcover, ISBN:1-886529-10-8, $79.00)
                     published by 
            Athena Scientific, Belmont, MA
            http://world.std.com/~athenasc/
            
Neuro-Dynamic Programming (NDP for short) is a recent class of
reinforcement learning methods that can be used to solve very large and
complex dynamic optimization problems. 

NDP combines simulation, learning, neural networks or other
approximation architectures, and the central ideas in dynamic
programming. It provides a rigorous framework for addressing
challenging and often intractable problems from a broad variety of
fields.  

This book provides the first systematic presentation of the science
and the art behind this far-reaching methodology.

Among its special features, the book:

-----------------------------------------------------------------------
** Describes and unifies a large number of reinforcement learning 
methods, including several that are new

** Rigorously explains the mathematical principles behind NDP

** Describes new approaches to formulation and approximate solution of
problems in stochastic optimal control, sequential decision making, 
and discrete optimization

** Illustrates through examples and case studies the practical
application of NDP to complex problems from resource allocation,
data communications, game playing, and combinatorial optimization

** Presents extensive background and new research material on dynamic
programming and neural network training
-----------------------------------------------------------------------

CONTENTS

1.  Introduction
  1.1. Cost-to-go Approximations in Dynamic Programming
  1.2. Approximation Architectures
  1.3. Simulation and Training
  1.4. Neuro-Dynamic Programming

2.  Dynamic Programming
  2.1. Introduction
  2.2. Stochastic Shortest Path Problems
  2.3. Discounted Problems
  2.4. Problem Formulation and Examples

3.  Neural Network Architectures and Training
  3.1. Architectures for Approximation
  3.2. Neural Network Training

4.  Stochastic Iterative Algorithms
  4.1. The Basic Model 
  4.2. Convergence Based on a Smooth Potential Function
  4.3. Convergence under Contraction or Monotonicity Assumptions
  4.4. The ODE Approach
  
5.  Simulation Methods for a Lookup Table Representation
  5.1. Some Aspects of Monte Carlo Simulation 
  5.2. Policy Evaluation by Monte Carlo Simulation 
  5.3. Temporal Difference Methods
  5.4. Optimistic Policy Iteration 
  5.5. Simulation-Based Value Iteration 
  5.6. Q-Learning 

6.  Approximate DP with Cost-to-Go Function Approximation
  6.1. Generic Issues -- From Parameters to Policies
  6.2. Approximate Policy Iteration
  6.3. Approximate Policy Evaluation Using TD(lambda)
  6.4. Optimistic Policy Iteration
  6.5. Approximate Value Iteration
  6.6. Q-Learning and Advantage Updating
  6.7. Value Iteration with State Aggregation
  6.8. Euclidean Contractions and Optimal Stopping
  6.9. Value Iteration with Representative States
  6.10. Bellman Error Methods
  6.11. Continuous States and the Slope of the Cost-to-Go
  6.12. Approximate Linear Programming
  6.13. Overview
  
7.  Extensions
  7.1. Average Cost per Stage Problemsn Error
  7.2. Dynamic Games
  7.3. Parallel Computation Issues

8.  Case Studies
  8.1. Parking
  8.2. Football
  8.3. Tetris
  8.4. Combinatorial Optimization -- Maintenance and Repair
  8.5. Dynamic Channel Allocation
  8.6. Backgammon

Appendix A: Mathematical Review
Appendix B: On Probability Theory and Markov Chains

********************************************************************

PREFACE: http://world.std.com/~athenasc/

********************************************************************

PUBLISHER'S INFORMATION:

Athena Scientific, P.O.Box 391, Belmont, MA, 02178-9998, U.S.A.
Email: athenasc at world.std.com, Tel: (617) 489-3097, FAX: (617) 489-2017
WWW Site for Info and Ordering: http://world.std.com/~athenasc/

********************************************************************



More information about the Connectionists mailing list