No subject

Fri Feb 22 07:29:04 EST 1991

Technical Reports available:

                             Planning with an
                           Adaptive World Model

                    S. Thrun, K. Moeller, A. Linden

We present a new connectionist planning method.  By interaction with an
unknown environment, a world model is progressively constructed using
gradient descent. For deriving optimal actions with respect to future
reinforcement, planning is applied in two steps: an experience network
proposes a plan which is subsequently optimized by gradient descent with a
chain of world models, so that an optimal reinforcement may be obtained when
it is actually run.  The appropriateness of this method is demonstrated by a
robotics application and a pole balancing task.

(to appear in proceedings NIPS*90)

-------------------------------------------------------------------------

                  A General Feed-Forward Algorithm
                        for Gradient Descent
                      in Connectionist Networks

                        S. Thrun, F. Smieja

An extended feed-forward algorithm for recurrent connectionist networks is
presented. This algorithm, which works locally in time, is derived both for
discrete-in-time networks and for continuous networks.  Several standard
gradient descent algorithms for connectionist networks (e.g. Williams/Zipser
88, Pineda 87, Pearlmutter 88, Gherrity 89, Rohwer 87, Waibel 88, especially
the backpropagation algorithm Rumelhart/Hinton/Williams 86, are
mathematically derived from this algorithm.  The learning rule presented in
this paper is a superset of gradient descent learning algorithms for
multilayer networks, recurrent networks and time-delay networks that allows
any combinations of their components.  
In addition, the paper presents feed-forward approximation procedures for
initial activations and external input values.  The former one is used for
optimizing starting values of the so-called context nodes, the latter one
turned out to be very useful for finding spurious input attractors of a
trained connectionist network.  Finally, we compare time, processor and space
complexities of this algorithm with backpropagation for an unfolded-in-time
network and present some simulation results.

(in: "GMD Arbeitspapiere Nr. 483")

-------------------------------------------------------------------------

Both reports can be received by ftp:

             unix>         ftp cis.ohio-state.edu

             Name:         anonymous
             Guest Login ok, send ident as password
             Password:     neuron
             ftp>          binary
             ftp>          cd pub
             ftp>          cd neuroprose
             ftp>          get thrun.nips90.ps.Z
             ftp>          get thrun.grad-desc.ps.Z
             ftp>          bye

             unix>         uncompress thrun.nips90.ps
             unix>         uncompress thrun.grad-desc.ps
             unix>         lpr thrun.nips90.ps
             unix>         lpr thrun.grad-desc.ps

-------------------------------------------------------------------------

To all European guys: The same files can be retrieved from gmdzi.gmd.de
(129.26.1.90), directory pub/gmd, which is probably a bit cheaper.

-------------------------------------------------------------------------

If you have trouble in ftping the files, do not hesitate to contact me.

                                            --- Sebastian Thrun
                                          (st at gmdzi.uucp, st at gmdzi.gmd.de)