TR: online local step size adaptation
Nici Schraudolph
nic at idsia.ch
Mon Mar 30 04:20:51 EST 1998
Dear colleagues,
the following technical report (10 pages, 143kB gzipped postscript)
is available by anonymous ftp from the address given below.
Best regards,
--
Dr. Nicol N. Schraudolph Tel: +41-91-970-3877
IDSIA Fax: +41-91-911-9839
Corso Elvezia 36
CH-6900 Lugano http://www.idsia.ch/~nic/
Switzerland http://www.cnl.salk.edu/~schraudo/
Technical Report IDSIA-09-98:
Online Local Gain Adaptation for Multi-Layer Perceptrons
--------------------------------------------------------
Nicol N. Schraudolph
We introduce a new method for adapting the step size of each individual
weight in a multi-layer perceptron trained by stochastic gradient descent.
Our technique derives from the K1 algorithm for linear systems (Sutton,
1992), which in turn is based on a diagonalized Kalman Filter. We expand
upon Sutton's work in two regards: K1 is a) extended to nonlinear systems,
and b) made more efficient by linearizing an exponentiation operation.
The resulting ELK1 (extended, linearized K1) algorithm is computationally
little more expensive than alternative proposals (Zimmermann, 1994;
Almeida et al., 1997, 1998), and does not require an arbitrary smoothing
parameter. On a first benchmark problem ELK1 clearly outperforms these
alternatives, as well as stochastic gradient descent with momentum, even
when the number of floating-point operations required per weight update
is taken into account. Unlike the method of Almeida et al., ELK1 does
not require statistical independence between successive training patterns.
ftp://ftp.idsia.ch/pub/nic/olga.ps.gz
More information about the Connectionists
mailing list