Batch Backprop versus Incremental

Scott E. Fahlman sef+ at cs.cmu.edu
Fri Jan 28 00:06:43 EST 1994


    ...a gentleman from the U.K.
    suggested that Batch mode learning could possibly be unstable in
    the long term for backpropagation.  I did not know the gentleman
    and when I asked for a reference he could not provide one.
    
    Does anyone have any kind of proof stating that one method is better 
    than another?  Or that possibly batch backprop is unstable in <<Some>>
    sense?
    
Those U.K. guys get some funny ideas.  I think it's the intoxicating fumes
from wet sheep.  :-)

Batch mode backprop is actually more stable (other things being equal) than
online (also known as "stochastic") updating.  In batch mode, each weight
update is made with respect to the true error gradient, computed over the
whole training set.  In online, each update is made with respect to a
single sample, and a few atypical samples in a row can take you very far
afield, especially if you use one of the fast training methods that adapts
step-size.  In addition, online training never settles down into a stable
minimum, since the network continues to be buffeted by the individual
training cases as they arrive.  (You can, of course, reduce the learning
rate once the net seems to have found a solution.)

Perhaps the origin of this myth about batch learning is that you need to
scale down the gradient values (or the learning rate parameter) as the
number of training cases goes up.  If you don't, the effective learning
rate can become arbitrarily large and the learning will be unstable.

This isn't to say that batch learning is necessarily better.  As long as
you take small steps, online updating will usually be stable, and it can be
much faster than batch updating if the training set is large and redundant.
A net trained by online update might converge before even a single epoch
has been completed.

-- Scott

===========================================================================
Scott E. Fahlman			Internet:  sef+ at cs.cmu.edu
Senior Research Scientist		Phone:     412 268-2575
School of Computer Science              Fax:       412 681-5739
Carnegie Mellon University		Latitude:  40:26:33 N
5000 Forbes Avenue			Longitude: 79:56:48 W
Pittsburgh, PA 15213
===========================================================================


More information about the Connectionists mailing list