Does backprop need the derivative?

Radford Neal radford at cs.toronto.edu
Sun Feb 7 12:24:15 EST 1993


Other posters have discussed, regarding backprop...

> ... the question whether the derivative can be replaced by a constant, 

To clarify, I believe the intent is that the "constant" have the same
sign as the derivative, but have constant magnitude.

Marwan Jabri says...

> Regarding (1), it is likely as Scott Fahlman suggested any derivative 
> that "preserves" the error sign may do the job. 

One would expect this to work only for BATCH training. On-line training
approximates the batch result only if the net result of updating the 
weights on many training cases mimics the summing of derivatives in
the batch scheme. This will not be the case if a training case where the
derivative is +0.00001 counts as much as one where it is +10000.

This is not to say it might not work in some cases. There's just no reason
to think that it will work generally.

    Radford Neal



More information about the Connectionists mailing list