Does backprop need the derivative ??
Scott_Fahlman@SEF-PMAX.SLISP.CS.CMU.EDU
Scott_Fahlman at SEF-PMAX.SLISP.CS.CMU.EDU
Sun Feb 7 12:56:03 EST 1993
As the intention of the inquirer is the analog implementation of
backprop, I see two problems: 1- the question whether the derivative can
be replaced by a constant, and more importantly 2- whether the precision
of the analog implementation will be high enough for backprop to work.
...
Regarding (2), there has been several reports indicating that
backpropagation simply does not work when the number of bits is reduced
towards 6-8 bits!
It is true that several studies show a sudden failure of backprop learning
when you use fixnum arithmetic and reduce the number of bits per word. The
point of failure seems to be problem-specific, but is often around 10-14
bits (incuding sign).
Marcus Hoehfeld and I studied this issue and found that the source of the
failure was a quantization effect: the learning algorithm needs to
accumulate lots of small steps, for weight-update or whatever, and since
these are smaller than half the low-order bit, it ends up accumulating a
lot of zeros instead. We showed that if a form of probabilisitic rounding
(dithering) is used to smooth over these quantization steps, learning
continues on down to 4 bits or fewer, with only a gradual degradation in
learning time, number of units/weights required, and quality of the result.
This study used Cascor, but we believe that the results hold for backprop
as well.
Marcus Hoehfeld and Scott E. Fahlman (1992) "Learning with Limited
Numerical Precision Using the Cascade-Correlation Learning Algorithm"
in IEEE Transactions on Neural Networks, Vol. 3, no. 4, July 1992, pp.
602-611.
Of course, a learning system implemented in analog hardware might have only
a few bits of accuracy due to noise and nonlinearity in the circuits, but
it wouldn't suffer from this quantization effect, since you get a sort of
probabilistic dithering for free.
-- Scott
===========================================================================
Scott E. Fahlman Internet: sef+ at cs.cmu.edu
Senior Research Scientist Phone: 412 268-2575
School of Computer Science Fax: 412 681-5739
Carnegie Mellon University Latitude: 40:26:33 N
5000 Forbes Avenue Longitude: 79:56:48 W
Pittsburgh, PA 15213
===========================================================================
More information about the Connectionists
mailing list