Does backprop need the derivative ?? 
    Scott_Fahlman@SEF-PMAX.SLISP.CS.CMU.EDU 
    Scott_Fahlman at SEF-PMAX.SLISP.CS.CMU.EDU
       
    Sun Feb  7 12:56:03 EST 1993
    
    
  
    As the intention of the inquirer is the analog implementation of
    backprop, I see two problems: 1- the question whether the derivative can
    be replaced by a constant, and more importantly 2- whether the precision
    of the analog implementation will be high enough for backprop to work.
    
    ...
    
    Regarding (2), there has been several reports indicating that
    backpropagation simply does not work when the number of bits is reduced
    towards 6-8 bits! 
    
It is true that several studies show a sudden failure of backprop learning
when you use fixnum arithmetic and reduce the number of bits per word.  The
point of failure seems to be problem-specific, but is often around 10-14
bits (incuding sign).
Marcus Hoehfeld and I studied this issue and found that the source of the
failure was a quantization effect: the learning algorithm needs to
accumulate lots of small steps, for weight-update or whatever, and since
these are smaller than half the low-order bit, it ends up accumulating a
lot of zeros instead.  We showed that if a form of probabilisitic rounding
(dithering) is used to smooth over these quantization steps, learning
continues on down to 4 bits or fewer, with only a gradual degradation in
learning time, number of units/weights required, and quality of the result.
This study used Cascor, but we believe that the results hold for backprop
as well.
    Marcus Hoehfeld and Scott E. Fahlman (1992) "Learning with Limited
    Numerical Precision Using the Cascade-Correlation Learning Algorithm"
    in IEEE Transactions on Neural Networks, Vol. 3, no. 4, July 1992, pp.
    602-611.
Of course, a learning system implemented in analog hardware might have only
a few bits of accuracy due to noise and nonlinearity in the circuits, but
it wouldn't suffer from this quantization effect, since you get a sort of
probabilistic dithering for free.
-- Scott
===========================================================================
Scott E. Fahlman			Internet:  sef+ at cs.cmu.edu
Senior Research Scientist		Phone:     412 268-2575
School of Computer Science              Fax:       412 681-5739
Carnegie Mellon University		Latitude:  40:26:33 N
5000 Forbes Avenue			Longitude: 79:56:48 W
Pittsburgh, PA 15213
===========================================================================
    
    
More information about the Connectionists
mailing list