Does backprop need the derivative ??

Mon Jun 5 16:42:55 EDT 2006

Hi,
I have done some experiments on this and am working on
"Robustness of BP to transfer function and derivative" ( tentative title ).

I have found that the actual derivative need not be used.
So long as the "derivative" equivalent ( whether constant or other functions)
indicates  the direction of increasing or decreasing value of the transfer
function ( whether immediate or potential ) ie if the transfer function is
increasing or will increase any positive value for the derivative would do.

Hence for sigmoid function one may use a positive constant.

For unit step function

f(x)  = 1 for x >= 0
		= 0 for x < 0

we could use some high positive value at x=0 and nearby and
some low positive value further away. Although the derivative is zero except at x=0,
using zero would jam ( stop ) the whole backprop process since backproped error
would be zero in all nodes ( eventually ).
Hence using a low value in this case could be interpreted as an indication that if we move in the positive direction we may possibly increase the output of the node.
Many variations of the derivative are possible.
I have tried many and they work ( most of the time ).
One problem with this is that if the output of the node is already "1" then
increasing the input would not increase the output as our derivative suggest.
What we need  to do in this case is to check the backprop error's direction (
ie +ve or -ve ) and have two different values of our derivative depending on thedirection. Still working on it.

Hope this helps. Please contact me for any comments / discussion.

Regards,
Tiong_Hwee Goh