backprop vs. gradient descent

Thomas VLSI Edwards tedwards at wam.umd.edu
Mon Sep 16 12:06:18 EDT 1991


->   From: Tomaso Poggio <tp-temp at ai.mit.edu>
->   Date: Mon, 9 Sep 91 20:58:53 EDT
->
->   Why call gradient descent backpropagation?
->
->
-->skl (Sean K. Lehman)
 LEHMAN2 at llnl.gov
 lehman at tweety.llnl.gov      (128.115.53.23)  ("I tot I taw a puddy  
tat") says..

 >I think you are confusing gradient descent, a mathematical
 >method for finding a local mininum, with backpropagation, a  
learning
 >algorithm for artificial neural networks.

Probably the best way to deal with this is to consider
backpropagation as a manner of obtaining the error gradient
of a neural net with multiplicative weights and a typically
non-linear transfer function (as you must propagate the error
back through the net to obtain the gradient, even if you use it
for something like conjugate gradient).  


Extending the definition of backpropagation to include
delta weight=learning rate times gradient confuses people, although
it should be mentioned that this simple gradient descent method
was often used in early backpropagation applications.

-Thomas Edwards




More information about the Connectionists mailing list