<div dir="ltr">Colleagues,<br>I have had the benefit of learning from many of the pioneers in the field including LeCun, Werbos, Widrow, and others. I also had the benefit of spending 2 years working with David Rumelhart's research group.<br><br>David worked in a number of areas including the modeling of cognition, engineering applications, and he was even one of the first to map brain activity using magnetic resonance. Besides his contributions to back propagation, he also made another primary contribution, which was the development of forward models, in collaboration with Jordan. Though many people took David's developments and extended them or even applied them commercially, he was always supportive and happy that they extended his work. He did this without asking for anything. He never felt that his contributions had been diminished by others extending his work.<br><br>I think Dave would've been happy about these researchers building upon his work to achieve their award.<br><div>David Bisant, PhD</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Oct 8, 2024 at 3:53 PM Stephen José Hanson <<a href="mailto:jose@rubic.rutgers.edu">jose@rubic.rutgers.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><u></u>
<div bgcolor="#edeac4">
<p>Hi Steve,</p>
<p></p>
<p style="line-height:100%;margin-bottom:0in">The problem every writer encounters is what can be concluded as resolved knowledge rather then new/novel knowledge. In the law this is of course “legal precedence”, so does the reference refer to a recent precedent,
or does the one for the 17<sup>th</sup> century hold precedence? In the present case, I agree that calculating gradients of functions using the chain rule was invented (Legendre -- Least squares) far before Rumelhart and Hinton applied it to error gradients
in acyclic/cyclic networks, and of course there were others as you say, in the 20<sup>th</sup> century that also applied error gradient to networks (Parker, Le cun et al). Schmidhuber says all that matters is the “math” not the applied context. However, I
seriously doubt that Legendre could have imagined using gradients of function error through succesive application in a acylic network would have produced a hierarchical kinship relationship (distinguishing between an italian and english family mother, fathers,
sons, aunts, grandparents etc.) in the hidden units of a network, simply by observing individuals with fixed feature relations. I think any reasonable person would maintain that this application is completely novel and could not be predicted in or out of
context from the “math” and certainly not from the 18<sup>th</sup> century. Hidden units were new in this context and their representational nature was novel, in this context. Scope of reference is also based on logical or causal proximity to the reference.
In this case, referencing Darwin or Newton in all biological or physics papers should be based on the outcome of the metaphorical test of whether the recent results tie back to original source in some direct line, for example, was Oswald’s grandfather responsible
for the death of President John F. Kennedy? Failing this test, suggests that the older reference may not have scope. But of course this can be subjective.</p>
<p style="line-height:100%;margin-bottom:0in">Steve<br>
</p>
<p></p>
<div>On 10/8/24 2:38 PM, Grossberg, Stephen wrote:<br>
</div>
<blockquote type="cite">
<div>
<p class="MsoNormal"><span style="font-size:16pt;font-family:Arial,sans-serif">Actually, Paul Werbos developed back propagation into its modern form, and worked out computational examples, for his 1974 Harvard PhD thesis.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:16pt;font-family:Arial,sans-serif"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:16pt;font-family:Arial,sans-serif">Then David Parker rediscovered it in 1982, etc.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:16pt;font-family:Arial,sans-serif"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:16pt;font-family:Arial,sans-serif">Schmidhuber provides an excellent and wide-ranging history of many contributors to Deep Learning and its antecedents:<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:16pt;font-family:Arial,sans-serif"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:16pt;font-family:Arial,sans-serif"><a href="https://www.sciencedirect.com/science/article/pii/S0893608014002135?casa_token=k47YCzFwcFEAAAAA:me_ZGF5brDqjRihq5kHyeQBzyUMYBypJ3neSinZ-cPn1pnyi69DGyM9eKSyLsdiRf759I77c7w" target="_blank">https://www.sciencedirect.com/science/article/pii/S0893608014002135?casa_token=k47YCzFwcFEAAAAA:me_ZGF5brDqjRihq5kHyeQBzyUMYBypJ3neSinZ-cPn1pnyi69DGyM9eKSyLsdiRf759I77c7w</a><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:16pt;font-family:Arial,sans-serif"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:16pt;font-family:Arial,sans-serif">This article has been cited over 23,000 times.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:16pt;font-family:Arial,sans-serif"><u></u> <u></u></span></p>
<div id="m_-1505310067835002155mail-editor-reference-message-container">
<div>
<div style="border-right:none;border-bottom:none;border-left:none;border-top:1pt solid rgb(181,196,223);padding:3pt 0in 0in">
<p class="MsoNormal" style="margin-bottom:12pt"><b><span style="font-size:12pt;color:black">From:
</span></b><span style="font-size:12pt;color:black">Connectionists <a href="mailto:connectionists-bounces@mailman.srv.cs.cmu.edu" target="_blank">
<connectionists-bounces@mailman.srv.cs.cmu.edu></a> on behalf of Stephen José Hanson
<a href="mailto:jose@rubic.rutgers.edu" target="_blank"><jose@rubic.rutgers.edu></a><br>
<b>Date: </b>Tuesday, October 8, 2024 at 2:25 PM<br>
<b>To: </b>Jonathan D. Cohen <a href="mailto:jdc@princeton.edu" target="_blank">
<jdc@princeton.edu></a>, Connectionists <a href="mailto:connectionists@cs.cmu.edu" target="_blank">
<connectionists@cs.cmu.edu></a><br>
<b>Subject: </b>Re: Connectionists: 2024 Nobel Prize in Physics goes to Hopfield and Hinton<u></u><u></u></span></p>
</div>
<p>Yes, Jon good point here, and although there is a through line from Hopfield to Hinton and Sejnowski.. Ie boltzmann machines and onto DL and LLMs<u></u><u></u></p>
<p>Dave of course invented BP, Geoff would always say.. his contribution was to try and talk Dave out of it as it had so many computational problems and could be in no way considered biologically plausible.<u></u><u></u></p>
<p>Steve<u></u><u></u></p>
<div>
<p class="MsoNormal"><span style="font-size:11pt">On 10/8/24 8:47 AM, Jonathan D. Cohen wrote:<u></u><u></u></span></p>
</div>
<blockquote style="margin-top:5pt;margin-bottom:5pt">
<pre>I’d like to add, in this context, a note in memoriam of David Rumelhart, who was an integral contributor to the work honored by today’s Nobel Prize.<u></u><u></u></pre>
<pre><u></u> <u></u></pre>
<pre>jdc<u></u><u></u></pre>
<pre><u></u> <u></u></pre>
</blockquote>
<pre>-- <u></u><u></u></pre>
<pre>Stephen José Hanson<u></u><u></u></pre>
<pre>Professor, Psychology Department<u></u><u></u></pre>
<pre>Director, RUBIC (Rutgers University Brain Imaging Center)<u></u><u></u></pre>
<pre>Member, Executive Committee, RUCCS<u></u><u></u></pre>
</div>
</div>
</div>
</blockquote>
<pre cols="72">--
Stephen José Hanson
Professor, Psychology Department
Director, RUBIC (Rutgers University Brain Imaging Center)
Member, Executive Committee, RUCCS</pre>
</div>
</blockquote></div>