<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body dir="auto"><div dir="ltr"></div><div dir="ltr">Hi, Juergen,</div><div dir="ltr"><br></div><div dir="ltr">Thanks for your reply.  Restricting your title to “modern” AI as you did is a start, but I think still not enough. For example, from what I understand about NNAISANCE, through talking with you and Bas Steunebrink, there’s quite a bit of hybrid AI in what you are doing at your company, not well represented in the review. The related open-access book certainly draws heavily on both traditions (<a href="https://link.springer.com/book/10.1007/978-3-031-08020-3">https://link.springer.com/book/10.1007/978-3-031-08020-3</a>).</div><div dir="ltr"><br></div><div dir="ltr">Likewise, there is plenty of eg symbolic planning in modern navigation systems, most robots etc; still plenty of use of symbolic trees in game playing; lots of people still use taxonomies and inheritance, etc., an AFAIK nobody has built a trustworthy virtual assistant, even in a narrow domain, with only deep learning. And so on. </div><div dir="ltr"><br></div><div dir="ltr">In the end, it’s really a question about balance, which is what I think Andrzej was getting at; you go miles deep on the history of deep learning, which I respect, but just give relatively superficial pointers (not none!) outside that tradition. Definitely better, to be sure, in having at least a few pointers than in having none, and I would agree that the future is uncertain. I think you strike the right note there!</div><div dir="ltr"><br></div><div dir="ltr">As an aside, saying that everything can be formulated as RL is maybe no more helpful than saying that everything we (currently) know how to do can be formulated in terms of Turing machine. True, but doesn’t carry you far enough in most real world applications. I personally see RL as part of an answer, but most useful in (and here we might partly agree) the context of systems with rich internal models of the world. </div><div dir="ltr"><br></div><div dir="ltr">My own view is that we will get to more reliable AI only once the field more fully embraces the project of articulating how such models work and how they are developed. </div><div dir="ltr"><br></div><div dir="ltr">Which is maybe the one place where you (eg <a href="https://arxiv.org/pdf/1803.10122.pdf">https://arxiv.org/pdf/1803.10122.pdf</a>), Yann LeCun (eg <a href="https://openreview.net/forum?id=BZ5a1r-kVsf">https://openreview.net/forum?id=BZ5a1r-kVsf</a>), and I (eg  <a href="https://arxiv.org/abs/2002.06177">https://arxiv.org/abs/2002.06177</a>) are most in agreement.</div><div dir="ltr"><br></div><div dir="ltr">Best,</div><div dir="ltr">Gary</div><div dir="ltr"><br><blockquote type="cite">On Jan 15, 2023, at 23:04, Schmidhuber Juergen <juergen@idsia.ch> wrote:<br><br></blockquote></div><blockquote type="cite"><div dir="ltr"><span>Thanks for these thoughts, Gary! </span><br><span></span><br><span>1. Well, the survey is about the roots of “modern AI” (as opposed to all of AI) which is mostly driven by “deep learning.” Hence the focus on the latter and the URL "deep-learning-history.html.” On the other hand, many of the most famous modern AI applications actually combine deep learning and other cited techniques (more on this below).</span><br><span></span><br><span>Any problem of computer science can be formulated in the general reinforcement learning (RL) framework, and the survey points to ancient relevant techniques for search & planning, now often combined with NNs:</span><br><span></span><br><span>"Certain RL problems can be addressed through non-neural techniques invented long before the 1980s: Monte Carlo (tree) search (MC, 1949) [MOC1-5], dynamic programming (DP, 1953) [BEL53], artificial evolution (1954) [EVO1-7][TUR1] (unpublished), alpha-beta-pruning (1959) [S59], control theory and system identification (1950s) [KAL59][GLA85],  stochastic gradient descent (SGD, 1951) [STO51-52], and universal search techniques (1973) [AIT7].</span><br><span></span><br><span>Deep FNNs and RNNs, however, are useful tools for _improving_ certain types of RL. In the 1980s, concepts of function approximation and NNs were combined with system identification [WER87-89][MUN87][NGU89], DP and its online variant called Temporal Differences [TD1-3], artificial evolution [EVONN1-3] and policy gradients [GD1][PG1-3]. Many additional references on this can be found in Sec. 6 of the 2015 survey [DL1]. </span><br><span></span><br><span>When there is a Markovian interface [PLAN3] to the environment such that the current input to the RL machine conveys all the information required to determine a next optimal action, RL with DP/TD/MC-based FNNs can be very successful, as shown in 1994 [TD2] (master-level backgammon player) and the 2010s [DM1-2a] (superhuman players for Go, chess, and other games). For more complex cases without Markovian interfaces, …”</span><br><span></span><br><span>Theoretically optimal planners/problem solvers based on algorithmic information theory are mentioned in Sec. 19.</span><br><span></span><br><span>2. Here a few relevant paragraphs from the intro:</span><br><span></span><br><span>"A history of AI written in the 1980s would have emphasized topics such as theorem proving [GOD][GOD34][ZU48][NS56], logic programming, expert systems, and heuristic search [FEI63,83][LEN83]. This would be in line with topics of a 1956 conference in Dartmouth, where the term "AI" was coined by John McCarthy as a way of describing an old area of research seeing renewed interest. </span><br><span></span><br><span>Practical AI dates back at least to 1914, when Leonardo Torres y Quevedo built the first working chess end game player [BRU1-4] (back then chess was considered as an activity restricted to the realms of intelligent creatures). AI theory dates back at least to 1931-34 when Kurt Gödel identified fundamental limits of any type of computation-based AI [GOD][BIB3][GOD21,a,b].</span><br><span></span><br><span>A history of AI written in the early 2000s would have put more emphasis on topics such as support vector machines and kernel methods [SVM1-4], Bayesian (actually Laplacian or possibly Saundersonian [STI83-85]) reasoning [BAY1-8][FI22] and other concepts of probability theory and statistics [MM1-5][NIL98][RUS95], decision trees, e.g. [MIT97], ensemble methods [ENS1-4], swarm intelligence [SW1], and evolutionary computation [EVO1-7][TUR1]. Why? Because back then such techniques drove many successful AI applications.</span><br><span></span><br><span>A history of AI written in the 2020s must emphasize concepts such as the even older chain rule [LEI07] and deep nonlinear artificial neural networks (NNs) trained by gradient descent [GD’], in particular, feedback-based recurrent networks, which are general computers whose programs are weight matrices [AC90]. Why? Because many of the most famous and most commercial recent AI applications depend on them [DL4]."</span><br><span></span><br><span>3. Regarding the future, you mentioned your hunch on neurosymbolic integration. While the survey speculates a bit about the future, it also says: "But who knows what kind of AI history will prevail 20 years from now?” </span><br><span></span><br><span>Juergen</span><br><span></span><br><span></span><br><blockquote type="cite"><span>On 14. Jan 2023, at 15:04, Gary Marcus <gary.marcus@nyu.edu> wrote:</span><br></blockquote><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><span>Dear Juergen,</span><br></blockquote><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><span>You have made a good case that the history of deep learning is often misrepresented. But, by parity of reasoning, a few pointers to a tiny fraction of the work done in symbolic AI does not in any way make this a thorough and balanced exercise with respect to the field as a whole.</span><br></blockquote><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><span>I am 100% with Andrzej Wichert, in thinking that vast areas of AI such as planning, reasoning, natural language understanding, robotics and knowledge representation are treated very superficially here. A few pointers to theorem proving and the like does not solve that. </span><br></blockquote><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><span>Your essay is a fine if opinionated history of deep learning, with a special emphasis on your own work, but of somewhat limited value beyond a few terse references in explicating other approaches to AI. This would be ok if the title and aspiration didn’t aim for as a whole; if you really want the paper to reflect the field as a whole, and the ambitions of the title, you have more work to do. </span><br></blockquote><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><span>My own hunch is that in a decade, maybe much sooner, a major emphasis of the field will be on neurosymbolic integration. Your own startup is heading in that direction, and the commericial desire to make LLMs reliable and truthful will also push in that direction. </span><br></blockquote><blockquote type="cite"><span>Historians looking back on this paper will see too little about that roots of that trend documented here.</span><br></blockquote><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><span>Gary </span><br></blockquote><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><blockquote type="cite"><span>On Jan 14, 2023, at 12:42 AM, Schmidhuber Juergen <juergen@idsia.ch> wrote:</span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span></span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>Dear Andrzej, thanks, but come on, the report cites lots of “symbolic” AI from theorem proving (e.g., Zuse 1948) to later surveys of expert systems and “traditional" AI. Note that Sec. 18 and Sec. 19 go back even much further in time (not even speaking of Sec. 20). The survey also explains why AI histories written in the 1980s/2000s/2020s differ. Here again the table of contents:</span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span></span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>Sec. 1: Introduction</span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>Sec. 2: 1676: The Chain Rule For Backward Credit Assignment</span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>Sec. 3: Circa 1800: First Neural Net (NN) / Linear Regression / Shallow Learning</span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>Sec. 4: 1920-1925: First Recurrent NN (RNN) Architecture. ~1972: First Learning RNNs</span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>Sec. 5: 1958: Multilayer Feedforward NN (without Deep Learning)</span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>Sec. 6: 1965: First Deep Learning</span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>Sec. 7: 1967-68: Deep Learning by Stochastic Gradient Descent </span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>Sec. 8: 1970: Backpropagation. 1982: For NNs. 1960: Precursor. </span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>Sec. 9: 1979: First Deep Convolutional NN (1969: Rectified Linear Units) </span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>Sec. 10: 1980s-90s: Graph NNs / Stochastic Delta Rule (Dropout) / More RNNs / Etc</span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>Sec. 11: Feb 1990: Generative Adversarial Networks / Artificial Curiosity / NN Online Planners</span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>Sec. 12: April 1990: NNs Learn to Generate Subgoals / Work on Command </span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>Sec. 13: March 1991: NNs Learn to Program NNs. Transformers with Linearized Self-Attention</span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>Sec. 14: April 1991: Deep Learning by Self-Supervised Pre-Training. Distilling NNs</span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>Sec. 15: June 1991: Fundamental Deep Learning Problem: Vanishing/Exploding Gradients</span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>Sec. 16: June 1991: Roots of Long Short-Term Memory / Highway Nets / ResNets</span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>Sec. 17: 1980s-: NNs for Learning to Act Without a Teacher </span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>Sec. 18: It's the Hardware, Stupid!</span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>Sec. 19: But Don't Neglect the Theory of AI (Since 1931) and Computer Science</span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>Sec. 20: The Broader Historic Context from Big Bang to Far Future</span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>Sec. 21: Acknowledgments</span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>Sec. 22: 555+ Partially Annotated References (many more in the award-winning survey [DL1])</span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span></span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>Tweet: https://urldefense.proofpoint.com/v2/url?u=https-3A__twitter.com_SchmidhuberAI_status_1606333832956973060-3Fcxt-3DHHwWiMC8gYiH7MosAAAA&d=DwIDaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=wQR1NePCSj6dOGDD0r6B5Kn1fcNaTMg7tARe7TdEDqQ&m=oGn-OID5YOewbgo3j_HjFjI3I2N3hx-w0hoIfLR_JJsn8q5UZDYAl5HOHPY-87N5&s=nWCXLKazOjmixYrJVR0CMlR12PasGbAd8bsS6VZ10bk&e= </span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span></span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>Jürgen</span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span></span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span></span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span></span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span></span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span></span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>On 13. Jan 2023, at 14:40, Andrzej Wichert <andreas.wichert@tecnico.ulisboa.pt> wrote:</span><br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>Dear Juergen,</span><br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>You make the same mistake at it was done in the earlier 1970. You identify deep learning with modern AI, the paper should be called instead "Annotated History of Deep Learning”</span><br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>Otherwise, you ignore symbolical AI, like search, production systems, knowledge representation, search, planning etc., as if is not part of AI anymore (suggested by your title).</span><br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>Best,</span><br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>Andreas</span><br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>--------------------------------------------------------------------------------------------------</span><br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>Prof. Auxiliar Andreas Wichert   </span><br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>https://urldefense.proofpoint.com/v2/url?u=http-3A__web.tecnico.ulisboa.pt_andreas.wichert_&d=DwIDaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=wQR1NePCSj6dOGDD0r6B5Kn1fcNaTMg7tARe7TdEDqQ&m=oGn-OID5YOewbgo3j_HjFjI3I2N3hx-w0hoIfLR_JJsn8q5UZDYAl5HOHPY-87N5&s=h5Zy9Hk2IoWPt7me1mLhcYHEuJ55mmNOAppZKcivxAk&e=</span><br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>-</span><br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>https://urldefense.proofpoint.com/v2/url?u=https-3A__www.amazon.com_author_andreaswichert&d=DwIDaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=wQR1NePCSj6dOGDD0r6B5Kn1fcNaTMg7tARe7TdEDqQ&m=oGn-OID5YOewbgo3j_HjFjI3I2N3hx-w0hoIfLR_JJsn8q5UZDYAl5HOHPY-87N5&s=w1RtYvs8dwtfvlTkHqP_P-74ITvUW2IiHLSai7br25U&e=</span><br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>Instituto Superior Técnico - Universidade de Lisboa</span><br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>Campus IST-Taguspark</span><br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>Avenida Professor Cavaco Silva                 Phone: +351  214233231</span><br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>2744-016 Porto Salvo, Portugal</span><br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>On 13 Jan 2023, at 08:13, Schmidhuber Juergen <juergen@idsia.ch> wrote:</span><br></blockquote></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>Machine learning is the science of credit assignment. My new survey credits the pioneers of deep learning and modern AI (supplementing my award-winning 2015 survey):</span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>https://urldefense.proofpoint.com/v2/url?u=https-3A__arxiv.org_abs_2212.11279&d=DwIDaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=wQR1NePCSj6dOGDD0r6B5Kn1fcNaTMg7tARe7TdEDqQ&m=oGn-OID5YOewbgo3j_HjFjI3I2N3hx-w0hoIfLR_JJsn8q5UZDYAl5HOHPY-87N5&s=6E5_tonSfNtoMPw1fvFOm8UFm7tDVH7un_kbogNG_1w&e=</span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>https://urldefense.proofpoint.com/v2/url?u=https-3A__people.idsia.ch_-7Ejuergen_deep-2Dlearning-2Dhistory.html&d=DwIDaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=wQR1NePCSj6dOGDD0r6B5Kn1fcNaTMg7tARe7TdEDqQ&m=oGn-OID5YOewbgo3j_HjFjI3I2N3hx-w0hoIfLR_JJsn8q5UZDYAl5HOHPY-87N5&s=XPnftI8leeqoElbWQIApFNQ2L4gDcrGy_eiJv2ZPYYk&e=</span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>This was already reviewed by several deep learning pioneers and other experts. Nevertheless, let me know under juergen@idsia.ch if you can spot any remaining error or have suggestions for improvements.</span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>Happy New Year!</span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>Jürgen</span><br></blockquote></blockquote></blockquote></blockquote><span></span><br><span></span><br></div></blockquote></body></html>