<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body dir="auto"><div dir="ltr"></div><div dir="ltr">Dear Geoff,</div><div dir="ltr"><br></div><div dir="ltr">Apology accepted, and I am very sorry that I misread the temporal contingency as causal. (I have noticed the delays of which you speak, so I can see in hindsight exactly how that might have happened.)</div><div dir="ltr"><br></div><div dir="ltr">To answer your question, I never really wrote much about image synthesis; I am a language and higher-level cognition researcher, not a vision guy (though I try to stay current). What I might have said if you had asked me in 2010 is to point to Pinker’s old example about <i>man bites dog</i>, which follows directly from Fodor and Pylyshyn’s 1988 Cognition article on systematicity. I wrote an update on where all that stands a week or two ago, <a href="https://garymarcus.substack.com/p/horse-rides-astronaut?s=w">https://garymarcus.substack.com/p/horse-rides-astronaut?s=w</a>, and also wrote a bit more on the topic here <a href="https://arxiv.org/abs/2204.13807">https://arxiv.org/abs/2204.13807</a>, explaining why I think that there is a very direct relation between the limits that current draw-from-text systems face on the language side, in compositionality and semantics that Fodor, Pylyshyn, Pinker, and I all foresaw.</div><div dir="ltr"><br></div><div dir="ltr">That said, I am genuinely impressed with the art, and certainly would not have anticipated the photorealism or the flexibility. I see how its done, but wouldn’t have seen it coming, and I am particularly impressed the consistency of perspective and lighting. The NeRF scene rendering stuff is also astonishingly good. In some ways deep learning has been absolutely brilliant.</div><div dir="ltr"><br></div><div dir="ltr">When I have said “deep learning is a hitting a wall”, I don’t mean that there is no progress, but rather that there are certain things that deep learning on its own can’t do. </div><div dir="ltr"><br></div><div dir="ltr">A lot of them actually have to the relations between wholes and parts, which I know has been a focus of your own on the vision side. Despite our history of friction, I was super positive to Tech Review when they asked me to comment about your GLOM framework:</div><div dir="ltr"><span style="-webkit-text-size-adjust: auto; font-family: Independent, serif; font-size: 18px; background-color: rgb(255, 255, 255);"><br></span></div><div dir="ltr"><span style="-webkit-text-size-adjust: auto; font-family: Independent, serif; font-size: 18px; background-color: rgb(255, 255, 255);"><i>Marcus admires Hinton’s willingness to challenge something that brought him fame, to admit it’s not quite working. “It’s brave,” he says. “And it’s a great corrective to say, ‘I’m trying to think outside the box.’”</i></span></div><div dir="ltr"><span style="-webkit-text-size-adjust: auto; font-family: Independent, serif; font-size: 18px; background-color: rgb(255, 255, 255);"><br></span></div><div dir="ltr">That whole sphere of questions is crucial, and not yet solved. As I told them, I admire you for trying.</div><div dir="ltr"><br></div><div dir="ltr">Again I really appreciate the apology, and am sorry that I misread that email. It would really would be good for the field if we could cultivate a better relationship, and raise the level of our mutual discussion, e.g by comparing the progress and obstacles in language versus vision. Personally, I would be thrilled; Steve Pinker has told me several times about the amazing and unexpected 3-d example you challenged him with when the two of you first met. He still remembers it vividly, 45 years later. </div><div dir="ltr"><br></div><div dir="ltr">I also know that you and I are both dissatisfied with benchmarkitis, and neither of us is fully satisfied with mere scaling. Instead of arguing about what we might have predicted in the past, it would be fun to challenge the field together.</div><div dir="ltr"><br></div><div dir="ltr">Best regards,</div><div dir="ltr">Gary</div><div dir="ltr"><br></div><div dir="ltr"><br></div><div dir="ltr"><br></div><div dir="ltr"><br><blockquote type="cite">On Jun 11, 2022, at 11:01, Geoffrey Hinton <geoffrey.hinton@gmail.com> wrote:<br><br></blockquote></div><blockquote type="cite"><div dir="ltr"><div dir="ltr"><div dir="auto">I am very sorry if you thought I was accusing you of being deranged. I do not think that at all. I just think you are stuck in a failed paradigm. </div><div dir="auto"><br></div><div dir="auto">I did accuse you of seeking attention and I would be happy to provide evidence for that if you really want to go there.</div><div dir="auto"><br></div><div dir="auto">My comment about the need to moderate deranged rantings was sent to connectionists at 1.23pm on June 8 and appeared on the list at 5.58am on June 9. It was in response to a completely different email. As you know, there is a delay between sending something to the list and it appearing on the list. Your email, which seemed to be soliciting a response from me, appeared on June 9 at 3.41am which was after I sent my email about deranged rantings. It's very unfortunate that my email about deranged rantings appeared a few hours after your email appeared, but there was no causal connection.</div><div dir="auto"><br></div><div>I am all in favor of providing clearly specified tests for whether a model "understands" and I gave one example of a good step in that direction in my podcast with Pieter Abbeel.</div><div><br></div><div>I also think that the ability to draw pictures when given captions is pretty convincing evidence of understanding the caption. My intense irritation with your comments is largely driven by my belief that in about 2010 you would have confidently predicted that the current performance of neural nets at drawing pictures from captions was unattainable by a purely connectionist system. But I guess we will never know.</div><div><br></div><div>Geoff</div><div><br></div><div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Jun 10, 2022 at 11:26 PM Gary Marcus <<a href="mailto:gary.marcus@nyu.edu" target="_blank">gary.marcus@nyu.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="auto"><div dir="ltr"></div><div dir="ltr">Let’s review:</div><div dir="ltr"><br></div><div dir="ltr">Hinton accuses me of not setting clear criteria.</div><div dir="ltr"><br></div><div dir="ltr">I offer 6 reasonably clear criteria.</div><div dir="ltr"><br></div><div dir="ltr">A significant sample of the ML community elsewhere applauds the criteria, and engages seriously. </div><div dir="ltr"><br></div><div dir="ltr">Hinton says it’s deranged to discuss them; after that, nobody here dares. </div><div dir="ltr"><br></div><div dir="ltr">Hanson derides the whole project and stacks the deck; ignores the cold fusion, flying jet packs, driverless taxis, and so on that haven’t become practical despite promises, citing only the numerator but not the denominator of history, further stifling any serious discussion of what Hinton’s requested targets might be.</div><div dir="ltr"><br></div><div dir="ltr">Was Hinton’s request for clear criteria agenuine good faith request? Does anyone on this list have better criteria? Do you always find it appropriate to tag team people for responding to requests in good faith?</div><div dir="ltr"><br></div><div dir="ltr">Open scientific discussion, literally for decades a hallmark of this list, appears to have left the building. Very unfortunate.</div><div dir="ltr"><br></div><div dir="ltr">Gary</div><div dir="ltr"><br></div><div dir="ltr"><br></div><div dir="ltr"><br></div><div dir="ltr"><br><blockquote type="cite">On Jun 10, 2022, at 8:14 AM, Stephen Jose Hanson <<a href="mailto:stephen.jose.hanson@rutgers.edu" target="_blank">stephen.jose.hanson@rutgers.edu</a>> wrote:<br><br></blockquote></div><blockquote type="cite"><div dir="ltr">
</div></blockquote></div><div dir="auto"><blockquote type="cite"><div dir="ltr">
<p><font size="+1"><font face="monospace">Bets? The <i>Augus</i><i>t</i> discussion months ago has reduced to bets? Really?<br>
</font></font></p>
<p><font size="+1">Gentleman, lets step back a bit... on the one hand this seems like schoolyard squabble about who can jump from the highest point on a wall without breaking a leg..</font></p>
<p><font size="+1">On the other hand.. it also feels like a troll* standing in a North Carolina field saying to Orville.. .."OK, so it worked for 12 seconds, I bet this never fly across an ocean!"</font></p>
<p><font size="+1">OR</font></p>
<p><font size="+1">" (1961) sure sure, you got a capsule in the upper stratosphere, but I bet you will never get to the moon".</font></p>
<p><font size="+1">OR</font></p>
<p><font size="+1">"1994, Ok, your computational biology model can do protein folding with about 40% match.. 20 years later not much improvement (60%).. so I bet you'll never reach 90% match". (in 2020, Deepmind published Alphafold--which reached over 94%
matches).<br>
</font></p>
<p><font size="+1"><br>
</font></p>
<p><font size="+1">So this type of counterfactual silliness, is simply due to our deep ignorance of the technologies in the future.. but who could know the tech of the future?
<br>
</font></p>
<p><font size="+1">Its really really really early in what is happening in AI now. .snipping at it at this point is sort of pointless. As we just don't know alot yet.</font></p>
<p><font size="+1">(1) how do DL models learn? (2) how do DL models represent knowledge? (3) What do DL models have to do with Brain?</font></p>
<p><font size="+1">Instead here's a useful project:<br>
</font></p>
<p><font size="+1">Recent work in language acquisition due to Yang an Piantidosi (PNAS 2022) who developed a symbolic model--similar to what Chomsky described as a Universal learning model (starting with recursion), seems to work surprisingly well. They provide
a large archive number of learning problems (FSM, CF, CS) cases.. which would be an interesting project for someone interested in RNN-DLs or LSTMs to show the same results, without the symbolic alg, they defined.<br>
</font></p>
<p><font size="+1">Y Yang and S.T. Piantadosi One model for the learning of language January 24, 2022, PNAS.</font></p>
<p><font size="+1">Finally, AGI.. so this is old idea and a borrowed idea from LL Thurstone, who in 1930, defined different types of Human Intelligence including a type of "GENERAL Intelligence". This lead to IQ tests and frustrating attempts at finding
it ... instead leading Thurstone to invent Factor analysis. Its difficult enough to try and define human intelligence, without claiming some sort of "G" factor for AI. With due respect to my friends at DeepMind... This seems like a deadend.</font></p>
<p><font size="+1">Cheers,</font></p>
<p><font size="+1">Steve<br>
</font></p>
<p><font size="+1"></font><br>
</p>
<p><br>
</p>
<p><br>
</p>
<p><br>
</p>
<p><br>
</p>
<p><br>
</p>
<p>* a troll is a person who posts inflammatory, insincere, digressive, extraneous, or off-topic messages in an online community, with the intent of provoking readers into displaying emotional responses, or manipulating others' perception<br>
</p>
<div>On 6/9/22 4:33 PM, Gary Marcus wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Dear Dr. Hinton,</div>
<div dir="ltr"><br>
</div>
<div dir="ltr">You very directly asked my side to produce some tangible goals. Ernest Davis and I did precisely what you asked, and in return you described me (in a separate but public message that also appears to have come from your account) as deranged. There
is no world in which that is socially acceptable, or a positive step towards science. </div>
<div dir="ltr"><br>
</div>
<div dir="ltr">Your reaction is particularly striking because it is a clear outlier. In general, despite the perfectly reasonable questions that you asked about wording in your subsequent email (which would presumably need be negotiated in any actually-implemented
bet, as one moved from term sheet to contract), the community reaction has actually been quite favorable. LongNow offered to host it, Metaculus added to their forecast site, Christian Szegedy placed a side bet and ran a lengthy discussion about the math proposition,
etc. Your reactions are so far out of range with any of the other reaction that I have received than I am frankly concerned for your well-being.</div>
<div dir="ltr"><br>
</div>
<div dir="ltr">§</div>
<div dir="ltr"><br>
</div>
<div dir="ltr">All that said, you raised one substantive question, and I will address it. I am reasonably familiar with the PaLM model that you refer to, and aware of the claim that it “understands” jokes. I am also aware of the fact that what is provided in
the paper was a numerator without a denominator.</div>
<div dir="ltr"><br>
</div>
<div dir="ltr">There is no clarity about what was tried, or how general it was, or under what conditions it failed; subset is indeed the operative word. In no world would what was written pass traditional peer review. Even for a conference paper, what is reported
about joke interpretation is woefully thin. It’s really not much rest to your theory on. </div>
<div dir="ltr"><br>
</div>
<div dir="ltr">I am also aware of the fact that your colleagues at Google failed to supply more detail when I requested it—never a good sign. </div>
<div dir="ltr"><br>
</div>
<div dir="ltr">How general is the ability? Is it a handful of paraphrases of jokes in vast memorized database? Would it extend to other kinds of jokes? Could it (or related models like Gato, with visual input) explain this cartoon?</div>
<div dir="ltr"><br>
</div>
<div dir="ltr"><img alt="image1.jpeg" style="width:896px;max-width:100%" src="cid:1815393991198796b131"></div>
<div dir="ltr"><br>
</div>
<div dir="ltr">Or why this bit of dialog with GPT (prompt in bold) is inadvertently funny?</div>
<blockquote type="cite">
<div dir="ltr">
<p><b>Sally's cow died yesterday. When will the cow be alive again? Let's think step by step.</b></p>
<p>First, the cow must be buried. This takes a few hours.</p>
<p>Next, a new cow must be born. This takes nine months.</p>
<p>Therefore, the earliest the cow will be alive again is nine months from now.</p>
</div>
</blockquote>
<div dir="ltr">Probably not. </div>
<div dir="ltr"><br>
</div>
<div dir="ltr">§</div>
<div dir="ltr"><br>
</div>
<div dir="ltr">What we have known since Eliza is that humans are easily seduced into anthropomorphizing machines. I am going to stand by my claim that current AI lacks understanding: </div>
<div dir="ltr">
<ul>
<li>one cannot derive a set of logic propositions from a large language model</li><li>one cannot reliably update a world model based on an LLMs calculations (a point that LeCun has also made, in slightly different terms)</li><li>one cannot reliably reason from what and LLM derives, </li><li>LLMs themselves cannot reliably reason from they are told..</li></ul>
</div>
<div dir="ltr">My point is not a Searlean one about the impossibility of machines thinking, just a reality of the limits of contemporary systems. On the latter point, I would also urge you to read my recent essay called “Horse rides Astronaut”, to see how
easy it is make up incorrect rationalization about these models when they make errors. </div>
<div dir="ltr"><br>
</div>
<div dir="ltr">Inflated appraisals of their capabilities may serve some sort of political end, but will not serve science.</div>
<div dir="ltr"><br>
</div>
<div dir="ltr">I cannot undo whatever slight some reviewer did to Yann decades ago, but I can call the current field as I see it; I don’t believe that current systems have gotten significantly closer to what I described in that 2016 conversation that you quote
from. I absolutely stand by the claim that we are a long way from answering “<span style="color:rgb(26,26,26);font-family:Spectral,serif,-apple-system,system-ui,"Segoe UI",Roboto,Helvetica,Arial,sans-serif,"Apple Color Emoji","Segoe UI Emoji","Segoe UI Symbol"">the
deeper questions in artificial intelligence, like how we understand language or how we reason about the world." SInce you are found of quoting stuff I right 6 or 7 years ago, here’s a challenge that I proposed in the New Yorker 2014; to me I see real progress
on this sort of thing, thus far:</span></div>
<div dir="ltr"><span></span></div>
<blockquote type="cite">
<div dir="ltr"><span><br>
</span></div>
<div dir="ltr"><span style="font-family:TNYAdobeCaslonPro,"Times New Roman",Times,serif;background-color:rgb(255,255,255);font-size:20px"><i>allow me to propose a Turing Test
for the twenty-first century: build a computer program that can watch any arbitrary TV program or YouTube video and answer questions about its content—“Why did Russia invade Crimea?” or “Why did Walter White consider taking a hit out on Jessie?” Chatterbots
like Goostman can hold a short conversation about TV, but only by bluffing. (When asked what “Cheers” was about, it responded, “How should I know, I haven’t watched the show.”) But no existing program—not Watson, not Goostman, not Siri—can currently come close
to doing what any bright, real teenager can do: watch an episode of “The Simpsons,” and tell us when to laugh.</i></span></div>
</blockquote>
<div dir="ltr"><span style="font-family:TNYAdobeCaslonPro,"Times New Roman",Times,serif;background-color:rgb(255,255,255);font-size:17px"><br>
</span></div>
<div dir="ltr"><span style="font-family:TNYAdobeCaslonPro,"Times New Roman",Times,serif;background-color:rgb(255,255,255);font-size:17px">Can Palm-E do that? I seriously doubt it. </span></div>
<div dir="ltr"><span> </span></div>
<div dir="ltr"><br>
</div>
<div dir="ltr">Dr. Gary Marcus</div>
<div dir="ltr"><br>
</div>
<div dir="ltr">Founder, Geometric Intelligence (acquired by Uber)</div>
<div dir="ltr">Author of 5 books, including Rebooting AI, one of Forbes 7 Must read books in AI, and The Algebraic Mind, one of the key early works advocating neurosymbolic AI</div>
<div dir="ltr"><br>
</div>
<div dir="ltr"><br>
</div>
<div dir="ltr"><br>
</div>
<div dir="ltr"><br>
</div>
<div dir="ltr"><br>
</div>
<div dir="ltr"><br>
<blockquote type="cite">On Jun 9, 2022, at 11:34, Geoffrey Hinton <a href="mailto:geoffrey.hinton@gmail.com" target="_blank">
<geoffrey.hinton@gmail.com></a> wrote:<br>
<br>
</blockquote>
</div>
<blockquote type="cite">
<div dir="ltr">
<div dir="ltr">I shouldn't respond because your main aim is to get attention without going to the trouble of building something that works (personal communication, Y. LeCun) but I cannot resist pointing out the following Marcus claim from 2016:
<div><span><br>
</span></div>
<div><span>"People are very
excited about big data and what it's giving them right now, but I'm not sure it's taking us closer to the deeper questions in artificial intelligence, like how we understand language or how we reason about the world. "</span></div>
<div><font face="Spectral, serif, -apple-system, system-ui,
Segoe UI, Roboto, Helvetica, Arial, sans-serif, Apple
Color Emoji, Segoe UI Emoji, Segoe UI Symbol" color="#1a1a1a"><br>
</font></div>
<div><font face="Spectral, serif, -apple-system, system-ui,
Segoe UI, Roboto, Helvetica, Arial, sans-serif, Apple
Color Emoji, Segoe UI Emoji, Segoe UI Symbol" color="#1a1a1a">Given that big neural nets can now explain why
a joke is funny (for some subset of jokes) do you still want to stick with this claim? It seems to me that the reason you made this claim is because you have a strong prior belief about how language understanding and reasoning must work and this belief is
remarkably resistant to evidence. Deep learning researchers have seen this before. Yann had a paper rejected by a vision conference even though it beat the state-of-the-art and one of the reasons given was that the model learned everything and therefore
taught us nothing about how to do vision. That particular referee had a strong idea of how computer vision must work and failed to notice that the success of Yann's model showed that that prior belief was spectacularly wrong. </font></div>
<div><font face="Spectral, serif, -apple-system, system-ui,
Segoe UI, Roboto, Helvetica, Arial, sans-serif, Apple
Color Emoji, Segoe UI Emoji, Segoe UI Symbol" color="#1a1a1a"><br>
</font></div>
<div><font face="Spectral, serif, -apple-system, system-ui,
Segoe UI, Roboto, Helvetica, Arial, sans-serif, Apple
Color Emoji, Segoe UI Emoji, Segoe UI Symbol" color="#1a1a1a">Geoff</font></div>
<div><font face="Spectral, serif, -apple-system, system-ui,
Segoe UI, Roboto, Helvetica, Arial, sans-serif, Apple
Color Emoji, Segoe UI Emoji, Segoe UI Symbol" color="#1a1a1a"><br>
</font>
<div><br>
</div>
<div><br>
</div>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Thu, Jun 9, 2022 at 3:41 AM Gary Marcus <<a href="mailto:gary.marcus@nyu.edu" target="_blank">gary.marcus@nyu.edu</a>> wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="auto">
<div dir="ltr">Dear Connectionists, and especially Geoff Hinton,
<div dir="ltr">
<div dir="ltr">
<div><br>
</div>
<div>It has come to my attention that Geoff Hinton is looking for challenging targets. In a just-released episode of The Robot Brains podcast [<a href="https://urldefense.com/v3/__https://www.youtube.com/watch?v=4Otcau-C_Yc__;!!BhJSzQqDqA!Xh3JO9ofzqekK6I5uDA0F9J35tYqCEKqe2VyJXZaTtWlhk_g0aLu79J2fMwGE1WT43F66Osn0VHJ10Uf2t-8BGjQUsDx$" target="_blank">https://www.youtube.com/watch?v=4Otcau-C_Yc</a>],
he said
<div><br>
</div>
<div><i>“If any of the people who say [deep learning] is hitting a wall would just write down a list of the things it’s not going to be able to do then five years later, we’d be able to show we’d done them.”</i></div>
<div><br>
</div>
<div>Now, as it so happens, I (with the help of Ernie Davis) did just write down exactly such a list of things, last weekm and indeed offered Elon Musk a $100,000 bet along similar lines.</div>
<div><br>
</div>
<div>Precise details are here, towards the end of the essay: </div>
<div><br>
</div>
<div><a href="https://urldefense.com/v3/__https://garymarcus.substack.com/p/dear-elon-musk-here-are-five-things?s=w__;!!BhJSzQqDqA!Xh3JO9ofzqekK6I5uDA0F9J35tYqCEKqe2VyJXZaTtWlhk_g0aLu79J2fMwGE1WT43F66Osn0VHJ10Uf2t-8BN37K60l$" target="_blank">https://garymarcus.substack.com/p/dear-elon-musk-here-are-five-things</a></div>
<div><br>
</div>
<div>Five are specific milestones, in video and text comprehension, cooking, math, etc; the sixth is the proviso that for an intelligence to be deemed “general” (which is what Musk was discussing in a remark that prompted my proposal), it would need to solve
a majority of the problems. We can probably all agree that narrow AI for any single problem on its own might be less interesting.</div>
<div><br>
</div>
<div>Although there is no word yet from Elon, Kevin Kelly offered to host the bet at LongNow.Org, and Metaculus.com has transformed the bet into 6 questions that the community can comment on. Vivek Wadhwa, cc’d, quickly offered to double the bet, and several
others followed suit; the bet to Elon (should he choose to take it) currently stands at $500,000.</div>
<div><br>
</div>
<div>If you’d like in on the bet, Geoff, please let me know. </div>
<div><br>
</div>
<div>More generally, I’d love to hear what the connectionists community thinks of six criteria I laid out (as well as the arguments at the top of the essay, as to why AGI might not be as imminent as Musk seems to think).</div>
<div><br>
</div>
<div>Cheers.</div>
<div>Gary Marcus</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
</div>
</div>
</blockquote>
</blockquote>
<div>-- <br>
<img border="0" style="width:556px;max-width:100%" alt="signature.png" src="cid:1815393991113d051522"></div>
</div></blockquote></div></blockquote></div></div>
</div>
</div></blockquote></body></html>