<div dir="ltr">Judging from the responses below, ChatGPT understands counting and sorting to the extent that it can write a correct Python program but does not have the understanding to bring the proper execution results.<div><img src="cid:ii_lf8hhck82" alt="count_and_sort.png" width="340" height="555"><br>Here's the program that ChatGPT created:<br>---<br>sentences = [<br> "The quick brown fox jumps over the lazy dog",<br> "Python is a popular programming language",<br> "I like to eat pizza for dinner",<br> "The capital of France is Paris",<br> "The cat in the hat wears a red and white striped hat",<br> "My favorite color is blue",<br> "The United States has fifty states",<br> "There are seven days in a week",<br> "I enjoy listening to music while I work",<br> "Mount Everest is the highest mountain in the world"<br>]<br><br># sort the sentences by number of words<br>sentences.sort(key=lambda x: len(x.split()))<br><br># print the sorted sentences with the number of words in parentheses<br>for sentence in sentences:<br> num_words = len(sentence.split())<br> print(f"{sentence} ({num_words})")<br>---<br><br>The execution of this program yields the following correct results:<br>---<br>My favorite color is blue (5)<br>Python is a popular programming language (6)<br>The capital of France is Paris (6)<br>The United States has fifty states (6)<br>I like to eat pizza for dinner (7)<br>There are seven days in a week (7)<br>I enjoy listening to music while I work (8)<br>The quick brown fox jumps over the lazy dog (9)<br>Mount Everest is the highest mountain in the world (9)<br>The cat in the hat wears a red and white striped hat (12)<br>---<br><br>Oka Natsuki<br>Miyazaki Sangyo-keiei University<br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">2023年3月13日(月) 17:45 Gary Marcus <<a href="mailto:gary.marcus@nyu.edu" target="_blank">gary.marcus@nyu.edu</a>>:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="auto"><div dir="ltr"></div><div dir="ltr">Geoff, Terry (mentioned below) and others,</div><div dir="ltr"><br></div><div dir="ltr">You raise an important question.</div><div dir="ltr"><br></div><div dir="ltr">Of course learning disabled people can understand some things and not others. Just as some computer scientists understand computer science and not psychology, etc. (and vice versa; unfortunately a lot of psychologists have never written a line of code, and that often undermines their work).</div><div dir="ltr"><br></div><div dir="ltr">That said your remark was itself a deflection away from my own questions, which I will reprint here, since you omitted them.</div><div dir="ltr"><span style="background-color:rgb(255,255,255)"><br></span></div><div dir="ltr"><blockquote type="cite"><span style="background-color:rgb(255,255,255)"><i>If a broken clock were correct twice a day, would we give it credit for patches of understanding of time? If n-gram model produced a sequence that was 80% grammatical, would we attribute to an underlying understanding of grammar?</i></span></blockquote></div><div dir="ltr"><br></div><div dir="ltr">The point there (salient to every good cognitive psychologist) is that you can’t infer underlying psychology and internal representations <i>directly</i> from behavior.</div><div dir="ltr"><br></div><div dir="ltr">A broken clock is behaviorally correct (occasionally) but it doesn’t have a functioning internal representation of time. An n-gram model, for high-n, can produce fluent prose, but not have any underlying understanding or representations of what it is saying, succeding to the extent that it does by piggybacking onto a corpus of speech produced by humans that talk about a world that is largely regular. </div><div dir="ltr"><br></div><div dir="ltr">Psychology is hard. Almost any “correct” behavior can be created in a multiplicity of ways; that’s why (cognitive) psychologists who are interested in underlying representations so often look to errors, and tests of generalization. </div><div dir="ltr"><br></div><div dir="ltr">In the case of LLMs, it’s clear that even when they produce a correct output, they rarely if ever deribe the same abstractions that a human would, or that a symbolic machine might use (perhaps preprogrammed) in a similar circumstance. </div><div dir="ltr"><br></div><div dir="ltr">Minerva, for example, is trained on an immense amount of data, and ostensibly captures two-digit arithmetic, but it fails altogether on 4-digit multiplication, The parsimonious explanation is that it is doing a kind of pattern recognition over stored examples (with 2-digit cases more densely sampled than 4-digit cases)—rather than genuinely understanding what multiplication is about. </div><div dir="ltr"><br></div><div dir="ltr">The same goes for essentially everything an LLMs talks about; there is a degree of generalization to similar examples, but distribution shift is hard (the crux of my own work going back to 1998), and nearly any generalization can be easily broken. </div><div dir="ltr"><br></div><div dir="ltr">As a last example, consider the following, where it initially sort of seems like ChatGPT has understood both counting and sorting in the context of complex query—which would be truly impressive—but on inspection it gets the details wrong, because it is relying on similarity, and not actually inducing the abstractions that define counting or sorting.</div><div dir="ltr"><br></div><div dir="ltr"><img src="cid:186e0feb627cb971f161" width="957.5" alt="image"></div><div dir="ltr"><br></div><div dir="ltr">This example by the way also speaks against what Terry erroneously alleged yesterday (“<span style="font-family:Helvetica;font-size:12px">If you ask a nonsense question, you get a nonsense answer... LLMs mirror the intelligence of the prompt”). </span>The request is perfectly clear, not a nonsensical question in any way. The prompt is perfectly sensible; the system just isn’t up to the job.</div><div dir="ltr"><br></div><div dir="ltr">Cheers, </div><div dir="ltr">Gary </div><div dir="ltr"><br></div><div dir="ltr"><br><blockquote type="cite">On Mar 10, 2023, at 10:57, Geoffrey Hinton <<a href="mailto:geoffrey.hinton@gmail.com" target="_blank">geoffrey.hinton@gmail.com</a>> wrote:<br><br></blockquote></div><blockquote type="cite"><div dir="ltr"><div dir="ltr">A clever deflection. But can you please say if you think learning disabled people understand some things even though they do not understand others. This should be an area in which you actually have some relevant expertise.<div><br><div>Geoff</div><div><br></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Mar 10, 2023 at 1:45 PM Gary Marcus <<a href="mailto:gary.marcus@nyu.edu" target="_blank">gary.marcus@nyu.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="auto"><div dir="ltr"></div><div dir="ltr">I think you should really pose this question to Yann LeCun, who recently said “LLMs have a more superficial understanding of the world than a house cat“ (<a href="https://urldefense.proofpoint.com/v2/url?u=https-3A__twitter.com_ylecun_status_1621861790941421573-3Fs-3D61-26t-3DeU-5FJMbqlN1G6Dkgee1AzlA&d=DwMFaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=wQR1NePCSj6dOGDD0r6B5Kn1fcNaTMg7tARe7TdEDqQ&m=FVWJZf1ojyzQCmNLi6fhBu2H55KsAnX9FwsBVpN2cdxfhUwG7YQPx2xBoeB349YY&s=Es_NXJ0YjX5wOL7rD3KfaZLZ7-neIym77X_OMT8Kbrs&e=" target="_blank">https://twitter.com/ylecun/status/1621861790941421573?s=61&t=eU_JMbqlN1G6Dkgee1AzlA</a>)</div><div dir="ltr"><br></div><div dir="ltr">Curious to hear how the conversation goes. </div><div dir="ltr"><br></div><div dir="ltr"><br><blockquote type="cite">On Mar 10, 2023, at 10:04 AM, Geoffrey Hinton <<a href="mailto:geoffrey.hinton@gmail.com" target="_blank">geoffrey.hinton@gmail.com</a>> wrote:<br><br></blockquote></div><blockquote type="cite"><div dir="ltr"><div dir="ltr"><br><div>A former student of mine, James Martens, came up with the following way of demonstrating chatGPT's lack of understanding. He asked it how many legs the rear left side of a cat has. </div><div>It said 4. </div><div><br></div><div>I asked a learning disabled young adult the same question. He used the index finger and thumb of both hands pointing downwards to represent the legs on the two sides of the cat and said 4.</div><div>He has problems understanding some sentences, but he gets by quite well in the world and people are often surprised to learn that he has a disability. </div><div><br></div><div>Do you really want to use the fact that he misunderstood this question to say that he has no understanding at all?</div><div>Are you really happy with using the fact that chatGPT sometimes misunderstands to claim that it never understands?</div><div><br></div><div>Geoff</div><div><br></div></div>
</div></blockquote></div></blockquote></div>
</div></blockquote></div></blockquote></div></div>