Connectionists: Statistics versus “Understanding” in Generative AI.

Mon Feb 19 10:53:51 EST 2024

Iam, the difference is that while you may need an external source to
remember all 50 states, for the ones that you have remembered/looked up,
you are able to verify that they do or do not contain specific letters
without reference to a resource, or writing some code to verify it.  It is
even worse that if you push them on their mistakes, they are still unable
to correct.

A better counterargument to the example Dave provides is that perhaps LLMs
just cannot ever break things down at the letter level because of their
reliance on tokens.  Humans can do this of course, but a good analogy for
us might be the Muller Lyer illusion, which is essentially impenetrable to
our cognitive faculties.  I.e. we are unable to force ourselves to see the
lines as their true lengths on the page because the basis of our
representations does not permit it.  This is perhaps similar to the way
that LLM representations preclude them from accessing the letter level.

However, I think a good counterpoint to this is that while people are
unable to un-see the Muller Lyer illusion, it is not that difficult to
teach someone about this blindspot and get them to reason around it, with
no external tools, just their own reasoning faculties.  LLMs seem unable to
achieve this level of self-knowledge no matter how patiently things are
explained.  They do not have the metacognitive faculty that allows them to
even understand their blindspot about letters.

On Mon, Feb 19, 2024 at 10:06 AM Gary Marcus <gary.marcus at nyu.edu> wrote:

> Correct; also tool integration has actually been less successful than some
> people believe:
>
>
> https://open.substack.com/pub/garymarcus/p/getting-gpt-to-work-with-external
> <https://open.substack.com/pub/garymarcus/p/getting-gpt-to-work-with-external?r=8tdk6&utm_campaign=post&utm_medium=web>
>
> On Feb 19, 2024, at 5:49 AM, Thomas Trappenberg <tt at cs.dal.ca> wrote:
>
> 
> Good point, but Dave's point stands as the models he is referring to did
> not even comprehend that they made mistakes.
>
> Cheers, Thomas
>
> On Mon, Feb 19, 2024, 4:43 a.m. <wuxundong at gmail.com> wrote:
>
>> That can be attributed to the models' underlying text encoding and
>> processing mechanisms, specifically tokenization that removes the spelling
>> information from those words. If you use GPT-4 instead, it can process it
>> properly by resorting to external tools.
>>
>> On Mon, Feb 19, 2024 at 3:45 PM Dave Touretzky <dst at cs.cmu.edu> wrote:
>>
>>> My favorite way to show that LLMs don't know what they're talking about
>>> is this simple prompt:
>>>
>>>    List all the US states whose names don't contain the letter "a".
>>>
>>> ChatGPT, Bing, and Gemini all make a mess of this, e.g., putting "Texas"
>>> or "Alaska" on the list and leaving out states like "Wyoming" and
>>> "Tennessee".  And you can have a lengthy conversation with them about
>>> this, pointing out their errors one at a time, and they still can't
>>> manage to get it right.  Gemini insisted that all 50 US states have an
>>> "a" in their name.  It also claimed "New Jersey" has two a's.
>>>
>>> -- Dave Touretzky
>>>
>>

-- 
Brad Wyble
Professor of Psychology
Penn State University
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/connectionists/attachments/20240219/af8ab65a/attachment.html>