Connectionists: ChatGPT’s “understanding” of maps and infographics

J. Jost jost at mis.mpg.de
Fri Feb 16 16:43:04 EST 2024


Hi Gary,

why don't you give a precise definition of "understanding". You write: "
Understanding, for me, is about converting conversations and visual inputs
and so on into sound, reliable, interrogable internal cognitive models."
but this contains a couple of terms whose meaning is not completely clear.

There is only hope for progress in this debate if everybody defines his or
her terms clearly and precisely. And when people attack you, they should
first also clearly state the definitions they are employing.

And if we all agree on some definition, then we might want to discuss to
which extent humans, like the average college student, satisfy the
corresponding criteria.

Juergen (J.)



On Thu, Feb 15, 2024 at 5:44 PM Gary Marcus <gary.marcus at nyu.edu> wrote:

> For your fill-in-the-blanks ["GPT-4 can ___ that scrambled text better
> than I can"] I would urge “respond to”
>
> That is nice, nonjudgemental, non-anthropomorphic language to describe a
> process that we thus far have limited insight into, and that seems to
> respond very differently from what we do.
>
> With respect to your assertion that various problems “can be remedied”, I
> first warned of hallucinations errors in 2001 in my book The Algebraic
> Mind; the problem has not gone away in over 20 years. I also addressed
> something quite similar to the reversal curse, and anticipated a number of
> other issues in semantics and compositionality, also still unsolved. For
> many problems the phrase “Can be remedied” remains a promissory note.
> Techniques like RAG hold some promise but are very far from perfect.
>
> Yes, I am sure that AI that is not subject to hallucination and the
> reversal curse and so can someday be built; I am also sure that some form
> of AI that can reliably produce accounts of its internal processing can be
> built. But I am not convinced that *LLMs, *which happen to be popular
> right now, are the right foundation for such things.
>
> They remain opaque, unreliable, and poorly grounded in facts.
> Understanding, for me, is about converting conversations and visual inputs
> and so on into sound, reliable, interrogable internal cognitive models. I
> am with LeCun in finding what passes for cognitive models in current
> systems to be lacking.
>
> Gary
>
>
>
> On Feb 15, 2024, at 07:56, Iam Palatnik <iam.palat at gmail.com> wrote:
>
> 
> I understand why using the word 'understanding' might seem too generous
> when models still have the failure modes mentioned. Some of the failure
> modes (like the reversal curse
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__arxiv.org_pdf_2309.12288.pdf&d=DwMFaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=wQR1NePCSj6dOGDD0r6B5Kn1fcNaTMg7tARe7TdEDqQ&m=SivCpJPHaW7ncPvNmuUp7mDn14S8Zc_2iCGxnTFTw1_3g2iD51fuVW1JZzEQ0b9q&s=PjEHWNy1GTTjO_h1ntrjMzDLVDn3KRHdvksEa1LnOUo&e=>)
> can be remedied with access to tools, external context, self-reflection
> prompts, but there are failures that cannot yet be remedied.
>
> I just don't know what better word to use in the sentence "GPT-4 can ___
> that scrambled text better than I can". 'Understand' just flows very
> naturally in how we commonly use this word, even if it turns out that what
> GPT-4 is doing is shallower or less general than what my brain is doing.
> 'Parse' or 'process' doesn't seem enough because the scrambled text
> contains an instruction and GPT-4 does follow through with it. What word
> should we use for this?
>
>
> On Thu, Feb 15, 2024 at 12:20 PM Gary Marcus <gary.marcus at nyu.edu> wrote:
>
>> Selectively looking at a single example (which happens to involve images)
>> and ignoring all the other language-internal failures that I and others
>> have presented is not a particularly effective way of getting to a general
>> truth.
>>
>> More broadly, you are, in my judgement, mistaking correlation for a
>> deeper level of understanding.
>>
>> Gary
>>
>> On Feb 15, 2024, at 07:05, Iam Palatnik <iam.palat at gmail.com> wrote:
>>
>> 
>> Dear all,
>>
>> yrnlcruet ouy aer diergna na txraegadeeg xalemep arpagaprh tcgnnoaini an
>> iuonisntrtc tub eht estetrl hntiwi aehc etmr rea sbcaedrml od ont seu nay
>> cedo adn yimlsp ucmanlsrbe shti lynaalmu ocen ouy musrncbea htis orvpe htta
>> oyu cloedtmep hte tska by llayerlti ooifwlgln this citnotsirun taets
>> itcyxellpi that oyu uderdnoost eht gsaninesmt
>>
>> Copy pasting just the above paragraph onto GPT-4 should show the kind of
>> behavior that makes some researchers say LLMs understand something, in some
>> form.
>> We already use words such as 'intelligence' in AI and 'learning' in ML.
>> This is not to say it's the same as human intelligence/learning. It is to
>> say it's a similar enough behavior that the same word fits, while
>> specifically qualifying the machine word-counterpart as something different
>> (artificial/machine).
>>
>> Can this debate be solved by coining a concept such as
>> 'artificial/machine understanding'? GPT-4 then 'machine understands' the
>> paragraph above. It 'machine understands' arbitrary scrambled text better
>> than humans 'human understand' it. Matrix multiplying rotational semantic
>> embeddings of byte pair encoded tokens is part of 'machine understanding'
>> but not of 'human understanding'. At the same time, there are plenty of
>> examples of things we 'human understand' and GPT-4 doesn't 'machine
>> understand', or doesn't understand without tool access and self reflective
>> prompts.
>>
>> As to the map generation example, there are multiple tasks overlaid
>> there. The language component of GPT-4 seems to have 'machine understood'
>> it has to generate an image, and what the contents of the image have to be.
>> It understood what tool it has to call to create the image. The tool
>> generated an infograph style map of the correct country, but the states and
>> landmarks are wrong. The markers are on the wrong cities and some of the
>> drawings are bad. Is it too far fetched to say GPT-4 'machine understood'
>> the assignment (generating a map with markers in the style of infograph),
>> but its image generation component (Dall-E) is bad at detailed accurate
>> geography knowledge?
>>
>> I'm also confused why the linguistic understanding capabilities of GPT-4
>> are being tested by asking Dall-E 3 to generate images. Aren't these two
>> completely separate models, and GPT-4 just function-calls Dall-E3 for image
>> generation? Isn't this actually a sign GPT-4 did its job by 'machine
>> understanding' what the user wanted, making the correct function call,
>> creating and sending the correct prompt to Dall-E 3, but Dall-E 3 fumbled
>> it because it's not good at generating detailed accurate maps?
>>
>> Cheers,
>>
>> Iam
>>
>> On Thu, Feb 15, 2024 at 5:20 AM Gary Marcus <gary.marcus at nyu.edu> wrote:
>>
>>> I am having a genuinely hard time comprehending some of the claims
>>> recently made in this forum. (Not one of which engaged with any of the
>>> specific examples or texts I linked.)
>>>
>>> Here’s yet another example, a dialog about geography that was just sent
>>> to me by entrepreneur Phil Libin. Do we really want to call outputs like
>>> these (to two prompts, with three generated responses zoomed in below)
>>> understanding?
>>>
>>> In what sense do these responses exemplify the word “understanding”?
>>>
>>> I am genuinely baffled. To me a better word would be “approximations”,
>>> and poor approximations at that.
>>>
>>> Worse, I don’t see any AI system on the horizon that could reliably do
>>> better, across a broad range of related questions. If these kinds of
>>> outputs are any indication at all, we are still a very long away from
>>> reliable general-purpose AI.
>>>
>>> Gary
>>>
>>>
>>>
>>>
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/connectionists/attachments/20240216/2699c160/attachment.html>


More information about the Connectionists mailing list