Connectionists: ChatGPT’s “understanding” of maps and infographics

Thu Feb 15 10:56:18 EST 2024

I understand why using the word 'understanding' might seem too generous
when models still have the failure modes mentioned. Some of the failure
modes (like the reversal curse <https://arxiv.org/pdf/2309.12288.pdf>) can
be remedied with access to tools, external context, self-reflection
prompts, but there are failures that cannot yet be remedied.

I just don't know what better word to use in the sentence "GPT-4 can ___
that scrambled text better than I can". 'Understand' just flows very
naturally in how we commonly use this word, even if it turns out that what
GPT-4 is doing is shallower or less general than what my brain is doing.
'Parse' or 'process' doesn't seem enough because the scrambled text
contains an instruction and GPT-4 does follow through with it. What word
should we use for this?

On Thu, Feb 15, 2024 at 12:20 PM Gary Marcus <gary.marcus at nyu.edu> wrote:

> Selectively looking at a single example (which happens to involve images)
> and ignoring all the other language-internal failures that I and others
> have presented is not a particularly effective way of getting to a general
> truth.
>
> More broadly, you are, in my judgement, mistaking correlation for a deeper
> level of understanding.
>
> Gary
>
> On Feb 15, 2024, at 07:05, Iam Palatnik <iam.palat at gmail.com> wrote:
>
> 
> Dear all,
>
> yrnlcruet ouy aer diergna na txraegadeeg xalemep arpagaprh tcgnnoaini an
> iuonisntrtc tub eht estetrl hntiwi aehc etmr rea sbcaedrml od ont seu nay
> cedo adn yimlsp ucmanlsrbe shti lynaalmu ocen ouy musrncbea htis orvpe htta
> oyu cloedtmep hte tska by llayerlti ooifwlgln this citnotsirun taets
> itcyxellpi that oyu uderdnoost eht gsaninesmt
>
> Copy pasting just the above paragraph onto GPT-4 should show the kind of
> behavior that makes some researchers say LLMs understand something, in some
> form.
> We already use words such as 'intelligence' in AI and 'learning' in ML.
> This is not to say it's the same as human intelligence/learning. It is to
> say it's a similar enough behavior that the same word fits, while
> specifically qualifying the machine word-counterpart as something different
> (artificial/machine).
>
> Can this debate be solved by coining a concept such as 'artificial/machine
> understanding'? GPT-4 then 'machine understands' the paragraph above. It
> 'machine understands' arbitrary scrambled text better than humans 'human
> understand' it. Matrix multiplying rotational semantic embeddings of byte
> pair encoded tokens is part of 'machine understanding' but not of 'human
> understanding'. At the same time, there are plenty of examples of things we
> 'human understand' and GPT-4 doesn't 'machine understand', or doesn't
> understand without tool access and self reflective prompts.
>
> As to the map generation example, there are multiple tasks overlaid there.
> The language component of GPT-4 seems to have 'machine understood' it has
> to generate an image, and what the contents of the image have to be. It
> understood what tool it has to call to create the image. The tool generated
> an infograph style map of the correct country, but the states and landmarks
> are wrong. The markers are on the wrong cities and some of the drawings are
> bad. Is it too far fetched to say GPT-4 'machine understood' the assignment
> (generating a map with markers in the style of infograph), but its image
> generation component (Dall-E) is bad at detailed accurate geography
> knowledge?
>
> I'm also confused why the linguistic understanding capabilities of GPT-4
> are being tested by asking Dall-E 3 to generate images. Aren't these two
> completely separate models, and GPT-4 just function-calls Dall-E3 for image
> generation? Isn't this actually a sign GPT-4 did its job by 'machine
> understanding' what the user wanted, making the correct function call,
> creating and sending the correct prompt to Dall-E 3, but Dall-E 3 fumbled
> it because it's not good at generating detailed accurate maps?
>
> Cheers,
>
> Iam
>
> On Thu, Feb 15, 2024 at 5:20 AM Gary Marcus <gary.marcus at nyu.edu> wrote:
>
>> I am having a genuinely hard time comprehending some of the claims
>> recently made in this forum. (Not one of which engaged with any of the
>> specific examples or texts I linked.)
>>
>> Here’s yet another example, a dialog about geography that was just sent
>> to me by entrepreneur Phil Libin. Do we really want to call outputs like
>> these (to two prompts, with three generated responses zoomed in below)
>> understanding?
>>
>> In what sense do these responses exemplify the word “understanding”?
>>
>> I am genuinely baffled. To me a better word would be “approximations”,
>> and poor approximations at that.
>>
>> Worse, I don’t see any AI system on the horizon that could reliably do
>> better, across a broad range of related questions. If these kinds of
>> outputs are any indication at all, we are still a very long away from
>> reliable general-purpose AI.
>>
>> Gary
>>
>>
>>
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/connectionists/attachments/20240215/dc757c8b/attachment.html>