Connectionists: Stephen Hanson in conversation with Geoff Hinton

Fri Feb 11 21:49:04 EST 2022

Dear John,

If I understand correctly, all learning systems do something along the lines of maximum likelihood learning or error minimization, like your DN. What’s your point?

JOHN: “Of course, the brain network does not remember all shapes and all configurations of parts.  That is why our DN must do maximum likelihood optimality, using a limited number of resources to best estimate such a huge space of cluttered scenes.”

So, can your DN model identify the parts of objects in the cluttered images below? Here was my note:

ASIM: “And we can also identify parts of wholes in these scenes. Here are some example scenes. In the first two scenes, we can identify the huskies along with the ears, eyes, legs, faces and so on. In the satellite image below, we can identify parts of the planes like the fuselage, tail, wing and so on. That’s the fundamental part of DARPA’s XAI model – to be able to identify the parts to confirm the whole object. And if you can identify the parts, a school bus will never become an ostrich with change of a few pixels. So you get a lot of things with Explainable models of this form – a symbolic XAI model, robustness against adversarial attacks, and a model that you can trust. Explainable AI of this form can become the best defense against adversarial attacks. You may not need any adversarial training of any kind.”

Best,
Asim Roy
Professor, Information Systems
Arizona State University
Lifeboat Foundation Bios: Professor Asim Roy<https://urldefense.proofpoint.com/v2/url?u=https-3A__lifeboat.com_ex_bios.asim.roy&d=DwMFaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=wQR1NePCSj6dOGDD0r6B5Kn1fcNaTMg7tARe7TdEDqQ&m=waSKY67JF57IZXg30ysFB_R7OG9zoQwFwxyps6FbTa1Zh5mttxRot_t4N7mn68Pj&s=oDRJmXX22O8NcfqyLjyu4Ajmt8pcHWquTxYjeWahfuw&e=>
Asim Roy | iSearch (asu.edu)<https://urldefense.proofpoint.com/v2/url?u=https-3A__isearch.asu.edu_profile_9973&d=DwMFaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=wQR1NePCSj6dOGDD0r6B5Kn1fcNaTMg7tARe7TdEDqQ&m=waSKY67JF57IZXg30ysFB_R7OG9zoQwFwxyps6FbTa1Zh5mttxRot_t4N7mn68Pj&s=jCesWT7oGgX76_y7PFh4cCIQ-Ife-esGblJyrBiDlro&e=>

[A dog and a cat lying on a bed    Description automatically generated with low confidence]  [A wolf walking in the snow    Description automatically generated with medium confidence]     [An aerial view of a city    Description automatically generated with medium confidence]

From: Juyang Weng <juyang.weng at gmail.com>
Sent: Friday, February 11, 2022 3:40 PM
To: Asim Roy <ASIM.ROY at asu.edu>
Cc: John K Tsotsos <tsotsos at cse.yorku.ca>; connectionists at mailman.srv.cs.cmu.edu; Gary Marcus <gary.marcus at nyu.edu>
Subject: Re: Connectionists: Stephen Hanson in conversation with Geoff Hinton

Dear Asim,
We should not just assume your 1, a dog, since there are many articulated objects and each articulated object looks very different under many situations.  There are further many other articulated objects other than the dog.    You will see below.

Since you want explainable AI, you must not start with a single object symbol like "dog".  A symbol is "abstract" already.  That is way Michael Jordan complained that neural networks do not abstract well.

Let us do your 2.  How many possible combinations?
The number of shapes of a part:   Suppose a part of a uniform color.  It has m pixels along its boundary; each pixel has n possible positions, the number of shapes of a part is O(m^n), already exponential in n.   We assumed that each pixel has the same color, which is not true.
Suppose that each part is centered at location l, the number of combinations of p parts of your object (dog) is O(p^l), another exponential complexity.
Thus the number of combinations of your object (dog) in a clean background, like Jeffe Hinton proposed to do, is already a product two two exponential complexities:
O(m^n)O(p^l)=O(m^{n}p^{l}).
Suppose that there are b objects in a cluttered scene, the number of combination of objects in a cluttered scene is
[O(m^{n}p^{l})]^b=O(m^{nb}p^{lb}).

Of course, the brain network does not remember all shapes and all configurations of parts.  That is why our DN must do maximum likelihood optimality, using a limited number of resources to best estimate such a huge space of cluttered scenes.
I am not talking about abstraction yet, which is another subject about "pop up" in the brain.
I guess that many people on this list are not familiar with such a complexity analysis. Gary Marcus, sorry to overload you with this.
That is what I said in several talks that  the brain is an elephant and all disciplines are blind men.
Even people with a PhD in computer science may not be skillful in such an exponential complexity in vision, since many computer science programs have dropped automata theory from their required course lists.

I asked Gary Marcus to suggest how to solve this huge problem.  But instead, he asked me to try a data set which is a dead end.  Many have tried and are dead, like ImageNet.
Best regards,
-Joh

On Fri, Feb 11, 2022 at 4:59 PM Asim Roy <ASIM.ROY at asu.edu<mailto:ASIM.ROY at asu.edu>> wrote:
Dear John,

  1.  Let’s start with a simple case, say a dog, and enumerate how many possible parts and objects a dog would need to remember or recognize.
  2.  How many possible combinations did you use in your calculation for the DN?

Best,
Asim

From: Juyang Weng <juyang.weng at gmail.com<mailto:juyang.weng at gmail.com>>
Sent: Friday, February 11, 2022 12:33 PM
To: Asim Roy <ASIM.ROY at asu.edu<mailto:ASIM.ROY at asu.edu>>; John K Tsotsos <tsotsos at cse.yorku.ca<mailto:tsotsos at cse.yorku.ca>>
Cc: connectionists at mailman.srv.cs.cmu.edu<mailto:connectionists at mailman.srv.cs.cmu.edu>; Gary Marcus <gary.marcus at nyu.edu<mailto:gary.marcus at nyu.edu>>
Subject: Re: Connectionists: Stephen Hanson in conversation with Geoff Hinton

Dear Asim,

Thank you for saying "we can".
Please provide:
(1) a neural network that does all you said "we can" and
(2) the complexity analysis for all possible combinations among all possible parts and all possible objects
This chain of conversations is very useful for those who are not yet familiar with the "complexity of vision" (NP hard) that John Tsotso wrote papers argued about.
John Tsotso:
Our DN solves this problem like a brain in a constant time (frame time)!  The solution simply pops up.

Best regards,
-John

On Thu, Feb 10, 2022 at 3:01 AM Asim Roy <ASIM.ROY at asu.edu<mailto:ASIM.ROY at asu.edu>> wrote:
Dear John,

We can deal with cluttered scenes. And we can also identify parts of wholes in these scenes. Here are some example scenes. In the first two scenes, we can identify the huskies along with the ears, eyes, legs, faces and so on. In the satellite image below, we can identify parts of the planes like the fuselage, tail, wing and so on. That’s the fundamental part of DARPA’s XAI model – to be able to identify the parts to confirm the whole object. And if you can identify the parts, a school bus will never become an ostrich with change of a few pixels. So you get a lot of things with Explainable models of this form – a symbolic XAI model, robustness against adversarial attacks, and a model that you can trust. Explainable AI of this form can become the best defense against adversarial attacks. You may not need any adversarial training of any kind.

Best,
Asim Roy
Professor, Information Systems
Arizona State University
Lifeboat Foundation Bios: Professor Asim Roy<https://urldefense.proofpoint.com/v2/url?u=https-3A__lifeboat.com_ex_bios.asim.roy&d=DwMFaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=wQR1NePCSj6dOGDD0r6B5Kn1fcNaTMg7tARe7TdEDqQ&m=waSKY67JF57IZXg30ysFB_R7OG9zoQwFwxyps6FbTa1Zh5mttxRot_t4N7mn68Pj&s=oDRJmXX22O8NcfqyLjyu4Ajmt8pcHWquTxYjeWahfuw&e=>
Asim Roy | iSearch (asu.edu)<https://urldefense.proofpoint.com/v2/url?u=https-3A__isearch.asu.edu_profile_9973&d=DwMFaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=wQR1NePCSj6dOGDD0r6B5Kn1fcNaTMg7tARe7TdEDqQ&m=waSKY67JF57IZXg30ysFB_R7OG9zoQwFwxyps6FbTa1Zh5mttxRot_t4N7mn68Pj&s=jCesWT7oGgX76_y7PFh4cCIQ-Ife-esGblJyrBiDlro&e=>

   [A dog and a cat lying on a bed    Description automatically generated with low confidence]         [A wolf walking in the snow    Description automatically generated with medium confidence]     [An aerial view of a city    Description automatically generated with medium confidence]

From: Connectionists <connectionists-bounces at mailman.srv.cs.cmu.edu<mailto:connectionists-bounces at mailman.srv.cs.cmu.edu>> On Behalf Of Juyang Weng
Sent: Wednesday, February 9, 2022 3:19 PM
To: Post Connectionists <connectionists at mailman.srv.cs.cmu.edu<mailto:connectionists at mailman.srv.cs.cmu.edu>>
Subject: Re: Connectionists: Stephen Hanson in conversation with Geoff Hinton

Dear Gary,

As my reply to Asim Roy indicated, the parts and whole problem that Geoff Hinton considered is ill-posed since it bypasses how a brain network segments the "whole" from 1000 parts in the cluttered scene.  Only 10 parts belong to the whole.

The relation problem has also been solved and mathematically proven if one understands emergent universal Turing machines using a Developmental Network (DN).   The solution to relation is a special case of the solution to the compositionality problem which is a special case of the emergent universal Turing machine.

I am not telling you "a son looks like his father because the father makes money to feed the son".   The solution is supported by biology and a mathematical proof.

Best regards,
-John

Date: Mon, 7 Feb 2022 07:57:34 -0800
From: Gary Marcus <gary.marcus at nyu.edu<mailto:gary.marcus at nyu.edu>>
To: Juyang Weng <juyang.weng at gmail.com<mailto:juyang.weng at gmail.com>>
Cc: Post Connectionists <connectionists at mailman.srv.cs.cmu.edu<mailto:connectionists at mailman.srv.cs.cmu.edu>>
Subject: Re: Connectionists: Stephen Hanson in conversation with Geoff
        Hinton
Message-ID: <D0E77E54-78C0-4605-B40C-434E2B8F1E7C at nyu.edu<mailto:D0E77E54-78C0-4605-B40C-434E2B8F1E7C at nyu.edu>>
Content-Type: text/plain; charset="utf-8"

Dear John,

I agree with you that cluttered scenes are critical, but Geoff?s GLOM paper [https://www.cs.toronto.edu/~hinton/absps/glomfinal.pdf<https://urldefense.com/v3/__https:/www.cs.toronto.edu/*hinton/absps/glomfinal.pdf__;fg!!IKRxdwAv5BmarQ!Nz1SeTUV0HTHgPjgQgoT1IgAHrVhxdw8HVVMwgs83QlxthT1NyY5hgDxKe34wLc$>] might actually have some relevance. It may well be that we need to do a better job with parts and whole before we can fully address clutter, and Geoff is certainly taking that question seriously.

Geoff?s ?Stable islands of identical vectors? do sound suspiciously like symbols to me (in a good way!), but regardless, they seem to me to be a plausible candidate as a foundation for coping with clutter.

And not just cluttered scenes, but also relations between multiple objects in a scene, which is another example of the broader issue you raise, challenging for pure MLPs but critical for deeper AI.

Gary

--
Juyang (John) Weng

--
Juyang (John) Weng

--
Juyang (John) Weng
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/connectionists/attachments/20220212/d1026491/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 27597 bytes
Desc: image001.jpg
URL: <http://mailman.srv.cs.cmu.edu/pipermail/connectionists/attachments/20220212/d1026491/attachment-0003.jpg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.jpg
Type: image/jpeg
Size: 91893 bytes
Desc: image002.jpg
URL: <http://mailman.srv.cs.cmu.edu/pipermail/connectionists/attachments/20220212/d1026491/attachment-0004.jpg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image003.jpg
Type: image/jpeg
Size: 399723 bytes
Desc: image003.jpg
URL: <http://mailman.srv.cs.cmu.edu/pipermail/connectionists/attachments/20220212/d1026491/attachment-0005.jpg>