<div dir="ltr"><div dir="ltr">Dear Asim,</div><div dir="ltr">The following information might be very useful to Gary Marcus and many others on this list. </div><div>I try to make my complexity analysis a little more complete, as John Tsotsos seems to have considered (2) below (?), if my memory serves me correctly.</div><div dir="ltr">(1) The exponential complexity of recognizing and segmenting a part: <font face="Calibri, sans-serif"><span style="font-size:14.6667px">each pixel has c colors, a part with e pixel-elements has O(c^e) complexity.</span></font></div><div dir="ltr"><div>(2) Group "parts" into an attended object:<br>"<span style="font-family:Calibri,sans-serif;font-size:14.6667px">Suppose that each part is centered at location l, the number of combinations of p parts of your object (dog) is O(p^l), another exponential </span><span style="font-family:Calibri,sans-serif;font-size:14.6667px">complexity."</span></div><div><span style="font-family:Calibri,sans-serif;font-size:14.6667px">This exponential </span><span style="font-family:Calibri,sans-serif;font-size:14.6667px">O(p^l) has never been addressed by any neural networks other than our DN.   </span></div><div><span style="font-family:Calibri,sans-serif;font-size:14.6667px">(3) Segment an object from a cluttered background.   Suppose a cluttered scene has m parts, m>> p.  Segmenting an object from a cluttered scene (many parts!)</span></div><div><span style="font-family:Calibri,sans-serif;font-size:14.6667px">has a complexity O(2^m) where 2 is belonging or not-belonging to the object.</span></div><div><font face="Calibri, sans-serif"><span style="font-size:14.6667px">The real complexity is at least a product of above three exponential complexities.  O(c^e p^l 2^m).</span></font></div><div><font face="Calibri, sans-serif"><span style="font-size:14.6667px">In other words, what you wrote "</span></font><i>we can also identify parts of wholes in these scenes" is an illusion, since you have not discussed how your network deals with</i></div><div><i>NP hard problems.  Just three examples are an illusion.  It is a toy illusion. </i></div><div><i>Of course, our DN can do all above and more, with a constant (ML) frame complexity, but the network size is of a brain-size. </i></div><div><i>I am not saying that we solved the NP completeness problem.  The NO completeness problem is pure symbolic.  The problem of the brain is not symbolic (e.g., pixels).<br>We should not expect to do better than humans, unlike Li Fei-Fei incorrectly claimed.  </i></div><div><i>Best regards,</i></div><div><i>-John</i></div></div><div dir="auto"></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Feb 11, 2022, 9:49 PM Asim Roy <<a href="mailto:ASIM.ROY@asu.edu" target="_blank">ASIM.ROY@asu.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">


<div lang="EN-US">

<div>

<p class="MsoNormal">Dear John,<u></u><u></u></p>

<p class="MsoNormal"><u></u> <u></u></p>

<p class="MsoNormal">If I understand correctly, all learning systems do something along the lines of maximum likelihood learning or error minimization, like your DN. What’s your point?<u></u><u></u></p>

<p class="MsoNormal"><u></u> <u></u></p>

<p class="MsoNormal"><span style="background:lime">JOHN: </span>

<i><span style="background:yellow">“Of course, the brain network does not remember all shapes and all configurations of parts.  That is why our DN must do maximum likelihood optimality, using a limited number of resources to best estimate

 such a huge space of cluttered scenes.”</span><u></u><u></u></i></p>

<p class="MsoNormal"><u></u> <u></u></p>

<p class="MsoNormal">So, can your DN model identify the parts of objects in the cluttered images below? Here was my note:<u></u><u></u></p>

<p class="MsoNormal"><u></u> <u></u></p>

<p class="MsoNormal"><span style="background:lime">ASIM:</span><span style="background:yellow">

<i>“And we can also identify parts of wholes in these scenes. Here are some example scenes. In the first two scenes, we can identify the huskies along with the ears, eyes, legs, faces and so on. In the satellite image below, we can identify parts of the planes

 like the fuselage, tail, wing and so on. That’s the fundamental part of DARPA’s XAI model – to be able to

</i></span><i><span style="background:red">identify the parts to confirm the whole object</span><span style="background:yellow">. And if you can identify the parts, a school bus will never become an ostrich with change

 of a few pixels. So you get a lot of things with Explainable models of this form –

</span><span style="background:red">a symbolic XAI model, robustness against adversarial attacks, and a model that you can trust</span><span style="background:yellow">. Explainable AI of this form can become the best defense

 against adversarial attacks. You may not need any adversarial training of any kind.</span>”<u></u><u></u></i></p>

<p class="MsoNormal"><u></u> <u></u></p>

<p class="MsoNormal"><u></u> <u></u></p>

<p class="MsoNormal">Best,<u></u><u></u></p>

<p class="MsoNormal">Asim Roy<u></u><u></u></p>

<p class="MsoNormal">Professor, Information Systems<u></u><u></u></p>

<p class="MsoNormal">Arizona State University<u></u><u></u></p>

<p class="MsoNormal"><a href="https://urldefense.proofpoint.com/v2/url?u=https-3A__lifeboat.com_ex_bios.asim.roy&d=DwMFaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=wQR1NePCSj6dOGDD0r6B5Kn1fcNaTMg7tARe7TdEDqQ&m=waSKY67JF57IZXg30ysFB_R7OG9zoQwFwxyps6FbTa1Zh5mttxRot_t4N7mn68Pj&s=oDRJmXX22O8NcfqyLjyu4Ajmt8pcHWquTxYjeWahfuw&e=" rel="noreferrer" target="_blank">Lifeboat

 Foundation Bios: Professor Asim Roy</a><u></u><u></u></p>

<p class="MsoNormal"><a href="https://urldefense.proofpoint.com/v2/url?u=https-3A__isearch.asu.edu_profile_9973&d=DwMFaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=wQR1NePCSj6dOGDD0r6B5Kn1fcNaTMg7tARe7TdEDqQ&m=waSKY67JF57IZXg30ysFB_R7OG9zoQwFwxyps6FbTa1Zh5mttxRot_t4N7mn68Pj&s=jCesWT7oGgX76_y7PFh4cCIQ-Ife-esGblJyrBiDlro&e=" rel="noreferrer" target="_blank">Asim

 Roy | iSearch (asu.edu)</a><u></u><u></u></p>

<p class="MsoNormal"><u></u> <u></u></p>

<p class="MsoNormal"><img border="0" width="480" height="360" style="width:5in;height:3.75in" id="m_-9221473681776046933m_-3689652050252558740gmail-m_7070342535326291021gmail-m_7914905735634183612m_1775684907010914047_x0000_i1030" alt="A dog and a cat lying on a bed


Description automatically generated with low confidence"> 

<img border="0" width="308" height="231" style="width:3.2083in;height:2.4062in" id="m_-9221473681776046933m_-3689652050252558740gmail-m_7070342535326291021gmail-m_7914905735634183612m_1775684907010914047_x0000_i1029" alt="A wolf walking in the snow


Description automatically generated with medium confidence">    <img border="0" width="1443" height="1452" style="width:15.0312in;height:15.125in" id="m_-9221473681776046933m_-3689652050252558740gmail-m_7070342535326291021gmail-m_7914905735634183612m_1775684907010914047_x0000_i1028" alt="An aerial view of a city


Description automatically generated with medium confidence"><u></u><u></u></p>

<p class="MsoNormal"><u></u> <u></u></p>

<div style="border-right:none;border-bottom:none;border-left:none;border-top:1pt solid rgb(225,225,225);padding:3pt 0in 0in">

<p class="MsoNormal"><br></p></div><div><div><div>

</div>

</div>

</div>

</div>

</div>


</blockquote></div>

</div>