Connectionists: Stephen Hanson in conversation with Geoff Hinton
Stephen José Hanson
jose at
Fri Feb 4 07:00:45 EST 2022
Tom, understanding is a theorem?
you mean it should be a theorem?
and yes, if you are having brain surgery.. you hope your surgeon,
"understands" what they are doing..
On 2/3/22 12:31 PM, Dietterich, Thomas wrote:
> “Understanding” is not a Boolean. It is a theorem that no system can
> enumerate all of the consequences of a state of affairs in the world.
> For low-stakes application work, we can be satisfied by a system that
> “does the right thing”. If the system draws a good picture, that’s
> sufficient. It “understood” the request.
> But for higher-stakes applications---and for advancing the
> science---we seek a causal account of how the components of a system
> cause it to do the right thing. We are hoping that a small set of
> mechanisms can produce broad coverage of intelligent behavior. This
> gives us confidence that the system will respond correctly outside of
> the narrow tasks on which we have tested it.
> --Tom
> Thomas G. Dietterich, Distinguished Professor Emeritus
> School of Electrical Engineering and Computer Science
> US Mail: 1148 Kelley Engineering Center
> Office: 2067 Kelley Engineering Center
> Oregon State Univ., Corvallis, OR 97331-5501
> Voice: 541-737-5559; FAX: 541-737-1300
> URL:
> <>
> *From:* Connectionists <connectionists-bounces at>
> *On Behalf Of *Gary Marcus
> *Sent:* Thursday, February 3, 2022 8:26 AM
> *To:* Danko Nikolic <danko.nikolic at>
> *Cc:* connectionists at; AIhub <aihuborg at>
> *Subject:* Re: Connectionists: Stephen Hanson in conversation with
> Geoff Hinton
> [This email originated from outside of OSU. Use caution with links and
> attachments.]
> Dear Danko,
> Well said. I had a somewhat similar response to Jeff Dean’s 2021 TED
> talk, in which he said (paraphrasing from memory, because I don’t
> remember the precise words) that the famous 200 Quoc Le unsupervised
> model
> [
> <>]
> had learned the concept of a ca. In reality the model had clustered
> together some catlike images based on the image statistics that it had
> extracted, but it was a long way from a full,
> counterfactual-supporting concept of a cat, much as you describe below.
> I fully agree with you that the reason for even having a semantics is
> as you put it, "to 1) learn with a few examples and 2) apply the
> knowledge to a broad set of situations.” GPT-3 sometimes gives the
> appearance of having done so, but it falls apart under close
> inspection, so the problem remains unsolved.
> Gary
> On Feb 3, 2022, at 3:19 AM, Danko Nikolic <danko.nikolic at
> <mailto:danko.nikolic at>> wrote:
> G. Hinton wrote: "I believe that any reasonable person would admit
> that if you ask a neural net to draw a picture of a hamster
> wearing a red hat and it draws such a picture, it understood the
> request."
> I would like to suggest why drawing a hamster with a red hat does
> not necessarily imply understanding of the statement "hamster
> wearing a red hat".
> To understand that "hamster wearing a red hat" would mean
> inferring, in newly emerging situations of this hamster, all the
> real-life implications that the red hat brings to the little animal.
> What would happen to the hat if the hamster rolls on its back?
> (Would the hat fall off?)
> What would happen to the red hat when the hamster enters its lair?
> (Would the hat fall off?)
> What would happen to that hamster when it goes foraging? (Would
> the red hat have an influence on finding food?)
> What would happen in a situation of being chased by a predator?
> (Would it be easier for predators to spot the hamster?)
> ...and so on.
> Countless many questions can be asked. One has understood "hamster
> wearing a red hat" only if one can answer reasonably well many of
> such real-life relevant questions. Similarly, a student
> has understood materias in a class only if they can apply the
> materials in real-life situations (e.g., applying Pythagora's
> theorem). If a student gives a correct answer to a multiple choice
> question, we don't know whether the student understood the
> material or whether this was just rote learning (often, it is rote
> learning).
> I also suggest that understanding also comes together with
> effective learning: We store new information in such a way that we
> can recall it later and use it effectively i.e., make good
> inferences in newly emerging situations based on this knowledge.
> In short: Understanding makes us humans able to 1) learn with a
> few examples and 2) apply the knowledge to a broad set of situations.
> No neural network today has such capabilities and we don't know
> how to give them such capabilities. Neural networks need large
> amounts of training examples that cover a large variety of
> situations and then the networks can only deal with what the
> training examples have already covered. Neural networks cannot
> extrapolate in that 'understanding' sense.
> I suggest that understanding truly extrapolates from a piece of
> knowledge. It is not about satisfying a task such as translation
> between languages or drawing hamsters with hats. It is how you got
> the capability to complete the task: Did you only have a few
> examples that covered something different but related and then you
> extrapolated from that knowledge? If yes, this is going in the
> direction of understanding. Have you seen countless examples and
> then interpolated among them? Then perhaps it is not understanding.
> So, for the case of drawing a hamster wearing a red hat,
> understanding perhaps would have taken place if the following
> happened before that:
> 1) first, the network learned about hamsters (not many examples)
> 2) after that the network learned about red hats (outside the
> context of hamsters and without many examples)
> 3) finally the network learned about drawing (outside of the
> context of hats and hamsters, not many examples)
> After that, the network is asked to draw a hamster with a red hat.
> If it does it successfully, maybe we have started cracking the
> problem of understanding.
> Note also that this requires the network to learn sequentially
> without exhibiting catastrophic forgetting of the previous
> knowledge, which is possibly also a consequence of human learning
> by understanding.
> Danko
> Dr. Danko Nikolić
> <>
> <>
> --- A progress usually starts with an insight ---
> <>
> Virus-free.
> <>
> On Thu, Feb 3, 2022 at 9:55 AM Asim Roy <ASIM.ROY at
> <mailto:ASIM.ROY at>> wrote:
> Without getting into the specific dispute between Gary and
> Geoff, I think with approaches similar to GLOM, we are finally
> headed in the right direction. There’s plenty of
> neurophysiological evidence for single-cell abstractions and
> multisensory neurons in the brain, which one might claim
> correspond to symbols. And I think we can finally reconcile
> the decades old dispute between Symbolic AI and Connectionism.
> GARY: (Your GLOM, which as you know I praised publicly, is in
> many ways an effort to wind up with encodings that effectively
> serve as symbols in exactly that way, guaranteed to serve as
> consistent representations of specific concepts.)
> GARY: I have /never/ called for dismissal of neural networks,
> but rather for some hybrid between the two (as you yourself
> contemplated in 1991); the point of the 2001 book was to
> characterize exactly where multilayer perceptrons succeeded
> and broke down, and where symbols could complement them.
> Asim Roy
> Professor, Information Systems
> Arizona State University
> Lifeboat Foundation Bios: Professor Asim Roy
> <>
> Asim Roy | iSearch (
> <>
> *From:* Connectionists
> <connectionists-bounces at
> <mailto:connectionists-bounces at>> *On
> Behalf Of *Gary Marcus
> *Sent:* Wednesday, February 2, 2022 1:26 PM
> *To:* Geoffrey Hinton <geoffrey.hinton at
> <mailto:geoffrey.hinton at>>
> *Cc:* AIhub <aihuborg at <mailto:aihuborg at>>;
> connectionists at
> <mailto:connectionists at>
> *Subject:* Re: Connectionists: Stephen Hanson in conversation
> with Geoff Hinton
> Dear Geoff, and interested others,
> What, for example, would you make of a system that often drew
> the red-hatted hamster you requested, and perhaps a fifth of
> the time gave you utter nonsense? Or say one that you trained
> to create birds but sometimes output stuff like this:
> <image001.png>
> One could
> a. avert one’s eyes and deem the anomalous outputs irrelevant
> or
> b. wonder if it might be possible that sometimes the system
> gets the right answer for the wrong reasons (eg partial
> historical contingency), and wonder whether another approach
> might be indicated.
> Benchmarks are harder than they look; most of the field has
> come to recognize that. The Turing Test has turned out to be a
> lousy measure of intelligence, easily gamed. It has turned out
> empirically that the Winograd Schema Challenge did not measure
> common sense as well as Hector might have thought. (As it
> happens, I am a minor coauthor of a very recent review on this
> very topic:
> <!!IKRxdwAv5BmarQ!INA0AMmG3iD1B8MDtLfjWCwcBjxO-e-eM2Ci9KEO_XYOiIEgiywK-G_8j6L3bHA%24&>)
> But its conquest in no way means machines now have common
> sense; many people from many different perspectives recognize
> that (including, e.g., Yann LeCun, who generally tends to be
> more aligned with you than with me).
> So: on the goalpost of the Winograd schema, I was wrong, and
> you can quote me; but what you said about me and machine
> translation remains your invention, and it is inexcusable that
> you simply ignored my 2019 clarification. On the essential
> goal of trying to reach meaning and understanding, I remain
> unmoved; the problem remains unsolved.
> All of the problems LLMs have with coherence, reliability,
> truthfulness, misinformation, etc stand witness to that fact.
> (Their persistent inability to filter out toxic and insulting
> remarks stems from the same.) I am hardly the only person in
> the field to see that progress on any given benchmark does not
> inherently mean that the deep underlying problems have solved.
> You, yourself, in fact, have occasionally made that point.
> With respect to embeddings: Embeddings are very good for
> natural language /processing/; but NLP is not the same as
> NL/U/ – when it comes to /understanding/, their worth is still
> an open question. Perhaps they will turn out to be necessary;
> they clearly aren’t sufficient. In their extreme, they might
> even collapse into being symbols, in the sense of uniquely
> identifiable encodings, akin to the ASCII code, in which a
> specific set of numbers stands for a specific word or concept.
> (Wouldn’t that be ironic?)
> (Your GLOM, which as you know I praised publicly, is in many
> ways an effort to wind up with encodings that effectively
> serve as symbols in exactly that way, guaranteed to serve as
> consistent representations of specific concepts.)
> Notably absent from your email is any kind of apology for
> misrepresenting my position. It’s fine to say that “many
> people thirty years ago once thought X” and another to say
> “Gary Marcus said X in 2015”, when I didn’t. I have
> consistently felt throughout our interactions that you have
> mistaken me for Zenon Pylyshyn; indeed, you once (at NeurIPS
> 2014) apologized to me for having made that error. I am still
> not he.
> Which maybe connects to the last point; if you read my work,
> you would see thirty years of arguments /for/ neural networks,
> just not in the way that you want them to exist. I have ALWAYS
> argued that there is a role for them; characterizing me as a
> person “strongly opposed to neural networks” misses the whole
> point of my 2001 book, which was subtitled “Integrating
> Connectionism and Cognitive Science.”
> In the last two decades or so you have insisted (for reasons
> you have never fully clarified, so far as I know) on
> abandoning symbol-manipulation, but the reverse is not the
> case: I have /never/ called for dismissal of neural networks,
> but rather for some hybrid between the two (as you yourself
> contemplated in 1991); the point of the 2001 book was to
> characterize exactly where multilayer perceptrons succeeded
> and broke down, and where symbols could complement them. It’s
> a rhetorical trick (which is what the previous thread was
> about) to pretend otherwise.
> Gary
> On Feb 2, 2022, at 11:22, Geoffrey Hinton
> <geoffrey.hinton at
> <mailto:geoffrey.hinton at>> wrote:
> Embeddings are just vectors of soft feature detectors and
> they are very good for NLP. The quote on my webpage from
> Gary's 2015 chapter implies the opposite.
> A few decades ago, everyone I knew then would have agreed
> that the ability to translate a sentence into many
> different languages was strong evidence that you
> understood it.
> But once neural networks could do that, their critics
> moved the goalposts. An exception is Hector Levesque who
> defined the goalposts more sharply by saying that the
> ability to get pronoun references correct in Winograd
> sentences is a crucial test. Neural nets are improving at
> that but still have some way to go. Will Gary agree that
> when they can get pronoun references correct in Winograd
> sentences they really do understand? Or does he want to
> reserve the right to weasel out of that too?
> Some people, like Gary, appear to be strongly opposed to
> neural networks because they do not fit their preconceived
> notions of how the mind should work.
> I believe that any reasonable person would admit that if
> you ask a neural net to draw a picture of a hamster
> wearing a red hat and it draws such a picture, it
> understood the request.
> Geoff
> On Wed, Feb 2, 2022 at 1:38 PM Gary Marcus
> <gary.marcus at <mailto:gary.marcus at>> wrote:
> Dear AI Hub, cc: Steven Hanson and Geoffrey Hinton,
> and the larger neural network community,
> There has been a lot of recent discussion on this list
> about framing and scientific integrity. Often the
> first step in restructuring narratives is to bully and
> dehumanize critics. The second is to misrepresent
> their position. People in positions of power are
> sometimes tempted to do this.
> The Hinton-Hanson interview that you just published is
> a real-time example of just that. It opens with a
> needless and largely content-free personal attack on a
> single scholar (me), with the explicit intention of
> discrediting that person. Worse, the only substantive
> thing it says is false.
> Hinton says “In 2015 he [Marcus] made a prediction
> that computers wouldn’t be able to do machine
> translation.”
> I never said any such thing.
> What I predicted, rather, was that multilayer
> perceptrons, as they existed then, would not (on their
> own, absent other mechanisms) /understand/ language.
> Seven years later, they still haven’t, except in the
> most superficial way.
> I made no comment whatsoever about machine
> translation, which I view as a separate problem,
> solvable to a certain degree by correspondance without
> semantics.
> I specifically tried to clarify Hinton’s confusion in
> 2019, but, disappointingly, he has continued to purvey
> misinformation despite that clarification. Here is
> what I wrote privately to him then, which should have
> put the matter to rest:
> You have taken a single out of context quote [from
> 2015] and misrepresented it. The quote, which you have
> prominently displayed at the bottom on your own web
> page, says:
> Hierarchies of features are less suited to challenges
> such as language, inference, and high-level planning.
> For example, as Noam Chomsky famously pointed out,
> language is filled with sentences you haven't seen
> before. Pure classifier systems don't know what to do
> with such sentences. The talent of feature detectors
> -- in identifying which member of some category
> something belongs to -- doesn't translate into
> understanding novel sentences, in which each sentence
> has its own unique meaning.
> It does /not/ say "neural nets would not be able to
> deal with novel sentences"; it says that hierachies of
> features detectors (on their own, if you read the
> context of the essay) would have trouble
> /understanding /novel sentences.
> Google Translate does yet not /understand/ the content
> of the sentences is translates. It cannot reliably
> answer questions about who did what to whom, or why,
> it cannot infer the order of the events in paragraphs,
> it can't determine the internal consistency of those
> events, and so forth.
> Since then, a number of scholars, such as the the
> computational linguist Emily Bender, have made similar
> points, and indeed current LLM difficulties with
> misinformation, incoherence and fabrication all follow
> from these concerns. Quoting from Bender’s
> prizewinning 2020 ACL article on the matter with
> Alexander Koller,
> <>,
> also emphasizing issues of understanding and meaning:
> /The success of the large neural language models on
> many NLP tasks is exciting. However, we find that
> these successes sometimes lead to hype in which these
> models are being described as “understanding” language
> or capturing “meaning”. In this position paper, we
> argue that a system trained only on form has a priori
> no way to learn meaning. .. a clear understanding of
> the distinction between form and meaning will help
> guide the field towards better science around natural
> language understanding. /
> Her later article with Gebru on language models
> “stochastic parrots” is in some ways an extension of
> this point; machine translation requires mimicry, true
> understanding (which is what I was discussing in 2015)
> requires something deeper than that.
> Hinton’s intellectual error here is in equating
> machine translation with the deeper comprehension that
> robust natural language understanding will require; as
> Bender and Koller observed, the two appear not to be
> the same. (There is a longer discussion of the
> relation between language understanding and machine
> translation, and why the latter has turned out to be
> more approachable than the former, in my 2019 book
> with Ernest Davis).
> More broadly, Hinton’s ongoing dismissiveness of
> research from perspectives other than his own (e.g.
> linguistics) have done the field a disservice.
> As Herb Simon once observed, science does not have to
> be zero-sum.
> Sincerely,
> Gary Marcus
> Professor Emeritus
> New York University
> On Feb 2, 2022, at 06:12, AIhub
> <aihuborg at <mailto:aihuborg at>>
> wrote:
> Stephen Hanson in conversation with Geoff Hinton
> In the latest episode of this video series for
> <>,
> Stephen Hanson talks to Geoff Hinton about neural
> networks, backpropagation, overparameterization,
> digit recognition, voxel cells, syntax and
> semantics, Winograd sentences, and more.
> You can watch the discussion, and read the
> transcript, here:
> <>
> About AIhub:
> AIhub is a non-profit dedicated to connecting the
> AI community to the public by providing free,
> high-quality information through
> <>
> (
> <>).
> We help researchers publish the latest AI news,
> summaries of their work, opinion pieces, tutorials
> and more. We are supported by many leading
> scientific organizations in AI, namely AAAI
> <>,
> NeurIPS
> <>,
> <>,
> <>/IJCAI
> <>,
> <>,
> <>
> and RoboCup
> <>.
> Twitter: @aihuborg
> <>
> Virus-free.
> <>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.png
Type: image/png
Size: 19957 bytes
Desc: not available
URL: <>
More information about the Connectionists
mailing list