Connectionists: Annotated History of Modern AI and Deep Learning: Boltzmann Machine and Adaptive Resonance Theory

Wed Jan 25 15:38:51 EST 2023

Dear Geoff,

It's good to hear from you!

I of course know about the Boltzmann Machine learning algorithm that you published with David Ackley and Terry Sejnowski:
https://onlinelibrary.wiley.com/doi/pdfdirect/10.1207/s15516709cog0901_7

Because your article was published in 1985, I did not include it in a list of early algorithms.

I do discuss it, however, in my Magnum Opus on p. 156:
https://www.amazon.com/Conscious-Mind-Resonant-Brain-Makes-ebook/dp/B094W6BBKN/ref=tmm_kin_swatch_0?_encoding=UTF8&qid=&sr=

As you know better than I do, there is more to the Boltzmann Machine than an Ising model, as your use of the name Boltzmann, one of the greatest founders of statistical mechanics, suggests.

In particular, your model requires that an external parameter, such as a formal temperature variable, be slowly adjusted to control the approach to equilibrium.

The Boltzmann Machine is thus neither an autonomous, nor a non-parametric, algorithm.

I had also published quite a few models by 1985, notably foundational models on Competitive Learning and Adaptive Resonance Theory, or ART, between 1976 and 1980.

ART can autonomously learn to attend, classify, recognize, and predict objects and events in a changing world that is filled with unexpected events.

Unsupervised ART models, such as those that I published between 1976 and 1980, do not require any external supervision; e.g.,

Grossberg, S. (1976). Adaptive pattern classification and universal recoding, II: Feedback, expectation, olfaction, and illusions. Biological Cybernetics, 23, 187-202.
https://sites.bu.edu/steveg/files/2016/06/Gro1976BiolCyb_II.pdf

Grossberg, S. (1980). How does a brain build a cognitive code? Psychological Review, 87, 1-51.
https://sites.bu.edu/steveg/files/2016/06/Gro1980PsychRev.pdf

See the Appendices, starting on p. 45, for some theorems.

Starting in 1987, Gail Carpenter and I began to publish ART learning algorithms with a full suite of mathematical theorems and parametric computer simulations, including a proof that these ART models do not experience catastrophic forgetting:

Carpenter, G.A., and Grossberg, S. (1987). A massively parallel architecture for a self-organizing neural pattern recognition machine. Computer Vision, Graphics, and Image Processing, 37, 54-115.
https://sites.bu.edu/steveg/files/2016/06/CarGro1987CVGIP.pdf

Supervised ARTMAP models, which I began to published with Gail and some of our students starting in 1991, were simulated using challenging benchmark databases and compared with other algorithms. They are "supervised" by environmental feedback, which may or may not include a human teacher. See:

Carpenter, G.A., Grossberg, S., and Reynolds, J.H. (1991). ARTMAP: Supervised real-time learning and classification of nonstationary data by a self-organizing neural network. Neural Networks, 4, 565-588.
https://sites.bu.edu/steveg/files/2016/06/CarGroRey1991NN.pdf

as well as many other increasingly powerful ART algorithms that can be downloaded from sites.bu.edu/steveg and http://techlab.bu.edu/members/gail/publications.html

Best,

Steve

________________________________
From: Geoffrey Hinton <geoffrey.hinton at gmail.com>
Sent: Wednesday, January 25, 2023 2:02 PM
To: Grossberg, Stephen <steve at bu.edu>
Cc: connectionists at cs.cmu.edu <connectionists at cs.cmu.edu>
Subject: Re: Connectionists: Annotated History of Modern AI and Deep Learning: Early binary, linear, and continuous-nonlinear neural networks, some which included learning

Dear Stephen,

Thanks for letting us know about your Magnum Opus.

There is actually a learning algorithm for the Ising model and it works even when you can only observe the states of a subset of the units. It's called the Boltzmann Machine learning algorithm.

Geoff

On Wed, Jan 25, 2023 at 1:25 PM Grossberg, Stephen <steve at bu.edu<mailto:steve at bu.edu>> wrote:
Dear Juergen,

Thanks for mentioning the Ising model!

As you know, it is a binary model, with just two states, and it does not learn.

My Magnum Opus
https://www.amazon.com/Conscious-Mind-Resonant-Brain-Makes/dp/0190070552

reviews some of the early binary neural network models, such as the McCulloch-Pitts, Caianiello, and Rosenblatt models, starting on p. 64, before going on to review early linear models that included learning, like the Adeline and Madeline models of Bernie Widrow and the Brain-State-in-a-Box model of Jim Anderson, then continuous and nonlinear models of various kinds, including models that are still used today.

Best,

Steve

________________________________
From: Connectionists <connectionists-bounces at mailman.srv.cs.cmu.edu<mailto:connectionists-bounces at mailman.srv.cs.cmu.edu>> on behalf of Schmidhuber Juergen <juergen at idsia.ch<mailto:juergen at idsia.ch>>
Sent: Wednesday, January 25, 2023 11:40 AM
To: connectionists at cs.cmu.edu<mailto:connectionists at cs.cmu.edu> <connectionists at cs.cmu.edu<mailto:connectionists at cs.cmu.edu>>
Subject: Re: Connectionists: Annotated History of Modern AI and Deep Learning: Early recurrent neural networks for serial verbal learning and associative pattern learning

Dear Steve,

thanks - I hope you noticed that the survey mentions your 1969 work!

And of course it also mentions the origin of this whole recurrent network business: the Ising model or Lenz-Ising model introduced a century ago. See Sec. 4: 1920-1925: First Recurrent NN (RNN) Architecture

https://people.idsia.ch/~juergen/deep-learning-history.html#rnn

"The first non-learning RNN architecture (the Ising model or Lenz-Ising model) was introduced and analyzed by physicists Ernst Ising and Wilhelm Lenz in the 1920s [L20][I24,I25][K41][W45][T22]. It settles into an equilibrium state in response to input conditions, and is the foundation of the first learning RNNs ...”

Jürgen

> On 25. Jan 2023, at 18:42, Grossberg, Stephen <steve at bu.edu<mailto:steve at bu.edu>> wrote:
>
> Dear Juergen and Connectionists colleagues,
>
> In his attached email below, Juergen mentioned a 1972 article of my friend and colleague, Shun-Ichi Amari, about recurrent neural networks that learn.
>
> Here are a couple of my own early articles from 1969 and 1971 about such networks. I introduced them to explain paradoxical data about serial verbal learning, notably the bowed serial position effect:
>
> Grossberg, S. (1969). On the serial learning of lists. Mathematical Biosciences, 4, 201-253.
> https://sites.bu.edu/steveg/files/2016/06/Gro1969MBLists.pdf
>
> Grossberg, S. and Pepe, J. (1971). Spiking threshold and overarousal effects in serial learning. Journal of Statistical Physics, 3, 95-125.
> https://sites.bu.edu/steveg/files/2016/06/GroPepe1971JoSP.pdf
>
> Juergen also mentioned that Shun-Ichi's work was a precursor of what some people call the Hopfield model, whose most cited articles were published in 1982 and 1984.
>
> I actually started publishing articles on this topic starting in the 1960s. Here are two of them:
>
> Grossberg, S. (1969). On learning and energy-entropy dependence in recurrent and nonrecurrent signed networks. Journal of Statistical Physics, 1, 319-350.
> https://sites.bu.edu/steveg/files/2016/06/Gro1969JourStatPhy.pdf
>
> Grossberg, S. (1971). Pavlovian pattern learning by nonlinear neural networks. Proceedings of the National Academy of Sciences, 68, 828-831.
> https://sites.bu.edu/steveg/files/2016/06/Gro1971ProNatAcaSci.pdf
>
> An early use of Lyapunov functions to prove global limit theorems in associative recurrent neural networks is found in the following 1980 PNAS article:
>
> Grossberg, S. (1980). Biological competition: Decision rules, pattern formation, and oscillations. Proceedings of the National Academy of Sciences, 77, 2338-2342.
> https://sites.bu.edu/steveg/files/2016/06/Gro1980PNAS.pdf
>
> Subsequent results culminated in my 1983 article with Michael Cohen, which was in press when the Hopfield (1982) article was published:
>
> Cohen, M.A. and Grossberg, S. (1983). Absolute stability of global pattern formation and parallel memory storage by competitive neural networks. IEEE Transactions on Systems, Man, and Cybernetics, SMC-13, 815-826.
>  https://sites.bu.edu/steveg/files/2016/06/CohGro1983IEEE.pdf
>
> Our article introduced a general class of neural networks for associative spatial pattern learning, which included the Additive and Shunting neural networks that I had earlier introduced, as well as a Lyapunov function for all of them.
>
> This article proved global limit theorems about all these systems using that Lyapunov function.
>
> The Hopfield article describes the special case of the Additive model.
>
> His article proved no theorems.
>
> Best to all,
>
> Steve
>
> Stephen Grossberg
> http://en.wikipedia.org/wiki/Stephen_Grossberg
> http://scholar.google.com/citations?user=3BIV70wAAAAJ&hl=en
> https://youtu.be/9n5AnvFur7I
> https://www.youtube.com/watch?v=_hBye6JQCh4
> https://www.amazon.com/Conscious-Mind-Resonant-Brain-Makes/dp/0190070552
>
> Wang Professor of Cognitive and Neural Systems
> Director, Center for Adaptive Systems
> Professor Emeritus of Mathematics & Statistics,
>        Psychological & Brain Sciences, and Biomedical Engineering
> Boston University
> sites.bu.edu/steveg<http://sites.bu.edu/steveg>
> steve at bu.edu<mailto:steve at bu.edu>
>
> From: Connectionists <connectionists-bounces at mailman.srv.cs.cmu.edu<mailto:connectionists-bounces at mailman.srv.cs.cmu.edu>> on behalf of Schmidhuber Juergen <juergen at idsia.ch<mailto:juergen at idsia.ch>>
> Sent: Wednesday, January 25, 2023 8:44 AM
> To: connectionists at cs.cmu.edu<mailto:connectionists at cs.cmu.edu> <connectionists at cs.cmu.edu<mailto:connectionists at cs.cmu.edu>>
> Subject: Re: Connectionists: Annotated History of Modern AI and Deep Learning
>
> Some are not aware of this historic tidbit in Sec. 4 of the survey: half a century ago, Shun-Ichi Amari published a learning recurrent neural network (1972) which was later called the Hopfield network.
>
> https://people.idsia.ch/~juergen/deep-learning-history.html#rnn
>
> Jürgen
>
>
>
>
> > On 13. Jan 2023, at 11:13, Schmidhuber Juergen <juergen at idsia.ch<mailto:juergen at idsia.ch>> wrote:
> >
> > Machine learning is the science of credit assignment. My new survey credits the pioneers of deep learning and modern AI (supplementing my award-winning 2015 survey):
> >
> > https://arxiv.org/abs/2212.11279
> >
> > https://people.idsia.ch/~juergen/deep-learning-history.html
> >
> > This was already reviewed by several deep learning pioneers and other experts. Nevertheless, let me know under juergen at idsia.ch<mailto:juergen at idsia.ch> if you can spot any remaining error or have suggestions for improvements.
> >
> > Happy New Year!
> >
> > Jürgen
> >

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/connectionists/attachments/20230125/8928d1dc/attachment.html>