Connectionists: Deep Belief Nets (2006) / Neural History Compressor (1991) or Hierarchical Temporal Memory

Mon Feb 10 14:40:35 EST 2014

Nice to see this started again, even after the “get me off the mailing list” email.  :-)  For those of you relatively new to the field - it was discussions like this, I believe, that were responsible for growing connectionists to begin with - 25 years ago.  Anyway:

Well put - although, there is a long history of engineers and others coming up with interesting new ideas after contemplating biological structures  - that actually made a contribution to engineering.   Lots of current examples.  However, success in the engineering world does not at all necessarily mean that this is how the brain actually does it.

One more point - it is almost certain that a great deal of the computational power of the nervous system comes from interactions in the dendrite - which almost certainly can not be boiled down to the traditional summation of synaptic inputs over time and space followed by some simple thresholding mechanism.  Therefore, in addition to the vow of chastity for any of you who are really in this business for the love of neuroscience, I also suggest that you focus on the computational erogenous zone of the dendrites.  The Internet is a remarkable and complex network, but without understanding how the information it delivers is rendered and influences the computers it is connected to, probably rather difficult to figure out the network itself.  

Jim

On Feb 10, 2014, at 9:56 AM, Ali Minai <minaiaa at gmail.com> wrote:

> I agree with both Juergen and John. On the one hand, most neural processing must - almost necessarily - emerge from the dynamics of many recurrent networks interacting at multiple scales. I that sense, deep learning with recurrent networks is a fruitful place to start in trying to understand this. On the other hand, I also think that the term "deep learning" has become unnecessarily constrained to refer to a particular style of layered architecture and certain types of learning algorithms. We need to move beyond these - broaden the definition to include networks with more complex architectures and learning processes that include development, and even evolution. And to extend the model beyond just "neural" networks to encompass the entire brain-body network, including its mechanical and autonomic components.
> 
> One problem is that when engineers and computer scientists try to understand the brain, we keep getting distracted by all the sexy "applications" that arise as a side benefit of our models, go chasing after them, and eventually lose track of the original goal of understanding how the brain works. This results in a lot of very useful neural network models for vision, time-series prediction, data analysis, etc., but doesn't tell us much about the brain. Some of us need to take a vow of chastity and commit ourselves anew to the discipline of biology.
> 
> Ali
> 
> 
> On Mon, Feb 10, 2014 at 10:26 AM, Juergen Schmidhuber <juergen at idsia.ch> wrote:
> John,
> 
> perhaps your view is a bit too pessimistic. Note that a single RNN already is a general computer. In principle, dynamic RNNs can map arbitrary observation sequences to arbitrary computable sequences of motoric actions and internal attention-directing operations, e.g., to process cluttered scenes, or to implement development (the examples you mentioned). From my point of view, the main question is how to exploit this universal potential through learning. A stack of dynamic RNN can sometimes facilitate this. What it learns can later be collapsed into a single RNN [3].
> 
> Juergen
> 
> http://www.idsia.ch/~juergen/whatsnew.html
> 
> 
> 
> On Feb 7, 2014, at 12:54 AM, Juyang Weng <weng at cse.msu.edu> wrote:
> 
> > Juergen:
> >
> > You wrote: A stack of recurrent NN.  But it is a wrong architecture as far as the brain is concerned.
> >
> > Although my joint work with Narendra Ahuja and Thomas S. Huang at UIUC was probably the first
> > learning network that used the deep Learning idea for learning from clutter scenes (Cresceptron ICCV 1992 and IJCV 1997),
> > I gave up this static deep learning idea later after we considered the Principle 1: Development.
> >
> > The deep learning architecture is wrong for the brain.  It is too restricted, static in architecture, and cannot learn directly from cluttered scenes required by Principle 1.  The brain is not a cascade of recurrent NN.
> >
> > I quote from Antonio Damasio "Decartes' Error": p. 93: "But intermediate communications occurs also via large subcortical nuclei such as those in the thalamas and basal ganglia, and via small nulei such as those in the brain stem."
> >
> > Of course, the cerebral pathways themselves are not a stack of recurrent NN either.
> >
> > There are many fundamental reasons for that.  I give only one here base on our DN brain model:  Looking at a human, the brain must dynamically attend the tip of the nose, the entire nose, the face, or the entire human body on the fly.  For example, when the network attend the nose, the entire human body becomes the background!  Without a brain network that has both shallow and deep connections (unlike your stack of recurrent NN), your network is only for recognizing a set of static patterns in a clean background.  This is still an overworked pattern recognition problem, not a vision problem.
> >
> > -John
> >
> > On 2/6/14 7:24 AM, Schmidhuber Juergen wrote:
> >> Deep Learning in Artificial Neural Networks (NN) is about credit assignment across many subsequent computational stages, in deep or recurrent NN.
> >>
> >> A popluar Deep Learning NN is the Deep Belief Network (2006) [1,2].  A stack of feedforward NN (FNN) is pre-trained in unsupervised fashion. This can facilitate subsequent supervised learning.
> >>
> >> Let me re-advertise a much older, very similar, but more general, working Deep Learner of 1991. It can deal with temporal sequences: the Neural Hierarchical Temporal Memory or Neural History Compressor [3]. A stack of recurrent NN (RNN) is pre-trained in unsupervised fashion. This can greatly facilitate subsequent supervised learning.
> >>
> >> The RNN stack is more general in the sense that it uses sequence-processing RNN instead of FNN with unchanging inputs. In the early 1990s, the system was able to learn many previously unlearnable Deep Learning tasks, one of them requiring credit assignment across 1200 successive computational stages [4].
> >>
> >> Related developments: In the 1990s there was a trend from partially unsupervised [3] to fully supervised recurrent Deep Learners [5]. In recent years, there has been a similar trend from partially unsupervised to fully supervised systems. For example, several recent competition-winning and benchmark record-setting systems use supervised LSTM RNN stacks [6-9].
> >>
> >>
> >> References:
> >>
> >> [1] G. E. Hinton, R. R. Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, Vol. 313. no. 5786, pp. 504 - 507, 2006. http://www.cs.toronto.edu/~hinton/science.pdf
> >>
> >> [2] G. W. Cottrell. New Life for Neural Networks. Science, Vol. 313. no. 5786, pp. 454-455, 2006. http://www.academia.edu/155897/Cottrell_Garrison_W._2006_New_life_for_neural_networks
> >>
> >> [3] J. Schmidhuber. Learning complex, extended sequences using the principle of history compression, Neural Computation, 4(2):234-242, 1992. (Based on TR FKI-148-91, 1991.)  ftp://ftp.idsia.ch/pub/juergen/chunker.pdf  Overview: http://www.idsia.ch/~juergen/firstdeeplearner.html
> >>
> >> [4] J. Schmidhuber. Habilitation thesis, TUM, 1993. ftp://ftp.idsia.ch/pub/juergen/habilitation.pdf . Includes an experiment with credit assignment across 1200 subsequent computational stages for a Neural Hierarchical Temporal Memory or History Compressor or RNN stack with unsupervised pre-training [2] (try Google Translate in your mother tongue): http://www.idsia.ch/~juergen/habilitation/node114.html
> >>
> >> [5] S. Hochreiter, J. Schmidhuber. Long Short-Term Memory. Neural Computation, 9(8):1735-1780, 1997. Based on TR FKI-207-95, 1995.  ftp://ftp.idsia.ch/pub/juergen/lstm.pdf . Lots of of follow-up work on LSTM under http://www.idsia.ch/~juergen/rnn.html
> >>
> >> [6] S. Fernandez, A. Graves, J. Schmidhuber. Sequence labelling in structured domains with hierarchical recurrent neural networks. In Proc. IJCAI'07, p. 774-779, Hyderabad, India, 2007.  ftp://ftp.idsia.ch/pub/juergen/IJCAI07sequence.pdf
> >>
> >> [7] A. Graves, J. Schmidhuber. Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks. NIPS'22, p 545-552, Vancouver, MIT Press, 2009.  http://www.idsia.ch/~juergen/nips2009.pdf
> >>
> >> [8] 2009: First very deep (and recurrent) learner to win international competitions with secret test sets: deep LSTM RNN (1995-) won three connected handwriting contests at ICDAR 2009 (French, Arabic, Farsi), performing simultaneous segmentation and recognition.  http://www.idsia.ch/~juergen/handwriting.html
> >>
> >> [9] A. Graves, A. Mohamed, G. E. Hinton. Speech Recognition with Deep Recurrent Neural Networks. ICASSP 2013, Vancouver, 2013.   http://www.cs.toronto.edu/~hinton/absps/RNN13.pdf
> >>
> >>
> >>
> >> Juergen Schmidhuber
> >> http://www.idsia.ch/~juergen/whatsnew.html
> >
> > --
> > --
> > Juyang (John) Weng, Professor
> > Department of Computer Science and Engineering
> > MSU Cognitive Science Program and MSU Neuroscience Program
> > 428 S Shaw Ln Rm 3115
> > Michigan State University
> > East Lansing, MI 48824 USA
> > Tel: 517-353-4388
> > Fax: 517-432-1061
> > Email: weng at cse.msu.edu
> > URL: http://www.cse.msu.edu/~weng/
> > ----------------------------------------------
> >
> 
> 
> 
> 
> 
> -- 
> Ali A. Minai, Ph.D.
> Professor
> Complex Adaptive Systems Lab
> Department of Electrical Engineering & Computing Systems
> University of Cincinnati
> Cincinnati, OH 45221-0030
> 
> Phone: (513) 556-4783
> Fax: (513) 556-7326
> Email: Ali.Minai at uc.edu
>           minaiaa at gmail.com
> 
> WWW: http://www.ece.uc.edu/~aminai/

Dr. James M. Bower Ph.D.

Professor of Computational Neurobiology

Barshop Institute for Longevity and Aging Studies.

15355 Lambda Drive

University of Texas Health Science Center 

San Antonio, Texas  78245

Phone:  210 382 0553

Email: bower at uthscsa.edu

Web: http://www.bower-lab.org

twitter: superid101

linkedin: Jim Bower

CONFIDENTIAL NOTICE:

The contents of this email and any attachments to it may be privileged or contain privileged and confidential information. This information is only for the viewing or use of the intended recipient. If you have received this e-mail in error or are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of, or the taking of any action in reliance upon, any of the information contained in this e-mail, or

any of the attachments to this e-mail, is strictly prohibited and that this e-mail and all of the attachments to this e-mail, if any, must be

immediately returned to the sender or destroyed and, in either case, this e-mail and all attachments to this e-mail must be immediately deleted from your computer without making any copies hereof and any and all hard copies made must be destroyed. If you have received this e-mail in error, please notify the sender by e-mail immediately.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/connectionists/attachments/20140210/811968c4/attachment.html>