Connectionists: Deep Belief Nets (2006) / Neural History Compressor (1991) or Hierarchical Temporal Memory

Mon Feb 10 10:56:04 EST 2014

I agree with both Juergen and John. On the one hand, most neural processing
must - almost necessarily - emerge from the dynamics of many recurrent
networks interacting at multiple scales. I that sense, deep learning with
recurrent networks is a fruitful place to start in trying to understand
this. On the other hand, I also think that the term "deep learning" has
become unnecessarily constrained to refer to a particular style of layered
architecture and certain types of learning algorithms. We need to move
beyond these - broaden the definition to include networks with more complex
architectures and learning processes that include development, and even
evolution. And to extend the model beyond just "neural" networks to
encompass the entire brain-body network, including its mechanical and
autonomic components.

One problem is that when engineers and computer scientists try to
understand the brain, we keep getting distracted by all the sexy
"applications" that arise as a side benefit of our models, go chasing after
them, and eventually lose track of the original goal of understanding how
the brain works. This results in a lot of very useful neural network models
for vision, time-series prediction, data analysis, etc., but doesn't tell
us much about the brain. Some of us need to take a vow of chastity and
commit ourselves anew to the discipline of biology.

Ali

On Mon, Feb 10, 2014 at 10:26 AM, Juergen Schmidhuber <juergen at idsia.ch>wrote:

> John,
>
> perhaps your view is a bit too pessimistic. Note that a single RNN already
> is a general computer. In principle, dynamic RNNs can map arbitrary
> observation sequences to arbitrary computable sequences of motoric actions
> and internal attention-directing operations, e.g., to process cluttered
> scenes, or to implement development (the examples you mentioned). From my
> point of view, the main question is how to exploit this universal potential
> through learning. A stack of dynamic RNN can sometimes facilitate this.
> What it learns can later be collapsed into a single RNN [3].
>
> Juergen
>
> http://www.idsia.ch/~juergen/whatsnew.html
>
>
>
> On Feb 7, 2014, at 12:54 AM, Juyang Weng <weng at cse.msu.edu> wrote:
>
> > Juergen:
> >
> > You wrote: A stack of recurrent NN.  But it is a wrong architecture as
> far as the brain is concerned.
> >
> > Although my joint work with Narendra Ahuja and Thomas S. Huang at UIUC
> was probably the first
> > learning network that used the deep Learning idea for learning from
> clutter scenes (Cresceptron ICCV 1992 and IJCV 1997),
> > I gave up this static deep learning idea later after we considered the
> Principle 1: Development.
> >
> > The deep learning architecture is wrong for the brain.  It is too
> restricted, static in architecture, and cannot learn directly from
> cluttered scenes required by Principle 1.  The brain is not a cascade of
> recurrent NN.
> >
> > I quote from Antonio Damasio "Decartes' Error": p. 93: "But intermediate
> communications occurs also via large subcortical nuclei such as those in
> the thalamas and basal ganglia, and via small nulei such as those in the
> brain stem."
> >
> > Of course, the cerebral pathways themselves are not a stack of recurrent
> NN either.
> >
> > There are many fundamental reasons for that.  I give only one here base
> on our DN brain model:  Looking at a human, the brain must dynamically
> attend the tip of the nose, the entire nose, the face, or the entire human
> body on the fly.  For example, when the network attend the nose, the entire
> human body becomes the background!  Without a brain network that has both
> shallow and deep connections (unlike your stack of recurrent NN), your
> network is only for recognizing a set of static patterns in a clean
> background.  This is still an overworked pattern recognition problem, not a
> vision problem.
> >
> > -John
> >
> > On 2/6/14 7:24 AM, Schmidhuber Juergen wrote:
> >> Deep Learning in Artificial Neural Networks (NN) is about credit
> assignment across many subsequent computational stages, in deep or
> recurrent NN.
> >>
> >> A popluar Deep Learning NN is the Deep Belief Network (2006) [1,2].  A
> stack of feedforward NN (FNN) is pre-trained in unsupervised fashion. This
> can facilitate subsequent supervised learning.
> >>
> >> Let me re-advertise a much older, very similar, but more general,
> working Deep Learner of 1991. It can deal with temporal sequences: the
> Neural Hierarchical Temporal Memory or Neural History Compressor [3]. A
> stack of recurrent NN (RNN) is pre-trained in unsupervised fashion. This
> can greatly facilitate subsequent supervised learning.
> >>
> >> The RNN stack is more general in the sense that it uses
> sequence-processing RNN instead of FNN with unchanging inputs. In the early
> 1990s, the system was able to learn many previously unlearnable Deep
> Learning tasks, one of them requiring credit assignment across 1200
> successive computational stages [4].
> >>
> >> Related developments: In the 1990s there was a trend from partially
> unsupervised [3] to fully supervised recurrent Deep Learners [5]. In recent
> years, there has been a similar trend from partially unsupervised to fully
> supervised systems. For example, several recent competition-winning and
> benchmark record-setting systems use supervised LSTM RNN stacks [6-9].
> >>
> >>
> >> References:
> >>
> >> [1] G. E. Hinton, R. R. Salakhutdinov. Reducing the dimensionality of
> data with neural networks. Science, Vol. 313. no. 5786, pp. 504 - 507,
> 2006. http://www.cs.toronto.edu/~hinton/science.pdf
> >>
> >> [2] G. W. Cottrell. New Life for Neural Networks. Science, Vol. 313.
> no. 5786, pp. 454-455, 2006.
> http://www.academia.edu/155897/Cottrell_Garrison_W._2006_New_life_for_neural_networks
> >>
> >> [3] J. Schmidhuber. Learning complex, extended sequences using the
> principle of history compression, Neural Computation, 4(2):234-242, 1992.
> (Based on TR FKI-148-91, 1991.)
> ftp://ftp.idsia.ch/pub/juergen/chunker.pdf  Overview:
> http://www.idsia.ch/~juergen/firstdeeplearner.html
> >>
> >> [4] J. Schmidhuber. Habilitation thesis, TUM, 1993.
> ftp://ftp.idsia.ch/pub/juergen/habilitation.pdf . Includes an experiment
> with credit assignment across 1200 subsequent computational stages for a
> Neural Hierarchical Temporal Memory or History Compressor or RNN stack with
> unsupervised pre-training [2] (try Google Translate in your mother tongue):
> http://www.idsia.ch/~juergen/habilitation/node114.html
> >>
> >> [5] S. Hochreiter, J. Schmidhuber. Long Short-Term Memory. Neural
> Computation, 9(8):1735-1780, 1997. Based on TR FKI-207-95, 1995.
> ftp://ftp.idsia.ch/pub/juergen/lstm.pdf . Lots of of follow-up work on
> LSTM under http://www.idsia.ch/~juergen/rnn.html
> >>
> >> [6] S. Fernandez, A. Graves, J. Schmidhuber. Sequence labelling in
> structured domains with hierarchical recurrent neural networks. In Proc.
> IJCAI'07, p. 774-779, Hyderabad, India, 2007.
> ftp://ftp.idsia.ch/pub/juergen/IJCAI07sequence.pdf
> >>
> >> [7] A. Graves, J. Schmidhuber. Offline Handwriting Recognition with
> Multidimensional Recurrent Neural Networks. NIPS'22, p 545-552, Vancouver,
> MIT Press, 2009.  http://www.idsia.ch/~juergen/nips2009.pdf
> >>
> >> [8] 2009: First very deep (and recurrent) learner to win international
> competitions with secret test sets: deep LSTM RNN (1995-) won three
> connected handwriting contests at ICDAR 2009 (French, Arabic, Farsi),
> performing simultaneous segmentation and recognition.
> http://www.idsia.ch/~juergen/handwriting.html
> >>
> >> [9] A. Graves, A. Mohamed, G. E. Hinton. Speech Recognition with Deep
> Recurrent Neural Networks. ICASSP 2013, Vancouver, 2013.
> http://www.cs.toronto.edu/~hinton/absps/RNN13.pdf
> >>
> >>
> >>
> >> Juergen Schmidhuber
> >> http://www.idsia.ch/~juergen/whatsnew.html
> >
> > --
> > --
> > Juyang (John) Weng, Professor
> > Department of Computer Science and Engineering
> > MSU Cognitive Science Program and MSU Neuroscience Program
> > 428 S Shaw Ln Rm 3115
> > Michigan State University
> > East Lansing, MI 48824 USA
> > Tel: 517-353-4388
> > Fax: 517-432-1061
> > Email: weng at cse.msu.edu
> > URL: http://www.cse.msu.edu/~weng/
> > ----------------------------------------------
> >
>
>
>

-- 
Ali A. Minai, Ph.D.
Professor
Complex Adaptive Systems Lab
Department of Electrical Engineering & Computing Systems
University of Cincinnati
Cincinnati, OH 45221-0030

Phone: (513) 556-4783
Fax: (513) 556-7326
Email: Ali.Minai at uc.edu
          minaiaa at gmail.com

WWW: http://www.ece.uc.edu/~aminai/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/connectionists/attachments/20140210/1cca5ab5/attachment.html>