<div dir="ltr"><div><div>I agree with both Juergen and John. On the one hand, most neural processing must - almost necessarily - emerge from the dynamics of many recurrent networks interacting at multiple scales. I that sense, deep learning with recurrent networks is a fruitful place to start in trying to understand this. On the other hand, I also think that the term "deep learning" has become unnecessarily constrained to refer to a particular style of layered architecture and certain types of learning algorithms. We need to move beyond these - broaden the definition to include networks with more complex architectures and learning processes that include development, and even evolution. And to extend the model beyond just "neural" networks to encompass the entire brain-body network, including its mechanical and autonomic components.<br>

<br></div>One problem is that when engineers and computer scientists try to understand the brain, we keep getting distracted by all the sexy "applications" that arise as a side benefit of our models, go chasing after them, and eventually lose track of the original goal of understanding how the brain works. This results in a lot of very useful neural network models for vision, time-series prediction, data analysis, etc., but doesn't tell us much about the brain. Some of us need to take a vow of chastity and commit ourselves anew to the discipline of biology.<br>

<br></div>Ali<br></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Mon, Feb 10, 2014 at 10:26 AM, Juergen Schmidhuber <span dir="ltr"><<a href="mailto:juergen@idsia.ch" target="_blank">juergen@idsia.ch</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">John,<br>

<br>

perhaps your view is a bit too pessimistic. Note that a single RNN already is a general computer. In principle, dynamic RNNs can map arbitrary observation sequences to arbitrary computable sequences of motoric actions and internal attention-directing operations, e.g., to process cluttered scenes, or to implement development (the examples you mentioned). From my point of view, the main question is how to exploit this universal potential through learning. A stack of dynamic RNN can sometimes facilitate this. What it learns can later be collapsed into a single RNN [3].<br>


<br>

Juergen<br>

<br>

<a href="http://www.idsia.ch/~juergen/whatsnew.html" target="_blank">http://www.idsia.ch/~juergen/whatsnew.html</a><br>

<div class="HOEnZb"><div class="h5"><br>

<br>

<br>

On Feb 7, 2014, at 12:54 AM, Juyang Weng <<a href="mailto:weng@cse.msu.edu">weng@cse.msu.edu</a>> wrote:<br>

<br>

> Juergen:<br>

><br>

> You wrote: A stack of recurrent NN.  But it is a wrong architecture as far as the brain is concerned.<br>

><br>

> Although my joint work with Narendra Ahuja and Thomas S. Huang at UIUC was probably the first<br>

> learning network that used the deep Learning idea for learning from clutter scenes (Cresceptron ICCV 1992 and IJCV 1997),<br>

> I gave up this static deep learning idea later after we considered the Principle 1: Development.<br>

><br>

> The deep learning architecture is wrong for the brain.  It is too restricted, static in architecture, and cannot learn directly from cluttered scenes required by Principle 1.  The brain is not a cascade of recurrent NN.<br>


><br>

> I quote from Antonio Damasio "Decartes' Error": p. 93: "But intermediate communications occurs also via large subcortical nuclei such as those in the thalamas and basal ganglia, and via small nulei such as those in the brain stem."<br>


><br>

> Of course, the cerebral pathways themselves are not a stack of recurrent NN either.<br>

><br>

> There are many fundamental reasons for that.  I give only one here base on our DN brain model:  Looking at a human, the brain must dynamically attend the tip of the nose, the entire nose, the face, or the entire human body on the fly.  For example, when the network attend the nose, the entire human body becomes the background!  Without a brain network that has both shallow and deep connections (unlike your stack of recurrent NN), your network is only for recognizing a set of static patterns in a clean background.  This is still an overworked pattern recognition problem, not a vision problem.<br>


><br>

> -John<br>

><br>

> On 2/6/14 7:24 AM, Schmidhuber Juergen wrote:<br>

>> Deep Learning in Artificial Neural Networks (NN) is about credit assignment across many subsequent computational stages, in deep or recurrent NN.<br>

>><br>

>> A popluar Deep Learning NN is the Deep Belief Network (2006) [1,2].  A stack of feedforward NN (FNN) is pre-trained in unsupervised fashion. This can facilitate subsequent supervised learning.<br>

>><br>

>> Let me re-advertise a much older, very similar, but more general, working Deep Learner of 1991. It can deal with temporal sequences: the Neural Hierarchical Temporal Memory or Neural History Compressor [3]. A stack of recurrent NN (RNN) is pre-trained in unsupervised fashion. This can greatly facilitate subsequent supervised learning.<br>


>><br>

>> The RNN stack is more general in the sense that it uses sequence-processing RNN instead of FNN with unchanging inputs. In the early 1990s, the system was able to learn many previously unlearnable Deep Learning tasks, one of them requiring credit assignment across 1200 successive computational stages [4].<br>


>><br>

>> Related developments: In the 1990s there was a trend from partially unsupervised [3] to fully supervised recurrent Deep Learners [5]. In recent years, there has been a similar trend from partially unsupervised to fully supervised systems. For example, several recent competition-winning and benchmark record-setting systems use supervised LSTM RNN stacks [6-9].<br>


>><br>

>><br>

>> References:<br>

>><br>

>> [1] G. E. Hinton, R. R. Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, Vol. 313. no. 5786, pp. 504 - 507, 2006. <a href="http://www.cs.toronto.edu/~hinton/science.pdf" target="_blank">http://www.cs.toronto.edu/~hinton/science.pdf</a><br>


>><br>

>> [2] G. W. Cottrell. New Life for Neural Networks. Science, Vol. 313. no. 5786, pp. 454-455, 2006. <a href="http://www.academia.edu/155897/Cottrell_Garrison_W._2006_New_life_for_neural_networks" target="_blank">http://www.academia.edu/155897/Cottrell_Garrison_W._2006_New_life_for_neural_networks</a><br>


>><br>

>> [3] J. Schmidhuber. Learning complex, extended sequences using the principle of history compression, Neural Computation, 4(2):234-242, 1992. (Based on TR FKI-148-91, 1991.)  <a href="ftp://ftp.idsia.ch/pub/juergen/chunker.pdf" target="_blank">ftp://ftp.idsia.ch/pub/juergen/chunker.pdf</a>  Overview: <a href="http://www.idsia.ch/~juergen/firstdeeplearner.html" target="_blank">http://www.idsia.ch/~juergen/firstdeeplearner.html</a><br>


>><br>

>> [4] J. Schmidhuber. Habilitation thesis, TUM, 1993. <a href="ftp://ftp.idsia.ch/pub/juergen/habilitation.pdf" target="_blank">ftp://ftp.idsia.ch/pub/juergen/habilitation.pdf</a> . Includes an experiment with credit assignment across 1200 subsequent computational stages for a Neural Hierarchical Temporal Memory or History Compressor or RNN stack with unsupervised pre-training [2] (try Google Translate in your mother tongue): <a href="http://www.idsia.ch/~juergen/habilitation/node114.html" target="_blank">http://www.idsia.ch/~juergen/habilitation/node114.html</a><br>


>><br>

>> [5] S. Hochreiter, J. Schmidhuber. Long Short-Term Memory. Neural Computation, 9(8):1735-1780, 1997. Based on TR FKI-207-95, 1995.  <a href="ftp://ftp.idsia.ch/pub/juergen/lstm.pdf" target="_blank">ftp://ftp.idsia.ch/pub/juergen/lstm.pdf</a> . Lots of of follow-up work on LSTM under <a href="http://www.idsia.ch/~juergen/rnn.html" target="_blank">http://www.idsia.ch/~juergen/rnn.html</a><br>


>><br>

>> [6] S. Fernandez, A. Graves, J. Schmidhuber. Sequence labelling in structured domains with hierarchical recurrent neural networks. In Proc. IJCAI'07, p. 774-779, Hyderabad, India, 2007.  <a href="ftp://ftp.idsia.ch/pub/juergen/IJCAI07sequence.pdf" target="_blank">ftp://ftp.idsia.ch/pub/juergen/IJCAI07sequence.pdf</a><br>


>><br>

>> [7] A. Graves, J. Schmidhuber. Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks. NIPS'22, p 545-552, Vancouver, MIT Press, 2009.  <a href="http://www.idsia.ch/~juergen/nips2009.pdf" target="_blank">http://www.idsia.ch/~juergen/nips2009.pdf</a><br>


>><br>

>> [8] 2009: First very deep (and recurrent) learner to win international competitions with secret test sets: deep LSTM RNN (1995-) won three connected handwriting contests at ICDAR 2009 (French, Arabic, Farsi), performing simultaneous segmentation and recognition.  <a href="http://www.idsia.ch/~juergen/handwriting.html" target="_blank">http://www.idsia.ch/~juergen/handwriting.html</a><br>


>><br>

>> [9] A. Graves, A. Mohamed, G. E. Hinton. Speech Recognition with Deep Recurrent Neural Networks. ICASSP 2013, Vancouver, 2013.   <a href="http://www.cs.toronto.edu/~hinton/absps/RNN13.pdf" target="_blank">http://www.cs.toronto.edu/~hinton/absps/RNN13.pdf</a><br>


>><br>

>><br>

>><br>

>> Juergen Schmidhuber<br>

>> <a href="http://www.idsia.ch/~juergen/whatsnew.html" target="_blank">http://www.idsia.ch/~juergen/whatsnew.html</a><br>

><br>

> --<br>

> --<br>

> Juyang (John) Weng, Professor<br>

> Department of Computer Science and Engineering<br>

> MSU Cognitive Science Program and MSU Neuroscience Program<br>

> 428 S Shaw Ln Rm 3115<br>

> Michigan State University<br>

> East Lansing, MI 48824 USA<br>

> Tel: <a href="tel:517-353-4388" value="+15173534388">517-353-4388</a><br>

> Fax: <a href="tel:517-432-1061" value="+15174321061">517-432-1061</a><br>

> Email: <a href="mailto:weng@cse.msu.edu">weng@cse.msu.edu</a><br>

> URL: <a href="http://www.cse.msu.edu/~weng/" target="_blank">http://www.cse.msu.edu/~weng/</a><br>

> ----------------------------------------------<br>

><br>

<br>

<br>

</div></div></blockquote></div><br><br clear="all"><br>-- <br>Ali A. Minai, Ph.D.<br>Professor<br>Complex Adaptive Systems Lab<br>Department of Electrical Engineering & Computing Systems<br>University of Cincinnati<br>

Cincinnati, OH 45221-0030<br><br>Phone: (513) 556-4783<br>Fax: (513) 556-7326<br>Email: <a href="mailto:Ali.Minai@uc.edu" target="_blank">Ali.Minai@uc.edu</a><br>          <a href="mailto:minaiaa@gmail.com" target="_blank">minaiaa@gmail.com</a><br>

<br>WWW: <a href="http://www.ece.uc.edu/%7Eaminai/" target="_blank">http://www.ece.uc.edu/~aminai/</a>

</div>