<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    I really like Gary Marcus' last paragraph as well.  It ties in with
    what I have been hoping to see for some time - vast networks of
    ensembles (NOT just hierarchical - which is a very limited view)
    that use "multiple conflicting hypothesis" involving : <br>
    <ul>
      <li>data -  sensory, learned/ evolved.  In other words, both
        personal and evolutionary (extra-generational) sources of a huge
        amount of information <br>
      </li>
      <li>functional or associative structures - high specialized (not
        just sensory, motor, etc) to highly generalized (not just logic,
        specialized classifiers, etc) <br>
      </li>
      <li>architectures - static (often used, always available),
        adaptive (for new challenges that can use existing
        capabilities), dynamic (newly develop for one-time or future
        use)<br>
      </li>
    </ul>
    and "brought to life" through :<br>
    <ul>
      <li>processes - memory, parameter adjustment, learning, evolution
        (David Fogel has suggested that at some point learning requires
        evolutionary processes, Nik Kasabovs Evolving connectionists
        systems, etc), goal-setting, decisions, optimisations  (NOTE
        that it isn't necessary to always pick a winning idea, even for
        actions!).  </li>
      <li>systems </li>
      <li>capabilities - and their adaptation to other situations,
        sometimes through analogy/metaphor/abduction</li>
      <li>consciousness</li>
      <li>goal-setting <br>
      </li>
      <li>behaviors</li>
      <li>personalities (human, or in a very limited way, what computer
        systems do)</li>
    </ul>
    These are all there (and much more!) biologically, and all have been
    addressed, but the integration of effective overall systems is
    perhaps more limited.  The ability to utilise and morph specialized
    functional nets to new or dynamic challenges, an the development of
    overall architectures, seem to me to be less developed at present.  
    Some of the cognitive work seems to address this more than the
    "engineering" type of ANN applications.  <br>
    <br>
    It seems to me that biology, through the dramatic but mostly ignored
    example of "instinct" goes far beyond current thinking.  How genetic
    & non-genetic DNA (and other "molecules) and epigenetics can
    code all of this in the form of "Mendellian heredity", and how there
    may also be "Lamarckian heredity" coding (perhaps suggested by work
    or Michael Meaney and others) are questions that biologists look at,
    but this seems to be only partly addressed in the area of
    [evolutionary computation-ANNs] (eg genetic algorithms).    It is
    the coding of all of the above that actually interests me the most
    (mostly related to ANNs, at present I'm too confused about the
    biological coding!).   I keep thinking that the "unused DNA" alone
    in each cell provides a huge repository of powerful programming
    available for other biological applications, and that includes brain
    & mind. <br>
    <br>
    It is natural to focus on one aspect to advance with one's research,
    and to favour one approach or the other.  It is perhaps too easy to
    ignore the greater picture of what we can possibly infer at present,
    and to expect that the overall capabilities result from an enormous,
    powerful, and beautiful collection of diverse options that grow as
    we learn more about this area over the decades.  <br>
    <br>
    <br>
    <a class="moz-txt-link-abbreviated" href="http://www.BillHowell.ca">www.BillHowell.ca</a><br>
    <br>
    <br>
    <div class="moz-cite-prefix">Ali Minai wrote, On 14-02-10 02:37 PM:<br>
    </div>
    <blockquote
cite="mid:CABG3s4uaC6yetV+-u+rsHsJbvEv4jCATFJrVdcQbsoAmhvMusw@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div>I think Gary's last paragraph is absolutely key. Unless we
          take both the evolutionary and the developmental processes
          into account, we will neither understand complex brains fully
          nor replicate their functionality too well in our robots etc.
          We build complex robots that know nothing and then ask them to
          learn complex things, setting up a hopelessly difficult
          learning problem. But that isn't how animals learn, or why
          animals have the brains and bodies they have. A purely
          abstract computational approach to neural models makes the
          same category error that connectionists criticized symbolists
          for making, just at a different level.<br>
          <br>
        </div>
        <div>Ali<br>
        </div>
      </div>
      <div class="gmail_extra"><br>
        <br>
        <div class="gmail_quote">On Mon, Feb 10, 2014 at 11:38 AM, Gary
          Marcus <span dir="ltr"><<a moz-do-not-send="true"
              href="mailto:gary.marcus@nyu.edu" target="_blank">gary.marcus@nyu.edu</a>></span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div style="word-wrap:break-word">Juergen and others,
              <div><br>
              </div>
              <div>I am with John on his two basic concerns, and think
                that your appeal to computational universality is a red
                herring; I cc the entire group because I think that
                these issues lay at the center of why many of the
                hardest problems in AI and neuroscience continue to lay
                outside of reach, despite in-principle proofs about
                computational universality. </div>
              <div><br>
              </div>
              <div>John’s basic points, which I have also made before
                (e.g. in my books The Algebraic Mind and The Birth of
                the Mind and in my periodic New Yorker posts) are two</div>
              <div><br>
              </div>
              <div>a. It is unrealistic to expect that hierarchies of
                pattern recognizers will suffice for the full range of
                cognitive problems that humans (and strong AI systems)
                face. Deep learning, to take one example, excels at
                classification, but has thus far had relatively little
                to contribute to inference or natural language
                understanding.  Socher et al’s impressive CVG work, for
                instance, is parasitic on a traditional (symbolic)
                parser, not a soup-to-nuts neural net induced from
                input. </div>
              <div><br>
              </div>
              <div>b. it is unrealistic to expect that all the relevant
                information can be extracted by any general purpose
                learning device.</div>
              <div><br>
              </div>
              <div>Yes, you can reliably map any arbitrary input-output
                relation onto a multilayer perceptron or recurrent net,
                but <i>only</i> if you know the complete input-output
                mapping in advance. Alas, you can’t be guaranteed to do
                that in general given arbitrary subsets of the complete
                space; in the real world, learners see subsets of
                possible data and have to make guesses about what the
                rest will be like. Wolpert’s No Free Lunch work is
                instructive here (and also in line with how cognitive
                scientists like Chomsky, Pinker, and myself have thought
                about the problem). For any problem, I presume that
                there exists an appropriately-configured net, but there
                is no guarantee that in the real world you are going to
                be able to correctly induce the right system via
                general-purpose learning algorithm given a finite amount
                of data, with a finite amount of training. Empirically,
                neural nets of roughly the form you are discussing have
                worked fine for some problems (e.g. backgammon) but been
                no match for their symbolic competitors in other domains
                (chess) and worked only as an adjunct rather than an
                central ingredient in still others (parsing,
                question-answering a la Watson, etc); in other domains,
                like planning and common-sense reasoning, there has been
                essentially no serious work at all.</div>
              <div><br>
              </div>
              <div>My own take, informed by evolutionary and
                developmental biology, is that no single general purpose
                architecture will ever be a match for the endproduct of
                a billion years of evolution, which includes, I suspect,
                a significant amount of customized architecture that
                need not be induced anew in each generation.  We learn
                as well as we do precisely because evolution has
                preceded us, and endowed us with custom tools for
                learning in different domains. Until the field of neural
                nets more seriously engages in understanding what the
                contribution from evolution to neural wetware might be,
                I will remain pessimistic about the field’s prospects.</div>
              <div><br>
              </div>
              <div>Best,</div>
              <div>Gary Marcus</div>
              <div><br>
              </div>
              <font face="HelveticaNeue-Light"><span>Professor of
                  Psychology</span><br>
              </font>
              <div>
                <div
                  style="text-align:-webkit-auto;word-wrap:break-word">
                  <div
                    style="text-align:-webkit-auto;word-wrap:break-word">
                    <div
                      style="text-align:-webkit-auto;word-wrap:break-word"><font
                        face="HelveticaNeue-Light">
                        <div>
                          <div>
                            <div style="margin:0in 0in 0.0001pt">New
                              York University</div>
                            <div style="margin:0in 0in 0.0001pt">Visiting
                              Cognitive Scientist</div>
                            <div style="margin:0in 0in 0.0001pt">Allen
                              Institute for Brain Science</div>
                            <div style="margin:0in 0in 0.0001pt">Allen
                              Institute for Artiificial Intelligence</div>
                          </div>
                          <div>
                          </div>
                        </div>
                        co-edited book coming late 2014:</font></div>
                    <div
                      style="text-align:-webkit-auto;word-wrap:break-word"><span
                        style="text-align:-webkit-auto"><font
                          face="HelveticaNeue-Light">The Future of the
                          Brain: Essays By The World’s Leading
                          Neuroscientists</font></span></div>
                    <div
                      style="text-align:-webkit-auto;word-wrap:break-word"><a
                        moz-do-not-send="true"
                        href="http://garymarcus.com/"
                        style="font-family:HelveticaNeue-Light"
                        target="_blank">http://garymarcus.com/</a></div>
                    <div><br>
                    </div>
                  </div>
                </div>
              </div>
              <div>
                <div class="h5">
                  <div>
                    <div>On Feb 10, 2014, at 10:26 AM, Juergen
                      Schmidhuber <<a moz-do-not-send="true"
                        href="mailto:juergen@idsia.ch" target="_blank">juergen@idsia.ch</a>>
                      wrote:</div>
                    <br>
                    <blockquote type="cite">John,<br>
                      <br>
                      perhaps your view is a bit too pessimistic. Note
                      that a single RNN already is a general computer.
                      In principle, dynamic RNNs can map arbitrary
                      observation sequences to arbitrary computable
                      sequences of motoric actions and internal
                      attention-directing operations, e.g., to process
                      cluttered scenes, or to implement development (the
                      examples you mentioned). From my point of view,
                      the main question is how to exploit this universal
                      potential through learning. A stack of dynamic RNN
                      can sometimes facilitate this. What it learns can
                      later be collapsed into a single RNN [3].<br>
                      <br>
                      Juergen<br>
                      <br>
                      <a moz-do-not-send="true"
                        href="http://www.idsia.ch/%7Ejuergen/whatsnew.html"
                        target="_blank">http://www.idsia.ch/~juergen/whatsnew.html</a><br>
                      <br>
                      <br>
                      <br>
                      On Feb 7, 2014, at 12:54 AM, Juyang Weng <<a
                        moz-do-not-send="true"
                        href="mailto:weng@cse.msu.edu" target="_blank">weng@cse.msu.edu</a>>
                      wrote:<br>
                      <br>
                      <blockquote type="cite">Juergen:<br>
                        <br>
                        You wrote: A stack of recurrent NN.  But it is a
                        wrong architecture as far as the brain is
                        concerned.<br>
                        <br>
                        Although my joint work with Narendra Ahuja and
                        Thomas S. Huang at UIUC was probably the first<br>
                        learning network that used the deep Learning
                        idea for learning from clutter scenes
                        (Cresceptron ICCV 1992 and IJCV 1997),<br>
                        I gave up this static deep learning idea later
                        after we considered the Principle 1:
                        Development.<br>
                        <br>
                        The deep learning architecture is wrong for the
                        brain.  It is too restricted, static in
                        architecture, and cannot learn directly from
                        cluttered scenes required by Principle 1.  The
                        brain is not a cascade of recurrent NN.<br>
                        <br>
                        I quote from Antonio Damasio "Decartes' Error":
                        p. 93: "But intermediate communications occurs
                        also via large subcortical nuclei such as those
                        in the thalamas and basal ganglia, and via small
                        nulei such as those in the brain stem."<br>
                        <br>
                        Of course, the cerebral pathways themselves are
                        not a stack of recurrent NN either.<br>
                        <br>
                        There are many fundamental reasons for that.  I
                        give only one here base on our DN brain model:
                         Looking at a human, the brain must dynamically
                        attend the tip of the nose, the entire nose, the
                        face, or the entire human body on the fly.  For
                        example, when the network attend the nose, the
                        entire human body becomes the background!
                         Without a brain network that has both shallow
                        and deep connections (unlike your stack of
                        recurrent NN), your network is only for
                        recognizing a set of static patterns in a clean
                        background.  This is still an overworked pattern
                        recognition problem, not a vision problem.<br>
                        <br>
                        -John<br>
                        <br>
                        On 2/6/14 7:24 AM, Schmidhuber Juergen wrote:<br>
                        <blockquote type="cite">Deep Learning in
                          Artificial Neural Networks (NN) is about
                          credit assignment across many subsequent
                          computational stages, in deep or recurrent NN.<br>
                          <br>
                          A popluar Deep Learning NN is the Deep Belief
                          Network (2006) [1,2].  A stack of feedforward
                          NN (FNN) is pre-trained in unsupervised
                          fashion. This can facilitate subsequent
                          supervised learning.<br>
                          <br>
                          Let me re-advertise a much older, very
                          similar, but more general, working Deep
                          Learner of 1991. It can deal with temporal
                          sequences: the Neural Hierarchical Temporal
                          Memory or Neural History Compressor [3]. A
                          stack of recurrent NN (RNN) is pre-trained in
                          unsupervised fashion. This can greatly
                          facilitate subsequent supervised learning.<br>
                          <br>
                          The RNN stack is more general in the sense
                          that it uses sequence-processing RNN instead
                          of FNN with unchanging inputs. In the early
                          1990s, the system was able to learn many
                          previously unlearnable Deep Learning tasks,
                          one of them requiring credit assignment across
                          1200 successive computational stages [4].<br>
                          <br>
                          Related developments: In the 1990s there was a
                          trend from partially unsupervised [3] to fully
                          supervised recurrent Deep Learners [5]. In
                          recent years, there has been a similar trend
                          from partially unsupervised to fully
                          supervised systems. For example, several
                          recent competition-winning and benchmark
                          record-setting systems use supervised LSTM RNN
                          stacks [6-9].<br>
                          <br>
                          <br>
                          References:<br>
                          <br>
                          [1] G. E. Hinton, R. R. Salakhutdinov.
                          Reducing the dimensionality of data with
                          neural networks. Science, Vol. 313. no. 5786,
                          pp. 504 - 507, 2006. <a
                            moz-do-not-send="true"
                            href="http://www.cs.toronto.edu/%7Ehinton/science.pdf"
                            target="_blank">http://www.cs.toronto.edu/~hinton/science.pdf</a><br>
                          <br>
                          [2] G. W. Cottrell. New Life for Neural
                          Networks. Science, Vol. 313. no. 5786, pp.
                          454-455, 2006. <a moz-do-not-send="true"
href="http://www.academia.edu/155897/Cottrell_Garrison_W._2006_New_life_for_neural_networks"
                            target="_blank">http://www.academia.edu/155897/Cottrell_Garrison_W._2006_New_life_for_neural_networks</a><br>
                          <br>
                          [3] J. Schmidhuber. Learning complex, extended
                          sequences using the principle of history
                          compression, Neural Computation, 4(2):234-242,
                          1992. (Based on TR FKI-148-91, 1991.)  <a
                            moz-do-not-send="true"
                            href="ftp://ftp.idsia.ch/pub/juergen/chunker.pdf"
                            target="_blank">ftp://ftp.idsia.ch/pub/juergen/chunker.pdf</a>
                           Overview: <a moz-do-not-send="true"
                            href="http://www.idsia.ch/%7Ejuergen/firstdeeplearner.html"
                            target="_blank">http://www.idsia.ch/~juergen/firstdeeplearner.html</a><br>
                          <br>
                          [4] J. Schmidhuber. Habilitation thesis, TUM,
                          1993. <a moz-do-not-send="true"
                            href="ftp://ftp.idsia.ch/pub/juergen/habilitation.pdf"
                            target="_blank">ftp://ftp.idsia.ch/pub/juergen/habilitation.pdf</a>
                          . Includes an experiment with credit
                          assignment across 1200 subsequent
                          computational stages for a Neural Hierarchical
                          Temporal Memory or History Compressor or RNN
                          stack with unsupervised pre-training [2] (try
                          Google Translate in your mother tongue): <a
                            moz-do-not-send="true"
                            href="http://www.idsia.ch/%7Ejuergen/habilitation/node114.html"
                            target="_blank">http://www.idsia.ch/~juergen/habilitation/node114.html</a><br>
                          <br>
                          [5] S. Hochreiter, J. Schmidhuber. Long
                          Short-Term Memory. Neural Computation,
                          9(8):1735-1780, 1997. Based on TR FKI-207-95,
                          1995.  <a moz-do-not-send="true"
                            href="ftp://ftp.idsia.ch/pub/juergen/lstm.pdf"
                            target="_blank">ftp://ftp.idsia.ch/pub/juergen/lstm.pdf</a>
                          . Lots of of follow-up work on LSTM under <a
                            moz-do-not-send="true"
                            href="http://www.idsia.ch/%7Ejuergen/rnn.html"
                            target="_blank">http://www.idsia.ch/~juergen/rnn.html</a><br>
                          <br>
                          [6] S. Fernandez, A. Graves, J. Schmidhuber.
                          Sequence labelling in structured domains with
                          hierarchical recurrent neural networks. In
                          Proc. IJCAI'07, p. 774-779, Hyderabad, India,
                          2007.  <a moz-do-not-send="true"
                            href="ftp://ftp.idsia.ch/pub/juergen/IJCAI07sequence.pdf"
                            target="_blank">ftp://ftp.idsia.ch/pub/juergen/IJCAI07sequence.pdf</a><br>
                          <br>
                          [7] A. Graves, J. Schmidhuber. Offline
                          Handwriting Recognition with Multidimensional
                          Recurrent Neural Networks. NIPS'22, p 545-552,
                          Vancouver, MIT Press, 2009.  <a
                            moz-do-not-send="true"
                            href="http://www.idsia.ch/%7Ejuergen/nips2009.pdf"
                            target="_blank">http://www.idsia.ch/~juergen/nips2009.pdf</a><br>
                          <br>
                          [8] 2009: First very deep (and recurrent)
                          learner to win international competitions with
                          secret test sets: deep LSTM RNN (1995-) won
                          three connected handwriting contests at ICDAR
                          2009 (French, Arabic, Farsi), performing
                          simultaneous segmentation and recognition.  <a
                            moz-do-not-send="true"
                            href="http://www.idsia.ch/%7Ejuergen/handwriting.html"
                            target="_blank">http://www.idsia.ch/~juergen/handwriting.html</a><br>
                          <br>
                          [9] A. Graves, A. Mohamed, G. E. Hinton.
                          Speech Recognition with Deep Recurrent Neural
                          Networks. ICASSP 2013, Vancouver, 2013.   <a
                            moz-do-not-send="true"
                            href="http://www.cs.toronto.edu/%7Ehinton/absps/RNN13.pdf"
                            target="_blank">http://www.cs.toronto.edu/~hinton/absps/RNN13.pdf</a><br>
                          <br>
                          <br>
                          <br>
                          Juergen Schmidhuber<br>
                          <a moz-do-not-send="true"
                            href="http://www.idsia.ch/%7Ejuergen/whatsnew.html"
                            target="_blank">http://www.idsia.ch/~juergen/whatsnew.html</a><br>
                        </blockquote>
                        <br>
                        -- <br>
                        --<br>
                        Juyang (John) Weng, Professor<br>
                        Department of Computer Science and Engineering<br>
                        MSU Cognitive Science Program and MSU
                        Neuroscience Program<br>
                        428 S Shaw Ln Rm 3115<br>
                        Michigan State University<br>
                        East Lansing, MI 48824 USA<br>
                        Tel: <a moz-do-not-send="true"
                          href="tel:517-353-4388" value="+15173534388"
                          target="_blank">517-353-4388</a><br>
                        Fax: <a moz-do-not-send="true"
                          href="tel:517-432-1061" value="+15174321061"
                          target="_blank">517-432-1061</a><br>
                        Email: <a moz-do-not-send="true"
                          href="mailto:weng@cse.msu.edu" target="_blank">weng@cse.msu.edu</a><br>
                        URL: <a moz-do-not-send="true"
                          href="http://www.cse.msu.edu/%7Eweng/"
                          target="_blank">http://www.cse.msu.edu/~weng/</a><br>
                        ----------------------------------------------<br>
                        <br>
                      </blockquote>
                      <br>
                      <br>
                    </blockquote>
                  </div>
                  <br>
                </div>
              </div>
            </div>
          </blockquote>
        </div>
        <br>
        <br clear="all">
        <br>
        -- <br>
        Ali A. Minai, Ph.D.<br>
        Professor<br>
        Complex Adaptive Systems Lab<br>
        Department of Electrical Engineering & Computing Systems<br>
        University of Cincinnati<br>
        Cincinnati, OH 45221-0030<br>
        <br>
        Phone: (513) 556-4783<br>
        Fax: (513) 556-7326<br>
        Email: <a moz-do-not-send="true" href="mailto:Ali.Minai@uc.edu"
          target="_blank">Ali.Minai@uc.edu</a><br>
                  <a moz-do-not-send="true"
          href="mailto:minaiaa@gmail.com" target="_blank">minaiaa@gmail.com</a><br>
        <br>
        WWW: <a moz-do-not-send="true"
          href="http://www.ece.uc.edu/%7Eaminai/" target="_blank">http://www.ece.uc.edu/~aminai/</a>
      </div>
    </blockquote>
    <br>
  </body>
</html>