<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
I really like Gary Marcus' last paragraph as well. It ties in with
what I have been hoping to see for some time - vast networks of
ensembles (NOT just hierarchical - which is a very limited view)
that use "multiple conflicting hypothesis" involving : <br>
<ul>
<li>data - sensory, learned/ evolved. In other words, both
personal and evolutionary (extra-generational) sources of a huge
amount of information <br>
</li>
<li>functional or associative structures - high specialized (not
just sensory, motor, etc) to highly generalized (not just logic,
specialized classifiers, etc) <br>
</li>
<li>architectures - static (often used, always available),
adaptive (for new challenges that can use existing
capabilities), dynamic (newly develop for one-time or future
use)<br>
</li>
</ul>
and "brought to life" through :<br>
<ul>
<li>processes - memory, parameter adjustment, learning, evolution
(David Fogel has suggested that at some point learning requires
evolutionary processes, Nik Kasabovs Evolving connectionists
systems, etc), goal-setting, decisions, optimisations (NOTE
that it isn't necessary to always pick a winning idea, even for
actions!). </li>
<li>systems </li>
<li>capabilities - and their adaptation to other situations,
sometimes through analogy/metaphor/abduction</li>
<li>consciousness</li>
<li>goal-setting <br>
</li>
<li>behaviors</li>
<li>personalities (human, or in a very limited way, what computer
systems do)</li>
</ul>
These are all there (and much more!) biologically, and all have been
addressed, but the integration of effective overall systems is
perhaps more limited. The ability to utilise and morph specialized
functional nets to new or dynamic challenges, an the development of
overall architectures, seem to me to be less developed at present.
Some of the cognitive work seems to address this more than the
"engineering" type of ANN applications. <br>
<br>
It seems to me that biology, through the dramatic but mostly ignored
example of "instinct" goes far beyond current thinking. How genetic
& non-genetic DNA (and other "molecules) and epigenetics can
code all of this in the form of "Mendellian heredity", and how there
may also be "Lamarckian heredity" coding (perhaps suggested by work
or Michael Meaney and others) are questions that biologists look at,
but this seems to be only partly addressed in the area of
[evolutionary computation-ANNs] (eg genetic algorithms). It is
the coding of all of the above that actually interests me the most
(mostly related to ANNs, at present I'm too confused about the
biological coding!). I keep thinking that the "unused DNA" alone
in each cell provides a huge repository of powerful programming
available for other biological applications, and that includes brain
& mind. <br>
<br>
It is natural to focus on one aspect to advance with one's research,
and to favour one approach or the other. It is perhaps too easy to
ignore the greater picture of what we can possibly infer at present,
and to expect that the overall capabilities result from an enormous,
powerful, and beautiful collection of diverse options that grow as
we learn more about this area over the decades. <br>
<br>
<br>
<a class="moz-txt-link-abbreviated" href="http://www.BillHowell.ca">www.BillHowell.ca</a><br>
<br>
<br>
<div class="moz-cite-prefix">Ali Minai wrote, On 14-02-10 02:37 PM:<br>
</div>
<blockquote
cite="mid:CABG3s4uaC6yetV+-u+rsHsJbvEv4jCATFJrVdcQbsoAmhvMusw@mail.gmail.com"
type="cite">
<div dir="ltr">
<div>I think Gary's last paragraph is absolutely key. Unless we
take both the evolutionary and the developmental processes
into account, we will neither understand complex brains fully
nor replicate their functionality too well in our robots etc.
We build complex robots that know nothing and then ask them to
learn complex things, setting up a hopelessly difficult
learning problem. But that isn't how animals learn, or why
animals have the brains and bodies they have. A purely
abstract computational approach to neural models makes the
same category error that connectionists criticized symbolists
for making, just at a different level.<br>
<br>
</div>
<div>Ali<br>
</div>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On Mon, Feb 10, 2014 at 11:38 AM, Gary
Marcus <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:gary.marcus@nyu.edu" target="_blank">gary.marcus@nyu.edu</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style="word-wrap:break-word">Juergen and others,
<div><br>
</div>
<div>I am with John on his two basic concerns, and think
that your appeal to computational universality is a red
herring; I cc the entire group because I think that
these issues lay at the center of why many of the
hardest problems in AI and neuroscience continue to lay
outside of reach, despite in-principle proofs about
computational universality. </div>
<div><br>
</div>
<div>John’s basic points, which I have also made before
(e.g. in my books The Algebraic Mind and The Birth of
the Mind and in my periodic New Yorker posts) are two</div>
<div><br>
</div>
<div>a. It is unrealistic to expect that hierarchies of
pattern recognizers will suffice for the full range of
cognitive problems that humans (and strong AI systems)
face. Deep learning, to take one example, excels at
classification, but has thus far had relatively little
to contribute to inference or natural language
understanding. Socher et al’s impressive CVG work, for
instance, is parasitic on a traditional (symbolic)
parser, not a soup-to-nuts neural net induced from
input. </div>
<div><br>
</div>
<div>b. it is unrealistic to expect that all the relevant
information can be extracted by any general purpose
learning device.</div>
<div><br>
</div>
<div>Yes, you can reliably map any arbitrary input-output
relation onto a multilayer perceptron or recurrent net,
but <i>only</i> if you know the complete input-output
mapping in advance. Alas, you can’t be guaranteed to do
that in general given arbitrary subsets of the complete
space; in the real world, learners see subsets of
possible data and have to make guesses about what the
rest will be like. Wolpert’s No Free Lunch work is
instructive here (and also in line with how cognitive
scientists like Chomsky, Pinker, and myself have thought
about the problem). For any problem, I presume that
there exists an appropriately-configured net, but there
is no guarantee that in the real world you are going to
be able to correctly induce the right system via
general-purpose learning algorithm given a finite amount
of data, with a finite amount of training. Empirically,
neural nets of roughly the form you are discussing have
worked fine for some problems (e.g. backgammon) but been
no match for their symbolic competitors in other domains
(chess) and worked only as an adjunct rather than an
central ingredient in still others (parsing,
question-answering a la Watson, etc); in other domains,
like planning and common-sense reasoning, there has been
essentially no serious work at all.</div>
<div><br>
</div>
<div>My own take, informed by evolutionary and
developmental biology, is that no single general purpose
architecture will ever be a match for the endproduct of
a billion years of evolution, which includes, I suspect,
a significant amount of customized architecture that
need not be induced anew in each generation. We learn
as well as we do precisely because evolution has
preceded us, and endowed us with custom tools for
learning in different domains. Until the field of neural
nets more seriously engages in understanding what the
contribution from evolution to neural wetware might be,
I will remain pessimistic about the field’s prospects.</div>
<div><br>
</div>
<div>Best,</div>
<div>Gary Marcus</div>
<div><br>
</div>
<font face="HelveticaNeue-Light"><span>Professor of
Psychology</span><br>
</font>
<div>
<div
style="text-align:-webkit-auto;word-wrap:break-word">
<div
style="text-align:-webkit-auto;word-wrap:break-word">
<div
style="text-align:-webkit-auto;word-wrap:break-word"><font
face="HelveticaNeue-Light">
<div>
<div>
<div style="margin:0in 0in 0.0001pt">New
York University</div>
<div style="margin:0in 0in 0.0001pt">Visiting
Cognitive Scientist</div>
<div style="margin:0in 0in 0.0001pt">Allen
Institute for Brain Science</div>
<div style="margin:0in 0in 0.0001pt">Allen
Institute for Artiificial Intelligence</div>
</div>
<div>
</div>
</div>
co-edited book coming late 2014:</font></div>
<div
style="text-align:-webkit-auto;word-wrap:break-word"><span
style="text-align:-webkit-auto"><font
face="HelveticaNeue-Light">The Future of the
Brain: Essays By The World’s Leading
Neuroscientists</font></span></div>
<div
style="text-align:-webkit-auto;word-wrap:break-word"><a
moz-do-not-send="true"
href="http://garymarcus.com/"
style="font-family:HelveticaNeue-Light"
target="_blank">http://garymarcus.com/</a></div>
<div><br>
</div>
</div>
</div>
</div>
<div>
<div class="h5">
<div>
<div>On Feb 10, 2014, at 10:26 AM, Juergen
Schmidhuber <<a moz-do-not-send="true"
href="mailto:juergen@idsia.ch" target="_blank">juergen@idsia.ch</a>>
wrote:</div>
<br>
<blockquote type="cite">John,<br>
<br>
perhaps your view is a bit too pessimistic. Note
that a single RNN already is a general computer.
In principle, dynamic RNNs can map arbitrary
observation sequences to arbitrary computable
sequences of motoric actions and internal
attention-directing operations, e.g., to process
cluttered scenes, or to implement development (the
examples you mentioned). From my point of view,
the main question is how to exploit this universal
potential through learning. A stack of dynamic RNN
can sometimes facilitate this. What it learns can
later be collapsed into a single RNN [3].<br>
<br>
Juergen<br>
<br>
<a moz-do-not-send="true"
href="http://www.idsia.ch/%7Ejuergen/whatsnew.html"
target="_blank">http://www.idsia.ch/~juergen/whatsnew.html</a><br>
<br>
<br>
<br>
On Feb 7, 2014, at 12:54 AM, Juyang Weng <<a
moz-do-not-send="true"
href="mailto:weng@cse.msu.edu" target="_blank">weng@cse.msu.edu</a>>
wrote:<br>
<br>
<blockquote type="cite">Juergen:<br>
<br>
You wrote: A stack of recurrent NN. But it is a
wrong architecture as far as the brain is
concerned.<br>
<br>
Although my joint work with Narendra Ahuja and
Thomas S. Huang at UIUC was probably the first<br>
learning network that used the deep Learning
idea for learning from clutter scenes
(Cresceptron ICCV 1992 and IJCV 1997),<br>
I gave up this static deep learning idea later
after we considered the Principle 1:
Development.<br>
<br>
The deep learning architecture is wrong for the
brain. It is too restricted, static in
architecture, and cannot learn directly from
cluttered scenes required by Principle 1. The
brain is not a cascade of recurrent NN.<br>
<br>
I quote from Antonio Damasio "Decartes' Error":
p. 93: "But intermediate communications occurs
also via large subcortical nuclei such as those
in the thalamas and basal ganglia, and via small
nulei such as those in the brain stem."<br>
<br>
Of course, the cerebral pathways themselves are
not a stack of recurrent NN either.<br>
<br>
There are many fundamental reasons for that. I
give only one here base on our DN brain model:
Looking at a human, the brain must dynamically
attend the tip of the nose, the entire nose, the
face, or the entire human body on the fly. For
example, when the network attend the nose, the
entire human body becomes the background!
Without a brain network that has both shallow
and deep connections (unlike your stack of
recurrent NN), your network is only for
recognizing a set of static patterns in a clean
background. This is still an overworked pattern
recognition problem, not a vision problem.<br>
<br>
-John<br>
<br>
On 2/6/14 7:24 AM, Schmidhuber Juergen wrote:<br>
<blockquote type="cite">Deep Learning in
Artificial Neural Networks (NN) is about
credit assignment across many subsequent
computational stages, in deep or recurrent NN.<br>
<br>
A popluar Deep Learning NN is the Deep Belief
Network (2006) [1,2]. A stack of feedforward
NN (FNN) is pre-trained in unsupervised
fashion. This can facilitate subsequent
supervised learning.<br>
<br>
Let me re-advertise a much older, very
similar, but more general, working Deep
Learner of 1991. It can deal with temporal
sequences: the Neural Hierarchical Temporal
Memory or Neural History Compressor [3]. A
stack of recurrent NN (RNN) is pre-trained in
unsupervised fashion. This can greatly
facilitate subsequent supervised learning.<br>
<br>
The RNN stack is more general in the sense
that it uses sequence-processing RNN instead
of FNN with unchanging inputs. In the early
1990s, the system was able to learn many
previously unlearnable Deep Learning tasks,
one of them requiring credit assignment across
1200 successive computational stages [4].<br>
<br>
Related developments: In the 1990s there was a
trend from partially unsupervised [3] to fully
supervised recurrent Deep Learners [5]. In
recent years, there has been a similar trend
from partially unsupervised to fully
supervised systems. For example, several
recent competition-winning and benchmark
record-setting systems use supervised LSTM RNN
stacks [6-9].<br>
<br>
<br>
References:<br>
<br>
[1] G. E. Hinton, R. R. Salakhutdinov.
Reducing the dimensionality of data with
neural networks. Science, Vol. 313. no. 5786,
pp. 504 - 507, 2006. <a
moz-do-not-send="true"
href="http://www.cs.toronto.edu/%7Ehinton/science.pdf"
target="_blank">http://www.cs.toronto.edu/~hinton/science.pdf</a><br>
<br>
[2] G. W. Cottrell. New Life for Neural
Networks. Science, Vol. 313. no. 5786, pp.
454-455, 2006. <a moz-do-not-send="true"
href="http://www.academia.edu/155897/Cottrell_Garrison_W._2006_New_life_for_neural_networks"
target="_blank">http://www.academia.edu/155897/Cottrell_Garrison_W._2006_New_life_for_neural_networks</a><br>
<br>
[3] J. Schmidhuber. Learning complex, extended
sequences using the principle of history
compression, Neural Computation, 4(2):234-242,
1992. (Based on TR FKI-148-91, 1991.) <a
moz-do-not-send="true"
href="ftp://ftp.idsia.ch/pub/juergen/chunker.pdf"
target="_blank">ftp://ftp.idsia.ch/pub/juergen/chunker.pdf</a>
Overview: <a moz-do-not-send="true"
href="http://www.idsia.ch/%7Ejuergen/firstdeeplearner.html"
target="_blank">http://www.idsia.ch/~juergen/firstdeeplearner.html</a><br>
<br>
[4] J. Schmidhuber. Habilitation thesis, TUM,
1993. <a moz-do-not-send="true"
href="ftp://ftp.idsia.ch/pub/juergen/habilitation.pdf"
target="_blank">ftp://ftp.idsia.ch/pub/juergen/habilitation.pdf</a>
. Includes an experiment with credit
assignment across 1200 subsequent
computational stages for a Neural Hierarchical
Temporal Memory or History Compressor or RNN
stack with unsupervised pre-training [2] (try
Google Translate in your mother tongue): <a
moz-do-not-send="true"
href="http://www.idsia.ch/%7Ejuergen/habilitation/node114.html"
target="_blank">http://www.idsia.ch/~juergen/habilitation/node114.html</a><br>
<br>
[5] S. Hochreiter, J. Schmidhuber. Long
Short-Term Memory. Neural Computation,
9(8):1735-1780, 1997. Based on TR FKI-207-95,
1995. <a moz-do-not-send="true"
href="ftp://ftp.idsia.ch/pub/juergen/lstm.pdf"
target="_blank">ftp://ftp.idsia.ch/pub/juergen/lstm.pdf</a>
. Lots of of follow-up work on LSTM under <a
moz-do-not-send="true"
href="http://www.idsia.ch/%7Ejuergen/rnn.html"
target="_blank">http://www.idsia.ch/~juergen/rnn.html</a><br>
<br>
[6] S. Fernandez, A. Graves, J. Schmidhuber.
Sequence labelling in structured domains with
hierarchical recurrent neural networks. In
Proc. IJCAI'07, p. 774-779, Hyderabad, India,
2007. <a moz-do-not-send="true"
href="ftp://ftp.idsia.ch/pub/juergen/IJCAI07sequence.pdf"
target="_blank">ftp://ftp.idsia.ch/pub/juergen/IJCAI07sequence.pdf</a><br>
<br>
[7] A. Graves, J. Schmidhuber. Offline
Handwriting Recognition with Multidimensional
Recurrent Neural Networks. NIPS'22, p 545-552,
Vancouver, MIT Press, 2009. <a
moz-do-not-send="true"
href="http://www.idsia.ch/%7Ejuergen/nips2009.pdf"
target="_blank">http://www.idsia.ch/~juergen/nips2009.pdf</a><br>
<br>
[8] 2009: First very deep (and recurrent)
learner to win international competitions with
secret test sets: deep LSTM RNN (1995-) won
three connected handwriting contests at ICDAR
2009 (French, Arabic, Farsi), performing
simultaneous segmentation and recognition. <a
moz-do-not-send="true"
href="http://www.idsia.ch/%7Ejuergen/handwriting.html"
target="_blank">http://www.idsia.ch/~juergen/handwriting.html</a><br>
<br>
[9] A. Graves, A. Mohamed, G. E. Hinton.
Speech Recognition with Deep Recurrent Neural
Networks. ICASSP 2013, Vancouver, 2013. <a
moz-do-not-send="true"
href="http://www.cs.toronto.edu/%7Ehinton/absps/RNN13.pdf"
target="_blank">http://www.cs.toronto.edu/~hinton/absps/RNN13.pdf</a><br>
<br>
<br>
<br>
Juergen Schmidhuber<br>
<a moz-do-not-send="true"
href="http://www.idsia.ch/%7Ejuergen/whatsnew.html"
target="_blank">http://www.idsia.ch/~juergen/whatsnew.html</a><br>
</blockquote>
<br>
-- <br>
--<br>
Juyang (John) Weng, Professor<br>
Department of Computer Science and Engineering<br>
MSU Cognitive Science Program and MSU
Neuroscience Program<br>
428 S Shaw Ln Rm 3115<br>
Michigan State University<br>
East Lansing, MI 48824 USA<br>
Tel: <a moz-do-not-send="true"
href="tel:517-353-4388" value="+15173534388"
target="_blank">517-353-4388</a><br>
Fax: <a moz-do-not-send="true"
href="tel:517-432-1061" value="+15174321061"
target="_blank">517-432-1061</a><br>
Email: <a moz-do-not-send="true"
href="mailto:weng@cse.msu.edu" target="_blank">weng@cse.msu.edu</a><br>
URL: <a moz-do-not-send="true"
href="http://www.cse.msu.edu/%7Eweng/"
target="_blank">http://www.cse.msu.edu/~weng/</a><br>
----------------------------------------------<br>
<br>
</blockquote>
<br>
<br>
</blockquote>
</div>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
<br clear="all">
<br>
-- <br>
Ali A. Minai, Ph.D.<br>
Professor<br>
Complex Adaptive Systems Lab<br>
Department of Electrical Engineering & Computing Systems<br>
University of Cincinnati<br>
Cincinnati, OH 45221-0030<br>
<br>
Phone: (513) 556-4783<br>
Fax: (513) 556-7326<br>
Email: <a moz-do-not-send="true" href="mailto:Ali.Minai@uc.edu"
target="_blank">Ali.Minai@uc.edu</a><br>
<a moz-do-not-send="true"
href="mailto:minaiaa@gmail.com" target="_blank">minaiaa@gmail.com</a><br>
<br>
WWW: <a moz-do-not-send="true"
href="http://www.ece.uc.edu/%7Eaminai/" target="_blank">http://www.ece.uc.edu/~aminai/</a>
</div>
</blockquote>
<br>
</body>
</html>