<div dir="ltr">I can't comment on most of this, but I am not sure if all models of sparsity and sparse coding fall into the connectionist realm either because some make statistical assumptions.<div>-Tsvi</div></div><div class="gmail_extra">
<br><br><div class="gmail_quote">On Tue, Apr 8, 2014 at 9:19 PM, Juyang Weng <span dir="ltr"><<a href="mailto:weng@cse.msu.edu" target="_blank">weng@cse.msu.edu</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
Tavi:<br>
<br>
Let me explain a little more detail:<br>
<br>
There are two large categories of biological neurons, excitatory and
inhibitory. Both are developed through mainly signal statistics, <br>
not specified primarily by the genomes. Not all people agree with
my this point, but please tolerate my this view for now. <br>
I gave a more detailed discussion on this view in my NAI book. <br>
<br>
The main effect of inhibitory connections is to reduce the number of
firing neurons (David Field called it sparse coding), with the help
of <br>
excitatory connections. This sparse coding is important because
those do not fire are long term memory of the area at this point of
time.<br>
My this view is different from David Field. He wrote that sparse
coding is for the current representations. I think sparse coding is
<br>
necessary for long-term memory. Not all people agree with my this
point, but please tolerate my this view for now. <br>
<br>
However, this reduction requires very fast parallel neuronal updates
to avoid uncontrollable large-magnitude oscillations. <br>
Even with the fast biological parallel neuronal updates, we still
see slow but small-magnitude oscillations such as the <br>
well-known theta waves and alpha waves. My view is that such slow
but small-magnitude oscillations are side effects of <br>
excitatory and inhibitory connections that form many loops, not
something really desirable for the brain operation (sorry, <br>
Paul Werbos). Not all people agree with my this point, but please
tolerate my this view for now. <br>
<br>
Therefore, as far as I understand, all computer simulations for
spiking neurons are not showing major brain functions<br>
because they have to deal with the slow oscillations that are very
different from the brain's, e.g., as Dr. Henry Markram reported<br>
(40Hz?). <br>
<br>
The above discussion again shows the power and necessity of an
overarching brain theory like that in my NAI book. <br>
Those who only simulate biological neurons using superficial
biological phenomena are not going to demonstrate <br>
any major brain functions. They can talk about signal statistics
from their simulations, but signal statistics are far from brain
functions. <br>
<br>
-John<div><div class="h5"><br>
<br>
<div>On 4/8/14 1:30 AM, Tsvi Achler wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Hi John,
<div>ART evaluates distance between the contending
representation and the current input through vigilance. If
they are too far apart, a poor vigilance signal will be
triggered.</div>
<div>The best resonance will be achieved when they have the
least amount of distance.</div>
<div>If in your model, K-nearest neighbors is used without a
neural equivalent, then your model is not quite in the spirit
of a connectionist model.</div>
<div>For example, Bayesian networks do a great job emulating
brain behavior, modeling the integration of priors. and has
been invaluable to model cognitive studies. However they
assume a statistical configuration of connections and
distributions which is not quite known how to emulate with
neurons. Thus pure Bayesian models are also questionable in
terms of connectionist modeling. But some connectionist
models can emulate some statistical models for example see
section 2.4 in Thomas & McClelland's chapter in Sun's
2008 book (<a href="http://www.psyc.bbk.ac.uk/people/academic/thomas_m/TM_Cambridge_sub.pdf" target="_blank">http://www.psyc.bbk.ac.uk/people/academic/thomas_m/TM_Cambridge_sub.pdf</a>).</div>
<div>I am not suggesting <span style="font-size:13.333333969116211px;font-family:arial,sans-serif">Hodgkin-Huxley</span> level
detailed neuron models, however connectionist models should
have their connections explicitly defined. </div>
<div>Sincerely,</div>
<div>-Tsvi</div>
<div><br>
</div>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On Mon, Apr 7, 2014 at 10:58 AM, Juyang
Weng <span dir="ltr"><<a href="mailto:weng@cse.msu.edu" target="_blank">weng@cse.msu.edu</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000"> Tsvi,<br>
<br>
Note that ART uses a vigilance value to pick up the first
"acceptable" match in its sequential bottom-up and
top-down search.<br>
I believe that was Steve meant when he mentioned
vigilance. <br>
<br>
Why do you think "ART as a neural way to implement a
K-nearest neighbor algorithm"? <br>
If not all the neighbors have sequentially participated,<br>
how can ART find the nearest neighbor, let alone K-nearest
neighbor?<br>
<br>
Our DN uses an explicit k-nearest mechanism to find the
k-nearest neighbors in every network update, <br>
to avoid the problems of slow resonance in existing models
of spiking neuronal networks. <br>
The explicit k-nearest mechanism itself is not meant to be
biologically plausible, <br>
but it gives a computational advantage for software
simulation of large networks <br>
at a speed slower than 1000 network updates per second.<br>
<br>
I guess that more detailed molecular simulations of
individual neuronal spikes (such as using the
Hodgkin-Huxley model of<br>
a neuron, using the <a href="http://www.neuron.yale.edu/neuron/" target="_blank">NEURON software,</a> or like <a href="http://bluebrain.epfl.ch/" target="_blank">the Blue Brain project</a> directed by
respected Dr. Henry Markram) <br>
are very useful for showing some detailed molecular,
synaptic, and neuronal properties.<br>
However, they miss necessary brain-system-level mechanisms
so much that it is difficult for them <br>
to show major brain-scale functions <br>
(such as learning to recognize objects and detection of
natural objects directly from natural cluttered scenes). <br>
<br>
According to my understanding, if one uses a detailed
neuronal model for each of a variety of neuronal types and<br>
connects those simulated neurons of different types
according to a diagram of Brodmann areas, <br>
his simulation is NOT going to lead to any major brain
function. <br>
He still needs brain-system-level knowledge such as that
taught in the BMI 871 course. <br>
<br>
-John <br>
<div>
<div> <br>
<div>On 4/7/14 8:07 AM, Tsvi Achler wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>Dear Steve, John</div>
I think such discussions are great to spark
interests in feedback (output back to input) such
models which I feel should be given much more
attention.
<div>In this vein it may be better to discuss more
of the details here than to suggest to read a
reference.</div>
<div><br>
</div>
<div>Basically I see ART as a neural way to
implement a K-nearest neighbor algorithm.
Clearly the way ART overcomes the neural
hurdles is immense especially in figuring out
how to coordinate neurons. However it is also
important to summarize such methods in
algorithmic terms which I attempt to do here
(and please comment/correct).</div>
<div>Instar learning is used to find the best
weights for quick feedforward recognition
without too much resonance (otherwise more
resonance will be needed). Outstar learning is
used to find the expectation of the patterns.
The resonance mechanism evaluates distances
between the "neighbors" evaluating how close
differing outputs are to the input pattern
(using the expectation). By choosing one winner
the network is equivalent to a 1-nearest
neighbor model. If you open it up to more
winners (eg k winners) as you suggest then it
becomes a k-nearest neighbor mechanism.</div>
<div><br>
</div>
<div>Clearly I focused here on the main ART
modules and did not discuss other additions.
But I want to just focus on the main idea at
this point.</div>
<div>Sincerely,</div>
<div>-Tsvi</div>
</div>
<div class="gmail_extra"> <br>
<br>
<div class="gmail_quote">On Sun, Apr 6, 2014 at
1:30 PM, Stephen Grossberg <span dir="ltr"><<a href="mailto:steve@cns.bu.edu" target="_blank">steve@cns.bu.edu</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style="word-wrap:break-word"><font face="Arial" size="5">Dear John,</font>
<div><font face="Arial" size="5"><br>
</font></div>
<div><font face="Arial" size="5">Thanks for
your questions. I reply below.</font></div>
<div> <font face="Arial" size="5"><br>
</font>
<div>
<div>
<div><font face="Arial" size="5">On
Apr 5, 2014, at 10:51 AM, Juyang
Weng wrote:</font></div>
<font face="Arial" size="5"><br>
</font>
<blockquote type="cite">
<div bgcolor="#FFFFFF" text="#000000"><font face="Arial" size="5"> Dear Steve,<br>
<br>
This is one of my long-time
questions that I did not have a
chance to ask you when I met you
many times before. <br>
But they may be useful for some
people on this list. <br>
Please accept my apology of my
question implies any false
impression that I did not
intend.<br>
<br>
(1) Your statement below seems
to have confirmed my
understanding: <br>
Your top-down process in ART in
the late 1990's is basically for
finding an acceptable match <br>
between the input feature vector
and the stored feature vectors
represented by neurons (not
meant for the nearest match). <br>
</font></div>
</blockquote>
<div><font face="Arial" size="5"><br>
</font></div>
</div>
<font face="Arial" size="5">ART has
developed a lot since the 1990s. A
non-technical but fairly comprehensive
review article was published in 2012
in <i>Neural Networks</i> and can be
found at <a href="http://cns.bu.edu/%7Esteve/ART.pdf" target="_blank">http://cns.bu.edu/~steve/ART.pdf</a>.</font></div>
<div><font face="Arial" size="5"><br>
</font></div>
<div><font face="Arial" size="5">I do not
think about the top-down process in
ART in quite the way that you state
above. My reason for this is
summarized by the acronym CLEARS for
the processes of Consciousness,
Learning, Expectation, Attention,
Resonance, and Synchrony. </font><span style="font-family:Arial;font-size:x-large">All the CLEARS processes
come into this story, and </span><span style="font-family:Arial;font-size:x-large">ART top-down mechanisms
contribute to all of them. For me, the
most fundamental issues concern how
ART dynamically self-stabilizes the
memories that are learned within the
model's bottom-up adaptive filters and
top-down expectations. </span></div>
<div><font face="Arial" size="5"><br>
</font></div>
<div><font face="Arial" size="5">In
particular, during learning, a big
enough mismatch can lead to hypothesis
testing and search for a new, or
previously learned, category that
leads to an acceptable match. The
criterion for what is "big enough
mismatch" or "acceptable match" is
regulated by a vigilance parameter
that can itself vary in a
state-dependent way.</font></div>
<div><font face="Arial" size="5"><br>
</font></div>
<div><font face="Arial" size="5">After
learning occurs, a bottom-up input
pattern typically directly selects the
best-matching category, without any
hypothesis testing or search. And even
if there is a reset due to a large
initial mismatch with a previously
active category, a single reset event
may lead directly to a matching
category that can directly resonate
with the data. </font></div>
<div><font face="Arial" size="5"><br>
</font></div>
<div><font face="Arial" size="5">I should
note that all of the foundational
predictions of ART now have
substantial bodies of psychological
and neurobiological data to support
them. See the review article if you
would like to read about them.</font></div>
<div>
<div><font face="Arial" size="5"><br>
</font>
<blockquote type="cite">
<div bgcolor="#FFFFFF" text="#000000"><font face="Arial" size="5"> The currently active
neuron is the one being examined
by the top down process<br>
</font></div>
</blockquote>
<div><font face="Arial" size="5"><br>
</font></div>
</div>
<font face="Arial" size="5">I'm not sure
what you mean by "being examined", but
perhaps my comment above may deal with
it.</font></div>
<div><font face="Arial" size="5"><br>
</font></div>
<div><font face="Arial" size="5">I should
comment, though, about your use of the
word "currently active neuron". I
assume that you mean at the category
level. </font></div>
<div><font face="Arial" size="5"><br>
</font></div>
<div><font face="Arial" size="5">In this
regard, there are two ART's. The first
aspect of ART is as a cognitive and
neural theory whose scope, which
includes perceptual, cognitive, and
adaptively timed cognitive-emotional
dynamics, among other processes, is
illustrated by the above referenced
2012 review article in <i>Neural
Networks</i>. In the biological
theory, there is no general commitment
to just one "currently active neuron".
One always considers the neuronal
population, or populations, that
represent a learned category.
Sometimes, but not always, a
winner-take-all category is chosen. </font></div>
<div><font face="Arial" size="5"><br>
</font></div>
<div><font face="Arial" size="5">The 2012
review article illustrates some of the
large data bases of psychological and
neurobiological data that have been
explained in a principled way,
quantitatively simulated, and
successfully predicted by ART over a
period of decades. ART-like processing
is, however, certainly not the only
kind of computation that may be needed
to understand how the brain works. The
paradigm called Complementary
Computing that I introduced awhile ago
makes precise the sense in which ART
may be just one kind of dynamics
supported by advanced brains. This is
also summarized in the review article.<br>
</font>
<div><font face="Arial" size="5"><br>
</font></div>
<div><font face="Arial" size="5">The
second aspect of ART is as a series
of algorithms that mathematically
characterize key ART design
principles and mechanisms in a
focused setting, and provide
algorithms for large-scale
applications in engineering and
technology. ARTMAP, fuzzy ARTMAP,
and distributed ARTMAP are among
these, all of them developed with
Gail Carpenter. Some of these
algorithms use winner-take-all
categories to enable the proof of
mathematical theorems that
characterize how underlying design
principles work. In contrast, the
distributed ARTMAP family of
algorithms, developed by Gail
Carpenter and her colleagues, allows
for distributed category
representations without losing the
benefits of fast, incremental,
self-stabilizing learning and
prediction in response to a large
non-stationary databases that can
include many unexpected events. </font></div>
<div><font face="Arial" size="5"><br>
</font></div>
<div><font face="Arial" size="5">See,
e.g., <a href="http://techlab.bu.edu/members/gail/articles/115_dART_NN_1997_.pdf" target="_blank">http://techlab.bu.edu/members/gail/articles/115_dART_NN_1997_.pdf</a>
and <a href="http://techlab.bu.edu/members/gail/articles/155_Fusion2008_CarpenterRavindran.pdf" target="_blank">http://techlab.bu.edu/members/gail/articles/155_Fusion2008_CarpenterRavindran.pdf</a>.</font></div>
<div><font face="Arial" size="5"><br>
</font></div>
<div><font face="Arial" size="5">I
should note that FAST learning is a
technical concept: it means that
each adaptive weight can converge to
its new equilibrium value on EACH
learning trial. That is why ART
algorithms can often successfully
carry out one-trial incremental
learning of a data base. This is not
true of many other algorithms, such
as back propagation, simulated
annealing, and the like, which all
experience catastrophic forgetting
if they try to do fast learning.
Almost all other learning algorithms
need to be run using slow learning,
that allows only a small increment
in the values of adaptive weights on
each learning trial, to avoid
massive memory instabilities, and
work best in response to stationary
data. Such algorithms often fail to
detect important rare cases, among
other limitations. ART can provably
learn in either the fast or slow
mode in response to non-stationary
data.</font></div>
<div>
<div><font face="Arial" size="5"><br>
</font></div>
<blockquote type="cite">
<div bgcolor="#FFFFFF" text="#000000"><font face="Arial" size="5"> in a sequential
fashion: one neuron after
another, until an acceptable
neuron is found.<br>
<br>
(2) The input to the ART in the
late 1990's is for a single
feature vector as a monolithic
input. <br>
By monolithic, I mean that all
neurons take the entire input
feature vector as input. <br>
I raise this point here because
neuron in ART in the late 1990's
does not have an explicit local
sensory receptive field (SRF), <br>
i.e., are fully connected from
all components of the input
vector. A local SRF means that
each neuron is only connected to
a small region <br>
in an input image. <br>
</font></div>
</blockquote>
<div><font face="Arial" size="5"><br>
</font></div>
</div>
<font face="Arial" size="5">Various ART
algorithms for technology do use fully
connected networks. They represent a
single-channel case, which is often
sufficient in applications and which
simplifies mathematical proofs.
However, the single-channel case is,
as its name suggests, not a necessary
constraint on ART design. </font></div>
<div><font face="Arial" size="5"><br>
</font></div>
<div><font face="Arial" size="5">In
addition, many ART biological models
do not restrict themselves to the
single-channel case, and do have
receptive fields. These include the
LAMINART family of models that predict
functional roles for many identified
cell types in the laminar circuits of
cerebral cortex. These models
illustrate how variations of a shared
laminar circuit design can carry out
very different intelligent functions,
such as 3D vision (e.g., 3D LAMINART),
speech and language (e.g., cARTWORD),
and cognitive information processing
(e.g., LIST PARSE). They are all
summarized in the 2012 review article,
with the archival articles themselves
on my web page <a href="http://cns.bu.edu/%7Esteve" target="_blank">http://cns.bu.edu/~steve</a>. </font></div>
<div><font face="Arial" size="5"><br>
</font></div>
<div><font face="Arial" size="5">The
existence of these laminar
variations-on-a-theme provides an
existence proof for the exciting goal
of designing a family of chips whose
specializations can realize all
aspects of higher intelligence, and
which can be consistently connected
because they all share a similar
underlying design. Work on achieving
this goal can productively occupy lots
of creative modelers and technologists
for many years to come.</font></div>
<div><font face="Arial" size="5"><br>
</font></div>
<div><font face="Arial" size="5">I hope
that the above replies provide some
relevant information, as well as
pointers for finding more.</font></div>
<div><font face="Arial" size="5"><br>
</font></div>
<div><font face="Arial" size="5">Best,</font></div>
<div><font face="Arial" size="5"><br>
</font></div>
<div><font face="Arial" size="5">Steve</font></div>
<div><font face="Arial" size="5"><br>
</font></div>
<div><font face="Arial" size="5"><br>
</font></div>
<div><font face="Arial" size="5"><br>
</font></div>
<div>
<blockquote type="cite">
<div bgcolor="#FFFFFF" text="#000000">
<div> <font face="Arial" size="5"><br>
My apology again if my
understanding above has errors
although I have examined the
above two points carefully <br>
through multiple your papers.<br>
<br>
Best regards,<br>
<br>
-John<br>
<br>
</font></div>
<div>
<pre cols="72"><font face="Arial"><span style="font-size:18px">Juyang (John) Weng, Professor
Department of Computer Science and Engineering
MSU Cognitive Science Program and MSU Neuroscience Program
428 S Shaw Ln Rm 3115
Michigan State University
East Lansing, MI 48824 USA
Tel: <a href="tel:517-353-4388" value="+15173534388" target="_blank">517-353-4388</a>
Fax: <a href="tel:517-432-1061" value="+15174321061" target="_blank">517-432-1061</a>
Email: <a href="mailto:weng@cse.msu.edu" target="_blank">weng@cse.msu.edu</a>
URL: <a href="http://www.cse.msu.edu/%7Eweng/" target="_blank">http://www.cse.msu.edu/~weng/</a>
----------------------------------------------
</span></font></pre>
</div>
</div>
</blockquote>
</div>
<div><font face="Arial" size="5"><br>
</font>
<div> <font face="Arial" size="5"><span style="line-height:normal;text-indent:0px;border-collapse:separate;letter-spacing:normal;text-align:-webkit-auto;font-variant:normal;text-transform:none;font-style:normal;white-space:normal;font-weight:normal;word-spacing:0px"><span style="line-height:normal;text-indent:0px;border-collapse:separate;letter-spacing:normal;text-align:-webkit-auto;font-variant:normal;text-transform:none;font-style:normal;white-space:normal;font-weight:normal;word-spacing:0px">
<div style="word-wrap:break-word">
<span style="line-height:normal;text-indent:0px;border-collapse:separate;letter-spacing:normal;text-align:-webkit-auto;font-variant:normal;text-transform:none;font-style:normal;white-space:normal;font-weight:normal;word-spacing:0px">
<div style="word-wrap:break-word">
<div>
<div>
<div>
<div>Stephen
Grossberg</div>
<div>Wang Professor
of Cognitive and
Neural Systems</div>
<div>Professor of
Mathematics,
Psychology, and
Biomedical
Engineering</div>
<div>
<div>Director,
Center for
Adaptive
Systems <a href="http://www.cns.bu.edu/about/cas.html" target="_blank">http://www.cns.bu.edu/about/cas.html</a></div>
</div>
<div><a href="http://cns.bu.edu/%7Esteve" target="_blank">http://cns.bu.edu/~steve</a></div>
<div><a href="mailto:steve@bu.edu" target="_blank">steve@bu.edu</a></div>
</div>
</div>
</div>
<div><br>
</div>
</div>
</span></div>
</span><br>
</span><br>
</font></div>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</div>
</div>
<pre cols="72"><span><font color="#888888">--
--
Juyang (John) Weng, Professor
Department of Computer Science and Engineering
MSU Cognitive Science Program and MSU Neuroscience Program
428 S Shaw Ln Rm 3115
Michigan State University
East Lansing, MI 48824 USA
Tel: <a href="tel:517-353-4388" value="+15173534388" target="_blank">517-353-4388</a></font></span><div>
Fax: <a href="tel:517-432-1061" value="+15174321061" target="_blank">517-432-1061</a>
Email: <a href="mailto:weng@cse.msu.edu" target="_blank">weng@cse.msu.edu</a>
URL: <a href="http://www.cse.msu.edu/%7Eweng/" target="_blank">http://www.cse.msu.edu/~weng/</a>
----------------------------------------------
</div></pre>
</div>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
<pre cols="72">--
--
Juyang (John) Weng, Professor
Department of Computer Science and Engineering
MSU Cognitive Science Program and MSU Neuroscience Program
428 S Shaw Ln Rm 3115
Michigan State University
East Lansing, MI 48824 USA
Tel: <a href="tel:517-353-4388" value="+15173534388" target="_blank">517-353-4388</a>
Fax: <a href="tel:517-432-1061" value="+15174321061" target="_blank">517-432-1061</a>
Email: <a href="mailto:weng@cse.msu.edu" target="_blank">weng@cse.msu.edu</a>
URL: <a href="http://www.cse.msu.edu/~weng/" target="_blank">http://www.cse.msu.edu/~weng/</a>
----------------------------------------------
</pre>
</div></div></div>
</blockquote></div><br></div>