Connectionists: how the brain works? (UNCLASSIFIED)

Mon Apr 7 13:58:56 EDT 2014

Tsvi,

Note that ART uses a vigilance value to pick up the first "acceptable" 
match in its sequential bottom-up and top-down search.
I believe that was Steve meant when he mentioned vigilance.

Why do you think "ART as a neural way to implement a K-nearest neighbor 
algorithm"?
If not all the neighbors have sequentially participated,
how can ART find the nearest neighbor, let alone K-nearest neighbor?

Our DN uses an explicit k-nearest mechanism to find the k-nearest 
neighbors in every network update,
to avoid the problems of slow resonance in existing models of spiking 
neuronal networks.
The explicit k-nearest mechanism itself is not meant to be biologically 
plausible,
but it gives a computational advantage for software simulation of large 
networks
at a speed slower than 1000 network updates per second.

I guess that more detailed molecular simulations of individual neuronal 
spikes (such as using the Hodgkin-Huxley model of
a neuron, using the NEURON software, 
<http://www.neuron.yale.edu/neuron/> or like the Blue Brain project 
<http://bluebrain.epfl.ch/> directed by respected Dr. Henry Markram)
are very useful for showing some detailed molecular, synaptic, and 
neuronal properties.
However, they miss necessary brain-system-level mechanisms so much that 
it is difficult for them
to show major brain-scale functions
(such as learning to recognize objects and detection of natural objects 
directly from natural cluttered scenes).

According to my understanding, if one uses a detailed neuronal model for 
each of a variety of neuronal types and
connects those simulated neurons of different types according to a 
diagram of Brodmann areas,
his simulation is NOT going to lead to any major brain function.
He still needs brain-system-level knowledge such as that taught in the 
BMI 871 course.

-John

On 4/7/14 8:07 AM, Tsvi Achler wrote:
> Dear Steve, John
> I think such discussions are great to spark interests in feedback 
> (output back to input) such models which I feel should be given much 
> more attention.
> In this vein it may be better to discuss more of the details here than 
> to suggest to read a reference.
>
> Basically I see ART as a neural way to implement a K-nearest neighbor 
> algorithm.  Clearly the way ART overcomes the neural hurdles is 
> immense especially in figuring out how to coordinate neurons.  However 
> it is also important to summarize such methods in algorithmic terms 
>  which I attempt to do here (and please comment/correct).
> Instar learning is used to find the best weights for quick feedforward 
> recognition without too much resonance (otherwise more resonance will 
> be needed).  Outstar learning is used to find the expectation of the 
> patterns.  The resonance mechanism evaluates distances between the 
> "neighbors" evaluating how close differing outputs are to the input 
> pattern (using the expectation).  By choosing one winner the network 
> is equivalent to a 1-nearest neighbor model.  If you open it up to 
> more winners (eg k winners) as you suggest  then it becomes a 
> k-nearest neighbor mechanism.
>
> Clearly I focused here on the main ART modules and did not discuss 
> other additions.  But I want to just focus on the main idea at this point.
> Sincerely,
> -Tsvi
>
>
> On Sun, Apr 6, 2014 at 1:30 PM, Stephen Grossberg <steve at cns.bu.edu 
> <mailto:steve at cns.bu.edu>> wrote:
>
>     Dear John,
>
>     Thanks for your questions. I reply below.
>
>     On Apr 5, 2014, at 10:51 AM, Juyang Weng wrote:
>
>>     Dear Steve,
>>
>>     This is one of my long-time questions that I did not have a
>>     chance to ask you when I met you many times before.
>>     But they may be useful for some people on this list.
>>     Please accept my apology of my question implies any false
>>     impression that I did not intend.
>>
>>     (1) Your statement below seems to have confirmed my understanding:
>>     Your top-down process in ART in the late 1990's is basically for
>>     finding an acceptable match
>>     between the input feature vector and the stored feature vectors
>>     represented by neurons (not meant for the nearest match).
>
>     ART has developed a lot since the 1990s. A non-technical but
>     fairly comprehensive review article was published in 2012 in
>     /Neural Networks/ and can be found at
>     http://cns.bu.edu/~steve/ART.pdf <http://cns.bu.edu/%7Esteve/ART.pdf>.
>
>     I do not think about the top-down process in ART in quite the way
>     that you state above. My reason for this is summarized by the
>     acronym CLEARS for the processes of Consciousness, Learning,
>     Expectation, Attention, Resonance, and Synchrony. All the CLEARS
>     processes come into this story, and ART top-down mechanisms
>     contribute to all of them. For me, the most fundamental issues
>     concern how ART dynamically self-stabilizes the memories that are
>     learned within the model's bottom-up adaptive filters and top-down
>     expectations.
>
>     In particular, during learning, a big enough mismatch can lead to
>     hypothesis testing and search for a new, or previously learned,
>     category that leads to an acceptable match. The criterion for what
>     is "big enough mismatch" or "acceptable match" is regulated by a
>     vigilance parameter that can itself vary in a state-dependent way.
>
>     After learning occurs, a bottom-up input pattern typically
>     directly selects the best-matching category, without any
>     hypothesis testing or search. And even if there is a reset due to
>     a large initial mismatch with a previously active category, a
>     single reset event may lead directly to a matching category that
>     can directly resonate with the data.
>
>     I should note that all of the foundational predictions of ART now
>     have substantial bodies of psychological and neurobiological data
>     to support them. See the review article if you would like to read
>     about them.
>
>>     The currently active neuron is the one being examined by the top
>>     down process
>
>     I'm not sure what you mean by "being examined", but perhaps my
>     comment above may deal with it.
>
>     I should comment, though, about your use of the word "currently
>     active neuron". I assume that you mean at the category level.
>
>     In this regard, there are two ART's. The first aspect of ART is as
>     a cognitive and neural theory whose scope, which includes
>     perceptual, cognitive, and adaptively timed cognitive-emotional
>     dynamics, among other processes, is illustrated by the above
>     referenced 2012 review article in /Neural Networks/. In the
>     biological theory, there is no general commitment to just one
>     "currently active neuron". One always considers the neuronal
>     population, or populations, that represent a learned category.
>     Sometimes, but not always, a winner-take-all category is chosen.
>
>     The 2012 review article illustrates some of the large data bases
>     of psychological and neurobiological data that have been explained
>     in a principled way, quantitatively simulated, and successfully
>     predicted by ART over a period of decades. ART-like processing is,
>     however, certainly not the only kind of computation that may be
>     needed to understand how the brain works. The paradigm called
>     Complementary Computing that I introduced awhile ago makes precise
>     the sense in which ART may be just one kind of dynamics supported
>     by advanced brains. This is also summarized in the review article.
>
>     The second aspect of ART is as a series of algorithms that
>     mathematically characterize key ART design principles and
>     mechanisms in a focused setting, and provide algorithms for
>     large-scale applications in engineering and technology. ARTMAP,
>     fuzzy ARTMAP, and distributed ARTMAP are among these, all of them
>     developed with Gail Carpenter. Some of these algorithms use
>     winner-take-all categories to enable the proof of mathematical
>     theorems that characterize how underlying design principles work.
>     In contrast, the distributed ARTMAP family of algorithms,
>     developed by Gail Carpenter and her colleagues, allows for
>     distributed category representations without losing the benefits
>     of fast, incremental, self-stabilizing learning and prediction in
>     response to a large non-stationary databases that can include many
>     unexpected events.
>
>     See, e.g.,
>     http://techlab.bu.edu/members/gail/articles/115_dART_NN_1997_.pdf
>     and
>     http://techlab.bu.edu/members/gail/articles/155_Fusion2008_CarpenterRavindran.pdf.
>
>     I should note that FAST learning is a technical concept: it means
>     that each adaptive weight can converge to its new equilibrium
>     value on EACH learning trial. That is why ART algorithms can often
>     successfully carry out one-trial incremental learning of a data
>     base. This is not true of many other algorithms, such as back
>     propagation, simulated annealing, and the like, which all
>     experience catastrophic forgetting if they try to do fast
>     learning. Almost all other learning algorithms need to be run
>     using slow learning, that allows only a small increment in the
>     values of adaptive weights on each learning trial, to avoid
>     massive memory instabilities, and work best in response to
>     stationary data. Such algorithms often fail to detect important
>     rare cases, among other limitations. ART can provably learn in
>     either the fast or slow mode in response to non-stationary data.
>
>>     in a sequential fashion: one neuron after another, until an
>>     acceptable neuron is found.
>>
>>     (2) The input to the ART in the late 1990's is for a single
>>     feature vector as a monolithic input.
>>     By monolithic, I mean that all neurons take the entire input
>>     feature vector as input.
>>     I raise this point here because neuron in ART in the late 1990's
>>     does not have an explicit local sensory receptive field (SRF),
>>     i.e., are fully connected from all components of the input
>>     vector.   A local SRF means that each neuron is only connected to
>>     a small region
>>     in an input image.
>
>     Various ART algorithms for technology do use fully connected
>     networks. They represent a single-channel case, which is often
>     sufficient in applications and which simplifies mathematical
>     proofs. However, the single-channel case is, as its name suggests,
>     not a necessary constraint on ART design.
>
>     In addition, many ART biological models do not restrict themselves
>     to the single-channel case, and do have receptive fields. These
>     include the LAMINART family of models that predict functional
>     roles for many identified cell types in the laminar circuits of
>     cerebral cortex. These models illustrate how variations of a
>     shared laminar circuit design can carry out very different
>     intelligent functions, such as 3D vision (e.g., 3D LAMINART),
>     speech and language (e.g., cARTWORD), and cognitive information
>     processing (e.g., LIST PARSE). They are all summarized in the 2012
>     review article, with the archival articles themselves on my web
>     page http://cns.bu.edu/~steve <http://cns.bu.edu/%7Esteve>.
>
>     The existence of these laminar variations-on-a-theme provides an
>     existence proof for the exciting goal of designing a family of
>     chips whose specializations can realize all aspects of higher
>     intelligence, and which can be consistently connected because they
>     all share a similar underlying design. Work on achieving this goal
>     can productively occupy lots of creative modelers and
>     technologists for many years to come.
>
>     I hope that the above replies provide some relevant information,
>     as well as pointers for finding more.
>
>     Best,
>
>     Steve
>
>
>
>>
>>     My apology again if my understanding above has errors although I
>>     have examined the above two points carefully
>>     through multiple your papers.
>>
>>     Best regards,
>>
>>     -John
>>
>>     Juyang (John) Weng, Professor
>>     Department of Computer Science and Engineering
>>     MSU Cognitive Science Program and MSU Neuroscience Program
>>     428 S Shaw Ln Rm 3115
>>     Michigan State University
>>     East Lansing, MI 48824 USA
>>     Tel:517-353-4388  <tel:517-353-4388>
>>     Fax:517-432-1061  <tel:517-432-1061>
>>     Email:weng at cse.msu.edu  <mailto:weng at cse.msu.edu>
>>     URL:http://www.cse.msu.edu/~weng/  <http://www.cse.msu.edu/%7Eweng/>
>>     ----------------------------------------------
>>
>
>     Stephen Grossberg
>     Wang Professor of Cognitive and Neural Systems
>     Professor of Mathematics, Psychology, and Biomedical Engineering
>     Director, Center for Adaptive Systems
>     http://www.cns.bu.edu/about/cas.html
>     http://cns.bu.edu/~steve <http://cns.bu.edu/%7Esteve>
>     steve at bu.edu <mailto:steve at bu.edu>
>
>
>
>
>

-- 
--
Juyang (John) Weng, Professor
Department of Computer Science and Engineering
MSU Cognitive Science Program and MSU Neuroscience Program
428 S Shaw Ln Rm 3115
Michigan State University
East Lansing, MI 48824 USA
Tel: 517-353-4388
Fax: 517-432-1061
Email: weng at cse.msu.edu
URL: http://www.cse.msu.edu/~weng/
----------------------------------------------

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/connectionists/attachments/20140407/fa115113/attachment.html>