Connectionists: how the brain works? (UNCLASSIFIED)

Tsvi Achler achler at gmail.com
Thu Apr 10 20:42:15 EDT 2014


I can't comment on most of this, but I am not sure if all models of
sparsity and sparse coding fall into the connectionist realm either because
some make statistical assumptions.
-Tsvi


On Tue, Apr 8, 2014 at 9:19 PM, Juyang Weng <weng at cse.msu.edu> wrote:

>  Tavi:
>
> Let me explain a little more detail:
>
> There are two large categories of biological neurons, excitatory and
> inhibitory.   Both are developed through mainly signal statistics,
> not specified primarily by the genomes.   Not all people agree with my
> this point, but please tolerate my this view for now.
> I gave a more detailed discussion on this view in my NAI book.
>
> The main effect of inhibitory connections is to reduce the number of
> firing neurons (David Field called it sparse coding), with the help of
> excitatory connections.  This sparse coding is important because those do
> not fire are long term memory of the area at this point of time.
> My this view is different from David Field.  He wrote that sparse coding
> is for the current representations.  I think sparse coding is
> necessary for long-term memory. Not all people agree with my this point,
> but please tolerate my this view for now.
>
> However, this reduction requires very fast parallel neuronal updates to
> avoid uncontrollable large-magnitude oscillations.
> Even with the fast biological parallel neuronal updates, we still see slow
> but small-magnitude oscillations such as the
> well-known theta waves and alpha waves.   My view is that such slow but
> small-magnitude oscillations are side effects of
> excitatory and inhibitory connections that form many loops, not something
> really desirable for the brain operation (sorry,
> Paul Werbos).  Not all people agree with my this point, but please
> tolerate my this view for now.
>
> Therefore, as far as I understand, all computer simulations for spiking
> neurons are not showing major brain functions
> because they have to deal with the slow oscillations that are very
> different from the brain's, e.g., as Dr. Henry Markram reported
> (40Hz?).
>
> The above discussion again shows the power and necessity of an overarching
> brain theory like that in my NAI book.
> Those who only simulate biological neurons using superficial biological
> phenomena are not going to demonstrate
> any major brain functions.  They can talk about signal statistics from
> their simulations, but signal statistics are far from brain functions.
>
> -John
>
>
> On 4/8/14 1:30 AM, Tsvi Achler wrote:
>
> Hi John,
> ART evaluates distance between the contending representation and the
> current input through vigilance.  If they are too far apart, a poor
> vigilance signal will be triggered.
> The best resonance will be achieved when they have the least amount of
> distance.
> If in your model, K-nearest neighbors is used without a neural equivalent,
> then your model is not quite in the spirit of a connectionist model.
> For example, Bayesian networks do a great job emulating brain behavior,
> modeling the integration of priors. and has been invaluable to model
> cognitive studies.  However they assume a statistical configuration of
> connections and distributions which is not quite known how to emulate with
> neurons.  Thus pure Bayesian models are also questionable in terms of
> connectionist modeling.  But some connectionist models can emulate some
> statistical models for example see section 2.4  in Thomas & McClelland's
> chapter in Sun's 2008 book (
> http://www.psyc.bbk.ac.uk/people/academic/thomas_m/TM_Cambridge_sub.pdf).
> I am not suggesting Hodgkin-Huxley level detailed neuron models, however
> connectionist models should have their connections explicitly defined.
> Sincerely,
> -Tsvi
>
>
>
> On Mon, Apr 7, 2014 at 10:58 AM, Juyang Weng <weng at cse.msu.edu> wrote:
>
>>  Tsvi,
>>
>> Note that ART uses a vigilance value to pick up the first "acceptable"
>> match in its sequential bottom-up and top-down search.
>> I believe that was Steve meant when he mentioned vigilance.
>>
>> Why do you think "ART as a neural way to implement a K-nearest neighbor
>> algorithm"?
>> If not all the neighbors have sequentially participated,
>> how can ART find the nearest neighbor, let alone K-nearest neighbor?
>>
>> Our DN uses an explicit k-nearest mechanism to find the k-nearest
>> neighbors in every network update,
>> to avoid the problems of slow resonance in existing models of spiking
>> neuronal networks.
>> The explicit k-nearest mechanism itself is not meant to be biologically
>> plausible,
>> but it gives a computational advantage for software simulation of large
>> networks
>> at a speed slower than 1000 network updates per second.
>>
>> I guess that more detailed molecular simulations of individual neuronal
>> spikes (such as using the Hodgkin-Huxley model of
>> a neuron, using the NEURON software, <http://www.neuron.yale.edu/neuron/>or like the
>> Blue Brain project <http://bluebrain.epfl.ch/> directed by respected Dr.
>> Henry Markram)
>> are very useful for showing some detailed molecular, synaptic, and
>> neuronal properties.
>> However, they miss necessary brain-system-level mechanisms so much that
>> it is difficult for them
>> to show major brain-scale functions
>> (such as learning to recognize objects and detection of natural objects
>> directly from natural cluttered scenes).
>>
>> According to my understanding, if one uses a detailed neuronal model for
>> each of a variety of neuronal types and
>> connects those simulated neurons of different types according to a
>> diagram of Brodmann areas,
>> his simulation is NOT going to lead to any major brain function.
>> He still needs brain-system-level knowledge such as that taught in the
>> BMI 871 course.
>>
>> -John
>>
>> On 4/7/14 8:07 AM, Tsvi Achler wrote:
>>
>>  Dear Steve, John
>> I think such discussions are great to spark interests in feedback (output
>> back to input) such models which I feel should be given much more
>> attention.
>> In this vein it may be better to discuss more of the details here than to
>> suggest to read a reference.
>>
>>  Basically I see ART as a neural way to implement a K-nearest neighbor
>> algorithm.  Clearly the way ART overcomes the neural hurdles is immense
>> especially in figuring out how to coordinate neurons.  However it is also
>> important to summarize such methods in algorithmic terms  which I attempt
>> to do here (and please comment/correct).
>> Instar learning is used to find the best weights for quick feedforward
>> recognition without too much resonance (otherwise more resonance will be
>> needed).  Outstar learning is used to find the expectation of the patterns.
>>  The resonance mechanism evaluates distances between the "neighbors"
>> evaluating how close differing outputs are to the input pattern (using the
>> expectation).  By choosing one winner the network is equivalent to a
>> 1-nearest neighbor model.  If you open it up to more winners (eg k winners)
>> as you suggest  then it becomes a k-nearest neighbor mechanism.
>>
>>  Clearly I focused here on the main ART modules and did not discuss
>> other additions.  But I want to just focus on the main idea at this point.
>> Sincerely,
>> -Tsvi
>>
>>
>> On Sun, Apr 6, 2014 at 1:30 PM, Stephen Grossberg <steve at cns.bu.edu>wrote:
>>
>>> Dear John,
>>>
>>>  Thanks for your questions. I reply below.
>>>
>>>   On Apr 5, 2014, at 10:51 AM, Juyang Weng wrote:
>>>
>>>   Dear Steve,
>>>
>>> This is one of my long-time questions that I did not have a chance to
>>> ask you when I met you many times before.
>>> But they may be useful for some people on this list.
>>> Please accept my apology of my question implies any false impression
>>> that I did not intend.
>>>
>>> (1) Your statement below seems to have confirmed my understanding:
>>> Your top-down process in ART in the late 1990's is basically for finding
>>> an acceptable match
>>> between the input feature vector and the stored feature vectors
>>> represented by neurons (not meant for the nearest match).
>>>
>>>
>>>  ART has developed a lot since the 1990s. A non-technical but fairly
>>> comprehensive review article was published in 2012 in *Neural Networks*and can be found at
>>> http://cns.bu.edu/~steve/ART.pdf.
>>>
>>>  I do not think about the top-down process in ART in quite the way that
>>> you state above. My reason for this is summarized by the acronym CLEARS for
>>> the processes of Consciousness, Learning, Expectation, Attention,
>>> Resonance, and Synchrony. All the CLEARS processes come into this
>>> story, and ART top-down mechanisms contribute to all of them. For me,
>>> the most fundamental issues concern how ART dynamically self-stabilizes the
>>> memories that are learned within the model's bottom-up adaptive filters and
>>> top-down expectations.
>>>
>>>  In particular, during learning, a big enough mismatch can lead to
>>> hypothesis testing and search for a new, or previously learned, category
>>> that leads to an acceptable match. The criterion for what is "big enough
>>> mismatch" or "acceptable match" is regulated by a vigilance parameter that
>>> can itself vary in a state-dependent way.
>>>
>>>  After learning occurs, a bottom-up input pattern typically directly
>>> selects the best-matching category, without any hypothesis testing or
>>> search. And even if there is a reset due to a large initial mismatch with a
>>> previously active category, a single reset event may lead directly to a
>>> matching category that can directly resonate with the data.
>>>
>>>  I should note that all of the foundational predictions of ART now have
>>> substantial bodies of psychological and neurobiological data to support
>>> them. See the review article if you would like to read about them.
>>>
>>>   The currently active neuron is the one being examined by the top down
>>> process
>>>
>>>
>>>  I'm not sure what you mean by "being examined", but perhaps my comment
>>> above may deal with it.
>>>
>>>  I should comment, though, about your use of the word "currently active
>>> neuron". I assume that you mean at the category level.
>>>
>>>  In this regard, there are two ART's. The first aspect of ART is as a
>>> cognitive and neural theory whose scope, which includes perceptual,
>>> cognitive, and adaptively timed cognitive-emotional dynamics, among other
>>> processes, is illustrated by the above referenced 2012 review article in *Neural
>>> Networks*. In the biological theory, there is no general commitment to
>>> just one "currently active neuron". One always considers the neuronal
>>> population, or populations, that represent a learned category. Sometimes,
>>> but not always, a winner-take-all category is chosen.
>>>
>>>  The 2012 review article illustrates some of the large data bases of
>>> psychological and neurobiological data that have been explained in a
>>> principled way, quantitatively simulated, and successfully predicted by ART
>>> over a period of decades. ART-like processing is, however, certainly not
>>> the only kind of computation that may be needed to understand how the brain
>>> works. The paradigm called Complementary Computing that I introduced awhile
>>> ago makes precise the sense in which ART may be just one kind of dynamics
>>> supported by advanced brains. This is also summarized in the review article.
>>>
>>>  The second aspect of ART is as a series of algorithms that
>>> mathematically characterize key ART design principles and mechanisms in a
>>> focused setting, and provide algorithms for large-scale applications in
>>> engineering and technology. ARTMAP, fuzzy ARTMAP, and distributed ARTMAP
>>> are among these, all of them developed with Gail Carpenter. Some of these
>>> algorithms use winner-take-all categories to enable the proof of
>>> mathematical theorems that characterize how underlying design principles
>>> work. In contrast, the distributed ARTMAP family of algorithms, developed
>>> by Gail Carpenter and her colleagues, allows for distributed category
>>> representations without losing the benefits of fast, incremental,
>>> self-stabilizing learning and prediction in response to a large
>>> non-stationary databases that can include many unexpected events.
>>>
>>>  See, e.g.,
>>> http://techlab.bu.edu/members/gail/articles/115_dART_NN_1997_.pdf and
>>> http://techlab.bu.edu/members/gail/articles/155_Fusion2008_CarpenterRavindran.pdf
>>> .
>>>
>>>  I should note that FAST learning is a technical concept: it means that
>>> each adaptive weight can converge to its new equilibrium value on EACH
>>> learning trial. That is why ART algorithms can often successfully carry out
>>> one-trial incremental learning of a data base. This is not true of many
>>> other algorithms, such as back propagation, simulated annealing, and the
>>> like, which all experience catastrophic forgetting if they try to do fast
>>> learning. Almost all other learning algorithms need to be run using slow
>>> learning, that allows only a small increment in the values of adaptive
>>> weights on each learning trial, to avoid massive memory instabilities, and
>>> work best in response to stationary data. Such algorithms often fail to
>>> detect important rare cases, among other limitations. ART can provably
>>> learn in either the fast or slow mode in response to non-stationary data.
>>>
>>>   in a sequential fashion: one neuron after another, until an
>>> acceptable neuron is found.
>>>
>>> (2) The input to the ART in the late 1990's is for a single feature
>>> vector as a monolithic input.
>>> By monolithic, I mean that all neurons take the entire input feature
>>> vector as input.
>>> I raise this point here because neuron in ART in the late 1990's does
>>> not have an explicit local sensory receptive field (SRF),
>>> i.e., are fully connected from all components of the input vector.   A
>>> local SRF means that each neuron is only connected to a small region
>>> in an input image.
>>>
>>>
>>>  Various ART algorithms for technology do use fully connected networks.
>>> They represent a single-channel case, which is often sufficient in
>>> applications and which simplifies mathematical proofs. However, the
>>> single-channel case is, as its name suggests, not a necessary constraint on
>>> ART design.
>>>
>>>  In addition, many ART biological models do not restrict themselves to
>>> the single-channel case, and do have receptive fields. These include the
>>> LAMINART family of models that predict functional roles for many identified
>>> cell types in the laminar circuits of cerebral cortex. These models
>>> illustrate how variations of a shared laminar circuit design can carry out
>>> very different intelligent functions, such as 3D vision (e.g., 3D
>>> LAMINART), speech and language (e.g., cARTWORD), and cognitive information
>>> processing (e.g., LIST PARSE). They are all summarized in the 2012 review
>>> article, with the archival articles themselves on my web page
>>> http://cns.bu.edu/~steve.
>>>
>>>  The existence of these laminar variations-on-a-theme provides an
>>> existence proof for the exciting goal of designing a family of chips whose
>>> specializations can realize all aspects of higher intelligence, and which
>>> can be consistently connected because they all share a similar underlying
>>> design. Work on achieving this goal can productively occupy lots of
>>> creative modelers and technologists for many years to come.
>>>
>>>  I hope that the above replies provide some relevant information, as
>>> well as pointers for finding more.
>>>
>>>  Best,
>>>
>>>  Steve
>>>
>>>
>>>
>>>
>>> My apology again if my understanding above has errors although I have
>>> examined the above two points carefully
>>> through multiple your papers.
>>>
>>> Best regards,
>>>
>>> -John
>>>
>>>  Juyang (John) Weng, Professor
>>> Department of Computer Science and Engineering
>>> MSU Cognitive Science Program and MSU Neuroscience Program
>>> 428 S Shaw Ln Rm 3115
>>> Michigan State University
>>> East Lansing, MI 48824 USA
>>> Tel: 517-353-4388
>>> Fax: 517-432-1061
>>> Email: weng at cse.msu.edu
>>> URL: http://www.cse.msu.edu/~weng/
>>> ----------------------------------------------
>>>
>>>
>>>
>>>      Stephen Grossberg
>>> Wang Professor of Cognitive and Neural Systems
>>> Professor of Mathematics, Psychology, and Biomedical Engineering
>>>  Director, Center for Adaptive Systems
>>> http://www.cns.bu.edu/about/cas.html
>>>  http://cns.bu.edu/~steve
>>> steve at bu.edu
>>>
>>>
>>>
>>>
>>>
>>
>>  --
>> --
>> Juyang (John) Weng, Professor
>> Department of Computer Science and Engineering
>> MSU Cognitive Science Program and MSU Neuroscience Program
>> 428 S Shaw Ln Rm 3115
>> Michigan State University
>> East Lansing, MI 48824 USA
>> Tel: 517-353-4388
>>
>> Fax: 517-432-1061
>> Email: weng at cse.msu.edu
>> URL: http://www.cse.msu.edu/~weng/
>> ----------------------------------------------
>>
>>
>>
>
> --
> --
> Juyang (John) Weng, Professor
> Department of Computer Science and Engineering
> MSU Cognitive Science Program and MSU Neuroscience Program
> 428 S Shaw Ln Rm 3115
> Michigan State University
> East Lansing, MI 48824 USA
> Tel: 517-353-4388
> Fax: 517-432-1061
> Email: weng at cse.msu.edu
> URL: http://www.cse.msu.edu/~weng/
> ----------------------------------------------
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/connectionists/attachments/20140410/b769055d/attachment.html>


More information about the Connectionists mailing list