Connectionists: New paper on why modules evolve, and how to evolve modular neural networks

Sat Feb 23 13:34:50 EST 2013

  <!-- /* Font Definitions */
@font-face {font-family:"ＭＳ 明朝"; mso-font-charset:78; mso-generic-font-family:auto; mso-font-pitch:variable; mso-font-signature:-536870145 1791491579 18 0 131231 0;}
@font-face {font-family:"ＭＳ 明朝"; mso-font-charset:78; mso-generic-font-family:auto; mso-font-pitch:variable; mso-font-signature:-536870145 1791491579 18 0 131231 0;}
@font-face {font-family:Cambria; panose-1:2 4 5 3 5 4 6 3 2 4; mso-font-charset:0; mso-generic-font-family:auto; mso-font-pitch:variable; mso-font-signature:-536870145 1073743103 0 0 415 0;} /* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal {mso-style-unhide:no; mso-style-qformat:yes; mso-style-parent:""; margin-top:0cm; margin-right:0cm; margin-bottom:10.0pt; margin-left:0cm; mso-pagination:widow-orphan; font-size:12.0pt; font-family:Cambria; mso-ascii-font-family:Cambria; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"ＭＳ 明朝"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Cambria; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi; mso-fareast-language:JA;}
.MsoChpDefault {mso-style-type:export-only; mso-default-props:yes; font-size:10.0pt; mso-ansi-font-size:10.0pt; mso-bidi-font-size:10.0pt; font-family:Cambria; mso-ascii-font-family:Cambria; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"ＭＳ 明朝"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Cambria; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi; mso-fareast-language:JA;}
.MsoPapDefault {mso-style-type:export-only; margin-bottom:10.0pt;}
@page WordSection1 {size:612.0pt 792.0pt; margin:72.0pt 90.0pt 72.0pt 90.0pt; mso-header-margin:36.0pt; mso-footer-margin:36.0pt; mso-paper-source:0;}
div.WordSection1 {page:WordSection1;}
-->  
I think there
are a number of conflicting (sometimes implicit) definitions of modules which
generates a lot of confusion. 

In the late 1970s and early 1980s I was involved
in perhaps the first design of a very complex telecommunications system using
electronics (rather than electromechanical relays). The project required
transistor design, integrated circuit design, printed circuit assembly design,
system design, software language design, and the writing of millions of lines
of code. Several thousand engineers worked for about 5 years before delivery of
the first system. The experience provided a fair amount of insight into the
architectural constraints on extremely complex systems, which through natural
selection apply to the brain even though it is a completely different kind of
system.

In this context
the driving force for modules is the need to limit the physical information processing resources required. A module is a set of physical resources that performs a group of similar information
processes. The resources of each module are optimized to perform one type of
process very efficiently, minimizing the overall resources required. 

The criterion
for grouping information processes together is simply that they can be performed
efficiently on the same customized physical resources, so the groupings have little to
do with the way system behaviour is divided up into features. Any system
feature (or cognitive feature) will require information processes performed by
many different modules and there will be minimal correspondence between modules and
features.

A problem therefore
arises because a change to a feature is implemented by changes to some of the
modules it uses. The more features that use a type of information process (i.e.
share the same module), the greater the risk that changes to one feature (e.g.
through learning) will result in undesirable side-effects on other features
using the same modules. Furthermore, exchange of information between modules
means that a change to one module may affect the behaviour of other modules,
making undesirable side effects proliferate through those other modules. Minimisation of information
exchange between modules as far as possible is necessary to control the
proliferation of side effects.

Unfortunately,
an information exchange between two modules generally means that processing
performed by one module does not have to be duplicated in the other module, so
there is a resource cost to minimizing information exchange.

So finding the optimal
definition of the information processes that each module performs and the information
exchanges between them is the very complex problem of finding an adequate compromise between limiting information processing resource requirements and retaining the ability to change features without excessive undesirable side effects on other features. 

All this is documented in my 2001 paper.

On 22/02/13, "Levine, Daniel S"  <levine at uta.edu> wrote:
> 
>  
> 
> <!--P { MARGIN-TOP: 0px; MARGIN-BOTTOM: 0px
> }
> --> 
> 
> 
>  
>  Dear Steve, John, Richard et al.,
>   
>  I am reminded of a recent IJCNN when I heard Ali Minai describe a neural architecture for idea generation as "modular." Then about a day later I heard Steve describe another architecture for something else as "not modular." But both were describing a network composed of subnetworks with distinct functions, subnetworks that were not independent of one another but mutually interacting and strongly influencing one another. The same is true of all of my own model networks. In other words, Ali's "modularity" and Steve's "non-modularity" were essentially describing the same concept! Since then I have strenuously avoided use of the term "modular" as too ambiguous.
>   
>  Best,
>  Dan Levine
>   
>   
> From: connectionists-bounces at mailman.srv.cs.cmu.edu [connectionists-bounces at mailman.srv.cs.cmu.edu] On Behalf Of Juyang Weng [weng at cse.msu.edu]
> 
> Sent: Friday, February 22, 2013 8:03 PM
> 
> To: Stephen Grossberg
> 
> Cc: steve Grossberg; connectionists
> 
> Subject: Re: Connectionists: New paper on why modules evolve, and how to evolve modular neural networks
>  
>  
>  
>  Dear Richard, Steve and other connectionist colleagues,
> 
> 
> 
> Many researchers have said that neuroscience today is rich in data and poor in theory. I agree. Unlike organs in the body, brain is basically a signal processor. Therefore, it should have an overarching theory well explained in mathematics.
> 
> 
> 
> However, I am probably a minority to hold the following position. After coming up an overarching theory of the brain, I start to disbelieve the modular view of the brain. A modular view of the brain is like to categorize plants based on apparent look instead of their genes. 
> 
> 
> 
> The apparent Brodmann areas in the brain should be largely due to the body organs (eyes, ears, skins, muscles, glands etc.). The re-assignment of visual areas to other sensing modality in the brain of a blind person seems to justify my this theoretical view, since my theory explains why and how this re-assignment takes place. If my theory is correct, neuroscience textbooks should be very differently written in the future. Until then, few care to pay attention to this theory.
> 
> 
> 
> 
> Humbly,
> 
> 
> 
> -Juyang Weng
> 
> 
> Juyang (John) Weng, Professor
> Department of Computer Science and Engineering
> MSU Cognitive Science Program and MSU Neuroscience Program
> 3115 Engineering Building
> Michigan State University
> East Lansing, MI 48824 USA
> Tel: 517-353-4388
> Fax: 517-432-1061
> Email: weng at cse.msu.edu
> URL: http://www.cse.msu.edu/~weng/
>     On 2/22/13 8:02 PM, Stephen Grossberg wrote:
>  
>  
> > Dear Richard and other Connectionist colleagues, 
> >  
> >  I think that it is important to clarify how the work "module" is being used. Many people think of modules as implying
> > independent modules that should be able to fully compute their particular processes on their own. However, much behavioral and neurobiological data argue against this possibility. The brain’s organization into distinct anatomical areas and processing streams shows that brain processing is indeed specialized. However, specialization does not imply the kind of independence that modularity is often taken to imply. Then what is the nature of this specialization? 
> >  
> >  Complementary Computing concerns the proposal that pairs of parallel cortical processing streams compute complementary properties in the brain. Each stream has complementary computational strengths and weaknesses, much as in physical principles like the Heisenberg Uncertainty Principle. Each cortical stream can also possess multiple processing stages. These stages realize a hierarchical resolution of uncertainty. "Uncertainty" here means that computing one set of properties at a given stage prevents computation of a complementary set of properties at that stage. Complementary Computing proposes that the computational unit of brain processing that has behavioral significance consists of parallel interactions between complementary cortical processing streams with multiple processing stages to compute complete information about a particular type of biological intelligence. It has been suggested that such complementary processing streams may arise from a hierarchical multi-scale process of morphogenetic symmetry-breaking.
> >  
> >  
> >  The concept of Complementary Computing arose as it gradually became clear, as a result of decades of behavioral and neural modeling, that essentially all biological neural models exhibit such complementary processes. Articles that provide examples of Complementary Computing can be found on my web page http://cns.bu.edu/~steve(http://cns.bu.edu/%7Esteve) . They include:
> >  
> >  
> >  Grossberg, S. (2000). The complementary brain: Unifying brain dynamics and modularity. Trends in Cognitive Sciences, 4, 233-246. 
> >  
> >  
> >  Grossberg, S. (2012). Adaptive Resonance Theory: How a brain learns to consciously attend, learn, and recognize a changing world. Neural Networks, 37, 1-47. 
> >  
> >  
> >  About minimum length: It's important to keep in mind the work of van Essen (1997, Nature, 385, 313-318) concerning his tension-based theory of morphogenesis and compact wiring, which clarifies how folds in the cerebral cortex may develop and make connections more compact; i.e., shorter.
> >  
> >  
> >  A possible role of tension in other developmental processes, such as in the formation during morphogenesis of a gastrula from a blastula, illustrates that such a mechanism may be used in biological systems other than brains. The article below describes such a process heuristically, also on my web page:
> >  
> >  
> >  Grossberg, S. (1978). Communication, Memory, and Development. In R. Rosen and F. Snell (Eds.), Progress in theoretical biology, Volume 5. New York: Academic Press, pp. 183-232. See Sections XIV - XVI.
> >  
> >  
> >  About cortical columns: They are important, but no more important than the long-range horizontal interactions among columns that are ubiquitous in the cerebral cortex. Indeed, understanding how bottom-up, horizontal, and top-down interactions interact in neocortex has led to the paradigm of Laminar Computing, which attempts to clarify how specializations of this shared laminar design embody different types of biological intelligence, including vision, speech and language, and cognition. On my web page, articles with colleagues like Cao (2005), Raizada (2000, 2001), and Yazdanbakhsh (2005) for vision, Pearson (2008) for cognitive working memory and list chunking, and Kazerounian (2011) for speech perception illustrate this theme.
> >  
> >  
> >  Laminar Computing has begun to explain how the laminar design of neocortex may realize the best properties of feedforward and feedback processing, digital and analog processing, and bottom-up data-driven processing and top-down attentive hypothesis-driven processing. Embodying such designs into VLSI chips promises to enable the development of increasingly general-purpose adaptive autonomous algorithms for multiple applications.
> >  
> >  
> >  The existence and critical importance of long-range horizontal connections in neocortex raises the following issue: Why is the spatial resolution of columns as fine as it is? Why does not the long-range correlation length force the columns to become spatially more diffuse than they are? The following article on my web page suggests at least for the case of cortical area V! how the cortical subplate may play a role in this:
> >  
> >  
> >  Grossberg, S. and Seitz, A. (2003). Laminar development of receptive fields, maps, and columns in visual cortex: The coordinating role of the subplate. Cerebral Cortex, 13, 852-863. 
> >  
> >  
> >  Best,
> >  
> >  
> >        Steve Grossberg
> >  
> >  
> >  Wang Professor of Cognitive and Neural Systems
> >  Professor of Mathematics, Psychology, and Biomedical Engineering
> >   Director, Center for Adaptive Systems http://www.cns.bu.edu/about/cas.html
> >  
> >  http://cns.bu.edu/~steve(http://cns.bu.edu/%7Esteve)
> >  steve at bu.edu
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >    On Feb 22, 2013, at 4:18 PM, Richard Loosemore wrote:
> >  
> >   
> > >  
> > > 
> > > I hate to say this, but during discussions with fellow students back in 1987, I remember pointing out that it was not terribly surprising that the cortex consisted of columns (i.e. modules) with dense internal connectivity, with less-dense connections between columns -- not surprising, because the alternative was to try to make the brain less modular and connect every neuron in each column to all the neurons in all the other columns, and the result would be brains that were a million times larger than they are (due to all the extra wiring).
> > > 
> > > 
> > > 
> > > The same logic applies in all systems where it is costly to connect every element to every other: the optimal connectivity is well-connected, tightly clustered groups of elements.
> > > 
> > > 
> > > 
> > > During those discussions the point was considered so obvious that it sparked little comment. Ever since then I have told students in my lectures that this would be the evolutionary reason for cortical columns to exist.
> > > 
> > > 
> > > 
> > > So I am a little confused now. Can someone explain what I am missing .........?
> > > 
> > > 
> > > 
> > > Richard Loosemore
> > > 
> > > Department of Physical and Mathematical Sciences,
> > > 
> > > Wells College
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > On 2/13/13 9:48 AM, Juergen Schmidhuber wrote:
> > >   
> > > > The paper mentions that Santiago Ramón y Cajal already pointed out that evolution has created mostly short connections in animal brains.
> > > >  
> > >  
> > > > 
> > > >  
> > >  
> > > > Minimization of connection costs should also encourage modularization, e.g.,
> > > > http://arxiv.org/abs/1210.0118 (2012).
> > > >  
> > >  
> > > > 
> > > >  
> > >  
> > > > But who first had such a wire length term in an objective function to be minimized by evolutionary computation or other machine learning methods?
> > > >  
> > >  
> > > > I am aware of pioneering work by Legenstein and Maass:
> > > >  
> > >  
> > > > 
> > > >  
> > >  
> > > > R. A. Legenstein and W. Maass. Neural circuits for pattern recognition with small total wire length. Theoretical Computer Science, 287:239-249, 2002.
> > > >  
> > >  
> > > > R. A. Legenstein and W. Maass. Wire length as a circuit complexity measure. Journal of Computer and System Sciences, 70:53-72, 2005.
> > > >  
> > >  
> > > > 
> > > >  
> > >  
> > > > Is there any earlier relevant work? Pointers will be appreciated.
> > > >  
> > >  
> > > > 
> > > >  
> > >  
> > > > Jürgen Schmidhuber
> > > >  
> > >  
> > > > http://www.idsia.ch/~juergen/whatsnew.html(http://www.idsia.ch/%7Ejuergen/whatsnew.html)
> > > >  
> > >  
> > > > 
> > > >  
> > >  
> > > > 
> > > >  
> > >  
> > > > 
> > > >  
> > >  
> > > > 
> > > >  
> > >  
> > > > On Feb 10, 2013, at 3:14 AM, Jeff Clune wrote:
> > > >  
> > >  
> > > > 
> > > >  
> > >  
> > > >  
> > > > > Hello all,
> > > > >  
> > > >  
> > >  
> > > >  
> > > > > 
> > > > >  
> > > >  
> > >  
> > > >  
> > > > > I believe that many in the neuroscience community will be interested in a new paper that sheds light on why modularity evolves in biological networks, including neural networks. The same discovery also provides AI researchers a simple technique for evolving neural networks that are modular and have increased evolvability, meaning that they adapt faster to new environments.
> > > > >  
> > > >  
> > >  
> > > >  
> > > > > 
> > > > >  
> > > >  
> > >  
> > > >  
> > > > > Cite: Clune J, Mouret J-B, Lipson H (2013) The evolutionary origins of modularity. Proceedings of the Royal Society B. 280: 20122863.
> > > > > http://dx.doi.org/10.1098/rspb.2012.2863 (pdf)
> > > > >  
> > > >  
> > >  
> > > >  
> > > > > 
> > > > >  
> > > >  
> > >  
> > > >  
> > > > > Abstract: A central biological question is how natural organisms are so evolvable (capable of quickly adapting to new environments). A key driver of evolvability is the widespread modularity of biological networks—their organization as functional, sparsely connected subunits—but there is no consensus regarding why modularity itself evolved. Although most hypotheses assume indirect selection for evolvability, here we demonstrate that the ubiquitous, direct selection pressure to reduce the cost of connections between network nodes causes the emergence of modular networks. Computational evolution experiments with selection pressures to maximize network performance and minimize connection costs yield networks that are significantly more modular and more evolvable than control experiments that only select for performance. These results will catalyse research in numerous disciplines, such as neuroscience and genetics, and enhance our ability to harness evolution for engineering pu!
> > > > >  
> > > >  
> > >  
> > > >  
> > > > > rposes.
> > > > >  
> > > >  
> > >  
> > > >  
> > > > > 
> > > > >  
> > > >  
> > >  
> > > >  
> > > > > Video:
> > > > > 
> > > > > http://www.youtube.com/watch?feature=player_embedded&v=SG4_aW8LMng(http://www.youtube.com/watch?feature=player_embedded&v=SG4_aW8LMng)
> > > > >  
> > > >  
> > >  
> > > >  
> > > > > 
> > > > >  
> > > >  
> > >  
> > > >  
> > > > > There has been some nice coverage of this work in the popular press, in case you are interested:
> > > > >  
> > > >  
> > >  
> > > >  
> > > > > 
> > > > >  
> > > >  
> > >  
> > > >  
> > > > > • National Geographic:
> > > > > 
> > > > > http://phenomena.nationalgeographic.com/2013/01/30/the-parts-of-life/(http://phenomena.nationalgeographic.com/2013/01/30/the-parts-of-life/)
> > > > >  
> > > >  
> > >  
> > > >  
> > > > > • MIT's Technology Review:
> > > > > 
> > > > > http://www.technologyreview.com/view/428504/computer-scientists-reproduce-the-evolution-of-evolvability/(http://www.technologyreview.com/view/428504/computer-scientists-reproduce-the-evolution-of-evolvability/)
> > > > >  
> > > >  
> > >  
> > > >  
> > > > > • Fast Company:
> > > > > 
> > > > > http://www.fastcompany.com/3005313/evolved-brains-robots-creep-closer-animal-learning(http://www.fastcompany.com/3005313/evolved-brains-robots-creep-closer-animal-learning)
> > > > >  
> > > >  
> > >  
> > > >  
> > > > > • Cornell Chronicle:
> > > > > 
> > > > > http://www.news.cornell.edu/stories/Jan13/modNetwork.html(http://www.news.cornell.edu/stories/Jan13/modNetwork.html)
> > > > >  
> > > >  
> > >  
> > > >  
> > > > > • ScienceDaily:
> > > > > 
> > > > > http://www.sciencedaily.com/releases/2013/01/130130082300.htm(http://www.sciencedaily.com/releases/2013/01/130130082300.htm)
> > > > >  
> > > >  
> > >  
> > > >  
> > > > > 
> > > > >  
> > > >  
> > >  
> > > >  
> > > > > I hope you enjoy the work. Please let me know if you have any questions.
> > > > >  
> > > >  
> > >  
> > > >  
> > > > > 
> > > > >  
> > > >  
> > >  
> > > >  
> > > > > Best regards,
> > > > >  
> > > >  
> > >  
> > > >  
> > > > > Jeff Clune
> > > > >  
> > > >  
> > >  
> > > >  
> > > > > 
> > > > >  
> > > >  
> > >  
> > > >  
> > > > > Assistant Professor
> > > > >  
> > > >  
> > >  
> > > >  
> > > > > Computer Science
> > > > >  
> > > >  
> > >  
> > > >  
> > > > > University of Wyoming
> > > > >  
> > > >  
> > >  
> > > >  
> > > > > jeffclune at uwyo.edu
> > > > >  
> > > >  
> > >  
> > > >  
> > > > > jeffclune.com(http://jeffclune.com)
> > > > >  
> > > >  
> > >  
> > > >  
> > > > > 
> > > > >  
> > > >  
> > >  
> > > >  
> > > > > 
> > > > >  
> > > >  
> > >  
> > > > 
> > > >  
> > >  
> > > > 
> > > >  
> > >  
> > > > 
> > > >  
> > >  
> > > 
> > > 
> > >  
> > >  
> >  
> >  
> >         
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> > 
> > 
> >  
> >  
> >  
> 
> 
> 
> 
> -- --
> Juyang (John) Weng, Professor
> Department of Computer Science and Engineering
> MSU Cognitive Science Program and MSU Neuroscience Program
> 3115 Engineering Building
> Michigan State University
> East Lansing, MI 48824 USA
> Tel: 517-353-4388
> Fax: 517-432-1061
> Email: weng at cse.msu.edu
> URL: http://www.cse.msu.edu/~weng/
> ---------------------------------------------- 
>  
>  
> 
>  
> 
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.srv.cs.cmu.edu/mailman/private/connectionists/attachments/20130223/613c4d95/attachment-0001.html>