Connectionists: generalizing language in neural networks [was Re: Computational Modeling of Bilingualism Special Issue]

Tue Mar 26 10:55:47 EDT 2013

Dear Thomas,

Thanks for your note.  I don't doubt that Deep Learning could be souped up to address the task, but my gentle nudge in my New Yorker essay on deep learning (in which I also raised this issue) didn't -- so far as I know -- yield any results. 

Deep learning is good at creating abstract features such as equivalence classes (when two or more inputs should be treat equally), but less good, I suspect, at the sort of one-to-one mappings problem that I described in my previous message, in which each unique input corresponds to each unique output.  

Topography, which Thivierge and I pointed to, is a good example of where nature gravitates towards one-to-one maps, in a way that's beginning to be fairly well-understood. In a lot of neural networks, however, individual output nodes are fully orthogonal and logically independent; they get drawn on a page as if they were ordered, but they aren't really, and as a result the localism of their outputs doesn't dovetail well with functions in which ordering matters.

If anyone reading this list works on deep learning and would like to to collaborate on this issue, drop me a note. I think it would be an interesting project, regardless of the outcome. 

Cheers,
Gary

Gary Marcus
Professor of Psychology
New York University
Author of Guitar Zero
http://garymarcus.com/
New Yorker blog 

On Mar 26, 2013, at 7:16 AM, Thomas Trappenberg <tt at cs.dal.ca> wrote:

> Hello Garry,
> 
> Keep in mind that simple back-propagating networks are not the ultimate solution but that deep networks (many layers) are necessary for more advanced representation. Ultimately we think that we would even develop higher order abstract representations that would support more human-like generalization.  
> 
> Regarding topography, this is a good point. I heard that mice don't have the topography in early visual areas as found in cats etc, which seems to contradict your statement that "topography seems to be mandatory". However, mice are not very visual, and their barrel cortex is organized. 
> 
> Regards, Thomas
> 
> ---------
> Dr. Thomas Trappenberg
> Professor
> Faculty of Computer Science 
> Dalhousie University
> Halifax, Canada
> 
> 
> 
> On Tue, Mar 26, 2013 at 1:09 AM, Gary Marcus <gary.marcus at nyu.edu> wrote:
> I posed some important challenges for language-like generalization in PDP and SRN models in 1998 in an article in Cognitive Psychology, with further discussion in 1999 Science article (providing data from human infants), and a 2001 MIT Press book, The Algebraic Mind.
> 
> For example, if one trains a standard PDP autoassociator on identity with integers represented by distribution representation consisting of binary digits and expose the model only to even numbers, the model will not generalize to odd numbers (i.e., it will not generalize identity to the least significant bit) even though (depending on the details of implementation) it can generalize to some new even numbers. Another way to put this is that these sort of models can interpolate within some cloud around a space of training examples, but can't generalize universally-quanitfied one-to-one mappings outside that space.
> 
> Likewise, training an Elman-style SRN with localist inputs (one word, one node, as in Elman's work on SRNS) on a set of sentences like "a rose is a rose" and "a tulip is a tulip" leads the model to learn those individual relationships, but not to generalize to "a blicket is a blicket", where blicket represents an untrained node.
> 
> These problems have to do with a kind of localism that is inherent in the back-propogation rule. In the 2001 book, I discuss some of the ways around them, and the compromises that known workarounds lead to.  I believe that some alternative kind of architecture is called for.
> 
> SInce the human brain is pretty quick to generalize universally-quantified one-to-one-mappings, even to novel elements, and even on the basis of small amounts of data, I consider these to be important - but largely unsolved -- problems. The brain must do it, but we still really understand how.  (J. P. Thivierge and I made one suggestion in this paper in TINS.)
> 
> Sincerely,
> 
> Gary Marcus
> 
> 
> Gary Marcus
> Professor of Psychology
> New York University
> Author of Guitar Zero
> http://garymarcus.com/
> New Yorker blog 
> 
> On Mar 25, 2013, at 11:30 PM, Janet Wiles <janetw at itee.uq.edu.au> wrote:
> 
>> Recurrent neural networks can represent, and in some cases learn and generalise classes of languages beyond finite state machines. For a review, of their capabilities see the excellent edited book by Kolen and Kramer. e.g., ch 8 is on "Representation beyond finite states"; and ch9 is "Universal Computation and Super-Turing Capabilities".
>>  
>> Kolen and Kramer (2001) "A Field Guide Dynamical Recurrent Networks", IEEE Press.
>>  
>> From: connectionists-bounces at mailman.srv.cs.cmu.edu [mailto:connectionists-bounces at mailman.srv.cs.cmu.edu] On Behalf Of Juyang Weng
>> Sent: Sunday, 24 March 2013 9:17 AM
>> To: connectionists at mailman.srv.cs.cmu.edu
>> Subject: Re: Connectionists: Computational Modeling of Bilingualism Special Issue
>>  
>> Ping Li:
>> 
>> As far as I understand, traditional connectionist architectures cannot do abstraction well as Marvin Minsky, Michael Jordan 
>> and many others correctly stated.  For example, traditional neural networks cannot learn a finite automaton (FA) until recently (i.e., 
>> the proof of our Developmental Network).  We all know that FA is the basis for all probabilistic symbolic networks (e.g., Markov models)
>> but they are all not connectionist.   
>> 
>> After seeing your announcement, I am confused with the book title
>> "Bilingualism Special Issue: Computational Modeling of Bilingualism" but with your comment "most of the models are based on connectionist architectures."  
>> 
>> Without further clarifications from you, I have to predict that these connectionist architectures in the book are all grossly wrong in terms
>> of brain-capable connectionist natural language processing, since they cannot learn an FA.   This means that they cannot generalize to state-equivalent but unobserved word sequences.   Without this basic capability required for natural language processing, how can they claim connectionist natural language processing, let alone bilingualism?
>> 
>> I am concerned that many papers proceed with specific problems without understanding the fundamental problems of the traditional connectionism. The fact that the biological brain is connectionist does not necessarily mean that all connectionist researchers know about the brain's connectionism.
>> 
>> -John Weng
>> 
>> On 3/22/13 6:08 PM, Ping Li wrote:
>> Dear Colleagues,
>>  
>> A Special Issue on Computational Modeling of Bilingualism has been published. Most of the models are based on connectionist architectures. 
>>  
>> All the papers are available for free viewing until April 30, 2013 (follow the link below to its end):
>>  
>> http://cup.linguistlist.org/2013/03/bilingualism-special-issue-computational-modeling-of-bilingualism/
>>  
>> Please let me know if you have difficulty accessing the above link or viewing any of the PDF files on Cambridge University Press's website.
>>  
>> With kind regards,
>>  
>> Ping Li
>>  
>>  
>> =================================================================
>> Ping Li, Ph.D. | Professor of Psychology, Linguistics, Information Sciences & Technology  |  Co-Chair, Inter-College Graduate Program in Neuroscience | Co-Director, Center for Brain, Behavior, and Cognition | Pennsylvania State University  | University Park, PA 16802, USA  | 
>> Editor, Bilingualism: Language and Cognition, Cambridge University Press | Associate Editor: Journal of Neurolinguistics, Elsevier Science Publisher
>> Email: pul8 at psu.edu  | URL: http://cogsci.psu.edu
>> =================================================================
>>  
>> 
>> 
>> -- 
>> --
>> Juyang (John) Weng, Professor
>> Department of Computer Science and Engineering
>> MSU Cognitive Science Program and MSU Neuroscience Program
>> 428 S Shaw Ln Rm 3115
>> Michigan State University
>> East Lansing, MI 48824 USA
>> Tel: 517-353-4388
>> Fax: 517-432-1061
>> Email: weng at cse.msu.edu
>> URL: http://www.cse.msu.edu/~weng/
>> ----------------------------------------------
>>  
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.srv.cs.cmu.edu/mailman/private/connectionists/attachments/20130326/dd10bc0e/attachment-0001.html>