Connectionists: generalizing language in neural networks [was Re: Computational Modeling of Bilingualism Special Issue]

Lee Giles giles at ist.psu.edu
Tue Mar 26 15:52:39 EDT 2013


Here are some of our journal papers on this topic. Our focus was
primarily on grammatical
inference.

Best

Lee Giles

> 1.     C.L. Giles, S. Lawrence, A-C. Tsoi, "Noisy Time Series
> Prediction Using a Recurrent Neural Network and Grammatical
> Inference," Machine Learning, 44, 161-183, 2001.
>
> 2.     S. Lawrence, C.L. Giles, S. Fong, "Natural Language Grammatical
> Inference with Recurrent Neural Networks," IEEE Trans. on Knowledge
> and Data Engineering, 12(1), p.126, 2000.
>
> 3.     C.L. Giles, C.W. Omlin, K. K. Thornber, "Equivalence in
> Knowledge Representation: Automata, Recurrent Neural Networks, and
> Dynamical Fuzzy Systems," Proceedings of the IEEE, 87(9), 1623-1640,
> 1999 (invited).
>
> 4.     C.W. Omlin, K. K. Thornber, C.L. Giles, "Deterministic Fuzzy
> Finite State Automata Can Be Deterministically Encoded into Recurrent
> Neural Networks," IEEE Trans. on Fuzzy Systems, 6(1), p. 76, 1998.
>
> 5.     D.S. Clouse, C.L. Giles, B.G. Horne, G.W. Cottrell, "Time-Delay
> Neural Networks: Representation and Induction of Finite State
> Machines," IEEE Trans. on Neural Networks, 8(5), p. 1065, 1997.
>
> 6.     H.T. Siegelmann, C.L.Giles, "The Complexity of Language
> Recognition by Neural Networks," Neurocomputing, Special Issue on
> "Recurrent Networks for Sequence Processing," (eds) M. Gori, M. Mozer,
> A.H. Tsoi, W. Watrous, 15, p. 327, 1997.
>
> 7.     H.T. Siegelmann, B.G. Horne, C.L. Giles, "Computational
> capabilities of recurrent NARX neural networks," IEEE Trans. on
> Systems, Man and Cybernetics: Part B - Cybernetics, 27(2), p.208, 1997.
>
> 8.     S. Lawrence, C.L. Giles, A-C. Tsoi, A. Back, "Face Recognition:
> A Convolutional Neural Network Approach," IEEE Trans. on Neural
> Networks, Special Issue on "Pattern Recognition" 8(1), p. 98, 1997.
>
> 9.     C.W. Omlin, C.L. Giles, "Constructing Deterministic
> Finite-State Automata in Recurrent Neural Networks," Journal of the
> ACM, 45(6), p. 937, 1996.
>
> 10.   C.W. Omlin, C.L. Giles, "Rule Revision with Recurrent Neural
> Networks," IEEE Trans. on Knowledge and Data Engineering, 8(1), p.
> 183, 1996.
>
> 11.   C.W. Omlin, C.L. Giles, "Stable Encoding of Large Finite-State
> Automata in Recurrent Neural Networks with Sigmoid Discriminants,"
> Neural Computation, 8(4), p. 675, 1996.
>
> 12.   C.W. Omlin, C.L. Giles, "Extraction of Rules from Discrete-Time
> Recurrent Neural Networks," Neural Networks, 9(1), p. 41. 1996.
>
> 13.   C.L. Giles, B.G. Horne, T. Lin, "Learning a Class of Large
> Finite State Machines with a Recurrent Neural Network," Neural
> Networks, 8(9), p. 1359, 1995.
>
> 14.   C.L. Giles, C.W. Omlin, "Extraction, Insertion and Refinement of
> Production Rules in Recurrent Neural Networks," Connection Science,
> Special Issue on "Architectures for Integrating Symbolic and Neural
> Processes" 5(3-4), p. 307, 1993.
>
> 15.   C.L. Giles, C.B. Miller, D. Chen, H.H. Chen, G.Z. Sun, Y.C. Lee,
> "Learning and Extracting Finite State Automata with Second-Order
> Recurrent Neural Networks," Neural Computation, 4(3), 393-405, 1992.
>


On 3/26/13 12:48 PM, Juergen Schmidhuber wrote:
> More than a decade ago, Long Short-Term Memory recurrent neural
> networks (LSTM) learned certain context-free and context-sensitive
> languages that cannot be represented by finite state automata such as
> HMMs. Parts of the network became stacks or event counters.
>
> F. A. Gers and J. Schmidhuber. LSTM Recurrent Networks Learn Simple
> Context Free and Context Sensitive Languages. IEEE Transactions on
> Neural Networks 12(6):1333-1340, 2001.
>
> J. Schmidhuber, F. Gers, D. Eck. Learning nonregular languages: A
> comparison of simple recurrent networks and LSTM. Neural Computation,
> 14(9):2039-2041, 2002.
>
> F. Gers and J. A. Perez-Ortiz and D. Eck and J. Schmidhuber. Learning
> Context Sensitive Languages with LSTM Trained with Kalman Filters.
> Proceedings of ICANN'02, Madrid, p 655-660, Springer, Berlin, 2002.
>
>
> Old slides on this:
> http://www.idsia.ch/~juergen/lstm/sld028.htm
>
> Juergen
>
>
>
> On Mar 26, 2013, at 5:09 AM, Gary Marcus wrote:
>
>> I posed some important challenges for language-like generalization in
>> PDP and SRN models in 1998 in an article in Cognitive Psychology,
>> with further discussion in 1999 Science article (providing data from
>> human infants), and a 2001 MIT Press book, The Algebraic Mind.
>>
>> For example, if one trains a standard PDP autoassociator on identity
>> with integers represented by distribution representation consisting
>> of binary digits and expose the model only to even numbers, the model
>> will not generalize to odd numbers (i.e., it will not generalize
>> identity to the least significant bit) even though (depending on the
>> details of implementation) it can generalize to some new even
>> numbers. Another way to put this is that these sort of models can
>> interpolate within some cloud around a space of training examples,
>> but can't generalize universally-quanitfied one-to-one mappings
>> outside that space.
>>
>> Likewise, training an Elman-style SRN with localist inputs (one word,
>> one node, as in Elman's work on SRNS) on a set of sentences like "a
>> rose is a rose" and "a tulip is a tulip" leads the model to learn
>> those individual relationships, but not to generalize to "a blicket
>> is a blicket", where blicket represents an untrained node.
>>
>> These problems have to do with a kind of localism that is inherent in
>> the back-propogation rule. In the 2001 book, I discuss some of the
>> ways around them, and the compromises that known workarounds lead
>> to.  I believe that some alternative kind of architecture is called for.
>>
>> SInce the human brain is pretty quick to generalize
>> universally-quantified one-to-one-mappings, even to novel elements,
>> and even on the basis of small amounts of data, I consider these to
>> be important - but largely unsolved -- problems. The brain must do
>> it, but we still really understand how.  (J. P. Thivierge and I made
>> one suggestion in this paper in TINS.)
>>
>> Sincerely,
>>
>> Gary Marcus
>>
>>
>> Gary Marcus
>> Professor of Psychology
>> New York University
>> Author of Guitar Zero
>> http://garymarcus.com/
>> New Yorker blog
>>
>> On Mar 25, 2013, at 11:30 PM, Janet Wiles <janetw at itee.uq.edu.au> wrote:
>>
>>> Recurrent neural networks can represent, and in some cases learn and
>>> generalise classes of languages beyond finite state machines. For a
>>> review, of their capabilities see the excellent edited book by Kolen
>>> and Kramer. e.g., ch 8 is on "Representation beyond finite states";
>>> and ch9 is "Universal Computation and Super-Turing Capabilities".
>>>
>>> Kolen and Kramer (2001) "A Field Guide Dynamical Recurrent
>>> Networks", IEEE Press.
>>>
>>> From: connectionists-bounces at mailman.srv.cs.cmu.edu
>>> [mailto:connectionists-bounces at mailman.srv.cs.cmu.edu] On Behalf Of
>>> Juyang Weng
>>> Sent: Sunday, 24 March 2013 9:17 AM
>>> To: connectionists at mailman.srv.cs.cmu.edu
>>> Subject: Re: Connectionists: Computational Modeling of Bilingualism
>>> Special Issue
>>>
>>> Ping Li:
>>>
>>> As far as I understand, traditional connectionist architectures
>>> cannot do abstraction well as Marvin Minsky, Michael Jordan
>>> and many others correctly stated.  For example, traditional neural
>>> networks cannot learn a finite automaton (FA) until recently (i.e.,
>>> the proof of our Developmental Network).  We all know that FA is the
>>> basis for all probabilistic symbolic networks (e.g., Markov models)
>>> but they are all not connectionist.
>>>
>>> After seeing your announcement, I am confused with the book title
>>> "Bilingualism Special Issue: Computational Modeling of Bilingualism"
>>> but with your comment "most of the models are based on connectionist
>>> architectures."
>>>
>>> Without further clarifications from you, I have to predict that
>>> these connectionist architectures in the book are all grossly wrong
>>> in terms
>>> of brain-capable connectionist natural language processing, since
>>> they cannot learn an FA.   This means that they cannot generalize to
>>> state-equivalent but unobserved word sequences.   Without this basic
>>> capability required for natural language processing, how can they
>>> claim connectionist natural language processing, let alone
>>> bilingualism?
>>>
>>> I am concerned that many papers proceed with specific problems
>>> without understanding the fundamental problems of the traditional
>>> connectionism. The fact that the biological brain is connectionist
>>> does not necessarily mean that all connectionist researchers know
>>> about the brain's connectionism.
>>>
>>> -John Weng
>>>
>>> On 3/22/13 6:08 PM, Ping Li wrote:
>>> Dear Colleagues,
>>>
>>> A Special Issue on Computational Modeling of Bilingualism has been
>>> published. Most of the models are based on connectionist architectures.
>>>
>>> All the papers are available for free viewing until April 30, 2013
>>> (follow the link below to its end):
>>>
>>> http://cup.linguistlist.org/2013/03/bilingualism-special-issue-computational-modeling-of-bilingualism/
>>>
>>>
>>> Please let me know if you have difficulty accessing the above link
>>> or viewing any of the PDF files on Cambridge University Press's
>>> website.
>>>
>>> With kind regards,
>>>
>>> Ping Li
>>>
>>>
>>> =================================================================
>>> Ping Li, Ph.D. | Professor of Psychology, Linguistics, Information
>>> Sciences & Technology  |  Co-Chair, Inter-College Graduate Program
>>> in Neuroscience | Co-Director, Center for Brain, Behavior, and
>>> Cognition | Pennsylvania State University  | University Park, PA
>>> 16802, USA  |
>>> Editor, Bilingualism: Language and Cognition, Cambridge University
>>> Press | Associate Editor: Journal of Neurolinguistics, Elsevier
>>> Science Publisher
>>> Email: pul8 at psu.edu  | URL: http://cogsci.psu.edu
>>> =================================================================
>>>
>>>
>>>
>>> -- 
>>> -- 
>>> Juyang (John) Weng, Professor
>>> Department of Computer Science and Engineering
>>> MSU Cognitive Science Program and MSU Neuroscience Program
>>> 428 S Shaw Ln Rm 3115
>>> Michigan State University
>>> East Lansing, MI 48824 USA
>>> Tel: 517-353-4388
>>> Fax: 517-432-1061
>>> Email: weng at cse.msu.edu
>>> URL: http://www.cse.msu.edu/~weng/
>>> ----------------------------------------------
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.srv.cs.cmu.edu/mailman/private/connectionists/attachments/20130326/d0ba2566/attachment-0001.html>


More information about the Connectionists mailing list