journal paper on neural networks and NLP

Mon Jul 27 13:48:52 EDT 1998

The following journal paper on natural language learning and neural
networks has been accepted in IEEE Transactions on Knowledge and Data
Engineering and is now available at:

http://www.neci.nj.nec.com/homepages/lawrence/
http://www.neci.nj.nec.com/homepages/giles/

The labeled corpus data set for this study is also available at:

http://www.neci.nj.nec.com/homepages/sandiway/pappi/rnn/index.html

**************************************************************************

"Natural Language Grammatical Inference with Recurrent Neural Networks"

       Steve Lawrence (1), C. Lee Giles (1,2), Sandiway Fong (1)

(1) NEC Research Institute, 4 Independence Way, Princeton, NJ 08540, USA
 (2) Institute for Advanced Computer Studies, University of Maryland,
                   College Park, MD 20742, USA

            {lawrence,giles,sandiway}@research.nj.nec.com

			      ABSTRACT

This paper examines the inductive inference of a complex grammar with
neural networks -- specifically, the task considered is that of
training a network to classify natural language sentences as
grammatical or ungrammatical, thereby exhibiting the same kind of
discriminatory power provided by the Principles and Parameters
linguistic framework, or Government-and-Binding theory.  Neural
networks are trained, without the division into learned vs. innate
components assumed by Chomsky, in an attempt to produce the same
judgments as native speakers on sharply grammatical/ungrammatical
data.  How a recurrent neural network could possess linguistic
capability, and the properties of various common recurrent neural
network architectures are discussed. The problem exhibits training
behavior which is often not present with smaller grammars, and
training was initially difficult. However, after implementing several
techniques aimed at improving the convergence of the gradient descent
backpropagation-through-time training algorithm, significant learning
was possible.  It was found that certain architectures are better able
to learn an appropriate grammar. The operation of the networks and
their training is analyzed. Finally, the extraction of rules in the
form of deterministic finite state automata is investigated.

Keywords: recurrent neural networks, natural language
processing, grammatical inference, government-and-binding theory,
gradient descent, simulated annealing, principles-and-parameters
framework, automata extraction.

**********************************************************************

A previously published related book chapter:

Steve Lawrence, Sandiway Fong, C. Lee Giles, "Natural Language
Grammatical Inference: A Comparison of Recurrent Neural Networks and
Machine Learning Methods," in Symbolic, Connectionist, and Statistical
Approaches to Learning for Natural Language Processing, Lecture Notes
in AI, edited by Stefan Wermter, Ellen Riloff and Gabriele Scheler,
Springer Verlag, New York, pp. 33-47, 1996.

is available upon request.

__                                
C. Lee Giles / Computer Science / NEC Research Institute / 
4 Independence Way / Princeton, NJ 08540, USA / 609-951-2642 / Fax 2482
www.neci.nj.nec.com/homepages/giles
==