Ph.D. Thesis Announcement

Fri Mar 8 06:39:19 EST 2002

Dear Connectionists,

I am pleased to announce that my PhD thesis entitled

------------------------------------------------------------------------
   A Connectionist Approach to First-pass Attachment Prediction in
		       Natural Language Parsing

			    Fabrizio Costa

		      Dept. of Computer Science
		    University of Florence - Italy
------------------------------------------------------------------------

is now available at

	http://www.dsi.unifi.it/~costa/online/thesis.ps (and .pdf)

Please find the summary of my thesis below.

Fabrizio Costa

Summary ======= 

The apparently effortless capability that humans show in understanding
natural language is one of the main problems for modern cognitive science.
Natural language is extremely ambiguous and many theories of human parsing
claim that ambiguity is resolved resorting to linguistic experience, i.e.  
ambiguities are preferentially resolved in a way that has been successful
most frequently in the past. 

Current research is focused on finding what kind of features are relevant
for conditioning choice preferences.  In this thesis we employ a
connectionist paradigm (Recursive Neural Networks) capable of processing
acyclic graphs to perform supervised learning on syntactic trees extracted
from a large corpus of parsed sentences. The architecture is capable to
extract the relevant features directly from inspection of syntactic
structure and adaptively encode this information for the disambiguation
task.

Following a widely accepted hypothesis in psycholinguistics, we assume an
incremental parsing process (one word at a time) that keeps a connected
partial parse tree at all times. In this thesis we show how the model can
be trained from a large parsed corpus (treebank), leading to a good
disambiguation performance both on unrestricted text, and on a collection
of examples from the psycholinguistic experimental literature.

The goals of the thesis are twofold: (1) to demonstrate that a strongly
incremental approach is computationally feasible and (2) to present an
application of a connectionist architecture that directly processes
complex informations such as syntactic trees.

In Chapter 1 we set the background to understand the
current work. We highlight a trend in the field of statistical parsers
that shows how recently proposed models condition the statistics collected
on an growing number of linguistic features. We will review how some
recent incremental approaches to parsing show the viability of
incrementality from a computational and psycholinguistic viewpoint.  A
review of the connectionists models applied to NLP follows.

In Chapter 2 we formally introduce the recursive neural
network architecture and we go through the issues of training and
generalization capabilities of the model.

In Chapter 3 we introduce the incremental grammar
formalism used in the current work. We formalize the task of
discriminating the correct successor tree, and we then describe how to use
recursive neural networks to incrementally predict syntactic trees.

In Chapter 4 we try to gain an insight in the
relations between the parameters of the proposed system and the learned
preferences.

In Chapter 5 we study the relationship between the
generalization error and the features of the domain. Some transformations
are then applied on the input and on the architecture to enhance the
generalization capabilities of the model. The effectiveness of the
introduced modifications is then validated through a range of experiments.

In Chapter 6 we investigate how the modeling properties of
the proposed architecture relate to the the ambiguity resolution
mechanisms employed by human beings.

In Chapter 7 we employ the first-pass attachment
prediction mechanism to build an initial version of a syntactic
incremental probabilistic parser.

       ------------------------------------------------------

                               Contents

Introduction

1 Parsing, Incrementality and Connectionism
1.1 Introduction
	1.1.1 The ambiguity issue
1.2 Adding information to Context-Free Grammars
1.3 Incremental Parsing
	1.3.1 Incremental parsers
	1.3.2 Top-down left-corner parsing
	1.3.3 Incremental Cascaded Markov Model
1.4 Connectionist Models for Natural Language Processing
	1.4.1 Recurrent Networks
	1.4.2 Recursive Auto Associative Memory
	1.4.3 Simple Syncrony Networks
1.5 Conclusions

2 Recursive Neural Networks
2.1 Introduction
2.2 Definitions
2.3 Recursive representation
2.4 Modeling assumptions
2.5 Recursive network
2.6 Encoding network
2.7 Recursive processing with neural networks
2.8 Generalization Capabilities

3 Learning first-pass attachment preferences
3.1 Introduction
3.2 Incremental trees and connection paths
	3.2.1 Definitions
	3.2.2 Connection paths extraction
	3.2.3 Is incrementality plausible?
	3.2.4 POS tag level
	3.2.5 Left recursion treatment
3.3 Formal definition of the learning problem
	3.3.1 Universe of connection paths
	3.3.2 The forest of candidates
	3.3.3 The learning task
3.4 Recursive networks for ranking alternatives
3.5 Learning procedure
	3.5.1 Dataset compilation
	3.5.2 Network unfolding
	3.5.3 Parameters estimation

4 Network performance
4.1 Experimental setting
	4.1.1 The data set
4.2 Learning and generalizing capabilities of the system
	4.2.1 Target function variation
	4.2.2 Parameters variation
	4.2.3 Learning rate variation
	4.2.4 Training set variation
	4.2.5 Network generalization

5 What is the Network really choosing?
5.1 Structural analyses
	5.1.1 Statistical information on incremental tree features
	5.1.2 Correlation between features and error
	5.1.3 Correct and incorrect predictions characteristics
	5.1.4 Influence of connection paths' frequency
	5.1.5 Filtering out the connection paths' frequency effect
	5.1.6 Filtering out the anchor attachment ambiguity
5.2 Comparison with linguistic heuristics
5.3 Enhancing the network performance
	5.3.1 Connection path set reduction
	5.3.2 Tree reduction
	5.3.3 Network specialization
	5.3.4 Anchor shortcut

6 Modeling human ambiguity resolution
6.1 Introduction
6.2 Experienced based models
6.3 Experimental setting
6.4 Psycholinguistic examples
	6.4.1 NP/S ambiguities
	6.4.2 Relative clause attachment
	6.4.3 Prepositional phrase attachment to noun phrases
	6.4.4 Adverb attachment
	6.4.5 Closure ambiguity
	6.4.6 PP attachment ambiguities
6.5 Conclusion

7 Toward a connectionist incremental parser
7.1 Probabilistic framework
	7.1.1 Defining the model structure
	7.1.2 Maximum likelihood estimation
7.2 An example: PCFG
7.3 Probabilistic Incremental Grammar
7.4 Proposal for an incremental parser and modeling issues
7.5 How to evaluate the parser output
	7.5.1 Definitions
	7.5.2 Evaluation metrics
7.6 Parser evaluation
	7.6.1 Influence of tree reduction procedure
	7.6.2 Parsing performance issues
	7.6.3 Influence of the connection path coverage
	7.6.4 Influence of the beam size
7.7 Final remarks

Conclusion

-----------------------------------------------------------------
Fabrizio Costa
Dept. of Systems and Computer Science, University of Florence
Via di Santa Marta 3, I-50139 Firenze - Italy
Phone: +39 055 4796 361 Fax: +39 055 4796 363