Ph.D. Thesis Announcement

Mon Mar 4 14:41:50 EST 2002

You may be interested in the availability of my recently completed
Ph.D. thesis, entitled, "A Connectionist Model of Sentence
Comprehension and Production."  I have included an abstract and a
summary of the table of contents below.

The thesis can be downloaded in .ps or .pdf format at this site:

   http://tedlab.mit.edu/~dr/Thesis/

Although the thesis is a bit long, you may wish to take a look at the
Introduction and the Discussion to see if you might find anything of
interest in between.

Cheers,
Doug Rohde

       ------------------------------------------------------

    A Connectionist Model of Sentence Comprehension and Production

                        Douglas L. T. Rohde

      Carnegie Mellon University, Dept. of Computer Science and
             the Center for the Neural Basis of Cognition

Thesis Committee:
   Dr. David C. Plaut, Chair
   Dr. James L. McClelland
   Dr. David S. Touretzky
   Dr. Maryellen C. MacDonald

Abstract:

The most predominant language processing theories have, for some time,
been based largely on structured knowledge and relatively simple
rules.  These symbolic models intentionally segregate syntactic
information processing from statistical information as well as
semantic, pragmatic, and discourse influences, thereby minimizing the
importance of these potential constraints in learning and processing
language.  While such models have the advantage of being relatively
simple and explicit, they are inadequate to account for learning and
validated ambiguity resolution phenomena.  In recent years,
interactive constraint-based theories of sentence processing have
gained increasing support, as a growing body of empirical evidence
demonstrates early influences of various factors on comprehension
performance.  Connectionist networks are one form of model that
naturally reflect many properties of constraint-based theories, and
thus provide a form in which those theories may be instantiated.

Unfortunately, most of the connectionist language models implemented
until now have involved severe limitations, restricting the phenomena
they could address.  Comprehension and production models have, by and
large, been limited to simple sentences with small vocabularies
(St. John & McClelland, 1990).  Most models that have addressed the
problem of complex, multi-clausal sentence processing have been
prediction networks (Elman, 1991; Christiansen & Chater, 1999).
Although a useful component of a language processing system,
prediction does not get at the heart of language: the interface
between syntax and semantics.

The current thesis focuses on the design and testing of the
Connectionist Sentence Comprehension and Production (CSCP) model, a
recurrent neural network that has been trained to both comprehend and
produce a relatively complex subset of English.  This language
includes such features as tense and number, adjectives and adverbs,
prepositional phrases, relative clauses, subordinate clauses, and
sentential complements, with a vocabulary of about 300 total words.
It is broad enough that it permits the model to address a wide range
of sentence processing phenomena.  The experiments reported here
involve such issues as the relative comprehensibility of various
sentence types, the resolution of lexical ambiguities, generalization
to novel sentences, the comprehension of main verb/reduced relative,
sentential complement, subordinate clause, and prepositional phrase
attachment ambiguities, agreement attraction and other production
errors, and structural priming.

The model is able to replicate many key aspects of human sentence
processing across these domains, including sensitivity to lexical and
structural frequencies, semantic plausibility, inflectional
morphology, and locality effects.  A critical feature of the model is
its suggestion of a tight coupling between comprehension and
production and the idea that language production is primarily learned
through the formulation and testing of covert predictions during
comprehension.  I believe this work represents a major advance in the
attested ability of connectionist networks to process natural language
and a significant step towards a more complete understanding of the
human language faculty.

       ------------------------------------------------------

                               Contents

1  Introduction
    1.1 Why implement models?
    1.2 Properties of human language processing
    1.3 Properties of symbolic models
    1.4 Properties of connectionist models
    1.5 The CSCP model
    1.6 Chapter overview

2  An Overview of Connectionist Sentence Processing
    2.1 Parsing
    2.2 Comprehension
    2.3 Word prediction
    2.4 Production
    2.5 Other language processing models

3  Empirical Studies of Sentence Processing
    3.1 Introduction
    3.2 Relative clauses
    3.3 Main verb/reduced-relative ambiguities
    3.4 Sentential complements
    3.6 Prepositional phrase attachment
    3.7 Effects of discourse context
    3.8 Production
    3.9 Summary of empirical findings

4  Analysis of Syntax Statistics in Parsed Corpora
    4.1 Extracting syntax statistics isn't easy
    4.2 Verb phrases
    4.3 Relative clauses
    4.4 Sentential noun phrases
    4.5 Determiners and adjectives
    4.6 Prepositional phrases
    4.7 Coordination and subordination
    4.8 Conclusion

5  The Penglish Language
    5.1 Language features
    5.2 Penglish grammar
    5.3 The lexicon
    5.4 Phonology
    5.5 Semantics
    5.6 Statistics

6  The CSCP Model
    6.1 Basic architecture
    6.2 The semantic system
    6.3 The comprehension, prediction, and production system
    6.4 Training
    6.5 Testing
    6.6 Claims and limitations of the model

7  General Comprehension Results
    7.1 Overall performance
    7.2 Representation
    7.3 Experiment 2: Comparison of sentence types
    7.4 Lexical ambiguity
    7.5 Experiment 4: Adverbial attachment
    7.6 Experiment 5: Prepositional phrase attachment
    7.7 Reading time
    7.8 Individual differences

8  The Main Verb/Reduced Relative Ambiguity
    8.1 Empirical results
    8.2 Experiment 6
    8.3 Verb frequency effects
    8.4 Summary

9  The Sentential Complement Ambiguity
    9.1 Empirical results
    9.2 Experiment 7
    9.3 Summary

10 The Subordinate Clause Ambiguity
    10.1 Empirical results
    10.2 Experiment 8
    10.3 Experiment 9
    10.4 Experiment 10: Incomplete reanalysis
    10.5 Summary and discussion

11 Relative Clauses
    11.1 Empirical results
    11.2 Experiment 11
    11.3 Discussion

12 Production
    12.1 Word-by-word production
    12.2 Free production
    12.3 Agreement attraction
    12.4 Structural priming
    12.5 Summary

13 Discussion
    13.1 Summary of results
    13.2 Accomplishments of the model
    13.3 Problems with the model
    13.4 Model versus theory
    13.5 Properties, principles, processes
    13.6 Conclusion

Appendices
A  Lens: The Light, Efficient Network Simulator
    A.1 Performance benchmarks
    A.2 Optimizations
    A.3 Parallel training
    A.4 Customization
    A.5 Interface
    A.6 Conclusion

B  SLG: The Simple Language Generator
    B.1 The grammar
    B.2 Resolving the grammar
    B.3 Minimizing the grammar
    B.4 Parsing
    B.5 Word prediction
    B.6 Conclusion

C  TGrep2: A Tool for Searching Parsed Corpora
    C.1 Preparing corpora
    C.2 Command-line arguments
    C.3 Specifying patterns
    C.4 Controlling the output
    C.5 Differences from TGrep

D  Details of the Penglish Language
    D.1 The Penglish SLG grammar
    D.2 The Penglish lexicon

       ------------------------------------------------------