Thesis: Self-Organizing Maps in Natural Language Processing
Timo Honkela
tho at james.hut.fi
Wed Jan 7 10:02:08 EST 1998
The following Dr.Phil. thesis is available at
http://www.cis.hut.fi/~tho/thesis/honkela.ps.Z (compressed postscript)
http://www.cis.hut.fi/~tho/thesis/honkela.ps (postscript)
http://www.cis.hut.fi/~tho/thesis/ (html)
----------------------------------------------------------------------
SELF-ORGANIZING MAPS IN NATURAL LANGUAGE PROCESSING
Timo Honkela
Helsinki University of Technology
Neural Networks Research Centre
P.O.Box 2200 (Rakentajanaukio 2C)
FIN-02015 HUT, Finland
Timo.Honkela at hut.fi
Kohonen's Self-Organizing Map (SOM) is one of the most popular
artificial neural network algorithms. Word category maps are
SOMs that have been organized according to word similarities,
measured by the similarity of the short contexts of the words.
Conceptually interrelated words tend to fall into the same or
neighboring map nodes. Nodes may thus be viewed as word categories.
Although no a priori information about classes is given, during
the self-organizing process a model of the word classes emerges.
The central topic of the thesis is the use of the SOM in natural
language processing. The approach based on the word category maps
is compared with the methods that are widely used in artificial
intelligence research. Modeling gradience, conceptual change,
and subjectivity of natural language interpretation are considered.
The main application area is information retrieval and textual data
mining for which a specific SOM-based method called the WEBSOM has
been developed. The WEBSOM method organizes a document collection
on a map display that provides an overview of the collection and
facilitates interactive browsing.
------------------- ---------------------------
Timo Honkela Timo.Honkela at hut.fi http://www.cis.hut.fi/~tho/
Neural Networks Research Centre, Helsinki Univ of Technology
and P.O.Box 2200 FIN-02015 HUT, Finland
Nat Lang Proc Tel. +358-9-451 3275, Fax +358-9-451 3277
More information about the Connectionists
mailing list