Thesis: Self-Organizing Maps in Natural Language Processing

Timo Honkela tho at james.hut.fi
Wed Jan 7 10:02:08 EST 1998


The following Dr.Phil. thesis is available at

http://www.cis.hut.fi/~tho/thesis/honkela.ps.Z  (compressed postscript)
http://www.cis.hut.fi/~tho/thesis/honkela.ps    (postscript)
http://www.cis.hut.fi/~tho/thesis/              (html)

----------------------------------------------------------------------

      SELF-ORGANIZING MAPS IN NATURAL LANGUAGE PROCESSING 

                          Timo Honkela

                Helsinki University of Technology
                 Neural Networks Research Centre
                P.O.Box 2200 (Rakentajanaukio 2C)
                     FIN-02015 HUT, Finland
                       Timo.Honkela at hut.fi


Kohonen's Self-Organizing Map (SOM) is one of the most popular
artificial neural network algorithms.  Word category maps are
SOMs that have been organized according to word similarities,
measured by the similarity of the short contexts of the words.
Conceptually interrelated words tend to fall into the same or
neighboring map nodes. Nodes may thus be viewed as word categories.
Although no a priori information about classes is given, during
the self-organizing process a model of the word classes emerges.

The central topic of the thesis is the use of the SOM in natural
language processing.  The approach based on the word category maps
is compared with the methods that are widely used in artificial
intelligence research.  Modeling gradience, conceptual change,
and subjectivity of natural language interpretation are considered. 
The main application area is information retrieval and textual data
mining for which a specific SOM-based method called the WEBSOM has
been developed.  The WEBSOM method organizes a document collection
on a map display that provides an overview of the collection and
facilitates interactive browsing.



                -------------------   ---------------------------
 Timo Honkela   Timo.Honkela at hut.fi   http://www.cis.hut.fi/~tho/
Neural Networks Research Centre,      Helsinki Univ of Technology
     and        P.O.Box 2200          FIN-02015 HUT, Finland
 Nat Lang Proc  Tel. +358-9-451 3275, Fax +358-9-451 3277





More information about the Connectionists mailing list