[ACT-R-users] on similarity

Raluca.Budiu at parc.com Raluca.Budiu at parc.com
Fri Jun 17 13:28:32 EDT 2005


Yes, it's true that the results depend on the number of eigen vectors.
The intuition is that, the more eigen vectors, the more noise (i.e.,
accidental co-occurrences)  the system takes into account. With lower
numbers of eigen vectors, more generalization is allowed.  

 

Ayman has run some tests and found out that the optimal number of eigen
vectors is around 200-300 (this is also true for the traditional LSA).

 

As for the results of the GLSA being called "similarities", that's a
topic very much open to debate indeed. As Peter has said, there is some
theoretical motivation to them being called so (that PMIs can be shown
to roughly be equivalent with the original definition of strengths of
association in ACT-R 4.0). Whether association is the same with
similarity, that's another story. In ACT-R 5.0  I think there was a
tacit understanding (maybe not so tacit, if I remember correctly  some
of the discussions at the ACT-R workshops and postgraduate summer
school)  that there is some need to experiment with Sji-s and see
exactly what they should reflect (co-occurrence or perhaps some other
measure of semantic similarity).  

 

Raluca

 

 

  

 

From: Roman Belavkin <R.Belavkin at mdx.ac.uk>
Date: Jun 17, 2005 7:58 AM 
Subject: [ACT-R-users] on similarity
To: act-r-users at act-r.psy.cmu.edu

One more observation:  The results depend greatly on the number of
eigenvectors used (I understand this controls the number of dimension
PCA reduces the space to).  I found the `optimal' value for the word
apple is about 50 dimensions, becuase the results are:

apple carnation 0.6078935
apple fruit 0.5972236
apple blackberry 0.59559596
apple mac 0.5931967
apple orange 0.570567
apple sweet 0.5677523
apple palm 0.5576204
apple radish 0.55353457
apple intel 0.5511009
apple persimmon 0.54269683

They are quite different as you can see.

Cheers!
Roman

        -----Original Message-----
        From: act-r-users-admin at act-r.psy.cmu.edu on behalf of Roman
Belavkin
        Sent: Fri 6/17/2005 15:17
        To: Kelley, Troy (Civ,ARL/HRED); act-r-users at act-r.psy.cmu.edu 
        Cc:
        Subject: RE: [SPAM: 4.500] RE: [ACT-R-users] ACT-R output from
GLSA server at PARC



        Hi,

        I think the word `similar' is not really appropriate here.  LSA
just shows co-occurance of the two words really, and higher co-occurance
for word man can be simply explained because the word man means also
human, while woman is more specific and thus less ambiguous.


        Cheers,
        Roman

                -----Original Message-----
                From: act-r-users-admin at act-r.psy.cmu.edu on behalf of
Kelley, Troy (Civ,ARL/HRED)
                Sent: Thu 6/9/2005 21:52
                To: act-r-users at act-r.psy.cmu.edu
                Cc:
                Subject: [SPAM: 4.500] RE: [ACT-R-users] ACT-R output
from GLSA server at PARC



                Here is a sample output from the GLSA server at PARC for
the word Love

                love    love    0.9999997
                love    fun     0.16717705
                love    city    0.103955254
                love    place   0.2078263
                love    man     0.36086833
                love    friend  0.19896048
                love    neighbor        -0.037641484
                love    woman   0.21863273
                love    fondness        0.0076576206

                Interesting, love is more similar to the word "man" than
"woman", and
                more similar to "man" than "fondness" or "friend".

                Troy

                -----Original Message-----
                From: act-r-users-admin at act-r.psy.cmu.edu
                [mailto:act-r-users-admin at act-r.psy.cmu.edu] On Behalf
Of 
                Raluca.Budiu at parc.com
                Sent: Thursday, June 09, 2005 3:54 PM
                To: act-r-users at act-r.psy.cmu.edu
                Subject: [ACT-R-users] ACT-R output from GLSA
server at PARC



                Some of you may have heard about PARC's effort to build
an external
                GLSA/PMI server; it is now available at:

                http://glsa.parc.com/

                and it produces ACT-R output (other formats are
supported as well).

                GLSA (Generalized Latent Semantic Analysis) is a
LSA-like method of
                computing word similarities, but  it has the advantage
of an adjustable,
                web-based corpus. It takes as input a list of word pairs
and provides
                similarities between those words.

                Just very recently it started providing ACT-R output;
this is an  ACT-R
                file that defines a meaning chunk type and sets the
Sij-s between words
                to their similarity value as computed by the server.

                The server is still in a development phase, but please
feel free to
                experiment with it. Comments and suggestions are very
welcome.

                Raluca Budiu

                _______________________________________________
                ACT-R-users mailing list
                 ACT-R-users at act-r.psy.cmu.edu
                http://act-r.psy.cmu.edu/mailman/listinfo/act-r-users 

                _______________________________________________
                ACT-R-users mailing list
                ACT-R-users at act-r.psy.cmu.edu
                 http://act-r.psy.cmu.edu/mailman/listinfo/act-r-users

                This message has been checked for viruses but the
contents of an attachment
                may still contain software viruses, which could damage
your computer system:
                you are advised to perform your own checks. Email
communications with the
                University of Nottingham may be monitored as permitted
by UK legislation.


        _______________________________________________
        ACT-R-users mailing list
         ACT-R-users at act-r.psy.cmu.edu
<mailto:ACT-R-users at act-r.psy.cmu.edu> 
        http://act-r.psy.cmu.edu/mailman/listinfo/act-r-users


_______________________________________________ 
ACT-R-users mailing list
ACT-R-users at act-r.psy.cmu.edu
http://act-r.psy.cmu.edu/mailman/listinfo/act-r-users 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/act-r-users/attachments/20050617/088fea24/attachment.html>


More information about the ACT-R-users mailing list