[ACT-R-users] similarity
Roy Wilson
rwilson+ at pitt.edu
Fri Jun 17 11:15:46 EDT 2005
On Friday 17 June 2005 10:59, Roman Belavkin wrote:
> I have looked at the site, and it states it is using mutual information as
> a metric, which measures as we know statistical dependence (so, it is not
> just correlations betwen terms). Still I am not sure if similarity, as we
> understand it, and statistical dependene are the same things. For example,
> here are the similarities for the word apple:
>
> apple mac 0.23059334
> apple microsoft 0.21670483
> apple dkz 0.21022835
> etc...
>
> So, the most `similar' word to apple is mac (or how about `dkz'?). To me,
> orange or a fruit seems more similar terms. What these numbers show is a
> degree of statistical dependence of two terms in the docements analysed.
>
> It would be an interesting project for the ACT-R community to investigate
> this difference. What do you think?
I don't know what the answer is, but Manning and Schutze have an fairly
extensive discussion of these issues in (I think) "Statistical Foundations of
Natural Language Processing" (MIT Press, 2000).
>
> Roman
>
> -----Original Message-----
> From: act-r-users-admin at act-r.psy.cmu.edu on behalf of Roman Belavkin
> Sent: Fri 6/17/2005 15:17
> To: Kelley, Troy (Civ,ARL/HRED); act-r-users at act-r.psy.cmu.edu
> Cc:
> Subject: RE: [SPAM: 4.500] RE: [ACT-R-users] ACT-R output from GLSA
> server at PARC
>
>
>
> Hi,
>
> I think the word `similar' is not really appropriate here. LSA just shows
> co-occurance of the two words really, and higher co-occurance for word man
> can be simply explained because the word man means also human, while woman
> is more specific and thus less ambiguous.
>
>
> Cheers,
> Roman
>
> -----Original Message-----
> From: act-r-users-admin at act-r.psy.cmu.edu on behalf of Kelley,
> Troy (Civ,ARL/HRED) Sent: Thu 6/9/2005 21:52
> To: act-r-users at act-r.psy.cmu.edu
> Cc:
> Subject: [SPAM: 4.500] RE: [ACT-R-users] ACT-R output from GLSA
> server at PARC
>
>
>
> Here is a sample output from the GLSA server at PARC for the word
> Love
>
> love love 0.9999997
> love fun 0.16717705
> love city 0.103955254
> love place 0.2078263
> love man 0.36086833
> love friend 0.19896048
> love neighbor -0.037641484
> love woman 0.21863273
> love fondness 0.0076576206
>
> Interesting, love is more similar to the word "man" than "woman",
> and more similar to "man" than "fondness" or "friend".
>
> Troy
>
> -----Original Message-----
> From: act-r-users-admin at act-r.psy.cmu.edu
> [mailto:act-r-users-admin at act-r.psy.cmu.edu] On Behalf Of
> Raluca.Budiu at parc.com
> Sent: Thursday, June 09, 2005 3:54 PM
> To: act-r-users at act-r.psy.cmu.edu
> Subject: [ACT-R-users] ACT-R output from GLSA server at PARC
>
>
>
> Some of you may have heard about PARC's effort to build an
> external GLSA/PMI server; it is now available at:
>
> http://glsa.parc.com/
>
> and it produces ACT-R output (other formats are supported as
> well).
>
> GLSA (Generalized Latent Semantic Analysis) is a LSA-like method
> of computing word similarities, but it has the advantage of an adjustable,
> web-based corpus. It takes as input a list of word pairs and provides
> similarities between those words.
>
> Just very recently it started providing ACT-R output; this is an
> ACT-R file that defines a meaning chunk type and sets the Sij-s between
> words to their similarity value as computed by the server.
>
> The server is still in a development phase, but please feel free
> to experiment with it. Comments and suggestions are very welcome.
>
> Raluca Budiu
>
> _______________________________________________
> ACT-R-users mailing list
> ACT-R-users at act-r.psy.cmu.edu
> http://act-r.psy.cmu.edu/mailman/listinfo/act-r-users
>
> _______________________________________________
> ACT-R-users mailing list
> ACT-R-users at act-r.psy.cmu.edu
> http://act-r.psy.cmu.edu/mailman/listinfo/act-r-users
>
> This message has been checked for viruses but the contents of an
> attachment may still contain software viruses, which could damage your
> computer system: you are advised to perform your own checks. Email
> communications with the University of Nottingham may be monitored as
> permitted by UK legislation.
>
>
> _______________________________________________
> ACT-R-users mailing list
> ACT-R-users at act-r.psy.cmu.edu
> http://act-r.psy.cmu.edu/mailman/listinfo/act-r-users
>
>
> _______________________________________________
> ACT-R-users mailing list
> ACT-R-users at act-r.psy.cmu.edu
> http://act-r.psy.cmu.edu/mailman/listinfo/act-r-users
--
Roy Wilson
Learning Research Development Center
University of Pittsburgh
webpage: www.pitt.edu/~rwilson
email: rwilson at pitt.edu
More information about the ACT-R-users
mailing list