<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:st1="urn:schemas-microsoft-com:office:smarttags" xmlns="http://www.w3.org/TR/REC-html40">


<head>

<meta http-equiv=Content-Type content="text/html; charset=us-ascii">

<meta name=Generator content="Microsoft Word 11 (filtered medium)">

<o:SmartTagType namespaceuri="urn:schemas-microsoft-com:office:smarttags"

 name="country-region"/>

<o:SmartTagType namespaceuri="urn:schemas-microsoft-com:office:smarttags"

 name="PlaceName"/>

<o:SmartTagType namespaceuri="urn:schemas-microsoft-com:office:smarttags"

 name="PlaceType"/>

<o:SmartTagType namespaceuri="urn:schemas-microsoft-com:office:smarttags"

 name="City"/>

<o:SmartTagType namespaceuri="urn:schemas-microsoft-com:office:smarttags"

 name="place"/>

<!--[if !mso]>

<style>

st1\:*{behavior:url(#default#ieooui) }

</style>

<![endif]-->

<style>

<!--

 /* Style Definitions */

 p.MsoNormal, li.MsoNormal, div.MsoNormal

        {margin:0in;

        margin-bottom:.0001pt;

        font-size:12.0pt;

        font-family:"Times New Roman";}

a:link, span.MsoHyperlink

        {color:blue;

        text-decoration:underline;}

a:visited, span.MsoHyperlinkFollowed

        {color:blue;

        text-decoration:underline;}

span.EmailStyle18

        {mso-style-type:personal-reply;

        font-family:Arial;

        color:navy;}

@page Section1

        {size:8.5in 11.0in;

        margin:1.0in 1.25in 1.0in 1.25in;}

div.Section1

        {page:Section1;}

-->

</style>


</head>


<body lang=EN-US link=blue vlink=blue>


<div class=Section1>


<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'>Yes, it’s true that the results

depend on the number of eigen vectors. The intuition is that, the more eigen

vectors, the more noise (i.e., accidental co-occurrences)  the system

takes into account. With lower  numbers of eigen vectors, more

generalization is allowed.  <o:p></o:p></span></font></p>


<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'><o:p> </o:p></span></font></p>


<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'>Ayman has run some tests and found out

that the optimal number of eigen vectors is around 200-300 (this is also true

for the traditional LSA).<o:p></o:p></span></font></p>


<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'><o:p> </o:p></span></font></p>


<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'>As for the results of the GLSA being

called “similarities”, that’s a topic very much open to

debate indeed. As Peter has said, there is some theoretical motivation to them being

called so (that PMIs can be shown to roughly be equivalent with the original

definition of strengths of association in ACT-R 4.0). Whether association is

the same with similarity, that’s another story. In ACT-R 5.0  I think

there was a tacit understanding (maybe not so tacit, if I remember correctly  some

of the discussions at the ACT-R workshops and postgraduate summer school)  that

there is some need to experiment with Sji-s and see exactly what they should reflect

(co-occurrence or perhaps some other measure of semantic similarity).  <o:p></o:p></span></font></p>


<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'><o:p> </o:p></span></font></p>


<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'>Raluca<o:p></o:p></span></font></p>


<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'><o:p> </o:p></span></font></p>


<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'><o:p> </o:p></span></font></p>


<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'>  <o:p></o:p></span></font></p>


<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'><o:p> </o:p></span></font></p>


<div style='border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in 4.0pt'>


<p class=MsoNormal><span class=gmailquote><font size=3 face="Times New Roman"><span

style='font-size:12.0pt'>From: <b><span style='font-weight:bold'>Roman Belavkin</span></b>

<<a href="mailto:R.Belavkin@mdx.ac.uk">R.Belavkin@mdx.ac.uk</a>></span></font></span><br>

<span class=gmailquote>Date: Jun 17, 2005 7:58 AM </span><br>

<span class=gmailquote>Subject: [ACT-R-users] on similarity</span><br>

<span class=gmailquote>To: <a href="mailto:act-r-users@act-r.psy.cmu.edu">act-r-users@act-r.psy.cmu.edu</a></span><br>

<br>

One more observation:  The results depend greatly on the number of

eigenvectors used (I understand this controls the number of dimension PCA

reduces the space to).  I found the `optimal' value for the word

apple is about 50 dimensions, becuase the results are:<br>

<br>

apple carnation 0.6078935<br>

apple fruit 0.5972236<br>

apple blackberry 0.59559596<br>

apple mac 0.5931967<br>

apple orange 0.570567<br>

apple sweet 0.5677523<br>

apple palm 0.5576204<br>

apple radish 0.55353457<br>

apple intel 0.5511009<br>

apple persimmon 0.54269683<br>

<br>

They are quite different as you can see.<br>

<br>

Cheers!<br>

Roman<br>

<br>

        -----Original Message-----<br>

        From: <a

href="mailto:act-r-users-admin@act-r.psy.cmu.edu">act-r-users-admin@act-r.psy.cmu.edu</a>

on behalf of Roman Belavkin<br>

        Sent: Fri 6/17/2005 15:17<br>

        To: Kelley, Troy

(Civ,ARL/HRED); <a href="mailto:act-r-users@act-r.psy.cmu.edu">act-r-users@act-r.psy.cmu.edu

</a><br>

        Cc:<br>

        Subject: RE: [SPAM: 4.500] RE:

[ACT-R-users] ACT-R output from GLSA server@PARC<br>

<br>

<br>

<br>

        Hi,<br>

<br>

        I think the word `similar' is

not really appropriate here.  LSA just shows co-occurance of the two

words really, and higher co-occurance for word man can be simply explained

because the word man means also human, while woman is more specific and thus

less ambiguous.<br>

<br>

<br>

        Cheers,<br>

        Roman<br>

<br>

                -----Original

Message-----<br>

                From:

<a href="mailto:act-r-users-admin@act-r.psy.cmu.edu">act-r-users-admin@act-r.psy.cmu.edu</a>

on behalf of Kelley, Troy (Civ,ARL/HRED)<br>

                Sent:

Thu 6/9/2005 21:52<br>

                To:

<a href="mailto:act-r-users@act-r.psy.cmu.edu">act-r-users@act-r.psy.cmu.edu</a><br>

                Cc:<br>

                Subject:

[SPAM: 4.500] RE: [ACT-R-users] ACT-R output from GLSA server@PARC<br>

<br>

<br>

<br>

                Here

is a sample output from the GLSA server@PARC for the word Love<br>

<br>

                love    love    0.9999997<br>

                love    fun    

0.16717705<br>

                love    city    0.103955254<br>

                love    place  

0.2078263<br>

                love    man    

0.36086833<br>

                love    friend  0.19896048<br>

                love    neighbor        -0.037641484<br>

                love    woman  

0.21863273<br>

                love    fondness        0.0076576206<br>

<br>

                Interesting,

love is more similar to the word "man" than "woman", and<br>

                more

similar to "man" than "fondness" or "friend".<br>

<br>

                <st1:City

w:st="on"><st1:place w:st="on">Troy</st1:place></st1:City><br>

<br>

                -----Original

Message-----<br>

                From:

<a href="mailto:act-r-users-admin@act-r.psy.cmu.edu">act-r-users-admin@act-r.psy.cmu.edu</a><br>

                [mailto:<a

href="mailto:act-r-users-admin@act-r.psy.cmu.edu">act-r-users-admin@act-r.psy.cmu.edu</a>]

On Behalf Of <br>

                <a

href="mailto:Raluca.Budiu@parc.com">Raluca.Budiu@parc.com</a><br>

                Sent:

Thursday, June 09, 2005 3:54 PM<br>

                To:

<a href="mailto:act-r-users@act-r.psy.cmu.edu">act-r-users@act-r.psy.cmu.edu</a><br>

                Subject:

[ACT-R-users] ACT-R output from GLSA server@PARC<br>

<br>

<br>

<br>

                Some

of you may have heard about PARC's effort to build an external<br>

                GLSA/PMI

server; it is now available at:<br>

<br>

                <a

href="http://glsa.parc.com/">http://glsa.parc.com/</a><br>

<br>

                and

it produces ACT-R output (other formats are supported as well).<br>

<br>

                GLSA

(Generalized Latent Semantic Analysis) is a LSA-like method of<br>

                computing

word similarities, but  it has the advantage of an adjustable,<br>

                web-based

corpus. It takes as input a list of word pairs and provides<br>

                similarities

between those words.<br>

<br>

                Just

very recently it started providing ACT-R output; this is an  ACT-R<br>

                file

that defines a meaning chunk type and sets the Sij-s between words<br>

                to

their similarity value as computed by the server.<br>

<br>

                The

server is still in a development phase, but please feel free to<br>

                experiment

with it. Comments and suggestions are very welcome.<br>

<br>

                Raluca

Budiu<br>

<br>

                _______________________________________________<br>

                ACT-R-users

mailing list<br>

                
<a href="mailto:ACT-R-users@act-r.psy.cmu.edu">ACT-R-users@act-r.psy.cmu.edu</a><br>

                <a

href="http://act-r.psy.cmu.edu/mailman/listinfo/act-r-users">http://act-r.psy.cmu.edu/mailman/listinfo/act-r-users</a>

<br>

<br>

                _______________________________________________<br>

                ACT-R-users

mailing list<br>

                <a

href="mailto:ACT-R-users@act-r.psy.cmu.edu">ACT-R-users@act-r.psy.cmu.edu</a><br>

                
<a href="http://act-r.psy.cmu.edu/mailman/listinfo/act-r-users">http://act-r.psy.cmu.edu/mailman/listinfo/act-r-users</a><br>

<br>

                This

message has been checked for viruses but the contents of an attachment<br>

                may

still contain software viruses, which could damage your computer system:<br>

                you

are advised to perform your own checks. Email communications with the<br>

                <st1:PlaceType

w:st="on">University</st1:PlaceType> of <st1:PlaceName w:st="on">Nottingham</st1:PlaceName>

may be monitored as permitted by <st1:country-region w:st="on"><st1:place

 w:st="on">UK</st1:place></st1:country-region> legislation.<br>

<br>

<br>

        _______________________________________________<br>

        ACT-R-users mailing list<br>

        <a

href="mailto:ACT-R-users@act-r.psy.cmu.edu"> ACT-R-users@act-r.psy.cmu.edu</a><br>

        <a

href="http://act-r.psy.cmu.edu/mailman/listinfo/act-r-users">http://act-r.psy.cmu.edu/mailman/listinfo/act-r-users</a><br>

<br>

<br>

_______________________________________________ <br>

ACT-R-users mailing list<br>

<a href="mailto:ACT-R-users@act-r.psy.cmu.edu">ACT-R-users@act-r.psy.cmu.edu</a><br>

<a href="http://act-r.psy.cmu.edu/mailman/listinfo/act-r-users">http://act-r.psy.cmu.edu/mailman/listinfo/act-r-users

</a><o:p></o:p></p>


</div>


</div>


</body>


</html>