information function vs. squared error

thanasis kehagias ST401843%BROWNVM.BITNET at VMA.CC.CMU.EDU
Wed Mar 8 11:36:31 EST 1989


i am looking for pointers to papers discussing the use of an alternative
criterion to squared error, in back propagation algorithms. the
alternative function i have in mind is called (in different contexts
and/or authors) cross entropy, entropy, information, inf. divergence and
so on. it is defined something like:


    G=sum{i=1}{N} p_i*log(p_i)


i am not quite sure what the index i runs through: untis, weights or
something else. i know people have been talking about this a lot, i just
cannot remember where i read aboout it ...  it seems like Geoff Hinton's
group work on this .



      thanks,

          Thanasis


More information about the Connectionists mailing list