information function vs. squared error
thanasis kehagias
ST401843%BROWNVM.BITNET at VMA.CC.CMU.EDU
Wed Mar 8 11:36:31 EST 1989
i am looking for pointers to papers discussing the use of an alternative
criterion to squared error, in back propagation algorithms. the
alternative function i have in mind is called (in different contexts
and/or authors) cross entropy, entropy, information, inf. divergence and
so on. it is defined something like:
G=sum{i=1}{N} p_i*log(p_i)
i am not quite sure what the index i runs through: untis, weights or
something else. i know people have been talking about this a lot, i just
cannot remember where i read aboout it ... it seems like Geoff Hinton's
group work on this .
thanks,
Thanasis
More information about the Connectionists
mailing list