A computationally efficient squashing function
Michael P. Perrone
mpp at cns.brown.edu
Thu Feb 18 15:42:34 EST 1993
Recently on the comp.ai.neural-nets bboard, there has been a discussion of
more computationally efficient squashing functions. Some colleagues of
mine suggested that many members of the Connectionist mailing list may not
have access to the comp.ai.neural-nets bboard; so I have included a summary
below.
Michael
------------------------------------------------------
David L. Elliot mentioned using the following neuron activation function:
x
f(x) = -------
1 + |x|
He argues that this function has the same qualitative properties of the
hyperbolic tangent function but in practice faster to calculate.
I have suggested a similar speed-up for radial basis function networks:
1
f(x) = -------
1 + x^2
which avoids the transcendental calculation associated with gaussian RBF
nets.
I have run simulations using the above squashing function in various
backprop networks. The performance is comparable (sometimes worse
sometimes better) to usual training using hyperbolic tangents. I also
found that the performance of networks varied very little when the
activation functions were switched (i.e. two networks with identical
weights but different activation functions will have comparable performance
on the same data). I tested these results on two databases: the NIST OCR
database (preprocessed by Nestor Inc.) and the Turk and Pentland human face
database.
--------------------------------------------------------------------------------
Michael P. Perrone Email: mpp at cns.brown.edu
Institute for Brain and Neural Systems Tel: 401-863-3920
Brown University Fax: 401-863-3934
Providence, RI 02912
More information about the Connectionists
mailing list