Preprint available
Christian Omlin
omlin at waterbug.cs.sun.ac.za
Thu Nov 25 11:55:04 EST 1999
Dear Colleagues
The technical report below is available from our website
http://www.cs.sun.ac.za/projects/tech_reports/US-CS-TR-99-14.ps.gz
We welcome any comments you may have.
With kind regards,
Christian
Christian W. Omlin e-mail: omlin at cs.sun.ac.za
Department of Computer Science phone (direct): +27-21-808-4308
University of Stellenbosch phone (secretary): +27-21-808-4232
Private Bag X1 fax: +27-21-808-4416
Stellenbosch 7602 http://www.cs.sun.ac.za/people/staff/omlin
SOUTH AFRICA http://www.neci.nj.nec.com/homepages/omlin
------------------------------- cut here ------------------------------
What Inductive Bias Gives Good
Neural Network Training Performance?
S. Snyders C.W. Omlin
Department of Computer Science
University of Stellenbosch
7602 Stellennbosch
South Africa
E-mail: {snyders,omlin}@cs.sun.ac.za
ABSTRACT
There has been an increased interest in the use of prior knowl-
edge for training neural networks. Prior knowledge in the form of
Horn clauses has been the predominant paradigm for knowledge-
based neural networks. Given a set of training examples and an
initial domain theory, a neural network is constructed that fits
the training examples by preprogramming some of the weights. The
initialized neural network is then trained using backpropagation
to refine the knowledge. The prior knowledge presumably defines
a good starting point in weight space and provides an inductive
bias leading to faster convergence; it overrides backpropaga-
tion's bias toward a smooth interpolation resulting in small
weights. This paper proposes a heuristic for determining the
strength of the inductive bias by making use of gradient informa-
tion in weight space in the direction of the programmed weights.
The network starts its search in weight space where the gradient
is maximal thus speeding-up convergence. Tests on a benchmark
problem from molecular biology demonstrate that our heuristic on
average reduces the training time by 60% compared to a random
choice of the strength of the inductive bias; this performance is
within 20% of the training time that can be achieved with optimal
inductive bias. The difference in generalization performance is
not statistically significant.
More information about the Connectionists
mailing list