Ill-conditioning in NNs (Tech. Rep.)

Fri Dec 20 12:55:44 EST 1991

Technical report available:
CSRD Report no. 1089

Ill-Conditioning in Neural Network Training Problems

S. Saarinen, R. Bramley and G. Cybenko

Center for Supercomputing Research and Development, 
University of Illinois, Urbana, IL, USA  61801

Abstract

The training problem for feedforward neural networks 
is nonlinear parameter estimation that can be solved
by a variety of optimization techniques.  Much of the literature
on neural networks has focused on variants
of gradient descent.  The training of neural networks using
such techniques is known to be a slow process with more sophisticated
techniques not always performing significantly better.
In this paper, we show that feedforward neural networks can 
have ill-conditioned Hessians and that this ill-conditioning can
be quite common.  The analysis and experimental results in this
paper lead to the conclusion that many network training problems
are ill-conditioned and may not be solved more efficiently by
higher order optimization methods.  While our analyses are
for completely connected layered networks, they extend to networks with
sparse connectivity as well.  Our results suggest that neural networks
can have considerable redundancy in parameterizing the function
space in a neighborhood of a local minimum, independently of whether
or not the solution has a small residual.

If you wish to have this report, please write to

nichols at csrd.uiuc.edu

and ask for report 1089.

Merry Christmas,
Sirpa Saarinen
saarinen at csrd.uiuc.edu