Complexity of Second Order Learning Algorithms

Wed Aug 3 09:39:03 EDT 1988

It is generally assumed that second order learning algorithms are 
computationally too expensive for use on large problems, since the
complexity is O(N**2), N being the number of links.

It turns out that the function and gradient evaluations are O(NT), where
T is the number of training samples. In order to have statistically adequate
training data, T should approximate N, and is typically greater than N.
Thus, the computational cost of the function and gradient evaluations exceeds
that of the update algorithm.

Moreover, since the ratio of function and gradient evaluations to Hessian
updates is generally greater than two, the optimization process becomes
dominated by function and gradient evaluations rather than by the update
operation.

The complexity details and several examples are discussed in the technical
report (revised):

	Learning Algorithms for Connectionist Networks: Applied
		Gradient Methods of Nonlinear Optimization

			MS-CIS-87-51

available from:

	James Lotkowski
	Technical Report Facility
	Room 269/Moore Building
	University of Pennsylvania
	200 S 33rd Street
	Philadelphia, PA 19104-6389

james at upenn.cis.edu

Ray Watrous