new paper in neuroprose

Thu Jul 1 13:19:37 EDT 1993

FTP-host: archive.cis.ohio-state.edu
FTP-filename: /pub/neuroprose/maass.agnostic.ps.Z

The file maass.agnostic.ps.Z is now available for copying
from the Neuroprose repository. This is a 22-page long paper.
Hardcopies are not available.

      AGNOSTIC PAC-LEARNING OF FUNCTIONS ON ANALOG NEURAL NETS  

                               by
                          Wolfgang Maass                    

           Institute for Theoretical Computer Science   
         Technische Universitaet Graz, A-8010 Graz, Austria
                  email: maass at igi.tu-graz.ac.at   

Abstract:
This is a revised version of a preprint from May 93. In this new
version we have added a parallelization  of the learning algorithm 
LEARN that runs in polynomial time in ALL relevant parameters.

We consider learning of real-valued functions on analog neural nets
in Haussler's refinement of Valiant's model for probably approximately
correct learning ("PAC-learning"). One commonly refers to this refined
model as "agnostic PAC-learning", since it requires no a-priori
assumptions about the structure of the learning target. The learning 
target need not even be a function, instead it may be any distribution 
of input/output pairs. In particular, arbitrary errors and imprecisions 
are permitted in the training data. Hence the setup of this model is 
well-suited for the analysis of real-world learning problems on neural 
Cnets. 

The goal of the learner in this model is to find a hypothesis whose  
true error (resp. loss) is within an epsilon of the true
error of the best hypothesis from the considered hypothesis class. In 
our application to neural nets this hypothesis class is given by the 
class of all functions that can be computed on some fixed neural net N 
(with arbitrary real weights). 

We prove a positive result about agnostic PAC-learning on an arbitrary
fixed analog neural net N (with arbitrary piecewise polynomial
activation functions). We first construct for such N a somewhat larger
neural net N' (the "learning network"). Then we exhibit a learning 
algorithm LEARN that computes from the training examples an assignment 
of rational numbers to the weights of N' such that with high probability
the true error of the function that is computed by N' with these weights 
is within an epsilon of that of the best hypothesis that is computable 
on N (with arbitrary real weights).  

If the number of gates in N may be viewed as a constant, then the 
computation time of the learning algorithm LEARN is polynomial (in all
other relevant parameters). In addition one can parallelize this
learning algorithm in such a way that its computation time is polynomial 
in ALL relevant parameters, including the number of gates in N .

It should be noted that in contrast to common learning algorithms such
as backwards propagation, this learning algorithm LEARN is guaranteed
to solve its learning task (with high probability).

As a part of this positive learning result we show that the pseudo-
dimension of neural nets with piecewise polynomial activation functions 
can be bounded by a polynomial in the number of weights of the neural 
net. It has previously been shown by Haussler that the pseudo-dimension 
is the appropriate generalization of the VC-dimension for learning real 
valued functions. With the help of our upper bound on the pseudo-
dimension of neural nets with piecewise polynomial activation functions 
we can bound the number of training-examples that are needed  for 
agnostic PAC-learning. In this way one can reduce the task of minimizing
the true error of the neural net to the finite task of minimizing its
apparent error (i.e. its error on the training examples).