Efficient Learning

Mon Jan 10 20:37:05 EST 1994

       A Differential Theory of Learning for Efficient
              Statistical Pattern Recognition

	           J. B. Hampshire II
	   Jet Propulsion Laboratory, M/S 238-420
             California Institute of Technology
	          4800 Oak Grove Drive
	        Pasadena, CA  91109-8099
	         hamps at bvd.jpl.nasa.gov

                        ABSTRACT
                        --------
There is more to learning stochastic concepts for robust statistical
pattern recognition than the learning itself:  computational
resources must be allocated and information must be obtained.
Therein lies the key to a learning strategy that is efficient,
requiring the fewest resources and the least information
necessary to produce classifiers that generalize well.
Probabilistic learning strategies currently used with connectionist
(as well as most traditional) classifiers are often inefficient,
requiring high classifier complexity and large training sample sizes
to ensure good generalization.  An asymptotically efficient
**differential learning strategy** is set forth.  It guarantees
the best generalization allowed by the choice of classifier paradigm
as long as the training sample size is large; this guarantee also
holds for small training sample sizes when the classifier is an
``improper parametric model'' of the data (as it often is).
Differential learning requires the classifier with the minimum
functional complexity necessary --- under a broad range of
accepted complexity measures --- for Bayesian (i.e., minimum
probability-of-error) discrimination.  

The theory is demonstrated in several real-world
machine learning/pattern recognition tasks associated with
Fisher's Iris data, optical character recognition,
medical diagnosis, and airborne remote sensing imagery
interpretation.  These applications focus on the implementation
of differential learning and illustrate its advantages
and limitations in a series of experiments that complement
the theory.  The experiments demonstrate that differentially-generated
classifiers consistently generalize better than their
probabilistically-generated counterparts across a wide
range of real-world learning-and-classification tasks.
The discrimination improvements range from moderate to significant,
depending on the statistical nature of the learning
task and its relationship to the functional basis of the classifier
used.

============================================================

RETRIEVING DOCUMENTS:

To obtain a list of the materials/documents that can be retrieved
electronically, use anonymous ftp as follows (the IP address of 

speech1 is 128.2.254.145):

>  ftp speech1.cs.cmu.edu
>  user: anonymous
>  passwd: <your_name at your_host>
>  cd /usr0/hamps/public
>  get README

Read the file README and choose what you want to retrieve.
All files are in /usr0/hamps/public and /usr0/hamps/public/thesis.
I welcome your comments and constructive criticism.
Happy reading.

-JBH2