PhD available: Diversity in Neural Network Ensembles

Gavin Brown G.Brown at cs.bham.ac.uk
Tue Feb 24 11:23:09 EST 2004


Dear Connectionists,

My PhD thesis on "Diversity in Neural Network Ensembles" is now available
on the web at:

  http://www.cs.bham.ac.uk/~gxb/research/gbrown_thesis.ps.gz

Abstract:
---------------------------------------
We study the issue of error diversity in ensembles of neural networks.
In ensembles of regression estimators, the \emph{measurement} of diversity
can be formalised as the Bias-Variance-Covariance decomposition.  In
ensembles of classifiers, there is no neat theory in the literature to
date.  Our objective is to understand how to precisely define, measure,
and create diverse errors for both cases.   As a focal point we study one
algorithm, \emph{Negative Correlation (NC) Learning} which claimed, and
showed empirical evidence, to \emph{enforce} useful error diversity,
creating neural network ensembles with very competitive performance on
both classification and regression problems.  With the lack of a solid
understanding of its dynamics, we engage in a theoretical and empirical
investigation.

During the theoretical investigations, we find that NC succeeds due to
exploiting the \emph{Ambiguity decomposition} of mean squared error.  We
provide a grounding for NC in a statistics context of bias, variance and
covariance, including a link to a number of other algorithms that have
exploited Ambiguity.   The discoveries we make regarding NC are not
limited to neural networks.  The majority of observations we make are in
fact properties of the mean squared error function.  We find that NC is
therefore best viewed as a \emph{framework}, rather than an algorithm
itself, meaning several other learning techniques could make use of it.

We further study the configurable parameter in NC, thought to be entirely
problem-dependent, and find that one part of it can be analytically
determined for any ensemble architecture.  We proceed to define an upper
bound on the remaining part of the parameter, and show considerable
empirical evidence that a lower bound also exists.  As the size of the
ensemble increases, the upper and lower bounds converge, indicating that
the optimal parameter can be determined exactly.  We describe a number of
experiments with different datasets and ensemble architectures, including
the first comparisons to other popular ensemble methods; we find NC to be
a competitive technique, worthy of further application.

Finally we conclude with observations on how this investigation has
impacted our understanding of diversity in general, and note several
possible new directions that are suggested by this work.  This includes
links to evolutionary computation, Mixtures of Experts, and regularisation
techniques.

---------------------------------------------


Enjoy,
-Gav




More information about the Connectionists mailing list