committee's
Lars Kai Hansen
lars at eiffel.ei.dth.dk
Fri Jul 30 04:33:31 EDT 1993
It is great that attention is focussed on the effective use
of solution space samples for non-linear models.
Allow me to promote our pre-historic work on network voting:
NEURAL NETWORK ENSEMBLES by L.K. Hansen and P. Salamon
IEEE Trans. Pattern Analysis and Machine Intell. {\bf 12}, 993-1001, (1990)
Besides finding experimentally that the ensemble consensus often
is 'better than the best'.... expressions were derived
for the ensemble error rate based on different assumptions
on error correlations. The key invention is to describe the ensemble by the
'difficulty distribution'. This description was inspired by earlier work on
so called 'N-version programming' by Eckhardt and Lee:
A THEORETICAL BASIS FOR THE ANALYSIS OF MULTIVERSION SOFTWARE
SUBJECT TO COINCIDENT ERRORS by D.E. Eckhardt and L.D. Lee
IEEE Trans. Software Eng. {\bf 11} 1511-1517 (1985)
In a feasibility study on Handwritten digits the viability of
voting among small ensembles was confirmed (the consensus outperformed
the best individual by 25%) and the theoretical
estimate of ensemble performance was found to fit well to
the observed. Further, the work of Schwartz et al.
[Neural Computation {\bf 2}, 371-382 (1990)] was applied to estimate the
learning curve based on the distribution of generalizations of
a small ensemble:
ENSEMBLE METHODS FOR HANDWRITTEN DIGIT RECOGNITION by
L.K. Hansen, Chr. Liisberg, and P. Salamon
In proceedings of The Second IEEE Workshop on Neural
Networks for Signal Processing: NNSP'92 Eds. S.Y. Kung et al.,
IEEE Service Center Piscataway, 333-342, (1992)
While I refer to these methods as *ensemble* methods
(to emphasize the statistical relation and to invoke
associations to artistic ensembles), I note that theorists have
reserved *committee machines* for a special, constrained,
network architecture (see eg. Schwarze and Hertz
[Euro.Phys.Lett. {\bf 20}, 375-380, (1992)]).
In the theorist committee (TC) all weights from hiddens to output are fixed
to unity during training. This is very different from voting
among independently trained networks: while the TC explores
the function space of a large set of parameters (hence needs
very many training examples), a voting system based on independently
trained nets only explores the function space of the individual
network. The voting system can improve generalization by reducing
'random' errors due to training algorithms etc.
---------------------
Lars Kai Hansen, Tel: (+45) 4593 1222 (tone) 3889
CONNECT, Electronics Institute B349 Fax: (+45) 4288 0117
Technical University of Denmark email: lars at eiffel.ei.dth.dk
DK-2800 Lyngby DENMARK
More information about the Connectionists
mailing list