Multiple Models, Committee of nets etc...
Michael P. Perrone
mpp at cns.brown.edu
Thu Jul 29 03:27:21 EDT 1993
Tom Dietterich write:
> This analysis predicts that using a committee of very diverse
> algorithms (i.e., having diverse approximation errors) would yield
> better performance (as long as the committee members are competent)
> than a committee made up of a single algorithm applied multiple times
> under slightly varying conditions.
and David Wolpert writes:
>There is a good deal of heuristic and empirical evidence supporting
>this claim. In general, when using stacking to combine generalizers,
>one wants them to be as "orthogonal" as possible, as Tom maintains.
One minor result from my thesis shows that when the estimators are
orthogonal in the sense that
E[n_i(x)n_j(x)] = 0 for all i<>j
where n_i(x) = f(x) - f_i(x), f(x) is the target function, f_i(x) is
the i-th estimator and the expected value is over the underlying
distribution; then the MSE of the average estimator goes like 1/N
times the average of the MSE of the estimators where N is the number
of estimators in the population.
This is a shocking result because all we have to do to get arbitrarily
good performance is to increase the size of our estimator population!
Of course in practice, the nets are correlated and the result is no
longer true.
Michael
--------------------------------------------------------------------------------
Michael P. Perrone Email: mpp at cns.brown.edu
Institute for Brain and Neural Systems Tel: 401-863-3920
Brown University Fax: 401-863-3934
Providence, RI 02912
More information about the Connectionists
mailing list