Combining (averaging) NNs

Sherif Hashem shashem at ecn.purdue.edu
Sat Aug 21 18:08:11 EDT 1993


I have recently joined Connectionists and I read some of the email messages
arguing about combining/averaging NNs.  Unfortunately, I missed the earlier 
discussion that started this argument. 

I am interested in combining NNs, in fact, my Ph.D. thesis is about optimal 
linear combinations of NNs.

Averaging a number of estimators has been suggested/debated/examined in the 
literature for a long time, dating as far as 1818 (Laplace 1818). 
Clemen (1989) cites more than 200 papers in his review of the literature 
related to combining forecasts (estimators), including contributions from 
forecasting, psychology, statistics, and management science literatures.

Numerous empirical studies have been conducted to assess the 
benefits/limitations of combining estimators (Clemen 1989). Besides, there are
quite a few analytical results established in the area. Most of these 
studies and results are in the forecasting literature (more than 100 
publications in the last 20 years). 

I think that it is fair to say that, as long as no "absolute" best estimator 
can be identified, combining estimators may provide a superior alternative to 
picking the best from a population of estimators.

I have published some of my preliminary results on the benefits of combining 
NNs in (Hashem and Schmeiser 1992, 1993a, and Hashem et al. 1993b), and
based on my experience with combining NNs, I join Michael Perrone in 
advocating the use of combining NNs to enhance the estimation accuracy of 
NN based models.


Sherif Hashem
email:shashem at ecn.purdue.edu


References:
-----------

Clemen, R.T. (1989). Combining Forecasts: A Review and Annotated Bibliography.
        International Journal of Forecasting, Vol. 5, pp. 559-583.

Hashem, S., Y. Yih, & B. Schmeiser (1993b). An Efficient Model
       for Product Allocation using Optimal Combinations of 
       Neural Networks. In Intelligent Engineering Systems through
       Artificial Neural Networks, Vol. 3, C. Dagli, L. Burke, B. Femandez, 
       & J. Ghosh (Eds.), ASME Press, forthcoming.

Hashem, S., & B. Schmeiser (1993a). Approximating a Function
       and its Derivatives using MSE-Optimal Linear Combinations of 
       Trained Feedforward Neural Networks. Proceedings of the
       World Congress on Neural Networks, Lawrence Erlbaum
       Associates, New Jersey, Vol. 1, pp. 617-620.

Hashem, S., & B. Schmeiser (1992). Improving Model Accuracy using Optimal 
        Linear Combinations of Trained Neural Networks, Technical Report 
        SMS92-16, School of Industrial Engineering, Purdue University.
        (Submitted)

Laplace P.S. de. (1818). Deuxieme Supplement a la Theorie Analytique
        des Probabilites (Courcier, Paris).; reprinted (1847) in Oeuvers
        Completes de Laplace, Vol. 7 (Paris, Gauthier-Villars) 531-580.



More information about the Connectionists mailing list