Energy Prediction Competition

David J.C. MacKay mackay at mrao.cam.ac.uk
Sat Jun 26 12:36:00 EDT 1993


The following preprint is now available by anonymous ftp. 

***********************************************************************

Bayesian Non-linear Modeling for the Energy Prediction Competition

             David J.C. MacKay

                      University of Cambridge
                      Cavendish Laboratory
                      Madingley Road
                      Cambridge CB3 0HE
                      mackay at mrao.cam.ac.uk


Bayesian probability theory provides a unifying framework for data
modeling. A model space may include numerous control parameters which
influence the complexity of the model (for example regularisation
constants).  Bayesian methods can automatically set such parameters so
that the model becomes probabilistically well-matched to the data.

   The 1993 energy prediction competition involved the prediction of a
series of building energy loads from a series of environmental input
variables.  Non-linear regression using `neural networks' is a popular
technique for such modeling tasks.  Since it is not obvious how large
a time-window of inputs is appropriate, or what preprocessing of
inputs is best, this can be viewed as a regression problem in which
there are many possible input variables, some of which may actually be
irrelevant to the prediction of the output variable. Because a finite
data set will show random correlations between the irrelevant inputs
and the output, any conventional neural network (even with `weight
decay') will not set the coefficients for these junk inputs to zero.
Thus the irrelevant variables will hurt the model's performance.

   The Automatic Relevance Determination (ARD) model puts a prior over
the regression parameters which embodies the concept of relevance.
This is done in a simple and `soft' way by introducing multiple
`weight decay' constants, one `$\alpha$' associated with each input.
Using Bayesian methods, the decay rates for junk inputs are
automatically inferred to be large, preventing those inputs from
causing significant overfitting.

   An entry using the ARD model won the prediction competition by a
significant margin.

***********************************************************************

The preprint "Bayesian Non-linear Modeling for the Energy Prediction Competition" 
may be obtained as follows: 

ftp 131.111.48.8
anonymous
(your name)
cd pub/mackay
binary
mget pred.*.ps.Z
quit
uncompress pred.*.ps.Z

This preprint is 24 pages long and contains a large number of figures. 
A more concise version may be released later. 

Table of contents: 

Overview of Bayesian modeling methods
	Neural networks for regression
	Neural network learning as inference
	Setting regularisation constants $\a$ and $\b$
Automatic Relevance Determination
Prediction competition: part A
	The task
	Preliminaries
	Round 1
	Round 2
	Creating a committee
	Results
	Additional performance criteria
	How much did ARD help?
	How much did the use of a committee help?
Prediction competition: part B
	The task
	Preprocessing
	Time--delayed and time--filtered inputs
	Results
Summary and Discussion
	What I might have done differently
	How to better model this sort of problem.

Appendix
Training data: problem A
	Omitted data
	Coding of holidays
	Pre-- and post-processing



More information about the Connectionists mailing list