Hyperparameters: optimise, or integrate out?
David J.C. MacKay
mackay at mrao.cam.ac.uk
Sat Jun 26 12:34:00 EDT 1993
The following preprint is now available by anonymous ftp.
========================================================================
Hyperparameters: optimise, or integrate out?
David J.C. MacKay
University of Cambridge
Cavendish Laboratory
Madingley Road
Cambridge CB3 0HE
mackay at mrao.cam.ac.uk
I examine two computational methods for implementation of Bayesian
hierarchical models, that is, models which include unknown
hyperparameters such as regularisation constants. In the `evidence
framework' the model parameters are {\em integrated} over, and the
resulting evidence is {\em maximised} over the hyperparameters. In the
alternative `MAP' method, the `true posterior probability' is found by
{\em integrating} over the hyperparameters, and this is then {\em
maximised} over the model parameters. The similarities of the two
approaches, and their relative merits, are discussed. In severely
ill-posed problems, it is shown that significant biases arise in the
second method.
========================================================================
The preprint "Hyperparameters: optimise, or integrate out?"
may be obtained as follows:
ftp 131.111.48.8
anonymous
(your name)
cd pub/mackay
binary
get alpha.ps.Z
quit
uncompress alpha.ps.Z
This document is 16 pages long
Table of contents:
Outline
Making inferences
The ideal approach
The Evidence framework
The MAP method
The effective $\a$ of the general MAP method
Pros and cons
In favour of the MAP method
Magnifying the differences
An example
The curvature of the true prior, and MAP error bars
Discussion
Appendices:
Conditions for the evidence approximation
Distance between probability distributions
A method for evaluating distances D( p(t) , q(t) )
What I mean by saying that the approximation `works'
Predictions
The evidence
\sigma_N and \sigma_N-1
More information about the Connectionists
mailing list