PhD thesis available on Variational Bayes

Mon Aug 18 12:52:05 EDT 2003

Dear Connectionists

I would like to announce my thesis, some companion Matlab software, and a
website dedicated to Variational Bayesian techniques.

o My thesis "Variational Methods for Approximate Bayesian Inference" is
  available from

	http://www.cs.toronto.edu/~beal/papers.html

o Software for VB Mixtures of Factor Analysers, VB Hidden Markov Models,
  and VB State Space Models (Linear Dynamical Systems) is available from

	http://www.cs.toronto.edu/~beal/software.html

o Variational-Bayes.org: a repository of papers, software, and links
  related to the use of variational methods for approximate Bayesian
  learning

	http://www.variational-bayes.org

  We welcome your feed-back to help build this site.

Below is an abstract and short contents of my thesis.

Cheers
-Matt

----------

Abstract:

The Bayesian framework for machine learning allows for the incorporation
of prior knowledge in a coherent way, avoids overfitting problems, and
provides a principled basis for selecting between alternative models.
Unfortunately the computations required are usually intractable. This
thesis presents a unified variational Bayesian (VB) framework which
approximates these computations in models with latent variables using a
lower bound on the marginal likelihood.

Chapter 1 presents background material on Bayesian inference, graphical
models, and propagation algorithms. Chapter 2 forms the theoretical core
of the thesis, generalising the expectation-maximisation (EM) algorithm
for learning maximum likelihood parameters to the VB EM algorithm which
integrates over model parameters. The algorithm is then specialised to the
large family of conjugate-exponential (CE) graphical models, and several
theorems are presented to pave the road for automated VB derivation
procedures in both directed and undirected graphs (Bayesian and Markov
networks, respectively).

Chapters 3-5 derive and apply the VB EM algorithm to three commonly-used
and important models: mixtures of factor analysers, linear dynamical
systems, and hidden Markov models. It is shown how model selection tasks
such as determining the dimensionality, cardinality, or number of
variables are possible using VB approximations. Also explored are methods
for combining sampling procedures with variational approximations, to
estimate the tightness of VB bounds and to obtain more effective sampling
algorithms. Chapter 6 applies VB learning to a long-standing problem of
scoring discrete-variable directed acyclic graphs, and compares the
performance to annealed importance sampling amongst other methods.
Throughout, the VB approximation is compared to other methods including
sampling, Cheeseman-Stutz, and asymptotic approximations such as BIC. The
thesis concludes with a discussion of evolving directions for model
selection including infinite models and alternative approximations to the
marginal likelihood.

----------

Table of Contents:

Chapter 1 Introduction
    1.1 Probabilistic inference
    1.2 Bayesian model selection
    1.3 Practical Bayesian approaches
    1.4 Summary of the remaining chapters
Chapter 2 Variational Bayesian Theory
    2.1 Introduction
    2.2 Variational methods for ML / MAP learning
    2.3 Variational methods for Bayesian learning
    2.4 Conjugate-Exponential models
    2.5 Directed and undirected graphs
    2.6 Comparisons of VB to other criteria
    2.7 Summary
Chapter 3 Variational Bayesian Hidden Markov Models
    3.1 Introduction
    3.2 Inference and learning for maximum likelihood HMMs
    3.3 Bayesian HMMs
    3.4 Variational Bayesian formulation
    3.5 Experiments
    3.6 Discussion
Chapter 4 Variational Bayesian Mixtures of Factor Analysers
    4.1 Introduction
    4.2 Bayesian Mixture of Factor Analysers
    4.3 Model exploration: birth and death
    4.4 Handling the predictive density
    4.5 Synthetic experiments
    4.6 Digit experiments
    4.7 Combining VB approximations with Monte Carlo
    4.8 Summary
Chapter 5 Variational Bayesian Linear Dynamical Systems
    5.1 Introduction
    5.2 The Linear Dynamical System model
    5.3 The variational treatment
    5.4 Synthetic Experiments
    5.5 Elucidating gene expression mechanisms
    5.6 Possible extensions and future research
    5.7 Summary
Chapter 6 Learning the structure of discrete-variable graphical models
          with hidden variables
    6.1 Introduction
    6.2 Calculating marginal likelihoods of DAGs
    6.3 Estimating the marginal likelihood
    6.4 Experiments
    6.5 Open questions and directions
    6.6 Summary
Chapter 7 Conclusion
    7.1 Discussion
    7.2 Summary of contributions
Appendix A Conjugate Exponential family examples
Appendix B Useful results from matrix theory
    B.1 Schur complements and inverting partitioned matrices
    B.2 The matrix inversion lemma
Appendix C Miscellaneous results
    C.1 Computing the digamma function
    C.2 Multivariate gamma hyperparameter optimisation
    C.3 Marginal KL divergence of gamma-Gaussian variables

----------