Two papers on probabilistic approximations to Bayesian networks

Volker Tresp Volker.Tresp at mchp.siemens.de
Fri Oct 1 09:55:33 EDT 1999




We would like to announce the availability of two papers.

In the first paper  we apply different mixture approximations to
Bayesian networks with the goal of obtaining insight into the modeled
domain. One of the approximations is obtained via a mixture of mean
field solutions. The latter approach and a general formulation of the
mean field approximation applicable to an arbitrary Bayesian network
can be found in the second paper.


-------------------------------------------------------------------------
              Mixture Approximations  to  Bayesian Networks

               Volker Tresp, Michael Haft and Reimar Hofmann
                    Siemens AG, Corporate Technology
                          Neural Computation 
                  Dept. Information and Communications
                  Otto-Hahn-Ring 6, 81730 Munich, Germany 


Published in  Laskey, K. B., Prade, H., (Hrsg.), Uncertainty in  Artificial Intelligence,
Proceedings of the Fifteenth Conference, Morgan Kaufmann Publishers, 1999, pp. 639-646


Structure and parameters in a Bayesian network  uniquely specify  the
probability distribution of the  modeled domain.  The locality  of both
structure and probabilistic information are the great benefits of
Bayesian networks and require the modeler to only specify local
information.  On the other hand this locality of information  might
prevent the modeler ---and even more any other person--- from obtaining
a general overview of the important relationships within the domain.
The goal of the work presented in this paper is to provide an
``alternative'' view on the knowledge encoded in a Bayesian network
which might sometimes be very helpful for providing insights into the
underlying domain.  The basic idea is to calculate a mixture
approximation to  the probability distribution represented by the
Bayesian network.  The mixture  component densities can be thought of
as representing typical scenarios implied by the Bayesian model,
providing intuition about the basic relationships. As an additional
benefit, performing inference in the approximate model is very simple
and intuitive and can provide additional insights.  The computational
complexity  for the calculation of the mixture approximations
critically depends on the  measure which defines the distance between
the probability distribution represented by the Bayesian network and
the approximate distribution. Both the KL-divergence and the backward
KL-divergence lead to inefficient algorithms.  Incidentally, the latter
is used in recent work on mixtures of mean field solutions to which the
work presented here is closely related. We show, however, that using a
mean squared error cost function leads to update equations which can be
solved using the junction tree algorithm.  We conclude that the mean
squared error cost function can be used for Bayesian networks in which
inference based on the junction tree is tractable.  For large networks,
however, one may have to rely on  mean field approximations.

http://www7.informatik.tu-muenchen.de/~hofmannr/uai99_abstr.html

-----------------------------------------------------------------------
-----------------------------------------------------------------------



        Model-Independent Mean Field Theory as a Local Method for
                 Approximate Propagation of Information

                   M.~Haft, R.~Hofmann and V.~Tresp
         Corporate Technology, Department: Information and Communications
                Siemens AG, 81730 M\"unchen, Germany

Published in Network: Computation in Neural Systems, 10,  1999, pp. 93-105
                        (based on a TR of 1997).

We present a systematic approach to mean field theory (MFT) in a
general probabilistic setting without assuming a particular model.  The
mean field equations derived here may serve as a local and thus very
simple method for approximate inference in probabilistic models such as
Boltzmann machines or Bayesian networks.  Our approach is
`model-independent' in the sense that we do not assume a particular
type of dependencies; in a Bayesian network, for example, we allow
arbitrary tables to specify conditional dependencies.  In general,
there are multiple solutions to the mean field equations. We show that
improved estimates can  be obtained by forming a weighted mixture of
the multiple mean field solutions.  Simple approximate expressions for
the mixture weights are given.  The general formalism derived so far is
evaluated for the special case of Bayesian networks.  The benefits of
taking into account multiple solutions are demonstrated by using MFT
for inference in a small and in a very large Bayesian network. The
results are compared to the exact results.

http://www7.informatik.tu-muenchen.de/~hofmannr/network99_abstr.html

-----------------------------------------------------------------------


More information about the Connectionists mailing list