Bayes nets for classification, compression and error-correction

Brendan J. Frey frey at cs.toronto.edu
Mon Aug 11 16:12:54 EDT 1997



                      Doctoral dissertation

          Bayesian Networks for Pattern Classification, 
              Data Compression, and Channel Coding

                         Brendan J. Frey
                 http://www.cs.utoronto.ca/~frey


Pattern classification, data compression, and channel coding are
tasks that usually must deal with complex but structured natural
or artificial systems. Patterns that we wish to classify are a
consequence of a causal physical process. Images that we wish to
compress are also a consequence of a causal physical process. Noisy
outputs from a telephone line are corrupted versions of a signal
produced by a structured man-made telephone modem. Not only are
these tasks characterized by complex structure, but they also
contain random elements. Graphical models such as Bayesian networks
provide a way to describe the relationships between random
variables in a stochastic system.

In this thesis, I use Bayesian networks as an overarching framework
to describe and solve problems in the areas of pattern
classification, data compression, and channel coding. Results on
the classification of handwritten digits show that Bayesian network
pattern classifiers outperform other standard methods, such as the
k-nearest neighbor method. When Bayesian networks are used as
source models for data compression, an exponentially large number
of codewords are associated with each input pattern. It turns out
that the code can still be used efficiently, if a new technique
called ``bits-back coding'' is used. Several new error-correcting
decoding algorithms are instances of ``probability propagation'' in
various Bayesian networks. These new schemes are rapidly closing
the gap between the performances of practical channel coding
systems and Shannon's 50-year-old channel coding limit. The
Bayesian network framework exposes the similarities between these
codes and leads the way to a new class of ``trellis-constraint
codes'' which also operate close to Shannon's limit.


Brendan.




More information about the Connectionists mailing list