Modified PCA Model
Sam Roweis
roweis at cns.caltech.edu
Thu Sep 11 02:21:50 EDT 1997
Paper Available Online
------------------------
For those interested in modifications to basic principal component
analysis (PCA) models, especially those which define a proper
probability model in the data space, I am making available a paper
which will appear in NIPS'97. This paper introduces a new model called
_sensible_ principal component analysis (SPCA) which is closely
related to the independent work of Michael Tipping and Chris Bishop
recently announced on this list. The paper also shows how this new
model as well as the regular PCA model can be efficiently learned
using an EM algorithm.
There are two basic ideas in the paper:
1) Given some data, a very simple thing to do is to build a single Gaussian
density model using the sample mean and covariance. The likelihood of
new data can be easily obtained by evaluation under this Gaussian.
PCA identifies the axes of such a Gaussian model. But if we throw
away some of the principal components by projecting into a subspace of
high variance, we no longer have a proper density model. (The squared
distance of new data from the subspace is one measure of "novelty"
but it is insensitive to translations within the principal subspace.)
We can recapture a proper probability model by replacing all of
the unwanted principal components with a single spherical noise model
whose scale can be estimated from the data without ever having to find
the unwanted components. Because this noise model is characterized by
only a single scalar, such a model is almost as economical as regular
PCA but, like factor analysis, defines a proper probability density model.
2) Traditional methods for finding the first few principal components
of a dataset can be quite costly when there are many datapoints and/or
the data space is of very high dimension. These methods can also
fail when the sample covariance matrix is not of full rank and cannot
deal easily with missing data values.
Using the EM algorithms presented in the paper to fit a regular PCA
model or an SPCA model can be significantly more efficient and
require less data than other techniques. These algorithms also deal
naturally with missing data values, even when all datapoints have
missing coordinates.
EM ALGORITHMS FOR PCA AND SPCA
(to appear in NIPS97)
Sam Roweis
Computation and Neural Systems 139-74
California Institute of Technology
ftp://hope.caltech.edu/pub/roweis/Empca/empca.ps or
ftp://hope.caltech.edu/pub/roweis/Empca/empca.ps.gz
Abstract
I present an expectation-maximization (EM) algorithm for principal
component analysis (PCA). The algorithm allows a few eigenvectors and
eigenvalues to be extracted from large collections of high dimensional
data. It is computationally efficient and does not require computing
the sample covariance of the data. It also naturally accommodates
missing information. I also introduce a new variation of PCA known as
{\em sensible} principal component analysis (SPCA) which defines a
proper density model in the data space. Learning for SPCA is also done
with an EM algorithm. I include results of simulations showing that
these EM algorithms correctly and efficiently find the leading
eigenvectors of the covariance of datasets in a few iterations using
up to thousands of datapoints in hundreds of dimensions.
Sam
roweis at cns.caltech.edu http://hope.caltech.edu/roweis
More information about the Connectionists
mailing list