Modified PCA Model

Thu Sep 11 02:21:50 EDT 1997

			Paper Available Online
		       ------------------------

For those interested in modifications to basic principal component
analysis (PCA) models, especially those which define a proper
probability model in the data space, I am making available a paper
which will appear in NIPS'97. This paper introduces a new model called
_sensible_ principal component analysis (SPCA) which is closely
related to the independent work of Michael Tipping and Chris Bishop
recently announced on this list. The paper also shows how this new
model as well as the regular PCA model can be efficiently learned
using an EM algorithm.

There are two basic ideas in the paper:

1) Given some data, a very simple thing to do is to build a single Gaussian
   density model using the sample mean and covariance. The likelihood of
   new data can be easily obtained by evaluation under this Gaussian.
   PCA identifies the axes of such a Gaussian model. But if we throw
   away some of the principal components by projecting into a subspace of
   high variance, we no longer have a proper density model. (The squared
   distance of new data from the subspace is one measure of "novelty"
   but it is insensitive to translations within the principal subspace.)
   We can recapture a proper probability model by replacing all of
   the unwanted principal components with a single spherical noise model
   whose scale can be estimated from the data without ever having to find
   the unwanted components. Because this noise model is characterized by
   only a single scalar, such a model is almost as economical as regular
   PCA but, like factor analysis, defines a proper probability density model.

2) Traditional methods for finding the first few principal components
   of a dataset can be quite costly when there are many datapoints and/or
   the data space is of very high dimension. These methods can also
   fail when the sample covariance matrix is not of full rank and cannot
   deal easily with missing data values. 
   Using the EM algorithms presented in the paper to fit a regular PCA 
   model or an SPCA model can be significantly more efficient and 
   require less data than other techniques. These algorithms also deal
   naturally with missing data values, even when all datapoints have 
   missing coordinates.

		    EM ALGORITHMS FOR PCA AND SPCA
			(to appear in NIPS97)

			      Sam Roweis
		Computation and Neural Systems 139-74
		  California Institute of Technology

	 ftp://hope.caltech.edu/pub/roweis/Empca/empca.ps     or
	 ftp://hope.caltech.edu/pub/roweis/Empca/empca.ps.gz

			       Abstract

I present an expectation-maximization (EM) algorithm for principal
component analysis (PCA).  The algorithm allows a few eigenvectors and
eigenvalues to be extracted from large collections of high dimensional
data. It is computationally efficient and does not require computing
the sample covariance of the data. It also naturally accommodates
missing information. I also introduce a new variation of PCA known as
{\em sensible} principal component analysis (SPCA) which defines a
proper density model in the data space. Learning for SPCA is also done
with an EM algorithm. I include results of simulations showing that
these EM algorithms correctly and efficiently find the leading
eigenvectors of the covariance of datasets in a few iterations using
up to thousands of datapoints in hundreds of dimensions.

Sam

roweis at cns.caltech.edu                http://hope.caltech.edu/roweis