paper available: Classification / Feature Selection on Matrix Data

Thu Apr 15 11:25:54 EDT 2004

Apologies for multiple postings.

Dear Colleagues,

we would like to announce the following technical report:

"Classification, Regression, and Feature Selection on Matrix Data"

by Sepp Hochreiter and Klaus Obermayer

Abstract
---------------------------------------------------------------------
We describe a new technique for the analysis of data which is given in
matrix form. We consider two sets of objects, the ``row'' and the
``column'' objects, and we represent these objects by a matrix of
numerical values which describe their mutual relationships. We then
introduce a new technique, the ``Potential Support Vector Machine''
(P-SVM), as a large-margin based method for the construction of
classifiers and regression functions for the ``column''
objects. Contrary to standard support vector machine (SVM) approaches,
the P-SVM minimizes a scale-invariant capacity measure under a new set
of constraints. As a result, the P-SVM can handle data matrices which
are neither positive definite nor square, and leads to a usually
sparse expansion of the classification boundary or the regression
function in terms of the ``row'' rather than the ``column''
objects. We introduce two complementary regularization schemes in
order to avoid overfitting for noisy data sets. The first scheme
improves generalization performance for classification and regression
problems, the second scheme leads to the selection of  a small and
informative set of ``row'' objects and can be applied to feature
selection. A fast optimization algorithm based on the ``Sequential
Minimal Optimization'' (SMO) technique is provided.

We first apply the new method to so-called pairwise data,
i.e. ``row'' and ``column'' objects are from the same set.  Pairwise
data can be represented in two ways. The first representation uses
vectorial data and constructs a Gram matrix from feature vectors using
a kernel function.  Benchmark results show, that the P-SVM method
provides superior classification and regression results and has the
additional advantages that kernel functions are no longer restricted
to be positive definite. The second representation uses a measured
matrix of mutual relations between objects rather than vectorial data.
The new classification and regression method performs very well
compared to standard techniques on benchmark data sets.  More
importantly, however, experiments show that the P-SVM can be very
effectively used for feature selection. Then we apply the P-SVM to
genuine matrix data, where ``row'' and ``column'' objects are from
different sets,  and, again, the data matrix is either constructed via
a kernel function combining ``row'' and ``column'' objects  or
obtained by measurements.  On various benchmark data sets we
demonstrate the new method's excellent performance for classification,
regression, and feature selection problems. For both pairwise and
matrix data benchmarks are performed not only with toy data, but also
with several real world data sets including data from the UCI
repository, protein classification, web-page classification, and DNA
microarray data.
----------------------------------------------------------------------------

URL (under ``Theses and Tech. Reports''):

http://ni.cs.tu-berlin.de/publications/ni-pubs-mlearn.html

FTP:

ftp://ftp.cs.tu-berlin.de/pub/local/ni/papers/Hochreiter04techrep.ps.gz

Sincerely Yours,

Sepp Hochreiter