outlier, robust statistics

Mon Feb 14 17:37:50 EST 1994

   [terry at salk.edu]
   One man's outlier is another man's data point.  Another
   way to handle outliers is not to remove them but to model them
   explicitly.  Geoff Hinton has pointed out that character
   recognition can be made more robust by including models
   for background noise such as postmarks.

Explicitly modeling an occluding or transparently combined "outlier"
process is a powerful way to build a robust estimator. As mentioned in
other replies to this post, estimators which use a mixture model
(either implicitly or explicitly), such as the EM algorithm, are
promising methods to implement this type of strategy.

One issue which often complicates matters is how to decide how many
objects or processes there are in the signal, e.g. determine K in the
EM estimator. I would like to ask if anyone has a pointer to work on
estimating K in the context of an EM estimator or similar methods?
Often the appropriate cardinality of the model is not easily known
a priori.

   Steve Nowlan and I recently used mixtures of expert networks
   to separate multiple interpenetrating flow fields -- the
   transparency problem for visual motion.  The gating network
   was used to select regions of the visual field that 
   contained reliable estimates of local velocity for 
   which there was coherent global support.  There is
   evidence for such selection neurons in area MT of primate
   visual cortex, a region of cortex that specializes in
   the detection of coherent motion.

I'd also like to add a pointer to some related work Sandy Pentland,
Eero Simoncelli and I have done in this domain developing a strategy
for robust estimation ("outlier exclusion") based on minimum
description length theory. Our method effectively implements a
clustering method to find how many processes there are (e.g. estimate
K), and then iteratively refine estimates of the parameters and
"support" (segmentation) of those processes.  We have developed
versions of this method for range and motion segmentation, both for
occluded and transparently combined processes.

   [pluto at cs.ucsd.edu:]
   >I look forward to reading (Liu 94).  Can you (or anyone else)
   >point me to other references utilizing a similar definition
   >of "outlier?"  (IMHO) "outlier" is quite a value-laden term
   >that I tend to avoid since I feel it has multiple and
   >often ambiguous interpretations/definitions.  

Here are some references to conference papers on our work. A longer
journal paper that combines these is in the works, email me if you
would like a preprint when it becomes available.

Darrell, Sclaroff and Pentland, "Segmentation by Minimal Description",
Proc. 3rd Intl. Conf. Computer Vision, Osaka, Japan, 1990 (also
avail. as MIT Media Lab Percom TR-163.)

Darrell and Pentland, "Robust Estimation of a Multi-Layer Motion 
Representation", Proc. IEEE Workshop on Visual Motion, Princeton, October 1991

Darrell and Pentland, "Against Edges: Function Approximation with
Multiple Support Maps", NIPS 4, 1992

Darrell and Simoncelli, "Separation of Transparent Motion into Layers
using Velocity-tuned Mechanisms", Assn. for Resarch in Vision and
Opthm. (ARVO) 1993, also available as MIT Media Lab Percom TR-244.

(Percom TR's can be anon. ftp'ed from whitechapel.media.mit.edu)

--trevor