Reinforcement learning, particle filters, and human motion

Tue Feb 6 20:50:22 EST 2001

Hi,  

Please take notice of the following brand-new papers:

       Kernel-based reinforcement learning in average-cost problems.
       D. Ormoneit and P. W. Glynn.
       http://robotics.stanford.edu/~ormoneit/publications/control.ps

       Lattice Particle Filters.
       C. Lemieux, D. Ormoneit, and David J. Fleet. 
       http://robotics.stanford.edu/~ormoneit/publications/lattice.ps.gz

       Functional analysis of human motion data.
       D. Ormoneit, T. Hastie, and M.Black. 
       http://robotics.stanford.edu/~ormoneit/publications/motion.ps.gz

A more detailed description is enclosed below.

Cheers,

Dirk

_________________________________________________________________

    Kernel-based reinforcement learning in average-cost problems

		     D. Ormoneit and P. W. Glynn.
       http://robotics.stanford.edu/~ormoneit/publications/control.ps

Reinforcement learning (RL) is concerned with the identification of
optimal  controls in Markov Decision Processes (MDP) where  no
explicit model of the transition probabilities is available.  Many
existing approaches to RL --- including ``temporal-difference
learning'' --- employ simulation-based approximations of the value
function  for this purpose \cite{sutton88,tsitsiklis97}.  This
proceeding frequently leads to numerical instabilities of the
resulting learning algorithm, especially if the  function
approximators  used are parametric such as linear combinations of
basis functions or neural networks.  In this work, we propose an
alternative class of RL algorithms which always produces stable
estimates of the value function.  In detail, we use ``local
averaging'' methods to construct an  approximate dynamic programming
(ADP) algorithm.
_________________________________________________________________

		       Lattice Particle Filters

	     C. Lemieux, D. Ormoneit, and David J. Fleet.
  http://robotics.stanford.edu/~ormoneit/publications/lattice.ps.gz

A common way to formulate visual tracking is to adopt a Bayesian
approach, and to use particle filters to cope with nonlinear dynamics
and nonlinear observation equations. While particle filters can deal
with such filtering tasks in principle, their performance often varies
significantly due to their stochastic nature. We present a class of
algorithms, called lattice particle filters, that circumvent this
difficulty by placing the particles deterministically according to a
Quasi-Monte Carlo integration rule.  We describe a practical
realization of this idea and discuss its theoretical properties.
Experimental results with a synthetic 2D tracking problem show that
the lattice particle filter yields a performance improvement over
conventional particle filters that is equivalent to an increase
between 10 and 60\% in the number of particles, depending on their
``sparsity'' in the state-space.  We also present results on inferring
3D human motion from moving light displays.

_________________________________________________________________

	       Functional analysis of human motion data

		 D. Ormoneit, T. Hastie, and M.Black.
   http://robotics.stanford.edu/~ormoneit/publications/motion.ps.gz

We present a method for the modeling of 3D human motion data using
functional analysis.  First, we estimate a statistical model of
typical activities from a large set of 3D human motion data.  For this
purpose, the human body is represented as a set of articulated
cylinders and the evolution of a particular joint angle is described
by a time-series.  Specifically, we consider periodic motion such as
``walking'' in this work, and we develop a new set of tools that
allows for the automatic segmentation of the training data into a
sequence of identical ``motion cycles''.  Then we compute the mean and
the principal components of these cycles using a new algorithm that
accounts for missing information and that enforces smooth transitions
between cycles.  As an application of this methodology we consider the
visual tracking of human motion in a 2D video sequence.  Here the
principal components serve to define a low-dimensional representation
of the human 3D poses in a state-space model that treats the 2D video
images as observations.  We apply (approximate) Bayesian inference
using a particle filter to the state-space model to infer the body
poses at each time-step.  The resulting algorithm is able to track
human subjects in monocular video sequences and to recover their 3D
motion in complex unknown environments.
_________________________________________________________________
Dirk Ormoneit
Department of Computer Science
Gates Building, 1A-148
Stanford University
Stanford, CA 94305-9010

ph.: (650) 725-8797
fax: (650) 725-1449

ormoneit at cs.stanford.edu
http://robotics.stanford.edu/~ormoneit/