Papers: POMDP approximation; cost-sensitive learning

Sat Feb 12 14:40:06 EST 2000

The following papers are available from
http://www.cs.orst.edu/~tgd/cv/pubs.html 

Bayer, V., Dietterich, T. G. (2000). A POMDP Approximation Algorithm
that Anticipates the Need to Observe. Technical Report 00-30-01.
Department of Computer Science, Oregon State University.

Abstract: This paper introduces the even-odd POMDP, an approximation
to POMDPs in which the world is assumed to be fully observable every
other time step. The even-odd POMDP can be converted into an
equivalent MDP, the 2MDP, whose value function, $V^*_{2MDP}$, can be
combined online with a 2-step lookahead search to provide a good POMDP
policy. We prove that this gives an approximation to the POMDP's
optimal value function that is at least as good as methods based on
the optimal value function of the underlying MDP. We present
experimental evidence that the method gives better policies, and we
show that it can find a good policy for a POMDP with 10,000 states and
observations.

Margineantu, D. D., Dietterich, T. G. (2000). Bootstrap Methods for
the Cost-Sensitive Evaluation of Classifiers. Technical Report
00-30-02, Department of Computer Science, Oregon State University.

Abstract: Many machine learning applications require classifiers that
minimize an asymmetric cost function rather than the misclassification
rate, and several recent papers have addressed this problem.  However,
these papers have either applied no statistical testing or have
applied statistical methods that are not appropriate for the
cost-sensitive setting.  Without good statistical methods, it is
difficult to tell whether these new cost-sensitive methods are better
than existing methods that ignore costs, and it is also difficult to
tell whether one cost-sensitive method is better than another.  To
rectify this problem, this paper presents two statistical methods for
the cost-sensitive setting.  The first constructs a confidence
interval for the expected cost of a single classifier.  The second
constructs a confidence interval for the expected difference in costs
of two classifiers.  In both cases, the basic idea is to separate the
problem of estimating the probabilities of each cell in the confusion
matrix (which is independent of the cost matrix) from the problem of
computing the expected cost.  We show experimentally that these
bootstrap tests work better than applying standard $z$ tests based on
the normal distribution.