NFL, practice, and CV

hicks@cs.titech.ac.jp hicks at cs.titech.ac.jp
Mon Dec 18 23:58:07 EST 1995


Huaiyu Zhu wrote:
>You can't make every term positive in your balance sheet, if the grand
>total is bound to be zero.

There ARE functions which are always non-negative, but which under 
an appropriate measure integrate to 0.
It only requires that 

	1) the support of the non-negative values is vanishingly small,
	2) the non-negative values are bounded 

So the above statement by Dr. Zhu is not true.  In fact I think this ability
for pointwise positive values to dissapear under integration is key to the
"zero-sum" aspect of the NFL theorem holding true, despite the fact that we
obviously see so many examples of working algorithms.

My key point:  A zero-sum (infinite) universe doesn't require negative values.

----

There is another important issue which needs to be clarified, and that is the
definition of CV and the kinds of problems to which it can be applied.  Now
anybody can make whatever definition they want, and then come to some
conclusions based upon that definition, and that conclusion may be correct
given that definition.  However, there are also advantages to sharing a common
intellectual currency.  

	I quote below from "An Introduction to the Bootstrap" by Efron and
Tibshirani, 1993, Chapter 17.1.  It describes well what I meant when I talked
monitoring prediction error in a previous posting, and describes CV as a
method for doing that.

==================================================

	In our discussion so far we have focused on a number of measures of
statistical accuracy: standard errors, biases, and confidence intervals.  All
of these are measures of accuracy for parameters of a model.  Prediction error
is a different quantity that measures how well a model predicts the response
value of a future observation.  It is often used for model selection, since
it is ensible ot choose a model that has the lowest prediction error among a
set of candidates.

	Cross-validation is a standard tool for estimating prediction error.
It is an old idea (predating the bootstrap) that has enjoyed a comeback in
recent years with the increase in available computing power and speed.  In
this chapter we discuss cross-validation, the bootstrap, and some other
closely related techniques for estimation of prediction error.

	In regression models, prediction error refers to the expected squared 
difference between a future response and its prediction from the model:

	PE = E(y - \hat{y})^2.
	
The expectation refers to repeated sampling from the true population. 
Prediction error also arises in th eclassification problem, where the
repsponse falls into one of k unordered classes.  For example, the possible
reponses might be Republican, Democrat, or Independent in a political survey.
In classification problems prediction error is commonly defined as the
probability of an incorrect classification

	PE = Prob(\hat{y} \neq y),

also called the misclassification rate.  The methods described in this chapter
apply to both definitions of prediction error, and also to others.

==================================================

Craig Hicks
Tokyo Institute of Technology


More information about the Connectionists mailing list