validation sets

Thu Nov 21 15:19:02 EST 1991

The question of whether or not validation sets are useful can easily be
answered, at least on specific datasets.  We have run that experiment and
found that devoting some training examples to validation is useful (ie,
training on N examples does worse than training on N-k and validating on k).

This same issue comes up with decision-tree learners (where the validation set
is often called a "tuning set", as it is used to prune the decision tree).  I
believe there people have also found it is useful to devote some examples to
pruning/validating.

I think there is also an important point about "proper" experimental
methodology lurking in the discussion.  If one is using N examples for weight
adjustment (or whatever kind of learning one is doing) and also use k examples
for selecting among possible final answers, one should report that their
testset accuracy resulted from N+k training examples.  

Here's a brief argument for counting the validation examples just like
"regular" ones.  Let N=0 and k=<some big number>.  Randomly guess some very
large number of answers, return the answer that does best on the validation
set.  Most likely the answer returned will do well on the testset (and all we
ever got from the tuning set was a single number).  Certainly our algorithm
didnt learn from zero examples!

					Jude Shavlik
					University of Wisconsin
					shavlik at cs.wisc.edu