whoops
Geoffrey Hinton
hinton at ai.toronto.edu
Tue Sep 10 18:04:27 EDT 1991
I said that K nearest neighbors does well at generalizing parity from a
randomly sampled training set when K is small and even. This was a mistake.
What I meant was that K nearest neighbors does well when K is set so that
the modal hamming distance to the K nearest neighbors is small and even.
For example, if the training set contains a fraction, p, of all possible input
vectors of length n, then if we set K to be:
np + n(n-1)p/2
we can expect about np neighbors one bit away and about n(n-1)p/2 neighbors
two bits away. If n is not very small, there will be far more neighbors two
bits away than one bit away, so generalization will be correct. The use of a
validation set to fix K should therefore give good generalization for n not
too small.
Geoff
More information about the Connectionists
mailing list