Some more on NFL

David Wolpert dhw at santafe.edu
Mon Nov 20 13:54:43 EST 1995


Barak Pearlmutter writes:



>>>
not having low Kolmogorov complexity is equivalent to being
random... (the) description of anything that does not have low
Kolmogorov complexity is "nothing but noise."
>>>

I don't necessarily disagree with such sentiments, but one should
definitely be careful about them; there are *many* definitions of
"random". (Seth Lloyd has counted about 30 versions of its flip side,
"amount of information".) High Kolmogorov complexity is only one of
them.

To illustrate just one of the possible objections to measuring
randomness with Kolmogorov complexity: Would you say that a
macroscopic gas with a specified temperature is "random"? To describe
it exactly takes a huge Kolmogorov complexity. And certainly in many
regards its position in phase space is "nothing but noise". (Indeed,
in a formal sense, its position is a random sample of the Boltzmann
distribution.) Yet Physicists can (and do) make extraordinarilly
accurate predictions about such creatures with ease.

Another important (and related) point is that encoding constant that
gets buried in the definition of Kolmogorov complexity - in practive
it can be very important. To put it another way, one person's "random"
is another person's "highly regular"; this is precisely why the basis
you use in supervised learning matters so much.




>>>
as scientists, we have had some success by positing the
existence of non-random structure in the real world.  So it seems to
me that there is at least some reason to believe that the functions we
optimize in practice are not completely random, in this Kolmogorov
complexity sense.
>>>

Oh, most definitely. To give one simple example:

Cross-validation works quite well in practice. However either

1) for *any* fixed target (i.e., for any prior over targets), averaged
over all sets of generalizers you're choosing among, cross-validation
works no better than anti-cross-validation (choose the generalizer in
the set at hand having *largest* cross-validation error); and

2) for a fixed set of generalizers, averaged over all targets, it
works no better than anti-cross-validation.

So for cross-validation to work requires a very subtle
inter-relationship between the prior over targets and the set of
generalizers you're choosing among. In particular, cross-validation
cannot be given a Bayesian justification without regard to the set of
generalizers.

Nonetheless, I (and every other evenly marginally rational
statistician) have used cross-validation in the past, and will do so
again in the future.



More information about the Connectionists mailing list