near-optimal computable predictions

Thu Jul 18 10:18:20 EDT 2002

 The Speed Prior: a new simplicity measure yielding near-optimal
 computable predictions             (Juergen Schmidhuber, IDSIA)

In J. Kivinen and R. H. Sloan, eds, Proc. 15th Annual Conf. on
Computational Learning Theory (COLT), 216-228, Springer, 2002;
based on section 6 of http://arXiv.org/abs/quant-ph/0011122 (2000)

http://www.idsia.ch/~juergen/speedprior.html
ftp://ftp.idsia.ch/pub/juergen/colt.ps

Solomonoff's optimal but noncomputable method for inductive
inference assumes that observation sequences x are drawn from an
recursive prior distribution mu(x). Instead of using the unknown
mu(x) he predicts using the celebrated universal enumerable prior
M(x) which for all x exceeds any recursive mu(x), save for a
constant factor independent of x. The simplicity measure M(x)
naturally implements "Occam's razor" and is closely related to the
Kolmogorov complexity of x. However, M assigns high probability to
certain data x that are extremely hard to compute. This does not match
our intuitive notion of simplicity. Here we suggest a more plausible
measure derived from the fastest way of computing data.  In absence
of contrarian evidence, we assume that the physical world is generated
by a computational process, and that any possibly infinite sequence of
observations is therefore computable in the limit (this assumption is
more radical and stronger than Solomonoff's). Then we replace M by the
novel Speed Prior S, under which the cumulative a priori probability
of all data whose computation through an optimal algorithm requires
more than O(n) resources is 1/n.  We show that the Speed Prior allows
for deriving a computable strategy for optimal prediction of future
y, given past x.  Then we consider the case that the data actually
stem from a nonoptimal, unknown computational process, and use
Hutter's recent results to derive excellent expected loss bounds for
S-based inductive inference.

Assuming our own universe is sampled from S, we predict:
it won't get many times older than it is now;
large scale quantum computation won't work well;
beta decay is not random but due to some fast pseudo-random
generator which we should try to discover.

Juergen Schmidhuber                    http://www.idsia.ch/~juergen