Non-randomness is no panacea.

Juergen Schmidhuber juergen at idsia.ch
Wed Dec 6 04:39:11 EST 1995


In response to David's response dated Mon, 4 Dec 95:

I wrote ``Loosely speaking, in a universe whose
history is compressible, we may expect to generalize
well.''. To make this more precise, let us consider a
very simple 1-bit universe --- suppose the problem
is to extrapolate a  sequence of symbols (bits, without
loss of generality).  We have already observed a bitstring
s and would like to predict the next bit.  Let si denote
the event ``s is followed by symbol i'' for  i in {0,1}.

David is absolutely right by reminding us that we need a
prior before applying Bayes. And he is right by pointing
out that only if we have information concerning the prior,
we can exploit it to generalize well. In the context of
the present discussion, however, an interesting point is:
there is a special prior that is biased towards
*arbitrary* compressibility/structure/regularity.

Following Solomonoff/Levin/Chaitin/Li&Vitanyi, define P(s), the
a priori probability of a bitstring s, as the probability of
guessing a (halting) program that computes s on a universal
Turing machine U. Here, the way of guessing is defined by the
following procedure: initially, the input tape consists of
a single square.  Whenever the scanning head of the input
tape shifts to the right, do: (1) Append a new square.
(2) With probability 1/2 fill it with a 0; with probability 1/2
fill it with a 1.  Bayes tells us
P(s0|s) = P(s|s0)P(s0)/P(s) P(s0/P(s); P(s1|s) = P(s1)/P(s).
We are going to predict ``the next bit will be 0'' if
P(s0) > P(s1), and vice versa.  Due to the coding theorem
(Levin 74, Chaitin 75), P(si) = O((1/2)^K(si)) for  i in
{0,1} (K(x) denotes x' Kolmogorov complexity), the continuation
with lower Kolmogorov complexity will (in general) be more
likely. If s is ``noisy'' then this will be reflected by
its relatively high Kolmogorov complexity.

I am not saying anything new here. I'd just like to point
that if you know nothing about your universe except that it
is regular in some way, then P is of interest.  Sadly, most
possible universes are completely irregular and incompressible.
But for the few (but infinetly many) that are not, P is a
prior to consider (at least if we don't care for computing
time and constant factors).

Perhaps there are too many threads in the current discussion.
I'll shut up for a while.

Juergen Schmidhuber
IDSIA



More information about the Connectionists mailing list