learning 5-bit parity: RMHC vs. GP vs. MLP
Kevin Lang
kevin at research.nj.nec.com
Tue Aug 15 14:49:12 EDT 1995
Comments on "A Response to ..." [Ko95]
Kevin J. Lang
NEC Research Institute
kevin at research.nj.nec.com
July 1995
These remarks constitute a brief technical reply to [Ko95],
which was handed out at the ML-95 and ICGA-95 conferences.
SCALING FROM 3-BIT TO 5-BIT PARITY:
hill climbing still wins
[Ko95, section 8] extends the experimental comparison in [La95] of the
search efficiency of random mutation hill climbing and genetic search
to the task of learning 5-bit parity. It is shown that any given run
of RMHC is four times less likely than genetic search to find a
circuit which computes 5-bit parity. However, when RMHC manages to
find a solution, it does so about fifty times faster than genetic search.
Hence, by using RMHC in an iterated manner (i.e. multiple independent
runs), it is possible to generate solutions much more cheaply than
with genetic search. For example, by restarting each run of RMHC
after 75,000 candidates one could obtain a candidate to solution ratio
of about 300,000. This compares well with the value of 2,913,583
candidates per solution reported for genetic search.
The above argument could be formalized by calculating performance
curves for the two algorithms, as discussed in [Ko92, chapter 8].
[Ko95] asserts that these curves would have permitted a meaningful
comparison of the algorithms to be made, but does not provide them,
citing the large computational expense that would supposedly have to
be incurred.
Actually, the relevant portion of the I(M,i,z) curve for RMHC could be
estimated in one day by doing fifty runs out to 100,000 candidates.
Since this is roughly the same amount of work as a single run of
genetic search on this problem, estimating the performance curve for
genetic search _would_ be a daunting task.
DON'T USE RMHC:
the power of multi-layer perceptrons
I would like to emphasize that hill climbing algorithm described in
[La95] was not intended to be either novel or good, and that I do NOT
advocate its use. By deliberately using a bad version of an old
algorithm, I sought to underline the negative character of my results.
My positive advice is this: when learning functions, use multi-layer
perceptrons. By adopting this representation for hypotheses one can
exploit the powerful gradient-based search procedures that have been
developed by the numerical analysis community. To illustrate the
advantages of this approach, I invested 7 seconds of computer time in
ten runs of conjugate gradient search for MLP weights to compute 5-bit
parity. The resulting candidate to solution ratio was 393. This is
roughly 750 times better than RMHC on boolean circuits, and 7400 times
better than genetic search on boolean circuits.
candidates found a
examined solution?
---------- ----------
31 yes
43 yes
62 yes
151 yes
196 no, stuck in local optimum
239 no, stuck in local optimum
271 no, stuck in local optimum
274 no, stuck in local optimum
308 no, stuck in local optimum
392 yes
Unlike RMHC and genetic search, the conjugate gradient search
procedure used here has no control parameters to tweak, and should
yield good results in the hands of any user. Also, it detects when it
is stuck in a local optimum, thus permitting an immediate restart to
be made from random weights. This transforms the iterated methodology
from an after-the-fact accounting system into a truly useful algorithm.
Notes on the MLP network used in the above experiment:
5 input units
5 hidden units computing tanh
1 output unit computing tanh
random initial weights drawn uniformly from the interval [-1,+1]
true and false encoded by +1 and -1
REFERENCES
[Ko95] John Koza, "A Response to the ML-95 Paper entitled "Hill
Climbing Beats Genetic Search on a Boolean Circuit Synthesis
Task of Koza's"", informally published and distributed document.
[La95] Kevin Lang, "Hill Climbing Beats Genetic Search on a Boolean
Circuit Synthesis Task of Koza's", The Twelfth International
Conference on Machine Learning, pp. 340-343, 1995.
Note: a copy of this paper can be obtained by anonymous ftp
from ftp.nj.nec.com:/pub/kevin/lang-ml95.ps
[Ko92] John Koza, "Genetic Programming", MIT Press, 1992, pp. 205-236.
More information about the Connectionists
mailing list