No subject

wgm@santafe.edu wgm at santafe.edu
Fri Aug 18 13:42:01 EDT 1995


In response to the recent postings of Kevin Lang and Bill Macready and
David Wolpert, Barak Pearlmutter writes:

>>>
But this point is somewhat moot, because Kevin Lang's recent post was
not claiming that "his" "new" algorithm was better than current
algorithms.
>>>

Nor did we ever assume he was making this claim. We simply wanted to
point to our work that resulted from many of the same motivations and
investigates such issues from the most general perspective.

>>>
Lang published a well reasoned and careful scientific paper, which
cast doubt on the practicality and importance of a particular
algorithm (GP) by showing that on a very simple problem *taken from
the GP book* it compares unfavorably to a silly strawman algorithm.
This paper passed scientific muster, and was accepted into ML, a
respected refereed conference.
>>>

Again we want to stress we were not attacking Lang's work *at all*. We
couldn't agree more that the GP community needs more comparisons of
its results to other techniques. For those interested in such
comparisons I would also point to the papers of Una-May O'Reilly. They
can be found at http://www.santafe.edu/sfi/publications/94wplist.html

***

As regards No Free Lunch issues in optimization and supervised
learning,

>>>
It might be true that in a universe where everything was equally
likely, all search algorithms would be equally awful. 
>>>

It is NOT true that "in a universe where everything was equally
likely, all search algorithms would be equally awful." For example,
there are major head-to-head minimax distinctions between
algorithms. (The true magnitude of those distinctions is coming to
light in the work with our student we mentioned in our previous post.)

This is a crucially important point. There is a huge amount of
structure distinguishing algorithms even in "a universe where
everything was equally likely". All that would be "equally awful" in
such a universe is expected performance.

And of course in supervised learning there are other major a priori
distinctions between algorithms. E.g., for quadratic loss, if you give
me a learning algorithm with *any* random component (backprop with
random initial weights anyone?), I can always come up with an
algorithm with assuredly (!) superior performance, *independent of the
target*. (And therefore independent of the "universe that we live
in".)

Bill Macready and David Wolpert



More information about the Connectionists mailing list