No subject

Wed Nov 8 20:26:57 EST 1995

Pau Bofill writes

>>>
I happened to read the message from Bill Macready on Free Lunch
Theorems for Search, and I was surprised
by his (and D.H. Wolpert's) statement that

	"any two search algorithms have exactly the same expected
	performance, over the space of all fitness functions."

which seemed to invalidate statements like

	"my search algorithm beats your search algorithm according 
	to a particular performance measure for the following fitness
	function."
>>>

We don't see how the two statements could be interpreted as
contradictory. The first one explicitly is probabilistic (note the
term "expected") whereas the second one is not. They explicitly refer
to different things.

>>>
As far as I can see, the goal of a particular optimization PROBLEM, is
to find a target POINT in search space whith specific properties.
The role of any fitness function, then, is to assign a cost value to
each point in search space in such a way that it helps the algorithm
beeing used to find the target point.
>>>

Not at all! In the traveling salesperson problem, for example, the
goal is to find a low tour-distance - if you can find the lowest,
fine, but in practice you always must settle for something
suboptimal. The choice of using tour-distance as one's fitness
function was never made to "help the algorithm .. find the target
point". (There is not even a single "target point", per se!) Rather it
is made because ... that is what is of interest.

The problem *is* the fitness function. We would argue that this is the
case for almost any real-world problem.

Now one's algorithm can perform transformations of the function so as
to "help the algorithm ... find (a good) point". But that's another
issue entirely.

Aside to reader: Note that Bofill does not use "target" the way a
machine-learner would.

>>>
If I understood them properly, Wolpert and Macready define the 
space of all possible fitness functions as the set of ALL possible assignments
of cost values to points in search space.
>>>

Or to use Bofill's language, the space of all possible problems; they
are synonymous.

>>>
And they use as a generalized perfomace measure the histogram of cost values, 
regardless of the points where they were found. 
>>>

In most (almost all?) real world problems the set of points sampled
(d_X values in our terminology) has essentially implications for the
efficacy of the search algorithm used. Fitness values at those points,
and maybe computational complexity of the algorithm, have such
implications. But not the points themselves. Indeed, if there were
some aspect of the points that *were* important, then (by definition!)
it would go into the fitness function.

Again consider TSP. The performance of an algorithm is determined by
the tour-lengths (fitnesses) of the tours the algorithm has
constructed, not by the tours themselves.

>>>
and their statement is equivalent to,

	"any two search algorithms have exactly the same expected
	performance, over the space of all optimization problems
	that can be defined whithin a search space."
>>>

Loosely speaking.

>>>
which stated otherwise would mean,

	"If the target doesn't matter, all algorithms perform alike."
>>>

This makes no sense. Of course the target matters - it matters more
than anything else. The point is that *unless you take explicit
consideration of the target (or more generally the fitness function)
into account when using your algorithm*, all algorithms are alike. And
*many* users of genetic algorithms, tabu search, hill-climbing,
simulated annealing, etc., do not take into account the fitness
function. (In fact, in many competitions, knowledge concerning the
fitness function is considered cheating!)

This - relatively minor - point of the paper is a cry for people to do
the obvious: incorporate knowledge of the fitness function into their
algorithm. On any problem, best results will accrue to such a strategy
(e.g., the winners at TSP are those algorithms that are tailored to
TSP.) Yet people still try to use fixed algorithms as though they were
"general purpose". The NFL results point out the lack of formal
justifiability of such strategies.

>>>
In particular, algorithms should be compared on PROBLEMS, not on
fitness functions, with properly defined performance measures.
>>>

This is a content-free statement. In the real world, problems *are*
fitness functions.

Bill and David