A neat idea from L. Breiman

Wed Aug 19 13:44:02 EDT 1992

I recently read the following paper by Leo Breiman:

Breiman, L. (1991) The $\Pi$ method for estimating multivariate functions from noisy
data. {\it Technometrics, 33} (2), 125--160.  With discussion.

In this paper, Breiman presents a very neat technique called "back
fitting" that is a very general algorithm idea for improving greedy
algorithms.  Suppose we are executing a greedy algorithm for some
task, and at any given point in the process, we have already made
decisions d_1, d_2, ..., d_{k-1} and we are about to make decision
d_k.  In the standard greedy algorithm, we choose d_k to be
the locally best decision and then go on to consider d_{k+1}.
However, with backfitting, we first perform the following double loop:

  repeat until quiesence:
    for i from 1 to k-1 do
       "undo" decision d_i (holding all other decisions d_j, j<>i fixed)
        and re-make d_i to be the best decision (locally).

In other words, we first hold {d_2, ..., d_{k-1}} constant and see if
we can improve things by re-making decision d_1.  Then we hold {d_1,
d_3, ..., d_{k-1}} constant and consider re-making decision d_2, and
so on.  We cycle through the previous k-1 decisions making local
improvements until no further improvements can be found.  THEN, we
make decision d_k (and repeat the process, of course).

In general, this backfitting process will cost another factor of n in
the algorithm (assuming there are n decisions to be made).  In
experiments in three different learning algorithms, Breiman has found
that this method finds much better solutions than a simple greedy
algorithm.  Breiman (in various collaborations) is currently applying
this idea to improving CART trees and neural networks.

This idea would probably also find good application in COBWEB-style
algorithms and greedy MDL algorithms.

--Tom