Generalization vs. Interpolation

Wed Sep 4 12:02:42 EDT 1991

Hildebrandt --

   "I have come to treat interpolation and generalization as the same
   animal, since obtaining good generalization is a matter of
   interpolating in the right metric space (i.e. the one that best models
   the underlying process)."

Wolpert --

   Certainly I am sympathetic to this point of view. Simple versions
   of nearest neighbor interpolation (i.e., memory-based reasoners) do
   very well in many circumstances. (In fact, I've published a couple of
   papers making just that point.) However it is trivial to construct
   problems where the target function is extremely volatile and
   non-smooth in any "reasonable" metric; who are we to say that Nature
   should not be allowed to have such target functions? Moreover, for a
   number of discrete, symbolic problems, the notion of a "metric" is
   ill-defined, to put it mildly.

I do not presume to tell Nature what to do.  We may consider problems
for which there is no simple transformation from the input (sensor)
space into a (linear) metric space to be "hard" problems, in a sense.

Discrete problems, which naturally inhibit interpolation, must be
handled by table look-up, i.e. each case treated separately.  However,
table look-up can be considered to be an extreme case of interpolation
-- the transition between one recorded data point and a neighboring
one being governed by a Heaviside (threshold) function rather than a
straight line.

Wolpert --

   I am not claiming that metric-based generalizers will necessarily do
   poorly for these kinds of problems. Rather I'm simply saying that it is a
   bit empty to state that

Hildebrandt --

   "If the form of the underlying metric space is unknown, then it is a
   toss-up whether sigmoidal sheets, RBFs, piece-wise hyperplanar, or any
   number of other basis functions will work best."

Wolpert --

   That's like saying that if the underlying target function is unknown,
   then it is a toss-up what hypothesis function will work best.

   Loosely speaking, "interpolation" is something you do once you've
   decided on the metric. In addition to such interpolation,
   "generalization" also involves the preceding step of performing the
   "toss up" between metrics in a (hopefully) rational manner.

It would be splitting hairs to suggest that the process of choosing an
appropriate set of basis functions be called "learning to generalize"
rather than "generalization".  I could not agree with you more in
thinking that the search for an appropriate basis set is one of the
important open problems in connectionist research.

If nothing is known about the process to be modelled, is there any
more efficient way to select a basis than trial-and-error?

Are some sets of basis functions more likely to efficiently describe a
randomly selected process?  Aside from compactness, what other
properties can be ascribed to a desirable basis?

Given a particular set of basis functions, what criteria must be met
by the underlying process in order for the bases to generalize well?
Can these criteria be tested easily?

These are just a few of the questions that come to mind.  I'll be
interested in any thoughts you have in this area.

				Thomas H. Hildebrandt