Supervised and Unsupervised Learning

Fri Jun 2 15:43:11 EDT 1989

Here are some thoughts on the distinction between
supervised and unsupervised learning. First, despite
the apparent exhaustiveness of the terminology, these
two types of learning are not all that there are (see below).
Supervised learning means that the classes of
the training instances are known and available to the
learning system. Or when not applied to a classification
problem, it means that particular functional values are
available for the training instances (although perhaps
corrupted by noise). Unsupervised learning usually refers
to the case where the class labels or function values are
not provided with the training instances. I agree with
Terry Sanger that there can be all sorts of problems
falling between these extremes, but there are still other
kinds of learning tasks.

I've never much liked the term unsupervised learning because
(aside from the fact that together with supervised learning
it misleadingly implies that there are no other kinds of
learning) it seems sometimes to be understood as meaning
a method that can do the same thing a supervised method can
but doesn't need the supervision. My view (and I welcome
comments on this) is that what is usually called unsupervised
learning is a kind of supervised learning with a fixed,
builtin teacher. This teacher is embodied in some principle,
such as principal component analysis, clustering with a
specific criterion function on clusterings, infomax,
etc. So, according to this view, unsupervised learning
is a more specific type of process than is supervised
learning. Methods are usually not built to accomodate the
possibility of some input specifying what criterion to use
in the learning process.

Then there are other sorts of learning tasks where
the information given by the learning system's environment
is not as specific as a specification of desired responses.
For example, in a reinforcement learning task there
are desired responses but the system is not told what they
are: it has to search for them (in addition to the search
in parameter space for weights to remember the results of
previous searches). For example, Widrow, Gupta, and Maitra
(Systems Man and Cybernetics, vol 5, 1973, pp. 455-465)
discuss "learning with a critic". Rewards and punishments
are logically different from signed errors or desired
responses. Lots of other people (including me) have
discussed this type of learning task.

In thinking about supervised and unsupervised learning
tasks as these terms are used in everyday language, I think
it is greatly misleading to assume that the technical
meanings of these terms adequately characterize the
kind of learning we  see in animals. Usually, it seems to me,
the kind of supervision we see really is a process
whereby a sequence of problem-solving tasks are presented,
often with graded difficulty, and for each the learning
system faces a complex learning control task. Of course, to
solve these tasks, various parts of the sytem will probably
be facing unsupervised, supervised, reinforcement, and
probably other kinds of tasks that are all worked on
together.  Similarly, unsupervised learning used in the
vernacular seems to mean a process where all these things go
on, except the system itself in creating the  sequence of
problem-solving tasks (cf. Mitchell's LEX learning system).

Finally, I think it is important to distinguish
carefully between learning tasks and learning algorithms
or procedures (although I haven't been particularly careful
about this above). Tasks are characterized by the kinds of
information the system is allowed to get, how much it has at
the start, what the objective is, etc. A specific learning
algorithm is more-or-less capable of achieving various
objectives in various kinds of tasks. Usually a real
application encompasses lots of different learning tasks
depending on where you draw the line between the learning
system and the rest.

A. Barto
agb11 at phx.cam.ac.uk