supervised learning

Geoffrey Hinton hinton at ai.toronto.edu
Fri Jun 2 15:45:52 EDT 1989


I think the current useage of the terms "supervised" and "unsupervised"
learning is pretty much OK. To try to change the meanings of these terms now
would cause even more confusion.  There is a natural and important distinction
between data sets (wherever they come from) that consist of Input-output pairs
where the task is to predict the output given the input, and data-sets that
simply consist of an ensemble of vectors where the task (if its specified at
all) is typically to find natural clusters, or a compact code, or a code with
independent components.

Naturally there are tricky cases.  One of these is when we create a
"multi-completion" task by trying to predict each part of the input from all
the other parts.  We are really turning an unsupervised task into a whole set
of supervised tasks.  Another problem arises when we have some LOCAL objective
function (internal to the network) that acts as a teacher. If we backpropagate
derivatives from this LOCAL function, are we doing supervised or unsupervised
learning?  It might be reasonable to say that we are doing learning that is
locally supervised but globally unsupervised (i.e. the local supervision
information is not derived from more global supervision information that is
supplied with the data).

Its worth noting that the statistics literature makes a very similar basic
distinction.  It uses the term "discriminant" analysis for supervised learning
and "cluster analysis" for unsupervised learning (or the subset of
unsupervised learning that they know how to do.)

Geoff


More information about the Connectionists mailing list