Supervised learning

Pankaj Mehra mehra at ptolemy.arc.nasa.gov
Mon Jun 5 13:37:32 EDT 1989


The difference between "supervised" learning and reinforcement learning
is largely due to the nature of feedback. To borrow a term from Ron
Williams, the feedback is "prescriptive" in the former type and "evaluative"
in the latter.

Yet another distinction, first noted by Barto, Sutton, and Anderson, in
their 1983 paper, is due to the synchronicity and delay in feedback. If
feedback is delayed, the learner needs "memory" of recent decisions, and
a temporal credit assignment mechanism (Sutton, 1988) to distribute the
feedback among memorized decisions.

Asynchronicity in feedback can (roughly) be defined as the property of a
training environment so that it cannot be determined precisely which
output of the network will be followed by reinforcement or correction.
IMHO, this is an important difference between knowledge-based and connectionist
learning systems. The AI model of learning (Dietterich, 1981) can handle
only synchronous delays in feedback. The reason why ANSs can handle
asynchronous delays in feedback is because their architecture is inherently
asynchronous.

- Pankaj {mehra at cs.uiuc.edu}

References:
(Sutton,88) Machine Learning, vol. 3, pp. 9-44
(Dietterich,Buchanan,81) The Role of Critic in Learning Systems, Stanford TR
		STAN-CS-81-891
(Barto et al.,83) IEEE trans. Sys. Man Cyb., vol. SMC-13, pp. 834-846


More information about the Connectionists mailing list