paper available
Matthias Heger
heger at Informatik.Uni-Bremen.DE
Mon Feb 28 07:27:12 EST 1994
FTP-host: ftp.gmd.de
FTP-filename: /Learning/rl/papers/heger.consider-risk.ps.Z
The file heger.consider-risk.ps.Z is now available for copying from the RL
papers repository:
***************************************************
* Consideration of Risk in Reinforcement Learning *
***************************************************
(Revised submission to the 11th International Conference on
Machine Learning (ML94), 15 pages)
Abstract
--------
Most Reinforcement Learning (RL) work supposes policies for sequential
decision tasks to be optimal that minimize the expected total discounted
cost (e.g. Q-Learning [Wat 89], AHC [Bar Sut And 83]). On the other hand,
it is well known that it is not always reliable and can be treacherous to
use the expected value as a decision criterion [Tha 87]. A lot of alter-
native decision criteria have been suggested in decision theory to get a
more sophisticated consideration of risk but most RL researchers have not
concerned themselves with this subject until now. The purpose of this
paper is to draw the reader's attention to the problems of the expected
value criterion in Markov Decision Processes and to give Dynamic Pro-
gramming algorithms for an alternative criterion, namely the Minimax cri-
terion. A counterpart to Watkins' Q-Learning related to the Minimax cri-
terion is presented. The new algorithm, called Q^-Learning
(Q-hat-Learning), finds policies that minimize the >>worst-case<< total
discounted costs. Most mathematical details aren't presented here but can
be found in [Heg 94].
----------------------------------------------------------------------------
Here is an example of retrieving and printing the file:
-> ftp ftp.gmd.de
Connected to gmdzi.gmd.de.
220 gmdzi FTP server (Version 5.72 Fri Nov 20 20:35:05 MET 1992) ready.
Name (ftp.gmd.de:heger): anonymous
331 Guest login ok, send your email-address as password.
Password:
230-This is an experimental FTP Server. See /README for details.
This site is in Germany, Europe. Please restrict downloads to
our non-working hours (i.e outside of 08:00-18:00 MET, Mo-Fr)
*** Local time is 12:25:22 MET
230 Guest login ok, access restrictions apply.
ftp> cd Learning/rl/papers
250 CWD command successful.
ftp> binary
200 Type set to I.
ftp> get heger.consider-risk.ps.Z
200 PORT command successful.
150 Opening BINARY mode data connection for heger.consider-risk.ps.Z (100477 bytes).
226 Transfer complete.
local: heger.consider-risk.ps.Z remote: heger.consider-risk.ps.Z
100477 bytes received in 3.2e+02 seconds (0.3 Kbytes/s)
ftp> quit
221 Goodbye.
-> uncompress heger.consider-risk.ps.Z
-> lpr heger.consider-risk.ps
-------------------------------------------------------------------------------
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+ Matthias Heger +
+ Zentrum fuer Kognitionswissenschaften, Universitaet Bremen, +
+ Postfach 330 440 +
+ D-28334 Bremen, Germany +
+ +
+ email: heger at informatik.uni-bremen.de +
+ Tel.: +49 (0) 421 218 4659 +
+ Fax: +49 (0) 421 218 3054 +
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
More information about the Connectionists
mailing list