<table style="color: rgb(0, 0, 0);"><tbody><tr><td><p></p><pre style="border: 0pt; padding: 0pt; background-image: none; font-size: 2.5ex; line-height: 2.8ex; white-space: pre-wrap; word-wrap: break-word; background-position: initial initial; background-repeat: initial initial;"><i>Apologies for any cross-postings:</i></pre><pre style="border: 0pt; padding: 0pt; background-image: none; font-size: 2.5ex; line-height: 2.8ex; white-space: pre-wrap; word-wrap: break-word; background-position: initial initial; background-repeat: initial initial;"><strong><em>Emerging techniques and applications in Multi-Objective Reinforcement Learning (MORL)</em></strong></pre><pre style="border: 0pt; padding: 0pt; background-image: none; line-height: 2.8ex; white-space: pre-wrap; word-wrap: break-word;"><span style="font-size: 2.5ex; line-height: 2.8ex;">Direct links:</span></pre><pre style="border: 0pt; padding: 0pt; background-image: none; line-height: 2.8ex; white-space: pre-wrap; word-wrap: break-word;"><a href="https://www.elen.ucl.ac.be/esann/" style="font-size: 2.5ex;" target="_blank">https://www.elen.ucl.ac.be/esann/</a><span style="font-size: 2.5ex;"><br /><br /></span><a href="https://ai.vub.ac.be/ESANN_2015_MORL_special_session" style="font-size: 2.5ex;" target="_blank">https://ai.vub.ac.be/ESANN_2015_MORL_special_session</a><span style="font-size: 2.5ex;"><br /><br /></span><font face="Courier">Multi-objective optimization (MOO) and Reinforcement Learning (RL) are two well-established research fields in the area of learning, optimization, and control. RL addresses sequential decision making problems in initially unknown stochastic environments, involving stochastic policies and unknown temporal delays between actions and observable effects. Multi-objective optimization (MOO), which is a sub-area of multi-criteria decision making (MCDM), considers the optimization of more than one objective simultaneously and a decision maker, i.e. an algorithm or a technique, decides either which solutions are important for the user or when to present these solutions to the user for further consideration. Currently, MOO algorithms are seldomly used for stochastic optimization, which makes it an unexplored but promising research area.</font></pre><p><strong>State of the art</strong></p><p>Examples of algorithms that combine the two techniques MOO and RL are:</p><p><em>Multi-objective reinforcement learning is an</em> extension of RL to multi-criteria stochastic rewards (also called utilities in decision theory). Techniques from multi-objective evolutionary computation have been used for multi-objective RL in order to improve the exploration-exploitation tradeoff. The resulting algorithms are hybrids between MCDM and stochastic optimization. The RL algorithms are enriched with the intuition and efficiency of MOO in handing multi-objective problems.</p><p><em>Preference based reinforcement learning</em> combines reinforcement learning and preference learning that extend RL with qualitative reward vectors, e.g. ranking functions, that can be directly used by the user. Like MORL algorithms, RL is extended with new order relationships to order the policies.</p><p>Some multi-objective evolutionary algorithms use also method inspired by reinforcement learning to cope with noisy and uncertain environments.</p><p><br /></p><p><strong>Aim and scope</strong></p><p>The main goal of this special session is to solicit research and potential synergies between multi-objective optimization, evolutionary computation and reinforcement learning. We encourage submissions describing applications of MOO for agents acting in difficult environments that are possibly dynamic, uncertain and partially observable, e.g. in games, multi-agent applications such as scheduling, and other real-world applications.</p><p><br /></p><p><strong>Topics of interests</strong></p><ul><li>Novel frameworks combining both MOO and RL</li><li>Multi-objective optimization algorithms such as meta-heuristics and evolutionary algorithms for dynamic and uncertain environments</li><li>Theoretical results on learnability in multi-objective dynamic and uncertain environments</li><li>On-line self-adapting systems or automatic configuration systems</li><li>Solving multi-objective sequential decision making problems with RL</li><li>Real-world multi-objective applications in engineering, business, computer science, biological sciences, scientific computation</li></ul><p><strong>Organizers</strong></p><p><strong>Madalina M. Drugan </strong>(mdrugan@vub.ac.be), <strong><strong>Bernard Manderick</strong> </strong>(Bernard.Manderick@vub.ac.be) and</p><p><strong>Ann Nowe </strong>(anowe@vub.ac.be), Artificial Intelligence Lab, Vrije Universiteit Brussel, Pleinlaan 2, 1050, Brussels, Belgium</p><p><br /></p><p><strong>Dates</strong></p><p>Submission of papers: <strong>21 November 2014</strong></p><p>Notification of acceptance: 31 January 2015</p><p>ESANN conference:2 2 - 24 April 2015 in Bruges, Belgium</p><p><br /></p><p><strong>Author guidelines</strong></p><ul><li>Papers must not exceed 6 pages, including figures and references.</li><li>More information <a href="https://www.elen.ucl.ac.be/esann/index.php?pg=guidelines" target="_blank">https://www.elen.ucl.ac.be/esann/index.php?pg=guidelines</a></li></ul><p><strong> </strong></p></td></tr></tbody></table>