<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
/////////////////////////////////////////////////////////////////////
<br>
<br>
PhD Studentship Available on
<br>
<br>
Motivated Multi-scale Self-Organization for the
<br>
emergence of coordinated sensorimotor behaviors
<br>
<br>
BISCUIT team, CentraleSupelec/Loria,
<br>
Nancy, France
<br>
<br>
Keywords: artificial intelligence, reinforcement learning,
<br>
self-organization, fine grained collective dynamical systems.
<br>
<br>
/////////////////////////////////////////////////////////////////////
<br>
<br>
We invite applications for PhD studentships in the BISCUIT team at
the CentralSupelec/Loria laboratory to investigate the emergence of
coordinated sensorimotor behaviors in artificial agents. We
especially focus on Spatialized and Distributed Population Computing
(SDP-Computing) where a collective behavior emerges from the
<br>
interaction of massively parallel, distributed, decentralized and
adaptive simple computing units with local communications. As
theorized by the embodiment theory, the interaction with a rich and
complex environment is critical and necessary for an interesting
behavior to emerge.
<br>
<br>
We work with learning agents that start with a minimum set of
"innate" skills and that develop more abstract representations and
functionalities allowing them to survive in their environment.
Learning of these skills is driven by intrinsic motivations and
external rewards.
<br>
<br>
Thus, during this thesis, we would like to work with learning models
that evolve continuously in time and that are compatible with the
general framework of Reinforcement Learning (Sutton, 1998) and see
how these models could allow an agent to build more abstract and
complex behaviors.
<br>
<br>
For example, several works have shown that models specified within
the Dynamic Systems Theory (DST) (Kelso, 1995) can learn simple and
reflex sensorimotor behaviors (Beer, 1995; Spencer et al., 2011),
but these behaviors cannot help them to reach long-term and distant
goals.
<br>
<br>
Some of the question we would like to address in this thesis are:
<br>
<br>
* while we know that central pattern generators might be candidates
for providing motor primitives, it is still unclear how a
distributed model can build up on these motor primitives
sensorimotor behaviors on a longer time scale;
<br>
<br>
* how these higher level, more abstract, sensorimotor behaviors
could be integrated by the model to articulate its behavior ?
<br>
<br>
* how the system could learn even more complex behaviors, in some
way, gradually bootstrapping its sensorimotor capabilities ?
<br>
<br>
* can we make operant the theories of (Warren, 2006; Keijzer, 2001)
that advocate anticipatory behaviors;
<br>
<br>
* is it possible to take inspirations from works on hierarchical
reinforcement learning using options (Sutton et al., 1999;
Dietterich, 2000) in a continuous framework ?
<br>
<br>
////////////////////////////////////////////////////////////////////////
<br>
<br>
The final decision about the funding will be made in the first days
of june 2018. According to french law, the raw monthly salary is
between 1684,93 and 2024,70 euros.
<br>
<br>
Application deadline : 30th of april 2018
<br>
Contact : <a class="moz-txt-link-abbreviated"
href="mailto:alain.dutech@loria.fr">alain.dutech@loria.fr</a> AND
<a class="moz-txt-link-abbreviated"
href="mailto:jeremy.fix@centralesupelec.fr">jeremy.fix@centralesupelec.fr</a>
<br>
<br>
More details can be found at:
<br>
<a class="moz-txt-link-freetext"
href="http://www.loria.fr/en/jobs-training/phd-offer-motivated-multi-scale-self-organization-for-the-emergence-of-coordinated-sensorimotor-behaviors/">http://www.loria.fr/en/jobs-training/phd-offer-motivated-multi-scale-self-organization-for-the-emergence-of-coordinated-sensorimotor-behaviors/</a>
<br>
<br>
////////////////////////////////////////////////////////////////////////
<br>
<br>
** References **
<br>
Beer, R. D. (1995). A dynamical systems perspective on
agent-environment interaction. Artificial intelligence,
72(1–2):173–215.
<br>
<br>
Dietterich, T. (2000). Hierarchical reinforcement learning with the
MAXQ value function decom- position. Journal of Artificial
Intelligence Research (JAIR), 13:227–303.
<br>
<br>
Kelso, J. S. (1995). Dynamic Patterns. MIT Press.
<br>
<br>
Keijzer, F. (2001). Representation and behavior. MIT Press.
<br>
<br>
Sutton, R., Precup, D., and Singh, S. (1999). Between MDPs and
semi-MDPs: A framework for temporal abstraction in reinforcement
learning. Artificial Intelligence, (112):118–211.
<br>
<br>
Sutton, R. and Barto, A. (2016). Reinforcement learning: An
Introduction. Bradford Book, MIT Press, Cambridge, MA.
<br>
<br>
Warren, W. H. (2006). The dynamics of perception and action.
Psychological Review,113(2):358–389.
</body>
</html>