<html>

  <head>

    <meta http-equiv="content-type" content="text/html; charset=utf-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

/////////////////////////////////////////////////////////////////////

    <br>

    <br>

                  PhD Studentship Available on

    <br>

    <br>

         Motivated Multi-scale Self-Organization for the

    <br>

         emergence of coordinated sensorimotor behaviors

    <br>

    <br>

               BISCUIT team, CentraleSupelec/Loria,

    <br>

                          Nancy, France

    <br>

    <br>

    Keywords: artificial intelligence, reinforcement learning,

    <br>

    self-organization, fine grained collective dynamical systems.

    <br>

    <br>

/////////////////////////////////////////////////////////////////////

    <br>

    <br>

    We invite applications for PhD studentships in the BISCUIT team at

    the CentralSupelec/Loria laboratory to investigate the emergence of

    coordinated sensorimotor behaviors in artificial agents. We

    especially focus on Spatialized and Distributed Population Computing

    (SDP-Computing) where a collective behavior emerges from the

    <br>

    interaction of massively parallel, distributed, decentralized and

    adaptive simple computing units with local communications. As

    theorized by the embodiment theory, the interaction with a rich and

    complex environment is critical and necessary for an interesting

    behavior to emerge.

    <br>

    <br>

    We work with learning agents that start with a minimum set of

    "innate" skills and that develop more abstract representations and

    functionalities allowing them to survive in their environment.

    Learning of these skills is driven by intrinsic motivations and

    external rewards.

    <br>

    <br>

    Thus, during this thesis, we would like to work with learning models

    that evolve continuously in time and that are compatible with the

    general framework of Reinforcement Learning (Sutton, 1998) and see

    how these models could allow an agent to build more abstract and

    complex behaviors.

    <br>

    <br>

    For example, several works have shown that models specified within

    the Dynamic Systems Theory (DST) (Kelso, 1995) can learn simple and

    reflex sensorimotor behaviors (Beer, 1995; Spencer et al., 2011),

    but these behaviors cannot help them to reach long-term and distant

    goals.

    <br>

    <br>

    Some of the question we would like to address in this thesis are:

    <br>

    <br>

    * while we know that central pattern generators might be candidates

    for providing motor primitives, it is still unclear how a

    distributed model can build up on these motor primitives

    sensorimotor behaviors on a longer time scale;

    <br>

    <br>

    * how these higher level, more abstract, sensorimotor behaviors

    could be integrated by the model to articulate its behavior ?

    <br>

    <br>

    * how the system could learn even more complex behaviors, in some

    way, gradually bootstrapping its sensorimotor capabilities ?

    <br>

    <br>

    * can we make operant the theories of (Warren, 2006; Keijzer, 2001)

    that advocate anticipatory behaviors;

    <br>

    <br>

    * is it possible to take inspirations from works on hierarchical

    reinforcement learning using options (Sutton et al., 1999;

    Dietterich, 2000) in a continuous framework ?

    <br>

    <br>

////////////////////////////////////////////////////////////////////////

    <br>

    <br>

    The final decision about the funding will be made in the first days

    of june 2018. According to french law, the raw monthly salary is

    between 1684,93 and 2024,70 euros.

    <br>

    <br>

    Application deadline : 30th of april 2018

    <br>

    Contact : <a class="moz-txt-link-abbreviated"

      href="mailto:alain.dutech@loria.fr">alain.dutech@loria.fr</a> AND

    <a class="moz-txt-link-abbreviated"

      href="mailto:jeremy.fix@centralesupelec.fr">jeremy.fix@centralesupelec.fr</a>

    <br>

    <br>

    More details can be found at:

    <br>

    <a class="moz-txt-link-freetext"

href="http://www.loria.fr/en/jobs-training/phd-offer-motivated-multi-scale-self-organization-for-the-emergence-of-coordinated-sensorimotor-behaviors/">http://www.loria.fr/en/jobs-training/phd-offer-motivated-multi-scale-self-organization-for-the-emergence-of-coordinated-sensorimotor-behaviors/</a>

    <br>

    <br>

////////////////////////////////////////////////////////////////////////

    <br>

    <br>

    ** References **

    <br>

    Beer, R. D. (1995). A dynamical systems perspective on

    agent-environment interaction. Artificial intelligence,

    72(1–2):173–215.

    <br>

    <br>

    Dietterich, T. (2000). Hierarchical reinforcement learning with the

    MAXQ value function decom- position. Journal of Artificial

    Intelligence Research (JAIR), 13:227–303.

    <br>

    <br>

    Kelso, J. S. (1995). Dynamic Patterns. MIT Press.

    <br>

    <br>

    Keijzer, F. (2001). Representation and behavior. MIT Press.

    <br>

    <br>

    Sutton, R., Precup, D., and Singh, S. (1999). Between MDPs and

    semi-MDPs: A framework for temporal abstraction in reinforcement

    learning. Artificial Intelligence, (112):118–211.

    <br>

    <br>

    Sutton, R. and Barto, A. (2016). Reinforcement learning: An

    Introduction. Bradford Book, MIT Press, Cambridge, MA.

    <br>

    <br>

    Warren, W. H. (2006). The dynamics of perception and action.

    Psychological Review,113(2):358–389.

  </body>

</html>