Connectionists: PhD Position at CentraleSupelec/Loria, Nancy, France

Jeremy Fix Jeremy.Fix at centralesupelec.fr
Mon Mar 12 12:26:54 EDT 2018


/////////////////////////////////////////////////////////////////////

               PhD Studentship Available on

      Motivated Multi-scale Self-Organization for the
      emergence of coordinated sensorimotor behaviors

            BISCUIT team, CentraleSupelec/Loria,
                       Nancy, France

Keywords: artificial intelligence, reinforcement learning,
self-organization, fine grained collective dynamical systems.

/////////////////////////////////////////////////////////////////////

We invite applications for PhD studentships in the BISCUIT team at the 
CentralSupelec/Loria laboratory to investigate the emergence of 
coordinated sensorimotor behaviors in artificial agents. We especially 
focus on Spatialized and Distributed Population Computing 
(SDP-Computing) where a collective behavior emerges from the
interaction of massively parallel, distributed, decentralized and 
adaptive simple computing units with local communications. As theorized 
by the embodiment theory, the interaction with a rich and complex 
environment is critical and necessary for an interesting behavior to 
emerge.

We work with learning agents that start with a minimum set of "innate" 
skills and that develop more abstract representations and 
functionalities allowing them to survive in their environment. Learning 
of these skills is driven by intrinsic motivations and external rewards.

Thus, during this thesis, we would like to work with learning models 
that evolve continuously in time and that are compatible with the 
general framework of Reinforcement Learning (Sutton, 1998) and see how 
these models could allow an agent to build more abstract and complex 
behaviors.

For example, several works have shown that models specified within the 
Dynamic Systems Theory (DST) (Kelso, 1995) can learn simple and reflex 
sensorimotor behaviors (Beer, 1995; Spencer et al., 2011), but these 
behaviors cannot help them to reach long-term and distant goals.

Some of the question we would like to address in this thesis are:

* while we know that central pattern generators might be candidates for 
providing motor primitives, it is still unclear how a distributed model 
can build up on these motor primitives sensorimotor behaviors on a 
longer time scale;

* how these higher level, more abstract, sensorimotor behaviors could be 
integrated by the model to articulate its behavior ?

* how the system could learn even more complex behaviors, in some way, 
gradually bootstrapping its sensorimotor capabilities ?

* can we make operant the theories of (Warren, 2006; Keijzer, 2001) that 
advocate anticipatory behaviors;

* is it possible to take inspirations from works on hierarchical 
reinforcement learning using options (Sutton et al., 1999; Dietterich, 
2000) in a continuous framework ?

////////////////////////////////////////////////////////////////////////

The final decision about the funding will be made in the first days of 
june 2018. According to french law, the raw monthly salary is between 
1684,93 and 2024,70 euros.

Application deadline : 30th of april 2018
Contact : alain.dutech at loria.fr AND jeremy.fix at centralesupelec.fr

More details can be found at:
http://www.loria.fr/en/jobs-training/phd-offer-motivated-multi-scale-self-organization-for-the-emergence-of-coordinated-sensorimotor-behaviors/ 


////////////////////////////////////////////////////////////////////////

** References **
Beer, R. D. (1995). A dynamical systems perspective on agent-environment 
interaction. Artificial intelligence, 72(1–2):173–215.

Dietterich, T. (2000). Hierarchical reinforcement learning with the MAXQ 
value function decom- position. Journal of Artificial Intelligence 
Research (JAIR), 13:227–303.

Kelso, J. S. (1995). Dynamic Patterns. MIT Press.

Keijzer, F. (2001). Representation and behavior. MIT Press.

Sutton, R., Precup, D., and Singh, S. (1999). Between MDPs and 
semi-MDPs: A framework for temporal abstraction in reinforcement 
learning. Artificial Intelligence, (112):118–211.

Sutton, R. and Barto, A. (2016). Reinforcement learning: An 
Introduction. Bradford Book, MIT Press, Cambridge, MA.

Warren, W. H. (2006). The dynamics of perception and action. 
Psychological Review,113(2):358–389.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/connectionists/attachments/20180312/93c250f3/attachment.html>


More information about the Connectionists mailing list