Connectionists: PhD Position at CentraleSupelec/Loria, Nancy, France
Jeremy Fix
Jeremy.Fix at centralesupelec.fr
Mon Mar 12 12:26:54 EDT 2018
/////////////////////////////////////////////////////////////////////
PhD Studentship Available on
Motivated Multi-scale Self-Organization for the
emergence of coordinated sensorimotor behaviors
BISCUIT team, CentraleSupelec/Loria,
Nancy, France
Keywords: artificial intelligence, reinforcement learning,
self-organization, fine grained collective dynamical systems.
/////////////////////////////////////////////////////////////////////
We invite applications for PhD studentships in the BISCUIT team at the
CentralSupelec/Loria laboratory to investigate the emergence of
coordinated sensorimotor behaviors in artificial agents. We especially
focus on Spatialized and Distributed Population Computing
(SDP-Computing) where a collective behavior emerges from the
interaction of massively parallel, distributed, decentralized and
adaptive simple computing units with local communications. As theorized
by the embodiment theory, the interaction with a rich and complex
environment is critical and necessary for an interesting behavior to
emerge.
We work with learning agents that start with a minimum set of "innate"
skills and that develop more abstract representations and
functionalities allowing them to survive in their environment. Learning
of these skills is driven by intrinsic motivations and external rewards.
Thus, during this thesis, we would like to work with learning models
that evolve continuously in time and that are compatible with the
general framework of Reinforcement Learning (Sutton, 1998) and see how
these models could allow an agent to build more abstract and complex
behaviors.
For example, several works have shown that models specified within the
Dynamic Systems Theory (DST) (Kelso, 1995) can learn simple and reflex
sensorimotor behaviors (Beer, 1995; Spencer et al., 2011), but these
behaviors cannot help them to reach long-term and distant goals.
Some of the question we would like to address in this thesis are:
* while we know that central pattern generators might be candidates for
providing motor primitives, it is still unclear how a distributed model
can build up on these motor primitives sensorimotor behaviors on a
longer time scale;
* how these higher level, more abstract, sensorimotor behaviors could be
integrated by the model to articulate its behavior ?
* how the system could learn even more complex behaviors, in some way,
gradually bootstrapping its sensorimotor capabilities ?
* can we make operant the theories of (Warren, 2006; Keijzer, 2001) that
advocate anticipatory behaviors;
* is it possible to take inspirations from works on hierarchical
reinforcement learning using options (Sutton et al., 1999; Dietterich,
2000) in a continuous framework ?
////////////////////////////////////////////////////////////////////////
The final decision about the funding will be made in the first days of
june 2018. According to french law, the raw monthly salary is between
1684,93 and 2024,70 euros.
Application deadline : 30th of april 2018
Contact : alain.dutech at loria.fr AND jeremy.fix at centralesupelec.fr
More details can be found at:
http://www.loria.fr/en/jobs-training/phd-offer-motivated-multi-scale-self-organization-for-the-emergence-of-coordinated-sensorimotor-behaviors/
////////////////////////////////////////////////////////////////////////
** References **
Beer, R. D. (1995). A dynamical systems perspective on agent-environment
interaction. Artificial intelligence, 72(1–2):173–215.
Dietterich, T. (2000). Hierarchical reinforcement learning with the MAXQ
value function decom- position. Journal of Artificial Intelligence
Research (JAIR), 13:227–303.
Kelso, J. S. (1995). Dynamic Patterns. MIT Press.
Keijzer, F. (2001). Representation and behavior. MIT Press.
Sutton, R., Precup, D., and Singh, S. (1999). Between MDPs and
semi-MDPs: A framework for temporal abstraction in reinforcement
learning. Artificial Intelligence, (112):118–211.
Sutton, R. and Barto, A. (2016). Reinforcement learning: An
Introduction. Bradford Book, MIT Press, Cambridge, MA.
Warren, W. H. (2006). The dynamics of perception and action.
Psychological Review,113(2):358–389.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/connectionists/attachments/20180312/93c250f3/attachment.html>
More information about the Connectionists
mailing list