Connectionists: Postdoc position on recurrent neural networks for music improvisation

Mon Sep 21 09:31:15 EDT 2015

LAB: MULTISPEECH team, Inria Nancy, France
PI: Emmanuel Vincent (emmanuel.vincent at inria.fr)
START: December 2015 to March 2016
DURATION: 16 months
TO APPLY: send a CV, a motivation letter, a list of publications, and 
one or more recommendation letters to emmanuel.vincent at inria.fr by 
October 9, 2015

Automatic music improvisation aims to enable a machine to listen to 
other musicians and improvise with them in real time. While recurrent 
neural networks (RNNs) have shown their benefit for the generation of 
pitch sequences [1] and polyphonic music [2,3], current improvisation 
systems still rely on variable-order N-grams of pitch sequences, which 
can be learned in real time [4].

The goal of this postdoc position is to introduce the use of 
(potentially deep) RNNs in the context of automatic music improvisation. 
One or more of the following challenges shall be investigated:
- learn the RNN from a small amount of data using musically-motivated 
network architectures and parameter tying strategies
- update the RNN and generate meaningful music in real time given input 
by the other musicians
- jointly model heterogeneous musical dimensions (pitch, rhythm, 
harmony...) in the line of [5]
- jointly account for multiple time scales (tatum, beat, bar, structural 
block...)
To do so, we will leverage recent advances both in deep learning and in 
music modeling, e.g., [6,7].

This position is part of a funded project with Ircam. The successful 
candidate will collaborate with a PhD student and participate in project 
meetings at Ircam.

Salary: 2600 €/month gross, plus free health insurance and additional 
benefits

Ideal profile:
Prospective candidates should hold or be about to obtain a PhD in the 
area of machine learning or speech and music processing. Knowledge about 
RNNs and RNN programming practice (e.g., Theano) are necessary. Previous 
experience with music is not required but would be an asset.

[1] D. Eck and J. Schmidhuber, "Finding temporal structure in music: 
Blues improvisation with LSTM recurrent networks", in Proc. NNSP, 2002.

[2] N. Boulanger-Lewandowski, Y. Bengio, and P. Vincent, "Modeling 
temporal dependencies in high-dimensional sequences: Application to 
polyphonic music generation and transcription", in Proc. ICML, 2012.

[3] I.-T. Liu and B. Ramakrishnan, "Bach in 2014: Music composition with 
recurrent neural network", arXiv:1412.3191, 2014.

[4] G. Assayag and S. Dubnov. Using factor oracles for machine 
improvisation. Soft Computing, 2004.

[5] G. Bickerman, S. Bosley, P. Swire, and R. M. Keller, "Learning to 
create jazz melodies using deep belief nets", in Proc. ICCC, 2010.

[6] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. 
Erhan, V. Vanhoucke, and A. Rabinovich, "Going deeper with 
convolutions", in Proc. CVPR, 2015.

[7] F. Bimbot, G. Sargent, E. Deruty, C. Guichaoua, and E. Vincent. 
Semiotic description of music structure: An introduction to the 
Quaero/Metiss structural annotations. In Proc. AES 53rd Int. Conf. on 
Semantic Audio, 2014.

-- 
Emmanuel Vincent
Multispeech Project-Team
Inria Nancy - Grand Est
615 rue du Jardin Botanique, 54600 Villers-lès-Nancy, France
Phone: +33 3 8359 3083 - Fax: +33 3 8327 8319
Web: http://www.loria.fr/~evincent/