*Post-doc position : Information retrieval for Medical Scientific 
publications *


- M. Constant (U. of Lorraine). Website :

- M. Clausel (U. Lorraine). Website :

- R.S. Stoica (U. Lorraine). Website :

*Other partners of the project *: C. Francois (INIST), P. Oudet 
(Cancéopôle Est), F. Schaffner (Cancéropôle Est), N. Thouvenin (INIST).

*Location: *University of Lorraine, Nancy (France)

*Keywords: *Natural language processing, word embeddings, biomedical 
text mining, graph matching


The Cancéropôle Est is one of the 7 Cancéropôles created by the first 
national cancer action in 2003. Its missions are organizing, 
coordinating, and strengthening research against cancer in partnership 
with academic and clinical institutions by associating researchers, 
healthcare professionals, industrials and patients.

The aim of the project is to establish a cartography of the scientific 
research in Oncology in the two French Regions Grand Est and Bourgogne 
Franche Comté using the full text of scientific publications of each 
research team in the two regions.

*Description of the position:*

This position is funded by AMIES, University of Lorraine and Canceropôle 
Est. With this position, we would like to go use text mining technics to 
extract characteristics related to the scientific content of the 
publications of each research team in Grand Est and Bourgogne Franche 

The recruited person will work on the following points:

- Preprocessing of the data. The data will be provided by the 
Canc\'eropole Est and will consist of several full texts in xml or pdf 

- Learning of oncology embedding (see for e.g. [1]). INIST will provide 
training data to learn the embedding and ontology

- Extraction of characteristics related to the scientific content of 
publications for each research team.

- Combine these characteristics and collaboration graph of each team 
(see for e.g. [2]) to provide general characteristics for each team

- Integration in a vizualisation tool

The recruited person will benefit from the expertise of Canceropôle Est, 
INIST and University of Lorraine in text mining and statistical learning.

We would ideally like to recruit a 11 month post-doc with the following 
preferred skills:

- Knowledgeable in natural language processing, text mining and word 

- Knowledgeable in machine learning

- Good programming skills in Python (classical NLP librairies, 
scikit-learn, Pytorch and/or Tensor Flow)

- Very good English (understanding and writing)

The candidates should send a CV, 2 names of referees and a cover letter 
to the researchers mentioned above (Mathieu.Constant at, 
marianne.clausel at, radu-stefan.stoica at 
The selected candidates will be interviewed in February for an expected 
start in

March/April 2019.


[1] J. Lee et al. BioBERT: a Pre Trained Biomedical Language 
Representation Model for Biomedical Text Mining. Ed. Jonathan Wren. 
Bioinformatics (2019).

[2] Q. Laporte-Chabasse et al. Morpho-statistical description of 
networks through graph modelling and Bayesian inference. Preprint 2019

