Connectionists: [Post-Doc] Inria Bordeaux, FR: “Decoding Transformers”: Analyze Transformers for Brain Language Decoding and Encoding

Xavier Hinaut xavier.hinaut at inria.fr
Mon Apr 24 11:33:47 EDT 2023


A post-doc position is availbe at Inria, Bordeaux France:
“Decoding Transformers: Analyze Transformers for Brain Language Decoding and Encoding"

Email contact/questions: **until 3rd of May** to xavier dot hinaut at inria dot fr
Apply here: https://recrutement.inria.fr/public/classic/en/offres/2023-06169
Deadline application: **13th May 2023**

# Keywords
Brain encoding/decoding; Transformers; Language; Computational Neuroscience; fMRI/MEG/EEG brain data

# Duration & start date
24 months starting 1st October 2023

# Scientific research context
The development of new mechanistic models of brain activity can help to better understand how the brain works. This is particularly true for language, where a better understanding of the cognitive mechanisms involved could lead to improved treatments for developmental language disorders in children and to improved rehabilitation methods for brain injuries that lead to aphasia. The development of theoretical models for the neural dynamics underlying brain functions, including learning, is essential to (1) better understand the general functioning of the brain and (2) explore new paths that are not accessible using purely experimental (neurobiological) methods.
In this project we will consider various imaging methods with different spatial and temporal resolutions: Functional Magnetic Resonance Imaging (fMRI, high spatial resolution), ElectroEncephalography (EEG, high temporal resolution), MagnetoEncephalography (MEG, high temporal resolution, and better spatial resolution than EEG), as well as ECoG (ElectroCorticoGraphy, similar to EEG but less noisy as it is directly on the surface of the cortex and covers a reduced area of the brain).
The number and quality of brain imaging datasets available in open source is increasing rapidly in recent years [Li et al., 2021, Nastase et al., 2021, Gwilliams et al., 2022]. This is a great opportunity for modellers who do not usually have easy access to such data, and also an opportunity to compare different algorithms on the same data, in order to make the research more robust and reproducible.

# Work description
The long term objective is to obtain both mechanistic models that are explanatory and produce good predictions of brain imaging data. At the moment we have mechanistic models that are not very predictive (e.g. based on Reservoir Computing) and predictive models that are not very explanatory (e.g. Transfomers). The objective is to obtain the best of both worlds in order to provide some answers to our question by identifying the mechanisms necessary for language processing, while improving the predictive capacities of mechanistic models. To this end, this research project aims to understand the mechanisms that allow Transformers to predict brain activity. In the long term, this project also aims to build models that are more biologically plausible than Transformers by using their components that best predict brain activity, and by introducing constraints derived from our knowledge of cognitive and brain functions.
The objective of this project is divided into different sub-objectives: 1. to identify the components of the Transformer architecture that best predict the activity obtained by different imaging techniques (MRI, EEG, ECog, MEG) on linguistic tasks. 2. to identify which types of models fail to predict brain activity in a meaningful way, in order to derive a "set of neural computations" needed for the prediction of brain activity.
The majority of the analyses will be based on publicly available data. Some analyses will be based on data already acquired by Gaël Jobard. Because language models such as Transformers evolve rapidly, we will adapt the mechanisms sought and data analysed according to the advances of language models. For example, it seems that new multimodal models (e.g. taking into account images and language simultaneously) will be available soon. In such case, we will exploit these models and also analyse corpora containing images. Indeed, this may allow us to get closer to human representations that are multimodal in nature, while mitigating the "symbol grounding problem" [Harnad, 1990] that applies to purely linguistic models such as ChatGPT.
The hired postdoc will collaborate with other PhD students in the team. If time permits, in parallel to the data analysis, the postdoc will set up an experimental protocol in fMRI with the experimental means available to Gaël Jobard.

# Required Knowledge and background
- Good background in computational neuroscience, computer science, physics and/or mathematics;
- A strong interest for neuroscience, linguistics and the physiological processes underlying learning;
- Python programming with experience with scientific libraries Numpy/Scipy (or similar programming language: matlab, etc.);
- Experience in machine learning or data mining;
- Experience in neural imaging data;
- Independence and ability to manage a project;
- Good English reading/speaking skills.

# Advisors
Xavier Hinaut & Gaël Jobard
This work will be co-supervised with Gaël Jobard also at the Institute for Neurodegenerative diseases (Institut des Maladies Neurodégénératives – IMN –, Pellegrin Hospital Campus, Bordeaux).

# Other positions 
Other open positions are available here: https://github.com/neuronalX/phd-and-postdoc-positions

Best regards,

Xavier Hinaut
Inria Research Scientist
www.xavierhinaut.com -- +33 5 33 51 48 01
Mnemosyne team, Inria, Bordeaux, France -- https://team.inria.fr/mnemosyne
& LaBRI, Bordeaux University --  https://www4.labri.fr/en/formal-methods-and-models
& IMN (Neurodegeneratives Diseases Institute) -- http://www.imn-bordeaux.org/en
---
Our Reservoir Computing library: https://github.com/reservoirpy/reservoirpy


More information about the Connectionists mailing list