<html><head></head><body><div style="font-family: Verdana;font-size: 12.0px;"><div>

<div name="quote" style="margin:10px 5px 5px 10px; padding: 10px 0 10px 10px; border-left:2px solid #C3D9E5; word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">

<div name="quoted-content">

<div style="font-family: Verdana;font-size: 12.0px;">

<div>

<div>Objective of the thesis: The objective is to improve modelling of the relationships between chemical properties and biochemical/biological activity. We have recently concentrated on so-called "pharmacophores", molecular substructures based on emerging patterns that are supported by large amounts of data. Based on these pharmacophores, one can define a pharmacophoric space (where each molecule is represented by numerical coordinates) that is somewhat orthogonal to standard chemical spaces. To support this, we will work on distance measures between pharmacophores, starting from graph edit distances. The resulting distance matrix will allow reducing dimensionality so that data can be represented in 2 or 3 dimensions, i.e. in a way that can be interpreted by human experts (ideally, a pharmacophoric space that represents as closely as possible the neighbor relations of the original data) and so that they can be effectively grouped in terms of their bioligical activity (via clustering). The original data will be chosen in close collaboration with CERMN, the "Centre d'Etudes et de Recherche sur le Médicament de Normandie".</div>


<div> </div>


<div>Methods:<br/>

1) familiarizing with the platform Norns for pharmacophore generation, and improvement of an existing program (resulting from a prior project) for defining and calculating distances between pharmacophores<br/>

2) integration of linear (PCA, MDS,...) and non-linear (Sammon, isoMDS, SOM, GTM, ...) dimensionality reduction methods to arrive a 2D or 3D representational space based on pharmacophores<br/>

3) integration of clustering methods (k-Means, SVM, ...)<br/>

4) integration of predictive approaches based on the groupings derived from the prior step<br/>

5) preliminary integration of expert feedback options that allow improving both final representation and interpretation (in close collaboration with CERMN). Both predictive methods and expert feedback will be taken into account in parallel. Predictive approaches will use chemical and pharmacophoric similarities as starting points (the prior based on chemical fingerprints). Expert feedback will start from evaluating how well expert constraints are represented by the clusters, and how good those clusters are. The intended result of this step is close agreement between the expert point of view, and the automatically derived grouping of data.</div>


<div> </div>


<div>This master's thesis combines two cutting-edge fields of research, data mining and chemoinformatics. It involves two distinct but closely collaborating research groups, CoDaG (Constraints, Data, and Graphs) of the Greyc laboratory and CERMN. A project (currently under review for French financing) would allow, if accepted, for a PhD thesis to follow on to the work done during the master's thesis.<br/>

Master's theses in France are considered internships, and therefore remunerated at ~530 euros/month.</div>


<div> </div>


<div>albrecht.zimmermann@unicaen.fr ou bertrand.cuissart@unicaen.fr.</div>

</div>

</div>

</div>

</div>

</div></div></body></html>