The genetic diagnosis of rare diseases has been revolutionized by new high-throughput sequencing (HTS) technologies that generate massive data and thus raise new data usage challenges. In this context, the MDLab is involved in:

  • the adaptation of the throughput, computing and storage capacities required to process massive data from the NGS, including Whole Genome Sequencing (WGS);
  • the development of pipelines and computer tools for data analysis;
  • the establishment of clinical-genomic databases interoperable with European databases;
  • the implementation of two research projects concerning the cross-referencing of multi-OMIC data, and the determination of the fraction of fetal DNA circulating in maternal blood for non-invasive screening for trisomy 21 by NGS.

The Genetics project currently involves the following two sub-projects:

GenomeMixer and NiPTUNE: novel bioinformatics tools to improve Non-Invasive Prenatal Testing for fetal aneuploidies

Introduction. The discovery of free circulating fetal DNA in maternal blood has led to the development of Non-Invasive Prenatal Testing (NIPT) techniques for detecting fetal aberrations such as trisomy 21. NIPT is based on the analysis by massively parallel sequencing of small fragments of DNA circulating freely in the maternal blood. Its reliability relies heavily on both the presence in the maternal blood of a sufficient amount of fetal DNA (ff for fetal fraction) and a sufficient sequencing depth (sd). Bioinformatics tools have been developed to determine the ff but there is currently no reference method in clinical practice.

Materials and Methods. We developed GenomeMixer, an approach to estimate confidence intervals in aneuploidy prediction, which can be used by all diagnostic laboratories, and NiPTUNE, a software package to automate NIPT analysis. GenomeMixer creates synthetic sequencing data that mixes reads from pregnant women with fetal aneuploidies with those from non-pregnant women in order to determine the minimum threshold of both ff and sd necessary to detect aneuploidies reliably. The Niptune pipeline is composed of several modules including quality control, ff estimation by different methods and aneuploidy prediction.

Results. We tested our tools on 2 different cohorts composed of 377 and 1055 samples including respectively 10 and 24 aneuploidies. GenomeMixer determined the minimal threshold of both ff and sd. NiPTUNE was validated by the retrospective study performed on the cohorts.

Conclusions. We have developed bioinformatics tools to enable each diagnostic laboratory to determine the confidence intervals of both ff and sd necessary for a reliable NIPT.


F. Simões, C. Bouveyron, D. Piga, et al. Cardiac dyspnea risk zones in the South of France identified by geo-pollution trends study. Scientific Reports 12, 1900 (2022).

C. Bouveyron, J. Jacques, A. Schmutz, F. Simões and S. Bottini, Co-Clustering of Multivariate Functional Data for the Analysis of Air Pollution in the South of France. The Annals of Applied Statistics, in press, 2021

MitoIntegrOMICS: an integrated multi-OMICS approach to increase the ability to diagnose mitochondrial diseases

Mitochondrial diseases (MD) are rare disorders caused by deficiency of the mitochondrial respiratory chain, which provides energy in each cell. MD are caused by alterations (variants) in the genes involved in mitochondrial functions. MD diagnosis is based on the identification of the gene(s) responsible for the disease, which then makes it possible to offer genetic advice and a prenatal diagnosis, to consider therapeutic approaches and to improve the care of patients. Nowadays, technologies currently used for detecting causal variants are far from complete, covering only 25 to 50% of them.

To address these needs the MDLab proposes to gather three different domains: medicine, bioinformatics and machine learning, in order to set up an integrated multi-omics approach to identify novel causal variants. We foresee that this project will contribute to the introduction of new diagnostic tools to reduce the number of patients with a diagnostic stalemate. This study will define the milestones for transferring the joint use of multi-omics technologies from these fields of research to diagnostic situations. , and it can be divided into three main steps:

1) performing a bioinformatic analysis of multi-omics data;

2) developing a multi-omics integrational approach;

3) implementing a new variant prioritization AI algorithm.

This project aims to develop new algorithms that will find application not only in the diagnosis of MD diagnosis, but also of  other genetic disorders and cancer, and will allow the development of personalized medicine to improve patient healthcare.

Project members: Prof. Sylvie Bannwarth (CHU Nice, IRCAN UCA), Dr. John Boudjarane (CHU Timone, Marseille), Dr. Véronique Duboc (CHU Nice, IRCAN UCA), Prof. Véronique Paquis-Flucklinger (CHU Nice, IRCAN UCA), Prof. Vincent Procaccio (CHU Angers), Dr. Cécile Rouzier (CHU Nice, IRCAN UCA), Dr. Samira Saadi (CHU Nice, IRCAN UCA).