Our Publications 1

Nature Medicine
Pierre Courtiol, Charles Maussion, Matahi Moarii, Elodie Pronier, Samuel Pilcer, Meriem Sefta, Pierre Manceron, Sylvain Toldo, Mikhail Zaslavskiy, Nolwenn Le Stang, Nicolas Girard, Olivier Elemento, Andrew G. Nischolson, Jean-Yves Blay, Françoise Galateau-Sallé, Gilles Wainrib, Thomas Clozel

Malignant mesothelioma (MM) is an aggressive cancer primarily diagnosed on the basis of histological criteria. The World Health Organization classification subdivides mesothelioma tumors into three histological types: epithelioid MM (EMM), biphasic MM (BMM), and sarcomatoid MM (SMM). MM is a highly complex and heterogeneous disease rendering its diagnosis and histological typing difficult leading to suboptimal patient care and decision of treatment modalities.

Here, we developed a new approach based on deep convolutional neural networks (CNNs) called MesoNet to accurately predict the overall survival (OS) of mesothelioma patients from whole slide digitized images (WSIs) without any pathologist-provided locally annotated regions. We validated MesoNet on both an internal validation cohort from the French MESOBANK and an independent cohort from The Cancer Genome Atlas (TCGA). We demonstrated that the model was more accurate in predicting patient survival than using current pathology practices. Furthermore, unlike classical black-box deep learning methods, MesoNet identified regions contributing to patient outcome prediction.

Strikingly, we found that these regions are mainly located in the stroma and are histological features associated with inflammation, cellular diversity and vacuolization. These findings suggest that deep learning models can identify new features predictive of patient survival and potentially lead to new biomarker discoveries.

Benoît Schmauch, Alberto Romagnoni, Elodie Pronier, Charlie Saillard, Pascale Maillé, Julien Calderaro, Meriem Sefta, Sylvain Toldo, Thomas Clozel, Matahi Moarii, Pierre Courtiol, Gilles Wainrib

Deep learning methods for digital pathology analysis have proved an effective way to address multiple clinical questions, from diagnosis to prognosis and even to prediction of treatment outcomes. They have also recently been used to predict gene mutations from pathology images, but no comprehensive evaluation of their potential for extracting molecular features from histology slides has yet been performed.

We propose a novel approach based on the integration of multiple data modes, and show that our deep learning model, HE2RNA, can be trained to systematically predict RNA-Seq profiles from whole-slide images alone, without the need for expert annotation. HE2RNA is interpretable by design, opening up new opportunities for virtual staining. In fact, it provides virtual spatialization of gene expression, as validated by double-staining on an independent dataset. Moreover, the transcriptomic representation learned by HE2RNA can be transferred to improve predictive performance for other tasks, particularly for small datasets.

As an example of a task with direct clinical impact, we studied the prediction of microsatellite instability from hematoxylin & eosin stained images and our results show that better performance can be achieved in this setting.

Nature, Scientific Reports
Alberto Romagnoni, Simon Jégou, Kristel Van Steen, Gilles Wainrib, Jean-Pierre Hugot & International Inflammatory Bowel Disease Genetics Consortium (IIBDGC)

Crohn Disease (CD) is a complex genetic disorder for which more than 140 genes have been identified using genome wide association studies (GWAS). However, the genetic architecture of the trait remains largely unknown. The recent development of machine learning (ML) approaches incited us to apply them to classify healthy and diseased people according to their genomic information.

The Immunochip dataset containing 18,227 CD patients and 34,050 healthy controls enrolled and genotyped by the international Inflammatory Bowel Disease genetic consortium (IIBDGC) has been re-analyzed using a set of ML methods: penalized logistic regression (LR), gradient boosted trees (GBT) and artificial neural networks (NN). The main score used to compare the methods was the Area Under the ROC Curve (AUC) statistics. The impact of quality control (QC), imputing and coding methods on LR results showed that QC methods and imputation of missing genotypes may artificially increase the scores. At the opposite, neither the patient/control ratio nor marker preselection or coding strategies significantly affected the results. LR methods, including Lasso, Ridge and ElasticNet provided similar results with a maximum AUC of 0.80. GBT methods like XGBoost, LightGBM and CatBoost, together with dense NN with one or more hidden layers, provided similar AUC values, suggesting limited epistatic effects in the genetic architecture of the trait.

ML methods detected near all the genetic variants previously identified by GWAS among the best predictors plus additional predictors with lower effects. The robustness and complementarity of the different methods are also studied. Compared to LR, non-linear models such as GBT or NN may provide robust complementary approaches to identify and classify genetic markers.

Computational Toxicology
Mikhail Zaslavskiy, Simon Jégou, Eric W. Tramel, Gilles Wainrib

Timely assessment of compound toxicity is one of the biggest challenges facing the pharmaceutical industry today. A significant proportion of compounds identified as potential leads are ultimately discarded due to the toxicity they induce.

In this paper, we propose a novel machine learning approach for the prediction of molecular activity on ToxCast targets. We combine extreme gradient boosting with fully-connected and graph-convolutional neural network architectures trained on QSAR physical molecular property descriptors, PubChem molecular fingerprints, and SMILES sequences. Our ensemble predictor leverages the strengths of each individual technique, significantly outperforming existing state-of-the art models on the ToxCast and Tox21 toxicity related bioactivity-prediction datasets.

We provide free access to molecule bioactivity prediction using our model at

Diagnostic and Interventional Imaging
P. Herent, B. Schmauch, P. Jehanno, O. Dehaene, C.Saillard, C. Balleyguier, J. Arfi-Rouche, S. Jégou

The purpose of this study was to assess the potential of a deep learning model to discriminate between benign and malignant breast lesions using magnetic resonance imaging (MRI) and characterize different histological subtypes of breast lesions.

We developed a deep learning model that simultaneously learns to detect lesions and characterize them. We created a lesion-characterization model based on a single two-dimensional T1-weighted fat suppressed MR image obtained after intravenous administration of a gadolinium chelate selected by radiologists. The data included 335 MR images from 335 patients, representing 17 different histological subtypes of breast lesions grouped into four categories (mammary gland, benign lesions, invasive ductal carcinoma and other malignant lesions). Algorithm performance was evaluated on an independent test set of 168 MR images using weighted sums of the area under the curve (AUC) scores. We obtained a cross-validation score of 0.817 weighted average receiver operating characteristic (ROC)-AUC on the training set computed as the mean of three-shuffle three-fold cross-validation. Our model reached a weighted mean AUC of 0.816 on the independent challenge test set.

This study shows good performance of a supervised-attention model with deep learning for breast MRI. This method should be validated on a larger and independent cohort.

Diagnostic and Interventional Imaging
B. Schmauch, P. Herent, P. Jehanno, O. Dehaene, C. Saillard, C. Aubé, A. Luciani, N. Lassaue, S. Jégou

The purpose of this study was to create an algorithm that simultaneously detects and characterizes (benign vs. malignant) focal liver lesion (FLL) using deep learning.

We trained our algorithm on a dataset proposed during a data challenge organized at the 2018 Journées Francophones de Radiologie. The dataset was composed of 367 two-dimensional ultrasound images from 367 individual livers, captured at various institutions. The algorithm was guided using an attention mechanism with annotations made by a radiologist. The algorithm was then tested on a new data set from 177 patients. The models reached mean ROC-AUC scores of 0.935 for FLL detection and 0.916 for FLL characterization over three shuffled three-fold cross-validations performed with the training data. On the new dataset of 177 patients, our models reached a weighted mean ROC-AUC scores of 0.891 for seven different tasks.

This study that uses a supervised-attention mechanism focused on FLL detection and characterization from liver ultrasound images. This method could prove to be highly relevant for medical imaging once validated on a larger independent cohort.

Educational – Medium
Simon Jégou, Paul Herent

Healthcare is an industry that raises the highest hopes regarding the potential benefits of Artificial Intelligence (AI). Physicians and medical researchers will not become programmers or data scientists overnight, nor will they be replaced by them, but they will need an understanding of what AI actually is and how it works. Similarly, data scientists will need to collaborate closely with doctors to focus on relevant medical questions and understand patients behind the data.

This case study aims to connect both audiences (physicians/medical personnel and data scientists) by providing insights into how to apply machine learning to a specific medical use case. We will walk you through the reasoning of our approach and will enable you to accompany us on a practical journey (via our Colab notebook) focused on understanding the underlying mechanics of an applied machine learning model.

Our experiment focuses on creating and comparing algorithms of increasing complexity in a successful attempt to estimate the physiological age of a brain based on Magnetic Resonance Imaging (MRI) data. Based on this experiment we propose how this imaging biomarker could have an impact on the understanding of neurodegenerative diseases such as Alzheimer’s.

Eric W. Tramel, Marylou Gabrié, Andre Manoel, Francesco Caltagirone, Florent Krzakala

Restricted Boltzmann machines (RBMs) are energy-based neural-networks which are commonly used as the building blocks for deep architectures neural architectures.

In this work, we derive a deterministic framework for the training, evaluation, and use of RBMs based upon the Thouless-Anderson-Palmer (TAP) mean-field approximation of widely-connected systems with weak interactions coming from spin-glass theory. While the TAP approach has been extensively studied for fully-visible binary spin systems, our construction is generalized to latent-variable models, as well as to arbitrarily distributed real-valued spin systems with bounded support.

In our numerical experiments, we demonstrate the effective deterministic training of our proposed models and are able to show interesting features of unsupervised learning which could not be directly observed with sampling. Additionally, we demonstrate how to utilize our TAP-based framework for leveraging trained RBMs as joint priors in denoising problems.


Analysis of histopathology slides is a critical step for many diagnoses, and in particular in oncology where it defines the gold standard. In the case of digital histopathological analysis, highly trained pathologists must review vast whole-slide-images of extreme digital resolution (100,0002 pixels) across multiple zoom levels in order to locate abnormal regions of cells, or in some cases single cells, out of millions. The application of deep learning to this problem is hampered not only by small sample sizes, as typical datasets contain only a few hundred samples, but also by the generation of ground-truth localized annotations for training interpretable classification and segmentation models.

We propose a method for disease localization in the context of weakly supervised learning, where only image-level labels are available during training. Even without pixel-level annotations, we are able to demonstrate performance comparable with models trained with strong annotations on the Camelyon-16 lymph node metastases detection challenge.

We accomplish this through the use of pre-trained deep convolutional networks, feature embedding, as well as learning via top instances and negative evidence, a multiple instance learning technique from the field of semantic segmentation and object detection.

Baptiste Goujaud, Eric W. Tramel, Pierre Courtiol, Mikhail Zaslavskiy, Gilles Wainrib

Detection of interactions between treatment effects and patient descriptors in clinical trials is critical for optimizing the drug development process. The increasing volume of data accumulated in clinical trials provides a unique opportunity to discover new biomarkers and further the goal of personalized medicine, but it also requires innovative robust biomarker detection methods capable of detecting non-linear, and sometimes weak, signals.

We propose a set of novel univariate statistical tests, based on the theory of random walks, which are able to capture non-linear and non-monotonic covariate-treatment interactions. We also propose a novel combined test, which leverages the power of all of our proposed univariate tests into a single general-case tool.

We present results for both synthetic trials as well as real-world clinical trials, where we compare our method with state-of-the-art techniques and demonstrate the utility and robustness of our approach.

Our Publications 2