Federated learning for predicting histological response to neoadjuvant chemotherapy in triple-negative breast cancer
Abstract
Triple-negative breast cancer (TNBC) is a rare cancer, characterized by high metastatic potential and poor prognosis, and has limited treatment options.
The current standard of care in nonmetastatic settings is neoadjuvant chemotherapy (NACT), but treatment efficacy varies substantially across patients. This heterogeneity is still poorly understood, partly due to the paucity of curated TNBC data. Here we investigate the use of machine learning (ML) leveraging whole-slide images and clinical information to predict, at diagnosis, the histological response to NACT for early TNBC women patients. To overcome the biases of small-scale studies while respecting data privacy, we conducted a multicentric TNBC study using federated learning, in which patient data remain secured behind hospitals’ firewalls. We show that local ML models relying on whole-slide images can predict response to NACT but that collaborative training of ML models further improves performance, on par with the best current approaches in which ML models are trained using time-consuming expert annotations. Our ML model is interpretable and is sensitive to specific histological patterns.
This proof of concept study, in which federated learning is applied to real-world datasets, paves the way for future biomarker discovery using unprecedentedly large datasets.