Collaborative Federated Learning behind Hospitals’ Firewalls for Predicting Histological Response to Neoadjuvant Chemotherapy in Triple-Negative Breast Cancer
Abstract
Triple-Negative Breast Cancer (TNBC) is a rare cancer, characterized by high metastatic potential and poor prognosis, and has limited treatment options compared to other breast cancers. The current standard of care in non-metastatic settings is neoadjuvant chemotherapy (NACT), with the goal of breast-conserving surgery and for an in vivo assessment of chemosensitivity. However, the efficacy of this treatment varies significantly across patients, and this histological response heterogeneity is still poorly understood partly due to the paucity of available curated TNBC data.
Motivated by this problem, we investigate the use of machine learning (ML) to predict at diagnosis the histological response to NACT for early TNBC patients. To overcome the known biases of related small scale studies while respecting data privacy, we conduct, for the first time, a TNBC study in a multi-centric fashion behind hospitals’ firewalls using collaborative Federated Learning (FL). Thereby allowing access to enough TNBC data to sustain a complete response heterogeneity investigation. We show evidence that local ML models relying on Whole-Slide Images (WSIs) at diagnosis are able to predict the histological response to NACT as accurately as current clinical approaches, which rely on time-consuming expert annotations. We demonstrate that collaborative training further improves performance over single-center training outperforming clinical methods.
Our ML model is interpretable by design, and we show that it is sensitive to specific histological patterns. While we identify known predictive biomarkers among them, this proof of concept for real-world collaborative FL paves the way for future biomarker discovery using unprecedently large datasets.