MELLODDY Project Meets its Year One Objective: Deployment of t​he world’s first secure platform for multi-task federated learning in drug discovery


Tags: Cancer / Investment / ML


Date:September 17th, 2020


MELLODDY Project Meets its Year One Objective: Deployment of t​he world’s first secure platform for multi-task federated learning in drug discovery

September 17, 2020

The project has the potential to solve the challenges of data sharing within pharmaceutical research while also significantly advancing drug discovery.

Machine Learning Ledger Orchestration for Drug Discovery (MELLODDY) — an Innovative Medicines Initiative-funded consortium of 10 pharmaceutical partners: Amgen; Astellas; AstraZeneca; Bayer; Boehringer Ingelheim; GSK; Institut De Recherches Servier; Janssen Pharmaceutica NV; Merck KGaA; and Novartis, and seven technical partners: Budapesti Muszaki Es Gazdasagtudomanyi Egyetem; Iktos; Kubermatic; KU Leuven; NVIDIA; Owkin; and Substra Foundation — announced today that it has met its year one objective – the creation of a secure predictive modelling platform and the first successful Federated Learning run using this new platform.

Federated Learning (FL) is a Machine Learning (ML) technique that enables researchers to train artificial intelligence (AI) models on distributed data, at scale, across multiple institutions — without centralizing the data. The MELLODDY project is creating a solution that will facilitate a new form of ‘coopetition’ where competitors have a mutual interest in building predictive models that benefit from a parallel effort— while still protecting their private research, data, information, and models. MELLODDY’s successful deployment of the platform is the world’s first FL experiment in drug discovery performed at this scale and between competitive industrial partners. With this milestone achievement, 10 pharmaceutical companies, which otherwise are in competition with one another, have simultaneously trained their predictive models to learn from all the data submitted by each pharmaceutical partner.

The aim of the MELLODDY consortium is to develop a cutting-edge FL platform that enables the generation and enhancement of predictive ML models, using distributed pharmaceutical data and without exposing or revealing any of the individual company’s proprietary data and models. This type of collaborative, yet still protected, data collection process has the potential to solve the challenges of data sharing within pharmaceutical research while also significantly advancing drug discovery and development opportunities.

Hugo Ceulemans, Project Leader of MELLODDY and Scientific Director, Discovery Data Sciences at Janssen Pharmaceutica NV

We now have an operational platform, rigorously vetted by the consortium’s 10 pharmaceutical partners, found to be secure to host their data – an enormous accomplishment. Over the next year, we’ll turn our focus on studying the hypothesis that multi-partnered modelling will yield superior predictive models for drug discovery.

How the collaborative model works

Partners securely register their proprietary datasets in their own local instance of the distributed platform, which allows the private models to learn from the aggregated knowledge of all partners, without sharing private data. The development of the MELLODDY platform and the execution of the first federated run was a key milestone and a technical triumph that was achieved by the MELLODDY consortium as the first-year objective of this three-year project. The design, implementation and operation of the secure platform for multi-task FL were built on the previous work and expertise of the seven technical partners. The research partners (BME, Iktos, and NVIDIA) focused on implementing ML for drug discovery, ensuring privacy, and optimizing training speed on NVIDIA GPUs while the operational partners (Owkin, Kubermatic, KU Leuven, and Substra Foundation) developed and provided the code for the platform. Owkin provided Owkin Connect, its novel privacy-preserving framework to enable multitask FL. While KU Leuven provided SparseChem, an open-source library for training ML models specific to drug discovery, and Kubermatic (formally Loodse) deployed their Kubermatic Kubernetes Platform to build the scalable infrastructure for each pharmaceutical partner. Finally, Substra Foundation managed the technical operations, monitored the executions of the platform, and hosted the open-source code which is part of Owkin Connect.

The platform passed extensive and rigorous security audits by an external company and by the IT teams of each pharmaceutical partner to ensure data privacy and protection.

Mathieu Galtier, Project Coordinator of MELLODDY and Chief Product Officer at Owkin

This was an absolute prerequisite for the deployment of the platform on sensitive data and was the project’s major challenge in the first year. After a very intense year of collaborative effort, the platform is now functional, audited and tested at scale. From a technical perspective, it simply works. It is now up to our pharmaceutical partners to translate this development success into scientific breakthroughs.

To that extent, the pharmaceutical partners have already begun an extensive scientific and business case assessment of the results of the first cycle of modelling runs; the outcome of which, de-identified and aggregated across all partners, is considered for publication. Over the next two years, the MELLODDY project will focus on improving the performance of the common predictive model by exposing it to an increasing amount of data.

Learn more

About MELLODDY (platform, project, partners) – visit
About the Innovative Medicines Initiative – visit
About potentially participating in future federated runs visit our website’s FAQ page