How to develop a COVID-19 AI-Severity Score in a Race Against a Pandemic
How to develop a COVID-19 AI-Severity Score in a Race Against a Pandemic
Faced with the rapid spread of COVID-19, research institutions, pharmaceutical, biotech companies, and government institutions across the globe have come together to work in a collective race to understand and discover treatments for this pandemic. Owkin has published a piece in Nature Communications outlining a new AI-based multi-modal COVID-19 risk score, called AI-Severity, which improves severity prediction of hospitalized COVID-19 patients compared to existing scores.
This research is the result of a successful multi-center collaboration led by Natalie Laussau at Institut Gustave Roussy. Conceived of and submitted for publication in two months during a time of global health crisis, AI-Severity score demonstrates how Owkin can improve medical research by facilitating faster, more adaptable and better-targeted tools to understanding a disease.
The models developed in this paper are available through an open-sourced repository to help others advance their COVID-19 research and care.
A Global COVID-19 Research Race
As of January 2021, the Covid-10 global pandemic has affected 220 countries and infected over 87 million people. During its initial peak in spring of 2020, roughly 1 in 5 people affected needed hospital care. Because of this need, the intensive care units (‘ICU’) of many hospitals were overrun and unable to cope. The alarming rate at which institutions have filled throughout the pandemic highlights the need for treatments for the virus, as well as an understanding of the disease and its progression.
To stop virus transmission, much of the world has focused on vaccine research. Several countries have approved and begun to roll out the vaccines of AstraZeneca, Moderna, and Pfizer, among others. This is no small feat. According to the World Economic Forum, it can typically take up to 10 years and $500 million dollars to develop a vaccine. These COVID-19 vaccine efforts have notably produced stunning results in a matter of months. It is not only the speed that research institutions, pharmaceutical, biotech companies, and government institutions across the globe have raced that is remarkable; but also the breadth and depth of collaboration undertaken to meet the challenges posed by COVID-19.
It was in this new “space-race,” where the stakes were unarguably high, that Owkin kicked-off several initiatives to address the pandemic threat. Our most recent results of this endeavor, published in Nature Communications, was built out of a successful multi-center collaboration led by Natalie Laussau at Institut Gustave Roussy. The ScanCovIA consortia conceived, conducted, and submitted the research for publication in an astounding duration of two months at the height of the first wave of the pandemic. The result of this project is a new AI-based multi-modal COVID-19 risk score, called AI-Severity score, which improves severity prediction of hospitalized COVID-19 patients compared to existing scores. Ultimately, this improved score can better support institutions in their management and treatment of COVID-19 patients.
How We Developed the AI-Severity Score
One of the more alarming aspects of the pandemic has been the rate of hospitalizations in times and areas of unmitigated spread. As such, inundated institutions struggle to maintain a necessary balance of resources to best treat COVID-19 patients as well as patients who have other medical needs. In Spring of 2020, 1 in 5 affected by the novel coronavirus required hospitalization. Once hospitalized, a portion of COVID-19 patients are likely to develop severe outcomes requiring mechanical ventilation or high-flow oxygenation. Among hospitalized patients, 14 to 30% will require admission to an ICU, 12 to 33% will require mechanical ventilation, and 20% to 33% will die1,2. Once the hospital admits a patient, it is important to detect those patients at risk of severe outcomes in order to deliver proper care, to optimize the use of limited ICU resources, and to manage overall hospital resources and protocols.
How is Owkin uniquely positioned to help with COVID-19?
Since our inception in 2016, Owkin has championed a method of collaborative research — via our platform — in order to improve drug development and patient outcomes while preserving patient privacy and data security. The Owkin Platform consists of:
A network of leading academic medical centers and multimodal datasets;
Our Federated Learning software that secures distributed, multiparty datasets and enables an unprecedented breadth of collaboration in healthcare, called Owkin Connect;
Our expertise in ML, AI, Biology and Medicine that enables us to derive actionable, interpretable insights from real-world data.
Through this carefully refined ecosystem, we can funnel our resources to fight the COVID-19 threat and to help rapidly develop a solution that relieves pressure on ICUs, thereby improving patient outcomes. In addition, Owkin leads other COVID-19 projects through its Covid-19 Open AI (‘COAI’) consortium.
Kicking off the ScanCovIA Project for COVID-19
Since the beginning of the COVID-19 pandemic, Owkin has brought its attention to addressing this threat. We identified a potential research avenue for COVID-19: the prediction of disease prognosis. Like we begin most of our projects, we first contacted the partners within our network. We discovered that Institut Gustave Roussy (‘IGR’) had initiated a similar project and we joined forces.
Nathalie Lassau, Professor of Radiology at IGR, was the Principal Investigator of this multicentric study. The ScanCovIA consortium consisted of:
IGR: To coordinate the project, provide and annotate the data;
Kremlin-Bicêtre APHP (‘KB’): To provide the data;
Owkin: To help build, curate and anonymize the dataset, provide the biomedical machine learning expertise, and write the paper.
Digital Vision Center of CentraleSupélec and INRIA: To provide computer vision expertise.
The consortia aimed to identify prognostic biomarkers of disease severity. We based these biomarkers on routine clinical and biological data, as well as initial computerized tomography (CT) scans. CT scans permit radiologists and clinicians to observe whether lesions (damaged tissue) are present in the lungs.
COVID-19 on the Clock
A key feature of this project was the speed in which we conceived and conducted the research, and then wrote and submitted the findings for publication. What are usually complicated and lengthy processes —such as contract negotiation, IT set-up, data collection and anonymization, machine learning analysis and drafting a publication — were all completed in the record time of two months. This speed and progress was only made possible by an existing IT server that was previously installed at IGR for prior IGR and Owkin collaborative projects, as well as a substantial group effort from all teams to work together efficiently and around the clock in pursuit of this common goal.
“We are very proud to have set up a research partnership between two major French institutions and multiple partners across the data and medical research spectrum and to build a very rich cohort in such a short amount of time; less than 4 weeks,” Nathalie Lassau commented on the project. “I wanted to emphasize the prognostic value of imaging data, to show that it could be better used in clinical care. Finding such good predictors is a major public health priority and could help us better understand the disease.”
Why did we develop the 6-variable AI-Severity Score?
The ScanCovIA consortia set out to develop an improved risk score for hospitalized patients suffering from COVID-19. The current risk scores for severe deterioration combine several factors including age, sex, and comorbidities (other diseases in the same patient) with clinical and biological variables. We understand clinical variables as in clinic observations, such as low oxygen saturation (low levels of oxygen in the blood) and elevated respiratory rate. In blood testing, doctors measure biological variables which can contain factors that reflect multi-organ failure.
Our collaboration showed that CT scans also contain valuable prognostic information (information that can help predict patient outcomes). Our goal was to evaluate the extent to which adding this new modality (type of data) to the existing variables analyzed upon hospitalization could improve the outcome prediction for patients.
How did we develop the AI-Severity Score?
To evaluate this question we collected clinical and biological data, CT scan images and radiology reports from 1003 patients from two French hospitals: KB; and IGR and followed the four steps outlined below.
Step 1: We determined the prognostic variables associated with COVID-19 severity
Firstly, we evaluated how clinical and biological variables measured at admission were associated with future severe progression. An oxygen flow rate of 15 L/min or higher, and/or mechanical ventilation, and/or patient death typically signifies severe progression.
Secondly, we calculated the severity odds ratios (the size of the effect on severity) for each variable at each hospital. Then, we combined the results from the two centers. This showed 12 variables that were significantly associated with severity.
Afterwards, we assessed the predictive value of specific features in the radiology reports. Doing so, we identified 3 significant features in the lungs associated with future severity. One of the more notable features is extent of lung lesions. Radiologists recognize and denote this extent of lesions when they conduct their inspection of CT-scan images.
Step 2: We built CT image-based deep learning models that predicted severity better than the radiologist-labeled extent of lesions
To evaluate the prognostic value of CT-scans we, along with INRIA, developed deep learning models based on the raw CT-scan images. We trained these models using data from the Kremlin-Bicetre APHP hospital to predict future severity of hospitalized patients.
The patient populations between Institut Gustave Roussy and Kremlin-Bicetre APHP were very diverse: 85% of IGR population were cancer patients vs. 7% at KB. Despite this diversity, the models’ performance was consistent. This means they have great robustness and generalizability; we define this as the ability to transfer their performance learned on one dataset to a new dataset.
The CT-scan models were able to predict severity better than the radiologist quantification of disease extent. A deeper analysis of the neural network shows that it captures clinical features from the lung CTs, such as sex or age, in addition to the known COVID-19 radiology features. Meaning, the model can unveil features that radiologist have not previously noted within the images. These findings reinforce our hypothesis of the importance of CT scan data in predicting COVID-19 severity.
Step 3: We built and evaluated the multimodal AI-Severity score to predict severity based on the CT-scan deep learning model plus 5 clinical and biological variables.
We selected the minimal number of variables that convey additional severity information not contained in the deep learning model. These variables are age, sex, urea, platelet and oxygen saturation. Then, we combined these 5 variables with the deep learning model to obtain the AI-severity score.
Additionally, we trained alternative scores based on clinical and biological variables only. We found that AI-Severity has a larger performance than these alternative scores, although increase of performance was modest. Overall, CT-scan information increases predictive ability of severity scores. However, predictability is increased to a limited extent. This is because the neural network analysis of CT-scans provides a prognosis score that is correlated with other well-known markers of severity (oxygenation, LDH, and CRP). These results show that we can capture future disease severity biomarkers, such as C-reactive protein (CRP) levels, tissue damage (LDH), and oxygenation through routine CT-scans performed at admission. Furthermore, they show these scans provide useful and interpretable elements for prognosis.
AI-Severity Score Impacts
AI-Severity outperformed 11 previously published severity or mortality scores. These other scores were developed using 200 to 50,000 patients in the development and validation cohort. For example, AI-Severity provides significantly better performance than the 4C mortality score of the UK-wide ISARIC4C consortium that was developed with more than 30,000 patients but that did not include CT-scan information.
In this paper, we showed that adding radiology data to clinical and biological data, improves prediction of severity. This improvement occurs both when a radiologist denotes the lesions present or when we deploy a deep learning model. However, there are several advantages to capturing CT-scan information through deep learning models:
Good reproducibility: This is a key element for imaging biomarkers such as disease extent. The visual inspection of images can introduce variability that can hinder its clinical application.
Quicker diagnosis: Radiologists must read, and prioritize large numbers of cases in a pandemic. AI analysis of radiological images has the potential to reduce this burden and speed up their reading time.
More accurate severity scores: The unimodal prognosis scores obtained with deep learning models trained on CT-scans are more predictive of severity than manually extracted radiological features.
The successful publication in Nature Communications of this clinically relevant risk score for COVID-19 patients – developed in just 2 months – is a validation of Owkin’s approach to collaborative research and a great example of its Platform in practice. Our AI and radiology expertise, along with our software tools, our network of leading academic institutions, and our ability to quickly adapt resources in a time of crisis helped make this collaboration possible.
All consortium members believed this research should remain publicly available to get one step closer to beating this pandemic. As such, we have open-sourced this research and the corresponding models; you can access them via this repository.
Additionally, researchers can use AI-Severity in Owkin Studio, and apply it to their unique imaging datasets of lung diseased patients.
Nathalie Lassau, Samy Ammary, Marie France Bellin, Olivier Meyrignac, Emilie Chouzenoux, Michaël Blum, Paul Herent, Simon Jégou, Etienne Bendjebbar
Myers, L. C., Parodi, S. M., Escobar, G. J. & Liu, V. X. Characteristics of Hospitalized Adults With COVID-19 in an Integrated Health Care System in California. JAMA (2020) doi:10.1001/jama.2020.7202.
Docherty, A. B. et al. Features of 20 133 UK patients in hospital with covid-19 using the ISARIC WHO Clinical Characterisation Protocol: prospective observational cohort study. BMJ 369, m1985 (2020).