Early Prognosis of COVID-19 Infections Via Machine Learning

Santiago mazuelas

Nationality Spanish

Year of selection 2020

Institution Basque Center for Applied Mathematics (BCAM)

Country Spain

Risk Health

AXA Awards

3 years

The 2020 COVID-19 outbreak has revealed infections that result in particularly distinct outcomes: certain patients remain asymptomatic during the infection, others experience moderate symptoms for a few weeks, while still others suffer acute or even critical complications. This array of outcomes poses a key challenge for COVID-19 containment, since the most effective countermeasures when infections are detected are markedly different for each type of patient.

 To address this challenge, Dr. Santiago Mazuelas, an AXA Research Fund grantee at the Basque Center for Applied Mathematics (BCAM) in Spain, will develop machine learning techniques for the early prognosis of COVID-19 infections that predict the future severity of infections using health data obtained at the time the infection is detected. Such predictions could help medical staff and public health stakeholders make timely decisions that could result in favorable outcomes. For instance, an infected patient with a negative (or positive) early prognosis could be directly transferred to semi-intensive care (or a regular hospital ward) before he/she experiences notable symptoms. In addition, the prediction algorithms developed in the project could be used to closely monitor individuals who are not infected but who have a high probability of being asymptomatic or suffering complications if they do contract COVID-19.

 Research has already shown that certain health data, such as age and past medical history (PMH), have a strong correlation with the severity of COVID-19 infections, and some recent studies also suggest correlation with certain blood test features. The techniques developed by Dr. Mazuelas and his team use multimodal and information-rich health data to predict the future severity of COVID-19 infections. This health data set is composed of simple clinical data, such as age, sex, weight, blood pressure, body temperature, heart rate, respiratory rate, and past medical history (PMH), together with more detailed metrics, such as those obtained from biochemical tests and electrocardiograms. The learning techniques developed in the project also use a large amount of electronic health records to elucidate the complex relationship between health data instances and future severity of COVID-19 infections. In particular, numerous electronic health records of patients infected by COVID-19 will be used to obtain training data. Each training sample will be composed of the patient’s health data, obtained at early stages of the infection, together with a categorical value that identifies the infection’s severity, as experienced by the patient during the course of the infection.

 Dr. Mazuelas’ research addresses several scientific and technical challenges, both for data processing and learning algorithm design, including the use of unbalanced training samples affected by selection bias* and the development of privacy-preserving and cost-sensitive techniques.

“I was looking for an application of machine learning that would be useful in this pandemic, both for society in general and for the health system”, Dr. Mazuelas says. “Prognosis via machine learning is already being investigated for other diseases, such as cancer and viral infections, but this avenue of research has not been explored yet for COVID-19, even though research on COVID-19 is currently evolving rapidly.”  Hopefully, the results of his work could lead to remarkable improvements in the way medical and public health decisions are made in the treatment and management of COVID-19 infections.

 *in data science, selection bias is the bias introduced by the selection of individuals, groups or data for analysis in such a way that proper randomization is not achieved, with the consequence that the sample obtained is not representative of the population intended to be analyzed