HYBRID EVENT: You can participate in person at Rome, Italy or Virtually from your home or work.

3rd Edition of International Precision Medicine Conference

March 17-19, 2025

March 17 -19, 2025 | Rome, Italy
2022 Speakers

Ilaria Gandin

Ilaria Gandin, Speaker at Personalized Medicine Conferences 2022
University of Trieste, Italy
Title : A machine learning approach based on electronic health records to develop a risk model for new onset of atrial fibrillation

Abstract:

Atrial fibrillation (AF) is a type of cardiac arrhythmia associated with major adverse events, like
hospitalizations and death. Tools and techniques for the prediction of AF are required to identify high-risk
individuals. Currently, the most used tool to this aim is the CHARGE-AF risk model [Alonso 2013], a
standard survival model based on a small number of variable readily available in primary care setting.
The present work has two main objectives: 1) the development of a predictive risk score based on machine
learning (ML) algorithms applicable in the clinical context; 2) performing an extensive comparison
between standard and ML methods, in particular benefits coming from the non-linear contribution of a
large number of input variables and possible drawbacks.
In this study we analysed data of 16887 patients examined at the Cardiovascular Observatory of Trieste
(Italy) between 2009 and 2014 with no previous history of AF. Information consists of Electronic Health
Records (EHRs) that includes a wide variety of variables: demographics, clinical parameters measured
during the cardiological visit, ECG parameters, prescriptions, previous diagnosis, comorbidities. We
investigated two approaches for the prediction of AF onset within 5 years from the visit date: the
application of CHARGE-AF and development of prediction models based on 96 features extracted from
EHRs. We implemented two algorithms: the penalised logistic regression (LR) and the gradient boosting
(XGB), an ensemble method based on tree weak learners.
Using CHARGE-AF we obtain an area under the ROC curve (AUC) of 0.70 ([0.70, 0.71] 95% CI), whereas for
LR model and XGB model we obtain AUC=0.80 ([0.784 , 0.823] 95% CI) and AUC=0.84 ([0.82, 0.86] 95% CI)
respectively. However, we observe that XGB model is poorly calibrated (calibration slope=0.54, intercept=-
0.38). As for variable importance, in both LR and XGB we identify good overlap with factors involved in the
CHARGE-AF score, but hemoglobin emerges as a relevant variable only in XGB model.
Further analyses are required to understand the benefits in using ML algorithms in this context. Next
developments of the work are: implementation of more advanced tuning techniques for model’s hyperparameters
(instead of grid search); recalibration techniques, in particular isotonic regression fitting and
assessment of recalibrates predictions using Brier score and reliability plots; investigation of the role of
new emerging high-impact variables.

Biography:

Dr. Ilaria Gandin is a mathematician. In 2016 she received a PhD in Molecular Genetics focused on the
implementation of generalized mixed-models for massive genetic datasets. In 2017 she became researcher
at Area Science Park (Italy), a technology-transfer research institute and expanded her research interest
towards machine learning techniques. Since 2020, she has been researcher in Medical Statistics at the
University of Trieste (Italy) working on machine learning models for longitudinal health data, in particular
patient trajectories extracted form Electronic Health Records and electrocardiogram signals.

Watsapp