Skip to main content
Journal Club

Journal Club by SWISS/KNIFE

Original Paper

"Machine learning predictions of unplanned readmissions using electronic medical records: Predictor importance across medical and surgical patient populations"

Michael M Havranek, Aljoscha B Hwang, Ilona Funka, Dominique Kuhlen, Daniel Liedtke, Stefan Boes. PLoS One 2025 Sep 4;20(9):e0331263. doi: 10.1371/journal.pone.0331263. eCollection 2025. 

Download PDF

This study investigates the use of machine learning to predict unplanned hospital readmissions using electronic medical records (EMRs) from eight Swiss hospitals. The authors aim to improve upon previous models, which often relied on administrative data and demonstrated only modest predictive accuracy. By leveraging EMR data available before discharge and applying robust feature engineering, the study seeks to identify key predictors and assess their importance across medical and surgical patient populations.

In the study, 200,799 inpatient stays between 2018 and 2024 were analyzed, using the Centers for Medicare and Medicaid Services (CMS) definition of unplanned readmissions within 30 days. The dataset was divided into three cohorts: hospital-wide, medical, and surgical. Random forest models were developed and evaluated using area under the curve (AUC) metrics, achieving scores of 0.78 for the hospital-wide and surgical cohorts, and 0.72 for the medical cohort. These results outperform many existing models and demonstrate the feasibility of using EMR data for predictive modelling.

Diagnoses emerged as the most important predictor group, contributing 18.4% to the model’s decisions. Nursing assessments followed closely at 11.2%, and procedural information accounted for 10.8%. Interestingly, traditional predictors such as demographics, admission details, and diagnosis-related groups (DRGs) had minimal impact on model performance. The study also found notable differences between medical and surgical patients. For medical patients, medications and prior healthcare use were more predictive, while procedural details and physician caseload were more influential for surgical patients.

Feature engineering played a critical role in the study. Structured and unstructured EMR data were transformed into usable formats, with categorical variables converted into binary dummies and continuous variables aggregated using various statistical methods. Missing data were imputed using median values. Unstructured nursing notes were processed using a large language model, enhancing the richness of the dataset.

The study compared random forest models with XGBoost and found similar performance across both algorithms. Feature importance was assessed using Gini impurity reduction, revealing that the presence of specific ICD-10, CHOP, and ATC codes was more predictive than the actual values of lab tests or vital signs. This suggests that clinical decision-making, such as ordering a test, may serve as a proxy for patient acuity.

From a clinical perspective, the study highlights the potential of EMR-based models to support decision-making and identify high-risk patients before discharge. Nursing documentation, often overlooked, proved to be a valuable source of predictive information. The findings suggest that hospitals can tailor interventions based on patient type—for example, focusing on medication reconciliation for medical patients and procedural follow-up for surgical cases.

However, the study has limitations. It only includes internal readmissions within the hospital group, potentially underestimating true readmission rates. The generalizability of the findings may be limited due to the Swiss healthcare context and the private hospital setting. Additionally, the use of Gini importance may overestimate the relevance of certain variables, and missing data could introduce bias despite imputation efforts.

In conclusion, the study demonstrates that machine learning models using EMR data can predict unplanned readmissions with meaningful accuracy. Their work underscores the importance of nuanced clinical data and thoughtful feature engineering, offering a promising path forward for personalized discharge planning and quality improvement in healthcare.

Interview with Dr. med. Michael Havranek (Lucerne)

 

What inspired you to conduct this study?

Hospital readmissions prolong patient suffering and contribute significantly to healthcare costs. Since Switzerland incorporated unplanned 30-day readmission rates into its national quality monitoring in 2022, hospitals have been seeking tools to identify high-risk patients. So, we were inspired to investigate whether Swiss electronic medical record (EMR) data available before discharge—combined with systematic feature engineering based on our earlier meta-analytical work—could lead to accurate predictions of unplanned readmissions to identify high-risk patients.

Were there any unexpected findings?

Yes, while we expected diagnoses and procedures to be strong predictors, we were surprised by the added value of nursing assessments. Variables from structured nursing documentation (e.g., mobility, self-care, etc.) proved highly relevant, complementing medical and procedural data. In addition, simple indicators—such as whether a laboratory test was performed, regardless of the actual value—also carried predictive power, likely reflecting underlying clinical judgment.

What is the direct impact on the surgeon's work?

Our findings demonstrate that integrating such predictive models into clinical workflows could help surgeons identify at-risk patients before discharge, allowing for more targeted follow-up and potentially reducing readmissions.

What is your learning point from this project?

One key learning is that predictions of readmissions are most successulf when leveraging the full breadth of Swiss EMR data—not only established risk-factors such as particular diagnoses, but also nursing documentation, prior healthcare use, and even seemingly indirect indicators (like the laboratory tests mentioned above). In addition, we learned that robust results are achievable despite challenges with missing data, if careful feature engineering is applied.

Are there any subsequent projects planned?

Future work may focus on prospective validation of these models and exploring integration into real-time clinical decision support systems. At the moment, we are using other study designs to continue our work on readmissions, but it is likely that we will revisit the approaches used in the current study in the the near fo medium future.

 

Download PDF