Date of Award

Spring 5-2024

Document Type

Thesis

Degree Name

Master of Science (MS)

Department

Computational and Data Sciences

First Advisor

Dr. Cyril Rakovski

Second Advisor

Dr. Hanna Lu

Third Advisor

Dr. Adrian Vajiac

Abstract

We design and implement a multi-stage modeling approach focused on predicting unplanned 30-day all- cause intensive care unit (ICU) hospital readmissions using the Medical Information Mart for Intensive Care (MIMIC IV) dataset. Structured data consisting of demographic information, comorbidities, lab results, and vital signs are combined with features extracted from medical text data consisting of patients’ diagnoses, procedures, and discharge notes and further engineered using several methods, including Latent Dirichlet Allocation (LDA), Latent Semantic Analysis (LSA), and word embeddings.

We sequentially implement three distinct Dense Neural Networks (DNNs) combined with the LightGBM gradient-boosting framework. Our model attained a 5-fold cross-validated area under the ROC curve (AU- ROC) of 0.81. Our results demonstrate the effectiveness of the proposed modeling approach in identifying high-risk patients for unplanned ICU readmissions within 30 days of initial discharge.

This research adds to the expanding field of healthcare informatics by highlighting the effectiveness of incorporating a multitude of Natural Language Processing (NLP) and other patient-engineered features in a single model. These discoveries offer valuable insights for healthcare professionals and decision-makers aiming to decrease ICU readmissions.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

Available for download on Wednesday, May 06, 2026

Included in

Data Science Commons

Share

COinS