Date of Award
Spring 5-2024
Document Type
Thesis
Degree Name
Master of Science (MS)
Department
Computational and Data Sciences
First Advisor
Dr. Cyril Rakovski
Second Advisor
Dr. Hanna Lu
Third Advisor
Dr. Adrian Vajiac
Abstract
We design and implement a multi-stage modeling approach focused on predicting unplanned 30-day all- cause intensive care unit (ICU) hospital readmissions using the Medical Information Mart for Intensive Care (MIMIC IV) dataset. Structured data consisting of demographic information, comorbidities, lab results, and vital signs are combined with features extracted from medical text data consisting of patients’ diagnoses, procedures, and discharge notes and further engineered using several methods, including Latent Dirichlet Allocation (LDA), Latent Semantic Analysis (LSA), and word embeddings.
We sequentially implement three distinct Dense Neural Networks (DNNs) combined with the LightGBM gradient-boosting framework. Our model attained a 5-fold cross-validated area under the ROC curve (AU- ROC) of 0.81. Our results demonstrate the effectiveness of the proposed modeling approach in identifying high-risk patients for unplanned ICU readmissions within 30 days of initial discharge.
This research adds to the expanding field of healthcare informatics by highlighting the effectiveness of incorporating a multitude of Natural Language Processing (NLP) and other patient-engineered features in a single model. These discoveries offer valuable insights for healthcare professionals and decision-makers aiming to decrease ICU readmissions.
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.
Recommended Citation
D. Licerio, "Predicting 30-day unplanned ICU readmissions using deep learning and natural language processing techniques: A MIMIC IV data analysis," M. S. thesis, Chapman University, Orange, CA, 2024. https://doi.org/10.36837/chapman.000557