Date of Award

Summer 8-2020

Document Type


Degree Name

Doctor of Philosophy (PhD)


Computational and Data Sciences

First Advisor

Cyril Rakovski

Second Advisor

Daniel Alpay

Third Advisor

Louis Ehwerhemuepha

Fourth Advisor

Gary Doran


This thesis represents the results of three research projects that underline the breadth and depth of my interests.

Firstly, I devoted some efforts to the well-known Box-Pierce goodness-of-fit tests for time series models which has been an important research topic over the last few decades. All previously proposed tests are focused on changes of the test statistics. Instead, I adopted a different approach that takes the best performing test and modifying the rejection region. Thus, I developed a semiparametric correction of the Adjusted Box-Pierce test that attains the best I error rates for all sample sizes and lags and outperforms all previous global time series goodness-of-fit approaches.

Secondly, I aimed to study and identify novel risk factors significantly associated with 72-hour return visits to emergency departments. I queried data consisting of 185,000 ED visits of patients less than 18 years in the United States using the Cerner® Health Facts Database. A nested mixed-effects logistic regression model to provide statistical inference on associated risk factors was built, and a representative set of machine learning algorithms for our predictive modeling task was selected. New respiratory conditions including acute bronchiolitis, pneumonia, and asthma were identified as risk factors for return visits to ED.

Thirdly, I ambitioned to design and implement a comprehensive study to identify new clinical and demographic factors associated with prolonged length of stay ($>$ two weeks) among pediatric patients (aged 18 years and under) in a number of free-standing pediatric and mixed medical facilities. I implemented a mixed effect model to assess the statistical significance and effect sizes of age, race/ethnicity, number of medications, medical family history, presence of infection agents (fungi, bacteria, virus), cancer diagnoses, and other conditions as well as some clinical variables. A stochastic gradient model was also implemented for prediction. From the mixed-effects model, 11 main effect predictors were found to be significantly and statistically associated with an increase in the odds of prolonged length of stay. The area under the operator characteristic curve (AUROC) for the mixed-effects model was 0.887 (0.885, 0.889) and the extreme gradient boosting model attained an AUROC of 0.931 (0.930, 0.933).

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.



To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.