Date of Award

Spring 8-2023

Document Type


Degree Name

Doctor of Philosophy (PhD)


Computational and Data Sciences

First Advisor

Dr. Cyril Rakovski

Second Advisor

Dr. Adrian Vajiac

Third Advisor

Dr. Sidy Danioko


This dissertation documents an investigation into Parkinson’s Disease utilizing machine learning and causal inference methods. I will cover a descriptive analysis of Parkinson’s Disease (PD) in a vast, high-quality database and present costs associated with Parkinson’s Disease medications. I also researched a causal inference method assessing the Carbidopa-Levodopa effect on two-year survival and a causal survival analysis on a one-to-five-year survival comparing no drug use and Carbidopa-Levodopa in Parkinson’s Disease patients.

For my classification with Parkinson’s gait, patients were monitored with a smartphone and an additional 6 Inertial Measurement Unit (IMU) sensors to collect clinical gait measures. I used classical machine learning algorithms on raw smartphone data to distinguish between ON and OFF times. With an average accuracy of 92.5%, this work demonstrates the feasibility of using smartphone data to distinguish between ON versus OFF walking and lays the groundwork for a real-world, corrective feedback system.

I also researched the causal effect of the most prevalent PD medication in terms of survival. In particular, I focused on the probability of two-year survival with PD patients taking Carbidopa-Levodopa and no drug use and assessing whether there was an effect on survival utilizing the doubly robust method. My results with the differences of causal effects showed a 0.013 positive increase taking Carbidopa-Levodopa indicating this medication had a significant positive effect on the two-year survival of PD patients.

I then furthered this study and conducted a causal survival analysis from one-to-five-year survival with two treatments, no drug use and Carbidopa-Levodopa. The results showed that Carbidopa- vii Levodopa had a significant effect on survival when the drug was prescribed within three years from first diagnosis and no drug use had a significant effect at four and five years of survival.

In the process of better using the current data, a descriptive statistical analysis was conducted. As such, I studied a vast and high-quality database Cerner Real-World Data and focused on people who were diagnosed with Parkinson’s Disease from 2016 to 2022. I researched the demographics, comorbidities, and medications of PD patients. After cleaning the database, my final cohort size was 110,037 subjects.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

Available for download on Tuesday, May 20, 2025

Included in

Data Science Commons