Date of Award

Spring 5-28-2019

Document Type


Degree Name

Master of Science (MS)


Computational and Data Sciences

First Advisor

Erik Linstead, Ph.D

Second Advisor

Elizabeth Stevens, Ph.D

Third Advisor

Dennis Dixon, Ph.D


The understanding and treatment of challenging behaviors in individuals with Autism Spectrum Disorder is paramount to enabling the success of behavioral therapy; an essential step in this process being the labeling of challenging behaviors demonstrated in therapy sessions. These manifestations differ across individuals and within individuals over time and thus, the appropriate classification of a challenging behavior when considering purely qualitative factors can be unclear. In this thesis we seek to add quantitative depth to this otherwise qualitative task of challenging behavior classification. We do so through the application of natural language processing techniques to behavioral descriptions extracted from the CARD Skills dataset. Specifically, we construct 3 sets of 50-dimensional document embeddings to represent the 1,917 recorded instances of challenging behaviors demonstrated in Applied Behavior Analysis therapy. These embeddings are learned through three processes: a TF-IDF weighted sum of Word2Vec embeddings, Doc2Vec embeddings which use hierarchical softmax as an output layer, and Doc2Vec which optimizes the original Doc2Vec architecture through Negative Sampling. Once created, these embeddings are initially used as input to a Support Vector Machine classifier to demonstrate the success of binary classification within this problem set. This preliminary exploration achieves promising classification accuracies ranging from 78.2-100% and establishes the separability of challenging behaviors given their neural embeddings. We next construct a multi-class classification model via a Gaussian Process Classifier fitted with Laplace approximation. This classification model, trained on an 80/20 stratified split of the seven most frequently occurring behaviors in the dataset, produces an accuracy of 82.7%. Through this exploration we demonstrate that the semantic queues derived from the language of challenging behavior descriptions, modeled using natural language processing techniques, can be successfully leveraged in classification architectures. This study represents the first of its kind, providing a proof of concept for the application of machine learning to the observations of challenging behaviors demonstrated in ASD with the ultimate goal of improving the efficacy of the behavioral treatments which intrinsically rely on the accurate identification of these behaviors.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.



To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.