Using a Small Dataset to Classify Strength-Interactions with an Elastic Display: A Case Study for the Screening of Autism Spectrum Disorder
Health data collection of children with autism spectrum disorder (ASD) is challenging, time-consuming, and expensive; thus, working with small datasets is inevitable in this area. The diagnosis rate in ASD is low, leading to several challenges, including imbalance classes, potential overfitting, and sampling bias, making it difficult to show its potential in real-life situations. This paper presents a data analytics pilot-case study using a small dataset leveraging domain-specific knowledge to uncover differences between the gestural patterns of children with ASD and neurotypicals. We collected data from 59 children using an elastic display we developed during a sensing campaign and 9 children using the elastic display as part of a therapeutic program. We extracted strength-related features and selected the most relevant ones based on how the motor atypicality of children with ASD influences their interactions: children with ASD make smaller and narrower gestures and experience variations in the use of strength. The proposed machine learning models can correctly classify children with ASD with 97.3% precision and recall even if the classes are unbalanced. Increasing the size of the dataset via synthetic data improved the model precision to 99%. We finish discussing the importance of leveraging domain-specific knowledge in the learning process to successfully cope with some of the challenges faced when working with small datasets in a concrete, real-life scenario.
Monarca, I., Cibrian, F.L., Chavez, E. et al. Using a small dataset to classify strength-interactions with an elastic display: a case study for the screening of autism spectrum disorder. Int. J. Mach. Learn. & Cyber. (2022). https://doi.org/10.1007/s13042-022-01554-2