Franceli L. Cibrian
Introduction: Speech-to-text technology has become key in supporting technologies such as voice assistants (e.g., Alexa, Siri). Unfortunately, some individuals with speech differences, such as accents, female voices, children, or individuals with disabilities such as Down Syndrome, are not well recognized, creating issues in inclusivity. The first step toward making it more inclusive is to figure out where the errors or weaknesses are in speech-to-text algorithms (YouTube, IBM, Zoom, and Azure) in recognizing dialogs from diverse populations.
Methods: We analyze 10 videos from the ‘Special Books by Special Kids’ YouTube channel. Videos include 15 people with Down Syndrome and 6 Neurotypicals. To compare how algorithms perform, we developed a python script to compute the word error rate, mismatch, insertion, and deletion.
Results: Each algorithm did better for Neurotypicals than individuals with Down Syndrome by almost 40%. Overall, the most accurate algorithm was Azure for both Down Syndrome (46%) and Neurotypicals (87%). In general, all algorithms struggled the most with mismatching words, then deleting words, and the least common mistake was inserting words.
Conclusion: Even though Azure is doing better than other algorithms, it still does not work well for Down Syndrome. To further understand the limitations and potential improvement of these algorithms, we propose a phonetic analysis to identify key sounds that prove difficult to detect in each algorithm. The end goal is to determine the best algorithm for analyzing speech from individuals with Down Syndrome and to ultimately provide an inclusive and more accurate algorithm. We are also planning to use estate of the art AI algorithms such as OpenAI and AssemblyAI.
Acknowledgments: The first three authors equally contributed to this paper. We also thank Dr. Vivian Genaro Motti for her contributions to this research.
Anderson, Kayla; Abrahamsson, Cecilia Marie; Chen, Yingying 'Yuki'; and Cibrian, Franceli L., "Analysis of Speech-to-Text Algorithms in Recognizing Down Syndrome Conversations" (2023). Student Scholar Symposium Abstracts and Posters. 575.