AI-powered Speech Intelligibility Assessment for Children

Authors: Vicky Zhang1, Arun Sebastian1, Jessica Monaghan1

1National Acoustic Laboratories

Background: The speech intelligibility (SI) is crucial for effective communication, particularly in young children whose social and academic growth can be significantly impacted by how their speech is understood by others in real-life. Monitoring SI is a valuable way to track the speech development after amplification. However, there are debates about the best way to measure this performance. This study aims to explore an AI-based method for automatic assessment and scoring of SI performance in 5-year-old children, and compare the accuracy and consistency of transcriptions between AI and human listeners at the word level.

Methods: We used 18 pre-trained AI models to transcribe 2990 speech sentences recordings from children with normal hearing (NH) (850 sentences), and children with hearing loss (HL) (2140 sentences). Each sentence was also transcribed by human listeners. We then compared the word accuracy level between the AI methods with naïve human listeners’ results.

Results: The findings demonstrate a high level of agreement between the AI method and those by human listeners in both NH and HL groups of children’s speech recordings (ICC>0.9 for both NH and HL groups). There are significant correlations in the number of correct words between the AI method and the averaged listeners’ results (p<0.001 for both the NH and HL groups). The overall percentage score from the AI model also indicates that children with hearing aids had significantly lower speech intelligibility performance compared to their NH peers and children with cochlear implants (p<0.001).

Conclusions: Selection of audio samples by genre should be supported by an acoustical analysis of their representativeness. Accounting for within genre variability in the test design would also help to generalize findings of studies beyond the selected test material.