Searching for individual differences in audiovisual integration of speech in noise

Authors: Chris Sumner1, Samuel Smith2, Jens Roeser1, Thom Baguley1, Paula Stacey1

1Nottingham Trent University
2Harvard University

Background: Speech comprehension is often aided by watching a talker’s face. It is clear that there are individual differences in the ability to understand auditory speech in noise, and differences in the ability to understand visual speech (“lip-reading”) although most people are poor at this. It remains a matter for debate whether people differ in how they integrate auditory and visual cues.

Methods: To address this question, we have applied a new method for quantifying multi-sensory integration, based on signal detection theory (SDT), to a large dataset of online audio-visual speech perception performance.

Results: Participants (>200) vary in (self-reported): age, hearing-loss, English language experience, language-specific impairments and neurodiversity. Audiovisual performance was for the most part accurately predicted by unisensory performance. In contrast, for both auditory-only and visual-only speech perception there were striking differences in individual performance. There was also some evidence of some small systematic differences in auditory performance across the demographic groups.

Conclusion: This suggests that the “audiovisual integration function” for speech is relatively consistent across a diverse population, with individual differences being attributable to differences in unisensory perception.