Cortical tracking of speech: Effects of intelligibility and spectral degradation

Authors: Alexis D Deighton MacIntyre *1 ; Tobias Goehring1

1 University of Cambridge

During speech listening, recurring patterns of neural activity become temporally coupled to stimulus features. Such “cortical tracking” can be measured using electroencephalography (EEG). This technique may hold promise in clinical applications—for example, as an objective measure to guide cochlear implant (CI) fitting. Although cortical tracking is established in experimental settings, the effect of spectral degradation associated with CI signal processing is unclear.

We simulate CI listening by presenting natural and spectrally degraded speech to typically hearing listeners (n = 36) undergoing EEG recording. To dissociate sensory from linguistic-phonological processing, we use intelligible (English) and non-intelligible (Dutch) speech produced by the same, bilingual speaker. To maintain auditory attention irrespective of intelligibility, we devised a novel prosodic target detection task. Decoding models were trained to reconstruct the speech amplitude envelope from held-out neural response data, with the correlation between reconstructed and true stimulus envelope providing a measure of cortical tracking.

Cortical tracking was slightly, but significantly, reduced for non-intelligible speech. We find no clear effect of spectral degradation. The behavioural prosody-detection task was performed similarly well across conditions, although both intelligibility and spectral degradation adversely impacted reaction times. Hence, whereas we find behavioural differences as a result of spectral degradation, these differences are potentially too small to be captured by this measure of cortical tracking.

Determining the audiological applications of cortical tracking requires a nuanced understanding of which aspects of acoustic speech processing it is capable of representing. Cortical tracking may not reveal subtle differences in spectral resolution, but could in principle be used as an objective measure of speech envelope tracking with CI.”