Correlates of linguistic processing in the frequency following response to naturalistic speech

Mikolaj Kegler1, Hugo Weissbart2, Tobias Reichenbach1,3

1 Department of Bioengineering & Centre for Neurotechnology, Imperial College London, London, UK 2 Donders Centre for Cognitive Neuroimaging & Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands; 3 Department of Artificial Intelligence in Biomedical Engineering, Friedrich-Alexander-University Erlangen-Nuremberg, Erlangen, Germany

Background: Comprehension of spoken language requires rapid and continuous integration of upcoming acoustic information. Most of the studies investigating neural correlates of natural language comprehension focus on comparatively slow cortical activity. However, fast neural activity in subcortical and cortical areas can also track the fundamental frequency of voiced speech (f0). Whether this fast neural tracking plays a role in linguistic aspects of speech processing remains unclear. Here, we investigated whether this neural response is influenced by linguistic cues.

Methods: We measured EEG while participants listened to audiobooks. We then used a language model to compute linguistic features describing each word from the stories. Each word was characterized by its frequency out of context and by context-dependent surprisal and precision. We used a linear model to find a mapping between the fundamental waveform, which oscillated at the f0 of the speech signal, and the EEG. The model quantified the neural tracking of the fundamental waveform through a reconstruction score. Finally, we established a multiple regression model that predicted the reconstruction score for each word from its linguistic features.

Results: The neural response estimated by the linear model had a low latency (11 ms) and a high-frequency (above 50 Hz), characteristic for the neural tracking of the f0. The coefficients of the multiple regression model indicated that the single-word neural phase-locking to the f0 was significantly influenced by the context-dependent linguistic features: word precision and surprisal.

Conclusion: We showed that the neural response to the f0 of continuous speech in naturalistic narratives is modulated by context-dependent linguistic cues. Due to the low latency of the response, our findings suggest that it is under top-down control from higher processing centers. Our results show that the neural response at the f0 plays an active role in the rapid and continuous processing of spoken language.

A) Latency of the frequency following response to natural speech estimated by the complex linear model. B) Magnitudes of the model coefficients at 11 ms. C) Phases of the model coefficients at 11 ms.