Non-Intrusive Prediction of Speech Intelligibility and Listening Effort in Real-Time

Authors: Martin Berdau1

1Medizinische Physik, Carl-von-Ossietzky-Universität Oldenburg

Speech Intelligibility (SI) and Listening Effort (LE) are essential measures for assessing speech perception in an acoustic scenario. This especially is the case in the context of hearing aids, where the aim is to maximize SI or minimize LE by applying different sorts of hearing aid programs and signal processing algorithms. Making accurate predictions using models can become quite challenging, when exposed to dynamic acoustic scenarios with e.g. differing Signal-to-Noise Ratios, reverberation or speaker positions. Handling those challenges requires fast adaption to the scene. Moreover, in order to be usable in a hearing aid devices, real-time applicability and non-intrusiveness is required. However, current modeling is often times done offline, where audio of a fixed length is used and all signal properties at any point in time are known, making those models unusable for practical applications in unknown listening scenarios. Also, a lot of other models require separated clean target and interferer source signals of an acoustic scene, which are hard to acquire. Therefore, we propose a non-intrusive real-time model for predicting SI and LE, which consists of a binaural front-end and a single-channel backend. The front-end receives two ear signals and applies processing with respect to binaural hearing outputting a single-channel audio signal. The back-end makes a prediction of SI and LE on the output by using phoneme probability distributions as employed in Active Speech Recognition systems. In order to evaluate the model predictions, measurements are planned, where participants are required to make real-time assessments. This work may allow for easy real-time monitoring in the context of Ecological Momentary Assessment in the near future.