Authors: Franklin Y Alvarez Cardinale * 1 ; Waldo Nogueira 2Â
Affiliations:
1 Medizinische Hochschule Hannover
2 Medical University Hannover, Cluster of Excellence Hearing4all
Predicting speech understanding performance of hearing impaired people has been a challenging task for researchers, clinicians and developers. Objective measurements of speech intelligibility consists of an algorithm that transform the sound signal reaching our ears into a speech intelligibility index. Intrusive measures use information from the target speech to evaluate the signal. Additionally, some of these algorithms implement a hearing loss model to the perceived signal to account for individualized hearing loss.
This work presents an intrusive objective speech intelligibility algorithm that uses a physiological computational model of the human peripheral auditory system. Such a model consists of a population of auditory nerve fibers that transform acoustic signals into neural activity in form of spikes. Mutual information between the spike activity obtained from an ideal listener and a hearing aided listener is computed to calculate an index that is mapped to a speech intelligibility score. This algorithm is called spike activity mutual information index (SAMII).
Performance of SAMII was compared to the modified binaural short-time objective intelligibility (MBSTOI). The experiment baseline was taken from the first clarity prediction challenge. It consisted of simulations of different scenes where a target speaker, a noise source and a hearing aided listener were randomly located in a reververant room. Root mean square error (RMSE) between predicted percentage of word recognized and real listener performances was used to evaluate both algorithms. SAMII and MBSTOI obtained 35.16% and 36.52%, respectively.
Although SAMII do not show a significant improvement over MBSTOI, it was capable to predict speech understanding using a low-level representation of the sound. With further improvements, SAMII have the advantage of including highly detailed models of our peripheral auditory system to predict speech intelligibility with implementations beyond hearing aids.