Machine learning for models of auditory perception

Author: Dr Josef Schlittenlacher, University of Manchester, UK

Background: Various machine-learning techniques have created new possibilities in hearing healthcare and auditory modelling. With their ability to quantify uncertainty, learn from datasets and efficient computation on parallel hardware, machine-learning techniques are particularly useful for perceptual models that are complex or incorporate individual parameters. We present three different applications to models of auditory perception: Knowledge distillation to speed up computation, combination of first principles and deep learning to model speech recognition, and Bayesian active learning of individual model parameters.

Methods: (1) A three-layer perceptron with simple rectified linear unit activation functions was trained on about 1.7 million spectra of speech and artificial sounds to predict the same loudness as the Cambridge loudness model but considerably faster. (2) An automatic speech recognition (ASR) system was built to allow for modelling of impairments in the spectral domain such as lesions but also in the time domain such as the size of temporal processing windows. It consists of causal and non-causal neural networks and a Hidden Markov Model to predict phonemes (see figure). (3) The edge frequency of a dead region and outer hair cell loss at that place were learned in a Bayesian active-learning hearing test to determine these parameters of an individual model for audibility.

Results: (1) Predictions were accurate for all kinds of sounds, and a 24 hour-recording can be processed within minutes on a graphics processor unit; the reference model takes about 50 times real time. (2) The ASR system is a good predictor for the speech-recognition and phoneme-specific performance of cochlear-implant users. (3) The test was able to identify the individual parameters within 15 minutes.

Conclusions: Machine learning has various applications in auditory modelling and the approaches combined will transform individual testing and processing in hearing devices.