VCCA2020 SUBMITTED ABSTRACT
Authors: Leontien Pragt1, Peter van Hengel1, Dagmar Grob1, Jan-Willem Wasmann1
1Department of Otorhinolaryngology, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Centre Nijmegen, the Netherlands
Background Speech recognition software has become increasingly sophisticated and accurate due to progress in information technology. The software converts speech into text using artificial intelligence. The intended purpose of most developed apps is taking voice commands and taking notes. Nevertheless, some apps are specially developed for the hearing impaired and deaf. This project aims to examine the performance of speech recognition apps and to explore which audiological tests are a representative measure of the ability of these apps to convert speech into text.
Method Four apps have been tested on an iOS and an Android smartphone, respectively AVA (iOS, Android), Earfy (iOS, Android), Live Transcribe (Android) and Speechy (iOS). The audiological tests battery consisted of speech audiometry (NVA list of CVC Dutch words), DIN test (Digits-in-Noise), and PLOMP test (Dutch sentences in noise) with and without noise. Lastly, we presented a dialogue in Dutch and English to the apps and scored all correct, incorrect, and missing words.
Results All apps scored at least 50% phonemes correct for speech audiometry above 65 dB SPL. Earfy (iOS) and AVA (iOS) achieved 100% phoneme discrimination at 70 dB SPL. AVA (iOS, Android) and Live Transcribe (Android) scored the poorest SNR of +8 dB on the DIN test. All apps had a speech-reception-threshold (SRT) between 50 and 60 dB SPL on the PLOMP test without noise. With noise added, the best SNR measured was +8 to 9 dB for Earfy (Android) and Live Transcribe (Android). Overall, the number of correctly transcribed words was higher in English than in Dutch. In Dutch, Earfy (Android) and Speechy (iOS) reached the highest scores of about 80% correct words. In all tests, the test-retest variability appears to be higher for apps compared to humans.
Conclusion Due to the spread in test outcomes among apps for the different audiological tests, no app stands out from the others in terms of performance. When comparing the apps to normal hearing persons, there is a negative shift in SRT for all apps. A normal hearing human achieves 100% phoneme discrimination at 50 dB SPL, whereas the apps need at least 20 dB higher speech levels. Moreover, all apps have a positive SNR of around +8 dB, while the SNR for normal hearing is between -8 to -10 dB. Interestingly, several (profoundly) hearing impaired people report that they experience a benefit from the apps in certain listening situations.