Background and Objectives: The need for precise objective speech recognition ability and errors was emphasized. However, the current speech recognition test use in clinic focuses only on the average correct percent (%) of the patients, it is difficult...
Background and Objectives: The need for precise objective speech recognition ability and errors was emphasized. However, the current speech recognition test use in clinic focuses only on the average correct percent (%) of the patients, it is difficult to determine whether he/she is wrong due to confusion or novel words to them. The study aimed to conduct a speech recognition test and to systematically analyze the incorrect response according to error rate, error type, error pattern, speech detection similarity and distance. The study results were intended to be implemented in an algorithm for the development of 'A hearing-based personalized monosyllable words error diagnosis system' that classifies and diagnoses recognition factors that are difficult for patient with hearing loss.
Methods: Speech recognition test was conducted to a total 72 elderly by using total 363 words which consisted of a combination of CVC, CV, V, and VC referring to the Korean standard pronunciation dictionary from initial 3,192 words. The words were selected based on 90% correct score of eight adultd with normal hearing. All stimuli were recorded by two trained announcers (male and female), and any inaccurate or over-tuned stimuli were re-recorded to control the reliability of stimuli. Subjects were classified into normal hearing, mild, moderate, and severe hearing loss groups according to the BSA hearing threshold criteria. Stimulus were presented randomly at the MCL of each participant were asked to repeat what they heard. All responses were recorded and directly transcripted. Further, the subject’s incorrect response error rates and patterns were categorized into error rate, error type, error pattern, speech perception similarity and distance.
Results: As the hearing threshold elevated, the error rate also increased. Out of three phonemes position, initial consonant shows highest error rate and can be seen in various frequency ranges. In all hearing groups, the error rate of diphthong was about 20 to 40% higher than monophthong. In the case of final consonant, highest errors were found to be different for each group. In fact, the words with high error rate can be seen mainly influenced by the initial consonant and this influence increased as hearing severity increased. In terms of error type, substitution was the most common error followed by combination of substitution and substitution for all hearing group. However, rapid increase in types of failure errors was seen in severe hearing group. The highest error frequency was found different for each phoneme positions. In initial consonant, normal hearing and mild hearing loss group were classified according to the obstruent and sonorant way of articulation in the middle consonant group according to the position of the tongue and final consonant according to the obstruent and sonorant way of articulation. With increase in hearing severity, frequency, and degree of similarity between phonemes increased, making it harder to be identified depending on the position of each phoneme. The result of error pattern analysis was in line with speech perception distance analysis that revealed closer distance between the grouped phoneme from error pattern result and this pattern became clearer as the hearing severity increased. Finally, based on the results analyzed, we implemented an algorithm on identifying target hearing, setting, and presenting test questions considering the degree of hearing loss, analyzing error patterns, and reporting comprehensive results.
Conclusions: It is important to focus on errors and analyze factors in order to accurately identify difficulties in recognition. From study results, initial consonant plays the most important role in speech recognition. As highest errors were found to be different for each hearing group, individual speech recognition may vary depending on the combination of syllables suggesting that combinations should be used to improve accuracy of the results. The error pattern characteristic according to the degree of hearing elderly was confirmed and implemented in the algorithm presented in this study. With further clinical trails focusing on verifying the validity and reliability of the algorithm, we can expect a fast and more accurate test of speech recognition and discrimination of hearing-impaired elderly.