Experiments comparing isolated word recognition by human listners with automatic speech recognition systems are valuable because error analyses may lead to improvements in speech recognition technology. Isolated word recognition in adult human listeners has been compared with recognition performance by two commercially available speech‐recognition systems. The test stimuli were drawn from the Lincoln Laboratory Stressed‐Speech database. The database consists of 6930 stimuli (two iterations of each of 35 words spoken by nine different people in 11 different speaking styles). The vocabulary contains confusable words (i.e., go, hello, oh, no, and zero); the speaking styles include a wide range of naturally occurring variations (i.e., normal, slow, fast, soft, loud, angry). Analyses show that the acoustic characteristics of individual words vary considerably across talkers, and across styles within talkers. Performance of human listeners and the two machine‐based recognition systems was tested in a single‐talker, multistyle condition, and in a multitalker, multistyle condition. All tests were conducted under two listening conditions: normal, and in the presence of masking noise. The data to be presented are the error patterns of human listeners, versus the machine‐recognition systems, exhibited across talkers, across speaking styles, and across training conditions (multitalker, multistyle training versus single talker, single style training). [Work supported by Boeing Aerospace and Electronics.]
Skip Nav Destination
Article navigation
November 1989
August 13 2005
Word recognition by humans and machines: Tests on a multitalker, multistyle database
Patricia K. Kuhl;
Patricia K. Kuhl
Department of Speech and Hearing Sciences, University of Washington, Seattle, WA 98195
Search for other works by this author on:
Kerry P. Green;
Kerry P. Green
Department of Speech and Hearing Sciences, University of Washington, Seattle, WA 98195
Search for other works by this author on:
John W. Gordon;
John W. Gordon
Boeing Aerospace and Electronics, Seattle, WA 98124
Search for other works by this author on:
David L. Sanford;
David L. Sanford
Boeing Aerospace and Electronics, Seattle, WA 98124
Search for other works by this author on:
Caroline Fu
Caroline Fu
Boeing Aerospace and Electronics, Seattle, WA 98124
Search for other works by this author on:
J. Acoust. Soc. Am. 86, S77 (1989)
Citation
Patricia K. Kuhl, Kerry P. Green, John W. Gordon, David L. Sanford, Caroline Fu; Word recognition by humans and machines: Tests on a multitalker, multistyle database. J. Acoust. Soc. Am. 1 November 1989; 86 (S1): S77. https://doi.org/10.1121/1.2027648
Download citation file:
Citing articles via
All we know about anechoic chambers
Michael Vorländer
Day-to-day loudness assessments of indoor soundscapes: Exploring the impact of loudness indicators, person, and situation
Siegbert Versümer, Jochen Steffens, et al.
A survey of sound source localization with deep learning methods
Pierre-Amaury Grumiaux, Srđan Kitić, et al.
Related Content
Automatic evaluation of English spoken by Japanese students
J Acoust Soc Am (August 2005)
Time and frequency spectral derivative features for robust recognition of Lombard and noisy speech
J Acoust Soc Am (August 2005)
Modeling lexical stress in read and spontaneous speech
J Acoust Soc Am (August 2005)
Homovocalic patterns in two‐ and three‐syllable words: Implications for automatic speech recognition
J Acoust Soc Am (August 2005)
The intelligibility of native and non‐native speakers of American English using spelling alphabet test materials
J Acoust Soc Am (August 2005)