Several algorithms have been shown to generate a metric corresponding to the Speech Transmission Index (STI) using speech as a probe stimulus [e.g., Goldsworthy and Greenberg, J. Acoust. Soc. Am. 116, 3679–3689 (2004)]. The time-domain approaches work well on long speech segments and have the added potential to be used for short-time analysis. This study investigates the performance of the Envelope Regression (ER) time-domain STI method as a function of window length, in acoustically degraded environments with multiple talkers and speaking styles. The ER method is compared with a short-time Theoretical STI, derived from octave-band signal-to-noise ratios and reverberation times. For windows as short as 0.3 s, the ER method tracks short-time Theoretical STI changes in stationary speech-shaped noise, fluctuating restaurant babble and stationary noise plus reverberation. The metric is also compared to intelligibility scores on conversational speech and speech articulated clearly but at normal speaking rates (Clear/Norm) in stationary noise. Correlation between the metric and intelligibility scores is high and, consistent with the subject scores, the metrics are higher for Clear/Norm speech than for conversational speech and higher for the first word in a sentence than for the last word.
Skip Nav Destination
Article navigation
November 2013
November 01 2013
Comparison of a short-time speech-based intelligibility metric to the speech transmission index and intelligibility dataa)
Karen L. Payton;
Karen L. Payton
b)
ECE Department, University of Massachusetts Dartmouth
, 285 Old Westport Road, North Dartmouth, Massachusetts 02747
Search for other works by this author on:
Mona Shrestha
Mona Shrestha
ECE Department, University of Massachusetts Dartmouth
, 285 Old Westport Road, North Dartmouth, Massachusetts 02747
Search for other works by this author on:
b)
Author to whom correspondence should be addressed. Electronic mail: kpayton@umassd.edu
a)
Portions of this work were presented in “Analysis of short-time speech transmission index algorithms” Proceedings of Acoustics'08 Conference, Paris, France June 30, 2008 and “Evaluation of short-time speech-based intelligibility metrics” Proceedings of Int. Comm. Bio. Effects of Noise Conference, Foxwoods, CT, 23 July 2008.
J. Acoust. Soc. Am. 134, 3818–3827 (2013)
Article history
Received:
August 31 2012
Accepted:
August 30 2013
Citation
Karen L. Payton, Mona Shrestha; Comparison of a short-time speech-based intelligibility metric to the speech transmission index and intelligibility data. J. Acoust. Soc. Am. 1 November 2013; 134 (5): 3818–3827. https://doi.org/10.1121/1.4821216
Download citation file:
Pay-Per-View Access
$40.00
Sign In
You could not be signed in. Please check your credentials and make sure you have an active account and try again.
Citing articles via
Vowel signatures in emotional interjections and nonlinguistic vocalizations expressing pain, disgust, and joy across languages
Maïa Ponsonnet, Christophe Coupé, et al.
The alveolar trill is perceived as jagged/rough by speakers of different languages
Aleksandra Ćwiek, Rémi Anselme, et al.
A survey of sound source localization with deep learning methods
Pierre-Amaury Grumiaux, Srđan Kitić, et al.
Related Content
Analysis of short‐time speech transmission index algorithms
J Acoust Soc Am (May 2008)
Analysis of speech-based speech transmission index methods with implications for nonlinear operations
J Acoust Soc Am (December 2004)
Louis Braida’s influence on speech intelligibility research
J Acoust Soc Am (May 2017)
An evaluation of objective measures for intelligibility prediction of time-frequency weighted noisy speech
J. Acoust. Soc. Am. (November 2011)
Mechanisms underlying speech masking release in hybrid cochlear implant users
J Acoust Soc Am (March 2018)