An objective metric that predicts speech intelligibility under different types of noise and distortion would be desirable in voice communication. To date, the majority of studies concerning speech intelligibility metrics have focused on predicting the effects of individual noise or distortion mechanisms. This study proposes an objective metric, the spectrogram orthogonal polynomial measure (SOPM), that attempts to predict speech intelligibility for people with normal hearing under adverse conditions. The SOPM metric is developed by extracting features from the spectrogram using Krawtchouk moments. The metric's performance is evaluated for several types of noise (steady-state and fluctuating noise), distortions (peak clipping, center clipping, and phase jitters), ideal time-frequency segregation, and reverberation conditions both in quiet and noisy environments. High correlation (0.97–0.996) is achieved with the proposed metric when evaluated with subjective scores by normal-hearing subjects under various conditions.
Skip Nav Destination
Article navigation
September 2021
September 10 2021
An intrusive method for estimating speech intelligibility from noisy and distorted signals
Nursadul Mamun;
Nursadul Mamun
a)
1
Department of Electrical and Computer Engineering, University of Texas at Dallas
, Richardson, Texas 75080, USA
Search for other works by this author on:
Muhammad S. A. Zilany;
Muhammad S. A. Zilany
2
Department of Electrical and Computer Engineering, Texas A&M University at Qatar
, Doha, Qatar
Search for other works by this author on:
John H. L. Hansen;
John H. L. Hansen
b)
1
Department of Electrical and Computer Engineering, University of Texas at Dallas
, Richardson, Texas 75080, USA
Search for other works by this author on:
Evelyn E. Davies-Venn
Evelyn E. Davies-Venn
3
Department of Speech-Language-Hearing Sciences, University of Minnesota
, Minneapolis, Minnesota 55455, USA
Search for other works by this author on:
a)
Electronic mail: [email protected]
b)
ORCID: 0000-0003-1382-9929.
J. Acoust. Soc. Am. 150, 1762–1778 (2021)
Article history
Received:
May 18 2020
Accepted:
July 29 2021
Citation
Nursadul Mamun, Muhammad S. A. Zilany, John H. L. Hansen, Evelyn E. Davies-Venn; An intrusive method for estimating speech intelligibility from noisy and distorted signals. J. Acoust. Soc. Am. 1 September 2021; 150 (3): 1762–1778. https://doi.org/10.1121/10.0005899
Download citation file:
Pay-Per-View Access
$40.00
Sign In
You could not be signed in. Please check your credentials and make sure you have an active account and try again.
Citing articles via
All we know about anechoic chambers
Michael Vorländer
Day-to-day loudness assessments of indoor soundscapes: Exploring the impact of loudness indicators, person, and situation
Siegbert Versümer, Jochen Steffens, et al.
A survey of sound source localization with deep learning methods
Pierre-Amaury Grumiaux, Srđan Kitić, et al.
Related Content
A new fault diagnosis approach for analog circuits based on spectrum image and feature weighted kernel Fisher discriminant analysis
Rev. Sci. Instrum. (July 2018)
Estimation of a priori signal-to-noise ratio using neurograms for speech enhancement
J. Acoust. Soc. Am. (June 2020)
Predicting phoneme and word recognition in noise using a computational model of the auditory periphery
J. Acoust. Soc. Am. (January 2017)
Precession feature extraction of ballistic missile warhead with high velocity
AIP Conference Proceedings (April 2018)
Efficient selective image encryption based on chaos and Tchebichef moments
AIP Conf. Proc. (July 2023)