Ideal Binary Masking (IBM), using prior information, improves speech intelligibility by attenuating noisy components with a scaling factor applied to the noise. The main challenge is to construct an appropriate decision-making model to identify noise- or speech- dominant components. In this study, we utilized the signal-to-noise ratio (SNR) of the temporal amplitude envelope in the frequency-time domain. We firstly divided the noisy speech from 200 Hz to 6 kHz, processed by MATLAB, into 16 contiguous subbands each with bandwidth approximately 1.5 times an equivalent rectangular bandwidth. The subband envelopes were produced by means of the absolute value of the signal. SNRs of the temporal envelope were calculated for 40 ms windows. The mask was unity when the SNR was greater than −5dB; otherwise, it was 0.5. We evaluated the performance of the proposed IBM on word scores obtained with different speech in speech-spectrum shaped noise SNR values of −2, −4, −6, and −8 dB. Sixteen native speakers (age 28 ± 3 years) with normal hearing were recruited for the study and underwent the Modified Rhyme Test to assess intelligibility. Statistically significant increases of up to 20% in mean word scores were obtained by this IBM. [Work supported by NIOSH.]
Skip Nav Destination
Article navigation
October 2021
Meeting abstract. No PDF available.
October 01 2021
Using ideal binary masking based on signal-to-noise ratio of temporal amplitude envelope to improve the intelligibility of speech in noise
Rahim Soleymanpour;
Rahim Soleymanpour
Biomedical Eng., Univ. of Connecticut, 263 Farmington Ave., Farmington, CT 06030, [email protected]
Search for other works by this author on:
Insoo Kim;
Insoo Kim
Biomedical Eng., Univ. of Connecticut, Farmington, CT
Search for other works by this author on:
Hillary Marquis;
Hillary Marquis
Dept. of Surgery, Div. of Otolaryngol., Head and Neck Surgery, Univ. of Connecticut, Farmington, CT
Search for other works by this author on:
Anthony J. Brammer
Anthony J. Brammer
Biomedical Eng., Univ. of Connecticut, Farmington, CT
Search for other works by this author on:
J. Acoust. Soc. Am. 150, A275 (2021)
Citation
Rahim Soleymanpour, Kia Golzari, Insoo Kim, Erin Heiney, Hillary Marquis, Anthony J. Brammer; Using ideal binary masking based on signal-to-noise ratio of temporal amplitude envelope to improve the intelligibility of speech in noise. J. Acoust. Soc. Am. 1 October 2021; 150 (4_Supplement): A275. https://doi.org/10.1121/10.0008276
Download citation file:
96
Views
Citing articles via
All we know about anechoic chambers
Michael Vorländer
Day-to-day loudness assessments of indoor soundscapes: Exploring the impact of loudness indicators, person, and situation
Siegbert Versümer, Jochen Steffens, et al.
A survey of sound source localization with deep learning methods
Pierre-Amaury Grumiaux, Srđan Kitić, et al.
Related Content
Effect of envelope signal-to-noise ratio on the intelligibility of speech in speech-spectrum shaped noise
J. Acoust. Soc. Am. (October 2020)
Investigation of a temporal modulation based method on the intelligibility of speech in speech-spectrum shaped noise
J. Acoust. Soc. Am. (October 2020)
Self-administered, internet-enabled, modified rhyme test (MRT) for evaluating consonant confusion in remote subjects
J. Acoust. Soc. Am. (March 2024)
Improving speech understanding for face-to-face communication in noise when wearing hearing protectors
J. Acoust. Soc. Am. (March 2024)
Relationships between the modified rhyme test and objective metrics of speech intelligibility.
J Acoust Soc Am (March 2010)