The SII model in its present form (ANSI S3.5-1997, American National Standards Institute, New York) can accurately describe intelligibility for speech in stationary noise but fails to do so for nonstationary noise maskers. Here, an extension to the SII model is proposed with the aim to predict the speech intelligibility in both stationary and fluctuating noise. The basic principle of the present approach is that both speech and noise signal are partitioned into small time frames. Within each time frame the conventional SII is determined, yielding the speech information available to the listener at that time frame. Next, the SII values of these time frames are averaged, resulting in the SII for that particular condition. Using speech reception threshold (SRT) data from the literature, the extension to the present SII model can give a good account for SRTs in stationary noise, fluctuating speech noise, interrupted noise, and multiple-talker noise. The predictions for sinusoidally intensity modulated (SIM) noise and real speech or speech-like maskers are better than with the original SII model, but are still not accurate. For the latter type of maskers, informational masking may play a role.

1.
Allen
,
J. B.
(
1994
). “
How do humans process and recognize speech
,”
IEEE Trans. Speech Audio Process.
2
,
567
577
.
2.
ANSI (1996). ANSI S3.6-1996, “American National Standard Methods for Specification for audiometers” (American National Standards Institute, New York).
3.
ANSI (1997). ANSI S3.5-1997, “American National Standard Methods for Calculation of the Speech Intelligibility Index” (American National Standards Institute, New York).
4.
Bacon
,
S. P.
,
Opie
,
J. M.
, and
Montoya
,
D. Y.
(
1998
). “
The effects of hearing loss and noise masking on the masking release for speech in temporally complex backgrounds
,”
J. Speech Lang. Hear. Res.
41
,
549
563
.
5.
Bosman
,
A. J.
, and
Smoorenburg
,
G. F.
(
1995
). “
Intelligibility of Dutch CVC syllables and sentences for listeners with normal hearing and with three types of hearing impairment
,”
Audiology
34
,
260
284
.
6.
Bronkhorst
,
A. W.
(
2000
). “
The Cocktail party phenomenon: A review of research on speech intelligibility in multiple-talker conditions
,”
Acustica
86
,
117
128
.
7.
Bronkhorst
,
A. W.
, and
Plomp
,
R.
(
1992
). “
Effect of multiple speechlike maskers on binaural speech recognition in normal and impaired hearing
,”
J. Acoust. Soc. Am.
92
,
3132
3139
.
8.
Brungart
,
D. S.
(
2001
). “
Informational and energetic masking effects in the perception of two simultaneous talkers
,”
J. Acoust. Soc. Am.
109
,
1101
1109
.
9.
Brungart
,
D. S.
, and
Simpson
,
B. D.
(
2002
). “
The effects of spatial separation in distance on the informational and energetic masking of a nearby speech signal
,”
J. Acoust. Soc. Am.
112
,
664
676
.
10.
Brungart
,
D. S.
,
Simpson
,
B. D.
,
Ericson
,
M. A.
, and
Scott
,
K. R.
(
2001
). “
Informational and energetic masking effects in the perception of multiple simultaneous talkers
,”
J. Acoust. Soc. Am.
110
,
2527
2538
.
11.
Carhart
,
R.
,
Tillman
,
T. W.
, and
Greetis
,
E. S.
(
1969
). “
Perceptual masking in multiple sound backgrounds
,”
J. Acoust. Soc. Am.
45
,
694
703
.
12.
de Laat, J. A. P. M., and Plomp, R. (1983). “The reception threshold of interrupted speech for hearing-impaired listeners,” in Hearing—Physiological Bases and Psychophysics, edited by R. Klinke and R. Hartman (Springer, Berlin), pp. 359–363.
13.
Dirks
,
D. D.
,
Bell
,
T. S.
,
Rossman
,
R. N.
, and
Kincaid
,
G. E.
(
1986
). “
Articulation index predictions of contextually dependent words
,”
J. Acoust. Soc. Am.
80
,
82
92
.
14.
Drullman
,
R.
, and
Bronkhorst
,
A. W.
(
2000
). “
Multichannel speech intelligibility and talker recognition using monaural, binaural, and three-dimensional auditory presentation
,”
J. Acoust. Soc. Am.
107
,
2224
2235
.
15.
Dubno
,
J. R.
,
Horwitz
,
A. R.
, and
Ahlstrom
,
J. B.
(
2002
). “
Benefit of modulated maskers for speech recognition by younger and older adults with normal hearing
,”
J. Acoust. Soc. Am.
111
,
2897
2907
.
16.
Dubno
,
J. R.
,
Horwitz
,
A. R.
, and
Ahlstrom
,
J. B.
(
2003
). “
Recovery from prior stimulation: Masking of speech by interrupted noise for younger and older adults with normal hearing
,”
J. Acoust. Soc. Am.
113
,
2084
2094
.
17.
Duquesnoy
,
A. J.
(
1983
). “
Effect of a single interfering noise or speech source upon the binaural sentence intelligibility of aged persons
,”
J. Acoust. Soc. Am.
74
,
739
743
.
18.
Eddins
,
D. A.
,
Hall
, III,
J. W.
, and
Grose
,
J. H.
(
1992
). “
The detection of temporal gaps as a function of frequency region and absolute noise bandwidth
,”
J. Acoust. Soc. Am.
91
,
1069
1077
.
19.
Festen, J. M. (1987). “Speech-perception threshold in a fluctuating background sound and its possible relation to temporal resolution,” in The Psychophysics of Speech Perception, edited by M. E. H. Schouten (Martinus Nijhoff, Dordrecht), pp. 461–466.
20.
Festen
,
J. M.
(
1993
). “
Contributions of comodulation masking release and temporal resolution to the speech-reception threshold masked by an interfering voice
,”
J. Acoust. Soc. Am.
94
,
1295
1300
.
21.
Festen
,
J. M.
, and
Plomp
,
R.
(
1990
). “
Effects of fluctuating noise and interfering speech on the speech-reception threshold for impaired and normal hearing
,”
J. Acoust. Soc. Am.
88
,
1725
1736
.
22.
Fletcher
,
H.
, and
Galt
,
R. H.
(
1950
). “
The perception of speech and its relation to telephony
,”
J. Acoust. Soc. Am.
22
,
89
151
.
23.
French
,
N. R.
, and
Steinberg
,
J. C.
(
1947
). “
Factors governing the intelligibility of speech sounds
,”
J. Acoust. Soc. Am.
19
,
90
919
.
24.
Glasberg
,
B. R.
, and
Moore
,
B. C.
(
1992
). “
Effects of envelope fluctuations on gap detection
,”
Hear. Res.
64
,
81
92
.
25.
Gustafsson
,
H. A.
, and
Arlinger
,
S. D.
(
1994
). “
Masking of speech by amplitude-modulated noise
,”
J. Acoust. Soc. Am.
95
,
518
529
.
26.
Hogan
,
C. A.
, and
Turner
,
C. W.
(
1998
). “
High-frequency audibility: Benefits for hearing-impaired listeners
,”
J. Acoust. Soc. Am.
104
,
432
441
.
27.
Houtgast, T., Steeneken, H. J., and Bronkhorst, A. W. (1992). “Speech communication in noise with strong variations in the spectral or the temporal domain,” Proceedings of the 14th International Congress on Acoustics, Vol. 3, pp. H2–6.
28.
Howard-Jones
,
P. A.
, and
Rosen
,
S.
(
1992
). “
The perception of speech in fluctuating noise
,”
Acustica
78
,
258
272
.
29.
Howard-Jones
,
P. A.
, and
Rosen
,
S.
(
1993
). “
Uncomodulated glimpsing in checkerboard noise
,”
J. Acoust. Soc. Am.
93
,
2915
2922
.
30.
Hygge
,
S.
,
Ronnberg
,
J.
,
Larsby
,
B.
, and
Arlinger
,
S.
(
1992
). “
Normal-hearing and hearing-impaired subjects’ ability to just follow conversation in competing speech, reversed speech, and noise backgrounds
,”
J. Speech Hear. Res.
35
,
208
215
.
31.
Kamm
,
C. A.
,
Dirks
,
D. D.
, and
Bell
,
T. S.
(
1985
). “
Speech recognition and the Articulation Index for normal and hearing-impaired listeners
,”
J. Acoust. Soc. Am.
77
,
281
288
.
32.
Kryter
,
K. D.
(
1962a
). “
Methods for the calculation and use of the articulation index
,”
J. Acoust. Soc. Am.
34
,
1689
1697
.
33.
Kryter
,
K. D.
(
1962b
). “
Validation of the articulation index
,”
J. Acoust. Soc. Am.
34
,
1698
1702
.
34.
Larsby
,
B.
, and
Arlinger
,
S.
(
1994
). “
Speech recognition and just-follow-conversation tasks for normal-hearing and hearing-impaired listeners with different maskers
,”
Audiology
33
,
165
176
.
35.
Licklider
,
J. C. R.
, and
Guttman
,
N.
(
1957
). “
Masking of speech by line-spectrum interference
,”
J. Acoust. Soc. Am.
29
,
287
296
.
36.
Lippmann
,
R. P.
(
1996
). “
Accurate consonant perception without mid-frequency speech energy
,”
IEEE Trans. Speech Audio Process.
4
,
567
577
.
37.
Middelweerd
,
M. J.
,
Festen
,
J. M.
, and
Plomp
,
R.
(
1990
). “
Difficulties with speech intelligibility in noise in spite of a normal pure-tone audiogram
,”
Audiology
29
,
1
7
.
38.
Miller
,
G. A.
(
1947
). “
The masking of speech
,”
Psychol. Bull.
44
,
105
129
.
39.
Miller
,
G. A.
, and
Licklider
,
J. C. R.
(
1950
). “
The intelligibility of interrupted speech
,”
J. Acoust. Soc. Am.
22
,
167
173
.
40.
Moore, B. C. (1997). An Introduction to the Psychology of Hearing, 4th ed. (Academic, London).
41.
Moore
,
B. C.
,
Peters
,
R. W.
, and
Glasberg
,
B. R.
(
1996
). “
Detection of decrements and increments in sinusoids at high overall levels
,”
J. Acoust. Soc. Am.
99
,
3669
3677
.
42.
Müsch
,
H.
, and
Buus
,
S.
(
2001
). “
Using statistical decision theory to predict speech intelligibility. II. Measurement and prediction of consonant-discrimination performance
,”
J. Acoust. Soc. Am.
109
,
2910
2920
.
43.
Neijenhuis
,
K.
,
Sink
,
A.
,
Priester
,
G.
,
van Kordenoordt
,
S.
, and
van der Broek
,
P.
(
2002
). “
Age effects and normative data on a Dutch test battery for auditory processing disorders
,”
Int. J. Audiol.
41
,
334
346
.
44.
Nelson
,
P. B.
,
Jin
,
S. H.
,
Carney
,
A. E.
, and
Nelson
,
D. A.
(
2003
). “
Understanding speech in modulated interference: Cochlear implant users and normal-hearing listeners
,”
J. Acoust. Soc. Am.
113
,
961
968
.
45.
Nilsson
,
M.
,
Soli
,
S. D.
, and
Sullivan
,
J. A.
(
1994
). “
Development of the Hearing in Noise Test for the measurement of speech reception thresholds in quiet and in noise
,”
J. Acoust. Soc. Am.
95
,
1085
1099
.
46.
Noordhoek, I. M. (2000). “Intelligibility of narrow-band speech and its relation to auditory functions in hearing-impaired listeners,” Doctoral thesis, Free University, Amsterdam.
47.
Oxenham
,
A. J.
, and
Moore
,
B. C.
(
1994
). “
Modeling the additivity of nonsimultaneous masking
,”
Hear. Res.
80
,
105
118
.
48.
Oxenham, A. J., and Moore, B. C. (1997). “Modeling the effects of peripheral nonlinearity in normal and impaired hearing,” in Modeling Sensorineural Hearing Loss, edited by W. Jesteadt (Erlbaum, Mahwah, NJ), pp. 273–288.
49.
Oxenham
,
A. J.
, and
Plack
,
C. J.
(
1997
). “
A behavioral measure of basilar-membrane nonlinearity in listeners with normal and impaired hearing
,”
J. Acoust. Soc. Am.
101
,
3666
3675
.
50.
Oxenham
,
A. J.
,
Rosengard
,
P. S.
, and
Braida
,
L. D.
(
2004
). “
Perceptual consequences of normal and abnormal peripheral compression: Potential links between psychoacoustics and speech perception
,”
J. Acoust. Soc. Am.
115
,
2421
.
51.
Pavlovic
,
C. V.
(
1984
). “
Use of the articulation index for assessing residual auditory function in listeners with sensorineural hearing impairment
,”
J. Acoust. Soc. Am.
75
,
1253
1258
.
52.
Pavlovic
,
C. V.
(
1987
). “
Derivation of primary parameters and procedures for use in speech intelligibility predictions
,”
J. Acoust. Soc. Am.
82
,
413
422
.
53.
Pavlovic
,
C. V.
, and
Studebaker
,
G. A.
(
1984
). “
An evaluation of some assumptions underlying the articulation index
,”
J. Acoust. Soc. Am.
75
,
1606
1612
.
54.
Pavlovic
,
C. V.
,
Studebaker
,
G. A.
, and
Sherbecoe
,
R. L.
(
1986
). “
An articulation index based procedure for predicting the speech recognition performance of hearing-impaired individuals
,”
J. Acoust. Soc. Am.
80
,
50
57
.
55.
Peters
,
R. W.
,
Moore
,
B. C.
, and
Baer
,
T.
(
1998
). “
Speech reception thresholds in noise with and without spectral and temporal dips for hearing-impaired and normally hearing people
,”
J. Acoust. Soc. Am.
103
,
577
587
.
56.
Plack
,
C. J.
, and
Oxenham
,
A. J.
(
1998
). “
Basilar-membrane nonlinearity and the growth of forward masking
,”
J. Acoust. Soc. Am.
103
,
1598
1608
.
57.
Plomp
,
R.
(
1964
). “
Rate of decay of auditory sensation
,”
J. Acoust. Soc. Am.
36
,
277
282
.
58.
Plomp
,
R.
, and
Mimpen
,
A. M.
(
1979
). “
Improving the reliability of testing the speech reception threshold for sentences
,”
Audiology
18
,
43
52
.
59.
Rankovic
,
C. M.
(
1998
). “
Factors governing speech reception benefits of adaptive linear filtering for listeners with sensorineural hearing loss
,”
J. Acoust. Soc. Am.
103
,
1043
1057
.
60.
Rankovic
,
C. M.
(
2002
). “
Articulation index predictions for hearing-impaired listeners with and without cochlear dead regions
,”
J. Acoust. Soc. Am.
111
,
2545
2548
.
61.
Shailer
,
M. J.
, and
Moore
,
B. C.
(
1983
). “
Gap detection as a function of frequency, bandwidth, and level
,”
J. Acoust. Soc. Am.
74
,
467
473
.
62.
Shailer
,
M. J.
, and
Moore
,
B. C.
(
1987
). “
Gap detection and the auditory filter: Phase effects using sinusoidal stimuli
,”
J. Acoust. Soc. Am.
81
,
1110
1117
.
63.
Smoorenburg
,
G. F.
(
1992
). “
Speech reception in quiet and in noisy conditions by individuals with noise-induced hearing loss in relation to their tone audiogram
,”
J. Acoust. Soc. Am.
91
,
421
437
.
64.
Steeneken, H. J. (1992). “On measuring and predicting speech intelligibility,” Doctoral thesis, University of Amsterdam.
65.
Steeneken
,
H. J.
, and
Houtgast
,
T.
(
1980
). “
A physical method for measuring speech-transmission quality
,”
J. Acoust. Soc. Am.
67
,
318
326
.
66.
Steeneken
,
H. J.
, and
Houtgast
,
T.
(
1999
). “
Mutual dependence of the octave-band weights in predicting speech intelligibility
,”
Speech Commun.
28
,
109
123
.
67.
Steeneken
,
H. J.
, and
Houtgast
,
T.
(
2002
). “
Validation of the revised STIr method
,”
Speech Commun.
38
,
413
425
.
68.
Stickney
,
G. S.
, and
Assmann
,
P. F.
(
2001
). “
Acoustic and linguistic factors in the perception of bandpass-filtered speech
,”
J. Acoust. Soc. Am.
109
,
1157
1165
.
69.
Studebaker
,
G. A.
,
Pavlovic
,
C. V.
, and
Sherbecoe
,
R. L.
(
1987
). “
A frequency importance function for continuous discourse
,”
J. Acoust. Soc. Am.
81
,
1130
1138
.
70.
Studebaker
,
G. A.
,
Taylor
,
R.
, and
Sherbecoe
,
R. L.
(
1994
). “
The effect of noise spectrum on speech recognition performance-intensity functions
,”
J. Speech Hear. Res.
37
,
439
448
.
71.
ter Keurs
,
M.
,
Festen
,
J. M.
, and
Plomp
,
R.
(
1993
). “
Limited resolution of spectral contrast and hearing loss for speech in noise
,”
J. Acoust. Soc. Am.
94
,
1307
1314
.
72.
Trine, T. D. (1995). “Speech recognition in modulated noise and temporal resolution: Effects of listening bandwidth,” Unpublished doctoral dissertation, University of Minnesota, Twin Cities, MN.
73.
Turner
,
C. W.
, and
Henry
,
B. A.
(
2002
). “
Benefits of amplification for speech recognition in background noise
,”
J. Acoust. Soc. Am.
112
,
1675
1680
.
74.
van Wijngaarden, S. J. (2002). “Past, Present and future of the speech transmission index,” Proceedings of the International Symposium on STI, TNO Human Factors (Soesterberg, The Netherlands).
75.
van Wijngaarden, S. J. (2003). “The intelligibility of non-native speech,” Doctoral thesis, Free University, Amsterdam.
76.
Versfeld
,
N. J.
, and
Dreschler
,
W. A.
(
2002
). “
The relationship between the intelligibility of time-compressed speech and speech in noise in young and elderly listeners
,”
J. Acoust. Soc. Am.
111
,
401
408
.
77.
Versfeld
,
N. J.
,
Daalder
,
L.
,
Festen
,
J. M.
, and
Houtgast
,
T.
(
2000
). “
Method for the selection of sentence materials for efficient measurement of the speech reception threshold
,”
J. Acoust. Soc. Am.
107
,
1671
1684
.
78.
Warren
,
R. M.
,
Riener
,
K. R.
,
Bashford
,
J. A.
, and
Brubaker
,
B. S.
(
1995
). “
Spectral redundancy: Intelligibility of sentences heard through narrow spectral slits
,”
Percept. Psychophys.
57
,
175
182
.
This content is only available via PDF.
You do not currently have access to this content.