When a target voice is masked by an increasingly similar masker voice, increases in energetic masking are likely to occur due to increased spectro-temporal overlap in the competing speech waveforms. However, the impact of this increase may be obscured by informational masking effects related to the increased confusability of the target and masking utterances. In this study, the effects of target-masker similarity and the number of competing talkers on the energetic component of speech-on-speech masking were measured with an ideal time-frequency segregation (ITFS) technique that retained all the target-dominated time-frequency regions of a multitalker mixture but eliminated all the time-frequency regions dominated by the maskers. The results show that target-masker similarity has a small but systematic impact on energetic masking, with roughly a 1dB release from masking for same-sex maskers versus same-talker maskers and roughly an additional 1dB release from masking for different-sex masking voices. The results of a second experiment measuring ITFS performance with up to 18 interfering talkers indicate that energetic masking increased systematically with the number of competing talkers. These results suggest that energetic masking differences related to target-masker similarity have a much smaller impact on multitalker listening performance than energetic masking effects related to the number of competing talkers in the stimulus and non-energetic masking effects related to the confusability of the target and masking voices.

1.
Anzalone
,
M. C.
,
Calandruccio
,
L.
,
Doherty
,
K. A.
, and
Carney
,
L. H.
(
2006
). “
Determination of the potential benefit of time-frequency gain manipulation
,”
Ear Hear.
27
,
480
492
.
2.
Assmann
,
P. F.
, and
Summerfield
,
A. Q.
(
1990
). “
Modelling the perception of concurrent vowels: Vowels with different fundamental frequencies
,”
J. Acoust. Soc. Am.
88
,
680
697
.
3.
Arbogast
,
T.
,
Mason
,
C.
, and
Kidd
,
G.
(
2002
). “
The effect of spatial separation on information and energetic masking of speech
,”
J. Acoust. Soc. Am.
112
,
2086
2098
.
4.
Bird
,
J.
, and
Darwin
,
C. J.
(
1998
). “
Effects of a difference in fundamental frequency in separating two sentences
,” in
Psychophysical and Physiological Advances in Hearing
, edited by
A. R.
Palmer
,
A.
Rees
,
A. Q.
Summerfield
, and
R.
Meddis
(
Whurr
,
London
), pp.
263
269
.
5.
Bolia
,
R.
,
Nelson
,
W. T.
,
Ericson
,
M.
, and
Simpson
,
B.
(
2000
). “
A speech corpus for multitalker communications research
,”
J. Acoust. Soc. Am.
107
,
1065
1066
.
6.
Brokx
,
J. P. L.
, and
Nooteboom
,
S. G.
(
1982
). “
Intonation and the perceptual separation of simultaneous voices
,”
J. Phonetics
10
,
23
36
.
7.
Bronkhorst
,
A.
, and
Plomp
,
R.
(
1992
). “
Effects of multiple speechlike maskers on binaural speech recognitions in normal and impaired listening
,”
J. Acoust. Soc. Am.
92
,
3132
3139
.
8.
Brungart
,
D.
(
2001
). “
Informational and energetic masking effects in the perception of two simultaneous talkers
,”
J. Acoust. Soc. Am.
109
,
1101
1109
.
9.
Brungart
,
D. S.
, and
Simpson
,
B. D.
(
2002
). “
The effects of spatial separation in distance on the informational and energetic masking of a nearby speech signal
,”
J. Acoust. Soc. Am.
112
,
664
676
.
10.
Brungart
,
D.
,
Simpson
,
B. D.
,
Ericson
,
M.
, and
Scott
,
K.
(
2001
). “
Informational and energetic masking effects in the perception of multiple simultaneous talkers
,”
J. Acoust. Soc. Am.
110
,
2527
2538
.
11.
Brungart
,
D.
,
Chang
,
P. S.
,
Simpson
,
B. D.
, and
Wang
,
D. L.
(
2006
). “
Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation
,”
J. Acoust. Soc. Am.
120
,
4007
4018
.
12.
Cahart
,
R.
, and
Tillman
,
T.
(
1969
). “
Perceptual masking in multiple sound backgrounds
,”
J. Acoust. Soc. Am.
45
,
694
703
.
13.
Cavallini
,
F.
(
1993
). “
Fitting a logistic curve to data
,”
Coll. Math. J.
24
,
247
253
.
14.
Darwin
,
C. J.
,
Brungart
,
D. S.
, and
Simpson
,
B. D.
(
2003
). “
Effects of fundamental frequency and vocal-tract length changes on attention to one of two simultaneous talkers
,”
J. Acoust. Soc. Am.
114
,
2913
2922
.
15.
Drullman
,
R.
(
1995
). “
Speech intelligibility in noise: Relative contribution of speech elements above and below the noise level
,”
J. Acoust. Soc. Am.
98
,
1796
1798
.
16.
Festen
,
J.
, and
Plomp
,
R.
(
1990
). “
Effects of fluctuating noise and interfering speech on the speech reception threshold for impaired and normal hearing
,”
J. Acoust. Soc. Am.
88
,
1725
1736
.
17.
Freyman
,
R.
,
Helfer
,
K.
,
McCall
,
D.
, and
Clifton
,
R.
(
1999
). “
The role of perceived spatial separation in the unmasking of speech
,”
J. Acoust. Soc. Am.
106
,
3578
3587
.
18.
Freyman
,
R. L.
,
Balakrishnan
,
U.
, and
Helfer
,
K. S.
(
2004
). “
Effect of number of masking talkers and auditory priming on informational masking in speech recognition
,”
J. Acoust. Soc. Am.
115
,
2246
2256
.
19.
Freyman
,
R. L.
,
Helfer
,
K.
, and
Balakrishnan
,
U.
(
2007
). “
Variability and uncertainty in masking by competing speech
,”
J. Acoust. Soc. Am.
121
,
1040
1046
.
20.
Kidd
,
G.
,
Mason
,
C.
,
Rohtla
,
T.
, and
Deliwala
,
P.
(
1998
). “
Release from informational masking due to the spatial separation of sources in the identification of nonspeech auditory patterns
,”
J. Acoust. Soc. Am.
104
,
422
431
.
22.
Li
,
N.
, and
Loizou
,
P. C.
(
2007
). “
Factors influencing glimpsing of speech in noise
,”
J. Acoust. Soc. Am.
122
,
1165
1172
.
23.
Li
,
Y.
, and
Wang
,
D. L.
(
2009
). “
On the optimality of ideal binary time-frequency masks
,”
Speech Commun.
51
,
230
239
.
24.
Miller
,
G.
(
1947
). “
Sensitivity to changes in the intensity of white Gaussian noise and its relation to masking and loudness
,”
J. Acoust. Soc. Am.
191
,
609
619
.
25.
Moore
,
T.
(
1981
). “
Voice communication jamming research
,” in
AGARD Conference Proceedings 331: Aural Communication in Aviation
,
Neuilly-SurSeine, France
, pp.
2
1
2
6
.
26.
Patterson
,
R. D.
,
Holdsworth
,
J.
,
Nimmo-Smith
,
I.
, and
Rice
,
P.
(
1988
). “
SVOS final report, Part B: Implementing a gammatone filterbank
,” Report No. 2341, MRC Applied Psychology Unit, Cambridge, UK.
27.
Peterson
,
G. H.
, and
Barney
,
H. L.
(
1952
). “
Control methods used in a study of the vowels
,”
J. Acoust. Soc. Am.
24
,
175
184
.
28.
Pollack
,
I.
(
1975
). “
Auditory informational masking
,”
J. Acoust. Soc. Am.
57
(
S1
),
S5
.
29.
Qin
,
M. K.
, and
Oxenham
,
A. J.
(
2003
). “
Effects of simulated cochlear-implant processing on speech reception in fluctuating maskers
,”
J. Acoust. Soc. Am.
114
,
446
454
.
30.
Simpson
,
S.
, and
Cooke
,
M.
(
2005
). “
Consonant identification in N-talker babble is a nonmonotonic function of N
,”
J. Acoust. Soc. Am.
118
,
2775
2778
.
32.
Wang
,
D. L.
(
2005
). “
On ideal binary mask as the computational goal of auditory scene analysis
,” in
Speech Separation by Humans and Machines
, edited by
P.
Divenyi
(
Kluwer Academic
,
Norwell, MA
), pp.
181
197
.
33.
Wang
,
D. L.
, and
Brown
,
G. J.
(
2006
).
Computational Auditory Scene Analysis: Principles, Algorithms, and Applications
,
Wiley
,
New York
/
IEEE
,
Hoboken, NJ
.
34.
Wightman
,
F. L.
, and
Kistler
,
D. J.
(
2005
). “
Informational masking of speech in children: Effects of ipsilateral and contralateral distracters
,”
J. Acoust. Soc. Am.
118
,
3164
3176
.
You do not currently have access to this content.