Three experiments used the Coordinated Response Measure task to examine the roles that differences in F0 and differences in vocal-tract length have on the ability to attend to one of two simultaneous speech signals. The first experiment asked how increases in the natural F0 difference between two sentences (originally spoken by the same talker) affected listeners’ ability to attend to one of the sentences. The second experiment used differences in vocal-tract length, and the third used both F0 and vocal-tract length differences. Differences in F0 greater than 2 semitones produced systematic improvements in performance. Differences in vocal-tract length produced systematic improvements in performance when the ratio of lengths was 1.08 or greater, particularly when the shorter vocal tract belonged to the target talker. Neither of these manipulations produced improvements in performance as great as those produced by a different-sex talker. Systematic changes in both F0 and vocal-tract length that simulated an incremental shift in gender produced substantially larger improvements in performance than did differences in F0 or vocal-tract length alone. In general, shifting one of two utterances spoken by a female voice towards a male voice produces a greater improvement in performance than shifting male towards female. The increase in performance varied with the intonation patterns of individual talkers, being smallest for those talkers who showed most variability in their intonation patterns between different utterances.

1.
Assmann
,
P. F.
, and
Summerfield
,
A. Q.
(
1989
). “
Modelling the perception of concurrent vowels: Vowels with the same fundamental frequency
,”
J. Acoust. Soc. Am.
85
,
327
338
.
2.
Assmann
,
P. F.
, and
Summerfield
,
A. Q.
(
1990
). “
Modelling the perception of concurrent vowels: Vowels with different fundamental frequencies
,”
J. Acoust. Soc. Am.
88
,
680
697
.
3.
Atal
,
B. S.
, and
Hanauer
,
S. L.
(
1971
). “
Speech analysis and synthesis by linear prediction of the acoustic wave
,”
J. Acoust. Soc. Am.
50
,
637
655
.
4.
Bird, J., and Darwin, C. J. (1998). “Effects of a difference in fundamental frequency in separating two sentences,” in Psychophysical and Physiological Advances in Hearing, edited by A. R. Palmer, A. Rees, A. Q. Summerfield, and R. Meddis (Whurr, London), pp. 263–269.
5.
Boersma, P., and Weenink, D. (1996). “Praat, a System for doing Phonetics by Computer, version 3.4,” Institute of Phonetic Sciences, University of Amsterdam, Vol. 132, pp. 1–182, www.praat.org
6.
Bolia
,
R. S.
,
Nelson
,
W. T.
,
Ericson
,
M. A.
, and
Simpson
,
B. D.
(
2000
). “
A speech corpus for multitalker communications research
,”
J. Acoust. Soc. Am.
107
,
1065
1066
.
7.
Bregman, A. S. (1990). Auditory Scene Analysis: The Perceptual Organization of Sound (Bradford Books, MIT, Cambridge, MA).
8.
Brokx
,
J. P. L.
, and
Nooteboom
,
S. G.
(
1982
). “
Intonation and the perceptual separation of simultaneous voices
,”
J. Phonetics
10
,
23
36
.
9.
Bronkhorst
,
A. W.
(
2000
). “
The cocktail party phenomenon: a review of speech intelligibility in multiple-talker conditions
,”
Acustica
86
,
117
128
.
10.
Brungart
,
D. S.
(
2001
). “
Informational and energetic masking effects in the perception of two simultaneous talkers
,”
J. Acoust. Soc. Am.
109
,
1101
1109
.
11.
Brungart
,
D. S.
,
Simpson
,
B. D.
,
Scott
,
K. R.
, and
Ericson
,
M. A.
, (
2001
). “
Informational and energetic masking effects in the perception of multiple simultaneous talkers
,”
J. Acoust. Soc. Am.
110
,
2527
2538
.
12.
Darwin
,
C. J.
, and
Hukin
,
R. W.
(
2000
). “
Effectiveness of spatial cues, prosody and talker characteristics in selective attention
,”
J. Acoust. Soc. Am.
107
,
970
977
.
13.
Dirks
,
D.
, and
Bower
,
D.
(
1969
). “
Masking effects of speech competing messages
,”
J. Speech Hear. Res.
12
,
229
245
.
14.
Egan
,
J. P.
,
Carterette
,
E. C.
, and
Thwing
,
E. J.
(
1954
). “
Some factors affecting multi-channel listening
,”
J. Acoust. Soc. Am.
26
,
774
782
.
15.
Ericson, M. A., and McKinley, R. L. (1997). “The intelligibility of multiple talkers separated spatially in noise,” in Binaural and Spatial Hearing in Real and Virtual Environments, edited by R. H. Gilkey and T. R. Anderson (Lawrence Erlbaum, Mahwah, NJ), pp. 701–724.
16.
Festen
,
J. M.
, and
Plomp
,
R.
(
1990
). “
Effects of fluctuating noise and interfering speech on the speech-reception threshold for impaired and normal hearing
,”
J. Acoust. Soc. Am.
88
,
1725
1736
.
17.
Hacker
,
M. J.
, and
Ratcliff
,
R.
(
1979
). “
A revised table of d for M-alternative forced choice
,”
Percept. Psychophys.
26
,
168
170
.
18.
Kuwabara
,
H.
, and
Takagi
,
T.
(
1991
). “
Acoustic parameters of voice individuality and voice-quality control by analysis-synthesis method
,”
Speech Commun.
10
,
491
495
.
19.
Moulines
,
E.
, and
Charpentier
,
F.
(
1990
). “
Pitch synchronous waveform processing techniques for text-to-speech synthesis using diphones
,”
Speech Commun.
9
,
453
467
.
20.
Peterson
,
G. H.
, and
Barney
,
H. L.
(
1952
). “
Control methods used in a study of the vowels
,”
J. Acoust. Soc. Am.
24
,
175
184
.
21.
Scheffers, M. T. (1983). “Sifting vowels: Auditory pitch analysis and sound segregation,” Ph.D. dissertation, Groningen University, The Netherlands.
This content is only available via PDF.
You do not currently have access to this content.