In Mandarin Chinese, the fundamental frequency (F0) contour defines lexical “Tones” that differ in meaning despite being phonetically identical. Flattening the F0 contour impairs the intelligibility of Mandarin Chinese in background sounds. This might occur because the flattening introduces misleading lexical information. To avoid this effect, two types of speech were used: single-Tone speech contained Tones 1 and 0 only, which have a flat F0 contour; multi-Tone speech contained all Tones and had a varying F0 contour. The intelligibility of speech in steady noise was slightly better for single-Tone speech than for multi-Tone speech. The intelligibility of speech in a two-talker masker, with the difference in mean F0 between the target and masker matched across conditions, was worse for the multi-Tone target in the multi-Tone masker than for any other combination of target and masker, probably because informational masking was maximal for this combination. The introduction of a perceived spatial separation between the target and masker, via the precedence effect, led to better performance for all target-masker combinations, especially the multi-Tone target in the multi-Tone masker. In summary, a flat F0 contour does not reduce the intelligibility of Mandarin Chinese when the introduction of misleading lexical cues is avoided.

1.
Assmann
,
P. F.
, and
Paschall
,
D. D.
(
1998
). “
Pitches of concurrent vowels
,”
J. Acoust. Soc. Am.
103
,
1150
1160
.
2.
Assmann
,
P. F.
, and
Summerfield
,
A. Q.
(
1990
). “
Modeling the perception of concurrent vowels: Vowels with different fundamental frequencies
,”
J. Acoust. Soc. Am.
88
,
680
697
.
3.
Binns
,
C.
, and
Culling
,
J. F.
(
2007
). “
The role of fundamental frequency contours in the perception of speech against interfering speech
,”
J. Acoust. Soc. Am.
122
,
1765
1776
.
4.
Boersma
,
P.
, and
Weenink
,
D.
(
2013
). “
Praat: Doing phonetics by computer
” [computer program], http://www.praat.org/ (Last viewed 1/2/2018).
5.
Bradlow
,
A. R.
,
Torretta
,
G. M.
, and
Pisoni
,
D. B.
(
1996
). “
Intelligibility of normal speech I: Global and fine-grained acoustic-phonetic talker characteristics
,”
Speech Commun.
20
,
255
272
.
6.
Bregman
,
A. S.
(
1990
).
Auditory Scene Analysis: The Perceptual Organization of Sound
(
Bradford Books, MIT Press
,
Cambridge, MA
), pp.
1
790
.
7.
Brokx
,
J. P. L.
, and
Nooteboom
,
S. G.
(
1982
). “
Intonation and the perceptual separation of simultaneous voices
,”
J. Phon.
10
,
23
36
.
8.
Brouwer
,
S.
,
Van Engen
,
K. J.
, and
Calandruccio
,
L.
(
2012
). “
Linguistic contributions to speech-on-speech masking for native and non-native listeners: Language familiarity and semantic content
,”
J. Acoust. Soc. Am.
131
,
1449
1464
.
9.
Brungart
,
D. S.
,
Simpson
,
B. D.
,
Darwin
,
C. J.
,
Arbogast
,
T. L.
, and
Kidd
,
G.
, Jr.
(
2005
). “
Across-ear interference from parametrically degraded synthetic speech signals in a dichotic cocktail-party listening task
,”
J. Acoust. Soc. Am.
117
,
292
304
.
10.
Brungart
,
D. S.
,
Simpson
,
B. D.
,
Ericson
,
M. A.
, and
Scott
,
K. R.
(
2001
). “
Informational and energetic masking effects in the perception of multiple simultaneous talkers
,”
J. Acoust. Soc. Am.
110
,
2527
2538
.
11.
Chao
,
Y. R.
(
1968
).
A Grammar of Spoken Chinese
(
University of California Press
,
Berkeley, CA
), pp.
1
847
.
12.
Chen
,
J.
,
Li
,
H. H.
,
Li
,
L.
,
Moore
,
B. C. J.
, and
Wu
,
X. H.
(
2012
). “
Informational masking of speech produced by speech-like sounds without linguistic content
,”
J. Acoust. Soc. Am.
131
,
2914
2926
.
13.
Cooke
,
M.
,
Lecumberri
,
M. L. G.
, and
Barker
,
J.
(
2008
). “
The foreign language cocktail party problem: Energetic and informational masking effects in non-native speech perception
,”
J. Acoust. Soc. Am.
123
,
414
427
.
14.
Culling
,
J. F.
, and
Summerfield
,
Q.
(
1995
). “
The role of frequency modulation in the perceptual segregation of concurrent vowels
,”
J. Acoust. Soc. Am.
98
,
837
846
.
15.
Cutler
,
A.
(
1976
). “
Phoneme-monitoring reaction-time as a function of proceding inotation contour
,”
Percept. Psychophys.
20
,
55
60
.
16.
Cutler
,
A.
,
Dahan
,
D.
, and
van Donselaar
,
W.
(
1997
). “
Prosody in the comprehension of spoken language: A literature review
,”
Lang. Speech.
40
,
141
201
.
17.
Cutler
,
A.
, and
Foss
,
D. J.
(
1977
). “
On the role of sentence stress in sentence processing
,”
Lang. Speech.
20
,
1
10
.
18.
Darwin
,
C. J.
(
1981
). “
Perceptual grouping of speech components differing in fundamental frequency and onset time
,”
Q. J. Exp. Psychol.
33A
,
185
287
.
19.
Darwin
,
C. J.
,
Brungart
,
D. S.
, and
Simpson
,
B. D.
(
2003
). “
Effects of fundamental frequency and vocal-tract length changes on attention to one of two simultaneous talkers
,”
J. Acoust. Soc. Am.
114
,
2913
2922
.
20.
Darwin
,
C. J.
, and
Gardner
,
R. B.
(
1987
). “
Perceptual separation of vowels from concurrent sounds
,” in
The Psychophysics of Speech Perception
, edited by
M. E. H.
Schouten
(
Martinus Nijhoff
,
Dordrecht, the Netherlands
), pp.
112
124
.
21.
Deroche
,
M. L. D.
,
Culling
,
J. F.
,
Chatterjee
,
M.
, and
Limb
,
C. J.
(
2014
). “
Roles of the target and masker fundamental frequencies in voice segregation
,”
J. Acoust. Soc. Am.
136
,
1225
1236
.
22.
Drullman
,
R.
, and
Bronkhorst
,
A. W.
(
2004
). “
Speech perception and talker segregation: Effects of level, pitch, and tactile support with multiple simultaneous talkers
,”
J. Acoust. Soc. Am.
116
,
3090
3098
.
23.
Durlach
,
N. I.
,
Mason
,
C. R.
,
Kidd
,
G.
,
Arbogast
,
T. L.
,
Colburn
,
H. S.
, and
Shinn-Cunningham
,
B.
(
2003
). “
Note on informational masking
,”
J. Acoust. Soc. Am.
113
,
2984
2987
.
24.
Freyman
,
R. L.
,
Balakrishnan
,
U.
, and
Helfer
,
K. S.
(
2001
). “
Spatial release from informational masking in speech recognition
,”
J. Acoust. Soc. Am.
109
,
2112
2122
.
25.
Freyman
,
R. L.
,
Balakrishnan
,
U.
, and
Helfer
,
K. S.
(
2004
). “
Effect of number of masking talkers and auditory priming on informational masking in speech recognition
,”
J. Acoust. Soc. Am.
115
,
2246
2256
.
26.
Freyman
,
R. L.
,
Helfer
,
K. S.
,
McCall
,
D. D.
, and
Clifton
,
R. K.
(
1999
). “
The role of perceived spatial separation in the unmasking of speech
,”
J. Acoust. Soc. Am.
106
,
3578
3588
.
27.
Helfer
,
K. S.
(
1997
). “
Auditory and auditory-visual perception of clear and conversational speech
,”
J. Speech Lang. Hear. Res.
40
,
432
443
.
28.
Huang
,
Y.
,
Huang
,
Q.
,
Chen
,
X.
,
Qu
,
T. S.
,
Wu
,
X. H.
, and
Li
,
L.
(
2008
). “
Perceptual integration between target speech and target-speech reflection reduces masking for target-speech recognition in younger adults and older adults
,”
Hear. Res.
244
,
51
65
.
29.
Jackson
,
H. M.
, and
Moore
,
B. C. J.
(
2013
). “
Contribution of temporal fine structure information and fundamental frequency separation to intelligibility in a competing-speaker paradigm
,”
J. Acoust. Soc. Am.
133
,
2421
2430
.
30.
Laures
,
J. S.
, and
Bunton
,
K.
(
2003
). “
Perceptual effects of a flattened fundamental frequency at the sentence level under different listening conditions
,”
J. Commun. Disord.
36
,
449
464
.
31.
Laures
,
J. S.
, and
Weismer
,
G.
(
1999
). “
The effects of a flattened fundamental frequency on intelligibility at the sentence level
,”
J. Speech Lang. Hear. Res.
42
,
1148
1156
.
32.
Li
,
L.
,
Daneman
,
M.
,
Qi
,
J. G.
, and
Schneider
,
B. A.
(
2004
). “
Does the information content of an irrelevant source differentially affect spoken word recognition in younger and older adults?
,”
J. Exp. Psychol. Hum. Percept. Perform.
30
,
1077
1091
.
33.
Lin
,
T.
, and
Wang
,
L. J.
(
1992
).
Yuyinxue Jiaocheng)
(
Peking University Press
,
Beijing, China
), pp.
123
143
(in Chinese).
34.
Litovsky
,
R. Y.
,
Colburn
,
H. S.
,
Yost
,
W. A.
, and
Guzman
,
S. J.
(
1999
). “
The precedence effect
,”
J. Acoust. Soc. Am.
106
,
1633
1654
.
35.
McAdams
,
S.
(
1984
). “
Spectral fusion, spectral parsing and the formation of the auditory image
,” Ph.D. thesis,
University of Stanford
, pp.
100
137
.
36.
McAdams
,
S.
(
1989
). “
Segregation of concurrent sounds. I: Effects of frequency modulation coherence
,”
J. Acoust. Soc. Am.
86
,
2148
2159
.
37.
Miller
,
S. E.
,
Schlauch
,
R. S.
, and
Watson
,
P. J.
(
2010
). “
The effects of fundamental frequency contour manipulations on speech intelligibility in background noise
,”
J. Acoust. Soc. Am.
128
,
435
443
.
38.
Patel
,
A. D.
,
Xu
,
Y.
, and
Wang
,
B.
(
2010
). “
The role of F0 variation in the intelligibility of Mandarin sentences
,” in
Proceedings of Speech Prosody 2010
(
Chicago, IL
), p.
100890:1-4
.
39.
Rhebergen
,
K. S.
,
Versfeld
,
N. J.
, and
Dreschler
,
W. A.
(
2005
). “
Release from informational masking by time reversal of native and non-native interfering speech
,”
J. Acoust. Soc. Am.
118
,
1274
1277
.
40.
Simpson
,
S. A.
, and
Cooke
,
M.
(
2005
). “
Consonant identification in N-talker babble is a nonmonotonic function of N
,”
J. Acoust. Soc. Am.
118
,
2775
2778
.
41.
Stone
,
M. A.
,
Fullgrabe
,
C.
,
Mackinnon
,
R. C.
, and
Moore
,
B. C. J.
(
2011
). “
The importance for speech intelligibility of random fluctuations in ‘steady’ background noise
,”
J. Acoust. Soc. Am.
130
,
2874
2881
.
42.
Stone
,
M. A.
,
Fullgrabe
,
C.
, and
Moore
,
B. C. J.
(
2012
). “
Notionally steady background noise acts primarily as a modulation masker of speech
,”
J. Acoust. Soc. Am.
132
,
317
326
.
43.
Studebaker
,
G. A.
(
1985
). “
A ‘rationalized’ arcsine transform
,”
J. Speech Hear. Res.
28
,
455
462
.
44.
Wallach
,
H.
,
Newman
,
E. B.
, and
Rosenzweig
,
M. R.
(
1949
). “
The precedence effect in sound localization
,”
Am. J. Psychol.
62
,
315
336
.
45.
Wang
,
J. J.
,
Shu
,
H.
,
Zhang
,
L. L.
,
Liu
,
Z. X.
, and
Zhang
,
Y.
(
2013
). “
The roles of fundamental frequency contours and sentence context in Mandarin Chinese speech intelligibility
,”
J. Acoust. Soc. Am.
134
,
EL91
EL97
.
46.
Wingfield
,
A.
,
Lombardi
,
L.
, and
Sokol
,
S.
(
1984
). “
Prosodic features and the intelligibility of accelerated speech—Syntatic versus periodic segmentation
,”
J. Speech Hear. Res.
27
,
128
134
.
47.
Wu
,
X. H.
,
Chen
,
J.
,
Yang
,
Z. G.
,
Huang
,
Q.
,
Wang
,
M. Y.
, and
Li
,
L.
(
2007
). “
Effect of number of masking talkers on speech-on-speech masking in Chinese
,” in
Interspeech
(
Belgium
,
Antwerp
), pp.
390
393
.
48.
Wu
,
X. H.
,
Wang
,
C.
,
Chen
,
J.
,
Qu
,
H. W.
,
Li
,
W. R.
,
Wu
,
Y. H.
,
Schneider
,
B.
, and
Li
,
L.
(
2005
). “
The effect of perceived spatial separation on informational masking of Chinese speech
,”
Hear. Res.
199
,
1
10
.
49.
Wu
,
X. H.
,
Yang
,
Z. G.
,
Huang
,
Y.
,
Chen
,
J.
,
Li
,
L.
,
Daneman
,
M.
, and
Schneider
,
B.
(
2011
). “
Cross-language differences in informational masking of speech by speech: English versus Mandarin Chinese
,”
J. Speech Lang. Hear. Res.
54
,
1506
1524
.
50.
Yang
,
Z. G.
,
Chen
,
J.
,
Wu
,
X. H.
,
Wu
,
Y. H.
,
Schneider
,
B.
, and
Li
,
L.
(
2007
). “
The effect of voice cuing on releasing Chinese speech from informational masking
,”
Speech Commun.
49
,
892
904
.
51.
Zurek
,
P. M.
(
1980
). “
The precedence effect and its possible role in the avoidance of interaural ambiguities
,”
J. Acoust. Soc. Am.
67
,
952
964
.
You do not currently have access to this content.