This paper presents three experiments on the integration of speaker gender cues in Cantonese tone perception. Experiment 1 compared tone identification of F0-matched stimuli between different gender voices and showed that listeners tended to hear lower tones for stimuli with female-sounding voices and higher tones for stimuli with male-sounding voices. Experiment 2 investigated whether a similar voice gender normalization effect would occur in pitch perception. The results showed that unlike tone categorization shifting with voice gender systematically, voice gender interfered with pitch perception in listener-specific ways. In particular, musicians who were not affected by voice gender in pitch perception still showed a tone boundary shift induced by voice gender. Experiment 3 evaluated the influence of non-voice gender cues on tone identification with the guises of gendered names. The result shows that gendered names barely induced any shift on their own as guises of an identical set of gender-ambiguous stimuli; however, gendered names enhanced the shift when patterned with gender-prototypical voices of their gender. These findings support an additional phonological normalization process on top of psychoacoustic sensation. They also suggest that speaker normalization involves fine-grained processing of rich social cues conveyed by acoustic signals rather than merely abstract social labels.

1.
Alexander
,
J. A.
,
Wong
,
P. C.
, and
Bradlow
,
A. R.
(
2005
). “
Lexical tone perception in musicians and non-musicians
,” in
Ninth European Conference on Speech Communication and Technology
.
2.
Biemans
,
M.
(
2000
). “
Gender variation in voice quality
,” Ph.D. thesis, Netherlands Graduate School of Linguistics, Catholic University of Nijmegen.
3.
Bishop
,
J.
, and
Keating
,
P.
(
2012
). “
Perception of pitch location within a speaker's range: Fundamental frequency, voice quality and speaker sex
,”
J. Acoust. Soc. Am.
132
(
2
),
1100
1112
.
4.
Boersma
,
P.
, and
Weenink
,
D.
(
2013
). “
Praat: Doing phonetics by computer (version 6.0.15) [computer program]
,” http://www.fon.hum.uva.nl/praat (Last viewed July 15, 2019).
5.
Childers
,
D. G.
, and
Wu
,
K.
(
1991
). “
Gender recognition from speech. Part II: Fine analysis
,”
J. Acoust. Soc. Am.
90
(
4
),
1841
1856
.
6.
Coleman
,
R. O.
(
1971
). “
Male and female voice quality and its relationship to vowel formant frequencies
,”
J. Speech Hear. Res.
14
(
3
),
565
577
.
7.
Davidson
,
L.
(
2018
). “
Perception of relative pitch of sentence-length utterances
,”
J. Acoust. Soc. Am.
144
(
2
),
EL89
EL94
.
8.
Di Gioacchino
,
M.
, and
Jessop
,
L. C.
(
2011
). “
Uptalk-towards a quantitative analysis
,”
Toronto Work. Pap. Linguist.
33
(
1
),
1
16
.
9.
Fox
,
R. A.
, and
Qi
,
Y.-Y.
(
1990
). “
Context effects in the perception of lexical tone
,”
J. Chin. Linguist.
18
(
2
),
261
284
.
10.
Francis
,
A. L.
,
Ciocca
,
V.
,
Wong
,
N. K. Y.
,
Leung
,
W. H. Y.
, and
Chu
,
P. C. Y.
(
2006
). “
Extrinsic context affects perceptual normalization of lexical tone
,”
J. Acoust. Soc. Am.
119
(
3
),
1712
1726
.
11.
Fujioka
,
T.
,
Ross
,
B.
,
Kakigi
,
R.
,
Pantev
,
C.
, and
Trainor
,
L. J.
(
2006
). “
One year of musical training affects development of auditory cortical-evoked fields in young children
,”
Brain
129
(
10
),
2593
2608
.
12.
Goldstein
,
J. L.
(
1973
). “
An optimum processor theory for the central formation of the pitch of complex tones
,”
J. Acoust. Soc. Am.
54
(
6
),
1496
1516
.
13.
Hanson
,
H. M.
, and
Chuang
,
E. S.
(
1999
). “
Glottal characteristics of male speakers: Acoustic correlates and comparison with female data
,”
J. Acoust. Soc. Am.
106
(
2
),
1064
1077
.
14.
Hay
,
J.
, and
Drager
,
K.
(
2010
). “
Stuffed toys and speech perception
,”
Linguistics
48
(
4
),
865
892
.
15.
Hay
,
J.
,
Walker
,
A.
,
Sanchez
,
K.
, and
Thompson
,
K.
(
2019
). “
Abstract social categories facilitate access to socially skewed words
,”
PloS One
14
(
2
),
1
29
.
16.
Hay
,
J.
,
Warren
,
P.
, and
Drager
,
K.
(
2006
). “
Factors influencing speech perception in the context of a merger-in-progress
,”
J. Phonetics
34
(
4
),
458
484
.
17.
Henton
,
C. G.
, and
Bladon
,
R. A. W.
(
1985
). “
Breathiness in normal female speech: Inefficiency versus desirability
,”
Lang. Commun.
5
(
3
),
221
227
.
18.
Honorof
,
D. N.
, and
Whalen
,
D. H.
(
2005
). “
Perception of pitch location within a speaker's F0 range
,”
J. Acoust. Soc. Am.
117
(
4
),
2193
2200
.
19.
Huang
,
J.
, and
Holt
,
L. L.
(
2009
). “
General perceptual contributions to lexical tone normalization
,”
J. Acoust. Soc. Am.
125
(
6
),
3983
3994
.
20.
Johnson
,
K.
(
2005
). “
Speaker normalization in speech perception
,” in
The Handbook of Speech Perception
, edited by
D. B.
Pisoni
and
R.
Remez
(
Oxford: Blackwell Publishers
, 2005), pp.
363
389
.
21.
Johnson
,
K.
(
2006
). “
Resonance in an exemplar-based lexicon: The emergence of social identity and phonology
,”
J. Phonetics
34
(
4
),
485
499
.
22.
Johnson
,
K.
,
Strand
,
E. A.
, and
D'Imperio
,
M.
(
1999
). “
Auditory–visual integration of talker gender in vowel perception
,”
Phonetics
27
(
4
),
359
384
.
23.
Kalmus
,
H.
, and
Fry
,
D.
(
1980
). “
On tune deafness (dysmelodia): Frequency, development, genetics and musical background
,”
Ann. Hum. Genet.
43
(
4
),
369
382
.
24.
Klatt
,
D. H.
, and
Klatt
,
L. C.
(
1990
). “
Analysis, synthesis, and perception of voice quality variations among female and male talkers
,”
J. Acoust. Soc. Am.
87
(
2
),
820
857
.
25.
Kraus
,
N.
, and
Chandrasekaran
,
B.
(
2010
). “
Music training for the development of auditory skills
,”
Nat. Rev. Neurosci.
11
(
8
),
599
605
.
26.
Kuang
,
J.
(
2017a
). “
Covariation between voice quality and pitch: Revisiting the case of mandarin creaky voice
,”
J. Acoust. Soc. Am.
142
(
3
),
1693
1706
.
27.
Kuang
,
J.
(
2017b
). “
The effect of musicality on cue selection in pitch perception
,”
J. Acoust. Soc. Am.
141
(
5
),
3818
.
28.
Kuang
,
J.
, and
Liberman
,
M.
(
2015a
). “
The effect of spectral slope on pitch perception
,” in
Sixteenth Annual Conference of the International Speech Communication Association
.
29.
Kuang
,
J.
, and
Liberman
,
M.
(
2015b
). “
Influence of spectral cues on the perception of pitch height
,” in
Proceedings of ICPH
, Vol.
18
.
30.
Kuang
,
J.
, and
Liberman
,
M.
(
2016
). “
Pitch-range perception: The dynamic interaction between voice quality and fundamental frequency
,” in
INTERSPEECH
, pp.
1350
1354
.
31.
Kuang
,
J.
, and
Liberman
,
M.
(
2018
). “
Integrating voice quality cues in the pitch perception of speech and non-speech utterances
,”
Front. Psychol
9
,
2147
.
32.
Ladefoged
,
P.
(
1971
).
Preliminaries to Linguistic Phonetics
(
University of Chicago Press
,
Chicago
).
33.
Lai
,
W.
(
2017
). “
Auditory-visual integration of talker gender in Cantonese tone perception
,” in
Proc. Interspeech 2017
, pp.
664
668
.
34.
Laver
,
J.
(
1980
). “
The phonetic description of voice quality
,”
Cambridge Stud. Linguist. London
31
,
1
186
.
35.
Lavner
,
Y.
,
Gath
,
I.
, and
Rosenhouse
,
J.
(
2000
). “
The effects of acoustic modifications on the identification of familiar voices speaking isolated vowels
,”
Speech Commun.
30
(
1
),
9
26
.
36.
Leather
,
J.
(
1983
). “
Speaker normalization in perception of lexical tone
,”
J. Phonetics
11
(
4
),
373
382
.
37.
Lee
,
C.-Y.
(
2009
). “
Identifying isolated, multispeaker Mandarin tones from brief acoustic input: A perceptual and acoustic study
,”
J. Acoust. Soc. Am.
125
(
2
),
1125
1137
.
38.
Lee
,
C.-Y.
,
Tao
,
L.
, and
Bond
,
Z. S.
(
2008
). “
Identification of acoustically modified mandarin tones by native listeners
,”
J. Phonetics
36
(
4
),
537
563
.
39.
Li
,
Y.
,
Lee
,
T.
, and
Qian
,
Y.
(
2004
). “
F0 analysis and modeling for Cantonese text-to-speech
,” in
International Conference, Speech Prosody 2004
.
40.
Loveday
,
L.
(
1981
). “
Pitch, politeness and sexual role: An exploratory investigation into the pitch correlates of English and Japanese politeness formulae
,”
Lang. Speech
24
(
1
),
71
89
.
41.
Magne
,
C.
,
Schön
,
D.
, and
Besson
,
M.
(
2006
). “
Musician children detect pitch violations in both music and language better than nonmusician children: Behavioral and electrophysiological approaches
,”
J. Cognit. Neurosci.
18
(
2
),
199
211
.
42.
Marie
,
C.
,
Delogu
,
F.
,
Lampis
,
G.
,
Belardinelli
,
M. O.
, and
Besson
,
M.
(
2011
). “
Influence of musical expertise on segmental and tonal processing in Mandarin Chinese
,”
J. Cognit. Neurosci.
23
(
10
),
2701
2715
.
43.
Matthews
,
S.
, and
Yip
,
V.
(
2013
).
Cantonese: A Comprehensive Grammar
(
Routledge
,
London
).
44.
Moore
,
C. B.
, and
Jongman
,
A.
(
1997
). “
Speaker normalization in the perception of Mandarin Chinese tones
,”
J. Acoust. Soc. Am.
102
(
3
),
1864
1877
.
45.
Moreno
,
S.
,
Marques
,
C.
,
Santos
,
A.
,
Santos
,
M.
,
Castro
,
S. L.
, and
Besson
,
M.
(
2008
). “
Musical training influences linguistic abilities in 8-year-old children: More evidence for brain plasticity
,”
Cereb. Cortex
19
(
3
),
712
723
.
46.
Mullennix
,
J. W.
,
Johnson
,
K. A.
,
Topcu-Durgun
,
M.
, and
Farnsworth
,
L. M.
(
1995
). “
The perceptual representation of voice gender
,”
J. Acoust. Soc. Am.
98
(
6
),
3080
3095
.
47.
Peng
,
G.
,
Zhang
,
C.
,
Zheng
,
H.-Y.
,
Minett
,
J. W.
, and
Wang
,
W. S.-Y.
(
2012
). “
The effect of intertalker variations on acoustic–perceptual mapping in Cantonese and Mandarin tone systems
,”
J. Speech Lang. Hear. Res.
55
(
2
),
579
595
.
48.
Peng
,
G.
,
Zheng
,
H.-Y.
,
Gong
,
T.
,
Yang
,
R.-X.
,
Kong
,
J.-P.
, and
Wang
,
W. S.-Y.
(
2010
). “
The influence of language experience on categorical perception of pitch contours
,”
J. Phonetics
38
(
4
),
616
624
.
49.
Pépiot
,
E.
(
2015
). “
Voice, speech and gender: Male-female acoustic differences and cross-language variation in English and French speakers
,”
Corela. Cognit., Représent., Lang.
(HS-16)
2015
,
1
13
.
50.
Peretz
,
I.
,
Gosselin
,
N.
,
Nan
,
Y.
,
Caron-Caplette
,
E.
,
Trehub
,
S. E.
, and
Béland
,
R.
(
2013
). “
A novel tool for evaluating children's musical abilities across age and culture
,”
Front. Syst. Neurosci.
7
,
30
.
51.
Peterson
,
G. E.
, and
Barney
,
H. L.
(
1952
). “
Control methods used in a study of the vowels
,”
J. Acoust. Soc. Am.
24
(
2
),
175
184
.
52.
Robinson
,
K.
, and
Patterson
,
R. D.
(
1995
). “
The duration required to identify the instrument, the octave, or the pitch chroma of a musical note
,”
Music Percept.: Interdiscip. J.
13
(
1
),
1
15
.
53.
Schön
,
D.
,
Magne
,
C.
, and
Besson
,
M.
(
2004
). “
The music of speech: Music training facilitates pitch processing in both music and language
,”
Psychophysiology
41
(
3
),
341
349
.
54.
Schweinberger
,
S. R.
,
Casper
,
C.
,
Hauthal
,
N.
,
Kaufmann
,
J. M.
,
Kawahara
,
H.
,
Kloth
,
N.
,
Robertson
,
D. M.
,
Simpson
,
A. P.
, and
Zäske
,
R.
(
2008
). “
Auditory adaptation in voice perception
,”
Curr. Biol.
18
(
9
),
684
688
.
55.
Strand
,
E. A.
(
1999
). “
Uncovering the role of gender stereotypes in speech perception
,”
J. Lang. Soc. Psychol.
18
(
1
),
86
100
.
56.
Strand
,
E. A.
, and
Johnson
,
K.
(
1996
). “
Gradient and visual speaker normalization in the perception of fricatives
,” in
KONVENS
, pp.
14
26
.
57.
Szakay
,
A.
(
2006
). “
Rhythm and pitch as markers of ethnicity in New Zealand English
,” in
Proceedings of the 11th Australasian International Conference on Speech Science and Technology
,
University of Auckland
(ASSTA, Canberra, ACT), pp.
421
426
.
58.
Terhardt
,
E.
(
1978
). “
Psychoacoustic evaluation of musical sounds
,”
Percept. Psychophys.
23
(
6
),
483
492
.
59.
Titze
,
I. R.
(
1988
). “
A framework for the study of vocal registers
,”
J. Voice
2
(
3
),
183
194
.
60.
Wightman
,
F. L.
(
1973
). “
The pattern-transformation model of pitch
,”
J. Acoust. Soc. Am.
54
(
2
),
407
416
.
61.
Wong
,
P. C.
, and
Diehl
,
R. L.
(
1999
). “
The effect of reduced tonal space in Parkinsonian speech on the perception of Cantonese tones
,”
J. Acoust. Soc. Am.
105
(
2
),
1246
.
62.
Wong
,
P. C.
, and
Diehl
,
R. L.
(
2003
). “
Perceptual normalization for inter- and intratalker variation in Cantonese level tones
,”
J. Speech Lang. Hear. Res.
46
(
2
),
413
421
.
63.
Wong
,
P. C.
, and
Perrachione
,
T. K.
(
2007
). “
Learning pitch patterns in lexical identification by native English-speaking adults
,”
Appl. Psycholinguist.
28
(
4
),
565
585
.
64.
Zhang
,
C.
, and
Chen
,
S.
(
2016
). “
Toward an integrative model of talker normalization
,”
J. Exp. Psychol. Hum. Percept. Perform.
42
(
8
),
1252
1268
.
65.
Zhang
,
C.
,
Peng
,
G.
, and
Wang
,
W. S.
(
2012
). “
Unequal effects of speech and nonspeech contexts on the perceptual normalization of Cantonese level tones
,”
J. Acoust. Soc. Am.
132
(
2
),
1088
1099
.

Supplementary Material

You do not currently have access to this content.