Language communicators use acoustic-phonetic cues to convey a variety of social information in the spoken language, and the learning of a second language affects speech production in a social setting. It remains unclear how speaking different dialects could affect the acoustic metrics underlying the intended communicative meanings. Nine Chinese Bayannur-Mandarin bidialectics produced single-digit numbers in statements of both Standard Mandarin and the Bayannur dialect with different levels of intended confidence. Fifteen listeners judged the intention presence and confidence level. Prosodically unmarked and marked stimuli exhibited significant differences in perceived intention. A higher intended level was perceived as more confident. The acoustic analysis revealed the segmental (third and fourth formants, center of gravity), suprasegmental (mean fundamental frequency, fundamental frequency range, duration), and source features (harmonic to noise ratio, cepstral peak prominence) can distinguish between confident and doubtful expressions. Most features also distinguished between dialect and Mandarin productions. Interactions on fourth formant and mean fundamental frequency suggested that speakers made greater use of acoustic parameters to encode confidence and doubt in the Bayannur dialect than in Mandarin. In machine learning experiments, the above-chance-level overall classification rates for confidence and doubt and the in-group advantage supported the dialect theory.

1.
Almaghrabi
,
S. A.
,
Thewlis
,
D.
,
Thwaites
,
S.
,
Rogasch
,
N. C.
,
Lau
,
S.
,
Clark
,
S. R.
, and
Baumert
,
M.
(
2022
). “
The reproducibility of bio-acoustic features is associated with sample duration, speech task, and gender
,”
IEEE Trans. Neural Syst. Rehabil. Eng.
30
,
167
175
.
2.
Ashby
,
P.
(
2015
).
Understanding Phonetics
(
Routledge
,
London
).
3.
Baese-Berk
,
M. M.
,
Chandrasekaran
,
B.
, and
Roark
,
C. L.
(
2022
). “
The nature of non-native speech sound representations
,”
J. Acoust. Soc. Am.
152
,
3025
3034
.
4.
Banse
,
R.
, and
Scherer
,
K. R.
(
1996
). “
Acoustic profiles in vocal emotion expression
,”
J. Pers. Soc. Psychol.
70
,
614
636
.
5.
Beckman
,
M. E.
, and
Pierrehumbert
,
J. B.
(
1986
). “
Intonational structure in Japanese and English
,”
Phonol. Yearb.
3
,
255
309
.
6.
Bialystok
,
E.
(
2017
). “
The bilingual adaptation: How minds accommodate experience
,”
Psychol. Bull.
143
,
233
262
.
7.
Brockmann-Bauser
,
M.
,
Van Stan
,
J. H.
,
Carvalho Sampaio
,
M.
,
Bohlender
,
J. E.
,
Hillman
,
R. E.
, and
Mehta
,
D. D.
(
2021
). “
Effects of vocal intensity and fundamental frequency on cepstral peak prominence in patients with voice disorders and vocally healthy controls
,”
J. Voice
35
,
411
417
.
8.
Bryant
,
G.
, and
Barrett
,
H. C.
(
2008
). “
Vocal emotion recognition across disparate cultures
,”
J. Cogn. Cult.
8
,
135
148
.
9.
Caballero
,
J. A.
,
Vergis
,
N.
,
Jiang
,
X.
, and
Pell
,
M. D.
(
2018
). “
The sound of im/politeness
,”
Speech Commun.
102
,
39
53
.
10.
Caffi
,
C.
, and
Janney
,
R. W.
(
1994
). “
Toward a pragmatics of emotive communication
,”
J. Pragmat.
22
,
325
373
.
11.
Cao
,
H.
, and
Dellwo
,
V.
(
2019
). “
The role of the first five formants in three vowels of Mandarin for forensic voice analysis
,” in
International Congress of Phonetic Sciences
, August 5–9,
Melbourne, Australia
.
12.
Cavalcanti
,
J. C.
,
Eriksson
,
A.
, and
Barbosa
,
P. A.
(
2023
). “
On the speaker discriminatory power asymmetry regarding acoustic-phonetic parameters and the impact of speaking style
,”
Front. Psychol.
14
,
1101187
.
13.
Chen
,
Y.
,
Xu
,
Y.
, and
Guion-Anderson
,
S.
(
2015
). “
Prosodic realization of focus in bilingual production of Southern Min and Mandarin
,”
Phonetica
71
,
249
270
.
14.
Cheng
,
A.
(
2020
). “
Cross-linguistic f0 differences in bilingual speakers of English and Korean
,”
J. Acoust. Soc. Am.
147
,
EL67
EL73
.
15.
De Leeuw
,
E.
(
2007
). “
Hesitation markers in English, German, and Dutch
,”
J. Germ. Ling.
19
,
85
114
.
16.
Dromey
,
C.
,
Jang
,
G.-O.
, and
Hollis
,
K.
(
2013
). “
Assessing correlations between lingual movements and formants
,”
Speech Commun.
55
,
315
328
.
17.
Elfenbein
,
H. A.
(
2013
). “
Nonverbal dialects and accents in facial expressions of emotion
,”
Emot. Rev.
5
,
90
96
.
18.
Elfenbein
,
H. A.
, and
Ambady
,
N.
(
2002
). “
On the universality and cultural specificity of emotion recognition: A meta-analysis
,”
Psychol. Bull.
128
,
203
235
.
19.
Elfenbein
,
H. A.
,
Beaupré
,
M.
,
Lévesque
,
M.
, and
Hess
,
U.
(
2007
). “
Toward a dialect theory: Cultural differences in the expression and recognition of posed facial expressions
,”
Emotion
7
,
131
146
.
20.
Fuchs
,
S.
, and
Koenig
,
L. L.
(
2009
). “
Simultaneous measures of electropalatography and intraoral pressure in selected voiceless lingual consonants and consonant sequences of German
,”
J. Acoust. Soc. Am.
126
,
1988
2001
.
21.
Gardner
,
R. C.
(
1985
).
Social Psychology and Second Language Learning: The Role of Attitudes and Motivation
(
Edward Arnold
,
London
).
22.
Gilbert
,
A. C.
,
Cousineau-Perusse
,
M.
, and
Titone
,
D.
(
2020
). “
L2 exposure modulates the scope of planning during first and second language production
,”
Bilingualism
23
,
1093
1105
.
23.
Gordon
,
M.
,
Barthmaier
,
P.
, and
Sands
,
K.
(
2002
). “
A cross-linguistic acoustic study of voiceless fricatives
,”
J. Int. Phon. Assoc.
32
,
141
174
.
24.
Goupil
,
L.
,
Ponsot
,
E.
,
Richardson
,
D.
,
Reyes
,
G.
, and
Aucouturier
,
J.-J.
(
2021
). “
Listeners' perceptions of the certainty and honesty of a speaker are associated with a common prosodic signature
,”
Nat. Commun.
12
,
861
.
25.
Guo
,
Y.
,
Xiong
,
X.
,
Liu
,
Y.
,
Xu
,
L.
, and
Li
,
Q.
(
2022
). “
A novel speech emotion recognition method based on feature construction and ensemble learning
,”
PLoS One
17
,
e0267132
.
26.
Guyer
,
J. J.
,
Briñol
,
P.
,
Vaughan-Johnston
,
T. I.
,
Fabrigar
,
L. R.
,
Moreno
,
L.
, and
Petty
,
R. E.
(
2021
). “
Paralinguistic features communicated through voice can affect appraisals of confidence and evaluative judgments
,”
J. Nonverbal Behav.
45
,
479
504
.
27.
Guyer
,
J. J.
,
Fabrigar
,
L. R.
, and
Vaughan-Johnston
,
T. I.
(
2019
). “
Speech rate, intonation, and pitch: Investigating the bias and cue effects of vocal confidence on persuasion
,”
Pers. Soc. Psychol. Bull.
45
,
389
405
.
28.
He
,
J.
, and
Ren
,
L.
(
2021
). “
Speech emotion recognition using XGBoost and CNN BLSTM with attention
,” in
2021 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/IOP/SCI)
, pp.
54
159
.
29.
Hegde
,
S.
,
Shetty
,
S.
,
Rai
,
S.
, and
Dodderi
,
T.
(
2019
). “
A survey on machine learning approaches for automatic detection of voice disorders
,”
J. Voice
33
,
947.e11
947.e33
.
30.
Heller Murray
,
E. S.
,
Chao
,
A.
, and
Colletti
,
L.
(
2022
). “
A practical guide to calculating cepstral peak prominence in Praat
,”
J. Voice
(published online).
31.
Hou
,
J.
(
1986
). “
The partitioning of Jin dialect
,”
Dialect
, Issue 4, pp. 253–261 (in Chinese).
32.
Hussain
,
Q.
(
2021
). “
Phonetic correlates of laryngeal and place contrasts of Burushaski
,”
Speech Commun.
126
,
71
89
.
33.
Ji
,
Y.
,
Hu
,
Y.
, and
Jiang
,
X.
(
2022
). “
Segmental and suprasegmental encoding of speaker confidence in Wuxi dialect vowels
,”
Front. Psychol.
13
,
1028106
.
34.
Jiang
,
X.
,
Gossack-Keenan
,
K.
, and
Pell
,
M. D.
(
2020
). “
To believe or not to believe? How voice and accent information in speech alter listener impressions of trust
,”
Q. J. Exp. Psychol.
73
,
55
79
.
35.
Jiang
,
X.
, and
Lu
,
L.
(
2021
). “
A study of confident voices based on stop VOT
,” in
Collected Papers of the 14th Phonetic Conference of China
, p.
114
.
36.
Jiang
,
X.
,
Paulmann
,
S.
,
Robin
,
J.
, and
Pell
,
M. D.
(
2015
). “
More than accuracy: Nonverbal dialects modulate the time course of vocal emotion recognition across cultures
,”
J. Exp. Psychol. Hum. Percept. Perform.
41
,
597
612
.
37.
Jiang
,
X.
, and
Pell
,
M. D.
(
2017
). “
The sound of confidence and doubt
,”
Speech Commun.
88
,
106
126
.
38.
Jiang
,
X.
, and
Pell
,
M.
(
2018
). “
Predicting confidence and doubt in accented speakers: Human perception and machine learning experiments
,” in
Speech Prosody 2018
, pp.
269
273
.
39.
Jiang
,
X.
,
Sanford
,
R.
, and
Pell
,
M. D.
(
2017
). “
Neural systems for evaluating speaker (un)believability: Vocal expression, speaker confidence, believability judgment, and fMRI
,”
Hum. Brain Mapp.
38
,
3732
3749
.
40.
Kreiman
,
J.
, and
Sidtis
,
D.
(
2011
).
Perception of Emotion and Personality from Voice
,
Foundations of Voice Studies (Wiley-Blackwell
,
Oxford
), pp.
302
360
.
41.
Kumle
,
L.
,
,
M. L.-H.
, and
Draschkow
,
D.
(
2021
). “
Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R
,”
Behav. Res.
53
,
2528
2543
.
42.
Laukka
,
P.
,
Elfenbein
,
H. A.
,
Söder
,
N.
,
Nordström
,
H.
,
Althoff
,
J.
,
Chui
,
W.
,
Iraki
,
F. K.
,
Rockstuhl
,
T.
, and
Thinguja
,
N. S.
(
2013
). “
Cross-cultural decoding of positive and negative non-linguistic emotion vocalizations
,”
Front. Psychol.
4
,
353
360
.
43.
Laukka
,
P.
,
Elfenbein
,
H. A.
,
Thingujam
,
N. S.
,
Rockstuhl
,
T.
,
Iraki
,
F. K.
,
Chui
,
W.
, and
Althoff
,
J.
(
2016
). “
The expression and recognition of emotions in the voice across five nations: A lens model analysis based on acoustic features
,”
J. Pers. Soc. Psychol.
111
,
686
705
.
44.
Laukka
,
P.
,
Juslin
,
P.
, and
Bresin
,
R.
(
2005
). “
A dimensional approach to vocal expression of emotion
,”
Cogn. Emot.
19
,
633
653
.
45.
Laukka
,
P.
,
Neiberg
,
D.
, and
Elfenbein
,
H. A.
(
2014
). “
Evidence for cultural dialects in vocal emotion expression: Acoustic classification within and across five nations
,”
Emotion
14
,
445
449
.
46.
Lee
,
B.
, and
Sidtis
,
D. V. L.
(
2017
). “
The bilingual voice: Vocal characteristics when speaking two languages across speech tasks
,”
Speech Lang. Hear.
20
,
174
185
.
47.
Lee
,
J.
,
Shaiman
,
S.
, and
Weismer
,
G.
(
2016
). “
Relationship between tongue positions and formant frequencies in female speakers
,”
J. Acoust. Soc. Am.
139
,
426
440
.
48.
Lei
,
Y.
(
2015
).
A Study on the Phonetics and Vocabulary of the Linhe Dialect of Jin Chinese in Inner Mongolia
(
Lanzhou University
,
Lanzhou, China
) (in Chinese).
49.
Lev-Ari
,
S.
, and
Keysar
,
B.
(
2010
). “
Why don't we believe non-native speakers? The influence of accent on credibility
,”
J. Exp. Soc. Psychol.
46
,
1093
1096
.
50.
Li
,
P.
,
Zhang
,
F.
,
Tsai
,
E.
, and
Puls
,
B.
(
2014
). “
Language history questionnaire (LHQ 2.0): A new dynamic web-based research tool
,”
Bilingualism
17
,
673
680
.
51.
Lisker
,
L.
, and
Abramson
,
A. S.
(
1964
). “
A cross-language study of voicing in initial stops: Acoustical measurements
,”
Word
20
,
384
422
.
52.
Lopes
,
L. W.
,
Barbosa Lima
,
I. L.
,
Alves Almeida
,
L. N.
,
Cavalcante
,
D. P.
, and
De Almeida
,
A. A. F.
(
2012
). “
Severity of voice disorders in children: Correlations between perceptual and acoustic data
,”
J. Voice
26
,
819.e7
819.e12
.
53.
Lopes
,
L. W.
,
Batista Simões
,
L.
,
Delfino da Silva
,
J.
,
da Silva Evangelista
,
D.
,
da Nóbrega e Ugulino
,
A. C.
,
Oliveira Costa Silva
,
P.
, and
Jefferson Dias Vieira
,
V.
(
2017
). “
Accuracy of acoustic analysis measurements in the evaluation of patients with different laryngeal diagnoses
,”
J. Voice
31
,
382.e15
382.e26
.
54.
Mauchand
,
M.
,
Vergis
,
N.
, and
Pell
,
M. D.
(
2020
). “
Irony, prosody, and social impressions of affective stance
,”
Discourse Process.
57
,
141
157
.
55.
Monetta
,
L.
,
Cheang
,
H. S.
, and
Pell
,
M. D.
(
2008
). “
Understanding speaker attitudes from prosody by adults with Parkinson's disease
,”
J. Neuropsychol.
2
,
415
430
.
56.
Murphy
,
P. J.
,
McGuigan
,
K. G.
,
Walsh
,
M.
, and
Colreavy
,
M.
(
2008
). “
Investigation of a glottal related harmonics-to-noise ratio and spectral tilt as indicators of glottal noise in synthesized and human voice signals
,”
J. Acoust. Soc. Am.
123
,
1642
1652
.
57.
Ng
,
M. L.
,
Chen
,
Y.
, and
Chan
,
E. Y. K.
(
2012
). “
Differences in vocal characteristics between Cantonese and English produced by proficient Cantonese-English bilingual speakers: A long-term average spectral analysis
,”
J. Voice
26
,
e171
e176
.
58.
Nussbaum
,
C.
,
Schirmer
,
A.
, and
Schweinberger
,
S. R.
(
2022b
). “
Contributions of fundamental frequency and timbre to vocal emotion perception and their electrophysiological correlates
,”
Soc. Cogn. Affect. Neurosci.
17
,
1145
1154
.
59.
Nussbaum
,
C.
,
von Eiff
,
C. I.
,
Skuk
,
V. G.
, and
Schweinberger
,
S. R.
(
2022a
). “
Vocal emotion adaptation aftereffects within and across speaker genders: Roles of timbre and fundamental frequency
,”
Cognition
219
,
104967
.
60.
Ohala
,
J. J.
(
1984
). “
An ethological perspective on common cross-language utilization of f0 of voice
,”
Phonetica
41
,
1
16
.
61.
Passoni
,
E.
,
de Leeuw
,
E.
, and
Levon
,
E.
(
2022
). “
Bilinguals produce pitch range differently in their two languages to convey social meaning
,”
Lang. Speech
65
,
1071
1095
.
62.
Patel
,
S.
,
Scherer
,
K. R.
,
Björkner
,
E.
, and
Sundberg
,
J.
(
2011
). “
Mapping emotions into acoustic space: The role of voice production
,”
Biol. Psychol.
87
,
93
98
.
63.
Paulmann
,
S.
, and
Uskul
,
A. K.
(
2014
). “
Cross-cultural emotional prosody recognition: Evidence from Chinese and British listeners
,”
Cogn. Emot.
28
,
230
244
.
64.
Pell
,
M. D.
, and
Kotz
,
S. A.
(
2011
). “
On the time course of vocal emotion recognition
,”
PLoS One
6
,
e27256
.
65.
Pell
,
M. D.
,
Monetta
,
L.
,
Paulmann
,
S.
, and
Kotz
,
S. A.
(
2009a
). “
Recognizing emotions in a foreign language
,”
J. Nonverbal Behav.
33
,
107
120
.
66.
Pell
,
M. D.
,
Paulmann
,
S.
,
Dara
,
C.
,
Alasseri
,
A.
, and
Kotz
,
S. A.
(
2009b
). “
Factors in the recognition of vocally expressed emotions: A comparison of four languages
,”
J. Phon.
37
,
417
435
.
67.
Pellegrino
,
F.
,
Coupé
,
C.
, and
Marsico
,
E.
(
2011
). “
A cross language perspective on speech information rate
,”
Language
87
(
3
),
539
558
.
68.
Pineda-Pérez
,
E.
,
Calvache
,
C.
, and
Cantor-Cutiva
,
L. C.
(
2021
). “
Bibliometric analysis and review of literature on the relationship between voice production and bilingualism
,”
J. Voice
38
,
40
46
.
69.
Pinheiro
,
A. P.
,
Anikin
,
A.
,
Conde
,
T.
,
Sarzedas
,
J.
,
Chen
,
S.
,
Scott
,
S. K.
, and
Lima
,
C. F.
(
2021
). “
Emotional authenticity modulates affective and social trait inferences from voices
,”
Philos. Trans. R. Soc. London, Ser. B: Biol. Sci.
376
,
20200402
.
70.
Roche
,
J. M.
,
Morgan
,
S. D.
, and
Fisk
,
S.
(
2022
). “
Gender stereotypes drive perceptual differences of vocal confidence
,”
J. Acoust. Soc. Am.
151
,
3031
3042
.
71.
Scherer
,
K. R.
,
London
,
H.
, and
Wolf
,
J. J.
(
1973
). “
The voice of confidence: Paralinguistic cues and audience evaluation
,”
J. Res. Pers.
7
,
31
44
.
72.
Scherer
,
K. R.
,
Sundberg
,
J.
,
Fantini
,
B.
,
Trznadel
,
S.
, and
Eyben
,
F.
(
2017
). “
The expression of emotion in the singing voice: Acoustic patterns in vocal performance
,”
J. Acoust. Soc. Am.
142
,
1805
1815
.
73.
Schmitt
,
J. M.
,
Auer
,
P.
, and
Ferstl
,
E. C.
(
2019
). “
Understanding fairy tales spoken in dialect: An fMRI study
,”
Lang. Cogn. Neurosci.
34
,
440
456
.
74.
Shen
,
M.
(
2006
). “
The partitioning of Jin dialect
,”
Dialect
, Issue 4, pp. 343–356 (in Chinese).
75.
Su
,
F.
(
2003
). “
A preliminary exploration of the Ordos Dialect in Inner Mongolia
,”
J. Open Univ. China (Philosophy Social Sci. Ed.)
, Issue 4, pp. 98–100 (in Chinese).
76.
The Institute of Linguistics of Chinese Academy of Social Sciences, The Institute of Ethnology and Anthropology of Chinese Academy of Social Sciences, and Research Centre on Linguistics and Language Information Sciences of City University of Hong Kong
(
2012
).
Language Atlas of China
, 2nd ed. (
Commercial Press
,
Beijing
), Chinese dialect volume.
77.
Tiv
,
M.
,
Rouillard
,
V.
,
Vingron
,
N.
,
Wiebe
,
S.
, and
Titone
,
D.
(
2019
). “
Global second language proficiency predicts self-perceptions of general sarcasm use among bilingual adults
,”
J. Lang. Soc. Psychol.
38
,
459
478
.
78.
Wagner
,
M. A.
,
Broersma
,
M.
,
McQueen
,
J. M.
,
Dhaene
,
S.
, and
Lemhöfer
,
K.
(
2021
). “
Phonetic convergence to non-native speech: Acoustic and perceptual evidence
,”
J. Chin. Phon.
88
,
101076
.
79.
Wang
,
F.
, and
Wayland
,
R.
(
2023
). “
Acoustic properties of vocal emotions in American English and Mandarin Chinese
,”
J. Acoust. Soc. Am.
153
,
A294
.
80.
Xu
,
K.
,
Huang
,
W.
,
Liang
,
Y.
, and
Ran
,
Q.
(
2021
). “
Fundamental frequency perturbation and amplitude perturbation in Chinese dialects
,”
J. Chin. Phon.
16
,
16
26
(in Chinese).
81.
Xu
,
Y.
(
2013
).
ProsodyPro: A Tool for Large-Scale Systematic Prosody Analysis
(
Laboratoire Parole et Langage
,
Aix-en-Provence, France
), pp.
7
10
.
82.
Zhang
,
S.
, and
Pell
,
M. D.
(
2022
). “
Cultural differences in vocal expression analysis: Effects of task, language, and stimulus-related factors
,”
PLoS One
17
,
e0275915
.
83.
Zhang
,
Z.
,
McGettigan
,
C.
, and
Belyk
,
M.
(
2022
). “
Speech timing cues reveal deceptive speech in social deduction board games
,”
PLoS One
17
,
e0263852
.
84.
Zhu
,
X.
(
2004
). “
Intimacy and high pitch: A biological explanation of language phenomena such as diminutive, female national pronunciation, and girl language
,”
Contemp. Linguist.
3
,
193
222
(in Chinese).
You do not currently have access to this content.