The auditory gating paradigm was adopted to study how much acoustic information is needed to recognize emotions from speech prosody and music performances. In Study 1, brief utterances conveying ten emotions were segmented into temporally fine-grained gates and presented to listeners, whereas Study 2 instead used musically expressed emotions. Emotion recognition accuracy increased with increasing gate duration and generally stabilized after a certain duration, with different trajectories for different emotions. Above-chance accuracy was observed for ≤100 ms stimuli for anger, happiness, neutral, and sadness, and for ≤250 ms stimuli for most other emotions, for both speech and music. This suggests that emotion recognition is a fast process that allows discrimination of several emotions based on low-level physical characteristics. The emotion identification points, which reflect the amount of information required for stable recognition, were shortest for anger and happiness for both speech and music, but recognition took longer to stabilize for music vs speech. This, in turn, suggests that acoustic cues that develop over time also play a role for emotion inferences (especially for music). Finally, acoustic cue patterns were positively correlated between speech and music, suggesting a shared acoustic code for expressing emotions.

1.
Askenfelt
,
A.
(
1991
). “
Voices and strings: Close cousins or not?
,” in
Music, Language, Speech and Brain
, edited by
J.
Sundberg
,
L.
Nord
, and
R.
Carlson
(
Palgrave
,
London
), pp.
243
256
.
2.
Audibert
,
N.
,
Aubergé
,
V.
, and
Rilliard
,
A.
(
2007
). “
When is the emotional information? A gating experiment for gradient and contours cues
,” in
Proceedings of the 16th International Congress of Phonetic Sciences
, edited by
J.
Trouvain
and
W. J.
Barry
(
Saarland University
,
Saarbrücken, Germany
), pp.
2137
2140
.
3.
Balkwill
,
L.-L.
, and
Thompson
,
W. F.
(
1999
). “
A cross-cultural investigation of the perception of emotion in music: Psychophysical and cultural cues
,”
Music Percept.
17
,
43
64
.
4.
Banse
,
R.
, and
Scherer
,
K. R.
(
1996
). “
Acoustic profiles in vocal emotion expression
,”
J. Pers. Soc. Psychol.
70
,
614
636
.
5.
Bänziger
,
T.
,
Mortillaro
,
M.
, and
Scherer
,
K. R.
(
2012
). “
Introducing the Geneva multimodal expression corpus for experimental research on emotion perception
,”
Emotion
12
,
1161
1179
.
6.
Barrett
,
L. F.
(
2017
). “
The theory of constructed emotion: An active inference account of interoception and categorization
,”
Soc. Cogn. Affect. Neurosci.
12
,
1
23
.
7.
Baumeister
,
R. F.
,
Bratslavsky
,
E.
,
Finkenauer
,
C.
, and
Vohs
,
K. D.
(
2001
). “
Bad is stronger than good
,”
Rev. Gen. Psychol.
5
,
323
370323
.
8.
Bigand
,
E.
,
Filipic
,
S.
, and
Lalitte
,
P.
(
2005
). “
The time course of emotional responses to music
,”
Ann. N.Y. Acad. Sci.
1060
,
429
437
.
9.
Birkholz
,
P.
,
Martin
,
L.
,
Willmes
,
K.
,
Kröger
,
B. J.
, and
Neuschaefer-Rube
,
C.
(
2015
). “
The contribution of phonation type to the perception of vocal emotions in German: An articulatory synthesis study
,”
J. Acoust. Soc. Am.
137
,
1503
1512
.
10.
Bowling
,
D. L.
,
Sundararajan
,
J.
,
Han
,
S.
, and
Purves
,
D.
(
2012
). “
Expression of emotion in Eastern and Western music mirrors vocalization
,”
PLoS One
7
,
e31942
.
11.
Chen
,
X.
,
Zhao
,
L.
,
Jiang
,
A.
, and
Yang
,
Y.
(
2011
). “
Event-related potential correlates of the expectancy violation effect during emotional prosody processing
,”
Biol. Psychol.
86
,
158
167
.
12.
Cordaro
,
D. T.
,
Keltner
,
D.
,
Tshering
,
S.
,
Wangchuk
,
D.
, and
Flynn
,
L. M.
(
2016
). “
The voice conveys emotion in ten globalized cultures and one remote village in Bhutan
,”
Emotion
16
,
117
128
.
13.
Cornew
,
L.
,
Carver
,
L.
, and
Love
,
T.
(
2010
). “
There's more to emotion than meets the eye: A processing bias for neutral content in the domain of emotional prosody
,”
Cogn. Emot.
24
,
1133
1152
.
14.
Coutinho
,
E.
, and
Schuller
,
B.
(
2017
). “
Shared acoustic codes underlie emotional communication in music and speech—Evidence from deep transfer learning
,”
PLoS One
12
,
e0179289
.
15.
Cowen
,
A. S.
,
Elfenbein
,
H. A.
,
Laukka
,
P.
, and
Keltner
,
D.
(
2018
). “
Mapping 24 emotions conveyed by brief human vocalization
,”
Am. Psychol.
(published online).
16.
Dalla Bella
,
S.
,
Peretz
,
I.
, and
Aronoff
,
N.
(
2003
). “
Time course of melody recognition: A gating paradigm study
,”
Percept. Psychophys.
65
,
1019
1028
.
17.
Dinno
,
A.
(
2009
). “
Exploring the sensitivity of Horn's parallel analysis to the distributional form of random data
,”
Multivar. Behav. Res.
44
,
362
388
.
18.
Eerola
,
T.
,
Friberg
,
A.
, and
Bresin
,
R.
(
2013
). “
Emotional expression in music: Contribution, linearity, and additivity of primary musical cues
,”
Front. Psychol.
4
,
487
.
19.
Ekman
,
P.
(
1992
). “
An argument for basic emotions
,”
Cogn. Emot.
6
,
169
200
.
20.
Elfenbein
,
H. A.
, and
Ambady
,
N.
(
2002
). “
On the universality and cultural specificity of emotion recognition: A meta-analysis
,”
Psychol. Bull.
128
,
203
235
.
21.
Ellsworth
,
P. C.
, and
Scherer
,
K. R.
(
2003
). “
Appraisal processes in emotion
,” in
Handbook of Affective Sciences
, edited by
R. J.
Davidson
,
K. R.
Scherer
, and
H. H.
Goldsmith
(
Oxford University Press
,
New York
), pp.
572
595
.
22.
Eyben
,
F.
,
Scherer
,
K. R.
,
Schuller
,
B. W.
,
Sundberg
,
J.
,
André
,
E.
,
Busso
,
C.
,
Devillers
,
L. Y.
,
Epps
,
J.
,
Laukka
,
P.
,
Narayanan
,
S.
, and
Truong
,
K.
(
2016
). “
The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for voice research and affective computing
,”
IEEE Trans. Affect. Comp.
7
,
190
202
.
23.
Eyben
,
F.
,
Weninger
,
F.
,
Gross
,
F.
, and
Schuller
,
B.
(
2013
). “
Recent developments in openSMILE, the Munich open-source multimedia feature extractor
,” in
Proceedings of the 21st ACM International Conference on Multimedia
, edited by
A.
Jaimes
,
N.
Sebe
,
N.
Boujemaa
,
D.
Gatica-Perez
,
D. A.
Shamma
,
M.
Worring
, and
R.
Zimmermann
(
ACM Press
,
New York
), pp.
835
838
.
24.
Filipic
,
S.
,
Tillmann
,
B.
, and
Bigand
,
E.
(
2010
). “
Judging familiarity and emotion from very brief musical excerpts
,”
Psychonom. Bull. Rev.
17
,
335
341335
.
25.
Fitch
,
W. T.
(
2006
). “
The biology and evolution of music: A comparative perspective
,”
Cognition
100
,
173
215
.
26.
Frank
,
M. G.
, and
Stennett
,
J.
(
2001
). “
The forced-choice paradigm and the perception of facial expressions of emotion
,”
J. Pers. Soc. Psychol.
80
,
75
85
.
27.
Fritz
,
T.
,
Jentschke
,
S.
,
Gosselin
,
N.
,
Sammler
,
D.
,
Peretz
,
I.
,
Turner
,
R.
,
Friederici
,
A. D.
, and
Koelsch
,
S.
(
2009
). “
Universal recognition of three basic emotions in music
,”
Curr. Biol.
19
,
573
576
.
28.
Ghosh
,
J.
,
Li
,
Y.
, and
Mitra
,
R.
(
2015
). “
On the use of Cauchy prior distributions for Bayesian logistic regression
,” arXiv:1507.07170 [Stat].
29.
Gobl
,
C.
, and
Ní Chasaide
,
A.
(
2010
). “
Voice source variation and its communicative functions
,” in
The Handbook of Phonetic Sciences
, 2nd ed., edited by
W. J.
Hardcastle
,
J.
Laver
, and
F. E.
Gibbon
(
Wiley-Blackwell
,
Chichester, England
), pp.
387
423
.
30.
Goudbeek
,
M.
, and
Scherer
,
K.
(
2010
). “
Beyond arousal: Valence and potency/control cues in the vocal expression of emotion
,”
J. Acoust. Soc. Am.
128
,
1322
1336
.
31.
Grichkovstova
,
I.
,
Lacheret
,
A.
,
Morel
,
M.
,
Beaucousin
,
V.
, and
Tzourio-Mazoyer
,
N.
(
2007
). “
Affective speech gating
,” in
Proceedings of the 16th International Congress of Phonetic Sciences
, edited by
J.
Trouvain
and
W. J.
Barry
(
Saarland University
,
Saarbrücken, Germany
), pp.
805
808
.
32.
Grosjean
,
F.
(
1980
). “
Spoken word recognition processes and the gating paradigm
,”
Percept. Psychophys.
28
,
267
283
.
33.
Hammerschmidt
,
K.
, and
Jürgens
,
U.
(
2007
). “
Acoustical correlates of affective prosody
,”
J. Voice
21
,
531
540
.
34.
Harajda
,
H.
,
Mikiel
,
W.
,
Gabryelczyk
,
P.
, and
Fedyniuk
,
P.
(
1993
). “
Microstructure of sound: Formants in the dynamical spectra of violin sounds
,”
Arch. Acoust.
18
,
17
32
, available at http://acoustics.ippt.gov.pl/index.php/aa/article/view/1128/963.
35.
Ilie
,
G.
, and
Thompson
,
W. F.
(
2006
). “
A comparison of acoustic cues in music and speech for three dimensions of affect
,”
Music Percept.
23
,
319
330
.
36.
Jiang
,
X.
,
Paulmann
,
S.
,
Robin
,
J.
, and
Pell
,
M. D.
(
2015
). “
More than accuracy: Nonverbal dialects modulate the time course of vocal emotion recognition across cultures
,”
J. Exp. Psychol. Hum. Percept. Perform.
41
,
597
612
.
37.
Juslin
,
P. N.
(
1997
). “
Emotional communication in music performance: A functionalist perspective and some data
,”
Music Percept.
14
,
383
418
.
38.
Juslin
,
P. N.
, and
Laukka
,
P.
(
2001
). “
Impact of intended emotion intensity on cue utilization and decoding accuracy in vocal expression of emotion
,”
Emotion
1
,
381
412
.
39.
Juslin
,
P. N.
, and
Laukka
,
P.
(
2003
). “
Communication of emotions in vocal expression and music performance: Different channels, same code?
,”
Psychol. Bull.
129
,
770
814
.
40.
Juslin
,
P. N.
,
Laukka
,
P.
, and
Bänziger
,
T.
(
2018
). “
The mirror to our soul? Comparisons of spontaneous and posed vocal expression of emotion
,”
J. Nonverbal Behav.
42
,
1
40
.
41.
Keltner
,
D.
, and
Haidt
,
J.
(
1999
). “
Social functions of emotions at four levels of analysis
,”
Cogn. Emot.
13
,
505
521
.
42.
Krumhansl
,
C. L.
(
2010
). “
Plink: ‘Thin slices’ of music
,”
Music Percept.
27
,
337
354
.
43.
Ladd
,
D. R.
,
Silverman
,
K. E. A.
,
Tolkmitt
,
F.
,
Bergmann
,
G.
, and
Scherer
,
K. R.
(
1985
). “
Evidence for the independent function of intonation contour type, voice quality, and F0 range in signaling speaker affect
,”
J. Acoust. Soc. Am.
78
,
435
444
.
44.
Laukka
,
P.
,
Eerola
,
T.
,
Thingujam
,
N. S.
,
Yamasaki
,
T.
, and
Beller
,
G.
(
2013
). “
Universal and culture-specific factors in the recognition and performance of musical affect expressions
,”
Emotion
13
,
434
449
.
45.
Laukka
,
P.
, and
Elfenbein
,
H. A.
(
2012
). “
Emotion appraisal dimensions can be inferred from vocal expressions
,”
Soc. Psychol. Pers. Sci.
3
,
529
536
.
46.
Laukka
,
P.
,
Elfenbein
,
H. A.
,
Söder
,
N.
,
Nordström
,
H.
,
Althoff
,
J.
,
Chui
,
W.
,
Iraki
,
F. K.
,
Rockstuhl
,
T.
, and
Thingujam
,
N. S.
(
2013
). “
Cross-cultural decoding of positive and negative non-linguistic vocalizations
,”
Front. Psychol.
4
,
353
.
47.
Laukka
,
P.
,
Elfenbein
,
H. A.
,
Thingujam
,
N. S.
,
Rockstuhl
,
T.
,
Iraki
,
F. K.
,
Chui
,
W.
, and
Althoff
,
J.
(
2016
). “
The expression and recognition of emotions in the voice across five nations: A lens model analysis based on acoustic features
,”
J. Pers. Soc. Psychol.
111
,
686
705
.
48.
Laukka
,
P.
,
Juslin
,
P.
, and
Bresin
,
R.
(
2005
). “
A dimensional approach to vocal expression of emotion
,”
Cogn. Emot.
19
,
633
653
.
49.
Lazarus
,
R. S.
(
1991
).
Emotion and Adaptation
(
Oxford University Press
,
New York
).
50.
Lima
,
C. F.
,
Anikin
,
A.
,
Monteiro
,
A. C.
,
Scott
,
S. K.
, and
Castro
,
S.
(
2018
). “
Automaticity in the recognition of nonverbal emotional vocalizations
,”
Emotion
19
(
2
),
219
233
.
51.
Liu
,
T.
,
Pinheiro
,
A. P.
,
Deng
,
G.
,
Nestor
,
P. G.
,
McCarley
,
R. W.
, and
Niznikiewicz
,
M. A.
(
2012
). “
Electrophysiological insights into processing nonverbal emotional vocalizations
,”
Neuroreport
23
,
108
112
.
52.
Moors
,
A.
(
2017
). “
Integration of two skeptical emotion theories: Dimensional appraisal theory and Russell's psychological construction theory
,”
Psychol. Inq.
28
,
1
19
.
53.
Mores
,
R.
(
2017
). “
Vowel quality in violin sounds—A timbre analysis of Italian masterpieces
,” in
Studies in Musical Acoustics and Psychoacoustics
, edited by
A.
Schneider
(
Springer
,
Cham, Switzerland
), pp.
223
245
.
54.
Müllensiefen
,
D.
,
Gingras
,
B.
,
Musil
,
J.
, and
Stewart
,
L.
(
2014
). “
The musicality of non-musicians: An index for assessing musical sophistication in the general population
,”
PLoS One
9
,
e89642
.
55.
Nordström
,
H.
,
Laukka
,
P.
,
Thingujam
,
N. S.
,
Schubert
,
E.
, and
Elfenbein
,
H. A.
(
2017
). “
Emotion appraisal dimensions inferred from vocal expressions are consistent across cultures: A comparison between Australia and India
,”
R. Soc. Open Sci.
4
,
170912
.
56.
Nummenmaa
,
L.
, and
Calvo
,
M. G.
(
2015
). “
Dissociation between recognition and detection advantage for facial expressions: A meta-analysis
,”
Emotion
15
,
243
256
.
57.
Ortony
,
A.
,
Clore
,
G. L.
, and
Collins
,
A.
(
1988
).
The Cognitive Structure of Emotions
(
Cambridge University Press
,
Cambridge, UK
).
58.
Parncutt
,
R.
(
2014
). “
The emotional connotations of major versus minor tonality: One or more origins?
,”
Music. Sci.
18
,
324
353
.
59.
Paulmann
,
S.
, and
Kotz
,
S. A.
(
2008
). “
Early emotional prosody perception based on different speaker voices
,”
Neuroreport
19
,
209
213
.
60.
Paulmann
,
S.
, and
Pell
,
M. D.
(
2010
). “
Contextual influences of emotional speech prosody on face processing: How much is enough?
,”
Cogn. Affect. Behav. Neurosci.
10
,
230
242
.
61.
Peirce
,
J. W.
(
2007
). “
PsychoPy—Psychophysics software in Python
,”
J. Neurosci. Methods
162
,
8
13
.
62.
Pell
,
M. D.
, and
Kotz
,
S. A.
(
2011
). “
On the time course of vocal emotion recognition
,”
PLoS One
6
,
e27256
.
63.
Peretz
,
I.
,
Gagnon
,
L.
, and
Bouchard
,
B.
(
1998
). “
Music and emotion: Perceptual determinants, immediacy, and isolation after brain damage
,”
Cognition
68
,
111
141
.
64.
Piironen
,
J.
,
Paasiniemi
,
M.
, and
Vehtari
,
A.
(
2018
). “
Projective inference in high-dimensional problems: Prediction and feature selection
,” arXiv:1810.02406.
65.
Pollack
,
I.
,
Rubenstein
,
H.
, and
Horowitz
,
A.
(
1960
). “
Communication of verbal modes of expression
,”
Lang. Speech
3
,
121
130
.
66.
Rigoulot
,
S.
,
Wassiliwizky
,
E.
, and
Pell
,
M. D.
(
2013
). “
Feeling backwards? How temporal order in speech affects the time course of vocal emotion recognition
,”
Front. Psychol.
4
,
367
.
67.
Sauter
,
D. A.
(
2017
). “
The nonverbal communication of positive emotions: An emotion family approach
,”
Emot. Rev.
9
,
222
234
.
68.
Sauter
,
D. A.
, and
Eimer
,
M.
(
2010
). “
Rapid detection of emotion from human vocalizations
,”
J. Cogn. Neurosci.
22
,
474
481
.
69.
Sauter
,
D. A.
,
Eisner
,
F.
,
Calder
,
A. J.
, and
Scott
,
S. K.
(
2010
). “
Perceptual cues in non-verbal vocal expressions of emotion
,”
Q. J. Exp. Psychol.
63
,
2251
2272
.
70.
Scherer
,
K. R.
(
1986
). “
Vocal affect expression: A review and a model for future research
,”
Psychol. Bull.
99
,
143
165
.
71.
Scherer
,
K. R.
(
1995
). “
Expression of emotion in voice and music
,”
J. Voice
9
,
235
248
.
72.
Scherer
,
K. R.
(
2003
). “
Vocal communication of emotion: A review of research paradigms
,”
Speech Commun.
40
,
227
256
.
73.
Scherer
,
K. R.
(
2019
). “
Acoustic patterning of emotion vocalizations
,” in
The Oxford Handbook of Voice Perception
, edited by
S.
Frühholz
and
P.
Belin
(
Oxford University Press
,
Oxford, England
), pp.
61
91
.
74.
Scherer
,
K. R.
,
Clark-Polner
,
E.
, and
Mortillaro
,
M.
(
2011
). “
In the eye of the beholder? Universality and cultural specificity in the expression and perception of emotion
,”
Int. J. Psychol.
46
,
401
435
.
75.
Scherer
,
K. R.
,
Sundberg
,
J.
,
Fantini
,
B.
,
Trznadel
,
S.
, and
Eyben
,
F.
(
2017
). “
The expression of emotion in the singing voice: Acoustic patterns in vocal performance
,”
J. Acoust. Soc. Am.
142
,
1805
1815
.
76.
Scherer
,
K. R.
,
Sundberg
,
J.
,
Tamarit
,
L.
, and
Salomão
,
G. L.
(
2015
). “
Comparing the acoustic expression of emotion in the speaking and the singing voice
,”
Comput. Speech. Lang.
29
,
218
235
.
77.
Schoonderwaldt
,
E.
(
2009
). “
The violinist's sound palette: Spectral centroid, pitch flattening and anomalous low frequencies
,”
Acta Acust. Acust.
95
,
901
914
.
78.
Schuller
,
B.
,
Batliner
,
A.
,
Steidl
,
S.
, and
Seppi
,
D.
(
2011
). “
Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge
,”
Speech Commun.
53
,
1062
1087
.
79.
Simon-Thomas
,
E. R.
,
Keltner
,
D. J.
,
Sauter
,
D.
,
Sinicropi-Yao
,
L.
, and
Abramson
,
A.
(
2009
). “
The voice conveys specific emotions: Evidence from vocal burst displays
,”
Emotion
9
,
838
846
.
80.
Spreckelmeyer
,
K. N.
,
Kutas
,
M.
,
Urbach
,
T.
,
Altenmüller
,
E.
, and
Münte
,
T. F.
(
2009
). “
Neural processing of vocal emotion and identity
,”
Brain Cogn.
69
,
121
126
.
81.
Stan Development Team
. (
2018a
). RStan: The R interface to Stan. R package version 2.17.3. http://mc-stan.org (Last viewed January 25, 2019).
82.
Stan Development Team
. (
2018b
). RStanArm: Bayesian applied regression modeling via Stan. R package version 2.17.4. http://mc-stan.org (Last viewed January 25, 2019).
83.
Tangney
,
J. P.
, and
Tracy
,
J. L.
(
2012
). “
Self-conscious emotions
,” in
Handbook of Self and Identity
, 2nd ed., edited by
M.
Leary
and
J. P.
Tangney
(
Guilford Press
,
New York
), pp.
446
478
.
84.
Thompson
,
W. F.
,
Marin
,
M. M.
, and
Stewart
,
L.
(
2012
). “
Reduced sensitivity to emotional prosody in congenital amusia rekindles the musical protolanguage hypothesis
,”
Proc. Natl. Acad. Sci. U.S.A.
109
,
19027
19032
.
85.
Tracy
,
J. L.
, and
Randles
,
D.
(
2011
). “
Four models of basic emotions: A review of Ekman and Cordaro, Izard, Levenson, and Panksepp and Watt
,”
Emot. Rev.
3
,
397
405
.
86.
Vieillard
,
S.
,
Peretz
,
I.
,
Gosselin
,
N.
,
Khalfa
,
S.
,
Gagnon
,
L.
, and
Bouchard
,
B.
(
2008
). “
Happy, sad, scary and peaceful musical excerpts for research on emotions
,”
Cogn. Emot.
22
,
720
752
.
87.
Wagner
,
H. L.
(
1993
). “
On measuring performance in category judgment studies of nonverbal behavior
,”
J. Nonverbal Behav.
17
,
3
38
.
88.
Weninger
,
F.
,
Eyben
,
F.
,
Schuller
,
B. W.
,
Mortillaro
,
M.
, and
Scherer
,
K. R.
(
2013
). “
On the acoustics of emotion in audio: What speech, music, and sound have in common
,”
Front. Psychol.
4
,
292
.
89.
Yanushevskaya
,
I.
,
Gobl
,
C.
, and
Ní Chasaide
,
A.
(
2018
). “
Cross-language differences in how voice quality and f0 contours map to affect
,”
J. Acoust. Soc. Am.
144
,
2730
2750
.

Supplementary Material

You do not currently have access to this content.