Recent studies have demonstrated perceptual adaptation to nonlinguistic properties of speech involving voice gender and emotional expression. The present study extends this work by examining the contribution of fundamental frequency (F0) to these effects. Voice recordings of vowel-consonant-vowel (VCV) syllables from six talkers were processed using the STRAIGHT vocoder and an auditory morphing technique to synthesize gender (experiment 1) and expressive (experiment 2) speech sound continua ranging from one category endpoint to the other (female to male; angry to happy). Continuum endpoints served as adaptors for F0 present and F0 removed conditions. F0 removed stimuli were created by replacing the periodic excitation source with broadband noise. Confirming previous findings, aftereffects were found in the F0 present condition, resulting in a decreased likelihood to identify test stimuli as belonging to the adaptor category. No aftereffects appeared when F0 was removed, highlighting the importance of F0 in adaptation. However, in an identification test listeners were still able to categorize F0 removed stimuli at better-than-chance levels, indicating that residual cues for gender and emotion were available even when F0 was not present.

1.
Ades
,
A. E.
(
1976
). “
Adapting the property detectors for speech perception
,” in
New Approaches to Language Mechanisms
, edited by
R. J.
Wales
and
E.
Walker
(
North-Holland
,
Amsterdam
), pp.
55
107
.
2.
Assmann
,
P. F.
,
Nearey
,
T. M.
, and
Dembling
,
S.
(
2006
). “
Effects of frequency shifts on perceived naturalness and gender information in speech
,” in
Proceedings of the Ninth International Conference on Spoken Language Processing
, Pittsburgh, PA (September, 17–21, 2006), pp.
889
892
.
3.
Barreda
,
S.
, and
Nearey
,
T. M.
(
2012
). “
The direct and indirect roles of fundamental frequency in vowel perception
,”
J. Acoust. Soc. Am.
131
,
466
477
.
4.
Bestelmeyer
,
P. E. G.
,
Rouger
,
J.
,
DeBruine
,
L. M.
, and
Belin
,
P.
(
2010
). “
Auditory adaptation in vocal affect perception
,”
Cognition.
117
,
217
223
.
5.
Bulut
,
M.
, and
Narayanan
,
S.
(
2008
). “
On the robustness of overall F0-only modifications to the perception of emotions in speech
,”
J. Acoust. Soc. Am.
123
,
4547
4558
.
6.
Coleman
,
R. O.
(
1976
). “
A comparison of the contributions of two voice quality characteristics to the perception of maleness and femaleness in the voice
,”
J. Speech Hearing Res.
19
,
168
180
.
7.
Eimas
,
P. D.
, and
Corbit
,
J. D.
(
1973
). “
Selective adaptation of linguistic feature detectors
,”
Cognit. Psychol.
4
,
99
109
.
8.
Ekman
,
P.
(
1984
). “
Expression and the nature of emotion
,” in
Approaches to Emotion
, edited by
K.
Scherer
and
P.
Ekman
(
Lawrence Erlbaum
,
Hillsdale, NJ
), pp.
319
344
.
9.
Gelfer
,
M. P.
, and
Mikos
,
V. A.
(
2005
). “
The relative contributions of speaking fundamental frequency and formant frequencies to gender identification based on isolated vowels
,”
J. Voice
19
,
544
554
.
10.
Hillenbrand
,
J. M.
, and
Clark
,
M. J.
(
2009
). “
The role of F0 and formant frequencies in distinguishing the voices of men and women
,”
Atten. Percept. Psychophys.
71
,
1150
1166
.
11.
Juslin
,
P. N.
, and
Laukka
,
P.
(
2003
). “
Communication of emotions in vocal expression and music performance: Different channels, same code?
,”
Psychol. Bull.
129
,
770
814
.
12.
Kawahara
,
H.
,
Masuda-Katsuse
,
I.
, and
de Cheveigné
,
A.
(
1999
). “
Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction
,”
Speech Comm.
27
,
187
207
.
13.
Kawahara
,
H.
, and
Matsui
,
H.
(
2003
). “
Auditory morphing based on an elastic perceptual distance metric in an interference-free time-frequency representation
,” in
Proceedings of the 2003 IEEE Inter. Conf. on Acoust., Speech, and Signal Proc.
, Vol.
I
, pp.
256
259
.
14.
Kreiman
,
J.
, and
Sidtis
,
D.
(
2011
).
Foundations of Voice Studies: An Interdisciplinary Approach to Voice Production and Perception
(
Wiley-Blackwell
,
West Sussex, UK
), pp.
124
341
.
15.
Lass
,
N. J.
,
Hughes
,
K. R.
,
Bowyer
,
M. D.
,
Waters
,
L. T.
, and
Bourne
,
V. T.
(
1976
). “
Speaker sex identification from voiced, whispered, and filtered isolated vowels
,”
J. Acoust. Soc. Am.
59
.
675
678
.
16.
Laukka
,
P.
(
2005
). “
Categorical perception of vocal emotion expressions
,”
Emotion
5
,
277
295
.
17.
Lieberman
,
P.
, and
Michaels
,
S. B.
(
1962
). “
Some aspects of fundamental frequency and envelope amplitude as related to the emotional content of speech
,”
J. Acoust. Soc. Am.
34
,
922
927
.
18.
Mullennix
,
J. W.
,
Johnson
,
K. A.
,
Topcu-Durgan
,
M.
, and
Farnsworth
,
L. M.
(
1995
). “
The perceptual representation of voice gender
,”
J. Acoust. Soc. Am.
98
,
3080
3095
.
19.
Murray
,
I. R.
, and
Arnott
,
J. L.
(
1993
). “
Toward the simulation of emotion in synthetic speech: A review of the literature on human vocal emotion
,”
J. Acoust. Soc. Am.
93
,
1097
1108
.
20.
Nearey
,
T. M.
, and
Assmann
,
P. F.
(
2007
). “
Probabilistic ‘sliding template’ models for indirect vowel normalization
,” in
Experimental Approaches to Phonology
, edited by
M. J.
Solé
,
P. S.
Beddor
, and
M.
Ohala
(
Oxford University Press
,
Oxford, UK
), pp.
246
269
.
21.
Nearey
,
T. M.
,
Assmann
,
P. F.
, and
Hillenbrand
,
J. M.
(
2002
). “
Evaluation of a strategy for automatic formant tracking
,”
J. Acoust. Soc. Am.
112
,
2323
.
22.
Pell
,
M. D.
,
Paulmann
,
S.
,
Dara
,
C.
,
Alasseri
,
A.
, and
Kotz
,
S. A.
(
2009
). “
Factors in the recognition of vocally expressed emotions: A comparison of four languages
,”
J. Phonetics.
37
,
417
435
.
23.
Peterson
,
G. E.
, and
Barney
,
H. L.
(
1952
). “
Control methods used in a study of vowels
,”
J. Acoust. Soc. Am.
24
,
175
184
.
24.
Phillips
,
D. P.
, and
Hall
,
S. E.
(
2005
). “
Psychophysical evidence for adaptation of central auditory processors for interaural differences in time and level
,”
Hear. Res.
202
,
188
199
.
25.
Russell
,
J. A.
(
1994
). “
Is there universal recognition of emotion from facial expression? A review of the cross-cultural studies
,”
Psychol. Bull.
115
,
102
141
.
26.
Samuel
,
A. G.
(
1986
). “
Red herring detectors and speech perception: In defense of selective adaptation
,”
Cognit. Psychol.
18
,
452
499
.
27.
Scherer
,
K. R.
(
1979
). “
Nonlinguistic vocal indicators of emotion and psychopathology
,” in
Emotions in Personality and Psychopathology
, edited by
C. E.
Izard
(
Plenum Press
,
New York
), pp.
495
529
.
28.
Scherer
,
K. R.
(
1986
). “
Vocal affect expression: a review and a model for future research
,”
Psychol. Bull.
99
,
143
165
.
29.
Scherer
,
K. R.
(
2010
). “
Emotion and emotional competence: conceptual and theoretical issues for modeling agents
,” in
Blueprint for Affective Computing: A Sourcebook
, edited by
K. R.
Scherer
,
T.
Bänziger
, and
E. B.
Roesch
(
Oxford University Press
,
New York
), pp.
3
20
.
31.
Schweinberger
,
S.
,
Casper
,
C.
,
Hauthal
,
N.
,
Kaufmann
,
J.
,
Kawahara
,
H.
,
Kloth
,
N.
,
Robertson
,
D.
,
Simpson
,
A.
, and
Zaske
,
R.
(
2008
). “
Auditory adaptation in voice perception
,”
Current Biology.
18
,
684
688
.
32.
Sjölander
,
K.
, and
Beskow
,
J.
(
2000
). “
WaveSurfer—An open source speech tool
,” in Proceedings of the Int. Conf. Speech Lang. Proc., Vol.
IV
, pp.
464
467
.
33.
Webster
,
M. A.
, and
MacLeod
,
D. I. A.
(
2011
). “
Visual adaptation and face perception
,”
Philos. Trans. R. Soc. B.
366
,
1702
1725
.
34.
Webster
,
M. A.
,
Werner
,
J. S.
, and
Field
,
D. J.
(
2005
). “
Adaptation and the phenomenology of perception
,” in
Fitting the Mind to the World: Adaptation and Aftereffects in High-Level Vision
, edited by
C. E. G.
Clifford
and
G.
Rhodes
(
Oxford University Press
,
New York
), pp.
241
277
.
You do not currently have access to this content.