Spectral properties of earlier sounds (context) influence recognition of later sounds (target). Acoustic variability in context stimuli can disrupt this process. When mean fundamental frequencies (f0’s) of preceding context sentences were highly variable across trials, shifts in target vowel categorization [due to spectral contrast effects (SCEs)] were smaller than when sentence mean f0’s were less variable; when sentences were rearranged to exhibit high or low variability in mean first formant frequencies (F1) in a given block, SCE magnitudes were equivalent [Assgari, Theodore, and Stilp (2019) J. Acoust. Soc. Am. 145(3), 1443–1454]. However, since sentences were originally chosen based on variability in mean f0, stimuli underrepresented the extent to which mean F1 could vary. Here, target vowels (/ɪ/-/ɛ/) were categorized following context sentences that varied substantially in mean F1 (experiment 1) or mean F3 (experiment 2) with variability in mean f0 held constant. In experiment 1, SCE magnitudes were equivalent whether context sentences had high or low variability in mean F1; the same pattern was observed in experiment 2 for new sentences with high or low variability in mean F3. Variability in some acoustic properties (mean f0) can be more perceptually consequential than others (mean F1, mean F3), but these results may be task-dependent.

1.
Anwyl-Irvine
,
A. L.
,
Massonnié
,
J.
,
Flitton
,
A.
,
Kirkham
,
N.
, and
Evershed
,
J. K.
(
2020
). “
Gorilla in our midst: An online behavioral experiment builder
,”
Behav. Res.
52
(
1
),
388
407
.
2.
Assgari
,
A. A.
, and
Stilp
,
C. E.
(
2015
). “
Talker information influences spectral contrast effects in speech categorization
,”
J. Acoust. Soc. Am.
138
(
5
),
3023
3032
.
3.
Assgari
,
A. A.
,
Theodore
,
R. M.
, and
Stilp
,
C. E.
(
2019
). “
Variability in talkers' fundamental frequencies shapes context effects in speech perception
,”
J. Acoust. Soc. Am.
145
(
3
),
1443
1454
.
4.
Assmann
,
P. F.
,
Dembling
,
S.
, and
Nearey
,
T. M.
(
2006
). “
Effects of frequency shifts on perceived naturalness and gender information in speech
,” in
Proceedings of the Ninth International Conference on Spoken Language Processing
, September 17–21, Pittsburgh, PA, pp.
889
892
.
5.
Assmann
,
P. F.
, and
Nearey
,
T. M.
(
2008
). “
Identification of frequency-shifted vowels
,”
J. Acoust. Soc. Am.
124
(
5
),
3203
3212
.
6.
Assmann
,
P. F.
,
Nearey
,
T. M.
, and
Hogan
,
J. T.
(
1982
). “
Vowel identification: Orthographic, perceptual, and acoustic aspects
,”
J. Acoust. Soc. Am.
71
(
4
),
975
989
.
7.
Attneave
,
F.
(
1954
). “
Some informational aspects of visual perception
,”
Psychol. Rev.
61
(
3
),
183
193
.
8.
Bachorowski
,
J.-A.
, and
Owren
,
M. J.
(
1999
). “
Acoustic correlates of talker sex and individual talker identity are present in a short vowel segment produced in running speech
,”
J. Acoust. Soc. Am.
106
(
2
),
1054
1063
.
9.
Barlow
,
H. B.
(
1961
). “
Possible principles underlying the transformation of sensory messages
,” in
Sensory Communication
, edited by
W. A.
Rosenblith
(
MIT
,
Cambridge, MA
), pp.
53
85
.
10.
Barreda
,
S.
(
2012
). “
Vowel normalization and the perception of speaker changes: An exploration of the contextual tuning hypothesis
,”
J. Acoust. Soc. Am.
132
(
5
),
3453
3464
.
11.
Bates
,
D. M.
,
Maechler
,
M.
,
Bolker
,
B.
, and
Walker
,
S.
(
2014
). “
lme4: Linear mixed-effects models using Eigen and S4. R package version 1.1-7
,” https://cran.r-project.org/web/packages/lme4/index.html (Last viewed June 21, 2022).
12.
Baumann
,
O.
, and
Belin
,
P.
(
2010
). “
Perceptual scaling of voice identity: Common dimensions for different vowels and speakers
,”
Psychol. Res.
74
(
1
),
110
120
.
13.
Boersma
,
P.
, and
Weenink
,
D.
(
2019
). “
Praat: Doing phonetics by computer (version 6.1), [computer program]
,” http://www.praat.org (Last viewed July 13, 2019).
14.
Bradlow
,
A. R.
,
Nygaard
,
L. C.
, and
Pisoni
,
D. B.
(
1999
). “
Effects of talker, rate, and amplitude variation on recognition memory for spoken words
,”
Percept. Psychophys.
61
(
2
),
206
219
.
[PubMed]
15.
Childers
,
D. G.
, and
Wu
,
K.
(
1991
). “
Gender recognition from speech. Part II: Fine analysis
,”
J. Acoust. Soc. Am.
90
(
4
),
1841
1856
.
16.
Choi
,
J. Y.
,
Hu
,
E. R.
, and
Perrachione
,
T. K.
(
2018
). “
Varying acoustic-phonemic ambiguity reveals that talker normalization is obligatory in speech processing
,”
Atten. Percept. Psychophys.
80
(
3
),
784
797
.
17.
Coleman
,
R. O.
(
1971
). “
Male and female voice quality and its relationship to vowel formant frequencies
,”
J. Speech Hear. Res.
14
(
3
),
565
577
.
18.
Compton
,
A. J.
(
1963
). “
Effects of filtering and vocal duration upon the identification of speakers, aurally
,”
J. Acoust. Soc. Am.
35
(
11
),
1748
1752
.
19.
Creelman
,
C. D.
(
1957
). “
Case of the unknown talker
,”
J. Acoust. Soc. Am.
29
(
5
),
655
.
20.
Drown
,
L.
, and
Theodore
,
R. M.
(
2020
). “
Effects of phonetic and indexical variability on talker normalization
,”
J. Acoust. Soc. Am.
148
,
2504
.
21.
Fant
,
G.
(
1970
).
Acoustic Theory of Speech Production with Calculations Based on X-Ray Studies of Russian Articulations
(
Mouton de Gruyter
,
Berlin
).
22.
Field
,
D. J.
(
1987
). “
Relations between the statistics of natural images and the response properties of cortical cells
,”
J. Opt. Soc. Am. A
4
(
12
),
2379
2394
.
23.
Garofolo
,
J.
,
Lamel
,
L.
,
Fisher
,
W.
,
Fiscus
,
J.
,
Pallett
,
D.
, and
Dahlgren
,
N.
(
1990
). “
DARPA TIMIT acoustic-phonetic continuous speech corpus CDROM
,” NIST Order No. PB91-505065,
National Institute of Standards and Technology
,
Gaithersburg, MD
.
24.
Geisler
,
W. S.
,
Perry
,
J. S.
,
Super
,
B. J.
, and
Gallogly
,
D. P.
(
2001
). “
Edge co-occurrence in natural images predicts contour grouping performance
,”
Vision Res.
41
(
6
),
711
724
.
25.
Gervain
,
J.
, and
Geffen
,
M. N.
(
2019
). “
Efficient neural coding in auditory and speech perception
,”
Trends Neurosci.
42
(
1
),
56
65
.
26.
Goldinger
,
S. D.
(
1996
). “
Words and voices: Episodic traces in spoken word identification and recognition memory
,”
J. Exp. Psychol. Learn. Mem. Cogn.
22
(
5
),
1166
1183
.
27.
Goldinger
,
S. D.
,
Pisoni
,
D. B.
, and
Logan
,
J. S.
(
1991
). “
On the nature of talker variability effects on recall of spoken word lists
,”
J. Exp. Psychol. Learn. Mem. Cogn.
17
(
1
),
152
162
.
28.
Hillenbrand
,
J. M.
, and
Clark
,
M. J.
(
2009
). “
The role of f0 and formant frequencies in distinguishing the voices of men and women
,”
Atten. Percept. Psychophys.
71
(
5
),
1150
1166
.
29.
Hillenbrand
,
J. M.
,
Getty
,
L. A.
,
Clark
,
M. J.
, and
Wheeler
,
K.
(
1995
). “
Acoustic characteristics of American English vowels
,”
J. Acoust. Soc. Am.
97
(
5
),
3099
3111
.
30.
Johnson
,
K.
, and
Sjerps
,
M. J.
(
2021
). “
Speaker normalization in speech perception
,” in
The Handbook of Speech Perception
, 2nd edited by
J. S.
Pardo
,
L. C.
Nygaard
,
R. E.
Remez
, and
D. B.
Pisoni
(
Wiley
,
New York
), pp.
145
176
.
31.
Kluender
,
K. R.
,
Stilp
,
C. E.
, and
Kiefte
,
M.
(
2013
). “
Perception of vowel sounds within a biologically realistic model of efficient coding
,” in
Vowel Inherent Spectral Change
, edited by
G. S.
Morrison
and
P. F.
Assmann
(
Springer
,
Berlin
), pp.
117
151
.
32.
Kluender
,
K. R.
,
Stilp
,
C. E.
, and
Llanos
,
F.
(
2019
). “
Longstanding problems in speech perception dissolve within an information-theoretic perspective
,”
Atten. Percept. Psychophys.
81
(
4
),
861
883
.
33.
Ladefoged
,
P.
, and
Broadbent
,
D. E.
(
1957
). “
Information conveyed by vowels
,”
J. Acoust. Soc. Am.
29
(
1
),
98
104
.
34.
Lammert
,
A. C.
, and
Narayanan
,
S. S.
(
2015
). “
On short-time estimation of vocal tract length from formant frequencies
,”
PLoS ONE
10
(
7
),
e0132193
.
35.
LaRiviere
,
C.
(
1975
). “
Contributions of fundamental frequency and formant frequencies to speaker identification
,”
Phonetica
31
(
3
),
185
197
.
36.
Lavner
,
Y.
,
Gath
,
I.
, and
Rosenhouse
,
J.
(
2000
). “
Effects of acoustic modifications on the identification of familiar voices speaking isolated vowels
,”
Speech Commun.
30
(
1
),
9
26
.
37.
Long
,
J. A.
(
2019
). “
Interactions: Comprehensive, user-friendly toolkit for probing interactions. R package version 1.1.3
,” https://cran.r-project.org/web/packages/interactions/index.html (Last viewed June 21, 2022).
38.
Magnuson
,
J. S.
, and
Nusbaum
,
H. C.
(
2007
). “
Acoustic differences, listener expectations, and the perceptual accommodation of talker variability
,”
J. Exp. Psychol. Hum. Percept. Perform.
33
(
2
),
391
409
.
39.
Martin
,
C. S.
,
Mullennix
,
J. W.
,
Pisoni
,
D. B.
, and
Summers
,
W. V.
(
1989
). “
Effects of talker variability on recall of spoken word lists
,”
J. Exp. Psychol. Learn. Mem. Cogn.
15
(
4
),
676
684
.
40.
Mullennix
,
J. W.
, and
Pisoni
,
D. B.
(
1990
). “
Stimulus variability and processing dependencies in speech perception
,”
Percept. Psychophys.
47
(
4
),
379
390
.
41.
Mullennix
,
J. W.
,
Pisoni
,
D. B.
, and
Martin
,
C. S.
(
1989
). “
Some effects of talker variability on spoken word recognition
,”
J. Acoust. Soc. Am.
85
(
1
),
365
378
.
42.
Nordström
,
P. E.
, and
Lindblom
,
B.
(
1975
). “
A normalization procedure for vowel formant data
,” in
Proceedings of the Eighth International Congress of Phonetic Sciences
, August 17–23, Leeds, UK.
43.
Nygaard
,
L. C.
,
Sommers
,
M. S.
, and
Pisoni
,
D. B.
(
1995
). “
Effects of stimulus variability on perception and representation of spoken words in memory
,”
Percept. Psychophys.
57
(
7
),
989
1001
.
44.
Olshausen
,
B. A.
, and
Field
,
D. J.
(
1996
). “
Natural image statistics and efficient coding
,”
Network
7
(
2
),
333
339
.
45.
Peterson
,
G. E.
(
1951
). “
The phonetic value of vowels
,”
Language
27
(
4
),
541
553
.
46.
Peterson
,
G. E.
, and
Barney
,
H. L.
(
1952
). “
Control methods used in a study of the vowels
,”
J. Acoust. Soc. Am.
24
(
2
),
175
184
.
47.
R Development Core Team
(
2021
). “
R: A Language and Environment for Statistical Computing
,”
R Foundation for Statistical Computing
,
Vienna, Austria
.
48.
Ruderman
,
D. L.
,
Cronin
,
T. W.
, and
Chiao
,
C. C.
(
1998
). “
Statistics of cone responses to natural images: Implications for visual coding
,”
J. Opt. Soc. Am. A
15
(
8
),
2036
2045
.
49.
Schwartz
,
O.
, and
Simoncelli
,
E. P.
(
2001
). “
Natural signal statistics and sensory gain control
,”
Nat. Neurosci.
4
(
8
),
819
825
.
50.
Smith
,
D. R. R.
, and
Patterson
,
R. D.
(
2005
). “
The interaction of glottal-pulse rate and vocal-tract length in judgements of speaker size, sex, and age
,”
J. Acoust. Soc. Am.
118
(
5
),
3177
3186
.
51.
Spahr
,
A. J.
,
Dorman
,
M. F.
,
Litvak
,
L. M.
,
van Wie
,
S.
,
Gifford
,
R. H.
,
Loizou
,
P. C.
,
Loiselle
,
L. M.
,
Oakes
,
T.
, and
Cook
,
S.
(
2012
). “
Development and validation of the AzBio sentence lists
,”
Ear Hear.
33
(
1
),
112
117
.
52.
Stilp
,
C. E.
(
2020
). “
Acoustic context effects in speech perception
,”
Wiley Interdiscip. Rev. Cogn. Sci.
11
(
1
),
1
18
.
53.
Stilp
,
C. E.
, and
Theodore
,
R. M.
(
2020
). “
Talker normalization is mediated by structured indexical information
,”
Atten. Percept. Psychophys.
82
,
2237
2243
.
54.
Sommers
,
M. S.
,
Nygaard
,
L. C.
, and
Pisoni
,
D. B.
(
1994
). “
Stimulus variability and spoken word recognition. I. Effects of variability in speaking rate and overall amplitude
,”
J. Acoust. Soc. Am.
96
(
3
),
1314
1324
.
55.
Tkačik
,
G.
,
Prentice
,
J. S.
,
Victor
,
J. D.
, and
Balasubramanian
,
V.
(
2010
). “
Local statistics in natural scenes predict the saliency of synthetic textures
,”
Proc. Natl. Acad. Sci. U.S.A.
107
(
42
),
18149
18154
.
56.
Van Lancker
,
D.
,
Kreiman
,
J.
, and
Emmorey
,
K.
(
1985
). “
Familiar voice recognition: Patterns and parameters Part I: Recognition of backward voices
,”
J. Phon.
13
(
1
),
19
38
.
57.
Wakita
,
H.
(
1977
). “
Normalization of vowels by vocal-tract length and its application to vowel identification
,”
IEEE Trans. Acoust. Speech Signal Process.
25
,
183
192
.
58.
Walden
,
B. E.
,
Montgomery
,
A. A.
,
Gibeily
,
G. J.
,
Prosek
,
R. A.
, and
Schwartz
,
D. M.
(
1978
). “
Correlates of psychological dimensions in talker similarity
,”
J. Speech Hear. Res.
21
(
2
),
265
275
.
59.
Winn
,
M. B.
, and
Litovsky
,
R. Y.
(
2015
). “
Using speech sounds to test functional spectral resolution in listeners with cochlear implants
,”
J. Acoust. Soc. Am.
137
(
3
),
1430
1442
.
60.
Woods
,
K. J. P.
,
Siegel
,
M. H.
,
Traer
,
J.
, and
McDermott
,
J. H.
(
2017
). “
Headphone screening to facilitate web-based auditory experiments
,”
Atten. Percept. Psychophys.
79
(
7
),
2064
2072
.
61.
Zhang
,
C.
, and
Chen
,
S.
(
2016
). “
Toward an integrative model of talker normalization
,”
J. Exp. Psychol. Hum. Percept. Perform.
42
(
8
),
1252
1268
.

Supplementary Material

You do not currently have access to this content.