Although listeners routinely perceive both the sex and individual identity of talkers from their speech, explanations of these abilities are incomplete. Here, variation in vocal production-related anatomy was assumed to affect vowel acoustics thought to be critical for indexical cueing. Integrating this approach with source-filter theory, patterns of acoustic parameters that should represent sex and identity were identified. Due to sexual dimorphism, the combination of fundamental frequency (F0, reflecting larynx size) and vocal tract length cues (VTL, reflecting body size) was predicted to provide the strongest acoustic correlates of talker sex. Acoustic measures associated with presumed variations in supralaryngeal vocal tract-related anatomy occurring within sex were expected to be prominent in individual talker identity. These predictions were supported by results of analyses of 2500 tokens of the /ɛ/ phoneme, extracted from the naturally produced speech of 125 subjects. Classification by talker sex was virtually perfect when F0 and VTL were used together, whereas talker classification depended primarily on the various acoustic parameters associated with vocal-tract filtering.

1.
Abercrombie, D. (1967). Elements of General Phonetics (Aldine, Chicago).
2.
Bachorowski
,
J.-A.
, and
Owren
,
M. J.
(
1995
). “
Vocal expression of emotion: Acoustic properties of speech are associated with emotional intensity and context
,”
Psych. Sci.
6
,
219
224
.
3.
Baken, R. J. (1996). Clinical Measurement of Speech and Voice (Singular, San Diego).
4.
Bricker, P. D., and Pruzansky, S. (1976). “Speaker recognition,” in Contemporary Issues in Experimental Phonetics, edited by N. J. Lass (Academic, New York), pp. 295–326.
5.
Byrd
,
D.
(
1992
). “
Preliminary results on speaker-dependent variation in the TIMIT database
,”
J. Acoust. Soc. Am.
92
,
593
596
.
6.
Carrell, T. D. (1984). “Contributions of fundamental frequency, formant spacing, and glottal waveform to talker identification,” Technical Report No. 5 (Speech Research Laboratory, Department of Psychology, Indiana University).
7.
Childers
,
D. G.
, and
Wu
,
K.
(
1991
). “
Gender recognition from speech. Part II. Fine analysis
,”
J. Acoust. Soc. Am.
90
,
1841
1856
.
8.
Coleman
,
R. O.
(
1971
). “
Male and female voice quality and its relationship to vowel formant frequencies
,”
J. Speech Hear. Res.
14
,
565
577
.
9.
Coleman
,
R. O.
(
1973
). “
Speaker identification in the absence of inter-subject differences in glottal source characteristics
,”
J. Acoust. Soc. Am.
53
,
1741
1743
.
10.
Coleman
,
R. O.
(
1976
). “
A comparison of the contributions of two voice quality characteristics to the perception of maleness and femaleness of the voice
,”
J. Speech Hear. Res.
19
,
168
180
.
11.
Doherty
,
E. T.
, and
Shipp
,
T.
(
1988
).
Tape recorder effects on jitter and shimmer extraction
.
J. Speech Hear. Res.
31
,
485
490
.
12.
Eklund
,
I.
, and
Traünmuller
,
H.
(
1997
). “
Comparative study of male and female whispered and phonated versions of the long vowels of Swedish
,”
Phonetica
54
,
1
21
.
13.
Fant, G. (1960). Acoustic Theory of Speech Production (Mouton, The Hague, Holland).
14.
Fant
,
G.
(
1966
). “
A note on vocal tract size factors and non-uniform F-pattern scaling
,”
Speech Trans. Lab. Quart. Prog. and Status Report
4
,
22
30
.
15.
Fitch, W. T. S., III (1994). “Vocal tract length perception and the evolution of language” (Unpublished doctoral dissertation, Brown University).
16.
Fitch
,
W. T.
(
1997
). “
Vocal tract length and formant frequency dispersion correlate with body size in rhesus macaques
,”
J. Acoust. Soc. Am.
102
,
1213
1222
.
17.
Fujisaki
,
H.
, and
Kawashima
,
T.
(
1968
). “
The role of pitch and higher formants in the perception of vowels
,”
IEEE Trans. Audio Electroacoust.
AU-16
,
73
77
.
18.
Goldinger
,
S. D.
(
1996
). “
Words and voices: Episodic traces in spoken word identification and recognition memory
,”
J. Exp. Psy.: Learn., Mem., Cogn.
22
,
1166
1183
.
19.
Goldinger, S. D., Pisoni, D. B., and Luce, P. A. (1996). “Speech perception and spoken word recognition: Research and theory,” in Principles of Experimental Phonetics, edited by N. J. Lass (Mosby, St. Louis), pp. 277–327.
20.
Hagiwara
,
R.
(
1997
). “
Dialect variation and formant frequency: The American English vowels revisited
,”
J. Acoust. Soc. Am.
102
,
655
658
.
21.
Hanson
,
H. M.
(
1997
). “
Glottal characteristics of female speakers: Acoustic correlates
,”
J. Acoust. Soc. Am.
101
,
466
481
.
22.
Hillenbrand
,
J.
,
Getty
,
L. A.
,
Clark
,
M. J.
, and
Wheeler
,
K.
(
1995
). “
Acoustic characteristics of American English vowels
,”
J. Acoust. Soc. Am.
97
,
3099
3111
.
23.
Ingemann
,
F.
(
1968
). “
Identification of the speaker’s sex from voiceless fricatives
,”
J. Acoust. Soc. Am.
44
,
1142
1145
.
24.
Johnson, K., and Mullenix, J. W., editors (1997). Talker Variability in Speech Processing (Academic, New York).
25.
Joos
,
M. A.
(
1948
). “
Acoustic phonetics
,”
Lang.
24
(Suppl 2),
1
136
.
26.
Karlsson
,
I.
(
1991
). “
Female voices in speech synthesis
,”
J. Phonetics
19
,
111
120
.
27.
Kent, R. D., and Read, C. (1992). The Acoustic Analysis of Speech (Singular, San Diego).
28.
Kewley-Port
,
D.
,
Li
,
X.
,
Zheng
,
Y.
, and
Neal
,
A. T.
(
1996
). “
Fundamental frequency effects on thresholds for vowel formant discrimination
,”
J. Acoust. Soc. Am.
100
,
2462
2470
.
29.
Klatt
,
D. H.
, and
Klatt
,
L. C.
(
1990
). “
Analysis, synthesis, and perception of voice quality variations among female and male talkers
,”
J. Acoust. Soc. Am.
87
,
820
857
.
30.
Kreiman, J. (1997). “Listening to voices: Theory and practice in voice perception research,” in Talker Variability in Speech Processing, edited by K. Johnson and J. W. Mullenix (Academic, New York), pp. 85–108.
31.
Kuwabara
,
H.
, and
Takagi
,
T.
(
1991
). “
Acoustic parameters of voice individuality and voice-quality control by analysis-synthesis method
,”
Speech Commun.
10
,
491
495
.
32.
Labov, W. (1972). Sociolinguistic Patterns (University of Pennsylvania, Philadelphia).
33.
Labov, W. (1994). Principles of Linguistic Change: Internal Factors (Blackwell, Cambridge, MA).
34.
Ladefoged, P. (1967). Three Areas of Experimental Phonetics (Cambridge University Press, Cambridge).
35.
Ladefoged
,
P.
, and
Broadbent
,
D.
(
1957
). “
Information conveyed by vowels
,”
J. Acoust. Soc. Am.
29
,
98
104
.
36.
Lass
,
N. J.
,
Hughes
,
K. R.
,
Bowyer
,
M. D.
,
Waters
,
L. T.
, and
Bourne
,
V.
(
1976
). “
Speaker sex identification from voiced, whispered, and filtered isolated vowels
,”
J. Acoust. Soc. Am.
59
,
675
678
.
37.
Lieberman, P., and Blumstein, S. E. (1993). Speech Physiology, Speech Perception, and Acoustic Phonetics (University Press, Cambridge, MA).
38.
Lyzenga
,
J.
, and
Wiebe Horst
,
J.
(
1997
). “
Frequency discrimination of stylized synthetic vowels with a single formant
,”
J. Acoust. Soc. Am.
102
,
1755
1767
.
39.
Maurer
,
D.
, and
Landis
,
T.
(
1996
). “
Intelligibility and spectral differences in high-pitched vowels
,”
Folia Phoniatr.
48
,
1
10
.
40.
Miller
,
J. D.
(
1989
). “
Auditory-perceptual interpretation of the vowel
,”
J. Acoust. Soc. Am.
85
,
2114
2134
.
41.
Murry
,
T.
, and
Singh
,
S.
(
1980
). “
Multidimensional analysis of male and female voices
,”
J. Acoust. Soc. Am.
68
,
1294
1300
.
42.
Owren, M. J., and Rendall, D. (1997). “An affect-conditioning model of nonhuman primate vocal signaling,” in Perspectives in Ethology: Volume 12. Communication, edited by D. H. Owings, M. D. Beecher, and N. S. Thompson (Plenum, New York), pp. 299–346.
43.
Owren
,
M. J.
,
Seyfarth
,
R. M.
, and
Cheney
,
D. L.
(
1997
). “
The acoustic features of vowel-like grunt calls in chacma baboons (Papio cynocephalus ursinus): Implications for production processes and functions
,”
J. Acoust. Soc. Am.
101
,
2951
2962
.
44.
Perry, T. L. (1997). “A developmental study of the acoustic and perceptual properties differentiating gender,” Unpublished doctoral dissertation, Vanderbilt University.
45.
Peters, R. W. (1954). “The relative intelligibility of single-voice and multiple-voice messages under various conditions of noise,” Joint Project Report No. 56, pp. 1–9 (U.S. Naval School of Aviation Medicine, Pensacola, FL).
46.
Peterson
,
G. E.
, and
Barney
,
H. L.
(
1952
). “
Control methods used in a study of the identification of vowels
,”
J. Acoust. Soc. Am.
24
,
175
184
.
47.
Pisoni, D. B. (1997). “Some thoughts on ‘normalization’ in speech perception,” in Talker Variability in Speech Processing, edited by K. Johnson and J. W. Mullenix (Academic, New York), pp. 9–32.
48.
Protopapas
,
A.
, and
Lieberman
,
P.
(
1997
). “
Fundamental frequency of phonation and perceived emotional stress
,”
J. Acoust. Soc. Am.
101
,
2267
2277
.
49.
Pullum, G. K., and Ladusaw, W. A. (1996). Phonetic Symbol Guide (University of Chicago Press, Chicago).
50.
Remez
,
R. E.
,
Fellowes
,
J. M.
, and
Rubin
,
P. E.
(
1997
). “
Talker identification based on phonetic information
,”
J. Exp. Psy.: Human Percep. Perform.
23
,
651
666
.
51.
Rendall
,
D.
,
Owren
,
M. J.
, and
Rodman
,
P. S.
(
1997
). “
The role of vocal tract filtering in identity cueing in rhesus monkey (Macaca mulatta) vocalizations
,”
J. Acoust. Soc. Am.
103
,
602
614
.
52.
Sachs, J. (1975). “Cues to the identification of sex in children’s speech,” in Language and Sex: Difference and Dominance, edited by B. Thorne and N. Henley (Newbury, Rowley, MA), pp. 152–171.
53.
Scherer
,
K. R.
(
1986
). “
Vocal affect expression: A review and model for future research
,”
Psychol. Bull.
99
,
143
165
.
54.
Schwartz
,
M. F.
(
1968
). “
Identification of speaker sex from isolated, voiceless fricatives
,”
J. Acoust. Soc. Am.
43
,
1178
1179
.
55.
Singh
,
S.
, and
Murry
,
T.
(
1978
). “
Multidimensional classification of normal voice qualities
,”
J. Acoust. Soc. Am.
64
,
81
87
.
56.
Stevens
,
K. N.
, and
House
,
A. S.
(
1955
). “
Development of a quantitative description of vowel articulation
,”
J. Acoust. Soc. Am.
27
,
484
493
.
57.
Tabachnik, B. G., and Fidell, L. S. (1996). Using Multivariate Statistics (HarperCollins, New York).
58.
Tartter
,
V.
(
1991
). “
Identifiability of vowels and speakers from whispered syllables
,”
Percept. Psychophys.
49
,
365
372
.
59.
Titze
,
I. R.
(
1989
). “
Physiologic and acoustic differences between male and female voices
,”
J. Acoust. Soc. Am.
85
,
1699
1707
.
60.
Titze, I. R. (1994). Principles of Voice Production (Prentice-Hall, Englewood Cliffs, NJ).
61.
Whalen, D. H., and Sheffert, S. M. (1997). “Normalization of vowels by breath sounds,” in Talker Variability in Speech Processing, edited by K. Johnson and J. W. Mullenix (Academic, New York), pp. 133–144.
62.
Whiteside
,
S. P.
(
1998
). “
Identification of speaker’s sex: A study of vowels
,”
Percept. Mot. Skills
86
,
579
584
.
63.
Wu
,
K.
, and
Childers
,
D. G.
(
1991
). “
Gender recognition from speech. Part I. Coarse analysis
,”
J. Acoust. Soc. Am.
90
,
1828
1840
.
This content is only available via PDF.
You do not currently have access to this content.