An evaluation of vowel normalization procedures for the purpose of studying language variation is presented. The procedures were compared on how effectively they (a) preserve phonemic information, (b) preserve information about the talker’s regional background (or sociolinguistic information), and (c) minimize anatomical/physiological variation in acoustic representations of vowels. Recordings were made for 80 female talkers and 80 male talkers of Dutch. These talkers were stratified according to their gender and regional background. The normalization procedures were applied to measurements of the fundamental frequency and the first three formant frequencies for a large set of vowel tokens. The normalization procedures were evaluated through statistical pattern analysis. The results show that normalization procedures that use information across multiple vowels (“vowel-extrinsic” information) to normalize a single vowel token performed better than those that include only information contained in the vowel token itself (“vowel-intrinsic” information). Furthermore, the results show that normalization procedures that operate on individual formants performed better than those that use information across multiple formants (e.g., “formant-extrinsic” F2-F1).

1.
Adank, P. (2003). “Vowel normalization: a perceptual-acoustic study of Dutch vowels,” PhD thesis, University of Nijmegen.
2.
Adank
,
P.
,
van Hout
,
R.
, and
Smits
,
R.
(
2004
). “
An acoustic description of the vowels of Northern and Southern Standard Dutch
,”
J. Acoust. Soc. Am.
116
,
1729
1738
.
3.
Ainsworth, W. A. (1975). “Intrinsic and extrinsic factors in vowel judgements,” in Auditory Analysis and Perception of Speech, edited by G. Fant and M. A. A. Tatham (Academic, London).
4.
Assmann
,
P. F.
,
Nearey
,
T. M.
, and
Hogan
,
J. T.
(
1982
). “
Vowel identification: Orthographic, perceptual, and acoustics aspects
,”
J. Acoust. Soc. Am.
71
,
975
989
.
5.
Bladon
,
R. A.
, and
Lindblom
,
B.
(
1981
). “
Modeling the judgement of vowel quality differences
,”
J. Acoust. Soc. Am.
69
,
1414
1422
.
6.
Boersma, P. (1993). “Accurate short-term analysis of fundamental frequency and the harmonics-to-noise ratio of a sampled sound,” in Proceedings of the Institute of Phonetic Sciences of the University of Amsterdam, 17, pp. 97–110.
7.
Deterding, D. (1990). “Speaker normalization for automatic speech recognition,” PhD thesis, University of Cambridge.
8.
Deterding
,
D.
(
1997
). “
The formants of monophthong vowels in Standard Southern British English Pronunciation
,”
J. Int. Phon. Assoc.
27
,
47
55
.
9.
Disner
,
S.
(
1980
). “
Evaluation of vowel normalization procedures
,”
J. Acoust. Soc. Am.
67
,
253
261
.
10.
Gerstman
,
L.
(
1968
). “
Classification of self-normalized vowels
,”
IEEE Trans. Audio Electroacoust.
AU-16
,
78
80
.
11.
Glasberg
,
B. R.
, and
Moore
,
B. C. J.
(
1990
). “
Derivation of auditory filter shapes from notched noise data
,”
Hear. Res.
47
,
103
138
.
12.
Hagiwara
,
R.
(
1997
). “
Dialect variation and formant frequency: The American English vowels revisited
,”
J. Acoust. Soc. Am.
102
,
655
658
.
13.
Harshman, T. (1970). “Foundations of the PARAFAC procedure: Models and conditions for an “explanatory” multi-model factor analysis,” In Working Papers in Phonetics, 16, Phonetics Lab UCLA.
14.
Hermansky
,
H.
,
Hanson
,
B. A.
, and
Wakita
,
H.
(
1985
). “
Low-dimensional representation of vowels based on all-pole modeling in the physiological domain
,”
Speech Commun.
10
,
509
512
.
15.
Hillenbrand
,
J.
,
Getty
,
L. A.
,
Clark
,
M. J.
, and
Wheeler
,
K.
(
1995
). “
Acoustic analysis of American English vowels
,”
J. Acoust. Soc. Am.
97
,
3099
3111
.
16.
Hindle, D. (1978). “Approaches to formant normalization in the study of natural speech,” in Linguistic Variation, Models and Methods, edited by D. Sankoff (Academic, New York).
17.
Labov, W. (2001). Principles of Linguistic Change: Vol. II: Social factors (Blackwell, Oxford).
18.
Ladefoged
,
P.
, and
Broadbent
,
D. E.
(
1957
). “
Information conveyed by vowels
,”
J. Acoust. Soc. Am.
29
,
88
104
.
19.
Lobanov
,
B. M.
(
1971
). “
Classification of Russian vowels spoken by different speakers
,”
J. Acoust. Soc. Am.
49
,
606
608
.
20.
Miller
,
J. D.
(
1989
). “
Auditory-perceptual interpretation of the vowel
,”
J. Acoust. Soc. Am.
85
,
2114
2134
.
21.
Most
,
T.
,
Amir
,
O.
, and
Tobin
,
Y.
(
2000
). “
The Hebrew Vowel System: raw and normalized acoustic Data
,”
Lang Speech
43
,
295
308
.
22.
Nearey, T. M. (1978). Phonetic Feature Systems for Vowels (Indiana University Linguistics Club, Indiana).
23.
Nearey
,
T. M.
(
1989
). “
Static, dynamic, and relational properties in speech perception
,”
J. Acoust. Soc. Am.
85
,
2088
2113
.
24.
Nearey, T. M. (1992). “Applications of generalized linear modeling to vowel data,” in Proceedings of the 1992 International Conference on Spoken Language Processing, 583–587.
25.
Nearey
,
T. M.
,
Assmann
,
P.
, and
Hillenbrand
,
J.
(
2002
). “
Evaluation of a strategy for automatic formant tracking
,”
J. Acoust. Soc. Am.
112
,
2323
.
26.
Nordström
,
P. E.
(
1976
). “
Female and infant vocal tracts simulated from male area functions
,”
J. Phonetics
5
,
81
92
.
27.
Peterson
,
G. E.
, and
Barney
,
H. L.
(
1952
). “
Control methods used in the study of the vowels
,”
J. Acoust. Soc. Am.
24
,
175
184
.
28.
Pickering, J. B. (1986). “Auditory vowel formant variation,” PhD thesis, Oxford University.
29.
Pols
,
L. C. W.
,
Tromp
,
H. R. C.
, and
Plomp
,
R.
(
1973
). “
Frequency analysis of Dutch vowels from 50 male speakers
,”
J. Acoust. Soc. Am.
53
,
1093
1101
.
30.
Stevens
,
J. P.
(
1979
). “
Comments on Olson: Choosing a test statistic in multivariate analysis of variance
,”
Psychol. Bull.
86
,
355
360
.
31.
Stevens
,
S. S.
, and
Volkmann
,
J.
(
1940
). “
The relation of pitch to frequency: A revised scale
,”
Am. J. Psychol.
53
,
329
353
.
32.
Syrdal
,
A. K.
(
1984
). “
Aspects of a model for the auditory representation of American English vowels
,”
Speech Commun.
4
,
121
135
.
33.
Syrdal
,
A. K.
, and
Gopal
,
H. S.
(
1986
). “
A perceptual model of vowel recognition based on the auditory representation of American English vowels
,”
J. Acoust. Soc. Am.
79
,
1086
1100
.
34.
Traunmüller
,
H.
(
1990
). “
Analytical expressions for the tonotopic sensory scale
,”
J. Acoust. Soc. Am.
88
,
97
100
.
35.
Van de Velde
,
H.
,
van Hout
,
R.
, and
Gerritsen
,
M.
(
1997
). “
Watching Dutch change
,”
J. Sociolinguistics
1
,
361
391
.
36.
Watson
,
C. I.
,
Maclagan
,
M.
, and
Harrington
,
J.
(
2000
). “
Acoustic evidence for vowel change in New Zealand English
,”
Language Variation and Change
12
,
51
68
.
37.
Wakita
,
H.
(
1977
). “
Normalization of vowels by vocal tract length and its application to vowel identification
,”
IEEE Trans. Acoust., Speech, Signal Process.
ASSP-25
,
183
192
.
38.
Zwicker
,
E.
(
1961
). “
Subdivision of the audible frequency range into critical bands (Frequenzgruppen)
,”
J. Acoust. Soc. Am.
33
,
248
.
39.
Zwicker
,
E.
,
Flottorp
,
G.
, and
Stevens
,
S. S.
(
1957
). “
Critical bandwidth in loudness summation
,”
J. Acoust. Soc. Am.
29
,
548
557
.
40.
Zwicker
,
E.
, and
Terhardt
,
E.
(
1980
). “
Analytical expressions for critical-band rate and critical bandwidth as a function of frequency
,”
J. Acoust. Soc. Am.
68
,
1523
1525
.
This content is only available via PDF.
You do not currently have access to this content.