The effects of age, sex, and vocal tract configuration on the glottal excitation signal in speech are only partially understood, yet understanding these effects is important for both recognition and synthesis of speech as well as for medical purposes. In this paper, three acoustic measures related to the voice source are analyzed for five vowels from 3145 CVC utterances spoken by 335 talkers (839years old) from the CID database [Miller et al., Proceedings of ICASSP, 1996, Vol. 2, pp. 849852]. The measures are: the fundamental frequency (F0), the difference between the “corrected” (denoted by an asterisk) first two spectral harmonic magnitudes, H1*H2* (related to the open quotient), and the difference between the “corrected” magnitudes of the first spectral harmonic and that of the third formant peak, H1*A3* (related to source spectral tilt). The correction refers to compensating for the influence of formant frequencies on spectral magnitude estimation. Experimental results show that the three acoustic measures are dependent to varying degrees on age and vowel. Age dependencies are more prominent for male talkers, while vowel dependencies are more prominent for female talkers suggesting a greater vocal tract-source interaction. All talkers show a dependency of F0 on sex and on F3, and of H1*A3* on vowel type. For low-pitched talkers (F0175Hz), H1*H2* is positively correlated with F0 while for high-pitched talkers, H1*H2* is dependent on F1 or vowel height. For high-pitched talkers there were no significant sex dependencies of H1*H2* and H1*A3*. The statistical significance of these results is shown.

1.
Ananthapadmanabha
,
T. V.
(
1984
). “
Acoustic analysis of voice source dynamics
,” STL-QPSR
25
,
1
24
.
2.
Baken
,
R. J.
(
1987
).
Clinical Measurement of Speech and Voice
(
Taylor and Francis
,
London
).
3.
Doval
,
B.
, and
d’Alessandro
,
C.
(
1999
). “
The spectrum of glottal flow models
,” Technical Report, LIMSI-CNRS, Orsay, France.
4.
El-Jaroudi
,
A.
, and
Makhoul
,
J.
(
1991
). “
Discrete all-pole modeling
,”
IEEE Trans. Signal Process.
39
,
411
423
.
5.
Esposito
,
C.
(
2005
). “
An acoustic and electroglottographic study of phonation in Santa Ana del Valle Zapotec
,”
Poster at the 79th Meeting of the Linguistic Society of America
,
2005
.
6.
Fant
,
G.
(
1960
).
Acoustic Theory of Speech Production
(
Mouton
,
The Hague, Paris
).
7.
Fant
,
G.
(
1982
). “
The voice source-acoustic modeling
,” STL-QPSR
23
,
28
48
.
8.
Fant
,
G.
(
1995
). “
The LF model revisited. Transformations and frequency domain analysis
,” STL-QPSR
36
,
119
156
.
9.
Fant
,
G.
, and
Kruckenberg
,
A.
(
1996
). “
Voice source properties of speech code
,” TMH-QPSR
37
,
45
56
.
10.
Fant
,
G.
,
Kruckenberg
,
A.
,
Liljencrants
,
J.
, and
Hertegård
,
S.
(
2000
). “
Acoustic-phonetic studies of prominence in Swedish
,” TMH-QPSR
41
,
1
52
.
11.
Fant
,
G.
,
Liljencrants
,
J.
, and
Lin
,
Q.
(
1985
). “
A four-parameter model of glottal flow
,” STL-QPSR
26
,
1
13
.
12.
Fröhlich
,
M.
,
Michaelis
,
D.
, and
Strube
,
H. W.
(
2001
). “
Sim-Simultaneous inverse filtering and matching of a glottal flow model for acoustic speech signals
,”
J. Acoust. Soc. Am.
110
,
479
488
.
13.
Hanson
,
H. M.
(
1995
). “
Glottal characteristics of female speakers
,” Ph.D. dissertation,
Harvard University
, Cambridge, MA.
14.
Hanson
,
H. M.
(
1997
). “
Glottal characteristics of female speakers: Acoustic correlates
,”
J. Acoust. Soc. Am.
101
,
466
481
.
15.
Hanson
,
H. M.
, and
Chuang
,
E. S.
(
1999
). “
Glottal characteristics of male speakers: Acoustic correlates and comparison with female data
,”
J. Acoust. Soc. Am.
106
,
1064
1077
.
16.
Hedelin
,
P.
(
1984
). “
A glottal LPC-vocoder
,” in
Proc. IEEE
1
6
1
1
6
4
.
17.
Henrich
,
N.
,
d’Alessandro
,
C.
, and
Doval
,
B.
(
2001
). “
Spectral correlates of voice open quotient and glottal flow asymmetry: Theory, limits and experimental data
,” in
Proceedings of EUROSPEECH
,
Scandinavia
, pp.
47
50
.
18.
Hertegård
,
S.
, and
Gauffin
,
J.
(
1992
). “
Acoustic properties of the Rothenberg mask
,” STL-QPSR
33
,
9
18
.
19.
Holmberg
,
E. B.
,
Hillman
,
R. E.
,
Perkell
,
J. S.
,
Guiod
,
P.
, and
Goldman
,
S. L.
(
1995
). “
Comparisons among aerodynamic, electroglottographic, and acoustic spectral measures of female voice
,”
J. Speech Hear. Res.
38
,
1212
1223
.
20.
Holmes
,
J. N.
(
1973
). “
Influence of the glottal waveform on the naturalness of speech from a parallel formant synthesizer
,” IEEE Trans. Audio Electroacoust.
298
305
.
22.
Iseli
,
M.
, and
Alwan
,
A.
(
2004
). “
An improved correction formula for the estimation of harmonic magnitudes and its application to open quotient estimation
,” in
Proceedings of ICASSP
,
Montreal, Canada
, Vol.
1
, pp.
669
672
.
23.
Klatt
,
D. H.
, and
Klatt
,
L. C.
(
1990
). “
Analysis, synthesis, and perception of voice quality variations among female and male talkers
,”
J. Acoust. Soc. Am.
87
,
820
857
.
24.
Koreman
,
J.
(
1996
). “
Decoding linguistic information in the glottal airflow
,” Ph.D. thesis,
University of Nijmegen
.
25.
Lee
,
S.
,
Potamianos
,
A.
, and
Narayanan
,
S.
(
1999
). “
Acoustics of childrens speech: Developmental changes of temporal and spectral parameters
,”
J. Acoust. Soc. Am.
105
,
1455
1468
.
26.
Lehiste
,
I.
, and
Peterson
,
G. E.
(
1961
). “
Some basic considerations in the analysis of intonation
,”
J. Acoust. Soc. Am.
33
,
419
425
.
27.
Maddieson
,
I.
, and
Ladefoged
,
P.
(
1985
). “
Tense and lax in four minority languages of China
,”
J. Phonetics
13
,
433
454
.
28.
Mannell
,
R. H.
(
1998
). “
Formant diphone parameter extraction utilising a labelled single speaker database
,” in
Proceedings of the ICSLP
(
ASSTA
,
Sydney, Australia
), Vol.
5
, pp.
2003
2006
.
29.
Marasek
,
K.
(
1996
). “
Glottal correlates of the word stress and the tense-lax opposition in German
,” in
Proceedings ICSLP
, Philadelphia, PA, pp.
1573
1576
.
30.
Markel
,
J. D.
, and
Gray
,
A. H.
, Jr.
(
1976
).
Linear Prediction of Speech
(
Springer
,
New York
).
31.
Mártony
,
J.
(
1965
). “
Studies of the voice source
,” STL-QPSR
6
,
4
9
.
32.
Miller
,
J.
,
Lee
,
S.
,
Uchanski
,
R.
,
Heidbreder
,
A.
,
Richman
,
B.
, and
Tadlock
,
J.
(
1996
). “
Creation of two children’s speech databases
,” in
Proceedings of ICASSP
, Vol.
2
, pp.
849
852
.
33.
Miller
,
R. L.
(
1959
). “
Nature of the vocal cord wave
,”
J. Acoust. Soc. Am.
31
,
667
677
.
34.
Peterson
,
G. E.
, and
Barney
,
H. L.
(
1952
). “
Control methods used in a study of the vowels
,”
J. Acoust. Soc. Am.
24
,
175
184
.
35.
Rabiner
,
L. R.
, and
Schafer
,
R. W.
(
1978
).
Digital Processing of Speech Signals
(
Prentice Hall
,
Englewood Cliffs, NJ
).
36.
Rosenberg
,
A. E.
(
1971
). “
Effect of glottal pulse shape on the quality of natural vowels
,”
J. Acoust. Soc. Am.
49
,
583
590
.
37.
Ross
,
M. J.
,
Shaffer
,
H. L.
,
Cohen
,
A.
,
Freudberg
,
R.
, and
Manley
,
H.
(
1974
). “
Average magnitude difference function pitch extractor
,”
IEEE Trans. Acoust., Speech, Signal Process.
22
,
353
362
.
38.
Rothenberg
,
M.
(
1973
). “
A new inverse-filtering technique for deriving the glottal airflow during voicing
,”
J. Acoust. Soc. Am.
53
,
1632
1645
.
39.
Sjölander
,
K.
(
2004
). “
Snack sound toolkit
,” KTH Stockholm, Sweden, http://www.speech.kth.se/snack/ (last viewed January
2007
).
40.
Sluijter
,
A.
, and
Van Heuven
,
V.
(
1996
). “
Spectral balance as an acoustic correlate of linguistic stress
,”
J. Acoust. Soc. Am.
100
,
2471
2485
.
41.
Sluijter
,
A.
,
Van Heuven
,
V.
, and
Pacilly
,
J.
(
1997
). “
Spectral balance as a cue in the perception of linguistic stress
,”
J. Acoust. Soc. Am.
101
,
503
513
.
42.
Swerts
,
M.
, and
Veldhuis
,
R.
(
2001
). “
The effect of speech melody on voice quality
,”
Speech Commun.
33
,
297
303
.
43.
Titze
,
I. R.
(
2004
). “
A theoretical study of f0-f1 interaction with application to resonant speaking and singing voice
,”
J. Voice
18
,
292
298
.
44.
Wakita
,
H.
(
1977
). “
Normalization of vowels by vocal-tract length and its application to vowel identification
,”
IEEE Trans. Acoust., Speech, Signal Process.
25
,
183
192
.
You do not currently have access to this content.