The dynamics of airflow during speech production may often result in some small or large degree of turbulence. In this paper, the geometry of speech turbulence as reflected in the fragmentation of the time signal is quantified by using fractal models. An efficient algorithm for estimating the short-time fractal dimension of speech signals based on multiscale morphological filtering is described, and its potential for speech segmentation and phonetic classification discussed. Also reported are experimental results on using the short-time fractal dimension of speech signals at multiple scales as additional features in an automatic speech-recognition system using hidden Markov models, which provide a modest improvement in speech-recognition performance.

1.
Cole, R., Muthusamy, Y., and Fanty, M. (1990). “The ISOLET Spoken Letter Database,” Tech. Rep. CSE 90-004, Oregon Graduate Institute of Science and Technology, Portland, Oregon.
2.
Davis
,
S.
, and
Mermelstein
,
P.
(
1992
). “
Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences
,”
IEEE Trans. Acoust., Speech, Signal Process.
28
,
357
366
.
3.
Dubuc
,
B.
,
Quiniou
,
J. F.
,
Roques-Carmes
,
C.
,
Tricot
,
C.
, and
Zucker
,
S. W.
(
1989
). “
Evaluating the Fractal Dimension of Profiles
,”
Phys. Rev. A
39
,
1500
1512
.
4.
Fant, G. (1970). Acoustic Theory of Speech Production (Mouton, Hague).
5.
Flanagan, J. L. (1972). Speech Analysis, Synthesis, and Perception (Springer-Verlag, New York).
6.
Kaiser, J. F. (1983). “Some Observations on Vocal Tract Operation from a Fluid Flow Point of View,” in Vocal Fold Physiology: Biomechanics, Acoustics, and Phonatory Control, edited by I. R. Titze and R. C. Scherer (The Denver Center for the Performing Arts, Denver, CO), pp. 358–386.
7.
Kaiser, J. F. (1990). “On a Simple Algorithm to Calculate the ‘Energy’ of a Signal,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Albuquerque, New Mexico, pp. 381–384.
8.
Klatt
,
D. H.
(
1987
). “
Review of Text-To-Speech Conversion for English
,”
J. Acoust. Soc. Am.
82
,
737
793
.
9.
Mandelbrot, B. B. (1982). The Fractal Geometry of Nature (Freeman, New York).
10.
Maragos, P. (1991). “Fractal Aspects of Speech Signals: Dimension and Interpolation,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Toronto, Canada (IEEE, New York), pp. 417–420.
11.
Maragos, P. (1994). “Fractal Signal Analysis Using Mathematical Morphology,” in Advances in Electronics and Electron Physics, Vol. 88, edited by P. Hawkes and B. Kazan (Academic, New York).
12.
Maragos
,
P.
, and
Sun
,
F. K.
(
1993
). “
Measuring the Fractal Dimension of Signals: Morphological Covers and Iterative Optimization
,”
IEEE Trans. Signal Process.
41
,
108
121
.
13.
McGowan
,
R. S.
(
1988
). “
An Aeroacoustics Approach to Phonation
,”
J. Acoust. Soc. Am.
83
,
696
704
.
14.
Peitgen, O., Jürgens, H., and Saupe, D. (1992). Chaos and Fractals (Springer-Verlag, New York).
15.
Pickover
,
C. A.
, and
Khorasani
,
A.
(
1986
). “
Fractal Characterization of Speech Waveform Graphs
,”
Comput. Graph.
10
,
51
61
.
16.
Serra, J. (1982). Image Analysis and Mathematical Morphology (Academic, New York).
17.
Singer, E., and Lippmann, R. P. (1992). “A Speech Recognizer Using Radial Basis Function Neural Networks in an HMM Framework,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, San Francisco, California, pp. 629–632.
18.
Stevens
,
K. N.
(
1971
). “
Airflow and Turbulence Noise for Fricative and Stop Consonants: Static Considerations
,”
J. Acoust. Soc. Am.
50
,
no
.
4
2
),
1180
.
19.
Teager, H. M., and Teager, S. M. (1990). “Evidence for Nonlinear Sound Production Mechanisms in the Vocal Tract,” in Speech Production and Speech Modeling, edited by W. J. Hardcastle and A. Marchal, NATO Advanced Study Institute Series D, Vol. 55, Bonas, France, July 1989 
(Kluwer Academic, Boston), pp. 241–261.
20.
Thomas
,
T. J.
(
1986
). “
A Finite Element Model of Fluid Flow in the Vocal Tract
,”
Comput. Speech Lang.
1
,
131
151
.
21.
Tritton, D. J. (1988). Physical Fluid Dynamics (Oxford U.P., New York).
22.
Young, S. (1995). The HTK Book (Cambridge Research Lab: Entropics, Cambridge, England).
This content is only available via PDF.
You do not currently have access to this content.