This study applies information geometry of normal distribution to model Japanese vowels on the basis of the first and second formants. The distribution of Kullback-Leibler (KL) divergence and its decomposed components were investigated to reveal the statistical invariance in the vowel system. The results suggest that although significant variability exists in individual KL divergence distributions, the population distribution tends to converge into a specific log-normal distribution. This distribution can be considered as an invariant distribution for the standard-Japanese speaking population. Furthermore, it was revealed that the mean and variance components of KL divergence are linearly related in the population distribution. The significance of these invariant features is discussed.

1.
S.
Furui
,
Digital Speech Processing, Synthesis, and Recognition
(
Marcel Dekker, Inc.
,
New York
,
1989
), p.
390
.
2.
T.
Chiba
and
M.
Kajiyama
,
Vowel: Its Nature and Structure
(
Phonetic Society of Japan
,
Tokyo, Japan
,
1958
), p.
236
.
3.
R. E.
Turner
and
R. D.
Patterson
, “
An analysis of the size information in classical formant data: Peterson and Barney (1952) Revisited
,”
J. Acoust. Soc. Jpn.
33
,
585
589
(
2003
).
4.
D. F.
Klatt
and
L. C.
Klatt
,
“Analysis, synthesis, and perception of voice quality variations among female and male talkers,”
J. Acoust. Soc. Am.
87
,
820
857
(
1990
).
5.
M.
Saitou
and
T.
Tsumura
,
“Relation between timbre and depth of frequency fluctuation—Comparison between listening condition through headphone and free field,”
J. Acoust. Soc. Jpn.
90
(
5
),
3
10
(
1990
).
6.
W.
Labov
,
“La transmission des changements linguistiques” (“The transmission of linguistic changes”)
,
Languages
26
(
108
),
16
33
(
1992
) (in French).
7.
N.
Minematsu
,
T.
Nishimura
,
K.
Nishinari
, and
K.
Sakuraba
, “
Theorem of the invariant structure and its derivation of speech Gestalt
,” in
Speech Recognition and Intrinsic Variation (SRIV2006)
, Toulouse, France (May 20,
2006
), pp.
47
52
.
8.
S.
Takano
and
S.
Nakamura
, “
Multilingual environment and natural acquisition of language
,”
AIP Conf. Proc.
519
,
785
796
(
2000
).
9.
B.
de Boer
,
The Origins of Vowel Systems
(
Oxford University Press
,
Oxford, UK
,
2001
),
184
p.
10.
S.
Amari
and
H.
Nagaoka
,
“Method of information geometry,”
in
Translations of Mathematical Monograph
, Vol.
191
(
Oxford University Press
,
Oxford, UK
,
2000
),
206
pp.
11.
G.
Verdoolaege
(editor), Entropy Special Issue: “Information geometry,” Entropy, MDPI AG, Basel, Switzerland (2014), http://www.mdpi.com/journal/entropy/special_issues/information-geometry (Last viewed March 31, 2015).
12.
M. K.
Sonmez
, “
Information geometry of topology preserving adaptation
,” in
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP
(
2000
), Vol. 6, pp.
3743
3746
.
13.
A.
Gunawardana
, “
The information geometry of EM variants for speech and image processing
,” Ph.D. dissertation,
The Johns Hopkins University
,
Baltimore, MD
,
2001
.
14.
http://www.fon.hum.uva.nl/praat/ (Last viewed July 8, 2014).
15.
S. I. R.
Costa
,
S. A.
Santos
, and
J. E.
Strapasson
,
“Fisher information distance: A geometrical reading,”
Discrete Appl. Math.
, available online .
16.
A. P.
Simpson
,
“Gender-specific articulatory-acoustic relations in vowel sequences
,”
J. Phonetics
30
(
3
),
417
435
(
2002
).
17.
A. P.
Simpson
,
“Dynamic consequences of differences in male and female vocal tract dimensions,”
J. Acoust. Soc. Am.
109
(
5
),
2153
2164
(
2001
).
18.
L.
Menard
,
J.-L.
Schwarz
, and
J.
Aubin
, “
Invariance and variability in the production of height feature in French vowels
,”
Speech Commun.
50
(
1
),
14
28
(
2008
).
19.
U. G.
Goldstein
, “
An articulatory model for the vocal tracts of growing children
,” Ph.D. dissertation,
Massachusetts Institute of Technology
(
1980
).
20.
S.
Lee
,
A.
Potamianos
, and
S.
Narayanan
, “
Acoustics of children's speech: Developmental changes of temporal and spectral parameters
,”
J. Acoust. Soc. Am.
105
(
3
),
1455
1468
(
1999
).
21.
Y.
Samuelsson
, “
Gender effects on phonetic variation and speaking styles. A literature study
,” GSLT Speech Technology Term Paper, pp.
1
8
(
2006
).
22.
Lognormal Distributions: Theory and Applications
, edited by
E. L.
Crow
and
K.
Shimizu
(
CRC Press
,
Boca Raton, FL
,
1987
),
387
pp.
23.
M.
Mitzenmacher
,
“A brief history of generative models for power law and lognormal distributions,”
Internet Math.
1
(
2
),
226
251
(
2004
).
24.
J. J.
Gibson
,
The Ecological Approach to Visual Perception
(
Houghton Mifflin Harcourt
,
Boston, MA
,
1979
),
350
pp.
25.
F.
Nielsen
and
R.
Nock
,
“Sided and symmetrized Bregman centroids,”
IEEE Trans. Inf. Theory
55
,
2882
2904
(
2009
).
You do not currently have access to this content.