A new method called the spectral locus control method (SLCM) is proposed, which can approximate the dynamic characteristics of the speech spectrum, such as in the transition from vowels to consonants or from consonants to vowels, effectively and accurately. The main procedures of the method are as follows. Continuous speech is segmented into VCV units, and these units are grouped according to the consonants. The spectrum patterns of the V1CV2 units in each group are analyzed to construct a statistical model which, given the spectra of V1 and V2, generates the spectrum loci for V1CV2 units. To synthesize continuous speech, a spectrum appropriate for a given consonantal context is first selected for each vowel V in every CVC sequence in the text. Then, the temporal sequence of the spectrum patterns for the entire V1CV2 is calculated based on the spectrum of the stationary parts in V1 and V2. Since VCV segment spectra are adapted to their consonantal environment, the synthesized speech is highly natural, especially in transitions.
Skip Nav Destination
Article navigation
November 1988
August 13 2005
Statistical modeling of dynamic spectral patterns for a speech synthesizer Free
Sateshi Takahashi;
Sateshi Takahashi
Department of Electrical Engineering, Waseda University, 3‐4‐1 Ohkubo, Shinjuku‐ku, Tokyo, 160 Japan
Search for other works by this author on:
Yasuaki Satoh;
Yasuaki Satoh
Department of Electrical Engineering, Waseda University, 3‐4‐1 Ohkubo, Shinjuku‐ku, Tokyo, 160 Japan
Search for other works by this author on:
Takeshi Ohno;
Takeshi Ohno
Department of Electrical Engineering, Waseda University, 3‐4‐1 Ohkubo, Shinjuku‐ku, Tokyo, 160 Japan
Search for other works by this author on:
Katsuhiko Shirai
Katsuhiko Shirai
Department of Electrical Engineering, Waseda University, 3‐4‐1 Ohkubo, Shinjuku‐ku, Tokyo, 160 Japan
Search for other works by this author on:
Sateshi Takahashi
Department of Electrical Engineering, Waseda University, 3‐4‐1 Ohkubo, Shinjuku‐ku, Tokyo, 160 Japan
Yasuaki Satoh
Department of Electrical Engineering, Waseda University, 3‐4‐1 Ohkubo, Shinjuku‐ku, Tokyo, 160 Japan
Takeshi Ohno
Department of Electrical Engineering, Waseda University, 3‐4‐1 Ohkubo, Shinjuku‐ku, Tokyo, 160 Japan
Katsuhiko Shirai
Department of Electrical Engineering, Waseda University, 3‐4‐1 Ohkubo, Shinjuku‐ku, Tokyo, 160 Japan
J. Acoust. Soc. Am. 84, S23 (1988)
Citation
Sateshi Takahashi, Yasuaki Satoh, Takeshi Ohno, Katsuhiko Shirai; Statistical modeling of dynamic spectral patterns for a speech synthesizer. J. Acoust. Soc. Am. 1 November 1988; 84 (S1): S23. https://doi.org/10.1121/1.2026230
Download citation file:
23
Views
Citing articles via
Climatic and economic fluctuations revealed by decadal ocean soundscapes
Vanessa M. ZoBell, Natalie Posdaljian, et al.
Variation in global and intonational pitch settings among black and white speakers of Southern American English
Aini Li, Ruaridh Purse, et al.
The contribution of speech rate, rhythm, and intonation to perceived non-nativeness in a speaker's native language
Ulrich Reubold, Robert Mayr, et al.
Related Content
Analysis and synthesis of CV syllables in Hindi
J. Acoust. Soc. Am. (August 2005)
Synthesis of Chinese by rules based on a multipulse excitation model
J. Acoust. Soc. Am. (August 2005)
A speech synthesis system by rule in Japanese
J. Acoust. Soc. Am. (August 2005)
Effects of fundamental frequency contour on the identification of resynthesized vowels with static formant frequency patterns
J. Acoust. Soc. Am. (August 2005)
A system for speech synthesis from Japanese orthographic text
J. Acoust. Soc. Am. (August 2005)