Motivated by the source-filter model of speech production, analysis of emotional speech based on the inverse-filtering method has been extensively conducted. The relative contribution of the glottal source and vocal tract cues to perception of emotions in speech is still unclear, especially after removing the effects of the known dominant factors (e.g., F0, intensity, and duration). In this present study, the glottal source and vocal tract parameters were estimated in a simultaneous manner, modified in a controlled way and then used for resynthesizing emotional Japanese vowels by applying a recently developed analysis-by-synthesis method. The resynthesized emotional vowels were presented to native Japanese listeners with normal hearing for perceptually rating emotions in valence and arousal dimensions. Results showed that glottal source information played a dominant role in perception of emotions in vowels, while vocal tract information contributed to valence and arousal perceptions after neutralizing the effects of F0, intensity, and duration cues.
Skip Nav Destination
Article navigation
August 2018
August 22 2018
Contributions of the glottal source and vocal tract cues to emotional vowel perception in the valence-arousal space
Yongwei Li;
Yongwei Li
School of Information Science, Japan Advanced Institute of Science and Technology
, 1-1 Asahidai, Nomi, Ishikawa 923-1292, Japan
Search for other works by this author on:
Junfeng Li;
Junfeng Li
Institute of Acoustics, Chinese Academy of Sciences
, 21 North 4th Ring Road, Haidian District, Beijing 100190, People's Republic of China
Search for other works by this author on:
Masato Akagi
Masato Akagi
School of Information Science, Japan Advanced Institute of Science and Technology
, 1-1 Asahidai, Nomi, Ishikawa 923-1292, Japan
Search for other works by this author on:
Electronic mail: [email protected]
Also at: School of Electronic Electrical and Communication Engineering, University of Chinese Academy of Sciences, 19(A) Yuquan Road Shijingshan District, Beijing 100049, People's Republic of China.
J. Acoust. Soc. Am. 144, 908–916 (2018)
Article history
February 12 2018
August 06 2018
Yongwei Li, Junfeng Li, Masato Akagi; Contributions of the glottal source and vocal tract cues to emotional vowel perception in the valence-arousal space. J. Acoust. Soc. Am. 1 August 2018; 144 (2): 908–916.
Download citation file:
Pay-Per-View Access
Sign In
You could not be signed in. Please check your credentials and make sure you have an active account and try again.
Citing articles via
All we know about anechoic chambers
Michael Vorländer
Day-to-day loudness assessments of indoor soundscapes: Exploring the impact of loudness indicators, person, and situation
Siegbert Versümer, Jochen Steffens, et al.
A survey of sound source localization with deep learning methods
Pierre-Amaury Grumiaux, Srđan Kitić, et al.
Related Content
The contribution of phonation type to the perception of vocal emotions in German: An articulatory synthesis study
J. Acoust. Soc. Am. (March 2015)
Subsegmental level analysis of high arousal speech using the zero-time windowing method
J. Acoust. Soc. Am. (January 2019)
Quasi-closed phase forward-backward linear prediction analysis of speech for accurate formant detection and estimation
J. Acoust. Soc. Am. (September 2017)
Cross-language differences in how voice quality and f contours map to affect
J. Acoust. Soc. Am. (November 2018)
Beyond arousal: Valence and potency/control cues in the vocal expression of emotion
J. Acoust. Soc. Am. (September 2010)