The theory of Task Dynamics provides a method of predicting articulatory kinematics from a discrete phonologically-relevant representation (“gestural score”). However, because the implementations of that model (e.g., Nam et al., 2004) have generally used a simplified articulatory geometry (Mermelstein et al., 1981) whose forward model (from articulator to constriction coordinates) can be analytically derived, quantitative predictions of the model for individual human vocal tracts have not been possible. Recently, methods of deriving individual speaker forward models from real-time MRI data have been developed (Sorensen et al., 2019). This has further allowed development of task dynamic models for individual speakers, which make quantitative predictions. Thus far, however, these models (Alexander et al., 2019) could only synthesize limited types of utterances due to their inability to model temporally overlapping gestures. An updated implementation is presented, which can accommodate overlapping gestures and incorporates an optimization loop to improve the fit of modeled articulatory trajectories to the observed ones. Using an analysis-by-synthesis approach, the updated implementation can be utilized: (1) to refine the hypothesized speaker-general gestural parameters (target, stiffness) for individual speakers; (2) to test different degrees of temporal overlapping among multiple gestures such as a CCVC syllable. [Work supported by NSF, Grant 1908865.]
Skip Nav Destination
Article navigation
October 2021
Meeting abstract. No PDF available.
October 01 2021
Modeling speaker-specific vocal tract kinematics from gestural scores
Yijing Lu;
Yijing Lu
Linguist, Univ. of Southern California, 3601 Watt Way, Grace Ford Salvatori 301, Los Angeles, CA 90089, [email protected]
Search for other works by this author on:
Shrikanth Narayanan;
Shrikanth Narayanan
Elec. and Comput. Eng., Univ. of Southern California, Los Angeles, CA
Search for other works by this author on:
Louis Goldstein;
Louis Goldstein
Linguist, Univ. of Southern California, Los Angeles, CA
Search for other works by this author on:
Asterios Toutios
Asterios Toutios
Elec. and Comput. Eng., Univ. of Southern California, Los Angeles, CA
Search for other works by this author on:
J. Acoust. Soc. Am. 150, A188–A189 (2021)
Citation
Yijing Lu, Justin Ly, Shrikanth Narayanan, Louis Goldstein, Asterios Toutios; Modeling speaker-specific vocal tract kinematics from gestural scores. J. Acoust. Soc. Am. 1 October 2021; 150 (4_Supplement): A188–A189. https://doi.org/10.1121/10.0008077
Download citation file:
120
Views
Citing articles via
All we know about anechoic chambers
Michael Vorländer
Day-to-day loudness assessments of indoor soundscapes: Exploring the impact of loudness indicators, person, and situation
Siegbert Versümer, Jochen Steffens, et al.
A survey of sound source localization with deep learning methods
Pierre-Amaury Grumiaux, Srđan Kitić, et al.
Related Content
A modular architecture for articulatory synthesis from gestural specification
J. Acoust. Soc. Am. (December 2019)
A procedure for estimating gestural scores from speech acoustics
J. Acoust. Soc. Am. (December 2012)
A procedure for estimating gestural scores from articulatory data.
J Acoust Soc Am (March 2010)
Modeling speech production using dynamic gestural structures
J Acoust Soc Am (August 2005)
Reduction of non-native accents through statistical parametric articulatory synthesis
J. Acoust. Soc. Am. (January 2015)