This study reports an investigation of the well-known context-dependent variation in English /r/ using a biomechanical tongue-jaw-hyoid model. The simulation results show that preferred /r/ variants require less volume displacement, relative strain, and relative muscle stress than variants that are not preferred. This study also uncovers a previously unknown mechanism in tongue biomechanics for /r/ production: Torque in the sagittal plane about the mental spine. This torque enables raising of the tongue anterior for retroflexed [ɻ] by activation of hyoglossus and relaxation of anterior genioglossus. The results provide a deeper understanding of the articulatory factors that govern contextual phonetic variation.
I. Introduction
Although articulatory factors are widely assumed to be responsible for much phonetically conditioned sound variation, a clear understanding of the mechanics underlying these factors has remained out of reach. One of the most prominent and extreme cases of phonetic variation has been that of North American English /r/, a segment long known to exhibit categorical variability in tongue shape across dialects and speakers, and even across phonetic contexts within speaker (Delattre and Freeman, 1968; Ong and Stone, 1998; Guenther et al., 1999; Tiede et al., 2004; Zhou et al., 2008; Campbell et al., 2010; Mielke et al., 2010). We use a modeling approach to examine biomechanical factors underlying the observation that contextual vowel quality influences the selection of “bunched” vs “tip-up” /r/ variants.
Using a dynamic three-dimensional (3D) jaw-tongue-hyoid model developed by Stavness et al. (2011) in the ARTISYNTH modeling toolkit (www.artisynth.org; see also Lloyd et al., 2011), we evaluate the biomechanical basis of preferred tongue postures for English /r/ in the context of /i/ vs /a/ vowels. Mielke et al. (2010) found that bunched /r/ is generally more widely represented than tip-up /r/. However, for those American English speakers who use multiple /r/ variants (11 out of 27, or about 40% of speakers in their study), bunched /r/ was more likely to occur adjacent to the vowel /i/, whereas tip-up postures occurred coupled with /a/ and /o/. Based on these findings, we test whether the categorical selection of /r/ variants is based on minimization of tongue displacement, relative strain, and relative muscle-induced stress between /r/ and an adjacent vowel.
II. Methods
A. Model
We used a computational model of coupled jaw-tongue-hyoid dynamics to analyze tongue biomechanics for English /r/ variants. The model was developed with the ARTISYNTH simulation toolkit and has been described in detail previously (Stavness et al., 2011). The 3D finite-element method (FEM) tongue model was based on a reference model reported by Buchaillard et al. (2009) and used a hyper-elastic, incompressible Mooney–Rivlin material. Contact between the tongue and teeth was detected and handled as a constraint on the FEM dynamics, allowing the lateral aspects of the tongue to brace against the teeth, as is known to occur in English /r/ (Ong and Stone, 1998).
The tongue musculature was modeled using a transverse-isotropic FEM material (Weiss et al., 1996). The muscle geometry was consistent with Buchaillard et al. (2009), who used simplified line-based muscles; however, the FEM muscle material provided a better distribution of muscle stress and more realistic representation of muscle mechanics including stress stiffening during muscle activation. Mesh elements that were associated with a particular muscle were assigned a fiber direction that represented the muscle’s principal line-of-action. Stress was added in the fiber direction to represent passive muscle stress (which varied nonlinearly with strain along the fiber direction) and active muscle stress (which varied linearly with muscle activation from 0 to 10 kPa).
Muscle activations were manually set to achieve canonical tongue postures for English vowels /i/ and /a/ and for variants of English /r/, as shown in Fig. 1. Muscle activations were arrived at by interactively adjusting activation magnitudes during a forward dynamics simulation of the tongue model, until the 3D shape of the tongue model conformed to the posture of interest, as determined by qualitative evaluation of matched postures from ultrasound and magnetic resonance imaging (MRI) images. All /r/ variants had tongue constrictions at the same location (just behind the alveolar ridge) on the anterior hard palate. Tongue shapes for vowels (/i/ and /a/) and three /r/ variants (one bunched and two tip-up) were achieved by activating muscles as indicated in Table I. The postures attained from the two tip-up /r/ variants were similar (see Fig. 2) and therefore only the simulations of tip-up /r/ using primarily hyoglossus activation were used for the comparative analysis.
(Color online) Sagittal cut-away view of displacement (A), relative strain (B), and relative muscle-induced stress (C) for tongue /r/ postures relative to /a/ (top row) and /i/ (bottom row). Smaller values were found for bunched /r/ in the context of /i/ (lower left-hand panel of all three subfigures) and for tip-up /r/ in the context of /a/ (upper right-hand panel of all three subfigures).
(Color online) Sagittal cut-away view of displacement (A), relative strain (B), and relative muscle-induced stress (C) for tongue /r/ postures relative to /a/ (top row) and /i/ (bottom row). Smaller values were found for bunched /r/ in the context of /i/ (lower left-hand panel of all three subfigures) and for tip-up /r/ in the context of /a/ (upper right-hand panel of all three subfigures).
Muscle activations (%) for the five simulated tongue postures.a
. | /a/ . | /i/ . | bunched /r/ . | tip-up /r/ (SL) . | tip-up /r/ (HG + SL) . |
---|---|---|---|---|---|
GGp | — | 15 | — | — | — |
GGm | 10 | 15 | 30 | 10 | 10 |
GGa | 15 | 5 | 5 | — | — |
SLa | — | 5 | 5 | 15 | 8 |
SLa_lat | — | 60 | 60 | — | — |
ILa | — | 60 | 60 | — | — |
HG | 15 | — | — | — | 8 |
TRANSp | — | — | — | — | — |
TRANSa | — | 5 | 5 | — | — |
VERTp | — | 10 | — | — | — |
VERTa | 5 | — | 5 | — | — |
jaw_open | 4 | — | — | — | — |
jaw_close | — | 3 | 3 | 3 | 3 |
. | /a/ . | /i/ . | bunched /r/ . | tip-up /r/ (SL) . | tip-up /r/ (HG + SL) . |
---|---|---|---|---|---|
GGp | — | 15 | — | — | — |
GGm | 10 | 15 | 30 | 10 | 10 |
GGa | 15 | 5 | 5 | — | — |
SLa | — | 5 | 5 | 15 | 8 |
SLa_lat | — | 60 | 60 | — | — |
ILa | — | 60 | 60 | — | — |
HG | 15 | — | — | — | 8 |
TRANSp | — | — | — | — | — |
TRANSa | — | 5 | 5 | — | — |
VERTp | — | 10 | — | — | — |
VERTa | 5 | — | 5 | — | — |
jaw_open | 4 | — | — | — | — |
jaw_close | — | 3 | 3 | 3 | 3 |
Abbreviations: Anterior/middle/posterior genioglossus (GGa/m/p), superior longitudinal (SL), anterior (lateral) fibers of superior longitudinal (SLa (_lat)), anterior fibers of inferior longitudinal (ILa), hyoglossus (HG), anterior fibers of transversus (TRANSa), anterior/posterior fibers of verticalis (VERTa/p), and jaw opening and closing muscles (jaw_open/close).
(Color online) Compressive muscle stress for two variants of tip-up /r/ relative to /a/ posture. The two tip-up postures were created using different muscle activations: primarily superior longitudinal activation (left) and primarily hyoglossus activation with reduced superior longitudinal activation (right).
(Color online) Compressive muscle stress for two variants of tip-up /r/ relative to /a/ posture. The two tip-up postures were created using different muscle activations: primarily superior longitudinal activation (left) and primarily hyoglossus activation with reduced superior longitudinal activation (right).
B. Metrics
Tongue deformation was characterized by the distribution of stress and strain in the FEM mesh, which was induced by tongue muscle forces given the material properties of the FEM model and the constraint of incompressibility. Our metrics for comparing the two alternative /r/ postures relative to the reference /a/ and /i/ postures included displacement, relative strain, and relative muscle-induced stress. For each metric, an average value was calculated as the mean value for all nodes in the FEM mesh and a percentage change was calculated as the difference between the /r/ posture and vowel posture values divided by the vowel posture value. Displacement was calculated as the Euclidean distance between each FEM node in the /r/ posture and in the reference posture. Displacement measured the position change in the tongue due to both jaw motion and tongue deformation. Von-Mises strain measured tongue shape change, i.e., the deformation of each finite-element invariant of translation and rotation. Muscle stress measured the additional stress in the fiber direction of each finite-element corresponding to passive and active muscle stress. Distributions are plotted in Fig. 1.
We also calculated compressive muscle stress (CMS) in order to compare the two simulated variants of tip-up /r/. CMS is useful because it represents stress that is the result of active work done by the tongue muscles (omitting changes in stress that are the result of relaxation), thus providing a characterization of the distribution of work involved in implementing a shape transition. CMS plots comparing the two tip-up /r/ variants relative to /a/ posture are shown in Fig. 2
III. Results
The resulting 3D dynamic simulations for vowel and /r/ postures are illustrated in Mm. 1.
Video showing 3D finite-element simulations of the tongue for vowel and /r/ postures. Color plots show tissue strain relative to rest posture. This is a file of type “mov” (11 Mb, H.264 encoding). [URL: http://dx.doi.org/10.1121/1.3695407.1]
Video showing 3D finite-element simulations of the tongue for vowel and /r/ postures. Color plots show tissue strain relative to rest posture. This is a file of type “mov” (11 Mb, H.264 encoding). [URL: http://dx.doi.org/10.1121/1.3695407.1]
A. Displacement
Average displacement of the tongue relative to /i/ posture was smaller for bunched /r/ (2.4 mm, 53%) than for tip-up /r/ (5.5 mm, 120%); average displacement relative to /a/ posture was smaller for tip-up /r/ (6.0 mm, 126%) than for bunched /r/ (6.8 mm, 141%). Maximum displacements were highest for the tip-up /r/, which is expected given the vertical displacement of the tongue tip. However, the volume of tissue in the tongue tip is small and therefore tip displacement was preferred over displacement of the much larger tongue body and tongue root, as was the case in displacement from /a/ to bunched /r/ [Fig. 1(A), upper-left panel].
B. Relative strain
Average strain difference relative to /i/ posture was smaller for bunched /r/ (0.18,1 42%) than for tip-up /r/ (0.28, 65%); average strain difference relative to /a/ posture was smaller for tip-up /r/ (0.17, 54%) than for bunched /r/ (0.19, 62%). Figure 1(B) illustrates that the majority of the strain change between /i/ and bunched /r/ occurred at the tongue root, whereas the tongue body/tip remain the same shape; whereas between /a/ and tip-up /r/, the majority of the strain change occurred at the tongue tip, whereas the tongue root shape remains constant.
C. Relative muscle stress
Average muscle-stress difference relative to /i/ posture was smaller for bunched /r/ (1.4 kPa, 54%) than for tip-up /r/ (2.5 kPa, 95%); average muscle-stress difference relative to /a/ posture was smaller for tip-up /r/ (0.8 kPa, 83%) than for bunched /r/ (1.7 kPa, 166%). The small value for muscle-stress difference between /a/ and tip-up /r/ was due to the fact that the change in muscle activation amplitudes was small from /a/ to tip-up /r/: relaxation of anterior genioglossus (GGa) and a low level activation of superior longitudinal (SL).
D. Tip-up variants
We simulated tip-up /r/ with two different muscle activation schemes: One with primarily SL activation and one with primarily hyoglossus (HG) activation. The resulting tongue postures were very similar. The main difference was that HG activation (which was independently required for the /a/ posture) permitted a reduction in SL activation due to torque about the mental spine of the mandible, and resulted in less difference in muscle stress relative to /a/ posture, as shown in Fig. 2.
IV. Discussion
All three of the measures we applied in this study—tissue displacement, relative strain, and relative muscle stress—aligned to support the notion that contextual /r/ variation in speech production is indeed governed by mechanical articulatory factors. Specifically, our simulations showed reductions in all three measures for transitions between bunched /r/ and the vowel /i/, and between tip-up /r/ and the vowel /a/. These simulation results are consistent with previous production experimental results that showed a preference for bunched /r/ in the context of /i/ and tip-up /r/ in the context of /a/, for those speakers who exhibited variation in /r/ shape (Mielke et al., 2010).
Our biomechanical analysis enabled a deeper comparison of tongue postures than possible with traditional kinematic analysis. Although the displacement and strain metrics are kinematic measures, they comprise a full 3D comparison of tongue shape change, which is hard to characterize experimentally. Further, the muscle-stress metric characterized tongue posture with respect to the contractile energy used to generate and maintain tissue deformation. Muscle stress is difficult to measure experimentally and therefore a model-based approach is warranted.
Tongue muscle activity is challenging to measure experimentally, and therefore it is difficult to compare the muscle activations used in our simulations to real data. Simulations do allow one to evaluate the set of feasible muscle strategies. Given the complex arrangement of tongue musculature it may be possible that quite different muscle activation strategies could be used to produce similar tongue postures. We found two such strategies for creating a tip-up /r/ posture: one with primarily superior longitudinal activation and one with primarily hyoglossus activation. Also, the similarity of muscle stress patterns between bunched /r/ and /i/, as well as between tip-up /r/ and /a/, suggests that similar strategies may be employed as part of speech planning in producing these paired tongue postures.
The appearance of the hyoglossus-induced tip-up /r/ variant was an unexpected result of our simulations, and emerged largely as a result of relaxing the GGa muscles during /a/. The existence of this mechanism highlights a hitherto unnoticed aspect of tongue mechanics: Torque (in this case induced by the HG) about the tongue’s central bone attachment point at the mental spine of the mandible. This factor is likely to play into global tongue movement patterns observed in previous studies (e.g., Iskarous, 2005).
In addition to offering new insights into the mechanics underlying tongue postures, these simulations provide a basis for further research into influences on conditioned speech variability in general. Our framework for model-based analysis also provides avenues for uncovering additional factors that may play into other types of variation, such as cases of apparently free variation, and sound change.
Acknowledgments
This work was supported by the Natural Sciences and Engineering Research Council of Canada. We acknowledge John Lloyd and the ArtiSynth team at UBC, as well as Yohan Payan from TIMC-Lab and Pascal Perrier from Gipsa-Lab, Grenoble, for their contributions to the tongue model.
Strain is a dimensionless quantity.