Fronting of the vowels /u, ʊ, o/ is observed throughout most North American English varieties, but has been analyzed mainly in terms of acoustics rather than articulation. Because an increase in F2, the acoustic correlate of vowel fronting, can be the result of any gesture that shortens the front cavity of the vocal tract, acoustic data alone do not reveal the combination of tongue fronting and/or lip unrounding that speakers use to produce fronted vowels. It is furthermore unresolved to what extent the articulation of fronted back vowels varies according to consonantal context and how the tongue and lips contribute to the F2 trajectory throughout the vowel. This paper presents articulatory and acoustic data on fronted back vowels from two varieties of American English: coastal Southern California and South Carolina. Through analysis of dynamic acoustic, ultrasound, and lip video data, it is shown that speakers of both varieties produce fronted /u, ʊ, o/ with rounded lips, and that high F2 observed for these vowels is associated with a front-central tongue position rather than unrounded lips. Examination of time-varying formant trajectories and articulatory configurations shows that the degree of vowel-internal F2 change is predominantly determined by coarticulatory influence of the coda.

The fronting of the back vowels /u/, /ʊ/, and /o/ (goose, foot, and goat; Wells, 1982) has been widely observed in global varieties of English, including in North America (Labov , 2006), Britain and Ireland (Ferragne and Pellegrino, 2010), Australia (Cox, 1999), New Zealand (Gordon , 2004), and South Africa (Mesthrie, 2010). Although this change has garnered significant attention in sociophonetic work, most previous studies, particularly of North American English, rely exclusively on acoustic analysis, leaving a gap in our understanding of the articulatory processes underlying this shift. Studies examining this change from an articulatory perspective have mostly been limited to British and Irish varieties (e.g., Harrington , 2011; Lawson , 2019), although there is evidence that fronting across dialects differs with respect to the particular vowels involved, the phonological conditioning, and the resulting acoustic quality of fronted vowels (Ferragne and Pellegrino, 2010; Labov , 2006). The temporal dynamics of fronted vowels likewise remain understudied. Only a handful of recent studies (Gorman and Kirkham, 2020; Stanley , 2021; Strycharczuk and Scobbie, 2017) examine the time-varying articulatory and/or acoustic quality of fronted back vowels, despite long-standing evidence that internal vowel dynamics are perceptually relevant (Chládková , 2016; Nearey and Assmann, 1986) and also show cross-dialectal variation (Farrington , 2018; Fox and Jacewicz, 2009). This paper addresses these gaps through a dynamic articulatory and acoustic analysis of fronted back vowels in two varieties of American English.

Back vowel fronting is widespread throughout North America. It can be observed to some extent in nearly all parts of the United States and Canada, albeit with regional variation. The Atlas of North American English (ANAE; Labov , 2006) identifies three broad patterns. The South, Midlands, and Mid-Atlantic regions exhibit strong fronting of both /u/ and /o/, whereas New York City, the Western US, and Canada are reported to front only /u/. The only region in which neither /u/ nor /o/ are fronted is Eastern New England, while the North shows moderate fronting of /u/ without fronting of /o/.

Labov (2006) show that the distribution of F2 for /u/, in general, exhibits a three-way split according to phonological context. Fronting of /u/ is strongly favored by preceding coronal consonants; their regression analysis indicates that the coefficient for coronal onsets is more than two times greater than any other term in their model. As a result of this influence, the F2 for post-coronal /u/ often approaches that of /i/. In pre-lateral environments, /u/ retains a low F2 and remains in the high back region of the vowel space. Following non-coronal onsets, /u/ is fronted to a moderate degree, such that it is acoustically similar to central [ʉ]. The resulting distribution is bimodal: post-coronal and pre-lateral tokens are tightly clustered at the ends of the F2 range, while non-coronal tokens are broadly distributed throughout the center of the vowel space. In comparison, the fronting of /o/ is less strongly conditioned by onset place of articulation and its F2 has a correspondingly unimodal distribution (Labov , 2006, p. 55).

Fronting of both /u/ and /o/ is particularly advanced in the South, where the change has been observed among speakers born as early as the late 19th century (Stanley , 2021; Thomas, 2001). In addition to being among the most advanced regions of the United States in terms of overall fronting of /u/ and /o/, the South moreover exhibits some degree of fronting before /l/. Fronting before lateral codas is unusual due to the velarization associated with dark coda [ɫ], and is not observed in most other varieties. The South can be further distinguished from other regions by its relatively strong fronting after non-coronals (Labov , 2006), which is particularly evident in Charleston, South Carolina. Baranowski (2006, 2008) shows that in coronal contexts, both men and women exhibit a normalized F2 of post-coronal /u/ comparable to that of /i/ and /ɪ/, with all three vowels typically occupying the 2200–2400 Hz range. Even in non-coronal contexts, some Charlestonian women exhibit a mean F2 for /u/ as high as 2200 Hz. The F2 of /o/ may exceed 1800 Hz, approaching the F2 of /e/ and /ɛ/ (approximately 2000 Hz).

The emergence of back vowel fronting in California is relatively more recent. Present-day California speech is characterized by the California Vowel Shift (CVS; Eckert, 2008; Hall-Lew, 2009; Podesva, 2011), which involves a counterclockwise rotation of the front and low vowels accompanied by the fronting of /u/, /ʊ/, and /o/.1 These changes have progressed throughout the late 20th and early 21st centuries, first being recognized as a distinct phenomenon in the 1980s. During the mid-20th century, Reed and Metcalf (1952) reported /u/ to be (minimally) fronted by a small number of speakers, while /o/ was not fronted at all. Hinton (1987) identified fronting of /o/, /ʊ/, and /u/ among speakers from Northern California, with /o/-fronting found for younger (but not older) speakers. More recently, Kennedy and Grama (2012) reported a normalized F2 for /u/ around 1800 Hz, within the F2 ranges of /ɪ/, /ɛ/, and /æ/, with no signficant differences between men and women. The F2 of /o/ was comparable to that of /ɑ/: both vowels exhibit an F2 between 1200 and 1400 Hz and are the two backest vowels in the system. In general, Labov (2006) find the West, including most of California, to be conservative with respect to fronting of /o/. San Diego, one of the locations for this study, is one of the few cities in their sample with advanced fronting of /o/, with an F2 exceeding 1400 Hz.

The use of the term “fronted” to describe the quality of these vowels is potentially misleading, as it implies that the change is achieved through a forward repositioning of the tongue. Yet because an increase in the F2 of back vowels can be the result of any gesture that shortens the front cavity of the vocal tract, i.e., tongue fronting or lip unrounding (Stevens and House, 1955; Stevens , 1986), acoustic data alone do not reveal the strategies speakers employ to produce these vowels. Articulatory-acoustic mappings are known to be complex, non-linear, and to some extent speaker-specific (see, e.g., Blackwood Ximenes , 2017; Noiray , 2014). Direct observation of the articulatory strategies used to achieve fronting is therefore critical for understanding the phonetic and phonological mechanisms underlying this change. Previous articulatory study of fronted back vowels has, however, been limited to a handful of dialects.

Harrington (2011) show that fronted /u/ in Southern Standard British English (SSBE) has not been unrounded: it is more similar to /ɔ/ than to /i/ with respect to lip rounding, but has a tongue position closer to /i/ and /ɪ/ than to /ɒ/ or /ɔ/. Apparent-time comparison of younger vs older speakers shows that, for both groups, /u/ exerts a similar degree of coarticulatory influence on the spectral peak of a preceding /s/, indicating that the degree of rounding on /u/ has not changed over time. Nevertheless, dialects may differ in terms of how fronted vowels are realized. In Scottish English, fronted /u/ has a lower tongue position and less strongly protruded lips than in Anglo or Irish varieties (Lawson , 2019; Scobbie , 2012). Lawson et al. note that fronting in Scotland is a much older change than in other English varieties, having been reported during the 1930s (McAllister, 1938) and perhaps having developed in Middle English as early as the 13th century. By contrast, /u/-fronting in SSBE first began during the second half of the 20th century (Harrington , 2008; Hawkins and Midgley, 2005; Kleber , 2012; Wells, 1982).

Despite significant attention given to the social and geographic distribution of back vowel fronting in North American English (e.g., Eckert, 2008; Fought, 1999; Lee, 2016), instrumental articulatory data are sparse. Articulatory descriptions for these varieties are based mostly on inferences from acoustic measurements or from impressionistic observations. In their early description of the California Vowel Shift, Hinton (1987, p. 119) write that the California back vowels “are clearly more front and less rounded,” a view later reiterated by Hagiwara (1997, p. 657), who describes them as “typically unrounded.” Likewise, Zsiga (2013, p. 432) describes Western American English as involving “fronting and unrounding of back vowels, so that ‘dude’ is [dɨd].”

On the other hand, Thomas (2001, p. 34) is “skeptical” that fronted /u/ is also unrounded. Eckert (2008) distinguishes between two types of vowel fronting in California, dubbed the “Surfer” and “Valley Girl” variants. She transcribes the Surfer variant as [y], i.e., [dyd] dude, also suggesting fronting without unrounding. The Valley Girl variant is diphthongized, and only partially unrounded: food is [fɪwd] and goes is [ɡɛwz]. Yet the vowel-internal dynamics of fronted back vowels remain understudied, as discussed in the following section.

De Jong (1995, p. 70) proposes that the articulation of /u/ varies on a regional basis—Southern California speakers exhibit “little or no rounding” on non-low back vowels, contrasting keek [kik] with kook [kƜk], while Midwest speakers “seem to be losing the backing contrast,” i.e., [kik] vs [kyk]. The latter strategy is supported by X-ray microbeam data for one speaker from St. Louis, suggesting that a collapse of tongue contrast may occur even in the context of surrounding velars, where fronting is not motivated by coarticulation. Either way, these and limited other articulatory data (e.g., Blackwood Ximenes , 2017) do not fully illuminate the situation.

The aim of the present study is to examine the phonetic realization of /u, ʊ, o/ in two varieties of American English where fronting is known to be present. Data were collected in San Diego, CA, and Columbia, SC, as speakers from these regions potentially differ in the phonological conditioning of fronting or in the resulting acoustic and articulatory quality of /u, ʊ, o/. As noted previously, the South is among the most advanced regions with respect to the parallel fronting of /u/ and /o/, as well as in the fronting of these vowels even in non-coronal or lateral environments (Baranowski, 2008; Labov , 2006). California English shows similarly advanced fronting of /u/, which is often described as being unround (De Jong, 1995; Hagiwara, 1997; Hinton , 1987). Vowel fronting in California shows greater sensitivity to consonantal context, however, and is a relatively more recent change, particularly for /o/. Note that while these regions differ in the recency of back vowel fronting as a diachronic sound change, the present study does not incorporate apparent-time analysis (cf. Harrington , 2011). The focus is instead on the synchronic phonetic quality of /u, ʊ, o/ after fronting has occurred.

An additional goal is to consider whether similar articulatory strategies are used to produce fronted vowels that differ in height and/or tenseness. Although /u/, /ʊ/, and /o/ often undergo fronting either in parallel or in sequence, the question of the phonetic and phonological relationship between such changes remains open. While these vowels share [+round] and [+back] features, they do not strictly form a natural class because the back round vowel /ɔ/ is typically excluded from fronting.2,3 Even among /u, ʊ, o/, however, fronting of one vowel does not entail fronting of the others. In SSBE, /ʊ/ undergoes fronting alongside /u/ (Kleber , 2012), but that is not the case for all British varieties (Gorman and Kirkham, 2020). For dialects in which multiple vowels are involved in fronting, the fronting of /u/ typically precedes that of /o/ or /ʊ/, as has been found in California (Kennedy and Grama, 2012) and SSBE (Kleber , 2012). Fronted /u/ furthermore differs from /o/ in the extent of the increase to F2, with /o/ seldom fronting beyond the center of the vowel space (Hall-Lew, 2009), in part because the overall F2 range is smaller for mid than for high vowels. In this study, acoustic, ultrasound, and lip video data are analyzed to determine whether fronted variants of /u, ʊ, o/ are produced with fronted tongue postures, unrounded lips, or a combination of the two.

A further aim of this study is to examine the time-varying acoustic and articulatory quality of fronted back vowels throughout their duration. Vowel-inherent spectral change, i.e., dynamic changes in vowel quality not attributable to phonological context (Nearey and Assmann, 1986), is not only known to vary between languages and dialects (Fox and Jacewicz, 2009), but is also a potential sociolinguistic variable (Farrington , 2018). With specific regard to fronted back vowels, Chládková (2016) argue that diphthongization inherent to /u/ has facilitated its fronting by inhibiting perceptual merger with /i/. It is not fully known, however, how dynamic adjustments to tongue and lip position correspond to the acoustic trajectories of /u, ʊ, o/, or how such trajectories interact with coarticulatory effects in various phonological contexts. Differences in diphthongization and articulatory timing have potentially interesting implications for the phonological representation of fronted vowels. For instance, if it is found that the tongue remains front while F2 decreases during the offglide, this would suggest that speakers have lost the [+back] feature (or gesture) altogether and that /u/ and /i/ are differentiated by (change in) rounding alone. On the other hand, if lip rounding is stable throughout the vowel's duration, and change in F2 is achieved by backing the tongue, this would suggest that speakers have retained both [+back] and [+round] vowel qualities, but that the tongue and lip gestures have perhaps been temporally reorganized.

Formant trajectories are also expected to differ according to the coarticulatory strength of their flanking consonants, which is a function of those segments' own articulatory requirements (Recasens , 1997). There is further potential for the coarticulatory influence of onset vs coda consonants to vary given that anticipatory vs perseverative coarticulatory effects may be planned to differing extents (Recasens, 1989; Whalen, 1990). Due to the inherent physical constraints of the vocal tract, perseverative coarticulation is partly automatic, while anticipatory coarticulation is known to arise through speech planning (Whalen, 1990). The temporal extent of coarticulation varies on a speaker-specific and language-specific basis (Magen, 1997; Manuel, 1990), demonstrating that coarticulatory planning is a core component of the phonological grammar. The coarticulatory production patterns of individual speakers relate to their ability to perceptually compensate for coarticulation (Beddor , 2018), which is one of the key motivating factors for the phonologization of fronted back vowels (Harrington , 2008; Ohala, 1981; Sóskuthy , 2018).

Back vowel fronting is strongly linked to coronal consonants not only because they exert coarticulatory pressure on the tongue, but also because the lexical distribution of /u/ is disproportionately skewed toward coronal contexts (Harrington, 2007). Listeners who fail to perceptually compensate for this type of predictable variability, however, are observed to assume a more front target for the vowel category as a whole (Ohala, 1981), which in turn reduces the amount of between-context acoustic variability for /u/. Harrington (2008) provide support for this model by demonstrating that younger speakers exhibit both fronter vowel targets and less contextual variability than older speakers, which corresponds to their front-shifted perceptual boundary between /i/ and /u/. With regard to parallel fronting in coronal and non-coronal contexts, Stanley (2021) examine apparent-time change in the acoustic dynamics of back vowels in Southern American English. Along the lines of previous findings (Harrington , 2008; Kleber , 2012), they show that the progression of fronting as a diachronic change results in gradual approximation (but not convergence) of F2 values in coronal and non-coronal contexts. Intriguingly, fronting of /u/ in non-coronal contexts was not found to impact its degree of diphthongization. The shape of the formant trajectories remained stable even as the vowels underwent generational change to their overall position in the vowel space.

Strycharczuk and Scobbie (2017) and Gorman and Kirkham (2020) examine articulatory-acoustic trajectories for /u, ʊ/ in fronting (pre-coronal) vs non-fronting contexts (pre-lateral). They find covert variation in tongue position that is not clearly reflected in acoustics, as well as some between-dialect differences in articulatory trajectories. Yet most other articulatory studies of fronted vowels have controlled for contextual effects either by restricting analysis to a single context or by averaging across them. Given the theoretical considerations noted previously, there is strong motivation to explore the competing (or perhaps mutually enhancing) effects of both onset and coda consonants on the dynamic trajectories of /u, ʊ, o/. The current study therefore evaluates the influence of onset and coda consonants on fronted vowels through a time-varying analysis of tongue position, lip rounding, and acoustic quality throughout the vowel interval.

Participants were 23 adult native speakers of American English, born and raised in either South Carolina or coastal Southern California at least through age 18. Thirteen speakers (six men, seven women) were from Southern California and were between the ages of 18 and 34 years [mean 22.3, standard deviation (SD) 5.9]. Ten speakers (three men, seven women) were from South Carolina and were between the ages of 18 and 50 years (mean 27.8, SD 11.7).4 None of the participants from Southern California have lived outside the region. Four participants from South Carolina have spent time outside the state, for a period from one to four years during adulthood. Demographic information is provided in Table III (see  Appendix A). Three additional participants (not counted previously) participated in the experiment but were excluded from analysis due to data processing issues or equipment malfunction.

Prompts included 193 English words, mostly monosyllabic except for a small number of disyllabic words where lexical gaps exist. In disyllabic words, primary stress fell on the target vowel. The vowels included in the wordlist were /i u ɪ ʊ e o ɑ ɔ/, which include the three back vowels (/u o ʊ/) observed in previous studies to undergo fronting, their front unround counterparts (/i e ɪ/), and the low back vowels /ɑ/ and /ɔ/. Each vowel appeared in a variety of phonological contexts, with onsets including labial, coronal, and dorsal stops, as well as the fricatives /s/, /ʃ/, and /h/. Syllables were either open or contained a labial, coronal, dorsal, or lateral coda. Consonants were voiceless to the extent possible, but voiced consonants were used in some contexts to overcome lexical gaps. The full word list is provided in Table IVAppendix B).

Recording took place in sound-attenuated rooms at the University of South Carolina in Columbia, SC, and at the University of California, San Diego. Identical methods and equipment were used in both locations. Participants were asked to repeat the wordlist with each word spoken in three repetitions of the carrier phrase “say ______ again.” Words were presented in a unique pseudorandom order for each participant in Articulate Assistant Advanced (AAA; Articulate Instruments Ltd., 2012).

Midsagittal ultrasound data were captured using an Articulate Instruments Micro ultrasound system with a 20 mm radius 2–4 MHz transducer and a typical frame rate of 81.5 fps. During recording, participants were seated with the ultrasound transducer held in place beneath their chin with a stabilizing headset (Articulate Instruments Ltd., 2008). Front-view video of the speaker's lips was captured at 120 fps using a Sony DSC RX10 IV camera (Sony, Tokyo, Japan) at a resolution of 1920 × 1080 pixels. Side-view video was also recorded at 60 fps using a headset-mounted camera, used here to rotationally align the ultrasound images (described in the following section). Audio was captured with an AKG C544 L cardioid headset condenser microphone (AKG, San Fernando, CA) and simultaneously recorded (48 kHz, 16 bit) to a Marantz PMD661 Mk2 recorder (Marantz, Carlsbad, CA) and to AAA. Total duration of the recording session was approximately 35 min.

FAVE-align v1.2.2 (Rosenfelder , 2015) was used to generate a force-aligned first-pass phonetic transcription, which was then manually corrected. Dynamic LPC formant measurements were taken throughout the vowel duration using the Fast Track plugin for Praat (Barreda, 2021; Boersma and Weenink, 2023). Maximum formant ranges used for analysis were initially set to 4500–6500 Hz for men and 5000–7000 Hz for women. The accuracy of the formant tracking was manually checked for each speaker, and settings were adjusted when necessary. Formant measurements were z-score normalized with the Lobanov method (Lobanov, 1971) using the phonTools package for R (Barreda, 2015; R Core Team, 2018).

Ultrasound data were analyzed in Articulate Assistant Advanced v220.02 (Articulate Instruments Ltd., 2012). Still ultrasound images were aligned to the audio recording using pulses generated by the Articulate Instruments PStretch unit, recorded in AAA alongside the speech signal. Tongue splines were automatically fit with DeepLabCut (Mathis , 2018; Nath , 2019) using the MobileNet1.0-based neural network implemented in AAA (Wrench and Balch-Tomes, 2022). Tongue coordinates were rotated to a common horizontal plane (Lawson , 2019; Scobbie , 2011) by visually estimating the orientation of the ultrasound probe and camera from the side-view lip video data. The side-view camera and ultrasound probe were attached to the same point on the ultrasound headset, so they are in a fixed spatial relationship to one another such that the pitch of the probe (relative to the speaker's head) is the same as the angle of roll for the camera. Further information regarding this procedure is provided in supplementary material, but note that the occlusal plane may also be directly imaged using a biteplate as described by Scobbie (2011). Tongue backness was quantified as the location of the tongue body coordinate along the x-axis. The point used for analysis was the Body1 coordinate from the DeepLabCut tongue model, shown in Fig. 1. Models were also run using the Dorsum1 coordinate, which yielded nearly identical results. The x coordinate of this point was z-score normalized for each speaker across all vowels recorded.

FIG. 1.

(Color online) Sample ultrasound frame with coordinates from DeepLabCut.

FIG. 1.

(Color online) Sample ultrasound frame with coordinates from DeepLabCut.

Close modal

Lip video was synchronized by aligning transient acoustic landmarks (e.g., stop bursts) in the continuously-recorded audio signals from the PMD661 recorder and the internal microphone of the RX10 camera. Alignment was verified several times throughout the recording, confirming that there was no detectable drift between the two signals. Lip movement was analyzed using a ResNet50-based neural network (He , 2016; Insafutdinov , 2016), also trained in DeepLabCut (version 2.3.2). Twenty frames from each speaker in the dataset were manually labeled to train the network. Points labeled were the left and right oral commissures, the upper and lower lips at the midsagittal plane along the oral mucosa, and four stable points on the ultrasound headset (camera mounting screws, diagram shown in the supplementary material). The network was trained for 800 000 iterations, after which outlier frames were extracted using the “jump” algorithm. Twenty outlier frames were labeled for each speaker and added to the training set, which was then used to train a new network for another 800 000 iterations.

Vertical lip openness and horizontal lip spread were calculated as the Euclidean distance between the upper and lower points and the oral commissures, respectively. These values were used to calculate the area of an ellipse approximating lip aperture using the formula A = π * ( openness / 2 ) * ( spread / 2 ). Because the distance from the speaker to the external RX10 camera was not fixed, measurements were scaled relative to the distance between stable points on the headset. Lip aperture measurements were z-score normalized to account for individual speaker differences.

Normalized mean formant measurements for all speakers are presented in Fig. 2. This plot includes group and individual mean formant measurements for all vowels represented in the wordlist, with measurements in non-lateral environments taken during the vowel nucleus, which was manually identified. Measurements for pre-lateral /ul/ and /ol/, taken during the vowel/lateral transition, are also provided as a reference for the high back periphery of the vowel space.

FIG. 2.

(Color online) Mean Lobanov-normalized formant measurements for all speakers. Except for /ul/ and /ol/, measurements were taken during the steady state portion of the vowel nucleus in non-lateral environments. For /ul/ and /ol/, measurements were taken during the steady state portion of the vowel offglide in pre-lateral tokens. Individual points indicate vowel category means for each speaker; cross-marks indicate group means. Ellipses indicate 95% confidence intervals.

FIG. 2.

(Color online) Mean Lobanov-normalized formant measurements for all speakers. Except for /ul/ and /ol/, measurements were taken during the steady state portion of the vowel nucleus in non-lateral environments. For /ul/ and /ol/, measurements were taken during the steady state portion of the vowel offglide in pre-lateral tokens. Individual points indicate vowel category means for each speaker; cross-marks indicate group means. Ellipses indicate 95% confidence intervals.

Close modal

The two dialects are rather similar with respect to the overall degree of back vowel fronting. In both dialects, /u/ occupies the entire high central region of the vowel space, with a mean F2 slightly exceeding the overall F2 midpoint at zero. As can be observed from the wide confidence interval ellipse, the F2 for /u/ is highly variable in both dialects. This variability is largely attributable to the phonological context, as illustrated in Fig. 3, which provides kernel density estimates for F2 in each vowel, onset, and dialect combination. Following coronal consonants, including /t, d, s, ʃ/, the F2 for /u/ is exceptionally high, frequently approaching 1.5 z, encroaching on the F2 distribution for /i/. Following non-coronal consonants, the F2 for /u/ is concentrated around the center of the vowel space. Although lower than the post-coronal contexts, this figure indicates that /u/ has fronted even in the absence of coarticulatory bias from an anterior tongue position. In pre-lateral contexts, the nucleus for /u/ shows some influence from the preceding consonant and has a higher F2 following coronal vs non-coronal onsets. Nevertheless, even after coronals, acoustic fronting of /u/ before laterals is suppressed and the F2 of /u/ remains relatively low.

FIG. 3.

(Color online) Kernel density estimates of F2 distribution by onset and dialect for /u, ʊ, o/ and pre-lateral /ul, ol/. Measurements from steady state portion of vowel nucleus.

FIG. 3.

(Color online) Kernel density estimates of F2 distribution by onset and dialect for /u, ʊ, o/ and pre-lateral /ul, ol/. Measurements from steady state portion of vowel nucleus.

Close modal

Regarding the fronting of /o/ and /ʊ/, Fig. 3 indicates that the two dialects differ somewhat in that the mean F2 for /o/ among South Carolina speakers is higher than that of /ʊ/, while the opposite is true for California speakers. Inspection of Fig. 2 suggests this pattern is partly driven by a handful of South Carolina speakers for whom the nucleus of /o/ is fronted beyond the centerline of the vowel space. Nevertheless, both /o/ and /ʊ/ are fronted to an acoustically centralized position in both dialects. Examining the distribution of these vowels in Fig. 3 reveals that the two dialects also differ with regard to the strength of fronting following various onset consonants. Among South Carolina speakers, the F2 of /ʊ/ is clustered around the midpoint of the vowel space in all environments except following labials. Similarly, South Carolina speakers also show a greater proportion of fronted /o/ tokens in non-coronal contexts.

Previous work has observed that F2 of /u/ in non-lateral contexts is highly sensitive to the influence of the preceding consonant, with the greatest acoustic fronting observed after coronal onsets and only moderate fronting following non-coronal onsets. Fronting of /o/ is likewise affected by the preceding consonant, but shows less fronting than /u/ even after coronals, perhaps in part because /o/ is more limited in how much it can be fronted before encroaching on the mid front vowels. The relative strength of phonological context has been suggested to differ between dialects, with previous acoustic studies showing that fronting of /u/ and /o/ in the South is relatively strong in all phonological contexts, with some speakers showing fronting of /u/ even before laterals.

Between-dialect differences in phonological conditioning were quantified using linear mixed effects regression models fit to the normalized F2 for each vowel. Models were built using the lme4 package for R, with p-values for each term calculated by lmerTest (Bates , 2015; Kuznetsova , 2017; R Core Team, 2018). The main effects were onset and dialect (both sum coded), as well as their interaction. Random effects were random intercepts for both the speaker and word. Model comparison was performed by fitting additional models with either onset class, dialect, or the interaction term dropped from the model. For vowels in non-lateral contexts, models were also fit with an additional fixed effect of coda (sum coded). For the tense vowels, the inclusion of this term did not significantly improve model fit [/u/: χ 2 ( 3 ) = 3.2, p = 0.359; /o/: χ 2 ( 4 ) = 3.3, p = 0.511], suggesting that coda consonants (other than /l/) do not significantly influence the F2 of /u, o/ at their nuclei. The inclusion of this term did, however, improve model fit for the lax vowel /ʊ/ [ χ 2 ( 3 ) = 15, p = 0.001]. Summaries of the best fit model for each vowel are provided in Table I.

TABLE I.

Summaries for best fit linear mixed effects regression models for static acoustic measurements. *p < 0.05; **p < 0.01; ***p < 0.001.

/u/ /ʊ/ /o/ /ul/ /ol/
(Intercept)  β 0.246, SE 0.055  β − 0.091, SE 0.102  β − 0.268, SE 0.046  β − 1.12, SE 0.25  β − 1.20, SE 0.21 
  t = 4.45 ***  t = −0.89  t = −5.85 ***  t = −4.40 ***  t = −5.66 *** 
Onset           
/p, b/  β − 0.663, SE 0.066  β − 0.36, SE 0.11  β − 0.332, SE 0.062  β − 0.52, SE 0.45  β − 0.36, SE 0.48 
  t = −9.99 ***  t = −3.43 ***  t = −5.33 ***  t = −1.15  t = −0.74 
/k, g/  β − 0.324, SE 0.066  β 0.041, SE 0.098  β − 0.060, SE 0.068  β − 0.43, SE 0.45  β − 0.26, SE 0.48 
  t = −4.89 ***  t = 0.42  t = −0.89  t = −0.95  t = −0.54 
/t, d/  β 0.562, SE 0.066  β 0.15, SE 0.13  β 0.151, SE 0.068  β 0.29, SE 0.36  β 0.24, SE 0.37 
  t = 8.48 ***  t = 1.16  t = 2.22 *  t = 0.80  t = 0.64 
/s, z/  β 0.412, SE 0.074  β 0.064, SE 0.106  β 0.214, SE 0.068    β 0.30, SE 0.48 
  t = 5.56 ***  t = 0.60  t = 3.16 **    t = 0.62 
/ʃ/  β 0.653, SE 0.088  β 0.219, SE 0.098  β 0.42, SE 0.09    β 0.45, SE 0.48 
  t = 7.42 ***  t = 2.24 *  t = 4.63 ***    t = 0.94 
Dialect           
South Carolina  β 0.047, SE 0.046  β 0.0081, SE 0.0295  β 0.089, SE 0.033    β 0.065, SE 0.039 
  t = 1.04  t = 0.28  t = 2.70 **    t = 1.68 
Onset × Dialect           
/p, b/ × SC  β 0.0034, SE 0.0185  β 0.036, SE 0.014  β 0.050, SE 0.011    β − 0.0094, SE 0.0264 
  t = 0.18  t = 2.55 *  t = 4.49 ***    t = −0.35 
/k, g/ × SC  β 0.010, SE 0.019  β − 0.012, SE 0.017  β 0.055, SE 0.012    β − 0.072, SE 0.025 
  t = 0.55  t = −0.71  t = 4.50 ***    t = −2.86 ** 
/t, d/ × SC  β − 0.023, SE 0.018  β 0.030, SE 0.021  β − 0.016, SE 0.012    β 0.033, SE 0.020 
  t = −1.27  t = 1.40  t = −1.25    t = 1.68 
/s, z/ × SC  β − 0.07, SE 0.02  β − 0.061, SE 0.017  β − 0.052, SE 0.012    β 0.017, SE 0.026 
  t = −3.44 ***  t = −3.69 ***  t = −4.19 ***    t = 0.65 
/ʃ/ × SC  β 0.0014, SE 0.0246  β − 0.048, SE 0.016  β − 0.113, SE 0.016    β 0.065, SE 0.025 
  t = 0.06  t = −3.02 **  t = −7.00 ***    t = 2.57 * 
Coda           
/k/    β − 0.23, SE 0.11       
    t = −2.06 *       
/p/    β − 0.055, SE 0.215       
    t = −0.25       
/t/    β 0.048, SE 0.179       
    t = 0.27       
/u/ /ʊ/ /o/ /ul/ /ol/
(Intercept)  β 0.246, SE 0.055  β − 0.091, SE 0.102  β − 0.268, SE 0.046  β − 1.12, SE 0.25  β − 1.20, SE 0.21 
  t = 4.45 ***  t = −0.89  t = −5.85 ***  t = −4.40 ***  t = −5.66 *** 
Onset           
/p, b/  β − 0.663, SE 0.066  β − 0.36, SE 0.11  β − 0.332, SE 0.062  β − 0.52, SE 0.45  β − 0.36, SE 0.48 
  t = −9.99 ***  t = −3.43 ***  t = −5.33 ***  t = −1.15  t = −0.74 
/k, g/  β − 0.324, SE 0.066  β 0.041, SE 0.098  β − 0.060, SE 0.068  β − 0.43, SE 0.45  β − 0.26, SE 0.48 
  t = −4.89 ***  t = 0.42  t = −0.89  t = −0.95  t = −0.54 
/t, d/  β 0.562, SE 0.066  β 0.15, SE 0.13  β 0.151, SE 0.068  β 0.29, SE 0.36  β 0.24, SE 0.37 
  t = 8.48 ***  t = 1.16  t = 2.22 *  t = 0.80  t = 0.64 
/s, z/  β 0.412, SE 0.074  β 0.064, SE 0.106  β 0.214, SE 0.068    β 0.30, SE 0.48 
  t = 5.56 ***  t = 0.60  t = 3.16 **    t = 0.62 
/ʃ/  β 0.653, SE 0.088  β 0.219, SE 0.098  β 0.42, SE 0.09    β 0.45, SE 0.48 
  t = 7.42 ***  t = 2.24 *  t = 4.63 ***    t = 0.94 
Dialect           
South Carolina  β 0.047, SE 0.046  β 0.0081, SE 0.0295  β 0.089, SE 0.033    β 0.065, SE 0.039 
  t = 1.04  t = 0.28  t = 2.70 **    t = 1.68 
Onset × Dialect           
/p, b/ × SC  β 0.0034, SE 0.0185  β 0.036, SE 0.014  β 0.050, SE 0.011    β − 0.0094, SE 0.0264 
  t = 0.18  t = 2.55 *  t = 4.49 ***    t = −0.35 
/k, g/ × SC  β 0.010, SE 0.019  β − 0.012, SE 0.017  β 0.055, SE 0.012    β − 0.072, SE 0.025 
  t = 0.55  t = −0.71  t = 4.50 ***    t = −2.86 ** 
/t, d/ × SC  β − 0.023, SE 0.018  β 0.030, SE 0.021  β − 0.016, SE 0.012    β 0.033, SE 0.020 
  t = −1.27  t = 1.40  t = −1.25    t = 1.68 
/s, z/ × SC  β − 0.07, SE 0.02  β − 0.061, SE 0.017  β − 0.052, SE 0.012    β 0.017, SE 0.026 
  t = −3.44 ***  t = −3.69 ***  t = −4.19 ***    t = 0.65 
/ʃ/ × SC  β 0.0014, SE 0.0246  β − 0.048, SE 0.016  β − 0.113, SE 0.016    β 0.065, SE 0.025 
  t = 0.06  t = −3.02 **  t = −7.00 ***    t = 2.57 * 
Coda           
/k/    β − 0.23, SE 0.11       
    t = −2.06 *       
/p/    β − 0.055, SE 0.215       
    t = −0.25       
/t/    β 0.048, SE 0.179       
    t = 0.27       

A pairwise post hoc test of onset by dialect was performed for each model using the emmeans package for R (Lenth, 2024; R Core Team, 2018). For /u/, the two dialects differ from one another only following /h/, such that South Carolina speakers have a significantly higher F2 [estimated difference (est. diff) = 0.13, p < 0.05]. Inspection of Fig. 3 indicates that the distribution of /u/ tokens in this context is bimodal, reflecting interspeaker variation among South Carolina speakers. For /ʊ/, pairwise comparisons were not significant for any onset context. For the pre-lateral vowel /ul/, dropping of terms related to dialect did not worsen model fit [ χ 2 ( 4 ) = 7, p = 0.134], indicating that the two dialects do not differ in the realization of this vowel.

The model for /o/, on the other hand, indicates that South Carolina speakers produce this vowel with a significantly higher F2 than California speakers following non-coronal onsets [est. diff = /p, b/: 0.14, p < 0.001; /k, g/: 0.14, p < 0.001; /h/: 0.16, p < 0.001] and following /t, d/ (0.073, p < 0.05). This finding is consistent with previous work, which has found that /o/-fronting in the South is less sensitive to phonological context than in other regions of North America (Labov , 2006). For pre-lateral /ol/, South Carolina speakers show a significantly higher F2 following the coronal onsets /t, d/ (0.098, p < 0.05) and /ʃ/ (0.13, p < 0.01), also consistent with previous work (Baranowski, 2008).

Dynamic acoustic and articulatory data were examined in order to assess contextual variation in vowel quality throughout the vowel duration, as well as to examine how positions of the tongue and lips contribute to the acoustic trajectory. Generalized additive mixed models (GAMMs; Wood, 2017) were used to model the non-linear trajectories of F2 frequency, lip aperture, and location of the tongue body along the x-axis. Models were fit with both a parametric term and smoothing term capturing the four-way interaction of dialect, vowel category, onset place of articulation, and coda place of articulation, allowing the trajectory in each context to receive an independent best fit curve. Random smooths for speaker and word were also included, using the random reference-difference smooth approach proposed by Sóskuthy (2021). This approach includes fitting both a random reference smooth for speaker (without the grouping variable of vowel * onset * coda) and a difference smooth (with the grouping variable), in order to avoid an overconservative fit that assigns too much variance to the random smooths. An autoregressive error term was also included in the model in order to capture the relationship between successive measurements within the same token. The degree of autocorrelation was calculated by fitting a model without the AR term and using the function itsadug::acf_resid() (van Rij , 2022) to calculate the appropriate value for rho.

1. Acoustic trajectories

Figure 4 (upper row) shows the fitted F2 trajectories for /u/, /o/, and /ʊ/ in six phonological contexts, including coronal (T_) vs non-coronal (K_) onsets with coronal (_T) vs non-coronal (_K) vs lateral (_L) codas. Inspection of the 95% confidence intervals around the trajectories indicates that, in most cases, the two dialects do not differ significantly from one another in either acoustics or articulation, although marginal differences can be observed for /o/ (e.g., in T_T contexts). The following discussion will therefore not distinguish between the two dialects. In general, the F2 trajectories for both /u/ and /o/ show a decrease in F2 throughout the vowel duration. In all contexts, the F2 for these vowels is higher at the vowel onset than at the vowel offset, although the magnitude and timing of this change varies greatly according to context. By comparison, the trajectory for /i/ is relatively stable in all non-lateral contexts, with consistently high F2 throughout the vowel.

FIG. 4.

(Color online) Predicted F2, tongue body, and lip aperture trajectories for /i, u, ʊ, o/ by dialect and phonological context. T, coronal; K, noncoronal; L, lateral. Shading indicates 95% confidence interval. For lip aperture, smaller values indicate increased lip rounding. For tongue body, smaller values indicate a more posterior tongue position.

FIG. 4.

(Color online) Predicted F2, tongue body, and lip aperture trajectories for /i, u, ʊ, o/ by dialect and phonological context. T, coronal; K, noncoronal; L, lateral. Shading indicates 95% confidence interval. For lip aperture, smaller values indicate increased lip rounding. For tongue body, smaller values indicate a more posterior tongue position.

Close modal

For /u/, F2 displacement is greatest in words with coronal onsets and non-coronal, non-lateral codas (e.g., too, duke, dupe). In this environment, F2 at the vowel onset is solidly situated in the front half of the vowel space, with a mean of approximately 1z. Comparing this value to the static acoustic measurements indicates that this is similar to the mean F2 for the nucleus of /e/. Beyond 25% of vowel duration, F2 descends past the center of the vowel space, with an onset-offset difference of nearly 2 z. /o/ follows a similar trajectory in this environment, but has a smaller overall displacement from onset to offset, with a lower F2 than /u/ throughout the vowel.

In words with coronal codas (e.g., boot, toot), /u/ is nearly monophthongal, with the smallest F2 decrease occurring in words with non-coronal onsets and coronal codas. In that environment, F2 remains stable around the center of the vowel space. In the other non-lateral environments, i.e., words in which both the onset and coda are coronal (T_T) or non-coronal (K_K), F2 shows a moderately negative slope throughout the vowel. The overall location of the F2 trajectory within the vowel space differs between these environments, however. In fully coronal contexts (e.g., dude), F2 remains in the front half of the vowel space throughout, with the minimum F2 at vowel offset just higher than the center of the vowel space. In fully non-coronal contexts (e.g., coop), F2 remains in the back half of the vowel space, with a maximum F2 around 0z and a minimum F2 in the high back quadrant, around 1 z.

While the F2 for /o/ is consistently lower than that of /u/, the shape of the trajectories for the two vowels is broadly similar. /o/ shows a distinct trajectory from /u/ in words like tote and coat (T_T, K_T), however, in that minimum F2 occurs around 65% of vowel duration before rising again ahead of the coronal coda. In all other contexts, the F2 minimum for both /u/ and /o/ occurs at the vowel offset.

In contrast to /u/ and /o/, the F2 for /ʊ/ is relatively monophthongal and exhibits a stable value around the center of the vowel space. This difference is most apparent in words with non-coronal codas (e.g., took, hook), where /u/ and /o/ showed the greatest F2 displacement. There is nevertheless some influence of consonantal context for /ʊ/, such that the F2 is slightly higher immediately following coronal onsets and immediately preceding coronal codas. In the non-coronal context (K_K), the F2 trajectory is basically flat, remaining just to the back of the vowel space center for its entire duration.

The normalized F2 for pre-lateral /u/ and /o/ also does not exceed 0z. In words with coronal onsets like duel, there is evidence of coarticulatory influence from the preceding consonant, with a mildly raised F2 at the vowel onset. However, F2 decreases relatively rapidly during the first 25% of the vowel duration. Unlike the T_K context, in which F2 is higher than K_K throughout the vowel, the F2 for T_L reaches a comparable minimum to that of K_L. Acoustic trajectories in K_L contexts (cool, pool) are effectively level. F2 trajectories for /o/ and /u/ show almost complete overlap of their confidence intervals in lateral contexts, in contrast to other environments where the F2 of /u/ is higher. The influence of lateral codas can also be observed in the F2 for /i/, which is high only during the first half of the vowel and decreases thereafter.

2. Articulatory trajectories

Independent contributions of the tongue and lips to vowel F2 can be assessed through inspection of the GAMM smooths for tongue body backness and lip aperture, given in the second and third rows of Fig. 4. The tongue body measure tracks the horizontal location of the back of the tongue along the x-axis, where higher values indicate a more front tongue position. For lip aperture, higher values correspond to lip opening and/or spreading, while lower values correspond to lip protrusion and/or closure. Thus, higher values in either measure indicate an articulatory configuration more similar to the high front vowel /i/.

In the T_T environment, it can be observed that the tongue positions for /i/ and /u/ are nearly identical throughout the vowel. This result indicates that the tongue position for /u/ is high and front and is thus articulatorily more similar to [y] than to [ʉ]. The trajectory of F2 in this environment does not overlap with /i/, however, and shows a small but measurable decrease through the vowel duration. Inspection of the corresponding lip trajectory indicates that acoustic distance between /i/ and /u/ in coronal environments can be almost entirely attributed to lip rounding. Lip aperture for /u/ differs significantly from /i/ throughout the vowel and does not support the assertion that acoustically fronted /u/ is unround. A relatively small decrease in lip aperture can be observed toward the vowel offset, consistent with the small acoustic change observed in F2.

In the environments T_K and K_T, the tongue position for /u/ is also strongly fronted, but to a lesser extent than in fully coronal environments. To illustrate this difference, midsagittal tongue contours for one representative speaker from Southern California are provided in Fig. 5. These contours were extracted at the vowel midpoint and modeled using polar SSANOVA (Mielke, 2015). These contours show that for /u/, environments with a coronal consonant in either position (onset or coda) have a more fronted tongue position. Thus, the difference between /i/ and /u/ is greatest in K_L and smallest in T_T, with an overall ranking of T_T < T_K  K_T < K_K < T_L < K_L.

FIG. 5.

(Color online) Representative SSANOVA tongue contours for Speaker Cal013. Tongue front is to the left. 95% confidence intervals are narrower than the plotted splines and therefore not included.

FIG. 5.

(Color online) Representative SSANOVA tongue contours for Speaker Cal013. Tongue front is to the left. 95% confidence intervals are narrower than the plotted splines and therefore not included.

Close modal

However, these environments differ with respect to the dynamic trajectories of tongue position. In T_K words, /i/ and /u/ show overlap in tongue position at the onset of the vowel. The distance between /i/ and /u/ increases throughout most of the vowel duration, indicating retraction of the tongue body for /u/. The point of maximum tongue body retraction for /u/ occurs close to the vowel offset, with the greatest distance from /i/ at approximately 75% of vowel duration. In contrast, /i/ and /u/ exhibit a more consistent tongue body difference in the K_T environment, with the maximum distance between them around the vowel midpoint. In words with non-coronal consonants in both onset and coda position (K_K), the tongue position for /u/ is most distinct from /i/, but is still relatively fronted and temporally stable. In pre-lateral contexts, the tongue body for /u/ is relatively retracted throughout the vowel duration, reaching the maximum difference from /i/ near the midpoint of the vowel. /i/ shows a similar rate of retraction during the second half of the vowel, consistent with the F2 trajectories.

/o/ and /ʊ/ have a consistently lower and more retracted tongue position than /u/. The tongue position for these two vowels is similar in all environments and the tongue body trajectories in Fig. 4 are nearly identical, with fully overlapping confidence intervals. The tongue splines in Fig. 5 demonstrate that in environments with non-coronal codas, /o/ and /ʊ/ show rather similar tongue shapes. Given this overlap in both the static F1 × F2 measurements and in tongue position, contrast between these vowels is apparently maintained mostly through differences in duration and/or lip trajectory (with corresponding change in F2). The differences in lip rounding and in F2 are confirmed in Fig. 4. Both the lip and F2 trajectories for /ʊ/ are stable throughout the vowel, while /o/ shows a gradual increase in rounding similar to /u/. Although /ʊ/ and /o/ are acoustically and articulatorily similar at the vowel onset, the two vowels increasingly diverge toward the vowel offset, with /o/ showing a decrease in F2 driven by an increase in lip rounding.

With respect to the rounding of /ʊ/, the confidence intervals for /i/ vs /ʊ/ show marginal overlap in Fig. 4. Although this vowel is clearly less round than /u/, note that such overlaps do not strictly indicate non-significance (Sóskuthy, 2021). Examination of the estimated difference curves in Fig. 6 (extracted with itsadug::get_difference) shows that /ʊ/ has a significantly smaller lip aperture than /i/ in all contexts, at least at the vowel onset. In some cases, the two vowels do not differ significantly at the vowel offset, although this can be attributed to a decrease in lip spread for /i/, given that the lip trajectory for /ʊ/ is level. These curves further confirm that the lip aperture for /ʊ/ is intermediate between that of /i/ and /u/.

FIG. 6.

(Color online) Predicted pairwise differences in lip aperture for /i/ vs /ʊ/ and /u/ vs /ʊ/. The difference is significant when the 95% confidence interval excludes zero.

FIG. 6.

(Color online) Predicted pairwise differences in lip aperture for /i/ vs /ʊ/ and /u/ vs /ʊ/. The difference is significant when the 95% confidence interval excludes zero.

Close modal

3. Correlation of articulatory and acoustic measures

As further confirmation of these results, a direct comparison between the articulatory and acoustic measurements was performed. Figure 7 provides F2 measurements plotted against tongue body and lip aperture measurements. Each point represents mean values calculated on a per-speaker, per-context basis, sampled at 10% intervals throughout the vowel's duration. To test the relative effect of tongue position and lip aperture on F2 in each coda context, a linear mixed effects regression model was fit for each vowel (/u, ʊ, o/, using raw values rather than the mean values shown in Fig. 7). The response variable was normalized F2, with fixed effects of normalized tongue body backness and normalized lip aperture, each interacting with coda type (sum coded for coronal, non-coronal, lateral). Random effects were random intercepts for speaker and word as well as random slopes for tongue and lip by speaker. Results of the models are provided in Table II, while estimated marginal slopes with 95% confidence intervals (calculated using marginaleffects::avg_slopes; Arel-Bundock, 2024) are shown in Fig. 7.

FIG. 7.

(Color online) Relationship of F2 to tongue backness and lip rounding. Points represent mean values calculated on a per-speaker, per-environment basis at 10% intervals throughout the vowel's duration. Slopes with 95% confidence intervals indicate estimated marginal effects computed from the linear mixed effects model for each vowel.

FIG. 7.

(Color online) Relationship of F2 to tongue backness and lip rounding. Points represent mean values calculated on a per-speaker, per-environment basis at 10% intervals throughout the vowel's duration. Slopes with 95% confidence intervals indicate estimated marginal effects computed from the linear mixed effects model for each vowel.

Close modal
TABLE II.

Mixed effects linear regression model summaries for F2 by tongue backness and lip aperture before non-coronal vs coronal vs lateral codas.

/u/ /ʊ/ /o/
(Intercept)  β − 0.153, SE 0.065  β − 0.132, SE 0.048  β − 0.39, SE 0.04 
  t = −2.34 *  t = −2.75 **  t = −9.81 *** 
Coda       
Non-coronal  β 0.093, SE 0.063  β − 0.098, SE 0.041  β 0.100, SE 0.021 
  t = 1.49  t = −2.37 *  t = 4.70 *** 
Coronal  β 0.329, SE 0.076    β 0.168, SE 0.026 
  t = 4.35 ***    t = 6.36 *** 
Articulator       
Lip aperture  β 0.381, SE 0.039  β 0.098, SE 0.021  β 0.292, SE 0.029 
  t = 9.75 ***  t = 4.71 ***  t = 10.18 *** 
Tongue backness  β 0.549, SE 0.032  β 0.290, SE 0.031  β 0.419, SE 0.028 
  t = 17.32 ***  t = 9.45 ***  t = 14.96 *** 
Coda × Articulator       
Non-coronal × LA  β 0.096, SE 0.011  β − 0.0046, SE 0.0055  β 0.0717, SE 0.0057 
  t = 8.68 ***  t = −0.84  t = 12.66 *** 
Coronal × LA  β − 0.024, SE 0.015    β − 0.014, SE 0.007 
  t = −1.63    t = −2.08 * 
Non-coronal × TB  β 0.105, SE 0.011  β − 0.0585, SE 0.0078  β 0.0103, SE 0.0073 
  t = 9.51 ***  t = −7.50 ***  t = 1.41 
Coronal × TB  β − 0.072, SE 0.015    β − 0.0601, SE 0.0093 
  t = −4.76 ***    t = −6.48 *** 
/u/ /ʊ/ /o/
(Intercept)  β − 0.153, SE 0.065  β − 0.132, SE 0.048  β − 0.39, SE 0.04 
  t = −2.34 *  t = −2.75 **  t = −9.81 *** 
Coda       
Non-coronal  β 0.093, SE 0.063  β − 0.098, SE 0.041  β 0.100, SE 0.021 
  t = 1.49  t = −2.37 *  t = 4.70 *** 
Coronal  β 0.329, SE 0.076    β 0.168, SE 0.026 
  t = 4.35 ***    t = 6.36 *** 
Articulator       
Lip aperture  β 0.381, SE 0.039  β 0.098, SE 0.021  β 0.292, SE 0.029 
  t = 9.75 ***  t = 4.71 ***  t = 10.18 *** 
Tongue backness  β 0.549, SE 0.032  β 0.290, SE 0.031  β 0.419, SE 0.028 
  t = 17.32 ***  t = 9.45 ***  t = 14.96 *** 
Coda × Articulator       
Non-coronal × LA  β 0.096, SE 0.011  β − 0.0046, SE 0.0055  β 0.0717, SE 0.0057 
  t = 8.68 ***  t = −0.84  t = 12.66 *** 
Coronal × LA  β − 0.024, SE 0.015    β − 0.014, SE 0.007 
  t = −1.63    t = −2.08 * 
Non-coronal × TB  β 0.105, SE 0.011  β − 0.0585, SE 0.0078  β 0.0103, SE 0.0073 
  t = 9.51 ***  t = −7.50 ***  t = 1.41 
Coronal × TB  β − 0.072, SE 0.015    β − 0.0601, SE 0.0093 
  t = −4.76 ***    t = −6.48 *** 

Several patterns are apparent. For /o/, the average marginal effects are significant for both lip aperture ( β = 0.31 , standard error , S E = 0.03 , z = 10.98, p < 0.001) and tongue backness ( β = 0.43 , S E = 0.03 , z = 15.22, p < 0.001), indicating that both articulators significantly influence the acoustic output. The estimated slope for tongue backness is greater than that of lip rounding, suggesting that variation in F2 is, in general, more strongly associated with differences in tongue backness than in lip rounding. However, the two articulators have rather different effects over time depending on the coda context. A pairwise comparison of the slopes in each context was performed using ggeffects::hypothesis_test. The slope for lip rounding is significantly stronger in non-coronal than in coronal ( Δ = 0.08 , S E = 0.01 , z = 9.02, p < 0.001) contexts, and weakest in lateral contexts ( Δ = 0.13 , S E = 0.01 , z = 10.27, p < 0.001). In non-lateral contexts, there is a clear association between F2, lip aperture, and time, as was suggested by the GAMM smooths. Measurements taken earlier in the vowel tend to have both a larger lip aperture (i.e., less rounding) and a higher F2, while measurements taken later in the vowel have both a smaller lip aperture and lower F2. In lateral contexts, on the other hand, the temporal gradient of lip aperture is relatively more vertical, showing that as F2 decreases throughout the vowel duration, there is no (or little) corresponding change in lip aperture. Instead, the F2 of pre-lateral /ol/ is more strongly influenced by tongue body position, as expected of the velarizing coarticulatory pressure of coda [ɫ].

The effect of tongue backness is strongest in lateral contexts, being slightly weaker in non-coronal contexts ( Δ = 0.04 , S E = 0.01 , z = 3.25, p = 0.003) and much weaker in coronal contexts ( Δ = 0.12 , S E = 0.02 , z = 7.11, p < 0.001). Examining tongue body position for pre-coronal /o/ over time shows that the tongue tends to have a relatively front tongue position even at the vowel offset. Recall from the GAMM predictions that tongue body position for /o/ showed a positive trajectory in K_T contexts. Given the expected direction of offgliding for /o/ (realized as [ɵʊ]), it appears that change in tongue position during /o/ is primarily in the vertical dimension, with a secondary effect of forward tongue body movement as the tongue raises toward [ʊ]. This forward/upward movement runs counter to the observed changes in F2; the more front tongue position would be expected to increase F2, whereas low-F2 measurements tend to be concentrated later in the vowel.

For /ʊ/, the overall effect of lip aperture ( β = 0.10 , S E = 0.02 , z = 4.68, p < 0.001) is much weaker than that of tongue backness ( β = 0.28 , S E = 0.03 , z = 9.27, p < 0.001), and approaches zero in non-coronal contexts. Lip aperture for /ʊ/ thus has little effect on its F2, which is more strongly determined by tongue position. Neither articulator has a strong association with F2 over time, although there is a slight tendency toward tongue fronting in advance of coronal codas. This result suggests that F2 for /ʊ/ varies predominantly on a between-token rather than within-token basis, which is consistent with its highly stable GAMM trajectories.

For /u/, the effects of lip rounding and tongue position vary greatly across all three coda contexts. Tongue backness has a significantly stronger association with F2 in non-coronal than in pre-coronal contexts ( Δ = 0.18 , S E = 0.02 , z = 7.63, p < 0.001), and also exhibits clear variance over time. The effect of lip aperture on F2 is also strongest in non-coronal contexts (K–T: p < 0.001; K–L p < 0.001). This finding suggests that the highly dynamic F2 trajectories in T_K and K_K contexts are not simply due to the lack of anticipatory coarticulation from coronal codas, but that increasing lip rounding also contributes to vowel-internal F2 change. The opposite is true for pre-coronal contexts, where neither tongue backness nor lip aperture show a strong relationship to F2 over time. There is some increase to lip rounding during the first 25% interval of the vowel's duration, but the lips otherwise maintain a stable level of roundness. Tongue body measurements occupy a relatively narrow range in terms of backness and are evenly distributed across timepoints. Thus, like /ʊ/, pre-coronal /u/ shows relatively little within-token variance in F2. For pre-lateral /ul/, the influence of lip rounding on F2 does not differ significantly from that of coronal contexts ( Δ = 0.05 , S E = 0.03 , z = 1.82, p = 0.207), although there is some tendency toward less rounding at the vowel offset than at the onset. Within-token decrease in F2 for /ul/ is more strongly associated with lingual retraction, as was also the case for /ol/, although tongue position is stable after the initial 25% portion of the vowel's duration. The relative stability of tongue position is such that the overall relationship of tongue backness to F2 does not differ significantly between pre-coronal and pre-lateral contexts ( Δ = 0.03 , S E = 0.03 , z = 1.24, p = 0.641).

The articulatory-acoustic relationship for each coda context can be generalized as follows. Before coronal codas and for lax /ʊ/, the tongue body is stable and is not associated with changes to F2 over time, although more front tongue positions correspond to a higher F2 in general. The relationship of lip rounding to pre-coronal F2 trajectory varies on a vowel-specific basis. Increased rounding lowers F2 during /o/, but less clearly so for /u, ʊ/. Before non-coronal consonants, on the other hand, there is a clear increase in both tongue body retraction and lip rounding throughout /u, o/, both of which are associated with a continuous decrease in F2. For pre-lateral vowels, there is moderate retraction of the tongue body at the vowel onset, but the tongue body is stable for most of the vowel's duration. Tongue body retraction corresponds to a decrease in F2, but time-varying changes to lip aperture do not. The mapping from lip aperture to F2 in lateral contexts is likely not straightforward (see also Gorman and Kirkham, 2020), given the greater acoustic complexity of lateral(ized) sounds.

This study has examined the time-varying acoustic and articulatory quality of fronted back vowels in two dialects of American English. Those varieties, spoken in South Carolina and Southern California, were chosen based on previously-reported differences in terms of when back vowel fronting first appeared, whether both /u/ and /o/ are fronted in parallel, and how fronting is conditioned by phonological context. It was found that fronted back vowels in these varieties are remarkably similar in both acoustics and articulation. The degree of acoustic fronting for the high vowels /u/ and /ʊ/ is generally comparable in both regions. While /ʊ/ is fronted to the center of the vowel space in all phonological environments, fronting of /u/ shows strong conditioning by consonantal context, exceeding the center of the vowel space in non-coronal contexts only for a subset of speakers. The dialects differ from one another mainly in terms of the frontedness of /o/, which is significantly greater for South Carolina speakers than for California speakers in some contexts. However, this difference emerged only in the static acoustic measurements; a tendency toward acoustically more front /o/ for South Carolina speakers was also seen in the GAMMs, although the difference was not significant. As both Southern California and South Carolina were found to be similarly advanced in the fronting of all three vowels, future studies may consider whether the patterns observed here differ from other regions in which /o/ does in fact remain back, such as Eastern New England or the Upper Midwest, or where fronting in general is less strongly phonologized. The degree of between-dialect variability may also have been partially obscured by the relatively formal style of laboratory speech. Socially stratified variability in fronting that has been observed in sociophonetic work (e.g., Hall-Lew, 2009; Koops, 2010; Lee, 2016) may have emerged with a speaker sample more carefully controlled for sociolinguistic factors, which was not a primary aim of this study. Closer consideration of sociolinguistic factors is therefore left to future work.

With respect to articulation, both /u/ and /o/ show significantly smaller lip aperture than /i/, suggesting that fronting is not the result of unrounding. Although /ʊ/ does not achieve the same maximum degree of lip rounding as /u/ or /o/, it differs significantly from /i/ in all contexts, at least at the vowel onset. In general, these findings are consistent with articulatory studies of back vowel fronting in other varieties of English. As discussed previously, work on SSBE, Scottish English, Irish English, Australian English, and West Yorkshire English has found that each of these varieties has retained at least some degree of rounding for /u/ as it has undergone fronting. Although the degree of rounding varies between dialects (e.g., greater in SSBE than in Scottish or West Yorkshire English; Gorman and Kirkham, 2020; Lawson , 2019) or between speakers (Gorman and Kirkham, 2020), no articulatory study has yet shown fronted back vowels to be unround, contrary to many impressionistic observations (e.g., Hagiwara, 1997; Hinton , 1987). Unrounding may be predicted as a viable strategy for increasing F2, at least when considered on a purely acoustic basis. There are a number of reasons, however, why unrounding is arguably disfavored by coarticulatory, perceptual, and phonological factors.

As noted in the introduction, the fronting of /u, ʊ/ (and perhaps also /o/) has been convincingly linked to the relationship between coarticulatory effects in production and the listener's ability to compensate for such effects in perception (Harrington , 2008; Kleber , 2012; Ohala, 1981). Listeners who show less compensation for the coarticulatory fronting effect of coronals are more likely to identify high-F2 tokens as /u/, which is compounded by the fact that /u/ follows either /j/ or an alveolar consonant over 70% of the time, at least in British English (Harrington, 2007). Because phonological contrast between /ju/ (dew) vs /u/ (do) has collapsed in American English, Labov (2008) argues that this merger of the [iw]-[uw] categories introduces a frontward bias for post-coronal /u/. Either way, these factors result in a shift of the production target for /u/ toward [ʉ] or [y]. Phonological extension of this shift to non-coronal contexts then results in a decrease in the degree of contextual variability.

Nevertheless, it is clear from this study (as well as the earlier findings of Stanley , 2021 and Harrington , 2008) that contextual differences in the quality of /u/ persist to some extent throughout and after the phonologization of fronting. In this study, coronal onsets promoted an overall raising of the vowel's F2 trajectory, while coronal codas induced a less dynamic F2 trajectory through anticipatory coarticulation. This coarticulatory influence is often such that there is no significant difference in tongue backness for /i/ vs /u/. Eliminating lip rounding in these contexts would therefore lead to a decrease or loss of perceptual contrast with /i/, making it an unlikely strategy.

In non-coronal contexts, fronting is driven by phonologization, not coarticulation, so there is no articulatory basis for preferring a fronted tongue position to unrounded lips. Speakers aiming to realize non-post-coronal /u/ with a similar acoustic quality as post-coronal /u/ may hypothetically rely on unrounding (instead of or in addition to fronting) in order to achieve a comparably high F2. Given that unrounding is not viable in coronal contexts, however, this would yield disparate articulatory configurations that are counterproductive toward the goal of maintaining a stable phonological category. Crucial to the hypothesis put forward by Kleber (2012) is the notion that sound change affects the /u, ʊ/ categories across contexts precisely because change in perceptual correction for coarticulation precedes change in production, leading speakers to modify their production patterns in order to maintain phonological cohesion.

A loss of lip rounding is not only unmotivated by coarticulatory and phonological pressures, but is also likely undesirable from a perceptual standpoint. It has long been established that visual cues contribute to the perception of speech sounds (McGurk and MacDonald, 1976), although it has only recently been considered to what extent such cues influence the direction of sound change. McGuire and Babel (2012) argue that visual cues contribute to perceptual asymmetry for /f/ and /θ/, while King and Chitoran (2022) argue that the distinct visual quality of British labiodental /r/ (i.e., [ʋ]) helps to distinguish this variant from /w/. Most relevant to the present study, Havenhill and Do (2018) argue that visual perceptibility of lip rounding has contributed to the preservation of rounding on /ɔ/ as it undergoes fronting as part of the Northern Cities Shift (NCS). In contrast to the fronting of /u/, fronting of /ɔ/ in the NCS is not driven by coarticulation, but by pressure to maintain dispersion within the vowel space (Labov, 1994). Yet articulatory and perceptual data for speakers from both Chicago (Havenhill, 2018) and Metro Detroit (Havenhill and Do, 2018) show that nearly all speakers who retain the /ɑ/-/ɔ/ contrast distinguish the two vowels by a difference in lip rounding. When confronted with audiovisual stimuli in which lip rounding is removed, speakers are significantly more likely to identify /ɔ/ as /ɑ/, suggesting that visual cues play a role in maintaining perceptibility of the /ɑ/-/ɔ/ contrast. In the case of high/mid back vowel fronting, the model proposed by Harrington (2008) (see also, Ohala, 1993) asserts that listeners may fail to correctly apprehend the acoustic effects of coarticulation. While those acoustic effects are auditorily ambiguous with respect to tongue position and lip rounding, lip rounding is visually distinct. It is, therefore, reasonable to predict that listeners will be more likely to fail to apprehend the position of the tongue than to fail to apprehend the degree of lip rounding, and thus more likely to lose the tongue backness contrast than the rounding contrast.

It is not clear from this study whether vowel-internal variability in lip rounding is the result of diachronic change, i.e., whether /u/ and /o/ were historically rounder throughout their duration. Apparent-time analysis is necessary to address this question (cf. Harrington , 2008, 2011; Kleber , 2012). Although it is likely impossible to obtain such data for Southern American English, where /u/-fronting began during the 19th century, apparent-time analysis remains possible for varieties where fronting is more recent. Both articulatory and acoustic data show that fronted /u/ in SSBE is relatively monophthongal and maintains a consistently high degree of rounding (Gorman and Kirkham, 2020; Harrington , 2008), which partly differs from the patterns in the present study. Monophthongal vs diphthongal variants of /u/ have been previously documented in American English (Eckert, 2008; Koops, 2010; Thomas, 2001) and are suggested to show social stratification. Here, the degree of diphthongization was also found to vary according to consonantal context, with coronal codas, in particular, promoting a flatter F2 trajectory. It remains an open question to what extent dialects differ in the degree of diphthongization, whether dialects differ in how diphthongization is articulated, and whether such differences are attributable to distinct mechanisms of change, e.g., those proposed by Harrington (2008) vs Labov (2008). Closer examination of these variants in future work will help to elucidate the range of possible outcomes for back vowel fronting, the phonological representation of diphthongs and diphthongized vowels, and the diachronic development of diphthongs through processes like breaking.

The issue of the underlying structure of diphthongs has been debated. While some proposals suggest that diphthongs involve two separate articulatory targets (Lehiste and Peterson, 1961), evidence for the consistent presence or perceptual relevance of steady state targets is mixed (Gay, 1968; Jacewicz , 2003; Morrison and Nearey, 2007), motivating approaches that emphasize the vowel's dynamic acoustic-kinematic trajectory (Gay, 1968; Gottfried , 1993; Hillenbrand, 2013). In terms of articulatory gestures, it is possible that vowel-internal dynamics involve separate tongue and lip gestures that are asynchronously timed, particularly in the case of multiply articulated (e.g., round or nasal) vowels. The specific lingual target(s) for fronted /u/ are somewhat unclear, however; there are no two contexts exhibiting the same degree of backness, while some contexts show no obvious steady-state targets. The frequently-proposed transcriptions of [ɪʊ], [ɪʊ], or [ɪw] (e.g., Eckert, 2008; Labov, 2008) may be applicable in fully non-coronal contexts, but the tongue trajectory in coronal contexts is typically stable, i.e., [y] or [ʉ]. Supposing that the K_K context reflects the canonical lingual target for /u/ in the absence of coarticulation, then the usual situation is for this target to fail to be achieved. The K_T context, for instance, exhibits a more front tongue position as early as the vowel onset, apparently due to anticipatory coarticulation. In the T_K context, perseverative coarticulation results in a vowel offglide that is only as back as the frontest position seen in the K_K context. Coda [ɫ] is known to be coarticulatorily aggressive (Recasens , 1997) and was observed to strongly influence the lingual trajectory throughout most or all of the vowel duration.

With respect to lip rounding, both /u/ and /o/ showed greater rounding (smaller aperture) at the end of the vowel than at the beginning, typically corresponding to a decrease in F2. This presents several possibilities for the underlying representation of these vowels. One is that the increase in rounding may be attributed to a single lip gesture with its peak timed to occur late in the vowel's duration. Although /u/ and /o/ were both found to be significantly rounder than /i/ at their onset, that difference could reflect anticipatory rounding in preparation for rounding at the vowel offset. The vowel onset may then be considered unround or unspecified for rounding, with /u/ and /o/ ultimately represented as [ɨʉ] and [ɘ¨ʊ]. On the other hand, lip aperture at the onset of /u/ and /o/ is similar in magnitude to that of /ʊ/, which also shows a significantly smaller lip aperture than /i/. This moderate degree of lip rounding suggests a second possibility, that “rounding” encompasses several distinct gestures, rather than a binary round vs unround opposition. In that case, fronted /u/ and /o/ in these varieties may be more accurately transcribed as [ʉʿʉʾ] and [ɵʿ¨ʊʾ]. Segmental representation does not easily capture a third possibility, which is that the articulatory-acoustic specification for vowels is inherently dynamic and may have no static targets at all. The data presented in this study would be consistent with that account, as the acoustic and articulatory trajectories for /u/ and /o/ show (in most cases) continuous change throughout their duration, rather than a transition between two steady state periods. By way of comparison, the trajectories for /u/ and /o/ are clearly distinct from that of pre-lateral /i/, which is stable during its nucleus and has a relatively rapid change during the transition to /l/.

Through dynamic acoustic and articulatory analysis, this study provides insight into the underlying articulation and contextual variability of fronted back vowels in American English. Static acoustic analysis alone is insufficient, highlighting the importance of examining both tongue and lip movement, in addition to acoustics, in the study of vowel production. Large-scale articulatory studies remain technically challenging, although ongoing advances in imaging and analysis techniques will increasingly allow for more sophisticated inquiries into articulatory-acoustic mapping. Further research combining analyses of multimodal perception and articulatory-acoustic dynamics, along with sociolinguistic factors, will deepen our understanding of how vowel systems develop over time and how the reinterpretation of coarticulatory acoustic effects contributes to the maintenance, modification, loss, or re-timing of articulatory gestures.

See the supplementary material for additional information about ultrasound spline rotation and GAMM model summaries.

Thank you to May Pik Yu Chan and Arthur Thompson for their assistance with data processing and to Lisa Zsiga, Jen Nycz, and Youngah Do for their comments on earlier versions of this project. Special thanks to Eric Holt, Marc Garellek, Sharon Rose, and Eric Baković for their generosity in providing lab space and assistance with participant recruitment. Thanks also to the editor, Ewa Jacewicz, and two anonymous reviewers for their valuable suggestions. Model training and statistical analysis were performed using research computing facilities offered by Information Technology Services, the University of Hong Kong.

The author has no conflicting interests to declare.

This study was carried out in accordance with the recommenda tions of the Georgetown University Social and Behavioral Sciences Institutional Review Board (IRB-C) with written informed consent from all subjects.

Data and code for the analysis are available on OSF at https://doi.org/10.17605/OSF.IO/RBHMG.

See Table III for demographics.

TABLE III.

Self-reported participant demographic information. Origin indicates counties where the participant was raised; Ex. indicates years lived outside South Carolina/Southern California.

Speaker Gender Age Ethnicity Origin Ex.
Cal 01  21  White  Los Angeles 
Cal 03  19  White  San Diego 
Cal 04  21  Asian  San Diego 
Cal 06  20  Hispanic  San Diego 
Cal 07  22  White  LA, Ventura 
Cal 08  18  Asian  Los Angeles 
Cal 09  21  Hispanic  Orange County 
Cal 10  34  Asian  San Diego 
Cal 11  20  Afghan  Orange County 
Cal 12  21  White/Asian  LA, Ventura 
Cal 13  18  unknown  Orange County 
Cal 14  18  Asian  Los Angeles 
Cal 16  18  Asian  Riverside 
SC 01  30  White  Richland 
SC 02  27  White  Lexington 
SC 03  22  White  Spartanburg 
SC 04  27  White  Berkeley, 
Dorchester 
SC 05  20  White  Greenville, 
Lexington 
SC 07  50  White  Greenville, 
Spartanburg 
SC 08  27  White  Aiken 
SC 09  18  White  Kershaw 
SC 11  22  White/Hispanic  Charleston 
SC 12  20  White  Greenville 
           
Speaker Gender Age Ethnicity Origin Ex.
Cal 01  21  White  Los Angeles 
Cal 03  19  White  San Diego 
Cal 04  21  Asian  San Diego 
Cal 06  20  Hispanic  San Diego 
Cal 07  22  White  LA, Ventura 
Cal 08  18  Asian  Los Angeles 
Cal 09  21  Hispanic  Orange County 
Cal 10  34  Asian  San Diego 
Cal 11  20  Afghan  Orange County 
Cal 12  21  White/Asian  LA, Ventura 
Cal 13  18  unknown  Orange County 
Cal 14  18  Asian  Los Angeles 
Cal 16  18  Asian  Riverside 
SC 01  30  White  Richland 
SC 02  27  White  Lexington 
SC 03  22  White  Spartanburg 
SC 04  27  White  Berkeley, 
Dorchester 
SC 05  20  White  Greenville, 
Lexington 
SC 07  50  White  Greenville, 
Spartanburg 
SC 08  27  White  Aiken 
SC 09  18  White  Kershaw 
SC 11  22  White/Hispanic  Charleston 
SC 12  20  White  Greenville 
           

Table IV contains the wordlist for the experiment.

TABLE IV.

Wordlist for the experiment. Italics indicate environments where lexical gaps or ambiguous orthography necessitated the use of voiced consonants. Boldface indicates words differing in syllable count or complexity.

/i/ /ɪ/ /e/ /u/ /ʊ/ /o/ /ɑ/ /ɔ/
h_#  he    hay  who    hoe  ha  haw 
h_t  heat  hit  hate  hoot  hood  hoed  hot  hawed 
h_p  heap  hip    hoop    hope  hop   
h_k    hick      hook    hock  hawk 
h_l  heel  hill  hale      hole    haul 
t_#  tea    Tay  too    toe  ta  tawny 
t_t  teat  tit  Tate  toot    tote  tot  taught 
t_p  deep  dip  tape  dupe    dope  top   
t_k  teak  tick  take  duke  took  toke  tock  talk 
t_l  teal  till  tale  tool    toll    tall 
  deal  dill  dale  duel    dole    doll 
p_#  bee    bay  boo    Bo  pa  paw 
p_t  Pete  pit  bait  boot  put  boat  pot   
p_p  peep  pip  paper  boop    pope  pop  pauper 
p_k  peek  pick  bake  pookie  book  poke  bock  balk 
p_l  peel  pill  pail  pool    pole    Paul 
k_#  key    Kay  coo    go    caw 
k_t  Keats  kit  Kate  coot  could  coat  cot  caught 
k_p  keep  kip  cape  coop    cope  cop   
k_k  geek  kick  cake  kook  cook  coke  cock  gawk 
k_l  keel  kill  kale  cool    coal    call 
s_#  see    say  sue    so    saw 
s_p  seep  sip  sapiens  soup    soap  sop   
s_t  seat  sit  sate  suit  soot  sewed  sot  sought 
s_k  seek  sick  sake    forsook  soak  sock  Salk 
s_l  seal  sill  sail  Zuul    sole    Saul 
ʃ_#  she    shay  shoe    show    shaw 
ʃ_p  sheep  ship  shape        shop   
ʃ_t  sheet  shit  shade  shoot  should  showed  shot   
ʃ_k  chic  Schick  shake    shook    shock   
ʃ_l    shill  shale      shoal    shawl 
/i/ /ɪ/ /e/ /u/ /ʊ/ /o/ /ɑ/ /ɔ/
h_#  he    hay  who    hoe  ha  haw 
h_t  heat  hit  hate  hoot  hood  hoed  hot  hawed 
h_p  heap  hip    hoop    hope  hop   
h_k    hick      hook    hock  hawk 
h_l  heel  hill  hale      hole    haul 
t_#  tea    Tay  too    toe  ta  tawny 
t_t  teat  tit  Tate  toot    tote  tot  taught 
t_p  deep  dip  tape  dupe    dope  top   
t_k  teak  tick  take  duke  took  toke  tock  talk 
t_l  teal  till  tale  tool    toll    tall 
  deal  dill  dale  duel    dole    doll 
p_#  bee    bay  boo    Bo  pa  paw 
p_t  Pete  pit  bait  boot  put  boat  pot   
p_p  peep  pip  paper  boop    pope  pop  pauper 
p_k  peek  pick  bake  pookie  book  poke  bock  balk 
p_l  peel  pill  pail  pool    pole    Paul 
k_#  key    Kay  coo    go    caw 
k_t  Keats  kit  Kate  coot  could  coat  cot  caught 
k_p  keep  kip  cape  coop    cope  cop   
k_k  geek  kick  cake  kook  cook  coke  cock  gawk 
k_l  keel  kill  kale  cool    coal    call 
s_#  see    say  sue    so    saw 
s_p  seep  sip  sapiens  soup    soap  sop   
s_t  seat  sit  sate  suit  soot  sewed  sot  sought 
s_k  seek  sick  sake    forsook  soak  sock  Salk 
s_l  seal  sill  sail  Zuul    sole    Saul 
ʃ_#  she    shay  shoe    show    shaw 
ʃ_p  sheep  ship  shape        shop   
ʃ_t  sheet  shit  shade  shoot  should  showed  shot   
ʃ_k  chic  Schick  shake    shook    shock   
ʃ_l    shill  shale      shoal    shawl 
1

Note that unlike the Northern Cities Shift or Southern Shift, the CVS is not considered a chain shift, i.e., a reorganization of the vowel space in which a change to the quality of one vowel triggers an interrelated series of changes to its neighbors (Labov, 1994). The term CVS instead refers to a set of independent changes that, in combination, characterize the vowel system of California English (Eckert, 2008).

2

Fronting of /ɔ/ is found in the Great Lakes region of the United States but is due to the Northern Cities Shift (a chain shift) and thus has distinct phonetic motivations from fronting of /u, ʊ, o/. As noted by Labov (2006), fronting of /u, ʊ, o/ in this region tends to be more moderate. For articulatory analyses of /ɔ/-fronting, see Majors and Gordon (2008) and Havenhill and Do (2018).

3

The phonological status of /ɔ/ may also differ between Californian and South Carolinian speakers. In California, /ɔ/ and /ɑ/ are typically merged as a result of the cot-caught merger, although the phonological specification of the merged category (i.e., whether it is [+round]) is not entirely clear. In the South, /ɔ/ and /ɑ/ are more often contrastive in which case both [+back] and [+round] features are necessary to distinguish /ɔ/.

4

Experiments were conducted during short data collection trips to each location and participants were therefore recruited by convenience sampling. As back vowel fronting in the South has been well-established for several generations (Stanley , 2021; Thomas, 2001), the inclusion of speakers from differing ages is not expected to adversely influence results. Physiological changes to the vocal tract due to aging would be expected to decrease formant values (Harrington, 2006), but the one older speaker in the sample exhibits slightly above-average F2 for /u, ʊ, o/.

1.
Arel-Bundock
,
V.
(
2024
). “
marginaleffects: Predictions, comparisons, slopes, marginal means, and hypothesis tests
,” https://CRAN.R-project.org/package=marginaleffects (Last viewed February 29, 2024).
2.
Articulate Instruments Ltd
. (
2008
). “
Ultrasound stabilisation headset users manual: Revision 1.4
” (
Articulate Instruments Ltd.
,
Musselburg, UK
).
3.
Articulate Instruments Ltd
. (
2012
). “
Articulate assistant advanced
,” (
Articulate Instruments Ltd.
,
Musselburg, UK
).
4.
Baranowski
,
M.
(
2006
). “
Phonological variation and change in the dialect of Charleston, South Carolina
,” Ph.D. thesis,
University of Pennsylvania
,
Philadelphia, PA
.
5.
Baranowski
,
M. A.
(
2008
). “
The fronting of the back upgliding vowels in Charleston, South Carolina
,”
Lang. Var. Change
20
(
3
),
527
551
.
6.
Barreda
,
S.
(
2015
). “
phonTools: Functions for phonetics in R, r package version 0.2-2.1
,” https://github.com/santiagobarreda/phonTools/ (Last viewed February 29, 2024).
7.
Barreda
,
S.
(
2021
). “
Fast Track: Fast (nearly) automatic formant-tracking using Praat
,”
Ling. Vanguard
7
(
1
),
20200051
.
8.
Bates
,
D.
,
Mächler
,
M.
,
Bolker
,
B.
, and
Walker
,
S.
(
2015
). “
Fitting linear mixed-effects models using lme4
,”
J. Stat. Soft.
67
(
1
),
1
48
.
9.
Beddor
,
P. S.
,
Coetzee
,
A. W.
,
Styler
,
W.
,
McGowan
,
K. B.
, and
Boland
,
J. E.
(
2018
). “
The time course of individuals' perception of coarticulatory information is linked to their production: Implications for sound change
,”
Language
94
(
4
),
931
968
.
10.
Blackwood Ximenes
,
A.
,
Shaw
,
J. A.
, and
Carignan
,
C.
(
2017
). “
A comparison of acoustic and articulatory methods for analyzing vowel differences across dialects: Data from American and Australian English
,”
J. Acoust. Soc. Am.
142
(
1
),
363
377
.
11.
Boersma
,
P.
, and
Weenink
,
D.
(
2023
). “
Praat: Doing phonetics by computer
, (version 6.3.11) [computer program],” http://www.praat.org/ (Last viewed July 20, 2023).
12.
Chládková
,
K.
,
Hamann
,
S.
,
Williams
,
D.
, and
Hellmuth
,
S.
(
2016
). “
F2 slope as a perceptual cue for the front–back contrast in Standard Southern British English
,”
Lang. Speech
60
(
3
),
377
398
.
13.
Cox
,
F.
(
1999
). “
Vowel change in Australian English
,”
Phonetica
56
(
1–2
),
1
27
.
14.
De Jong
,
K.
(
1995
). “
On the status of redundant features: The case of backing and rounding in American English
,” in
Phonology and Phonetic Evidence: Papers in Laboratory Phonology IV
, edited by
B.
Connell
and
A.
Arvaniti
(
Cambridge University Press
,
Cambridge, UK
), pp.
68
86
.
15.
Eckert
,
P.
(
2008
). “
Where do ethnolects stop?
,”
Int. J. Bilingualism
12
(
1–2
),
25
42
.
16.
Farrington
,
C.
,
Kendall
,
T.
, and
Fridland
,
V.
(
2018
). “
Vowel dynamics in the southern vowel shift
,”
Am. Speech
93
(
2
),
186
222
.
17.
Ferragne
,
E.
, and
Pellegrino
,
F.
(
2010
). “
Formant frequencies of vowels in 13 accents of the British Isles
,”
J. Int. Phonetic Assoc.
40
(
1
),
1
34
.
18.
Fought
,
C.
(
1999
). “
A majority sound change in a minority community: /u/-fronting in Chicano Englsih
,”
J. Sociolinguistics
3
(
1
),
5
23
.
19.
Fox
,
R. A.
, and
Jacewicz
,
E.
(
2009
). “
Cross-dialectal variation in formant dynamics of American English vowels
,”
J. Acoust. Soc. Am.
126
(
5
),
2603
2618
.
20.
Gay
,
T.
(
1968
). “
Effect of speaking rate on diphthong formant movements
,”
J. Acoust. Soc. Am.
44
(
6
),
1570
1573
.
21.
Gordon
,
E.
,
Campbell
,
L.
,
Hay
,
J.
,
Maclagan
,
M.
,
Sudbury
,
A.
, and
Trudgill
,
P.
(
2004
).
New Zealand English: Its Origins and Evolution
(
Cambridge University Press
,
Cambridge, UK
).
22.
Gorman
,
E.
, and
Kirkham
,
S.
(
2020
). “
Dynamic acoustic-articulatory relations in back vowel fronting: Examining the effects of coda consonants in two dialects of British English
,”
J. Acoust. Soc. Am.
148
(
2
),
724
733
.
23.
Gottfried
,
M.
,
Miller
,
J. D.
, and
Meyer
,
D. J.
(
1993
). “
Three approaches to the classification of American English diphthongs
,”
J. Phon.
21
(
3
),
205
229
.
24.
Hagiwara
,
R.
(
1997
). “
Dialect variation and formant frequency: The American English vowels revisited
,”
J. Acoust. Soc. Am.
102
(
1
),
655
658
.
25.
Hall-Lew
,
L.
(
2009
). “
Ethnicity and phonetic variation in a San Francisco neighborhood
,” Ph.D. thesis,
Stanford University
,
Stanford, CA
.
26.
Harrington
,
J.
(
2006
). “
An acoustic analysis of ‘happy-tensing’ in the Queen's christmas broadcasts
,”
J. Phon.
34
(
4
),
439
457
.
27.
Harrington
,
J.
(
2007
). “
Evidence for a relationship between synchronic variability and diachronic change in the Queen's annual christmas broadcasts
,” in
Laboratory Phonology
, edited by
J.
Cole
and
J. I.
Hualde
(
Mouton de Gruyter
,
Berlin
), Vol.
9
, pp.
125
143
.
28.
Harrington
,
J.
,
Kleber
,
F.
, and
Reubold
,
U.
(
2008
). “
Compensation for coarticulation, /u/-fronting, and sound change in Standard Southern British: An acoustic and perceptual study
,”
J. Acoust. Soc. Am.
123
(
5
),
2825
2835
.
29.
Harrington
,
J.
,
Kleber
,
F.
, and
Reubold
,
U.
(
2011
). “
The contributions of the lips and the tongue to the diachronic fronting of high back vowels in Standard Southern British English
,”
J. Int. Phonetic Assoc.
41
(
2
),
137
156
.
30.
Havenhill
,
J.
(
2018
). “
Constraints on articulatory variability: Audiovisual perception of lip rounding
,” Ph.D. thesis,
Georgetown University
,
Washington, DC
.
31.
Havenhill
,
J.
, and
Do
,
Y.
(
2018
). “
Visual speech perception cues constrain patterns of articulatory variation and sound change
,”
Front. Psychol.
9
,
728
.
32.
Hawkins
,
S.
, and
Midgley
,
J.
(
2005
). “
Formant frequencies of RP monophthongs in four age groups of speakers
,”
J. Int. Phonetic Assoc.
35
(
2
),
183
199
.
33.
He
,
K.
,
Zhang
,
X.
,
Ren
,
S.
, and
Sun
,
J.
(
2016
). “
Deep residual learning for image recognition
,” in
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
, June 27–30, Las Vegas, NV, pp.
770
778
.
34.
Hillenbrand
,
J. M.
(
2013
).
Static and Dynamic Approaches to Vowel Perception
(
Springer
,
Berlin
).
35.
Hinton
,
L.
,
Moonwomon
,
B.
,
Bremner
,
S.
,
Luthin
,
H.
,
Van Clay
,
M.
,
Lerner
,
J.
, and
Corcoran
,
H.
(
1987
). “
It's not just the Valley Girls: A study of California English
,”
BLS
13
,
117
128
.
36.
Insafutdinov
,
E.
,
Pishchulin
,
L.
,
Andres
,
B.
,
Andriluka
,
M.
, and
Schiele
,
B.
(
2016
). “
DeeperCut: A deeper, stronger, and faster multi-person pose estimation model
,” in
European Conference on Computer Vision
(
Springer
,
New York
), pp.
34
50
.
37.
Jacewicz
,
E.
,
Fujimura
,
O.
, and
Fox
,
R. A.
(
2003
). “
Dynamics in diphthong perception
,” in
Proceedings of the 15th International Congress of Phonetic Sciences
, edited by
M.-J.
Solé
(
ICPhS Organizing Committee
,
Barcelona, Spain
), pp.
993
996
.
38.
Kennedy
,
R.
, and
Grama
,
J.
(
2012
). “
Chain shifting and centralization in California vowels: An acoustic analysis
,”
Am. Speech
87
(
1
),
39
56
.
39.
King
,
H.
, and
Chitoran
,
I.
(
2022
). “
Difficult to hear but easy to see: Audio-visual perception of the /r/-/w/contrast in Anglo-English
,”
J. Acoust. Soc. Am.
152
(
1
),
368
379
.
40.
Kleber
,
F.
,
Harrington
,
J.
, and
Reubold
,
U.
(
2012
). “
The relationship between the perception and production of coarticulation during a sound change in progress
,”
Lang. Speech
55
(
3
),
383
405
.
41.
Koops
,
C.
(
2010
). “
/u/-fronting is not monolithic: Two types of fronted /u/ in Houston Anglos
,”
Univ. Pennsylvania Work. Papers Linguistics
16
(
2
),
14
, available at https://repository.upenn.edu/handle/20.500.14332/44782.
42.
Kuznetsova
,
A.
,
Brockhoff
,
P. B.
, and
Christensen
,
R. H. B.
(
2017
). “
lmerTest package: Tests in linear mixed effects models
,”
J. Stat. Softw.
82
(
13
),
1
26
.
43.
Labov
,
W.
(
1994
).
Principles of Linguistic Change, Volume 1: Internal Factors
(
Wiley-Blackwell
,
Malden, MA
).
44.
Labov
,
W.
(
2008
). “
Triggering events
,” in
Studies in the History of the English Language IV: Empirical Analytical Advances Study English Language Change
(
De Gruyter
,
London
), pp.
11
54
.
45.
Labov
,
W.
,
Ash
,
S.
, and
Boberg
,
C.
(
2006
).
The Atlas of North American English
(
Walter de Gruyter
,
Berlin
).
46.
Lawson
,
E.
,
Stuart-Smith
,
J.
, and
Rodger
,
L.
(
2019
). “
A comparison of acoustic and articulatory parameters for the GOOSE vowel across British Isles Englishes
,”
J. Acoust. Soc. Am.
146
(
6
),
4363
4381
.
47.
Lee
,
S.
(
2016
). “
High and mid back vowel fronting in Washington, D.C.
,”
Am. Speech
91
(
4
),
425
471
.
48.
Lehiste
,
I.
, and
Peterson
,
G. E.
(
1961
). “
Transitions, glides, and diphthongs
,”
J. Acoust. Soc. Am.
33
(
3
),
268
277
.
49.
Lenth
,
R. V.
(
2024
). “
emmeans: estimated marginal means, aka least-squares means, r package version 1.10.0
,” https://CRAN.R-project.org/package=emmeans (Last viewed February 29, 2024).
50.
Lobanov
,
B. M.
(
1971
). “
Classification of Russian vowels spoken by different speakers
,”
J. Acoust. Soc. Am.
49
,
606
608
.
51.
Magen
,
H. S.
(
1997
). “
The extent of vowel-to-vowel coarticulation in English
,”
J. Phon.
25
(
2
),
187
205
.
52.
Majors
,
T.
, and
Gordon
,
M. J.
(
2008
). “
The [+spread] of the Northern Cities Shift
,”
Univ. Pennsylvania Work. Papers Linguistics
14
(
2
),
111
120
, available at https://repository.upenn.edu/handle/20.500.14332/44693.
53.
Manuel
,
S. Y.
(
1990
). “
The role of contrast in limiting vowel-to-vowel coarticulation in different languages
,”
J. Acoust. Soc. Am.
88
(
3
),
1286
1298
.
54.
Mathis
,
A.
,
Mamidanna
,
P.
,
Cury
,
K. M.
,
Abe
,
T.
,
Murthy
,
V. N.
,
Mathis
,
M. W.
, and
Bethge
,
M.
(
2018
). “
DeepLabCut: Markerless pose estimation of user-defined body parts with deep learning
,”
Nat. Neurosci.
21
(
9
),
1281
1289
.
55.
McAllister
,
A.
(
1938
).
A Year's Course in Speech Training
(
University of London Press
,
London
).
56.
McGuire
,
G.
, and
Babel
,
M.
(
2012
). “
A cross-modal account for synchronic and diachronic patterns of /f/ and /θ/ in English
,”
Lab. Phonol.
3
(
2
),
1
41
.
57.
McGurk
,
H.
, and
MacDonald
,
J.
(
1976
). “
Hearing lips and seeing voices
,”
Nature
264
,
746
748
.
58.
Mesthrie
,
R.
(
2010
). “
Socio-phonetics and social change: Deracialisation of the GOOSE vowel in South African English
,”
J. Socioling.
14
(
1
),
3
33
.
59.
Mielke
,
J.
(
2015
). “
An ultrasound study of Canadian French rhotic vowels with polar smoothing spline comparisons
,”
J. Acoust. Soc. Am.
137
(
5
),
2858
2869
.
60.
Morrison
,
G. S.
, and
Nearey
,
T. M.
(
2007
). “
Testing theories of vowel inherent spectral change
,”
J. Acoust. Soc. Am.
122
(
1
),
EL15
EL22
.
61.
Nath
,
T.
,
Mathis
,
A.
,
Chen
,
A. C.
,
Patel
,
A.
,
Bethge
,
M.
, and
Mathis
,
M. W.
(
2019
). “
Using DeepLabCut for 3D markerless pose estimation across species and behaviors
,”
Nat. Protoc.
14
(
7
),
2152
2176
.
62.
Nearey
,
T. M.
, and
Assmann
,
P. F.
(
1986
). “
Modeling the role of inherent spectral change in vowel identification
,”
J. Acoust. Soc. Am.
80
(
5
),
1297
1308
.
63.
Noiray
,
A.
,
Iskarous
,
K.
, and
Whalen
,
D. H.
(
2014
). “
Variability in English vowels is comparable in articulation and acoustics
,”
Lab. Phonol.
5
(
2
),
271
288
.
64.
Ohala
,
J. J.
(
1981
). “
The listener as a source of sound change
,” in
Papers from the Parasession on Language and Behavior
, edited by
C. S.
Masek
,
R. A.
Hendrick
, and
M. F.
Miller
(
Chicago Linguistic Society
,
Chicago, IL
), pp.
178
203
.
65.
Ohala
,
J. J.
(
1993
). “
The phonetics of sound change
,” in
Historical Linguistics: Problems and Perspectives
, edited by
C.
Jones
(
Longman
,
London
), pp.
237
278
.
66.
Podesva
,
R. J.
(
2011
). “
The California Vowel Shift and gay identity
,”
Am. Speech
86
(
1
),
32
51
.
67.
R Core Team
(
2018
).
R: A Language and Environment for Statistical Computing
(
R Foundation for Statistical Computing
,
Vienna, Austria
).
68.
Recasens
,
D.
(
1989
). “
Long range coarticulation effects for tongue dorsum contact in VCVCV sequences
,”
Speech Commun.
8
(
4
),
293
307
.
69.
Recasens
,
D.
,
Pallarès
,
M. D.
, and
Fontdevila
,
J.
(
1997
). “
A model of lingual coarticulation based on articulatory constraints
,”
J. Acoust. Soc. Am.
102
(
1
),
544
561
.
70.
Reed
,
D. W.
, and
Metcalf
,
A. A.
(
1952
).
Linguistic Atlas of the Pacific Coast
(
Bancroft Library at the University of California
,
Berkeley, CA
).
71.
Rosenfelder
,
I.
,
Fruehwald
,
J.
,
Evanini
,
K.
,
Seyfarth
,
S.
,
Gorman
,
K.
,
Prichard
,
H.
, and
Yuan
,
J.
(
2015
). “
FAVE (forced alignment and vowel extraction) program suite v1.2.2 [computer software]
.”
72.
Scobbie
,
J. M.
,
Lawson
,
E.
,
Cowen
,
S.
,
Cleland
,
J.
, and
Wrench
,
A. A.
(
2011
). “
A common co-ordinate system for mid-sagittal articulatory measurement
,” QMU CASL Work. Papers WP-20 (
Queen Margaret University
,
Musselburgh, UK
), available at https://eresearch.qmu.ac.uk/handle/20.500.12289/3597.
73.
Scobbie
,
J. M.
,
Lawson
,
E.
, and
Stuart-Smith
,
J.
(
2012
). “
Back to front: A socially-stratified ultrasound tongue imaging study of Scottish English /u/
,”
Riv. Ling.
24
(
1
),
103
148
, available at https://www.italian-journal-linguistics.com/2012-2/.
74.
Sóskuthy
,
M.
(
2021
). “
Evaluating generalised additive mixed modelling strategies for dynamic speech analysis
,”
J. Phon.
84
,
101017
.
75.
Sóskuthy
,
M.
,
Foulkes
,
P.
,
Hughes
,
V.
, and
Haddican
,
B.
(
2018
). “
Changing words and sounds: The roles of different cognitive units in sound change
,”
Top. Cogn. Sci.
10
(
4
),
787
802
.
76.
Stanley
,
J. A.
,
Renwick
,
M. E. L.
,
Kuiper
,
K. I.
, and
Olsen
,
R. M.
(
2021
). “
Back vowel dynamics and distinctions in Southern American English
,”
J. English Ling.
49
(
4
),
389
418
.
77.
Stevens
,
K. N.
, and
House
,
A. S.
(
1955
). “
Development of a quantitative description of vowel articulation
,”
J. Acoust. Soc. Am.
27
(
3
),
484
493
.
78.
Stevens
,
K. N.
,
Keyser
,
S. J.
, and
Kawasaki
,
H.
(
1986
). “
Toward a phonetic and phonological theory of redundant features
,” in
Invariance and Variability in Speech Processes
, edited by
J. S.
Perkell
and
D. H.
Klatt
(
Psychology Press
,
London
), pp.
426
449
.
79.
Strycharczuk
,
P.
, and
Scobbie
,
J. M.
(
2017
). “
Fronting of southern British English high-back vowels in articulation and acoustics
,”
J. Acoust. Soc. Am.
142
(
1
),
322
331
.
80.
Thomas
,
E. R.
(
2001
).
Number 85 in Publications of the American Dialect Society an Acoustic Analysis of Vowel Variation in New World English
(
Duke University Press
,
Durham, NC
).
81.
van Rij
,
J.
,
Wieling
,
M.
,
Baayen
,
R. H.
, and
van Rijn
,
H.
(
2022
). “
itsadug: Interpreting time series and autocorrelated data using gamms, R package version 2.4.1
” (Last viewed June 1, 2023).
82.
Wells
,
J. C.
(
1982
).
Accents of English
(
Cambridge University Press
,
Cambridge, UK
).
83.
Whalen
,
D. H.
(
1990
). “
Coarticulation is largely planned
,”
J. Phonetics
18
(
1
),
3
35
.
84.
Wood
,
S.
(
2017
).
Generalized Additive Models: An Introduction with R
, 2nd ed. (
Chapman and Hall
,
New York
).
85.
Wrench
,
A.
, and
Balch-Tomes
,
J.
(
2022
). “
Beyond the edge: Markerless pose estimation of speech articulators from ultrasound and camera images using DeepLabCut
,”
Sensors
22
,
1133
.
86.
Zsiga
,
E. C.
(
2013
).
The Sounds of Language: An Introduction to Phonetics and Phonology, 7
(
John Wiley & Sons
,
New York
).

Supplementary Material