This paper presents an acoustic analysis of Mixean Low Navarrese, an endangered variety of Basque. The manuscript includes an overview of previous acoustic studies performed on different Basque varieties in order to synthesize the sparse acoustic descriptions of the language that are available. This synthesis serves as a basis for the acoustic analysis performed in the current study, in which the various acoustic analyses given in previous studies are replicated in a single, cohesive general acoustic description of Mixean Basque. The analyses include formant and duration measurements for the six-vowel system, voice onset time measurements for the three-way stop system, spectral center of gravity for the sibilants, and number of lingual contacts in the alveolar rhotic tap and trill. Important findings include: a centralized realization ([ʉ]) of the high-front rounded vowel usually described as /y/; a data-driven confirmation of the three-way laryngeal opposition in the stop system; evidence in support of an alveolo-palatal to apical sibilant merger; and the discovery of a possible incipient merger of rhotics. These results show how using experimental acoustic methods to study under-represented linguistic varieties can result in revelations of sound patterns otherwise undescribed in more commonly studied varieties of the same language.
I. INTRODUCTION
Basque is a well-known language in terms of general linguistic description. A number of reference grammars can be consulted for particular details on its syntax or its phonology (especially Hualde and Ortiz de Urbina, 2003). Nevertheless, there are no general acoustic descriptions of the standard language, but rather a limited number of papers that include acoustic characterizations of specific segments as produced in particular varieties of Basque. In this paper, we present a general acoustic description of an endangered and under-documented variety of Basque, namely Mixean Low Navarrese—henceforth, “Mixean Basque” or “Mixean.” Most of the work on the linguistic description of this variety has been carried out by the Basque dialectologist Iñaki Camino, who started performing fieldwork in the Mixe region in the early 1980s. His work involves general phonological, morpho-syntactic, and lexical descriptions (Camino, 2016). Given that the acoustics of Mixean Basque have never been discussed in previous work, we will compare our results with the acoustical description of each group of segments as described in different varieties of the language, insofar as they can be found in the literature.
A. General background
The region of Mixe (Amiküze in Mixean Basque) is situated in the South-West of France, in the Pyrénées-Atlantiques department. Within the Basque Country, it is located in the northern part of the historical province of Low Navarre or Nafarroa Beherea. The population of Mixe was 7856 as of 2015 (L'Institut national de la statistique et des études économiques, 2019), although the number of speakers of Basque in the region is much lower (Camino, 2016). The Mixe region is formed by 32 towns, the main city being Donapaleu (Dnaplü in Mixean Basque and Saint-Palais in French), with a population of 1842 in 2016 (L'Institut national de la statistique et des études économiques, 2019). The variety of Basque spoken in the region, Mixean Basque or amiküzera, is usually classified as part of the Low Navarrese dialect (Michelena, 2011) or as a transition variety (Zuazo, 2008). Nevertheless, Mixean Basque has more in common, phonologically, with the neighboring Zuberoan dialect, usually considered the most deviant variety of Basque, than with Low Navarrese.
Mixean Basque has historically been in contact with Bearnese Gascon to the North (Mixe being South of the region of Béarn) and Zuberoan Basque to the East.1 Due to an increasing French influence at all levels of the society, the number of speakers of Basque and Gascon has steadily decreased in the region during the last century (Camino, 2016, pp. 48–49). Although no study regarding the current language use in Mixe has been performed recently, Camino (2016, p. 51) describes the language as being on its way to disappearing, with children currently raised and schooled in French. According to a 2016 study by the association Zabalik (Camino, 2016, p. 55), only 9% of children (75 individuals) are currently schooled in a Basque-speaking model (Zabalik, 2016), where children will be taught the standard language instead of their local varieties.
In short, Mixean is one of the varieties of Basque that is most different from the rest (alongside Zuberoan), but is understudied due to being underrepresented in the literature and not being considered a dialect of its own by most Basque dialectologists. Given that it is an endangered variety and that its speakers are among the oldest population, failure to study this variety now may result in the inability to study it at all.
B. Mixean phonology
The phonemic inventory of Mixean Basque has not been clearly established in the previous literature. Nevertheless, a provisional description might be inferred from the study by Camino (2016) and the work on neighboring varieties (Egurtzegi, 2018b; Hualde, 2003). Thereby, Mixean ostensibly includes at least 34 contrastive consonants—12 stops /p, t, c, k, ph, th, ch, kh, b, d, ɟ, ɡ/, ten sibilants /, , ʃ, , , tʃ, , , ʒ, /, a labiodental fricative /f/, nine sonorants /m, n, ɲ, l, ʎ, ɾ, r, j, w/, two laryngeals /h, / (which are sonorant-like except word-initially)—and six contrastive vowels /a, e, o, i, y, u/, although not all segments are equally frequent.1 In addition, some non-nativized French loanwords are produced with /ʀ/, /v/, /ɛ/, /œ/, or nasalized vowels (phonemic segments in French), although these are rare and generally not considered part of the Mixean Basque inventory. Excluding recent loanwords, there are no phonemic nasal vowels in Mixean Basque (Camino, 2016, p. 200), but coarticulatory vowel nasalization can be found in contact to any of the nasal segments /m, n, ɲ, /.
Some segments are only contrastive in particular phonological environments. Laryngeals and aspirated stops can only be found in the onset of the first two syllables of the word; stops are absent from word-medial codas and their laryngeal configuration is neutralized in word-final position (to the plain voiceless series). Rhotics only contrast intervocallically; they are absent from word-initial position and they neutralize in tautosyllabic obstruent-rhotic onset clusters and in coda position to a segment that is not clearly described in the literature on this or any close variety. In his general description of Basque, Hualde (2003, p. 30) describes this neutralized rhotic as a trill realized with fewer vibrations. Regarding sibilants, /ʧ/ is the only affricate found word-initially, and voiced sibilants are mostly present in loanwords and compound boundaries such as deuse /deue/ “nothing (at all)” (deus “nothing” + ere “as well”). Sibilants preceding a voiced obstruent are also voiced. The sonorants /m, ɲ, ʎ/ do not occur word-finally. See Hualde (2003) for a description of the phonotactic restrictions of most Basque dialects.
The Mixean variety of Low Navarrese Basque shows a more complex phonemic inventory than general Low Navarrese or most other varieties of Basque. The only variety of Basque that has a more complex inventory is Zuberoan, which has a phonemic inventory similar to that of Mixean Basque but also includes four contrastive nasalized vowels (Egurtzegi, 2015). The most notable differences between Mixean and Zuberoan Basque, on the one hand, and the other Basque varieties, on the other, are the set of aspirated stops /ph, th, ch, kh/, the nasalized laryngeal //, and the rounded high front vowel /y/. All other Basque varieties only have five vowels /a, e, o, i, u/, two sets of stops (voiced and voiceless unaspirated), and either one (/h/) or no laryngeal approximants. Aspiration—both as a phonological segment (Egurtzegi, 2018b; Hualde, 2018) and as a feature of a two-way stop distinction (Egurtzegi, 2018a)—was arguably part of an earlier, common stage of the language. However, contrastive aspiration is limited to the Eastern-most varieties today, and it has been reported to show recession even in the dialects where it is most present, namely Zuberoan and Mixean (Camino, 2016). Although both Zuberoan and Mixean have been reported to have a front rounded vowel, Mixean /y/ has been auditorily described as different from Zuberoan /y/, and closer to /ø/, especially in pre-pausal position (Haase, 1992, p. 29). Another difference between Mixean and Zuberoan is that Zuberoan lost the tap in the 19th century, while Mixean still preserves two rhotics (/ɾ, r/) intervocalically.
II. PREVIOUS ACOUSTIC DESCRIPTIONS
Most acoustic descriptions of particular varieties of the Basque language focus either on sibilants or on the vocalic inventory. In addition to these, there are some studies that focus on rhotics and a couple of studies on the stop system of Zuberoan Basque. We are not aware of any acoustical study that analyzes all segments in any variety of the Basque language. The current study will focus on the sets of segments that have been analyzed in other varieties of Basque: vowels, stops, sibilants, and rhotics. Although prosody has been widely studied in Basque, accentuation is not contrastive in Mixean,2 so it will not be analyzed in this paper.
A. Previous findings: Vowels
The acoustics of vowels have received the most attention in the literature on Basque. The most general and complete work on the acoustics of vowel systems is Urrutia et al. (1995a), which summarizes previous works and presents descriptions and formant charts of multiple dialects of the language. Among the varieties studied in this book, we can find Eastern Low Navarrese (ELN) (Urrutia et al., 1995a, pp. 151–175; results for ELN were also published in Urrutia et al., 1995b) which includes data from a sample of three male speakers from Donibane Garazi (Cizean Low Navarrese), Donapaleu (Mixean Low Navarrese), and the Salazar Valley (Salazarese). This somewhat heterogeneous group is the closest reference to our data that can be found in the literature.
Urrutia et al. (1995a) follow the dialectal classification of Yrizar (1981, pp. 39–45), who classify very different varieties of Basque under ELN. The study involved three speakers, each representing a different sub-variety. The results are not presented by speaker, but only as aggregates, so we cannot extract the particular information about the Mixean informant. In addition, there is no information about /y/, a vowel which has been claimed to be contrastive in Mixean Low Navarrese (Michelena, 2011), but is not present in the varieties of the other two informants (Cizean Low Navarrese and Salazarese). We could find no information on whether the instances of /y/ were discarded from the analysis or aggregated to the /u/ tokens, the segment to which it corresponds in the varieties of the two speakers who do not have /y/ in their inventory.
The only acoustic study of a variety of Basque that has a six-vowel inventory, namely Zuberoan Basque, is found in the same book (Urrutia et al., 1995a, pp. 203–234). It presents the aggregate results of two informants of (Northern) Zuberoan Basque, with no information on the number of tokens under analysis. The authors mention [ø] and specify that it is not a contrastive vowel (Urrutia et al., 1995a, p. 206), but an allophonic variant of /y/. However, we could not find a phonological context for this realization in their work, and this proposal is not found in any other work that we are aware of—only Haase (1992) proposes [ø] as an allophone of /y/ in prepausal position, but in Mixean instead of Zuberoan. The study concludes that stress does not play a significant role in the acoustic realization of Zuberoan vowels (Urrutia et al., 1995a, p. 206). The results of the two relevant studies in Urrutia et al. (1995a) are summarized in Table I.
F1 and F2 values (Hz) and duration values (ms) in ELN and Zuberoan (Z) (Urrutia et al., 1995a; Urrutia et al., 1995b).
Vowel . | Variety . | F1 . | F2 . | Dur. . |
---|---|---|---|---|
/i/ | ELN | 348 | 2277 | 57.0 |
Z | 353 | 2442 | 68.0 | |
/e/ | ELN | 504 | 1879 | 59.1 |
Z | 509 | 2002 | 59.1 | |
/a/ | ELN | 730 | 1469 | 65.0 |
Z | 821 | 1449 | 71.0 | |
/o/ | ELN | 521 | 1058 | 64.7 |
Z | 505 | 1024 | 76.7 | |
/u/ | ELN | 383 | 1036 | 58.5 |
Z | 390 | 949 | 82.1 | |
/y/ | ELN | - | - | - |
Z | 379 | 1812 | 80.8 | |
[ø] | ELN | - | - | - |
Z | 416 | 1755 | 78.5 |
Vowel . | Variety . | F1 . | F2 . | Dur. . |
---|---|---|---|---|
/i/ | ELN | 348 | 2277 | 57.0 |
Z | 353 | 2442 | 68.0 | |
/e/ | ELN | 504 | 1879 | 59.1 |
Z | 509 | 2002 | 59.1 | |
/a/ | ELN | 730 | 1469 | 65.0 |
Z | 821 | 1449 | 71.0 | |
/o/ | ELN | 521 | 1058 | 64.7 |
Z | 505 | 1024 | 76.7 | |
/u/ | ELN | 383 | 1036 | 58.5 |
Z | 390 | 949 | 82.1 | |
/y/ | ELN | - | - | - |
Z | 379 | 1812 | 80.8 | |
[ø] | ELN | - | - | - |
Z | 416 | 1755 | 78.5 |
All studies,3 with the exception of the study of Zuberoan discussed above, describe five-vowel systems with two back non-low vowels, two front non-low vowels, and a low central vowel. In most cases (all in the case of back vowels), mid vowels are mid-closed: they are closer to high vowels than low vowels. [i] is consistently the highest and most anterior vowel, and [o] tends to be less retracted than [u], although multiple studies report no meaningful differences in F2 for back vowels (Hualde et al., 2010; Salaburu, 1984, etc.).
B. Previous findings: Stops
Only a handful of studies can be found in the literature that analyze the acoustic realization of Basque stops. Two of these studies investigated Zuberoan Basque, which makes them relevant as a means of comparison for our study. Gaminde et al. (2002) analyzed the recordings of four male speakers of Zuberoan Basque (from the region of Pettarra), including a total of 302 analyzed tokens. The study was restricted to word-initial stops in a stressed syllable. The authors describe a three-way contrast involving prevoiced (“negative” voice onset time, VOT; i.e., voicing starts during the closure phase of the stop), plain voiceless (short-lag positive VOT; i.e., voicing starts shortly after the release of the stop), and voiceless aspirated (long-lag VOT; i.e., voicing starts long after the release) stops. Mounole (2004) performed a comparable study of the same dialect, also involving four male participants (from Larrain), including a total of 861 tokens. She analyzed word-initial voiceless stops and found a difference between plain voiceless and voiceless aspirated stops, with no significant differences due to stress (Mounole, 2004, p. 222).
The few other studies on Basque stops found in the literature investigated intervocalic stop lenition. They demonstrated that plain voiceless stops showed a tendency towards voicing and lenition intervocalically (Nadeu and Hualde, 2015; Saadah, 2011), which was strongest for word-final voiceless stops when preceding a word-initial vowel (Hualde et al., 2019).
The studies discussed above show that Zuberoan has a three-way stop opposition based on VOT (pre-voicing or voicing during closure, short-lag, and long-lag VOT) that is not present in most Basque varieties. VOT values increase with increased posteriority for voiceless and voiceless aspirated stops but are roughly similar for all voiced stops. Stops are measured word-initially due to intervocalic lenition, and stress does not seem to be a factor affecting VOT.
C. Previous findings: Sibilants
There are six sibilants—represented orthographically as <s, z, x, ts, tz, tx>—that have traditionally been recognized as common to all Basque dialects. Most authors (Egurtzegi, 2013; Hualde, 2003; Michelena, 2011, inter alia) describe these segments as voiceless apico-alveolar (transcribed as /, /), dorso- or lamino-alveolar (transcribed as /, /), and post-alveolar (transcribed as /ʃ, ʧ/). A number of studies (Larrasquet, 1934; N'Diaye, 1970; Txillardegi, 1982) have described the pronunciation of written <s> as retroflex (instead of apico-alveolar), usually referring to Eastern varieties of Basque, restricting the apico-alveolar realizations to varieties in contact with Spanish (N'Diaye, 1970, p. 15). Yárnoz (2002a,b) described the six sibilants in the Basque variety of Bortziri (Northern High Navarrese) as flat post-alveolar (transcribed by the author as /, /), denti-alveolar (/, /), and palatalized post-alveolar (/ɕ, tɕ/) instead, although only Jurado Noriega (2011) followed this description for other varieties. In addition, some Eastern varieties have developed voiced sibilants that are mostly found in loanwords from Gascon and French and in liaison (Michelena, 2011, p. 230). Some dialects (Bizkaian and Gipuzkoan, in particular) have developed mergers, resulting in a reduction of the size of the sibilant inventory (see Muxika-Loitzate, 2017, and Beristain, 2018b, for recent acoustic studies on sibilant merger in Bizkaian and Gipuzkoan). In Bizkaian and some neighboring varieties, the merger of alveolar sibilants /, , , / results in an apico-alveolar fricative and a lamino-alveolar affricate realization [, ] (maintaining the post-alveolar sibilants), whereas in Gipuzkoan // and // merge in a lamino-alveolar realization ([]). In some varieties in contact with French, the apico-alveolar fricative // merges with the post-alveolar fricative /ʃ/ instead (see Hualde, 2010, for a comprehensive discussion of sibilant mergers). Some authors have even concluded that the sibilant merger is a completed phonological process in Basque (Urrutia et al., 1991, p. 331), but this assertion is only widely accepted for western dialects (Yárnoz, 2002b, p. 14).
As in the case of the vowels, the study that is geographically most relevant for comparison with our analysis is that by Urrutia et al. (1991), in which some acoustic parameters of the sibilants of different eastern varieties of Basque are described. However, their study of ELN sibilants shares the problematic dialectal classification followed in their aforementioned study on vowels (see Sec. II A) and, most importantly, they report the lower energy cut-off frequencies of the sibilants instead of their spectral center of gravity (CoG), reducing the comparability with more recent studies. The recent studies that analyze sibilant CoGs differ from our study in various aspects: Iglesias et al. (2016) used nonce words instead of lexical items, Gandarias et al. (2014) included three tokens for each sibilant, Beristain (2018b) and Muxika-Loitzate (2017) analyzed fricative sibilants (rather than both fricatives and affricates), Hualde (2010) and Iglesias et al. (2016) reported results of a single speaker, and Gandarias et al. (2014) only provided complete CoG values from one speaker (female from Lekeitio) even though the speech of three speakers was analyzed. All of these studies analyzed speakers from central or western dialects, while Mixean is one of the most eastern varieties of Basque.
Studies4 on western varieties have consistently reported mergers (as described above), while studies on High Navarrese have reported maintenance of all six (fricative and affricate) voiceless sibilant phonemes. Voiced sibilants tend to be linked to eastern varieties (although they can also be found in some western varieties, see Gandarias et al., 2014), but no study measuring the CoG of an eastern variety can be found in the literature. In varieties with no sibilant mergers, lamino-alveolar sibilants have the highest CoGs, followed by apico-alveolars and then by post-alveolars.
D. Previous findings: Rhotics
Most, if not all of the previous studies on the realization of Basque rhotics have been performed by Gaminde and colleagues (Gaminde, 2006; Gaminde et al., 2017; Gaminde et al., 2016). Gaminde et al. (2016) studied the different realizations of intervocalic taps in 15 speakers (20–23 years old) of Bizkaian Basque. They measured duration, formants, and acoustic energy in a total of 330 tokens of read speech. They observed realizations of five allophones of the intervocalic tap in Basque—[ɾ, , , , r]—classified by whether the rhotic showed lingual contact and whether it showed fricative noise (partial frication, frication throughout the rhotic, or no frication at all). Gaminde et al. (2017) followed this work with a larger study on the realization of intervocalic trills, involving 155 young speakers (23–36 years old) from the whole Basque-speaking territory. Their results showed that the majority (50%) of canonical trills were realized with two contacts, with far less occurrence of three (8.54%) and four (0.4%) contacts. Among the trills realized with incomplete occlusions (Lindau, 1985), most showed a tap followed by an approximant (27.13%) or fricative (10.67%).
Gaminde et al. (2017) showed that speakers of the central and western Basque dialects (in contact with Spanish) tend to use the alveolar trill, while speakers of eastern Basque dialects (in contact with French) use voiced uvular rhotics [ʁ, ʀ] instead. Interestingly, they reported that all of the speakers of eastern dialects in their study consistently used uvular rhotics. All studies have reported a wide range of variation in the production of both taps (Gaminde et al., 2016) and trills (Gaminde et al., 2017).
III. METHODOLOGY
The data used for the acoustic analyses presented in the current study include audio recordings from Camino (2016), compiled after intensive fieldwork in the Mixe region during the last four decades. Given the difficulty of accessing this population, these might likely be the only available recordings of Mixean. In addition, the advanced age of the participants makes any later recording problematic. Camino (2016) recorded more than an hour of audio data from each of 15 speakers of Mixean Basque. All speakers lived in different villages from the Mixe region. From this data set, we have selected ten audio recordings that were made between 2005 and 2015. The selected audio files were recorded in rural environments, with a portable minidisc recorder (SONY MZ-R30), at a sampling rate of 44.1 kHz. Recordings of speakers older than 85 years were excluded from the analysis to avoid age-related phonetic biases, and recordings made in the 1980s with a DAT recorder were excluded in order to obtain homogeneous data regarding the state of the language and the recording method used. The resulting corpus includes recordings from seven male speakers and three female speakers from ten different villages of the region of Mixe (Donapaleu, Uhartehiri, Sorhapürü, Arrüeta, Martxüeta, Labetze, Amendüze, Gamue, Zohota, and Arberatze), with ages ranging from 80 to 85 years [μ = 83, standard deviation (SD) = 1.7]. The length of the recordings was 5.5 min on average, with a range of 3.5 min to 8.5 min (SD = 1.7). All audio recordings were force-aligned using the WebMAUS application (Kisler et al., 2017) set to Basque (FR); the resulting Praat TextGrids were subsequently hand-corrected by either the first author, the second author, or a graduate student in phonetics. All acoustic analyses were performed in Praat (Boersma and Weenink, 2019) using custom-written functions created by the second author, following the protocols outlined below.
A. Methods: Vowels
We analyzed a total of 2221 vowel tokens, ranging between 112 and 370 tokens per speaker (, SD = 72.2). Formant estimations were made using the Burg LPC method in Praat. The formant estimator was optimized for each speaker by manually adjusting the maximum formant parameter (five formant estimation) until the formant trajectories aligned consistently with visible formant bands in a broad-band spectrogram. Average F1, F2, and F3 measurements were then obtained within the middle 10% of each vowel interval (i.e., a window equal to 10% of the vowel duration, centered on the vowel midpoint). Additionally, the duration of the entire vowel interval (in ms) was measured and logged. The formant measurements were then imported into R (R Core Team, 2019), where speaker-normalized values were computed using Lobanov normalization (i.e., z-score transformation). The resulting z-scores were then converted back to the Hz scale using the average standard deviation and grand mean of all ten speakers. These normalized formant values maintain the interpretability of the Hz scale, while also retaining the speaker-specific normalized structure of the z-scores.5 For the characterization of the acoustic vowel space, the speaker-normalized F1 and F2 values were combined for each of the six vowels /i, e, a, o, u, y/ and vowel-specific z-scores were computed. Outliers were removed from each vowel category by excluding observations associated with vowel-specific F1 or F2 z-scores with absolute values greater than 3.
B. Methods: Stops
Only stops in utterance-initial position were included in the analysis, since stop lenition occurs intervocalically (Nadeu and Hualde, 2015), even across word boundaries (Hualde et al., 2019). In this environment, the palatal stop series /ɟ, c, ch/ was produced rarely in the data—/ɟ/ was produced twice by one speaker and one time each by three speakers; /c/ was produced four times by one speaker, three times by another speaker, and only once by a third speaker; /ch/ was produced only one time by a single speaker. This stop series was therefore excluded from the final analysis. We analyzed a total of 717 stop consonant tokens, ranging between 41 and 114 tokens per speaker (, SD = 25.2). Measurements of VOT were made for the nine stop consonants /b, d, ɡ, p, t, k, ph, th, kh/. VOT was first estimated automatically using AutoVOT (Keshet et al., 2014) in Python (Python Software Foundation, 2019), trained on a subset of manually-annotated Praat TextGrids. The onsets and offsets of the estimated VOT segments were then manually verified by the second author and hand-corrected as needed. Before the final analysis, speaker-normalized z-scores were computed, and outliers were removed by excluding observations associated with z-scores with absolute values greater than 3.
C. Methods: Sibilants
We analyzed a total of 1494 sibilant consonant tokens, ranging between 97 and 214 tokens per speaker (, SD = 37.8). In the production of sibilants, the sound source is located in the front cavity, and the tight vocal tract constriction results in a weakening of the acoustic coupling between the front and back cavities; thus, the resonant frequencies are primarily associated with the length of the front cavity (Johnson, 2003, pp. 124–125). As such, in the absence of articulatory measurements (e.g., electropalatography), spectral CoG can be used as an approximation of place of articulation for sibilant consonants. Accordingly, CoG measurements for the sibilants /ʃ, ʧ, , , , , ʒ, , / were obtained in Praat. CoG measurements were calculated from the absolute spectrum after band-pass filtering the audio (from 300 Hz to 19 kHz). For fricative consonants, average CoG measurements were obtained within the middle 10% of the consonant interval (i.e., the same analysis window used for formant measurements of the vowels). For affricate consonants, the same 10% window was used for analysis; however, instead of centering the window on the midpoint of the consonant (i.e., 50% of the consonant interval) the window was centered on 75% of the consonant interval. This helped to ensure that average CoG values were obtained in the fricative portion of the affricate and excluded the closure and/or release.
Although CoG measurements are not necessarily expected to vary between speakers due to the same factors that condition inter-speaker differences in formant frequencies of vowels (e.g., differences in vocal tract length between males and females), it is expected that some degree of inter-speaker variation in CoG may be observed due to differences in speaker morphology (e.g., height of the palate, formation of the teeth, etc.). Because of this, CoG values were Lobanov normalized before conversion back to Hz, in the same manner as previously described for formant values. Before the final analysis, outliers were removed by excluding observations associated with z-scores with absolute values greater than 3.
D. Methods: Rhotics
We analyzed a total of 171 rhotic tokens, ranging between 9 and 29 tokens per speaker (, SD = 6.2). The number of lingual contacts made during the productions of /r/ and /ɾ/ was estimated programmatically in Praat according to the following method. The audio signal was first low-pass filtered at 2000 Hz, and an intensity/amplitude (dB) trajectory was generated from the filtered signal. Second, the intensity minima within the interval of each /r, ɾ/ token were identified, and the total number of minima within each interval was logged. We interpret each intensity minimum as a lingual contact, according to the understanding that any sort of intra-oral constriction will result in a loss of overall energy, due to the increased airflow impedance; a rapid constriction associated with a lingual tap is expected to result in a corresponding rapid loss of acoustic energy, thus producing an intensity minimum. Three environments were investigated: intervocalic rhotics, rhotics in onset clusters, and rhotics in coda position. As previously described in Sec. I B, rhotics only contrast intervocalically in Basque: they are neutralized in onset clusters as well as in coda position. Thus, separate analyses were carried out for /r/ and /ɾ/ in intervocalic position, but /r/ and /ɾ/ items were combined for the analyses of onset clusters and codas.
IV. RESULTS
A. Results: Vowels
The speaker-normalized F1/F2 acoustic vowel space is shown in Fig. 1. The ellipses shown denote 50% of the data variation for each vowel category (i.e., data ellipses, not confidence interval ellipses). The colored vowel symbols denote the F1/F2 mean for each category; black bars connect the category means in order to help visualize the overall shape of the acoustic vowel space. The average F1, F2, F3, and duration values for the six vowel categories are given in Table II.
(Color online) Acoustic vowel space of speaker-normalized F1/F2 values (Hz) for Mixean Basque. Ellipses represent coverage of 50% of the data in each vowel category.
(Color online) Acoustic vowel space of speaker-normalized F1/F2 values (Hz) for Mixean Basque. Ellipses represent coverage of 50% of the data in each vowel category.
Average F1, F2, and F3 values (Hz) and duration values (ms) of vowels in Mixean Basque.
Vowel . | F1 . | F2 . | F3 . | Dur. . |
---|---|---|---|---|
/i/ | 412 | 2241 | 2946 | 81 |
/e/ | 510 | 1845 | 2724 | 76 |
/a/ | 683 | 1461 | 2714 | 88 |
/o/ | 541 | 1138 | 2758 | 98 |
/u/ | 454 | 1079 | 2804 | 85 |
/y/ | 441 | 1629 | 2711 | 79 |
Vowel . | F1 . | F2 . | F3 . | Dur. . |
---|---|---|---|---|
/i/ | 412 | 2241 | 2946 | 81 |
/e/ | 510 | 1845 | 2724 | 76 |
/a/ | 683 | 1461 | 2714 | 88 |
/o/ | 541 | 1138 | 2758 | 98 |
/u/ | 454 | 1079 | 2804 | 85 |
/y/ | 441 | 1629 | 2711 | 79 |
With regard to F1 (i.e., acoustic vowel height), a three-way distinction can be observed: /i, y, u/ are realized as high vowels with similar F1 values, /e, o/ are realized as mid vowels with similar F1 values, and /a/ is realized as a low vowel. However, it should be noted that the F1 distinction between the high vowels /i, y, u/ and the mid vowels /e, o/ is not as great as the F1 distinction between the mid vowels /e, o/ and the low vowel /a/, as observed for other Basque varieties in previous works. With regard to F2 (i.e., acoustic vowel anteriority/posteriority), the results are somewhat more varied. Among the three high vowels, there is a clear three-way distinction in which /i/ and /u/ are realized as front and back vowels, respectively, while /y/ seems to be realized as a central (rather than front) vowel. In order to test for differences among the high vowels—and, thus, properly characterize the acoustic realization of /y/—linear mixed effects (LME) models were constructed using the lme4 R package (Bates et al., 2015) with either F2 or F3 as the response, Phone as the fixed factor, and random intercepts by Speaker and Word. Tukey post hoc comparisons were constructed using the multcomp package (Hothorn et al., 2008), with α level compensation for multiple comparisons using Benjamini-Hochberg adjustment; significant differences are reported for adjusted p-level 0.05. With regard to F2, each of the pairwise differences are significant, and are summarized as: /i/ > /y/ > /u/. With regard to F3, each of the pairwise differences are significant, and are summarized as: /i/ > /u/ > /y/. In the absence of articulatory data, these acoustic results suggest that /y/ is rounded (i.e., low F3) but centralized (i.e., F2 that is midway between /i/ and /u/), and would thus more appropriately be characterized as /ʉ/ in this variety. The mid-vowels /e, o/ display a clear front-back distinction (although /e/ is much more retracted in comparison to /i/), while the low vowel /a/ is realized as a central vowel along the F2 dimension. These results suggest that the Mixean Basque vowel system is characterized by two front vowels (/i/ and /e/), two central vowels (/ʉ/ and /a/), and two back vowels (/u/ and /o/).
B. Results: Stops
VOT values for voiced stops /b, d, ɡ/, unaspirated stops /p, t, k/, and aspirated stops /ph, th, kh/ are displayed in box plots in Fig. 2. In this figure and all similar box plot figures in the paper, mean values are displayed as white circles, median values are displayed as horizontal black bars, inward-facing notches display the standard error around the median (if the notches of two boxes do not overlap along the y axis, this suggests that the medians are significantly different), the boxes denote the inter-quartile range (IQR; the middle 50% of the data), and the whisker lines denote 1.5 × the IQR; outliers have been removed to aid visualization. While the results suggest that there are differences between the three stop categories in Mixean Basque that are in the expected direction—voiced stops are produced with voicing during the closure phase, unaspirated and aspirated stops are produced with positive VOT, and aspirated stops are produced with greater positive VOT values than unaspirated voiceless stops—the range of VOT values is substantially smaller in the Mixean variety compared to the Zuberoan variety. Note that VOT values referring to voicing during closure are denoted by negative values, following conventional use in studies on VOT.
Both Gaminde et al. (2002) and Mounole (2004) report a distinction between unaspirated and aspirated voiceless stops in the Zuberoan variety, indicating a three-way voicing distinction. In our results for the Mixean variety, although the voiceless consonants that are (ostensibly) aspirated are indeed realized with overall greater positive VOT values than their unaspirated counterparts, the distinction between the two groups is not as large as has been reported for the Zuberoan variety. Recall from the introduction that stop aspiration has been lost in most Basque dialects and has been reported to show recession in the dialects that maintain it. Therefore, it is of interest for our current study to determine in an objective, data-driven manner whether a three-way contrast does exist in Mixean Basque. To this end, we performed clustering of speaker-normalized VOT values (z-scores) with the mclust R package (Fraley et al., 2019), using the Bayesian Information Criterion (BIC) based on finite Gaussian mixture modeling to determine the optimal number of clusters present in the data. The results suggest that the overall distribution is indeed best described by three clusters/groups. The proportions of items belonging to these groups are shown in Fig. 3. In this figure, the three panels correspond to the three groups identified by the Gaussian clustering, and bars are shown in each panel for each of the nine consonant categories that are suggested by the orthographic representations. A given bar displays the percentage of the total number of items of the given phone that are identified as belonging to the given cluster. The average VOT for the given consonant belonging to the cluster is displayed above each bar. By way of example, we take the clustering results for /ɡ/: 81% of /ɡ/ items were identified as belonging to Group 1, with an average VOT of -40 ms for these items; 12% of /ɡ/ items were identified as belonging to Group 2, with an average VOT of 15 ms for these items; and 7% of /ɡ/ items were identified as belonging to Group 3, with an average VOT of 34 ms for these items.
(Color online) VOT (ms) values of stop consonants in Mixean Basque, separated into three clusters identified by finite Gaussian mixture modeling.
(Color online) VOT (ms) values of stop consonants in Mixean Basque, separated into three clusters identified by finite Gaussian mixture modeling.
These results suggest that a three-way voicing contrast is indeed present in the variety. However, the classification of the observations based on these groups/clusters does not clearly delineate categories comprised solely of the three stop consonant groups /b, d, ɡ/, /p, t, k/, and /ph, th, kh/. Group 1 consists of consonants with average pre-voicing values in the range of -44 ms to -40 ms (i.e., from 40 to 44 ms of voicing during the closure). Only /b, d, ɡ/ items are included in this group, and the majority (81%–94%) of the items for these consonants are included in this group for each of the three consonants. This suggests that Group 1 represents voiced consonants, comprised solely of /b, d, ɡ/ items, and that these /b, d, ɡ/ items are nearly categorically realized with voicing during the closure (i.e., pre-voicing).
However, considerably more variation can be observed for Groups 2 and 3. Group 2 consists of consonants with average positive VOT values in the range of 14–20 ms. Interestingly, items from all nine stop consonants are realized with VOT values in this range. While the most prominent group of consonants in this cluster is indeed the unaspirated voiceless /p, t, k/ triad—82%, 81%, and 61% of the total number of items for these consonants, respectively—a small percentage of each of the voiced consonants /b, d, ɡ/, as well as a much larger percentage of the aspirated voiceless /ph, th, kh/ are also produced with VOT values in this 14–20 ms range. Finally, Group 3 consists of consonants with average positive VOT values in the range of 34–42 ms. While there are no instances of voiced /b, d/ in this group, 7% of the total number of /ɡ/ items are produced with VOT values in this range, as well as a higher percentage of unaspirated voiceless /p, t, k/—18%, 19%, and 39% of the total number of times for these consonants, respectively. However, the most prominent group of consonants in this third cluster is indeed the aspirated voiceless /ph, th, kh/ triad—57%, 71%, and 81% of the total number of items for these consonants, respectively.
Overall, the VOT results for the stop consonants suggest that there is a clear categorical distinction between voiced (negative VOT values, i.e., voicing starts during the closure) and voiceless (positive VOT values, i.e., voicing starts after the release) consonants, but that the sub-distinction of aspiration vs non-aspiration among the voiceless consonants displays a substantial degree of overlapping phonetic realizations. It has previously been shown that phonological stop aspiration can vary greatly from town to town within the Zuberoan variety, including recession of this feature (Michelena, 2011). This intra-dialectal variation, if also present in Mixean Basque, may provide a possible explanation for the phonetic variation observed here.
C. Results: Sibilants
Box plots of speaker-normalized CoG values for the sibilant consonants are displayed in Fig. 4. The average values among all nine consonants range from 3365 Hz (for /ʒ/) to 5088 Hz (for //). In order to test for significant differences between the phones, an LME model was constructed with CoG as the response, Phone as the fixed factor, random slopes and intercepts by Speaker, and random intercepts by Word. Tukey post hoc comparisons were constructed with α level compensation for multiple comparisons using Benjamini-Hochberg adjustment; significant differences are reported for adjusted p-level 0.05.
The group of voiced sibilants /ʒ, , / is realized with the lowest CoG values;6 these three consonants have similar CoG values, except for a marginally significant difference (p = 0.027) between /ʒ/ and // due to the slightly raised CoG for //. The group of voiceless laminal sibilants /, / is realized with the highest CoG values, with no significant difference between them. It is of interest to note that both fricative and affricate apical sibilants have similar CoG values to their palatal counterparts (i.e., no significant differences among /ʃ, tʃ, , /), suggesting a merger in the place of articulation of the apical and palatal categories; this pattern is consistent for both voiced and voiceless sibilants. Regarding CoG values for fricative and affricate counterparts, there are no significant intra-pair differences among any of the three pairs /ʃ, ʧ/, /, /, and /, /. This suggests that, in each of the three cases, the place of articulation is consistent for the plain fricative and the fricative portion of its corresponding affricate, unlike other varieties of Basque.7
D. Results: Rhotics
As mentioned in Sec. I, the opposition between the two Basque rhotics (a tap and a trill) is only realized intervocalically. Rhotics neither occur word-initially nor do they contrast in onset clusters or in codas. Thus, it is important to differentiate these three contexts for their analysis, as outlined in the current section.
The results for the number of lingual contacts produced in the tap /ɾ/ and in the trill /r/ are shown in Table III. For the intervocalic tap /ɾ/, the majority of items (82.4%) are produced with a single lingual contact, as expected for the phonetic realization of a tap. A small portion of the items (10.5%) are produced without any measured lingual contact; these realizations may represent occurrences of allophones such as [, ], reported by Gaminde et al. (2016) for Bizkaian Basque. An even smaller portion of the items (7.1%) are produced with multiple contacts, suggesting that the tap /ɾ/ is sometimes (but infrequently) produced as a trill in the Mixean variety of Basque. For the intervocalic trill /r/, there is a nearly equal proportion of items realized with a single contact (43%) as of items realized with two contacts (40.9%), with slightly over half of the total number of /r/ items (56.5%) produced with multiple lingual contacts (i.e., two or more taps) and a single item (0.5% of the total data) with no measured lingual contacts. The high proportion of single-contact items is surprising, given the fact that trills are characterized by multiple articulator contacts; this result suggests a possible merger in progress of /r/ to /ɾ/ in Mixean Basque. Nevertheless, it could also be the case that rhotics form a stable non-robust opposition in Mixean, with a gradient difference and no clear boundary between categories. If so, the opposition would be conditioned by the mostly predictable distribution of the two sounds, which are only contrastive in a small subset of phonological contexts (see Currie Hall, 2009; Hualde, 2004; Renwick and Ladd, 2016; Renwick, 2014; Simonet, 2005).
Percentage of total number of taps realized in alveolar rhotics of Mixean Basque in onset clusters, intervocalic position, and syllable codas.
. | Intervocalic . | Cluster . | Coda . | |
---|---|---|---|---|
Taps . | /ɾ/ . | /r/ . | /ɾ/ ∼/r/ . | /ɾ/ ∼/r/ . |
0 | 10.5 | 0.5 | 35.7 | 54.8 |
1 | 82.4 | 43.0 | 60.7 | 34.7 |
2 | 5.7 | 40.9 | 3.6 | 6.5 |
3 | 1.4 | 13.4 | - | 2.4 |
4 | - | 2.2 | - | 1.6 |
. | Intervocalic . | Cluster . | Coda . | |
---|---|---|---|---|
Taps . | /ɾ/ . | /r/ . | /ɾ/ ∼/r/ . | /ɾ/ ∼/r/ . |
0 | 10.5 | 0.5 | 35.7 | 54.8 |
1 | 82.4 | 43.0 | 60.7 | 34.7 |
2 | 5.7 | 40.9 | 3.6 | 6.5 |
3 | 1.4 | 13.4 | - | 2.4 |
4 | - | 2.2 | - | 1.6 |
In comparison to the intervocalic environment, the occurrence of alveolar rhotic items produced without any lingual contact is much greater in the onset cluster and coda environments. 35.7% of /ɾ, r/ items are produced without any lingual contact when they occur in onset clusters, while the majority of alveolar rhotics (54.8%) are produced without any lingual contact when they occur in coda position. However, in both environments, some of the alveolar rhotic items are produced with at least one tap. In onset clusters, the majority of items (60.7%) are produced with a single tap, and a small proportion of items (3.6%) are produced with two taps; no items are produced with three or more lingual taps. In coda position, while the majority of items (54.8%) are produced without any lingual contact, a substantial proportion of alveolar rhotics are produced with one (34.7%) or multiple (10.5%) taps, including rare cases of three and four taps.
V. DISCUSSION
Based on the previous literature, we expected to find six vowel categories in Mixean Basque: the common Basque vowels /a, e, o, i, u/, and a high front rounded vowel /y/, as found in Zuberoan. Our acoustic analysis confirms that there are six vowels in Mixean. Nevertheless, our results suggest that the Mixean vowel system is characterized by two front vowels (/i/ and /e/), two central vowels (/y/ ∼[ʉ] and /a/), and two back vowels (/u/ and /o/). The retraction of Mixean /y/ has been auditorily described as closer to /ø/ (Haase, 1992, p. 29), although our results suggest that the sixth vowel found in Mixean is better represented as /ʉ/, a segment not found in any other variety of the Basque language. In addition, all high vowels in Mixean seem to be comparatively lower than their equivalents in other Basque dialects. The observation that Zuberoan high vowels are lower than those found, for instance, in French has been made by other authors as early as Larrasquet (1932), but our results for Mixean point to even lower high vowels than these described by Urrutia et al. (1991) for Zuberoan Basque. Thus, Mixean high vowels /i, y, u/ are actually articulated with mean formants close to the values expected of [e, ɵ, o].
Regarding Mixean stops, our study has found evidence that there is still a three-way laryngeal distinction in the language (pre-voiced, plain voiceless, and voiceless aspirated). Nevertheless, the mean VOT values that resulted from our study are less extreme than these reported by previous authors for the Zuberoan dialect. Although the finite Gaussian mixture modeling applied to our data suggested the presence of three stop series, the clusters that resulted from this classification did not consistently reflect the orthographic notation. These two observations suggest that the three-way distinction is not as strong in Mixean Basque as has been reported for Zuberoan Basque, and that distinctive aspiration is perhaps being lost in this variety, as has arguably occurred in other Basque varieties (Egurtzegi, 2018a).
Within sibilants, we did find a categorical difference between apical and laminal sibilants, but our results are not consistent with the most general description of the place of articulation of sibilant segments (namely, apico-alveolar, lamino-alveolar, and post-alveolar/alveolo-palatal). As the rest of the studies outlined in this paper, our study found lower CoG values for the apical sibilants than for the laminal sibilants, which is in line with the observation that apical sibilants may be articulated with a post-alveolar (or even retroflex) place of articulation, as suggested by a number of previous descriptions. However, no differences in CoG were observed between the apical and alveolo-palatal sibilants, suggesting a merger in place of articulation between the two sets of segments. This process could involve one of three scenarios: (1) a merger of apical to alveolo-palatal, as described in Hualde (2010); (2) a merger of alveolo-palatal to apical; or (3) a merger of both categories towards an intermediate place of articulation. In comparing our CoG measurements to the values reported in other studies, the CoG values for the apical sibilants are similar to those reported by Hualde (2010) for High Navarrese; however, the values for the alveolo-palatal sibilants are lower in High Navarrese than those observed in the current study. We believe that these results suggest that the alveolo-palatal sibilants have merged to apico-(post)alveolar in the Mixean variety, i.e., scenario 2 above. Finally, unlike previous reports, we did not find CoG differences between any given fricative and its affricate counterpart, suggesting that fricative and affricate counterparts are produced with the same place of articulation in Mixean Basque.
We expected to find an opposition between a tap and a trill in intervocalic position, as in other Basque dialects. The results from our study suggest that, in Mixean, this opposition is not as strong as in other Basque dialects. While the number of intervocalic productions of the phonological tap with more than one lingual contact was low (7.1%), the percentage of productions of the phonological intervocalic trill with a single tap was much higher than expected (43%). Although the majority of the intervocalic trills were actually trilled (56.5%), the unexpectedly high number of trills realized as a tap intervocalically points in the direction of an incipient merger of the two rhotics in Mixean Basque. Following the more general descriptions of Basque rhotics, we may have expected the rhotics in neutralizing contexts (i.e., onset clusters and codas) to be articulated as shorter trills (i.e., two lingual contacts). However, our study suggests that the most common realization of onset-cluster rhotics involves one tap (60.7% of all items, and 35.7% with no taps), while coda rhotics are most frequently produced with no taps (54.8% of all items, and 34.7% with one tap). These results suggest that neutralized rhotics are produced as taps or approximants in Mixean, and that coda position is the most frequent context for the realization of approximant/fricative rhotics, followed by onset clusters. Finally, regarding the place of articulation of Mixean rhotics, while some of the speakers produced a number of uvular articulations, none of them limited their rhotics to uvular segments. This observation is based on manual transcription and concomitant perception by the first author (a native speaker of Basque). This finding contrasts with observations by Gaminde et al. (2017), who consistently found uvular articulations in the speakers from eastern Basque dialects. However, it is worth mentioning that, while the speakers in their study were of young age, the participants in our study encompassed speakers from a much older generation, so that the spread of uvular rhotics in eastern Basque dialects can potentially be viewed as a recent innovation.
VI. CONCLUSION
This study has presented a general description of an endangered variety of Basque, namely Mixean Low Navarrese, via acoustic analyses of most segments in its phonological inventory. This paper has underlined the uneven nature of the acoustic studies on the Basque language: while studies on vocalic inventories and sibilants are fairly common, the rest of the segments of the language are understudied, and no general acoustic description of any variety (or the standard language) can be found in the literature. Important results of this study include the first proposal of a centralized rounded vowel in any Basque variety, a data-driven confirmation of the maintenance of the three-way stop distinction in Mixean, the description of a merger of the series of alveolo-palatal sibilants to the apico-alveolar series, evidence in support of an incipient merger of rhotics, and the realization of rhotics in neutralizing positions with one or even no lingual contacts. These results would have remained unknown had we focused on a more accessible dialect instead of an endangered variety.
ACKNOWLEDGMENTS
This research was supported by the Alexander von Humboldt Foundation, the Spanish Ministry of Economy and Competitiveness (FFI2016-76032-P; FFI2015-63981-C3-2), and ERC Advanced Grant No. 295573 Human interaction and the evolution of spoken accent (J. Harrington). We thank Iñaki Camino for letting us use his recordings, Rosa Franzke for helping us with the annotation process, Martxel Martínez for his help in editing the map, and José Ignacio Hualde for reading and commenting on an early version of this paper. All errors are ours.
See supplementary material at https://doi.org/10.1121/10.0000996 for a dialectal map of the Basque varieties and for a provisional segmental inventory of the Mixean variety.
Hualde (1997, p. 106) describes the accent in Low Navarrese as non-contrastive, with no clear stress in connected speech and paroxytonic in isolated words, a description that Camino (2016, pp. 190–191) confirms for Mixean.
A nearly-exhaustive list of acoustic studies of Basque vowels would also include Salaburu (1984) with five speakers from Baztan (High Navarrese); Pagola (1992), with 18 speakers of (Northern) High Navarrese (from Baztan, Bortziri, and Ultzama; and a speaker of Lapurdian from Zugarramurdi); Hualde et al. (2010) with two speakers from Goizueta (High Navarrese); Etxebarria Ayesta (1991, p. 48) with one speaker from Zeberio (Bizkaian); Gaminde (1992, 1995), with speakers from Urduliz and Gatika (Bizkaian); Etxebarria Arostegi (1995) also from Bizkaian; and Etxeberria (1990, 1991) with speakers of Zaldibia (Gipuzkoan). Etxebarria Arostegui (1991) and Iribar Ibabe and Túrrez Aguirrezabal (2001) do not specify the varieties under study.
A nearly-exhaustive list of acoustic studies involving Basque sibilants would include Yárnoz (2002b), Hualde (2010), Jurado Noriega (2011), Gaminde et al. (2013), Gandarias et al. (2014), Iglesias et al. (2016), Muxika-Loitzate (2017), Beristain (2018a,b), Txillardegi (1982), Isasi Martínez et al. (2009), Urrutia et al. (1988, 1989, 1991), and Txillardegi (1982). Classic articulatory descriptions of Basque sibilants include Navarro Tomás (1923, 1925) and Alonso (1923).
In comparison with the raw Hz values, the formant normalization resulted in an average absolute difference of 4.5 Hz for F1 (SD = 2.2), 9.8 Hz for F2 (SD = 5.1), and 12.5 Hz for F3 (SD = 11.4).
Given that the speech signals were band-pass filtered from 300 Hz to 19 kHz before spectral measurements were made, the lower CoG values for the voiced sibilants are not likely due directly to F0 energy. However, it is reasonable to speculate that energy associated with lower harmonics (energy that arises due to voicing) causes the overall spectral energy to shift towards lower frequencies.
A study of the contrast between fricative and affricate sibilants is beyond the purposes of the present study. Such a study would require additional acoustic metrics such as total duration or rise time (Johnson, 2003, pp. 144–145).