Kohler [Einführung in die Phonetik des Deutschen (Erich Schmidt Verlag, Berlin, Germany, 1995)] stated that German [ɐ] and [a] in unstressed syllables are merging. The present study tested this hypothesis. The contrast was found intact word-internally and word-finally. Neighborhood density enhanced its phonetic characteristics, but no effects of frequency and conditional probability were found.
Phonological description of German considers the vowel [ɐ] to be an allophone derived from the phoneme sequence [vowel + r],1 which is reflected in the orthography, where [ɐ] is spelled with the letter sequence ⟨vowel + r⟩. Very likely due to this phonological view, [ɐ] and its phonetic characteristics have been mostly ignored in phonetic and graphemic research. On the one hand, Tomaschek et al.,2 using data from South German speakers,3 demonstrated that [ɐ] has significantly different phonetic characteristics depending on word class. However, Kohler1 states that [ɐ] and [a] in unstressed syllables in Standard German is in the middle of merging, while phonetic studies concerned with contrasting [ɐ] to other vowels are inconclusive. Reference 4 reports that [ɐ] is realized with a slightly higher tongue position than [a]. A similar difference is reported by Heid et al.5 References 6 and 7 report that most of their speakers do not produce the [ɐ]-[a] difference.
However, these studies suffer from some shortcomings. Kohler's data stem from only one speaker of Northern German. Heid et al.5 findings for [a] are confounded by stress. Given that unstressed [a] has been found to be systematically realized with a lower F1 frequency than stressed lax [a],8 this is problematic.
Another aspect is that the studies cited previously5–7 use readout data. However, it is well known that orthographic encoding affects the pronunciation of texts read aloud, typically resulting in hyper-articulation.9 Accordingly, the question arises whether this is also the case for the [a] and [ɐ] contrast.
Another issue with these studies is that they focus on word-final occurrences of [ɐ] and [a]. However, as we will demonstrate, these vowels occur both word-finally and word-internally. Accordingly, the question arises whether word position affects whether [ɐ] and [a] are contrasted or not.
The final issue of all studies cited above is that they did not take into account aspects of lexical predictability. However, it has been repeatedly demonstrated that lexical predictability, as measured by a word's contextual predictability, its frequency or number of phonological neighbors, correlates with phonetic characteristics.10,11
Most studies that take measures like these into account report that phonetic characteristics are reduced in words of high predictability and frequency (e.g., Refs. 2 and 10–14). Accordingly, the question arises about how measures of predictability correlate with phonetic characteristics of phones that are located very near to each other in the phonetic space and run the risk of being confused, as is the case for [ɐ] and [a].
2.1 Word material
The word material for the present study was taken from the KIEL corpus,15 which contains recordings for 162 speakers, 55 in a reading condition, and 107 in the quasi-spontaneous condition. The KIEL corpus provides a narrow transcription. However, the annotation was based on phonological assumptions, not perception. We first extracted all words containing the letters “a” and “er,” and then classified the vowels based on the KIEL corpus' phonetic transcription. The corpus contains a total of 11.559 samples with unstressed [ɐ] and [a]. The selection of stress was based on the corpus' annotation. The number of syllables in the words ranged between 1 and 7. The vowels under investigation were not part of a diphthong. More details on the words can be inspected in the supplementary material. We contrasted word-final with word-internal vowels, with the latter being all vowels that are not word-final, independently of whether they are located at the onset of the word or word-medially. Word-internally, we obtained 5437 [a] and 676 [ɐ] in spontaneous speech, 2750 [a] and 756 [ɐ] in read speech (e.g., for [a]: absolut [apsolut], alternativen [altɐnatiːvən]; for [ɐ]: allerdings [alɐdɪŋs], anders [andɐs]). Word-finally, we obtained 753 [a] and 1275 [ɐ] in spontaneous speech, 250 [a] and 1688 [ɐ] in read speech (e.g., for [a]: etwa [ɛtva], Alaska [alaska]; for [ɐ]: oder [oːdɐ], aber [aːbɐ]).
2.2 Statistical analysis
The response variables in our study were F1 and F2, extracted using PRAAT and calculated by averaging formant measurements for time points located between the 45% and 55% time point of the vowel. As predictors, we used vowel duration to control for well-known temporal effects on formant quality in relation to the number of segments, word duration, and speaking rate [e.g., Ref. 16] condition to take recording condition into account (reading vs spontaneous). Our predictor of interest was Vowel, contrasting [a] vs [ɐ]. We probed whether there are any systematic differences between [a] and [ɐ] in the two conditions with a two-way interaction between condition and Vowel. We furthermore used the lexical predictors, Neighborhood density operationalizing the number of orthographic neighbors calculated using the CELEX corpus,17 and Word Frequency; P(N|N − 1); P(N|N + 1). The probabilistic predictors were calculated based on the KIEL corpus. All lexical predictors were log-transformed and z-scaled. Distributions, along with the data and the analysis scripts can be downloaded from our supplementary material here at https://osf.io/4b52v/.21 We also tested the inclusion of the place of articulation of segments preceding and following our target vowels. Though the predictors showed significant effects, their inclusion did not affect the estimates of Vowel. This is why we do not report their results here, but their effects can be inspected in the supplementary material.21 We also tested whether the relative position of word-internal vowels had an effect on their quality—but found no significant effects.
We used linear mixed-effects regression (lmer, R version 3.5.3, package lme4, Version 1.1–2118) for our statistical analysis. We fitted F1 and F2 once for word-internal and once for word-final position, using Speaker and Word as random intercepts. We employed a forward-fitting and backwards-fitting strategy, and once a final model was found, we excluded data with residuals larger than absolute 2.5 standard deviations away from the mean and refitted the regression model again. We report only models with significant main effects and interactions. Visualization was performed with functions from the languageR package (Version 1.5.0).19
Figure 1 visualizes the differences between the two vowels depending on the recording condition in all word positions. Word-internally, we find that [ɐ] has a significantly lower F1 than [a] in the spontaneous condition (β = −50.203, standard error, se = 5.738, t = −8.750). While there is no significant difference between spontaneous and reading condition (β = 6.089, se = 13.212, t = 0.461), their difference is increased in the reading condition, as indicated by the significant condition × vowel interaction (β = −36.588, se = 7.499, t = −4.879). Both vowels have higher F1 when the vowels are longer (β = 38.423, se = 1.034, t = 37.165) and in words with a larger number of neighbors (β = 7.056, se = 2.862, t = 2.466). No other effects have been found for F1. Concerning F2, [ɐ] has a significantly higher F2 than [a] (β = 22.484, se = 9.621, t = 2.337). Both vowels have lower F2 in relation to longer vowel duration (β = −11.732, se = 1.623, t = −7.229). No other effects have been found for F2.
Word-finally, we find similar effects for F1 as word-internally. Thus, word-finally, we observe that [ɐ] has a significantly lower F1 than [a] in the spontaneous condition (β = −45.411, se = 10.243, t = 4.433). There is also no significant difference between spontaneous and reading condition (β = 11.194, se = 13.462, t = 0.832), yet the difference between [a] and [ɐ] is increased in the reading condition, as indicated by the significant condition × vowel interaction (β = −22.439, se = 7.502, t = −2.991). Both vowels have higher F1 when the vowels are longer (β = 26.386, se = 1.199, t = 22.002). No other effects have been found for F1 word-finally.
Concerning F1, we find no significant difference between [a] and [ɐ] in the spontaneous condition (β = 25.415, se = 26.244, t = 0.968) and also no significant difference between spontaneous and reading condition (β = −40.096, se = 27.511, t = 1.457). Yet, we observe a significant condition × vowel interaction (β = 34.080, se = 14.225, t = 2.396), indicating that in the reading condition, the difference is increased. Both vowels have lower F2 when the vowels are longer (β = −32.676, se = 2.246, t = −14.549). No other effects have been found for F2 word-finally.
In the present study, we investigated the contrast between [a] and [ɐ] in unstressed syllabic positions in German words, testing whether the two vowels show different phonetic characteristics depending on the recording condition (spontaneous and read speech). We find that Kohler's1 supposed merger is not complete. In both spontaneous and reading conditions, we find significant differences between [a] and [ɐ]. Moreover, we observe that the contrast is enhanced in the reading condition, potentially due to hyperarticulation in relation to orthographic encoding.9 To fully establish whether the contrast is fully functional in German, however, future studies have to establish whether listeners are actually capable of discriminating them. While we have found that both vowels are hyperarticulated in relation to greater neighborhood density words internally, an effect already established in the previous literature,12,13 we have not found any effects of frequency and conditional probability. This finding contrasts with predictions made by theories pertaining to information theory, e.g., the Smooth Signal Redundancy Hypothesis.20
The present results have implications for the theoretical treatment of the contrast, and specifically of [ɐ]. [ɐ] has been neglected due to its allophonic status in phonology. However, our findings show that phonological status is not necessarily reflected in a contrast's phonetic characteristics. Moreover, we demonstrate that a merger has to be probed in relation to speaking register, positional aspects and lexical aspects in order to establish whether a contrast is truly eliminated in a language – or whether the potential merger happens only in a specific context.
This research was supported by the Open Access Publication Fund of the University of Bonn and by the Deutsche Forschungsgemeinschaft (Research Unit FOR2373 “Spoken Morphology,” Project BA 3080/3-1 and BA 3080/3-2). We are also thankful to Kevin Tang and two anonymous reviewers for their valuable comments on previous versions of this paper.