Discrimination of Canadian French /y/, /u/, /ɑ/, and /e/ by native Canadian-English listeners was investigated to determine if patterns found in standard varieties of French (as explained by the Perceptual Assimilation Model) could be replicated in Canadian French. Front-rounded /y/ paired with /u/ was the focus of investigation, as well as other (control) pairs. It was found that /y/-/u/ was the most difficult to discriminate as compared to other pairs, but that listeners were sensitive to the contrast, which replicates previous findings in European French. Results are explained as a mix of instances of single-category and category-goodness assimilation patterns.

Native speakers of English have difficulty perceiving the front-rounded series of French vowels (Levy, 2009; Levy and Strange, 2008), which are often confused with /u/ (or other back-rounded vowels). However, most research on second-language (L2) miscategorizations of French front-rounded vowels have focused on European varieties (Levy, 2009; Levy and Strange, 2008; Tyler et al., 2014) or synthesized stimuli (Rochet, 1995), while French is spoken in many regions worldwide. Here, we focus on the perception of Canadian French /y/ by native-English listeners, aiming to replicate previous results found with European varieties of French in a less-documented variety. Furthermore native-English speakers in Canada are exposed to French in the school system, and it has been shown that speech perception abilities impact phonological and word learning in the longer term (Best and Tyler, 2007).

While L2 acquisition of “standard” phonetic systems (e.g., General American or British English, Parisian French) has been the object of many investigations across many languages, fewer studies have examined if non-native categorization patterns are similar or different across different dialects of an L2. For instance, it has been found that Japanese-native learners of English living in Alabama had just as much difficulty identifying vowels spoken by a native English speaker from Alabama than Japanese-native learners living in Ohio (Fox and Tyler, 2007). However, Escudero and Boersma (2004) found that Spanish-native learners of Scottish or British English displayed different perception patterns of /i/-/ɪ/. It thus seems that in some cases, L2 dialect can have an influence on the way vowels are perceived by non-native listeners, but the results do not necessarily seem consistent across studies.

It can also seem peculiar that /y/, a front vowel, is often confused with /u/, a back vowel. For instance, based on a standard first-formant (F1) by second-formant (F2) space, it might be expected that /y/ be mainly confused with other front vowels such as /i/ or /e/, which are closer in the acoustic space. Lip rounding has also been hypothesized to impact perception of vowel backness (Lisker and Rossi, 1992), i.e., front-rounded vowels can be perceived as more back than their non-rounded counterparts, which could then motivate the perception of French /y/ as /u/ by English listeners. Another acoustic correlate to /y/ is focalization of the second and third (F3) formants, i.e., the fact that F2 and F3 are close to one another on the spectrum, which is also constantly realized by French speakers throughout development (Ménard, 2003). Because listeners' phonetic-cue weighting can differ across languages (Shertz et al., 2019), it is difficult to predict if English-native listeners will have difficulty discriminating Canadian-French /y/ from other vowels given prior studies indicating that it is the case, based on proximity in the F1/F2 acoustic space (i.e., confusion with /e/) or on similarity in (measures related to) F3 (i.e., confusion with /u/).

The main aim of the current paper is thus to describe how native-Canadian English listeners of Canadian French discriminate front-rounded /y/ from /u/, two vowels that have been found to be confused in European varieties of French (Levy, 2009; Levy and Strange, 2008), in order to replicate previous results found with other dialects and verify hypotheses of the Perceptual Assimilation Model (PAM), which is presented in Best (1995). We also want to establish a baseline of perceptual abilities of Canadian-English listeners of Canadian French to further investigate their L2 word-learning skills in future studies.

Based on previous literature, we expect listeners to perform poorly on /y/-/u/ stimuli pairs (Levy, 2009; Levy and Strange, 2008; Tyler et al., 2014). According to PAM (Best, 1995), we expect to find single-category assimilation, where listeners would consistently assimilate both [y] and [u] to their native /u/ category and thus respond incorrectly to [y]-[u] pairs. However, we also expect to find instances of category-goodness assimilation, shown by correct discrimination of [y] and [u], because [y] tokens can be considered “poor(er)” realizations of the /u/ category and thus could be more easily discriminated. Both options are possible here, since the investigated population is (variably) exposed to L2 French in the school system (Flege, 1995). For the other vowel pairs, we expect good discrimination performance overall since they should be cases of two-category assimilations, according to PAM (i.e., /y/-/e/, /y/-/ɑ/, /u/-/e/, /u/-/ɑ/, and /e/-/ɑ/); even if [y] tokens are assimilated to the listeners' /u/ category, discrimination from tokens of [e] or [ɑ] is expected to be good. Note, however, that since [y] and [e] tokens are close in the acoustic F1/F2 space, discrimination of this pair could be poorer than expected (i.e., category-goodness assimilation) if listeners rely on distance in acoustic space to discriminate L2 vowels. We provide below a measure of sensitivity to the contrasts (d′), in addition to accuracy measures on both the different and same pairs.

Listeners were 64 native-English speakers [M age = 21.75 years, standard deviation (SD) = 5.16) who self-reported null to low level of proficiency in French (self-assessment out of ten: M = 2.16, SD = 1.51). An additional five individuals were tested but were not included in the analyses because they were bilingual in Mandarin, a language that has /y/ in its phonological inventory. Due to the structure of the school system in Ontario, most listeners had been minimally exposed to French, but did not use it on a regular basis. A number of listeners were bilingual in a language other than French, but none of the languages had a series of front-rounded vowels.

Stimuli were CV syllables with all possible combinations of onsets [p, t, k, b, d, or ɡ] and nuclei [y, u, e, or ɑ]. [y] was not presented following [t] or [d] onsets since these consonants are affricated [ts] and [dz] before high-front vowels in some sub-varieties of Canadian French (Martin, 2002). For recording, four native speakers of Canadian French (three males) pronounced the duplicated syllables in the carrier phrase Je dis TARGET deux fois “I say TARGET twice” (e.g., for the target [py], speakers pronounced Je dis [pypy] deux fois) three or four times. Using Praat version 6.0.43 (Boersma and Weenick, 2018), the last syllable of the duplicated token was segmented (e.g., [pypy]), and the best token was selected (i.e., the vowel had to be clearly audible) by the first author and saved as a separate .wav file. Duplicating the syllables for recording followed by segmentation in Praat was necessary because noncontrastive stress is generally realized on the last vowel of phonological words in French, and thus we could ensure that the target vowel of interest was fully and clearly realized in the stimuli. Z-scored (by talker) F1 and F2 values (Lobanov, 1971) for the recorded vowels, arranged in an acoustic vowel space, are presented in Fig. 1 (leftmost panel), along with ellipses (default “t” type in ggplot2 in R, which assumes a multivariate t-distribution). Z-scored (by-talker) F3 values were also extracted, and are presented in Fig. 1 (rightmost panel; z-scored F3 range for /ɑ/: −1.46 to 1.95 with outliers, Δ=3.46, range: −1.46 to 0.9 without outliers, Δ=2.36; for /e/: 0.05 to 2.77, Δ=2.27; for /u/: −1.52 to −0.06 with outliers, Δ=1.46, range: −1.03 to −0.06 without outliers, Δ=0.97; for /y/: 1.27 to −0.62, Δ=1.1). F3 values (in Hz) of [y] and [u] tokens were not significantly different [t(30.23)=1.61, p = 0.12].

Fig. 1.

Average formant values for Canadian French speakers (one black dot per speaker for each vowel category) and individual token values (in gray) with ellipses on the leftmost panel. By-participant z-scored F3 values (lower rightmost panel).

Fig. 1.

Average formant values for Canadian French speakers (one black dot per speaker for each vowel category) and individual token values (in gray) with ellipses on the leftmost panel. By-participant z-scored F3 values (lower rightmost panel).

Close modal

The selected stimuli were semi-randomly arranged so that members of the pairs were never pronounced by the same talker, and that listeners heard 12 different pairs of each vowel combination (i.e., [y]-[u], [y]-[e], [y]-[ɑ], [u]-[e], [u]-[ɑ], and [e]-[ɑ]) and 18 same pairs for each vowel (i.e., [y]-[y], [u]-[u], [e]-[e], and [ɑ]-[ɑ]), regardless of the consonant onset (e.g., same pairs could have different onsets) and speakers. The choice to use different consonantal onsets for the syllables was made to force participants to focus on vowels and discourages any strategy that is strictly acoustic in nature. Note that this choice also prevents analyzing the impact of consonantal context on [y]-[u] discrimination, a factor that has been shown to impact /y/ perception by English listeners (Levy, 2009). Inter-stimulus interval (ISI) was 500 ms, and participants responded to a total of 144 pairs. Note that we would expect participants to incorrectly respond to same pairs as “different” if they use low-level acoustic information only to perform the task, given the relatively short ISI (Werker and Logan, 1985) and that both tokens within a pair are necessarily different (i.e., pronounced by two talkers). The experiment was programmed in Experiment Builder (SR Research, 2018). On each trial, participants listened to a vowel pair and were asked to indicate on a button box if the vowels were the same: “Regardless of the voice of the speaker and the first sound of the syllable, do you think that the vowels that you are hearing rhyme?” We also compiled reaction times, but will focus the current discussion on accuracy data. The discrimination experiment lasted approximately 10 min.

Results were analyzed using R version 3.6.0 (R Core Team, 2019). All data and analysis codes can be found at https://osf.io/5n9bw/. The number of correct and incorrect responses for different and same pairs was aggregated for each participant, and d′ scores were calculated using the neuropsychology package (Makowski, 2016) for each different contrast. We present both number of correct responses and d′ scores since the former is an indication of the participants' raw performance on the task, and the latter provides information about the listeners' sensitivity in a way that is bias-free (e.g., d′ can capture a situation in which listeners with a bias toward responding different for all different and same pairs might appear to have perfect discrimination of truly different pairs).

Figure 2 shows the d′ scores (panel A), rates of correct responses for different pairs (panel B), and rates of correct responses for same pairs (panel C). A repeated-measures analysis of variance (ANOVA) on d′ scores, using rstatix 0.4.0 (Kassambara, 2020) in R, showed a main effect of Vowel Pair [F(5,315)=143.7, p < 0.001]. Planned (Bonferroni-corrected) pairwise t-tests (see Table I in the supplementary material1 for additional details) showed that d′ scores for [y]-[u] pairs (M = 0.82, SD = 0.57) were significantly lower than all other pairs [range t(63)=20.4 to −11.9, p < 0.001 for all comparisons with [y]-[u] pairs]. D′ scores for [y]-[u] pairs were also significantly different from 0 [one-sample t-test: t(63)=11.57, p < 0.001], meaning that listeners were sensitive to the /y/-/u/ contrast, but less than the other contrasts. Note that d′ scores for [y]-[e] (M = 1.84, SD = 0.69) pairs were also significantly lower than [y]-[ɑ] pairs [M = 2.36, SD = 0.73; t(63)=8.33, p < 0.001], [u]-[e] pairs [M = 2.06, SD = 0.59; t(63)=3.48, p < 0.001], [u]-[ɑ] pairs [M = 2.53, SD = 0.69; t(63)=9.67, p < 0.001], but not [e]-[ɑ] (M = 2.03, SD = 0.59). We evaluate below the impact of Vowel Pair on same pairs to explain these unexpected results, because d′ score calculations integrate both different and same pairs.

Fig. 2.

d′ scores for each different-vowel pair (panel A), rate of correct responses (out of 12) for different-vowel pairs aggregated across participants (panel B), and rate of correct responses (out of 12) for same-vowel pairs aggregated across participants (panel C).

Fig. 2.

d′ scores for each different-vowel pair (panel A), rate of correct responses (out of 12) for different-vowel pairs aggregated across participants (panel B), and rate of correct responses (out of 12) for same-vowel pairs aggregated across participants (panel C).

Close modal

Similar analyses were conducted on rates of correct responses (out of 12) for different pairs, and showed a main effect of Vowel Pair [F(5,315)=216.2, p < 0.001]. Planned (Bonferroni-corrected) pairwise t-tests (see Table II in the supplementary material1 for additional details) showed that rates of correct responses for [y]-[u] (M = 5.34, SD = 2.81) were significantly lower than all other pairs [range t(63)=17.6 to −14.3, p < 0.001 for all comparisons with [y]-[u] pairs]. No other pairwise comparisons were significantly different from each other (range M = 11.19 to 11.64, range SD = 0.74 to 1.47).

A third repeated-measures ANOVA was used to assess the significance of Vowel Pair on rates of correct responses (out of 18) for same pairs. We found a main effect of Vowel Pair [F(3,189)=28.5, p < 0.001]. Planned (Bonferroni-corrected) pairwise t-tests (see Table III in the supplementary material1 for additional details) showed that rates of correct responses for [y]-[y] (M = 13.81, SD = 3.43) were significantly lower than [ɑ]-[ɑ] pairs [M = 15.58, SD = 3.24; t(63)=4.03, p < 0.001] and [u]-[u] pairs [M = 15.48, SD = 2.78; t(63)=4.88, p < 0.001], but significantly higher than [e]-[e] pairs [M = 12.3, SD = 3.01; t(63)=4.03, p = 0.012]. Rates of correct responses on [e]-[e] pairs were also significantly lower than [ɑ]-[ɑ] pairs [t(63)=7.52, p < 0.001] and [u]-[u] pairs [t(63)=7.65, p < 0.001].

The current data showed that Canadian-English listeners encounter difficulties when discriminating Canadian-French [y] compared to [u], as demonstrated by significantly lower d′ scores and rates of correct responses for different pairs than all other vowel pairings. However, the rate of correct responses for /y/-/u/ pairs was at 5.34/12 on average and d′ scores were significantly above 0. Thus, we interpret these results as a mix of single-category and category-goodness assimilation patterns, in PAM's terms, which could be due to previous exposure to L2 French in our group of participants. However, Levy and Strange (2008) found that the error rate on /y/-/u/ pairs was similar across groups of experienced and non-experienced L2 (English-native) listeners of European French. Thus, in our study and consistent with Levy and Strange (2008), English-native listeners were sensitive to the contrast between /y/ and /u/, shown by category-goodness assimilation patterns through non-zero d′ scores, but nevertheless showed poorer discrimination of /y/ and /u/ compared to other vowel pairs that better aligned with English vowel categories, as indicated by significantly lower /y/-/u/ d′ and rate of correct responses.

A closer investigation of the response patterns for same pairs revealed significantly lower performance (i.e., a higher rate of incorrect responses) for /e/-/e/ pairs than all other vowel pairings. This also explains why d′ scores for /y/-/e/ pairs were different from other pairings. One possibility as to why listeners did not perform as well on /e/-/e/ pairs than on the other same pairs, including /y/-/y/, is the relatively short ISI (i.e., 500 ms) which could have forced the listeners to focus on stimulus-specific differences as opposed to higher-level phonological similarity for these pairs. However, it remains puzzling as to why listeners performed significantly worse on /e/-/e/ pairs and not /ɑ/-/ɑ/ or /u/-/u/ pairs, for instance. Observation of F3 values suggests that /e/ vowels had a greater range of variation than the other vowel categories, which could have impacted the perception of French /e/. Further investigations using the same stimuli but varying the ISI could enable us to shed light on this methodological question, and in turn have an even better idea of the discrimination patterns of L2 vowels.

As mentioned above, it is possible that these observed results are attributable to different ways of weighing phonetic cues by L2 listeners of French for the perception of the investigated vowels. For instance, if English-native listeners rely on F3 to categorize rounded vowels in general, then it would be expected that they would misdiscriminate /y/ and /u/, yielding poor discrimination as observed here, because F3 values were similar for these vowel categories in the current stimuli (z-score for /u/ M=0.55 and for /y/ M=0.94). Similarly for French /e/, if English-native listeners strongly relied on F3 (which has been found to be variable in the current stimuli as shown by a difference in z-scored values of Δ=2.27, see Fig. 1), it is expected that they would incorrectly respond to a number of same pairs in the current study. Assuming that French-native listeners would perform better than English-native listeners on /y/-/u/ and /e/-/e/ pairs, this could be explained by the hypothesis that they would consider more than just F3, such as F2-F3 focalization (Ménard, 2003), while English-native listeners could be attuned to only F3 as a phonetic cue to perceive “roundedness.” Further research will be necessary to disentangle this question.

In conclusion, it seems discrimination patterns of /y/ in Canadian French are similar to those in other varieties of French (Levy and Strange, 2008), although interestingly sensitivity to the contrast was appreciably better than chance in the current study. This is likely due to exposure to L2 French in the school system, but could also be due to the weighting on F3 in Canadian-English listeners, yielding a mix of single-category and category-goodness assimilation patterns. We also predict for future research that words that have /y/ will be relatively difficult to learn for native-English listeners. Such findings would provide greater insights into the architecture and interactions between phonetic discrimination and word learning/recognition in an L2.

This research was supported by postdoctoral fellowships from the Fonds de recherche du Québec—Société et Culture and the BrainsCAN program at Western University (Canada First Research Excellence Funds) awarded to F.D.-T. We thank Krystal Flemming and Alyssa Yantsis for their assistance with data collection.

1

See supplementary material at https://doi.org/10.1121/10.0001180 for complete output tables of the statistical models.

1.
Best
,
C. T.
(
1995
). “
A direct realist view of cross-language speech perception
,” in
Speech Perception and Linguistic Experience: Issues in Cross-Language Research
, edited by
W.
Strange
(
York Press
,
Baltimore, MD
), pp.
171
204
.
2.
Best
,
C. T.
, and
Tyler
,
M. D.
(
2007
). “
Nonnative and second-language speech perception: Commonalities and complementarities
,” in
Language Experience in Second Language Speech Learning: In Honor of James Emil Flege
, edited by
O.-S.
Bohn
and
M. J.
Munro
(
John Benjamins
,
Amsterdam)
, pp.
13
34
.
3.
Boersma
,
P.
, and
Weenick
,
D.
(
2018
). “
Praat: Doing phonetics by computer (version 6.0.43) [computer program]
,” http://www.praat.org.
4.
Escudero
,
P.
, and
Boersma
,
P.
(
2004
). “
Bridging the gap between L2 speech perception research and phonological theory
,”
Stud. Second Lang. Acquisit.
26
,
551
585
.
5.
Flege
,
J. E.
(
1995
). “
Second language speech learning: Theory, findings, and problems
,” in
Speech Perception and Linguistic Experience: Issues in Cross-Language Research
, edited by
W.
Strange
(
York Press
,
Baltimore, MD
), pp.
233
277
.
6.
Fox
,
R. A.
, and
Tyler
,
J. T.
(
2007
). “
Second language acquisition of a regional dialect of American English by native Japanese speakers
,” in
Language Experience in Second Language Speech Learning: In Honor of James Emil Flege
, edited by
O.-S.
Bohn
and
M. J.
Munro
(
John Benjamins
,
Amsterdam
), pp.
117
134
.
7.
Kassambara
,
A.
(
2020
). rstatix: Pipe-Friendly Framework for Basic Statistical Tests, https://CRAN.R-project.org/package=rstatix, R package version 0.4.0 (Last viewed April 22, 2020).
8.
Levy
,
E. S.
(
2009
). “
On the assimilation-discrimination relationship in American English adults' French vowel learning
,”
J. Acoust. Soc. Am.
126
,
2670
2682
.
9.
Levy
,
E. S.
, and
Strange
,
W.
(
2008
). “
Perception of French vowels by American English adults with and without French language experience
,”
J. Phonetics
36
,
141
157
.
10.
Lisker
,
L.
, and
Rossi
,
M.
(
1992
). “
Auditory and visual cueing of the [±rounded] feature of vowels
,”
Lang. Speech
35
(
4
),
391
417
.
11.
Lobanov
,
B. M.
(
1971
). “
Classification of Russian vowels spoken by different speakers
,”
J. Acoust. Soc. Am.
49
(
2
),
606
608
.
12.
Makowski
,
D.
(
2016
). “
Package ‘neuropsychology’: An R toolbox for psychologists, neuropsychologists, and neuroscientists
,” https://github.com/neuropsychology/neuropsychology.R (Last viewed April 22, 2020).
13.
Martin
,
P.
(
2002
). “
Le système vocalique du français du Québec: De l'acoustique à la phonologie
” (“The vocalic system of Quebec French: From acoustics to phonology”),
La linguistique
38
,
71
88
.
14.
Ménard
,
L.
(
2003
). “
Acoustic variability and adaptive articulatory strategies during vocal tract growth revealed by the rounding contrast in French
,” in
Proceedings of the 15th International Congress of Phonetic Sciences
, edited by
M. J.
Solé
,
D.
Recasens
, and
J.
Romero
, Barcelona, Spain, pp.
3169
3172
.
15.
R Core Team
(
2019
). “
The R project for statistical computing
,” https://www.r-project.org (Last viewed April 22, 2020).
16.
Rochet
,
B. L.
(
1995
). “
Perception and production of second-language speech sounds by adults
,” in
Speech Perception and Linguistic Experience: Issues in Cross-Language Research
, edited by
W.
Strange
(
York Press
,
Baltimore, MD)
, pp.
379
410
.
17.
Shertz
,
J.
,
Carbonell
,
K.
, and
Lotto
,
A. J.
(
2019
). “
Language specificity in phonetic cue weighting: Monolingual and bilingual perception of the stop voicing contrast in English and Spanish
,”
Phonetica
(published online).
18.
SR Research
(
2018
). “
Experiment Builder (version 2.1.140)
,” https://www.sr-research.com/experiment-builder/ (Last viewed April 22, 2020).
19.
Tyler
,
M. D.
,
Best
,
C. T.
,
Faber
,
A.
, and
Levitt
,
A. G.
(
2014
). “
Perceptual assimilation and discrimination of non-native vowel contrasts
,”
Phonetica
71
,
4
21
.
20.
Werker
,
J. F.
, and
Logan
,
J. S.
(
1985
). “
Cross-language evidence for three factors in speech perception
,”
Percept. Psychophys.
37
(
1
),
35
44
.

Supplementary Material