The highest frequency for which the temporal fine structure (TFS) of a sinewave can be compared across ears varies between listeners with an upper limit of about 1400 Hz for young normal-hearing adults (YNHA). In this study, binaural TFS sensitivity was investigated for 63 typically developing children, aged 5 years, 6 months to 9 years, 4 months using the temporal fine structure-adaptive frequency (TFS-AF) test of Füllgrabe, Harland, Sęk, and Moore [Int. J. Audiol. 56, 926–935 (2017)]. The test assesses the highest frequency at which an interaural phase difference (IPD) of ϕ° can be distinguished from an IPD of 0°. The values of ϕ were 30° and 180°. The starting frequency was 200 Hz. The thresholds for the children were significantly lower (worse) than the thresholds reported by Füllgrabe, Harland, Sęk, and Moore [Int. J. Audiol. 56, 926–935 (2017)] for YNHA. For both values of ϕ, the median age at which children performed above chance level was significantly higher (p < 0.001) than for those who performed at chance. For the subgroup of 40 children who performed above chance for ϕ = 180°, the linear regression analyses showed that the thresholds for ϕ = 180° increased (improved) significantly with increasing age (p < 0.001) with adult-like thresholds predicted to be reached at 10 years, 2 months of age. The implications for spatial release from masking are discussed.
Although many auditory abilities are present at birth, the auditory system continues to develop into the teenage years (Eisenberg et al., 2000; Litovsky, 2015; Corbin et al., 2016). One auditory ability, binaural processing, has been extensively investigated in children (Litovsky, 2005; Van Deun et al., 2009; Lovett et al., 2012) but is still not well understood. Binaural processing makes use of the differences in timing and level of the sound that arrives at the two ears and is largely responsible for the sense of spatial hearing (Ehlers et al., 2016). Spatial hearing allows the localization of sound sources and enhances the perceptual separation of speech from interfering sources. Whereas there is evidence that spatial hearing improves with increasing age (Vaillancourt et al., 2008; Van Deun et al., 2010), Litovsky (2005) found similar levels of spatial benefit for word perception in noise for children (aged 4–7 years old) and adults. For low-frequency sinewaves, adults are able to detect changes in interaural phase, which occur when the azimuth of the sound source changes. This ability depends on neural phase locking to individual cycles of the stimulus, so, it is often described as depending on sensitivity to the temporal fine structure (TFS) of sounds (Moore, 2014, 2021). The highest frequency at which an interaural phase difference (IPD) in a sinewave can be discriminated has been used as a measure of binaural TFS sensitivity for adults with both normal and impaired hearing (Brughera et al., 2013; Füllgrabe et al., 2017; Füllgrabe and Moore, 2018). For young normal-hearing adults (YNHA), the upper frequency limit is about 1400 Hz, although there is marked individual variability (Brughera et al., 2013; Füllgrabe et al., 2017). The upper frequency limit declines with increasing age above about 40 years and to a lesser extent with hearing impairment (Füllgrabe and Moore, 2017, 2018). To our knowledge, the upper frequency limit of binaural sensitivity to TFS has not been assessed for children. This paper presents such an assessment.
The peripheral auditory system responds to broadband sounds, such as speech or music, by filtering the sounds into narrowband signals with a wide range of center frequencies. This filtering occurs in the cochlea, and each of the narrowband signals can be conceptualised as a rapidly oscillating carrier signal (the TFS) with a relatively slowly varying envelope (ENV; Moore, 2008, 2014). The TFS information is encoded in the auditory nerve by the synchronization of action potentials to a specific phase of the TFS, a feature known as “phase locking.” There have been many studies of the role of the ENV in speech coding (Dudley, 1939), speech comprehension (Shannon et al., 1995; Loizou et al., 1999), and the neural encoding of speech and language (Ghitza, 2011; Giraud and Poeppel, 2012; Flanagan and Goswami, 2018). Although ENV cues are sufficient to provide intelligible speech in quiet (Shannon et al., 1995), the TFS information may be important for the intelligibility of speech in background sounds (Zeng et al., 2005; Hopkins and Moore, 2010).
Research into the relative importance of ENV and TFS cues in children's auditory perception has often used vocoded stimuli to degrade or manipulate the ENV and TFS cues (Eisenberg et al., 2000; Bertoncini et al., 2009; Power et al., 2016). However, the vocoding approach is problematic as the cochlear filtering of the auditory system results in TFS information being recreated from signals processed to contain only ENV cues (Søndergaard et al., 2012; Shamma and Lorenzi, 2013). Similarly, ENV information is extracted from speech processed to contain only TFS cues (Ghitza, 2001; Søndergaard et al., 2012; Shamma and Lorenzi, 2013). As a result, for broadband signals, it is difficult to estimate the relative importance of ENV and TFS cues (Moore, 2019).
It is known that speech intelligibility is usually better when the target talker is spatially separate from interfering sounds than when all sounds are spatially coincident, an effect known as spatial release from masking (SRM; Hirsh, 1950). SRM depends on a range of factors, including head-shadow effects (better-ear listening) and the use of binaural differences in TFS and ENV (Bronkhorst and Plomp, 1988). It has been suggested that binaural TFS information contributes to SRM by providing cues for the segregation of speech from background sounds, based on spatial location (Strelcyk and Dau, 2009; Swaminathan et al., 2016; Pichora-Fuller and Schneider, 1991, 1992). Swaminathan et al. (2016) assessed the contribution of binaural TFS cues to SRM using noise-vocoded speech. The noise carriers could be either the same at the two ears (partially preserving binaural TFS cues) or independent at the two ears (strongly disrupting binaural TFS cues). In the spatially separated conditions, the maskers were placed symmetrically relative to the participant's head to reduce the usefulness of long-term head-shadow differences. Swaminathan et al. (2016) found that disrupting the binaural TFS cues led to a significant reduction in SRM, suggesting an important role of binaural TFS cues in SRM.
The degree to which SRM aids speech intelligibility for adults has been quantified. Hawley et al. (2004) showed that a binaural advantage in speech reception thresholds (SRTs) of 6–7 dB was achieved by adults when the target speech was straight ahead and multiple speech or reversed-speech interferers were used. The speech-based interferers led to larger binaural advantages than speech-shaped noise or speech-modulated speech-shaped noise interferers, possibly due to more informative TFS for the former. The authors concluded that “the benefit of binaural hearing for speech intelligibility is especially pronounced when there are multiple voiced interferers at different locations from the target.” If the same is true for children, then binaural processing and the use of TFS cues would be important for children to be able to attend to important sound sources in challenging environments such as a busy classroom. However, the processing of binaural cues by children may be poorer than for adults.
In reverberant environments, such as a typical classroom (Bradley, 1986), sound arrives first by a direct path, shortly followed by reflections (echoes) that add to the total sound reaching the ears. Binaural cues derived from the leading (direct) sound dominate the perceived locations of sounds and help to perceptually supress reverberation. This is known as the precedence effect (Blauert, 1997; Litovsky et al., 1999). Binaural TFS cues make an important contribution to the precedence effect (Litovsky et al., 1999). Litovsky and Godar (2010) showed that the lead-lag interval, or echo threshold, at which two separate sounds were heard, was longer for 4–5-year-old children (24 ms) than for adults (15 ms). Also, errors in localization were greater for the children than for the adults.
Considering the development of the auditory system, abilities such as the perception of speech in noise are known to mature over the first 10–12 years of life (Leibold and Buss, 2013; Corbin et al., 2016). However, there has been little research on binaural TFS sensitivity for children. In one study, Kane et al. (2021) measured TFS sensitivity across the lifespan using low-rate frequency modulation (FM) detection of a 500-Hz carrier with the FM either in phase or out of phase at the two ears. In the latter case, binaural TFS cues are available because the FM leads to periodic changes in interaural phase. Performance improved over the age range 5–13 years with larger effects of age for out-of-phase than in-phase FM. In another study, Lotfi et al. (2019) investigated binaural TFS sensitivity for a group of children (9–12 years old) with suspected auditory processing disorder (APD) and age-matched typically hearing controls. APD is characterized by difficulties in understanding speech in noise despite a pure-tone audiogram within the normal range (Dawes et al., 2008; Flanagan et al., 2018). Lotfi et al. (2019) assessed binaural TFS sensitivity for fixed frequencies of 250, 500, and 750 Hz using the “TFS-LF” test (temporal fine structure low frequency) developed by Hopkins and Moore (2010). This test measures the smallest detectable change in IPD from a reference IPD of 0°. They found that a subset (39%) of the participants with suspected APD had significantly higher (worse) thresholds than the control group. The control group showed similar performance to the YNHA tested by Hopkins and Moore (2010).
A different method for assessing binaural TFS sensitivity is via measurement of binaural masking level differences (MLDs). An MLD is the difference in the threshold of a signal in a background sound when the signal and background have the same interaural phase and level and when they differ in interaural phase and/or level. For example, for a pure tone presented in diotic noise, the threshold for detecting the tone is lower when the interaural phase of the tone is 180° (denoted NoSπ) than when it is 0° (denoted NoSo). The MLD for this pair of conditions is large for low frequencies, about 15 dB for frequencies near 500 Hz, and decreases with increasing frequency to 2–3 dB around 1500 Hz (Durlach and Colburn, 1978). The larger MLD at low frequencies is thought to reflect the use of TFS information, whereas the effect for frequencies above about 1500 Hz is thought to reflect the use of binaural ENV cues. Hall et al. (2004) investigated the development of the MLD for children aged 5–11 years. A brief 500-Hz tone was presented with a 50-Hz wide noise band centered at 500 Hz in NoSo and NoSπ conditions. The tone was placed either at an envelope minimum or envelope maximum. The MLDs were greater for the former than for the latter. The binaural advantage associated with the signal in the masker envelope minima increased with the age of the child. This could reflect an improvement in the ability to process binaural TFS, an improvement in temporal resolution, or both. Van Deun et al. (2009) measured MLDs for a click signal in a white noise masker; the reference condition used a click that was synchronous at the two ears, whereas the test condition used a click that was delayed at one ear relative to the other ear. With these broadband stimuli, the MLDs for 5–6-year-old children were similar to those of adults, although thresholds for the children were higher than for the adults in both conditions. A problem with the use of broadband stimuli, however, is that both ENV and TFS cues may contribute to the MLD, and their relative contribution is hard to determine.
The main goal of the present cross-sectional study was to obtain a relatively pure measure of binaural TFS sensitivity for typically developing young children with ages in the range 5–9 years old, all of whom were taking part in a longitudinal study of rhythm and reading development, the “Listening to Rhythm in Children” study. A test of binaural TFS sensitivity developed by Füllgrabe et al. (2017) was adopted as the basis for developing a child-friendly binaural TFS test. The test, called the temporal fine structure adaptive frequency (TFS-AF) test, assesses the highest frequency at which an IPD of ϕ° can be distinguished from an IPD of 0°. The higher this frequency, the better is the performance. The change in IPD is perceived as a change in the position of the sound within the head. The frequency limit for performing the test is highest when ϕ = 180° and decreases (worsens) when ϕ is decreased (Füllgrabe et al., 2017). Here, values of ϕ of 180° and 30° were used to create two degrees of difficulty.
To assess individual differences in monaural development, a simple auditory task of intensity discrimination was used. These data were available as part of the “Listening to Rhythm in Children” study, which included a range of psychoacoustic tasks. Hence, the stimulus characteristics were not chosen to match those of the stimuli used in the TFS-AF task. It is generally assumed that intensity discrimination does not depend on binaural processing, although thresholds for intensity discrimination are lower for diotic than for monaural presentation, indicating that information can be combined across the two ears (Jesteadt and Wier, 1977; Moore and Glasberg, 2007). For school-aged children aged 5–9 years old, intensity discrimination thresholds are negatively correlated with age (Moore and Linthicum, 2007; Buss et al., 2009). The intensity discrimination thresholds were used as a covariate in the statistical analysis to control for developmental monaural improvements, helping to isolate age-related changes in binaural processing, as assessed using the TFS-AF test.
Sixty-seven children participated in this study. Participants were required to be free of any diagnosed learning difficulties (i.e., dyslexia, dyspraxia, attention deficit hyperactivity disorder, autistic spectrum disorder, and speech and language impairments) and speak English as their first language. All participants had normal vision or corrected-to-normal vision with spectacles. The participants received a short hearing screen, based on whether or not they could hear pure tones of various frequencies presented at 20 dB hearing level (HL), using a portable audiometer (Earscan, Micro Audiometrics, Daytona Beach, FL) equipped with a TDH-39 headset (Telephonics, Farmingdale, NY) fitted within noise excluding earcups (Audiocups, Amplivox, Birmingham, UK). Each ear was tested separately using frequencies of 250, 500, 1000, 2000, 4000, and 8000 Hz. Three participants did not pass the screening, so, their data were removed from the analysis. The remaining 64 participants were able to detect sounds with a level of 20 dB HL for both ears for all frequencies. Parental informed written consent was obtained for all participants, and the study was approved by the Psychology Research Ethics Committee of the University of Cambridge.
Standardized tests were used to confirm that participants had typical cognitive development and would be able to understand the task. The full-scale IQFS was estimated for 60 of the 64 participants (4 participants were less than 6 years old, the minimum age for the test), following the short-form IQSF approach of Aubrey and Bourdin (2018). Four subscales from the WISC-V (Wechsler Intelligence Scales for Children, fifth UK edition, Wechsler, 2016) were administered: two verbal (vocabulary and similarities) and two nonverbal (block design and matrix reasoning). For one participant, due to limited availability, the IQSF was calculated using only two subscales (nonverbal, block design; and verbal, vocabulary). In addition, the British Picture Vocabulary Scale (BPVS3; Dunn and Dunn, 2009) suitable for children aged 3–16 years old, was administered to check typical receptive (hearing) vocabulary. The BPVS is a standardized assessment with normal scores in the range 85–115. One child's IQ and BPVS scores were below the normal range so the data for that child were removed from the analysis. The group mean IQSF score for the remaining 63 participants was normally distributed (Shapiro-Wilk, p > 0.05) with a mean of 106.6 (standard deviation, SD = 11.7) and the group mean BPVS score was 100.6 (SD = 9.6), which are within the expected normal range. The mean age was 7 years, 7 months (91 months, SD = 12 months); the age range was 5 years, 6 months to 9 years, 4 months (66 to 112 months). There were 33 male and 30 female participants.
B. Stimuli and procedure
1. Binaural TFS sensitivity
To determine the range of frequencies over which there was sensitivity to binaural TFS, a version of the TFS-AF test of Füllgrabe et al. (2017) was used, adapted by S.A.F. to be child friendly. The stimuli were presented to both ears over headphones. A two-interval, two-alternative forced-choice procedure was used. In each interval, a sequence of four pure tones was presented, each 400-ms long (including 20-ms raised-cosine onset and offset ramps to reduce spectral splatter) and separated by 100 ms. In one interval, the “standard,” all four tones had an IPD = 0°. In the other interval, the “target,” tones two and four had an IPD = ϕ, whereas tones one and three had an IPD = 0°. The order of the standard and target was randomly selected for each trial. The task was to indicate the interval containing the target. For listeners sensitive to the IPD, the four tones of the standard appear at a fixed position toward the middle of the head, whereas the tones of the target stimulus appear to shift position within the head.
The child-friendly modification involved the addition of a graphical display, including two footballs spaced horizontally on the computer screen. The footballs became bigger and then smaller again, in turn, first the left, then the right, in time with interval one and interval two of a trial. The participants were asked to indicate the football (left or right) that went together with the sounds that moved or wobbled within the head. The participants responded using the left or right trigger button on a wired X-Box 360 controller (Microsoft, Redmond, WA) corresponding to the first (left) or second (right) interval. The trials were self-paced, and feedback was given in the form of on-screen tokens for each correct response. At the start of each run, there were five practice trials with feedback. There was further encouragement by way of feedback milestones after 14 and 28 correct responses.
Following Füllgrabe et al. (2017), the frequency of the tones was initially set to 200 Hz, and the frequency was adaptively adjusted using a two-up, one-down rule to estimate the 71% correct point on the psychometric function (Levitt, 1971). It has been shown that such adaptive procedures produce stable threshold estimates for school-aged children (Buss et al., 2001). The frequency was increased by a factor of 1.46 until the first reversal, then reduced by a factor of 1.21 until the second reversal, and then changed by a factor of 1.1 for subsequent reversals. The frequency was not allowed to be less than 200 Hz. If this limit was reached, the frequency remained at 200 Hz until two consecutive correct responses had occurred. A run continued until 8 reversals or 40 trials had occurred. The threshold for a given run was calculated from the geometric mean of the frequencies at the last four reversals. Occasionally, fewer than 8 reversals occurred after the limit of 40 trials because the frequency stayed at 200 Hz for many trials. In this case, the threshold was assigned a value of 200 Hz. This occurred for 2.6% of the runs. The 200-Hz starting tone had a level of 74 dB sound pressure level (SPL), and all higher frequencies were presented at approximately the same sensation level. The appropriate levels were calculated using a psychoacoustic model (Moore et al., 2016), assuming that participants had normal absolute detection thresholds. Correct functioning of the TFS-AF test as adapted for children was verified by trialling the test with three normal-hearing adults (NHA; one male, including two of the authors and one volunteer), whose ages ranged from 19 to 54 years (mean age 36 years). All had audiometric thresholds less than or equal to 20 dB HL for octave audiometric frequencies between 250 and 8000 Hz. The geometric mean thresholds for the NHA using the adapted test were 1321 Hz for ϕ = 180° and 975 Hz for ϕ = 30°. These thresholds are similar to the geometric mean TFS-AF thresholds for YNHA reported by Füllgrabe et al. (2017; 1330 Hz for ϕ = 180°; 887 Hz for ϕ = 30°), confirming that the TFS-AF test as adapted for children gave results consistent with those for the adult version.
Each child took part in three TFS-AF sessions on different days. This was done to avoid long sessions, which might have led to boredom and fatigue. There was an average of 3 weeks between sessions. Two runs were administered in each session, one with ϕ = 180° and one with ϕ = 30°, in counterbalanced order. Each threshold estimation took approximately 6 min.
2. Intensity discrimination
The intensity-discrimination task was presented as a child-friendly computer game based on the three-interval, two-alternative “Dinosaur” threshold estimation program originally developed by Dorothy Bishop (Oxford University). The stimuli consisted of unmodulated noise with the long-term average spectrum of speech [CCITT (ITU), 1988; Recommendation G.227]. This was created from Gaussian noise using the matlab (The MathWorks, Natick, MA) ccitt_filter function (Seeber, 2005). Each stimulus had a duration of 800 ms, including 50-ms linear rise and fall times. The level of the two reference noise bursts was 64 dB SPL. The level of the target noise burst ranged from 74 to 64 dB SPL in 40 steps of 0.25 dB. The participants were introduced to three cartoon dinosaurs. It was explained that each would make a sound in turn and the task was to decide which dinosaur was the loudest. An adaptive staircase procedure (Levitt, 1971) was used with an initial two-down one-up procedure followed by a three-down one-up procedure after two reversals. The initial step size was 2 dB, and the step size was halved after the fourth and sixth reversals. A run terminated after the eighth reversal or 40 trials, whichever occurred first. The threshold was estimated from one run and was calculated as the mean difference in level at the last four reversals.
C. Test protocol
The testing took place in schools. Participants were seated with the experimenter in a quiet office or library. The experiment was programmed in Presentation software (Neurobehavioural Systems Inc., Albany, CA) and run on a laptop computer (Lenovo T480 ThinkPad, Hong Kong). Participants listened over sound-isolating headphones (Sennheiser HDA300, Wedemark, Germany) via an external soundcard (DragonFly Black, AudioQuest, Irvine, CA) using a sampling rate of 44.1 kHz with 24-bit resolution. The equipment was calibrated using a Bruel and Kjær (Nærum, Denmark) artificial ear and headphone coupler. In addition, a calibration verification program was run at the start of each session.
III. RESULTS AND STATISTICAL ANALYSIS
A. TFS-AF low-scoring participants
The thresholds for some participants on the TFS-AF task were close to the starting frequency of 200 Hz and showed no improvement across runs. Such thresholds could result from random guessing, perhaps reflecting either a failure to understand the task or an inability to perform the task. A Monte Carlo simulation (n = 100 000) was conducted in matlab to estimate the distribution of thresholds that would be obtained if a child responded randomly. The mean expected threshold was 223 Hz (SD = 29 Hz), and the upper side of the 95% confidence interval (CI) was 271 Hz. Based on this, we assumed that thresholds of 271 Hz or above reflected a genuine ability to perform the task, whereas thresholds below 271 Hz might have been based on random guessing. The thresholds below 271 Hz were designated Flow and those at or above 271 Hz were designated Fhigh.
B. The effect of age on the change from Flow to Fhigh thresholds
Figure 1 shows the overall thresholds (geometric mean for runs 2 and 3; thresholds for run 1 were excluded because of evidence for practice effects, as described in Sec. III C) for all participants for each value of ϕ. From inspection of Fig. 1, the change from predominantly Flow thresholds to Fhigh thresholds occurred at about 7–8 years of age. To investigate this further, the group was split into two, based on the Flow and Fhigh thresholds. The age distributions for groups Flow and Fhigh for TFS180 and TFS30 were not normally distributed (Shapiro-Wilk, p < 0.05). An independent-samples Mann-Whitney U test for TFS180 indicated that the median age for the Flow group (6 years, 6 months) was significantly lower than the median age of the Fhigh group (8 years, 2 months), U = 213, p < 0.001, r = 0.44. For TFS30, the median age for the Flow group (7 years, 4.5 months) was significantly lower than the median age of the Fhigh group (8 years, 5 months), U = 155, p < 0.001, r = 0.57.
Performance on the TFS-AF task is highest when ϕ is large (135°–180°; Füllgrabe et al., 2017). Thus, for the 23 children with Flow thresholds for ϕ = 180°, there is no evidence that they were able to do the TFS-AF task. Therefore, subsequent analyses of the effects of age on the TFS-AF thresholds were based on the geometric mean thresholds for runs 2 and 3 for the 40 participants with TFS180 Fhigh thresholds (24 male; mean age 8 years, 0 months, SD = 10 months).
C. Assessment of practice effects for the TFS-AF task
Inspection of the data suggested that performance was often poorer for run 1 than for runs 2 or 3. To assess whether there were practice effects, the thresholds were analyzed only for the 40 participants whose geometric mean TFS180 thresholds for runs 2 and 3 were 271 Hz or above. The individual and geometric mean thresholds for the participants meeting this criterion are shown in Fig. 2 for each of the three runs of the TFS-AF task.
A Shapiro-Wilk test on the log-transformed data showed that the data were not normally distributed. Therefore, thresholds for the three runs were compared for each value of ϕ using Friedman's analysis of variance. The significance levels were Bonferroni adjusted for multiple comparisons. For TFS180, although thresholds tended to be lower for run 1 than for runs 2 and 3, the effect of the run number was not significant; p = 0.066. Median thresholds for runs 1–3 were 635, 905, and 987 Hz, respectively. For TFS30, there was a significant effect of run number, χ2(2) = 9.774; p = 0.008. The median thresholds for runs 1–3 were 211, 278, and 309 Hz, respectively. Post hoc analyses showed a significant difference between the threshold values for runs 1 and 2 (z = −2.516, p = 0.036 with a small effect size r = −0.11) and between thresholds for runs 1 and 3 (z = −2.851, p < 0.013 with a small to medium effect size r = −0.32). There was no significant difference between the thresholds for run 2 and run 3 (p > 0.05).
In summary, for both values of ϕ, performance tended to improve from run 1 to run 2, but there was no significant difference between runs 2 and 3, suggesting that stable performance was reached after two runs. This is similar to the findings for YNHA (Füllgrabe et al., 2017).
D. TFS-AF main effect of phase
Figure 3 shows geometric mean thresholds across runs 2 and 3 for each of the 24 participants with TFS Fhigh thresholds for both values of ϕ (connected by solid lines), group mean thresholds (× symbols connected by a dashed line), and the geometric mean threshold for nine young YNHA, taken from Füllgrabe et al. (2017; circles connected by a dotted line).
The inspection of Fig. 3 shows that average thresholds for both values of ϕ were higher for YNHA than they were for children. For both age groups, the thresholds were higher for ϕ = 180° than they were for ϕ = 30°. To test the significance of these effects, a linear mixed-effects model was fitted to the data using the lmer( ) function of the lme4 package (Bates et al., 2015) in R (R Core Team, 2020). The dependant variable was log(threshold frequency) with the between-subjects factor age group (child, adult) and within-subject factor ϕ (30°,180°).
An analysis of variance using Satterthwaite's method for determining p-values was applied to the model. There was a significant main effect of phase ϕ [F(1,31) = 74.04, p < 0.001, ηp2 = 0.70]. There was a significant main effect of age group [F(1,31) = 28.35, p < 0.001, ηp2 = 0.48]. There was also a significant interaction between ϕ and age group [F(1,31) = 8.26, p < 0.01], reflecting the finding that the difference between the adults and children was greater for ϕ = 30° than for ϕ = 180°.
E. Intensity discrimination task
The mean intensity DL (expressed as the change in level at threshold) for 39 participants (1 of the 40 children showing above-chance thresholds for ϕ = 180° was unavailable for measurement of the intensity DL) was 3.9 dB (SD 2.0 dB). As expected from previous research (Jensen and Neff, 1993; Buss et al., 2009; Moore and Linthicum, 2007), a linear regression analysis showed that intensity DLs decreased with increasing age: [β = −0.328, t(38) = −2.11, p = 0.041]. However, the R2 value was only 0.108.
F. Development of TFS-AF sensitivity
Among children with measurable TFS180 thresholds, there was a trend for thresholds to increase (improve) with increasing age, as illustrated in Fig. 4. The correlation of the TFS180 thresholds with age was significant (Spearman's ρ = 0.393, p = 0.013). To assess whether the correlation was driven by monaural processing efficiency (as assessed using intensity DLs) or cognitive functioning (as assessed by IQ) on TFS sensitivity, partial correlations of TFS180 thresholds with age were calculated with the effect of IQ scores and intensity DLs partialled out. The partial correlation was significant (Spearman's ρ = 0.367, p = 0.026). This suggests that the relationship of the TFS-AF thresholds with age was largely independent of cognitive functioning as measured by IQ or monaural processing efficiency as measured by intensity DLs. Among children with measurable TFS30 thresholds, there was a nonsignificant trend for thresholds to increase (improve) with increasing age (Spearman's ρ = 0.154, p > 0.05).
To predict the age at which adult-like TFS180 sensitivity would be achieved, the relationship between the TFS180 thresholds and age was examined. The scatterplot of standardized predicted values against standardized residuals showed that the data met the assumption of homoscedasticity and were, therefore, suitable for linear regression modelling. The maximum Cook's D was 0.169, which is below the threshold of one, indicating that there were unlikely to be any individual influential outliers (Field, 2009, p. 217). Hence, the data for all 40 participants were included in the analysis. The fitted regression equation was
Scatterplots of the TFS180 thresholds against age, together with the fitted regression line, are shown in the upper panel of Fig. 4.
The regression analysis showed that age significantly predicted the TFS180 thresholds [β = 0.397, t(39) = 2.67, p = 0.011]. However, the R2 value was only 0.158, indicating that the model accounted for only 15.8% of the variance in the TFS180 thresholds. Adding the intensity DL as a predictor in the model did not significantly improve the percentage of variance accounted for. From Eq. (1), the age at which adult-like TFS180 thresholds would be reached (1400 Hz) was estimated to be 10 years, 2 months (122 months). Note, however, that this estimated age is outside the range of the ages tested. Also, the estimated age may have been biased by omission of the data for children who did not have measurable thresholds.
Twenty-four of the TFS30 thresholds were above the 95% CI for chance performance as indicated by the squares in the lower panel of Fig. 4. The regression line fitted to these thresholds is shown in that panel. Age did not significantly predict the TFS30 thresholds. This may reflect the limited age range of the participants with above-chance thresholds and the limited number of participants. Nevertheless, the data show a developmental trend. This trend can also be seen in the whole group data in Fig. 1 (TFS30 lower panel, N = 63) for which the median age of the Flow group (7 years, 4.5 months) was significantly lower (p < 0.001) than the median age of the Fhigh group (8years, 5 months).
A. Summary and interpretation
The TFS-AF task was administered three times on separate occasions. There was a small increase (improvement) in the thresholds between the first session and the second and third sessions, but the thresholds for the second and third sessions did not differ significantly. The improvement was most likely the result of the increasing task familiarity and is similar to the practice effects reported for YNHA (Füllgrabe et al., 2017). However, 23 of the children did not seem able to do the task, their thresholds being below the estimated 95% CI for chance performance for the easiest condition (ϕ = 180°). The median age of the TFS180 Flow group (N = 23) was 6 years, 6 months, suggesting that at or below this age, children often found the task too difficult to achieve measurable thresholds. The median age of the TFS180 Fhigh group (N = 40) was 8 years, 2 months, suggesting that at or above this age, children are likely to achieve measurable thresholds. As found for YNHA (Füllgrabe et al., 2017), thresholds were higher (better) for ϕ = 180° than for ϕ = 30°. The thresholds were significantly lower for the children than for YNHA tested by Füllgrabe et al. (2017). For the children, thresholds increased (improved) significantly with increasing age for ϕ = 180°. The interaction between ϕ and participant group (child vs adult) probably occurred because the group differences tend to be larger when the task is more difficult.
The thresholds for the intensity discrimination task might be regarded as providing a measure of monaural auditory processing. Similarly, IQ scores might be regarded as providing a measure of cognitive functioning. The scores for the TFS-AF task and age were correlated when the effects of the intensity DLs and IQ were partialled out. This makes it likely that individual differences in thresholds for the TFS-AF test partly reflected differences in age rather than differences in monaural sensitivity or cognitive ability.
B. Development of binaural TFS sensitivity
Linear regression analysis showed a significant relationship between age and TFS180 thresholds. Only one child, aged 8 years, 2 months (98 months), reached the average TFS180 threshold displayed by adults (1400 Hz). None of the children's TFS30 values approached the adult average of 1100 Hz with the oldest child (9 years, 4 months; 112 months) having the highest TFS30 threshold of 625 Hz. However, the regression analysis revealed that age was responsible for only 15.8% of the unique variance in the TFS180 values. For any given small age range, there was a considerable range of TFS-AF thresholds. The origin of these individual differences is unknown. Similar large individual differences have been observed among YNHA (Füllgrabe et al., 2017).
The individual differences in children's TFS-AF thresholds may be related to immaturities in auditory sensory processing (Hall et al., 2004), to differences in the efficiency of processing sensory information, or both. It has been suggested that for auditory tasks such as amplitude modulation detection or intensity discrimination, performance is limited by “internal noise” (Buss et al., 2006; Cabrera et al., 2019). One potential source of internal noise is related to myelination of neurons in the central auditory pathways. This is crucial for fast conduction and consistent timing of action potentials, allowing a precise neural representation of IPDs at the medial superior olive in the brain stem (Long et al., 2018). However, myelination of the auditory system begins around the 26th week of gestation, and myelination of the brainstem and midbrain appears to be largely complete by 1 year of age (Su et al., 2008). Hence, it seems unlikely that changes in myelination in the brainstem or midbrain are responsible for the developmental trend in binaural TFS sensitivity found in our study.
Spatial hearing is abnormal for children using cochlear implants (Litovsky and Gordon, 2016) even when implantation is bilateral and prelingual (Smieja et al., 2020). Comparison of children with bilateral cochlear implants and typically hearing children has shown distinct group differences in cortical auditory networks (Smieja et al., 2020). Development and connectivity in the auditory cortex continues for a considerable time, with changes in the primary auditory cortex not being complete until adulthood (Su et al., 2008). Accordingly, individual differences in the development of the central auditory system may contribute to individual differences in the TFS-AF thresholds found here.
It seems reasonable to assume that the thresholds for most of the participants in our study, especially those with Flow thresholds, will continue to improve with increasing age. The predicted age for reaching adult-like thresholds was 10 years, 2 months for ϕ = 180°. This prediction is consistent with the current understanding that the auditory system continues to develop into the early teens (Eisenberg, 2000; Stuart, 2005; Corbin et al., 2016; Litovsky, 2015). However, it would be worthwhile to test a group of older children to more accurately characterize the age at which adult-like thresholds are achieved, because many of the children in the age range tested here did not achieve thresholds above those that would be expected by chance.
C. Relevance to SRM
The limited frequency range over which younger children are able to make use of binaural TFS cues may partially explain why they have problems understanding speech in noise, especially for spatially distributed sound sources (Leibold and Buss, 2013; Vaillancourt et al., 2008; Van Deun et al., 2009). SRM depends partly on the use of binaural TFS cues to localize and perceptually segregate target speech and interfering sounds (Swaminathan et al., 2016). When there is only a small angular separation between the target and interfering sounds, there will be a correspondingly small difference in the IPD between the target and interferers, comparable to the smaller value of ϕ used here. For ϕ = 30°, binaural TFS cues could only be used over a small frequency range for many children, and this would limit SRM. It would be worthwhile to test the hypothesis that SRM is related to TFS-AF thresholds directly by measuring both in the same study.
D. Possible application to electro-acoustic hearing
Children with severe or profound hearing loss but with some residual hearing at low frequencies are often provided with bilateral cochlear implants to provide information about medium- and high-frequency sounds together with bilateral hearing aids to provide information about low-frequency sounds. This is called electro-acoustic stimulation (EAS). The frequency range over which the hearing aids provide gain is selected by the clinician, and is sometimes based on the presence and edge frequencies of dead regions in the cochlea (Zhang et al., 2014). One might argue that the frequency range over which there is binaural sensitivity to the TFS should also be taken into account; hearing aids should be programmed to provide audibility at least up to the highest frequency at which there is binaural sensitivity to TFS. Unfortunately, the present data cannot be used for this purpose because binaural sensitivity to TFS worsens with increasing hearing loss but with large individual variability (Füllgrabe and Moore, 2018), whereas the present data were obtained using normal-hearing children. However, the method described in this paper could be used to assess binaural sensitivity to TFS for children with residual low-frequency hearing, and this information might prove to be useful in fitting EAS.
The frequency range over which children were sensitive to binaural TFS cues, as assessed using the TFS-AF test, was smaller than that for adults and tended to increase with age over the range 5 years, 6 months to 9 years, 4 months. Some children, especially younger children, were unable to perform the task. Linear regression analyses based on the thresholds for the children who could perform above chance for ϕ = 180° suggested that adult-like TFS-AF thresholds would not be reached until 10–11 years old. The correlation of TFS-AF thresholds with age remained significant when the effects of intensity discrimination and IQ were partialled out. This suggests that the thresholds for the TFS-AF test reported here do provide a measure of the children's ability to process binaural TFS. The relatively poor binaural TFS processing abilities of children may limit the SRM that occurs for them when listening to speech in the presence of spatially distributed interfering sounds.
This project was funded by the Fondation Botnar, Grant No. 6064, to U.G. The sponsor played no role in the study design, data interpretation, or writing of the report. We thank Emily Buss and two reviewers for helpful comments on earlier versions of this paper.