Recent studies have found that envelope following responses (EFRs) are a marker of age-related and noise- or ototoxic-induced cochlear synaptopathy (CS) in research animals. Whereas the cochlear injury can be well controlled in animal research studies, humans may have an unknown mixture of sensorineural hearing loss [SNHL; e.g., inner- or outer-hair-cell (OHC) damage or CS] that cannot be teased apart in a standard hearing evaluation. Hence, a direct translation of EFR markers of CS to a differential CS diagnosis in humans might be compromised by the influence of SNHL subtypes and differences in recording modalities between research animals and humans. To quantify the robustness of EFR markers for use in human studies, this study investigates the impact of methodological considerations related to electrode montage, stimulus characteristics, and presentation, as well as analysis method on human-recorded EFR markers. The main focus is on rectangularly modulated pure-tone stimuli to evoke the EFR based on a recent auditory modelling study that showed that the EFR was least affected by OHC damage and most sensitive to CS in this stimulus configuration. The outcomes of this study can help guide future clinical implementations of electroencephalography-based SNHL diagnostic tests.

Pure-tone audiometry and otoacoustic emissions (OAEs) are standard clinical tests in assessing hearing thresholds and outer-hair-cell (OHC) damage, respectively. However, these tests are expected to be insensitive to supra-threshold hearing deficits associated with auditory-nerve fiber (ANF) damage, i.e., cochlear synaptopathy (CS; Kujawa and Liberman, 2009). CS is a recently discovered type of sensorineural hearing loss (SNHL) and refers to damaged ANF synapses that innervate the inner-hair-cells (IHCs; Kujawa and Liberman, 2009; Fernandez , 2020). Animal research studies have shown that acoustic overexposure (Fernandez , 2020), ototoxic drugs (Shaheen , 2015), and ageing (Parthasarathy and Kujawa, 2018) can compromise the ANF integrity, and noise-induced ANF degeneration can accelerate age-related hearing dysfunction (Fernandez , 2015). Because CS can hide behind a normal audiogram and normal OAEs, CS is also referred to as “hidden hearing loss.”

Because the cochlea is deeply embedded within the temporal bone, CS can hitherto only be quantified through post-mortem temporal bone histology (Makary , 2011; Wu , 2019). Human temporal bone studies have shown that each IHC is contacted by 10–15 synapses with individual ANFs (Wu , 2019) and the number of intact ANFs reduces with age (Makary , 2011; Viana , 2015; Wu , 2019). Ever since its discovery in 2009 (Kujawa and Liberman, 2009), several studies were initiated to develop sensitive and noninvasive diagnostic markers of synaptopathy (Shaheen , 2015; Mehraei , 2016; Valero , 2016; Bharadwaj , 2019; Vasilkov , 2021) and study the consequences of CS for sound detection (Bharadwaj , 2015; Oxenham, 2016; Prendergast , 2017; Verhulst , 2018) and speech perception (Bharadwaj , 2015; Guest , 2018; Smith , 2019; Garrett, 2020; Mepani , 2021). Last, the potential role of CS in tinnitus and hyperacusis generation has been studied (Schaette and McAlpine, 2011; Guest , 2017; Paul , 2017; Wojtczak , 2017; Bramhall , 2019; Verhulst , 2022).

Animal research and computational auditory modelling studies have shown the potential of envelope following responses (EFRs) in CS diagnosis and quantification. For stimulus modulation frequencies roughly above 80 Hz, EFRs are a far-field potential that reflects how well the population of ANFs phase-locks to the stimulus envelope (Shaheen , 2015; Wilson , 2021). From a group delay perspective, modulation frequencies above 80 Hz provide EFR group delays that are consistent with peripheral and brainstem processing (Dolphin and Mountain, 1992; Purcell , 2004; Zhong , 2014; Wilson , 2021). Furthermore, kainic-acid-induced EFR reductions match auditory brainstem response (ABR) wave-I amplitude reductions well for modulation frequencies above 80 Hz (Wilson , 2021). Several rodent studies have shown that the EFR strength is proportional to the number of histologically verified synapses (Shaheen , 2015; Möhrle , 2016; Parthasarathy and Kujawa, 2018). It should be noted that these strong relations were found at high modulation rates (∼1 kHz), well above those used in human studies. Thus, computational models of the human auditory periphery can be used to assess whether EFRs with lower modulation rates can be used in the differential diagnosis of CS in humans (Paul , 2017; Verhulst , 2018; Keshishzadeh , 2020; Encina-Llamas , 2021; Vasilkov , 2021). Using these models, the respective and combined effects of CS and hair-cell damage on the peripheral generators of EFR responses can be simulated and compared to EFR recordings in different study populations. For example, a recent study provided evidence that the EFR to a 120-Hz rectangularly amplitude-modulated envelope following response (RAM-EFR) pure-tone is more sensitive to individual EFR differences than conventional sinusoidally amplitude-modulated (SAM) stimuli or the ABR amplitudes (Vasilkov , 2021). The same study used model simulations to provide evidence that stimuli with rectangular envelopes and deep modulation depths enhance the specificity of the EFR to CS. The relative contribution of OHC damage to the EFR was strongly reduced compared to other stimulus configurations as a result of its sharply rising envelope. In the current absence of direct imaging methods for CS-quantification in humans, the combination of human/animal experiments and model simulations is the only available method to further develop and improve noninvasive markers of CS. There is a need to proceed with the most promising CS-markers to study the risk-factors for CS and its functional consequences until the field develops imaging (or other) methods for direct CS-quantification in living humans.

In this study, we recorded EFRs in younger and older participants to study the effect of different stimulations and analysis configurations (i.e., three different epoch numbers, five different analysis methods, and three different carrier frequencies) on the quality and robustness of EFR markers. We assume that the older cohort is representative of a population with overall lower EFR magnitudes than the younger cohort due to their presumed age-related CS. Human temporal bone histology has shown that from the age of 45 to 50 years old, there is considerable CS with a loss of ANFs that exceeds 40% (Wu , 2019). We investigated which of the EFR methods was better able to separate the younger from the older cohort using independent t-tests. Based on data collected in one of our previous studies, we furthermore assessed the general test-retest reliability of EFR measures. Last, we performed model simulations to quantify the specificity and sensitivity of various EFR markers to CS when OHC damage was also introduced. Our study outcomes can help steer the translation and development of EFR markers of CS toward robust precision hearing diagnostic tools for use in clinical practices.

We examined the effect of the number of recorded epochs, five different analysis methods, and three different carrier frequencies on the EFR magnitude. We then tested how EFR recordings were affected by the adopted electrode configuration. The recording and analysis parameters were varied to assess which configuration yielded robust EFRs in younger and older study participants.

The study included 41 participants divided over 3 groups. A power analysis based on previously collected EFR data (Keshishzadeh , 2020) showed that groups of N = 15 are sufficient to perform group difference statistics (α = 0.05) with a power of 0.85. In that study, t-tests for EFR markers in groups of 15 subjects reached the following sensitivity and power: EFR amplitude [standard deviation (SD) = 0.03 μV, sensitivity = 0.025 μV, and power = 0.85]. We recruited 15 participants per subgroup and excluded 4 subjects in the cohort based on an inspection of the quality of the raw EFR and ABR recordings. Two participants in the young group 1 had a strong posterior auricular muscle reflex in the ABR that would also contaminate the EFR recording and, thus, were excluded. Two participants in the old group were excluded because they either had an incomplete data collection or excessive noise in the recordings, presumably due to a bad electrode contact. Afterward, EFR magnitudes of a young cohort [young group 1, mean age 23.92 years old ± 1.19 SD, minimum (min) age = 22 years old, maximum (max) age = 25 years old; N = 13, 12 females] were compared to those of an older cohort (old group, mean age 54.69 years old ± 6.0 SD, minimum age = 46 years old, maximum age = 64 years old; N = 13, 12 females) for different epoch numbers, analysis methods, and carrier frequencies. It should be noted that the impact of the removed datapoints on the statistical models was considerable. When the outliers were included in the dataset, we observed no significant influence of age, epoch number, or carrier frequency. It is important to include older participants in this analysis as they are expected to reflect the natural variation of EFR recordings in a clinical diagnostic context. An additional 15 young normal-hearing (NH) participants (young group 2, mean age 22.20 years old ± 1.26 SD, minimum age = 21 years old, maximum age = 25 years old; N = 15, 14 females) were recruited to study the effect of electrode configuration on the EFR magnitude. To ensure we did not include participants with a strong noise-exposure history, all of the participants completed a questionnaire prior to testing, which is based on Keppler (2015). This questionnaire was completed online or on paper and consisted of sociodemographics, questions related to subjective hearing status, present hearing related symptoms, history of ear surgery, or known pathologies and illness. The questionnaire also contained 13 questions on noise exposure at work and/or during leisure time. Participants had to indicate their current and past attendance at activities with noise exposure, the subjective estimation of loudness, and the use of hearing protection at these activities. We did not have to exclude participants based on their questionnaire answers. Head circumference was measured (in cm) with a tape measure to study its effect on the EFR magnitude. Otoscopy and tympanometry were performed to ensure a normal outer-ear-canal and middle-ear status using a GSI TympStar (Grason-Stadler Inc., Eden Prairie, MN) tympanometer with a 226 Hz, 85 dB sound pressure level (SPL) probe tone. All of the test ears had normal otoscopic results and a “type A” tympanogram. Air-conduction thresholds were measured using an Interacoustics Equinox audiometer (Middlefart, Denamrk) and the modified Hughson-Westlake procedure. The conventional octave frequency pure tones of 0.125, 0.250, 0.500, 1, 2, 4, and 8 kHz and half-octave frequencies 3 and 6 kHz were delivered via Interacoustics TDH-39 headphones. Extended high-frequency (EHF) pure tones of 10, 12.5, 14, and 16 kHz were presented over Sennheiser HAD-200 headphones (Wedemark, Germany). Pure-tone audiometric thresholds at 4 kHz had to be below 25 dB hearing level (HL) for participants to be included. The ear with the best audiometric threshold at 4 kHz was chosen as the test ear for monaural EFR stimulation. Figure 1 shows the mean and individual audiograms of the three groups, and Table I summarizes the mean audiometric thresholds of the participants for the three frequencies corresponding to the carrier frequencies of the tested EFRs (i.e., 2, 4, and 6 kHz). From here on, “□” stands for young group 1, “△” indicates young group 2, and “○” depicts the old group.

FIG. 1.

The individual (thin gray line) and mean (thick black line) audiograms of the three groups showing the (a) young group 1, ; (b) young group 2, △; and (c) old group, .

FIG. 1.

The individual (thin gray line) and mean (thick black line) audiograms of the three groups showing the (a) young group 1, ; (b) young group 2, △; and (c) old group, .

Close modal
TABLE I.

Pure-tone average (PTA ±SD) thresholds (dB HL) at the three carrier frequencies and mean PTAs across two frequency regions: low- to high-frequency (125–6 kHz) and EHF (8–16 kHz) for the three groups.

Young group 1 Young group 2 Old group
2 kHz  2.3 (5.25)  1.3 (6.40)  5.0 (3.53) 
4 kHz  2.7 (4.84)  3.0 (4.93)  13.5 (6.89) 
6 kHz  13.1 (6.63)  10.0 (7.07)  18.1 (7.51) 
PTA 125 Hz — 6 kHz  6.1 (3.84)  3.9 (2.56)  8.8 (2.54) 
EHF thresholds  11.0 (10.67)  6.3 (5.71)  41.0 (15.16) 
Young group 1 Young group 2 Old group
2 kHz  2.3 (5.25)  1.3 (6.40)  5.0 (3.53) 
4 kHz  2.7 (4.84)  3.0 (4.93)  13.5 (6.89) 
6 kHz  13.1 (6.63)  10.0 (7.07)  18.1 (7.51) 
PTA 125 Hz — 6 kHz  6.1 (3.84)  3.9 (2.56)  8.8 (2.54) 
EHF thresholds  11.0 (10.67)  6.3 (5.71)  41.0 (15.16) 

Ethical approval was obtained from the local ethical committee and all of the participants were informed about the experimental procedures and signed an informed consent before the experiment. The measurements of this study took place during the Covid pandemic. Wearing a face mask in the hospital was mandatory at that time for researchers and participants.

We collected EFRs with the Universal Smart Box (Intelligent Hearing Systems, Miami, FL) using the SEPCAM software (Intelligent Hearing System, Miami, FL) in a quiet room with all of the inessential electrical devices turned off. During the recording, participants sat in a reclining chair and watched a silent, captioned movie of their choice to facilitate a relaxed, yet wakeful state. This wakeful state was important to avoid alpha-contamination. This alpha-noise appears when the eyes are closed and is a stronger potential than the EFR source and could, therefore, affect the quality of the recordings even if their generators are spatially different (Galbraith and Arroyo, 1993; Hoormann , 2000; Galbraith , 2003). All of the stimuli were delivered monaurally over shielded ER-2 insert-ear transducers (Etymotic Research, Chicago, IL). After transducer placement, both ears were covered with earmuffs (Busters, Kontich, Belgium) to minimize noise intrusion.

1. EFR stimuli

We recorded EFRs to RAM pure-tone stimuli because previous studies (Mepani , 2021; Vasilkov , 2021) showed larger EFR magnitudes for stimulation with RAM vs conventional SAM stimuli. The RAM stimulus modulation frequency was 110 Hz, its modulation depth was 100%, and its duty cycle was 25% for pure tones with carrier frequencies of 2, 4, or 6 kHz. Figure 2 shows two modulation cycles of the adopted RAM stimuli in the time domain along with their respective spectra. A single stimulus epoch was 500 ms long, and stimuli were presented 1000 times with alternating polarity. The RAM stimuli were calibrated to have the same peak-to-peak amplitude as a 70 dB SPL SAM-tone (carrier, 4 kHz; modulation frequency, 110 Hz; modulation depth, 100%). Consequently, the calibrated RAM stimuli were presented at 68.24 dB SPL, and stimuli for different pure-tone carriers had the same peak-to-peak amplitude.

FIG. 2.

(Color online) Two modulation cycles of the time-domain waveform of the RAM stimuli with 110 Hz modulation frequency and three carrier frequencies with (a) 2000 Hz, (b) 4000 Hz, and (c) 6000 Hz. The magnitude spectra for these carriers are shown in panels (d), (e), and (f), respectively. Spectral peaks represent the carrier frequency of the corresponding stimulus, and the side lobes represent the spread of spectral energy generated through applying the rectangular modulator.

FIG. 2.

(Color online) Two modulation cycles of the time-domain waveform of the RAM stimuli with 110 Hz modulation frequency and three carrier frequencies with (a) 2000 Hz, (b) 4000 Hz, and (c) 6000 Hz. The magnitude spectra for these carriers are shown in panels (d), (e), and (f), respectively. Spectral peaks represent the carrier frequency of the corresponding stimulus, and the side lobes represent the spread of spectral energy generated through applying the rectangular modulator.

Close modal

2. EFR processing

The recorded EFR signals were saved in “.EEG.F” format in SEPCAM using a sampling frequency of 10 kHz which is then converted to “.mat” format using a custom-made “sepcam2mat” matlab function. The raw recordings were filtered using an 800th-order finite-impulse-response (FIR) bandpass filter with low and high cutoff frequencies of 30 and 1500 Hz, respectively. The designed filter was applied to the electroencephalography (EEG) signals using the “filtfilt” function of matlab to avoid filter-induced phase-shifts. After filtering, we epoched the signal between 100 and 500 ms relative to the stimulus onset (i.e., 400 ms epochs) to focus our analysis on the sustained response. We subtracted the mean of each epoch to correct for the baseline drift and then discarded 20% of the recorded epochs with the highest peak-to-peak values to exclude the noisiest epochs (discarding an equal number of positive and negative polarity traces). Note that the 20% discarding percentage limit is a conservative method and could be further refined using online averaging or a specific signal-to-noise ratio (SNR) criteria. However, when studying individual differences, it is best to apply the same procedures in all subjects. We note that the labeling of the number of epochs in the figures refers to the recorded epochs (i.e., 1000, 800, and 600) and not to the epochs after noise reduction (i.e., 800, 640, and 480).

After noise reduction, the fast Fourier transform (FFT) of the remaining epochs was computed, and the corresponding matrix of FFT epochs was denoted by X. Using a bootstrapping procedure (Zhu , 2013; Keshishzadeh , 2021), X was used to estimate the EFR spectrum (EFRSpec) and corresponding noisefloor, indicated by NF. An exemplar EFRSpec (black) and respective NF (gray) is shown in Fig. 3, and the peaks of EFRSpec at the fundamental modulation frequency and the following three harmonics are specified by Mf1, Mf2, Mf3, and Mf4. Note that these values refer to peak-to-baseline values. Due to the absence of electrical shielding in the measurement noise, the 50-Hz power-line noise is also visible (the gray line in Fig. 3). To derive the EFR magnitude, we included the energy at the modulation frequency (110 Hz) and first three harmonics (220, 330, and 440 Hz) to avoid the interference of the power-line noise.

FIG. 3.

An example EFR spectrum (EFRSpec, black) and corresponding noisefloor (NF, gray) obtained after filtering, epoching, and bootstrapping. The spectrum corresponds to a RAM-4000 stimulus and belongs to an older subject. The peaks (peak-to-baseline) at fundamental frequency and the harmonics 2–4 are indicated by Mf1, Mf2, Mf3, and Mf4.

FIG. 3.

An example EFR spectrum (EFRSpec, black) and corresponding noisefloor (NF, gray) obtained after filtering, epoching, and bootstrapping. The spectrum corresponds to a RAM-4000 stimulus and belongs to an older subject. The peaks (peak-to-baseline) at fundamental frequency and the harmonics 2–4 are indicated by Mf1, Mf2, Mf3, and Mf4.

Close modal
We defined five EFR metrics as part of our analysis. The first EFR marker was defined as the peak-to-noisefloor value at the fundamental frequency,
EFR F 1 ( peak to noisefloor ) = M f 1 NF f 1 .
(1)
NF f 1 refers to the magnitude of NF at f = f 1. The EFR metric of Eq. (1) is routinely adopted in animal or research studies on EFRs (Parthasarathy and Kujawa, 2018). In the second method, we only considered the Mf1 value, i.e., the peak-to-baseline value at the fundamental frequency ( EFR F 1 ( peak to baseline ) = M f 1). In the third method, we calculated the EFR metric by adding the peak-to-noisefloor values at f 1 f 4, akin to the method presented in Vasilkov (2021) and Keshishzadeh (2021),
EFR F 1 + F 2 + F 3 + F 4 = n = 1 4 ( M f n NF f n ) .
(2)
We defined the EFR metric of the fourth and fifth methods by following the approach adopted in Mepani (2021). This method considers the sum of the spectral peak-to-noisefloor values at f 1 and f 2 or at f 3 and f 4 frequencies such that
EFR F 1 + F 2 = n = 1 2 ( M f n NF f n ) ,
(3)
EFR F 3 + F 4 = n = 3 4 ( M f n NF f n ) .
(4)
In our analysis, we only included data-points where the spectral M f n values were above the NF f n, and this condition was met for all of the tested subjects. Among the abovementioned EFR metrics, the EFR F 1 + F 2 + F 3 + F 4 includes the spectral energy at several harmonics and, hence, returns the largest values. At the same time, the EFR F 1 ( peak to baseline ) metric provides amplitudes comparable to the time-domain EFR peak-to-peak values and is, thus, best capable at capturing the energy present in the time-domain recording (Vasilkov , 2021).

In the next step of our analysis, we considered the number of epochs as a variable. The registration of the EEG signals should be sufficiently qualitative while minimizing the recording duration. To this end, we repeated the EFR F 1 ( peak to baseline ) three times with 1000, 800, and 600 epochs. This analysis used the RAM-4000 condition.

3. Electrode configuration

The young group 2 was tested with two different electrode configurations (Fig. 4) to study how the ground electrode position affected the SNR of the EFR markers. For the first configuration, Ambu Neuroline electrodes (Ballerup, Denmark) were placed on the high forehead (positive electrode), low forehead (ground electrode), and bilateral mastoid (negative electrodes). For the second configuration, electrodes were placed on the high forehead—(Fpz; positive electrode), nose tip (ground electrode), and bilateral mastoid (negative electrode). The Fpz-channel recording was re-referenced to the average of the mastoid electrodes. The electrode contacts on the subject's scalp were first rubbed with an abrasive gel (NuPrep, Weaver and Co., Aurora, CO) to keep measurement impedances below 3 kOhm. In each case, participants were tested first with configuration 1 and then with configuration 2.

FIG. 4.

(Color online) The visualization of the two electrode configurations, showing test ear mastoid (−), electrode 1; contralateral mastoid (−), electrode 2; high forehead—Fpz (+), electrode 3; and low forehead or contralateral nose tip (ground), electrode 4. Figure from Shutterstock, “Anatomy online, decade3d,” black and white female head, side and front views; ID, 575197756.

FIG. 4.

(Color online) The visualization of the two electrode configurations, showing test ear mastoid (−), electrode 1; contralateral mastoid (−), electrode 2; high forehead—Fpz (+), electrode 3; and low forehead or contralateral nose tip (ground), electrode 4. Figure from Shutterstock, “Anatomy online, decade3d,” black and white female head, side and front views; ID, 575197756.

Close modal

Additionally, we performed the same recordings using another EEG system (Biosemi, Amsterdam, Netherlands) to further investigate the effect of the ground electrode position on the EFR markers in one subject. Because the SEPCAM software does not store the raw ground signal, we used the Biosemi system with external (flat-type active) electrodes placed on both mastoids and three external electrodes on the upper and lower forehead and nose tip (representative of the Fpz, ground electrode1, and ground electrode2, respectively). The common-mode-sense (CMS) and driven-right-leg (DRL) pin-type active electrodes were placed on top of the head. A highly conductive gel (Signa gel, Parker Laboratories, Fairfield, NJ) was used to improve the contact between electrodes and skin.

Statistical analysis was performed using IBM SPSS Statistics 25 (SPSS Inc., Chicago, IL). A test of normality, i.e., Shapiro-Wilk test and descriptive parameters such as histograms, Q-Q plots, and box and whisker plots were calculated. The hearing status as measured by pure-tone audiometry (including EHFs) was analyzed across the groups to check for significant differences. Three separate linear mixed effects analyses were used to evaluate the effect of the independent variables on the EFR magnitude (dependent variable): age [young (group 1) and old], number of epochs (i.e., 600, 800, or 1000), the different analysis methods (i.e., EFR F 1 ( peak to noisefloor ), EFR F 1 ( peak to baseline ), EFR F 1 + F 2 + F 3 + F 4, EFR F 1 + F 2, and EFR F 3 + F 4), and the possible carrier frequencies (i.e., 2000, 4000, or 6000 Hz). Additionally, we looked at the interaction between the two independent factors, as well as the estimated marginal means and the pairwise comparisons. Next to that, independent t-tests were used to check for a statistically significant relationship between the age group and EFR magnitude for the different recording and analysis parameters separately. Furthermore, two different electrode configurations were compared among the subjects of young group 2. This dataset included significant outliers; hence, the nonparametric Wilcoxon signed-rank test was used to determine whether there was a median difference between the EFR magnitude for electrode configuration 1 and electrode configuration 2. The EFR F 1 + F 2 + F 3 + F 4 method with a 4 kHz carrier frequency and 800 epochs was used for this comparison.

Figure 5 shows EFR markers recorded to 600, 800, or 1000 epochs, and the corresponding noisefloor magnitudes are indicated with small boxplots. We expect that the noisefloor would reduce with a factor of 1/ number of averages , and this is reflected by a small reduction in mean noisefloor values when going from 600 to 1000 epochs. However, given the spread of individual noisefloor values, t-tests between the different conditions yielded no statistically significant differences in the noisefloor values across the different epoch conditions for each age group or between the young and old groups. This suggests that the recordings reached a stable noisefloor level even in the condition with 600 epochs. First, we performed a linear mixed effects analysis of the relationship between the EFR magnitude (dependent variable) and number of epochs (independent variable, i.e., 600, 800, or 1000 epochs). As fixed effects, we entered the factors “number of epochs” and “age” [young (group 1); old, independent variable] and the analysis indicated that age [F(1, 24) = 5,9, p = 0.023] had a significant effect on the EFR magnitude, but number of epochs did not [F(2,48) = 0.04, p = 0.962]. Afterward, we analyzed the interaction between the two independent variables on the EFR magnitude. The interaction term had no significant effect [F(2,48) = 0.4, p = 0.681]. Furthermore, we calculated the estimated marginal means for all of the conditions. First, the test indicated that the young subjects presented the largest mean EFR magnitude for the three different epoch numbers. Second, when using 1000 epochs, the largest median EFR magnitude was measured for the young and old subjects. Third, registering the EFR magnitude with 800 epochs resulted in the second largest mean value among the young subjects. To investigate which of the conditions was best able to separate the younger from the older cohort, we performed a t-test per condition, showing the most significant difference between the age groups for the 600 and 1000 epoch conditions (p = 0.02 in both cases), followed by a p of 0.03 for the 800-epoch condition. We conclude that all of the epoch configurations were sufficient to separate the younger from the older group based on their EFR magnitude.

FIG. 5.

(Color online) Variability of the EFR magnitude (RAM-4000) registered with a different number of alternating polarity epochs (i.e., 600, 800, or 1000). Individual data for the young group 1 (darker squares) and old group (lighter circles) subjects are shown. The numbers indicate the participant number. EFR magnitudes were calculated using the EFR F 1 + F 2 + F 3 + F 4 method, and median EFR magnitudes are indicated by the orange horizontal lines. The smaller, colored boxplots show the corresponding EFR noisefloor magnitude at the modulation frequency and following three harmonics (220, 330, and 440 Hz).

FIG. 5.

(Color online) Variability of the EFR magnitude (RAM-4000) registered with a different number of alternating polarity epochs (i.e., 600, 800, or 1000). Individual data for the young group 1 (darker squares) and old group (lighter circles) subjects are shown. The numbers indicate the participant number. EFR magnitudes were calculated using the EFR F 1 + F 2 + F 3 + F 4 method, and median EFR magnitudes are indicated by the orange horizontal lines. The smaller, colored boxplots show the corresponding EFR noisefloor magnitude at the modulation frequency and following three harmonics (220, 330, and 440 Hz).

Close modal

Figure 6 depicts how the EFR magnitude changes depending on the used analysis method and shows that the EFR F 1 + F 2 + F 3 + F 4 method yielded the largest overall responses. A linear mixed effects analysis of the relationship between EFR magnitude (dependent variable) and five different analysis methods (independent variable) was performed. As fixed effects, we entered the analysis “methods” and age [young (group 1), old, independent variable]. Table II shows that the chosen “analysis method” [F(4,96) = 55.9, p < 0.001] had a significant effect on the dependent variable EFR magnitude but age [F(1,24) = 3.8, p = 0.064] did not. The interaction term between the two independent variables on the EFR magnitude had no significant effect [F(4,96) = 1.9, p = 0.116]. Table II shows significant mean differences (p < 0.05) in EFR magnitude when considering the pairwise comparison of the analysis method. Furthermore, we studied the estimated marginal means for all of the conditions. First, the test indicated that the younger subjects generated the largest mean EFR magnitude irrespective of the analysis method. Second, for the young as well as the older group, the EFR F 1 + F 2 + F 3 + F 4 method provided the largest mean EFR magnitude. The second, third, and fourth largest mean EFR magnitudes were found for the methods EFR F 1 + F 2 , EFR F 1 ( peak to baseline ) , and EFR F 1 ( peak to noisefloor ), respectively. The analysis method, EFR F 3 + F 4, gave the lowest mean EFR magnitudes for the younger and older subjects. To investigate which of the analysis methods was best able to separate the younger and older groups, we performed a t-test for each condition and found that only the EFR F 1 + F 2 + F 3 + F 4 and EFR F 3 + F 4 methods yielded significant group differences (respectively, p = 0.035 and p = 0.006). Within the older group, there were several subjects for which the EFR F 3 + F 4 marker was set to the individual noisefloor, hence, we suggest the use of the EFR F 1 + F 2 + F 3 + F 4 method for classifying CS on the basis of the EFR magnitude. Last, there were no significant differences in the size of the EFR marker when correcting for the noisefloor or not. A t-test between EFR F 1 ( peak to baseline ) , and EFR F 1 ( peak to noisefloor ) magnitudes did not result in significant differences. However, we do suggest using noisefloor-corrected methods going forward because they are relative and, thus, easier to compare across subjects.

FIG. 6.

(Color online) Variability of the EFR magnitude across different analysis methods derived from the same EFR recordings in younger and older participants. The median EFR magnitudes are indicated using horizontal lines in the boxplots.

FIG. 6.

(Color online) Variability of the EFR magnitude across different analysis methods derived from the same EFR recordings in younger and older participants. The median EFR magnitudes are indicated using horizontal lines in the boxplots.

Close modal
TABLE II.

The analysis method: Pairwise comparison.

EFRF3+F4  EFRF1+F2  p < 0.001 
EFRF1+F2+F3+F4 
EFRF1(peak-to-noisefloor) 
EFRF1(peak -to-baseline) 
EFRF1(peak-to-noisefloor) 
EFRF1+F2+F3+F4  EFRF1(peak-to-noisefloor)  p < 0.001 
EFRF3+F4  EFRF1+F2  p < 0.001 
EFRF1+F2+F3+F4 
EFRF1(peak-to-noisefloor) 
EFRF1(peak -to-baseline) 
EFRF1(peak-to-noisefloor) 
EFRF1+F2+F3+F4  EFRF1(peak-to-noisefloor)  p < 0.001 

Figure 7 shows how the carrier frequency affects the EFR magnitude. A linear mixed effects analysis was used to evaluate the relationship between EFR magnitude (dependent variable) and three different carrier frequencies (i.e., 2, 4, and 6 kHz; independent variable). Age [young (group 1), old] and carrier frequency were the fixed effects. The analysis showed that neither age [F(1,24) = 3.1, p = 0.092] nor “carrier frequency” [F(2,48) = 0.9, p = 0.431] had a significant effect on the EFR magnitude. The interaction term between the two independent variables on the EFR magnitude had no significant effect [F(2,48) = 0.9, p = 0.427]. Furthermore, the estimated marginal means analysis indicated that carrier frequencies of 4 and 6 kHz generated the largest mean EFR magnitude among the younger subjects. Second, the largest mean EFR magnitude was found when the younger subjects were tested with a carrier frequency of 6 kHz. This result is likely due to the outlier in the young group 1 cohort. When we removed this outlier from the statistical analysis, the largest mean EFR magnitude for the younger groups was recorded with a 4-kHz carrier frequency. To investigate which of the carrier frequencies was best able to separate the younger group from the older group, we performed a t-test for each condition and found that only the RAM-4000 yielded significant group differences (p = 0.023).

FIG. 7.

(Color online) Individual EFR magnitudes derived from recordings with different carrier frequencies (800 epochs), i.e., 2 kHz; 4 kHz; 6 kHz. Younger subjects and older subjects are specified by squares and circles, respectively. On the y axis, one outlier value was compressed. The exact value was indicated next to the squares.

FIG. 7.

(Color online) Individual EFR magnitudes derived from recordings with different carrier frequencies (800 epochs), i.e., 2 kHz; 4 kHz; 6 kHz. Younger subjects and older subjects are specified by squares and circles, respectively. On the y axis, one outlier value was compressed. The exact value was indicated next to the squares.

Close modal

The adopted stimulus was the 4-kHz RAM, using 800 epochs analyzed with the EFR F 1 + F 2 + F 3 + F 4 method. Within young group 2, ten participants showed larger EFR magnitudes in configuration 1 and five participants showed larger EFR magnitudes in configuration 2 (Fig. 8). A Wilcoxon signed-rank test showed no statistically significant larger median EFR magnitude for electrode configuration 1 compared to electrode configuration 2 (p = 0.211). Configuration 1 showed, overall, larger EFR magnitudes with a wider spread, and this spread is unlikely explained by SNHL differences given that the same subjects were used for both configurations. Our recordings were performed consecutively; therefore, it is possible that test-retest differences influenced these results as well. As we will further elaborate on in Sec. IV, the test-retest reliability of RAM-EFRs is expected to be on the order of 8%–10% of the EFR magnitude.

FIG. 8.

(Color online) The EFR magnitudes (RAM-4000) calculated using the EFR F 1 + F 2 + F 3 + F 4 equation and 800 epochs. The EFRs were recorded with two different electrode configurations. The median EFR magnitudes of participants are shown by horizontal lines in the boxplots.

FIG. 8.

(Color online) The EFR magnitudes (RAM-4000) calculated using the EFR F 1 + F 2 + F 3 + F 4 equation and 800 epochs. The EFRs were recorded with two different electrode configurations. The median EFR magnitudes of participants are shown by horizontal lines in the boxplots.

Close modal
To further investigate whether the observed variability in EFR markers was reflecting a general test-retest variability or due to the different position of the ground electrode (forehead or nose), we performed an additional analysis. Theoretically, the position of the ground electrode should not bias the recordings given the formula for raw EEG signals:
( Fpz Ground ) ( Mastoid Ground ) = Fpz Mastoid .
(5)
To verify this, we calculated the time-domain EFR for the two electrode configurations (i.e., ground electrodes 1 and 2) using the Biosemi EEG system (Fig. 9). The results in Fig. 9(a) support our hypothesis in Eq. (5) because the EFRs for both ground electrode configurations overlapped. Raw EFRs recorded from each channel are shown in Figs. 9(b) and 9(c). Even though different EEG amplifiers performed their referencing differently, i.e., Biosemi referenced to the CMS electrode (to suppress the common mode) and Intelligent Hearing System referenced to a ground electrode, it is unlikely that the difference in ground electrode position was a major factor in the observed differences in Fig. 8.
FIG. 9.

(Color online) EFR raw recordings for one young subject on the Biosemi amplifier, showing (a) EFRs derived from the two electrode configurations overlap, Fpz, nose; Fpz, lower forehead; (b) raw EEG waveforms for each individual channel, Fpz, lower forehead, nose; (c) raw EFRs from the mastoid; and (d) raw EFRs from the mastoid and the mastoid rereferenced to the nose electrode.

FIG. 9.

(Color online) EFR raw recordings for one young subject on the Biosemi amplifier, showing (a) EFRs derived from the two electrode configurations overlap, Fpz, nose; Fpz, lower forehead; (b) raw EEG waveforms for each individual channel, Fpz, lower forehead, nose; (c) raw EFRs from the mastoid; and (d) raw EFRs from the mastoid and the mastoid rereferenced to the nose electrode.

Close modal

We used a computational model of the human auditory periphery (Verhulst , 2018; Osses and Verhulst, 2019) to simulate the experimental RAM-EFR conditions and study how different aspects of SNHL (OHC damage and CS) affected the EFR generators. The adopted model is biophysically inspired and simulates the auditory processing at different stages along the ascending auditory pathway, which enabled us to simulate the OHC-loss and CS aspects of SNHL. First, we quantified how OHC damage and CS contribute to EFR generation of RAM stimuli with different carrier frequencies. Second, we studied how different EFR analysis methods play a role in the differential diagnosis among subjects with coexisting OHC-loss and CS. To simulate OHC-loss, we applied cochlear-gain-loss (CGL) to the cochlear model admittance, which resulted in wider cochlear filters with reduced gain (Verhulst , 2016; Verhulst , 2018). Throughout the text, the respective simulated CGL condition is called Flat35 and refers to a simulated flat audiogram with thresholds of 35 dB HL across frequency. Because the considered stimuli all had carrier frequencies at or above 2 kHz, simulating a flat hearing loss is not expected to make a strong difference in outcome from simulating a hearing loss profile with normal low-frequency hearing thresholds. Out of the possible CGL profiles that can be simulated with the model (see Keshishzadeh , 2021), we decided to select a single CGL profile which introduced the most severe CGL that can be simulated in the model and would have the strongest confounding effect of OHC damage to the simulated EFRs. The NH model (without CS) had 19 ANFs per simulated CF channel: 13 high spontaneous-rate (HSR), 3 medium spontaneous-rate (MSR), and 3 low spontaneous-rate (LSR) ANFs. We applied CS to the NH model by decreasing the number of ANFs. In addition to the NH profile (i.e., 13HSR-3MSR-3LSR), we simulated four CS profiles: 13HSR-3MSR-0LSR, 13HSR-0MSR-0LSR, 7HSR-0MSR-0LSR, and 4HSR-0MSR-0LSR. Last, the EFRs of different CGL and CS profiles were simulated by calculating the spectrum of the summed W-I, W-III, and W-V waveforms as depicted in Fig. 10(a). In our analysis, we considered the peaks of the spectrum at the modulation frequency of the RAM stimulus (110 Hz) and the following three harmonics.

FIG. 10.

(Color online) The modelling approach. (a) A block-diagram of the computational model of the auditory periphery is shown. Reproduced from with permission from Verhulst (2018). 360, 55–75. Copyright 2018 Authors, licensed under a Creative Commons Attribution 4.0 International license. (b) The excitation pattern of the BM velocity for NH (darker) and Flat35 (lighter) profiles is depicted. (c), (d), (e) The excitation patterns at the level of ANF in response to RAM stimuli with different carriers are shown. [(f), (g), (h)] The AM-response of the ANFs in response to RAM stimuli with different carriers are depicted. The darker colors (i.e., black, red, and blue) refer to the NH profile simulations. The lighter solid and dashed lines (i.e., black, red, and blue) represent the Flat35 and CS profiles simulations. (i), (j), (k). (l) Each panel shows the simulated RAM-EFRs of different carrier frequencies (2000, 4000, and 6000), which are calculated with the different methods [i.e., (i) peak of the fundamental modulation frequency, (j) sum of the harmonics, (k) sum of the fundamental and second harmonic, and (l) sum of the third and fourth harmonics].

FIG. 10.

(Color online) The modelling approach. (a) A block-diagram of the computational model of the auditory periphery is shown. Reproduced from with permission from Verhulst (2018). 360, 55–75. Copyright 2018 Authors, licensed under a Creative Commons Attribution 4.0 International license. (b) The excitation pattern of the BM velocity for NH (darker) and Flat35 (lighter) profiles is depicted. (c), (d), (e) The excitation patterns at the level of ANF in response to RAM stimuli with different carriers are shown. [(f), (g), (h)] The AM-response of the ANFs in response to RAM stimuli with different carriers are depicted. The darker colors (i.e., black, red, and blue) refer to the NH profile simulations. The lighter solid and dashed lines (i.e., black, red, and blue) represent the Flat35 and CS profiles simulations. (i), (j), (k). (l) Each panel shows the simulated RAM-EFRs of different carrier frequencies (2000, 4000, and 6000), which are calculated with the different methods [i.e., (i) peak of the fundamental modulation frequency, (j) sum of the harmonics, (k) sum of the fundamental and second harmonic, and (l) sum of the third and fourth harmonics].

Close modal

We simulated EFRs to the same RAM stimuli that were adopted in the experiment (carrier, 2 kHz, 4 kHz, or 6 kHz; modulation frequency, 110 Hz; modulation depth, 100%; level, 68.24 dB SPL). In the first step, we simulated the basilar membrane (BM) excitation patterns in response to the RAM stimuli of different carriers. The excitation pattern was defined as the root mean square (RMS) of the simulated BM velocity at each simulated CF channel between frequencies of 113 and 12 000 Hz. Figure 10(b) illustrates the NH (darker solid lines) and hearing loss (lighter solid lines, Flat35) BM excitation patterns of each condition. The excitation pattern peak occurred at a certain frequency close to the carrier frequency of the respective RAM stimulus. Considering the BM excitation patterns of NH profiles (darker solid lines), the RAM-2000NH peak was, respectively, 1.12 and 2.28 dB higher than the peaks of RAM-4000NH and RAM-6000NH conditions. This difference was even greater when we considered the simulated RAM conditions for the Flat35 profile (lighter solid lines); the magnitude of RAM-4000Flat35 and RAM-6000Flat35 were, respectively, 2.39 and 4.71 dB lower than the peak of RAM-2000Flat35 condition. The smaller on-CF peaks of the RAM-4000 and -6000 conditions can be explained by the reduced energy delivered to higher CF channels of the cochlea due to the adopted middle-ear filter. As expected, the peaks of the BM excitation pattern degraded after introducing CGL to the model. Among the three carrier frequencies, the RAM-2000Flat35 showed the smallest peak reduction compared to its normal gain condition (RAM-2000NH), i.e., 3.92 dB. With the same applied CGL, the peaks of RAM-4000 and -6000 conditions reduced by 5.18 and 6.34 dB, respectively, compared with normal gain. Aside from the on-CF contribution, off-CF channels contributed to the response as well. The origin of excitation at off-CF channels can be (1) the longitudinal filter coupling and respective gain propagation along the cochlea and (2) the off-CF sidebands associated with the rectangular modulator of the stimulus (see the spectrum of the RAM stimulus in Fig. 2). Hence, although introducing CGL to the model could have increased the off-CF contribution due to wider cochlear filters, the reduced filter gain might also have weakened the energy of stimulus sidebands, resulting in a net decrease in off-CF contribution to the EFRs in the simulated Flat35 profile.

In the second step, we simulated the ANF excitation patterns to the input from the BM level of the model by calculating the RMS of the ANF firing rate at each CF. For each RAM stimulus condition, the ANF excitation pattern was simulated for NH, Flat35, and the 7HSR-0MSR-0LSR CS profiles [Figs. 10(c)–10(e)]. Considering the simulated NH profiles, the off-CF contributions observed at the BM [Fig. 10(b)] are reflected also at the ANF processing level. After applying CGL (lighter solid lines), off-CF contributions reduced at the level of ANF processing because of the reduced ANF input from off-CF cochlear channels. Hence, consistent with the BM level simulations and compared to the NH profile, the Flat35 profile yielded a more frequency-specific excitation of the ANFs. Considering the ANF excitation at on-CF channels, introducing CGL increased the amplitude of the RAM-2000NH condition by 0.56 dB but reduced the amplitude of RAM-4000NH and RAM-6000NH ANF excitation by 0.78 and 2.39 dB, respectively. On the other hand, applying CS (lighter dashed lines) decreased the on-CF amplitude of ANF excitation for all of the conditions. The CS-induced amplitude reductions of ANF excitation for RAM-2000NH, RAM-4000NH, and RAM-6000NH equaled 7.38, 7.26, and 7.14 dB, respectively. However, contributions of off-CF channels to the response were not negligible for either NH or CS simulated profiles, which may confound a purely frequency-specific interpretation of the respective EFR simulation magnitudes.

In the third step, we calculated the amplitude modulation (AM) response of the ANFs to evaluate the AM-coding of the ANFs in response to RAM stimuli with different carrier frequencies. The AM-response at each CF channel was defined as the spectral magnitude of the ANFs firing rate at the modulation frequency, i.e., 110 Hz. Figures 10(f)–10(h) show the AM-responses of each RAM stimulus for the NH, Flat35, and CS profiles, respectively. Considering the NH profiles, the RAM-6000NH showed the highest on-CF AM-response (68.04 dB), followed by RAM-4000NH (67.82 dB), and RAM-2000NH (64.84 dB). Applying CGL did not cause a significant difference in the AM-response of the RAM-2000 condition, whereas it considerably reduced the off-CF channels contributions to RAM-4000 and -6000 conditions due to the weaker off-CF channels excitation in Figs. 10(d) and 10(e). Looking at the on-CF responses, CGL caused an increase in the AM-response of the RAM-2000 condition by 2.09 dB (≈3% referenced to the NH AM-response) while it decreased the AM-response of RAM-4000 and RAM-6000 conditions by 1.17 (≈1.72%) and 3.00 dB (≈4.41%), respectively. Introducing CS decreased the on-/off-CF AM-responses of all of the conditions. Specifically, CS caused the on-CF AM-response of RAM-2000, -4000, and -6000 to decrease by 7.60, 7.27, and 7.13 dB, respectively (i.e., 11.72%, 10.73%, and 10.48% magnitude reduction, respectively, in comparison to the NH profile). The trend toward lower RAM-EFR magnitudes observed in the old group compared with the young group (see Fig. 7) could, thus, be explained by a reduced number of ANFs with age as seen in Wu (2019).

In the fourth step, we simulated EFRs to RAM stimuli for NH, CS, and CGL profiles and considered the four different EFR-metric calculation approaches[Figs. 10(i)–10(l)], identical to the second (F1; peak-to-baseline), third (F1 + F2 + F3 + F4), fourth (F1 + F2), and fifth (F3 + F4) implemented methods for the measured EFRs [Fig. 6, Eqs. (1)–(5)]. Note that the adopted model was deterministic and, hence, estimation of the noisefloor was not possible for the simulated EFRs. Therefore, we did not include the EFR F 1 ( peak to noisefloor ) metric in the model simulations and reported simulated EFR metrics based on the spectral peak-to-baseline values (Table III). In the simulated CS profiles, we explicitly considered the effect of each ANF-type on the EFR magnitudes in the different conditions. The simulated type of ANF loss (LSR, LSR + MSR, or LSR + MSR + HSR) is specified in Table III. The RAM-EFR magnitudes of all of the conditions decreased due to CS and removing all of the LSR and/or MSR ANFs (CS profiles, 13HSR-3MSR-0LSR and/or 13HSR-0MSR-0LSR) had the highest impact on the RAM-EFR2000 magnitude and lowest impact on the RAM-EFR6000 magnitude. Among the different RAM stimulus carriers, RAM-EFR2000 showed the strongest vulnerability to CS. While CGL increased the magnitude of RAM-EFR2000 (the simulated Flat35 profile), it reduced the RAM-EFR4000 and RAM-EFR6000 magnitudes. According to the RAM-EFR2000 simulations in Figs. 10(i)–10(l) (gray circles), high degrees of LSR + MSR CS in the presence of CGL yielded a net increase in the magnitude of RAM-EFR2000 compared to the no-CGL condition, whereas we observe an opposite effect on RAM-EFR4000 and RAM-EFR6000 magnitudes (lighter circles). Considering the sole effect of CGL, no considerable difference was observed when comparing the RAM-EFRs of different analysis methods, except for the EFR F 3 + F 4 (sum of the second and third harmonics), which showed the minimum effect of CGL on the RAM-EFR4000 condition. Although a lower sensitivity of the metrics to CGL is desired when quantifying the CS aspect of SNHL, the overall dynamic range of the EFR F 3 + F 4 across CS loss simulations was small compared to the other metrics and, overall, less suited to capture individual CS differences.

TABLE III.

The amount of simulated EFR magnitude variation for different analyses and simulated CGL and CS profiles. The magnitude reductions were referenced to the corresponding NH profile.

Type of loss Method RAM-EFR2000 variation referenced to NH (%) RAM-EFR4000 variation referenced to NH (%) RAM-EFR6000 variation referenced to NH (%)
LSR-loss (13HSR-3MSR-0LSR)  EFRF1(peak-to-baseline)  −6.73  −5.17  −3.85 
EFRF1+F2+F3+F4  −6.90  −5.33  −4.07 
EFRF1+F2  −6.89  −5.31  −4.058 
EFRF3+F4  −6.99  −6.10  −4.67 
LSR + MSR-loss (13HSR-0MSR-0LSR)  EFRF1(peak -to-baseline)  −17.77  −16.88  −14.81 
EFRF1+F2+F3+F4  −17.99  −17.09  −15.17 
EFRF1+F2  −18.00  −17.08  −15.18 
EFRF3+F4  −17.55  −17.31  −14.88 
LSR + MSR + HSR (7HSR-0MSR-0LSR)  EFRF1(peak-to-baseline)  −55.72  −50.07  −54.12 
EFRF1+F2+F3+F4  −55.84  −50.02  −54.32 
EFRF1+F2  −55.84  −50.04  −54.32 
EFRF3 + F4  −55.60  −49.37  −54.16 
LSR + MSR + HSR (3HSR-0MSR-0LSR)  EFRF1(peak-to-baseline)  −81.02  −80.82  −80.34 
EFRF1+F2+F3+F4  −81.08  −80.87  −80.42 
EFRF1+F2  −81.08  −80.87  −80.43 
EFRF3+F4  −80.97  −80.92  −80.36 
Flat35  EFRF1(peak-to-baseline)  +16.38  −16.80  −39.56 
EFRF1+F2+F3+F4  +20.22  −13.35  −35.92 
EFRF1+F2  +19.92  −13.45  −35.76 
EFRF3+F4  +33.52  −9.68  −41.09 
Type of loss Method RAM-EFR2000 variation referenced to NH (%) RAM-EFR4000 variation referenced to NH (%) RAM-EFR6000 variation referenced to NH (%)
LSR-loss (13HSR-3MSR-0LSR)  EFRF1(peak-to-baseline)  −6.73  −5.17  −3.85 
EFRF1+F2+F3+F4  −6.90  −5.33  −4.07 
EFRF1+F2  −6.89  −5.31  −4.058 
EFRF3+F4  −6.99  −6.10  −4.67 
LSR + MSR-loss (13HSR-0MSR-0LSR)  EFRF1(peak -to-baseline)  −17.77  −16.88  −14.81 
EFRF1+F2+F3+F4  −17.99  −17.09  −15.17 
EFRF1+F2  −18.00  −17.08  −15.18 
EFRF3+F4  −17.55  −17.31  −14.88 
LSR + MSR + HSR (7HSR-0MSR-0LSR)  EFRF1(peak-to-baseline)  −55.72  −50.07  −54.12 
EFRF1+F2+F3+F4  −55.84  −50.02  −54.32 
EFRF1+F2  −55.84  −50.04  −54.32 
EFRF3 + F4  −55.60  −49.37  −54.16 
LSR + MSR + HSR (3HSR-0MSR-0LSR)  EFRF1(peak-to-baseline)  −81.02  −80.82  −80.34 
EFRF1+F2+F3+F4  −81.08  −80.87  −80.42 
EFRF1+F2  −81.08  −80.87  −80.43 
EFRF3+F4  −80.97  −80.92  −80.36 
Flat35  EFRF1(peak-to-baseline)  +16.38  −16.80  −39.56 
EFRF1+F2+F3+F4  +20.22  −13.35  −35.92 
EFRF1+F2  +19.92  −13.45  −35.76 
EFRF3+F4  +33.52  −9.68  −41.09 

Among the remaining methods, the EFR F 1 + F 2 + F 3 + F 4 (sum of the fundamental frequency and the following three harmonics) showed the lowest sensitivity to the applied CGL, which was smallest for the RAM-EFR4000 condition. Hence, the RAM-EFR4000 condition, analyzed by summing the spectral energy of the fundamental frequency of the modulator and its following three harmonics, may provide a better diagnostic metric of CS compared to the other carrier frequencies as it was least affected by CGL. Whereas the most severely simulated GCL reduced this EFR marker up to 15% of its original magnitude, the most severe ANF damage pattern reduced the response to 85% of its original magnitude. At the same time, the RAM-EFR4000 marker is more sensitive to the loss of HSR fibers than LSR and MSR fibers. Whereas the latter two fiber types reduced the response by about 15%, the simulated loss of HSR fibers further reduced the response by 85%. The stronger sensitivity to HSR fiber damage is due to the use of a 100% modulated stimulus that sweeps the full dynamic range of available ANFs. Even though we did not explicitly simulate the loss of IHC in this study, simulating a complete ANF loss is equivalent to a full IHC deafferentation and yields the same simulation outcomes as removing the IHC. Following the trends in Fig. 10 for different degrees of CS, a full frequency-inspecific deafferentation (or IHC loss) is expected to abolish the EFR response.

Last, we simulated EFRs to the widely adopted SAM-tone with the same characteristics as the adopted RAM stimuli using a 110-Hz sinusoidal modulator. The results are displayed with square markers and dashed lines in Figs. 10(i)–10(l). In all of the implemented processing methods, the magnitude of the SAM-EFR to different carriers increased due to CGL and decreased after introducing CS. After presenting CS in the presence of CGL, the SAM-EFRs of all of the carrier frequencies, akin to RAM-EFR2000, showed a net increased magnitude with respect to the NH profile. In terms of EFR magnitude strength, the RAM-EFR magnitudes of all of the conditions were larger than the SAM-EFR magnitudes, which is consistent with the experimental observations in Vasilkov (2021). In this regard, the overall larger magnitude of the RAM-EFR widens its diagnostic application range for listeners with SNHL and may provide a tool that captures individual differences in supra-threshold envelope coding caused by CS.

In this study, we conducted several experiments and model simulations to explore the robustness of the EFR in a clinical hearing diagnostic setting. We focused, particularly, on the RAM stimulus as a promising marker of CS based on prior simulations with a computational model of (impaired) peripheral auditory processing (Verhulst , 2018; Vasilkov , 2021).

We are aware that most of our participants were women, and this may have influenced the EFR magnitude. However, a previous study compared the age-related loss of IHCs, OHCs, and ANFs in a human temporal bone study for males and females and found no significant effect of sex (Wu , 2019). We maintained the same ratio of men across our groups, and this ensures that any sex differences within the groups would not have impacted the group differences. Other possible confounders relate to the head circumference or OHC damage. Figure 11 shows the correlation between the head circumference [Fig. 11(a)] and the EFR F 1 + F 2 + F 3 + F 4 marker [Fig. 11(b)]. An earlier study of our research group did not find a significant effect of head circumference on the derived-band EFR magnitude (Keshishzadeh , 2020) and the current study confirms this, i.e., Fig. 11 shows that a smaller head circumference does not lead to a larger EFR magnitude (r = 0.06; p = 0.76). The average pure-tone thresholds to the 10, 12.5, 14, and 16 kHz frequencies yielded the individual EHF in Fig. 11(b) and showed no significant correlation with the EFR magnitude (r = −0.35; p = 0.07).

FIG. 11.

(Color online) Individual EFR magnitudes derived from recordings with a 4 kHz carrier frequency (800 epochs) against the individual head circumference (a) (r = 0.06; p = 0.76). (b) Individual EFR magnitudes against the average EHF thresholds between 10 and 16 kHz (r = –0.35; p = 0.07). Younger subjects and older subjects are specified by squares and circles, respectively.

FIG. 11.

(Color online) Individual EFR magnitudes derived from recordings with a 4 kHz carrier frequency (800 epochs) against the individual head circumference (a) (r = 0.06; p = 0.76). (b) Individual EFR magnitudes against the average EHF thresholds between 10 and 16 kHz (r = –0.35; p = 0.07). Younger subjects and older subjects are specified by squares and circles, respectively.

Close modal

In support of this experimental observation, the model simulations showed that out of the tested conditions, the RAM 4-kHz pure tone evoked an EFR that was least affected by OHC damage when considering mixed SNHL pathologies. Our simulations in Fig. 10 showed that the 4-kHz RAM-EFR magnitude could be reduced by 15% due to OHC damage but it was much more affected by CS, i.e., up to 85% reduction when simulating 87% ANF loss. The influence of OHC damage on the EFR magnitude depends somewhat on the frequency-shape of the simulated CGL and is smaller when simulating sloping audiograms (i.e., 8% in Vasilkov , 2021). Furthermore, OHC damage affects SAM-EFR markers more so than RAM-EFR markers, a feature explored in Fig. 10 for different carrier frequencies. In the model, the reduced influence of OHC damage on the RAM-EFR compared to the SAM-EFR stems from the fast-rising RAM stimulus envelope that directly saturates the ANF without giving the cochlear amplifier time to apply (impaired) cochlear compression. SAM stimulus envelopes rise slower in time and can, therefore, be more affected by individual cochlear compression differences. The current study specifically focused on participants with normal-to-near normal audiometric thresholds to ensure that the impact on the EFR magnitude primarily reflected differences related to our methodological manipulations or individual degrees of CS rather than differences in OHC function. The absent relation between EHF thresholds and EFR marker magnitude in our cohort supports this view.

We expected to see individual differences in EFR magnitude in the younger participant cohort that would reflect individual ANF profiles and more drastic EFR magnitude reductions in the old group due to their expected age-related CS (Parthasarathy and Kujawa, 2018). When considering the median EFR magnitude values, we observed an overall larger spread across the younger than older cohort and smaller median EFRs in the old cohort. This trend is consistent with the earlier experimental observations in Vasilkov (2021) and suggests an overall age-related reduction in EFR magnitude consistent with an overall ANF number decrease with age as observed in post-mortem human temporal bone analyses (Wu , 2019).

Furthermore, independent t-tests indicated statistically significant differences between the age groups. Recordings with a 4-kHz carrier frequency, the EFR F 1 + F 2 + F 3 + F 4 or EFR F 3 + F 4 method indicated a statistically significant age effect. Because the EFR F 1 + F 2 + F 3 + F 4 marker yielded overall larger magnitudes and the EFR F 3 + F 4 method was not sufficient to capture individual CS differences in the model simulations, we conclude that using the 4-kHz RAM stimulus with the EFR F 1 + F 2 + F 3 + F 4 method to diagnose age-related CS may be a good choice. However, future research will need to provide a larger sample size to study the effect of age on the EFR magnitude more systematically.

Last, there is a possibility that age influences the generators of the EFR independent of how CS affects the EFR magnitude. The EFR could have had contributions from various stages of the ascending auditory pathway (Coffey , 2016; Coffey , 2017; Bidelman, 2018), and we aimed to minimize the contribution of central sources by focusing on modulation frequencies above 80 Hz (Purcell , 2004) and using an electrode montage that is commonly used for ABRs. EFRs recorded in Budgerigars to pure tones with different modulation frequencies show a good resemblance to ABR wave-I amplitude changes in the same animal (Wilson , 2021) when the modulation frequency is above 80 Hz and offer some support for a strong peripheral contribution to the EFRs we present here. However, we can never fully rule out that age may impact the brainstem (or even central) generators of the EFR, and future studies are necessary to fully explore this possibility.

The following description of the general test-retest reliability is based on data from a previous study within our research group. This additional information regarding the general test-retest reliability of EFR measures is important to consider before interpreting individual EFR differences or effects related to the different tested conditions. We did not perform test-retest recordings in the present study but implemented an additional analysis on our previous EFR recordings (Maele , 2021) to obtain a reliability estimate in young NH listeners with normal audiograms. The cohort was tested four times in the days surrounding music festivals and used the same recording equipment as well as similar stimulation and analysis methods to what we describe in the current study. Given the possible acute impact of noise exposure on the EFR markers, we only included the data from the day before the event (D1) and days three (D3) and five (D5) after the event, for which we observed no significant group mean differences related to the day of test.

Figure 12(a) shows individual RAM-EFR magnitudes across measurement session and the distribution of EFR data and SDs (calculated across the three sessions) in displayed in Fig. 12(b). Overall, RAM-EFRs were larger in magnitude than SAM-EFRs and, therefore, exceeded the across-session SD more strongly. The means and SDs across sessions resulted in a median test-retest sensitivity of 10% for the RAM-EFR that increased (worsened) to 20% for the SAM-EFR. From this analysis in NH listeners, we estimate that if all of the other measurement conditions remain the same, a 10% deviation of the median EFR magnitude can naturally be expected to reflect a test-retest difference. This is important when interpreting EFR results across measurement sessions, as was the case for our electrode configuration experiment in Fig. 8.

FIG. 12.

(Color online) An analysis of NH EFR data collected in Maele (2021), measured pre-event (D1), day 3 (D3), and day 5 (D5). The median EFR magnitudes are indicated using horizontal lines in the boxplots. (a) Individual RAM-EFR magnitudes and boxplots of the cohort distribution, (b) boxplots of the mean EFR magnitude and the corresponding SD for either RAM or SAM stimuli recorded in the same session, (c) sensitivity (%) defined by the median EFR magnitude for RAM and SAM, defined as the ratio between the SD of the individual EFR over the EFR mean are shown. SDs were calculated from the EFR mean across the three measurement sessions.

FIG. 12.

(Color online) An analysis of NH EFR data collected in Maele (2021), measured pre-event (D1), day 3 (D3), and day 5 (D5). The median EFR magnitudes are indicated using horizontal lines in the boxplots. (a) Individual RAM-EFR magnitudes and boxplots of the cohort distribution, (b) boxplots of the mean EFR magnitude and the corresponding SD for either RAM or SAM stimuli recorded in the same session, (c) sensitivity (%) defined by the median EFR magnitude for RAM and SAM, defined as the ratio between the SD of the individual EFR over the EFR mean are shown. SDs were calculated from the EFR mean across the three measurement sessions.

Close modal

We investigated how the stopping criteria (i.e., number of epochs included in the analysis) affected the EFR recording and its noisefloor. For the range of epochs we evaluated (600–1000), the number of epochs had no significant effect on the EFR magnitude or noisefloor, and this suggests that we had a stable physiological noisefloor even when recording only 600 epochs (and discarding 20% of the worst epochs before calculating the EFR magnitude). The decision of the ideal number of epochs should consider the required SNR (for a desired effect size) and measurement aim. The noisefloor can increase by physiological activity (electromyographic activity, cognitive function, or electrical activity) and/or from the recording environment or devices (Kraus , 2017). Thus, externally indistinguishable states of the participant can lead to different EFR amplitudes due to fluctuations of internal variables (Matousek and Petersén, 1983). Cortical activity is one of the main sources of noise in EFR recordings and depends strongly on task demands. The cortical activity and, thus, the amount of background noise in the EFR will be different when visual sources (watching) are presented compared to auditory sources (listening; Galbraith and Arroyo, 1993; Hoormann , 2000; Galbraith , 2003). To limit possible cortical noise sources, our participants watched a silent, captioned movie while they listened to our auditory stimuli. Nevertheless, some studies found a decrease in EFR amplitudes when participants directed attention toward visual stimuli (away from auditory stimuli) while latencies of the auditory frequency-following responses remained unchanged (Galbraith and Arroyo, 1993; Galbraith , 2003). The inconsistencies across studies regarding whether to present a visual stimulus during EFR recordings may be explained by differences in EFR SNR calculation, the type of acoustic stimulus presented, and the frequency of the stimulus eliciting EFRs (Holmes , 2018). Our results are consistent with Hoormann (2000), who observed no effect on the EFR amplitudes when changing the number of epochs. Because we tried to keep the interfering acoustic noise sources as minimal as possible, an addition of 200 epochs may have only had a limited effect on the SNR of the EFR. Consistent with this finding, Kraus (2017) suggest recording four times as many epochs to halve the noisefloor value. The recordings with 600 epochs showed the largest spread of EFR magnitudes as well as the largest differences in EFR magnitude between younger and older subjects. All of the recordings (i.e., with 600, 800, or 1000 epochs) were sufficient to obtain significant differences between the two age groups. Taking together these different factors, we suggest that 800 recording epochs (i.e., 640 after removing the 20% noisiest epochs) are sufficient to obtain a stable EFR magnitude and maintain a good SNR even in older listeners.

Previous research described how different carrier frequencies affected the auditory steady-state response (ASSR). The ASSR is similar to the EFR and generated by the frequency of a sinusoidal modulator imposed on a pure tone (Picton , 2003). One study compared ASSRs with the same spectral stimulus smearing level for different carrier frequencies (e.g., 500 and 2000 Hz) and found significantly different ASSRs (Hwang , 2019). Our EFR magnitudes did exhibit significant differences between the younger and older groups when changing the carrier frequency from 2 to 4 to 6 kHz. Statistically significant group differences were found when the EFR magnitude was recorded with 4 kHz. The estimated marginal means analysis indicated that carrier frequencies of 4 and 6 kHz generated the largest mean EFR magnitude among the younger subjects. The lowest EFR magnitude was found when older subjects were tested with a carrier frequency of 4 kHz. This implies that envelope processing in individuals was similar across these tested carrier frequencies and carrier frequency-related differences in EFR magnitude were generally smaller than individual EFR magnitude differences for a single EFR condition. We generally reported EFR responses with positive SNRs, even for the tested older cohort, implying that these RAM-EFRs can be used to quantify individual envelope processing differences in both cohorts. Of the three tested carrier frequencies, the EFR model simulations with the RAM-4000 stimulus [Fig. 10(j)] show that the RAM 2 and 6 kHz carriers are more affected by concomitant OHC damage than the RAM 4 kHz carrier. Most of the older listeners had decreased hearing sensitivity at 4 kHz, hence, it remains important to focus on temporal envelope encoding associated with processing above the ANF phase-locking limit. In humans, this limit may lie around 1–2 kHz, as one of our previous studies showed that we were not able to measure reliable derived-band EFRs below 2 kHz in young NH listeners (Keshishzadeh , 2020). Therefore, it may be best to limit RAM-EFR markers for CS diagnosis to carrier frequencies of 2 kHz and greater. Our model simulations additionally led us to conclude that it is safer to work with the 4 kHz carrier in (older) listeners with impaired audiograms.

The position of electrodes on the scalp and the choice of reference and ground electrode position influence EFR measurements strongly (Smith , 1975; Stillman , 1978; Galbraith, 1994). We compared EFR recordings using two specific electrode configurations with a vertical two-channel montage (four electrodes, consisting of one positive, two negative, and one ground). This montage emphasizes sustained phase-locked neural activity from the generators in the brainstem (Smith , 1975; Stillman , 1978). The EFRs recorded from the human scalp can have multiple peripheral and central generators, and the contribution of these generators is related to the modulation frequency (Purcell , 2004; Coffey , 2021). Here, the RAM stimulus envelope was modulated at 110 Hz to maximally elicit peripheral contributions (Purcell , 2004; Parthasarathy and Bartlett, 2012; Wilson , 2021). However, there is an upper bound on the EFR modulation frequency that can be used to assess local temporal envelope processing. The carrier and modulation frequencies should fall within a single cochlear auditory filter if the EFR marker should be informative about (CS-affected) envelope processing. This filter bandwidth-modulation frequency limit is species and carrier frequency specific (Kohlrausch , 2000). For example, using a 400 Hz modulation frequency in humans is not recommended at 4 kHz (Garrett and Verhulst, 2019), and based on Fig. 2 of Kohlrausch (2000), it is safer to use 100–120 Hz modulators for carrier frequencies down to 1000 kHz in humans.

The peripheral and brainstem EFR generators are expected to show the largest contribution to the evoked potentials when the non-inverting electrode is placed on a midline position, such as the high forehead, and the two inverting electrodes are located at the mastoid. The ground electrode can be positioned on a variety of locations (Galbraith, 1994) and is required to obtain a voltage difference by subtracting the voltages of active and reference points (Teplan, 2002). In our hypothesis [see Eq. (5)], we argued that the placement of the ground electrode would not significantly affect the EFR amplitude. These findings were confirmed using raw data obtained from the Biosemi EEG amplifier. Paukkunen and Sepponen (2008) conducted a simulation study to consider the effect of ground selection for EEG recordings and they used a homogeneous head model for their purpose. They recommend selecting a distant site for the ground electrode placement so that the distances between the four electrodes are more or less the same. In other words, when the ground electrode was placed too close to the active and/or reference electrode, the electrical sources coupled differently into the signals. This reduced the comparability of the signals and increased signal distortion (Paukkunen and Sepponen, 2008). On the downside, a distant ground electrode will also collect more interference from other sources inside the head, and this motivates the placement of the ground electrode at a maximally silent site to minimize the measurement noise sensitivity (Paukkunen and Sepponen, 2008). From this point of view, we would expect that configuration 2 (i.e., ground electrode on nose tip) would elicit larger EFR magnitudes. When the ground electrode is placed on the nose tip, the distance between the recording and reference electrodes is more comparable than when two reference electrodes are placed on the forehead. In addition to the ground electrode placement, the placement of the reference electrode can affect the generated EFR. Hall (1984) recommend the earlobe rather than the mastoid as the preferred location for auditory subcortical recordings because it is a noncephalic site and results in smaller bone vibration artifacts. The measurements of this study took place during the Covid pandemic and wearing a face mask in the hospital was mandatory at that time. In the context of practical feasibility, the reference electrodes were, thus, placed bilaterally on the mastoid. Taken together, we recommend placing the ground electrode at a maximally silent place (i.e., the nose tip) to avoid collecting interference from other sources inside the head. If feasible, reference electrode(s) should be placed at the earlobe(s). To ensure a certain level of symmetry between the different electrodes, the active electrode should, in this configuration, be put at the high forehead.

We combined model simulations with EFR recordings in younger and older listeners to investigate methodological considerations related to stimulus repetition, analysis method, carrier frequency, and electrode configuration for RAM-EFRs. These RAM-EFRs yielded good SNRs and showed reliable responses even in older listeners, which opens clinical options for individual hearing loss diagnosis. Within this study, statistical significance (p < 0.05) between the age groups was presented for the three different epoch numbers, measurements with a 4 kHz carrier frequency, and measurements analyzed with the EFR F 1 + F 2 + F 3 + F 4 and the EFR F 3 + F 4 method. We conclude that the placement of the ground electrode does not significantly affect the EFR magnitude and recommend placing the ground electrode at a maximally silent place (i.e., the nose tip). For clinical applications, we want to keep the measurements as time efficient as possible so that the RAM-EFR can be easily integrated into already existing clinical protocols. At the same time, we would also like to keep in mind that (older) participants with a hearing loss might require more epochs. All three recordings were able to detect an age difference. To ensure a good SNR on the one hand and individually interpretable EFR responses on the other hand, we suggest recording 800 epochs to maintain 640 noise-free epochs after applying our crude artifact rejection (20% of the recorded epochs removed). Further efforts to improve the artifact-rejection procedure may bring the number of necessary recorded epochs closer to the 600 required epochs for obtaining a good SNR. A significant difference between our five possible analysis methods was found. When considering the possible analysis methods, we conclude that the EFR F 1 + F 2 + F 3 + F 4 method is the most suited compared to the other considered methods in analyzing the EFR magnitude. Based on the model simulations, this method is the least affected by possible OHC damage. Next to that, this method yields the strongest group difference between young and old. Our EFR model simulations showed that RAM-4000 was the least affected by OHC-loss compared to 2 and 6 kHz RAM stimuli. Taken together, the outcomes of our theoretical-experimental study can support future studies in their development of a clinically usable and robust test protocol for EFRs.

This work was supported by the European Research Council (ERC) under the Horizon 2020 Research and Innovation Program, Grant No. 678120 RobSpear (H.V.D.B., S.K., and S.V.), ERC-PoC CochSyn (Grant No. 899858; H.V.D.B.), and Fonds Wetenschappelijk Onderzoek G068621N. Ghent University filed a patent application (PCTEP2020053192) which covers some of the ideas presented in this paper. S.V. and Viacheslav Vasilkov are inventors.

1.
Bharadwaj
,
H. M.
,
Mai
,
A. R.
,
Simpson
,
J. M.
,
Choi
,
I.
,
Heinz
,
M. G.
, and
Shinn-Cunningham
,
B. G.
(
2019
). “
Non-invasive assays of cochlear synaptopathy—Candidates and considerations
,”
Neuroscience
407
,
53
66
.
2.
Bharadwaj
,
H. M.
,
Masud
,
S.
,
Mehraei
,
G.
,
Verhulst
,
S.
, and
Shinn-Cunningham
,
B. G.
(
2015
). “
Individual differences reveal correlates of hidden hearing deficits
,”
J. Neurosci.
35
(
5
),
2161
2172
.
3.
Bidelman
,
G. M.
(
2018
). “
Subcortical sources dominate the neuroelectric auditory frequency-following response to speech
,”
Neuroimage
175
,
56
69
.
4.
Bramhall
,
N.
,
Beach
,
E. F.
,
Epp
,
B.
,
Le Prell
,
C. G.
,
Lopez-Poveda
,
E. A.
,
Plack
,
C. J.
,
Schaette
,
R.
,
Verhulst
,
S.
, and
Canlon
,
B.
(
2019
). “
The search for noise-induced cochlear synaptopathy in humans: Mission impossible?
,”
Hear. Res.
377
,
88
103
.
5.
Coffey
,
E. B.
,
Arseneau-Bruneau
,
I.
,
Zhang
,
X.
,
Baillet
,
S.
, and
Zatorre
,
R. J.
(
2021
). “
Oscillatory entrainment of the frequency-following response in auditory cortical and subcortical structures
,”
J. Neurosci.
41
(
18
),
4073
4087
.
6.
Coffey
,
E. B.
,
Herholz
,
S. C.
,
Chepesiuk
,
A. M.
,
Baillet
,
S.
, and
Zatorre
,
R. J.
(
2016
). “
Cortical contributions to the auditory frequency-following response revealed by MEG
,”
Nat. Commun.
7
(
1
),
1
11
.
7.
Coffey
,
E. B.
,
Musacchia
,
G.
, and
Zatorre
,
R. J.
(
2017
). “
Cortical correlates of the auditory frequency-following and onset responses: EEG and fMRI evidence
,”
J. Neurosci.
37
(
4
),
830
838
.
8.
Dolphin
,
W. F.
, and
Mountain
,
D. C.
(
1992
). “
The envelope following response: Scalp potentials elicited in the Mongolian gerbil using sinusoidally AM acoustic signals
,”
Hear. Res.
58
(
1
),
70
78
.
9.
Encina-Llamas
,
G.
,
Dau
,
T.
, and
Epp
,
B.
(
2021
). “
On the use of envelope following responses to estimate peripheral level compression in the auditory system
,”
Sci. Rep.
11
(
1
),
1
19
.
10.
Fernandez
,
K. A.
,
Guo
,
D.
,
Micucci
,
S.
,
De Gruttola
,
V.
,
Liberman
,
M. C.
, and
Kujawa
,
S. G.
(
2020
). “
Noise-induced cochlear synaptopathy with and without sensory cell loss
,”
Neuroscience
427
,
43
57
.
11.
Fernandez
,
K. A.
,
Jeffers
,
P. W.
,
Lall
,
K.
,
Liberman
,
M. C.
, and
Kujawa
,
S. G.
(
2015
). “
Aging after noise exposure: Acceleration of cochlear synaptopathy in ‘recovered’ ears
,”
J. Neurosci.
35
(
19
),
7509
7520
.
12.
Galbraith
,
G. C.
(
1994
). “
Two-channel brain-stem frequency-following responses to pure tone and missing fundamental stimuli
,”
Electroencephalogr. Clin. Neurophysiol.
92
(
4
),
321
330
.
13.
Galbraith
,
G. C.
, and
Arroyo
,
C.
(
1993
). “
Selective attention and brainstem frequency-following responses
,”
Biol. Psychol.
37
(
1
),
3
22
.
14.
Galbraith
,
G. C.
,
Olfman
,
D. M.
, and
Huffman
,
T. M.
(
2003
). “
Selective attention affects human brain stem frequency-following response
,”
Neuroreport
14
(
5
),
735
738
.
15.
Garrett
,
M.
(
2020
). “
Degradation of auditory processing and perception with age: The role of near and supra-threshold sensorineural hearing deficits
,” Ph.D. thesis, Carl von Ossietzky Universität Oldenburg. Medizinische Physik, Oldenburg, Germany.
16.
Garrett
,
M.
, and Verhulst, S. (
2019
). “
Applicability of subcortical EEG metrics of synaptopathy to older listeners with impaired audiograms
,”
Hear. Res.
380
,
150
165
.
17.
Guest
,
H.
,
Munro
,
K. J.
,
Prendergast
,
G.
,
Howe
,
S.
, and
Plack
,
C. J.
(
2017
). “
Tinnitus with a normal audiogram: Relation to noise exposure but no evidence for cochlear synaptopathy
,”
Hear. Res.
344
,
265
274
.
18.
Guest
,
H.
,
Munro
,
K. J.
,
Prendergast
,
G.
,
Millman
,
R. E.
, and
Plack
,
C. J.
(
2018
). “
Impaired speech perception in noise with a normal audiogram: No evidence for cochlear synaptopathy and no relation to lifetime noise exposure
,”
Hear. Res.
364
,
142
151
.
19.
Hall
,
J. W.
, III,
Morgan
,
S. H.
,
Mackey–Hargadine
,
J.
,
Aguilar
,
E. A., III
, and
Jahrsdoerfer
,
R. A.
(
1984
). “
Neuro-otologic applications of simultaneous multichannel audiotory evoked response recordings
,”
Laryngoscope
94
(
7
),
883
889
.
20.
Holmes
,
E.
,
Purcell
,
D. W.
,
Carlyon
,
R. P.
,
Gockel
,
H. E.
, and
Johnsrude
,
I. S.
(
2018
). “
Attentional modulation of envelope-following responses at lower (93–109 Hz) but not higher (217–233 Hz) modulation rates
,”
J. Assoc. Res. Otolaryngol.
19
(
1
),
83
97
.
21.
Hoormann
,
J.
,
Falkenstein
,
M.
, and
Hohnsbein
,
J.
(
2000
). “
Early attention effects in human auditory-evoked potentials
,”
Psychophysiology
37
(
1
),
29
42
.
22.
Hwang
,
J. H.
,
Nam
,
K. W.
,
Jang
,
D. P.
, and
Kim
,
I. Y.
(
2019
). “
Effects of degree and symmetricity of bilateral spectral smearing, carrier frequency, and subject sex on amplitude of evoked auditory steady-state response signal
,”
Cogn. Neurodyn.
13
(
2
),
151
160
.
23.
Keppler
,
H.
,
Dhooge
,
I.
, and
Vinck
,
B.
(
2015
). “
Hearing in young adults. Part II: The effects of recreational noise exposure
,”
Noise Health
17
(
78
),
245
252
.
24.
Keshishzadeh
,
S.
,
Garrett
,
M.
,
Vasilkov
,
V.
, and
Verhulst
,
S.
(
2020
). “
The derived-band envelope following response and its sensitivity to sensorineural hearing deficits
,”
Hear. Res.
392
,
107979
.
25.
Keshishzadeh
,
S.
,
Garrett
,
M.
, and
Verhulst
,
S.
(
2021
). “
Towards personalized auditory models: Predicting individual sensorineural hearing-loss profiles from recorded human auditory physiology
,”
Trends Hear.
25
,
233121652098840
.
26.
Kohlrausch
,
A.
,
Fassel
,
R.
, and
Dau
,
T.
(
2000
). “
The influence of carrier level and frequency on modulation and beat-detection thresholds for sinusoidal carriers
,”
J. Acoust. Soc. Am.
108
(
2
),
723
734
.
27.
Kraus
,
N.
,
Anderson
,
S.
, and
White-Schwoch
,
T.
(
2017
). “
The frequency-following response: A window into human communication
,” in
The Frequency-Following Response
, Springer Handbook of Auditory Research, Vol. 61, edited by N. Kraus, S. Anderson, T. White-Schwoch, R. Fay, and A. Popper (
Springer
,
Cham, Switzerland
), pp.
1
15
.
28.
Kujawa
,
S. G.
, and
Liberman
,
M. C.
(
2009
). “
Adding insult to injury: Cochlear nerve degeneration after ‘temporary’ noise-induced hearing loss
,”
J. Neurosci.
29
(
45
),
14077
14085
.
29.
Maele
,
T. V.
,
Keshishzadeh
,
S.
,
Poortere
,
N.
,
Dhooge
,
I.
,
Keppler
,
H.
, and
Verhulst
,
S.
(
2021
). “
The variability in potential biomarkers for cochlear synaptopathy after recreational noise exposure
,”
J. Speech. Lang. Hear. Res.
64
(
12
),
4964
4981
.
30.
Makary
,
C. A.
,
Shin
,
J.
,
Kujawa
,
S. G.
,
Liberman
,
M. C.
, and
Merchant
,
S. N.
(
2011
). “
Age-related primary cochlear neuronal degeneration in human temporal bones
,”
J. Assoc. Res. Otolaryngol.
12
(
6
),
711
717
.
31.
Matousek
,
M.
, and
Petersén
,
I.
(
1983
). “
A method for assessing alertness fluctuations from EEG spectra
,”
Electroencephalogr. Clin. Neurophysiol.
55
(
1
),
108
113
.
32.
Mehraei
,
G.
,
Hickox
,
A. E.
,
Bharadwaj
,
H. M.
,
Goldberg
,
H.
,
Verhulst
,
S.
,
Liberman
,
M. C.
, and
Shinn-Cunningham
,
B. G.
(
2016
). “
Auditory brainstem response latency in noise as a marker of cochlear synaptopathy
,”
J. Neurosci.
36
(
13
),
3755
3764
.
33.
Mepani
,
A. M.
,
Verhulst
,
S.
,
Hancock
,
K. E.
,
Garrett
,
M.
,
Vasilkov
,
V.
,
Bennett
,
K.
,
de Gruttola
,
V.
,
Liberman
,
M. C.
, and
Maison
,
S. F.
(
2021
). “
Envelope following responses predict speech-in-noise performance in normal-hearing listeners
,”
J. Neurophysiol.
125
(
4
),
1213
1222
.
34.
Möhrle
,
D.
,
Ni
,
K.
,
Varakina
,
K.
,
Bing
,
D.
,
Lee
,
S. C.
,
Zimmermann
,
U.
,
Knipper
,
M.
, and
Rüttiger
,
L.
(
2016
). “
Loss of auditory sensitivity from inner hair cell synaptopathy can be centrally compensated in the young but not old brain
,”
Neurobiol. Aging
44
,
173
184
.
35.
Osses
,
A. V.
, and
Verhulst
,
S.
(
2019
). “
Calibration and reference simulations for the auditory periphery model of Verhulst et al. 2018 version 1.2
,” arXiv:1912.10026.
36.
Oxenham
,
A. J.
(
2016
). “
Predicting the perceptual consequences of hidden hearing loss
,”
Trends Hear.
20
,
233121651668676
.
37.
Parthasarathy
,
A.
, and
Bartlett
,
E.
(
2012
). “
Two-channel recording of auditory-evoked potentials to detect age-related deficits in temporal processing
,”
Hear. Res.
289
(
1-2
),
52
62
.
38.
Parthasarathy
,
A.
, and
Kujawa
,
S. G.
(
2018
). “
Synaptopathy in the aging cochlea: Characterizing early-neural deficits in auditory temporal envelope processing
,”
J. Neurosci.
38
(
32
),
7108
7119
.
39.
Paukkunen
,
A. K. O.
, and
Sepponen
,
R.
(
2008
). “
The effect of ground electrode on the sensitivity, symmetricity and technical feasibility of scalp EEG recordings
,”
Med. Biol. Eng. Comput.
46
(
9
),
933
938
.
40.
Paul
,
B. T.
,
Bruce
,
I. C.
, and
Roberts
,
L. E.
(
2017
). “
Evidence that hidden hearing loss underlies amplitude modulation encoding deficits in individuals with and without tinnitus
,”
Hear. Res.
344
,
170
182
.
41.
Picton
,
T. W.
,
John
,
M. S.
,
Dimitrijevic
,
A.
, and
Purcell
,
D.
(
2003
). “
Human auditory steady-state responses: Respuestas auditivas de estado estable en humanos
,”
Int. J. Audiol.
42
(
4
),
177
219
.
42.
Prendergast
,
G.
,
Millman
,
R. E.
,
Guest
,
H.
,
Munro
,
K. J.
,
Kluk
,
K.
,
Dewey
,
R. S.
,
Hall
,
D. A.
,
Heinz
,
M. G.
, and
Plack
,
C. J.
(
2017
). “
Effects of noise exposure on young adults with normal audiograms II: Behavioral measures
,”
Hear. Res.
356
,
74
86
.
43.
Purcell
,
D. W.
,
John
,
S. M.
,
Schneider
,
B. A.
, and
Picton
,
T. W.
(
2004
). “
Human temporal auditory acuity as assessed by envelope following responses
,”
J. Acoust. Soc. Am.
116
(
6
),
3581
3593
.
44.
Schaette
,
R.
, and
McAlpine
,
D.
(
2011
). “
Tinnitus with a normal audiogram: Physiological evidence for hidden hearing loss and computational model
,”
J. Neurosci.
31
(
38
),
13452
13457
.
45.
Shaheen
,
L. A.
,
Valero
,
M. D.
, and
Liberman
,
M. C.
(
2015
). “
Towards a diagnosis of cochlear neuropathy with envelope following responses
,”
J. Assoc. Res. Otolaryngol.
16
(
6
),
727
745
.
46.
Smith
,
J. R.
,
Funke
,
W. F.
,
Yeo
,
W.
, and
Ambuehl
,
R. A.
(
1975
). “
Detection of human sleep EEG waveforms
,”
Electroencephalogr. Clin. Neurophysiol.
38
(
4
),
435
437
.
47.
Smith
,
S.
,
Krizman
,
J.
,
Liu
,
C.
,
White-Schwoch
,
T.
,
Nicol
,
T.
, and
Kraus
,
N.
(
2019
). “
Investigating peripheral sources of speech-in-noise variability in listeners with normal audiograms
,”
Hear. Res.
371
,
66
74
.
48.
Stillman
,
R. D.
,
Crow
,
G.
, and
Moushegian
,
G.
(
1978
). “
Components of the frequency-following potential in man
,”
Electroencephalogr. Clin. Neurophysiol.
44
(
4
),
438
446
.
49.
Teplan
,
M.
(
2002
). “
Fundamentals of EEG measurement
,”
Meas. Sci. Rev.
2
(
2
),
1
11
.
50.
Valero
,
M. D.
,
Hancock
,
K. E.
, and
Liberman
,
M. C.
(
2016
). “
The middle ear muscle reflex in the diagnosis of cochlear neuropathy
,”
Hear. Res.
332
,
29
38
.
51.
Vasilkov
,
V.
,
Garrett
,
M.
,
Mauermann
,
M.
, and
Verhulst
,
S.
(
2021
). “
Enhancing the sensitivity of the envelope-following response for cochlear synaptopathy screening in humans: The role of stimulus envelope
,”
Hear. Res.
400
,
108132
.
52.
Verhulst
,
S.
,
Altoè
,
A.
, and
Vasilkov
,
V.
(
2018
). “
Computational modeling of the human auditory periphery: Auditory-nerve responses, evoked potentials and hearing loss
,”
Hear. Res.
360
,
55
75
.
53.
Verhulst
,
S.
,
Ernst
,
F.
,
Garrett
,
M.
, and
Vasilkov
,
V.
(
2018
). “
Suprathreshold psychoacoustics and envelope-following response relations: Normal-hearing, synaptopathy and cochlear gain loss
,”
Acta Acust. Acust.
104
(
5
),
800
803
.
54.
Verhulst
,
S.
,
Jagadeesh
,
A.
,
Mauermann
,
M.
, and
Ernst
,
F.
(
2016
). “
Individual differences in auditory brainstem response wave characteristics: Relations to different aspects of peripheral hearing loss
,”
Trends Hear.
20
,
233121651667218
.
55.
Verhulst
,
S.
,
Keshishzadeh
,
S.
,
Taghon
,
B.
,
Keppler
,
H.
, and
Dhooge
,
I.
(
2022
). “
Low and high-frequency hearing mechanisms in ageing and tinnitus
,” in
International Symposium on Hearing (ISH)
, May, Lyon.
56.
Viana
,
L. M.
,
O'Malley
,
J. T.
,
Burgess
,
B. J.
,
Jones
,
D. D.
,
Oliveira
,
C. A.
,
Santos
,
F.
,
Merchant
,
S. N.
,
Liberman
,
L. D.
, and
Liberman
,
M. C.
(
2015
). “
Cochlear neuropathy in human presbycusis: Confocal analysis of hidden hearing loss in post-mortem tissue
,”
Hear. Res.
327
,
78
88
.
57.
Wilson
,
J. L.
,
Abrams
,
K. S.
, and
Henry
,
K. S.
(
2021
). “
Effects of kainic acid-induced auditory nerve damage on envelope-following responses in the budgerigar (Melopsittacus undulatus)
,”
J. Assoc. Res. Otolaryngol.
22
(
1
),
33
49
.
58.
Wojtczak
,
M.
,
Beim
,
J. A.
, and
Oxenham
,
A. J.
(
2017
). “
Weak middle-ear-muscle reflex in humans with noise-induced tinnitus and normal hearing may reflect cochlear synaptopathy
,”
eNeuro
4
(
6
),
ENEURO.0363-17.2017
.
59.
Wu
,
P. Z.
,
Liberman
,
L. D.
,
Bennett
,
K.
,
de Gruttola
,
V.
,
O'Malley
,
J. T.
, and
Liberman
,
M. C.
(
2019
). “
Primary neural degeneration in the human cochlea: Evidence for hidden hearing loss in the aging ear
,”
Neuroscience
407
,
8
20
.
60.
Zhong
,
Z.
,
Henry
,
K. S.
, and
Heinz
,
M. G.
(
2014
). “
Sensorineural hearing loss amplifies neural coding of envelope information in the central auditory system of chinchillas
,”
Hear. Res.
309
,
55
62
.
61.
Zhu
,
L.
,
Bharadwaj
,
H.
,
Xia
,
J.
, and
Shinn-Cunningham
,
B.
(
2013
). “
A comparison of spectral magnitude and phase-locking value analyses of the frequency-following response to complex tones
,”
J. Acoust. Soc. Am.
134
(
1
),
384
395
.