The amplitudes of distortion-product otoacoustic emissions (DPOAEs) may abruptly decrease even though the stimulus level is relatively high. These notches observed in the DPOAE input/output functions or distortion-product grams have been hypothesized to be due to destructive interference between wavelets generated by distributed sources of the nonlinear-distortion component of DPOAEs. In this paper, simulations with a smooth cochlear model and its analytical solution support the hypothesis that destructive interference between individual wavelets may lead to the amplitude notches and explain the cause for onset and offset amplitude overshoots in the DPOAE signal measured for intensity pairs in the notches.
Distortion-products (DPs) generated in the cochlea may be recorded by a microphone placed in the ear canal. DPs evoked by two pure tones of nearby frequencies are called distortion-product otoacoustic emissions (DPOAEs) (Probst et al., 1991). In a previous study (Vencovský et al., 2019), we analyzed how DPs generated over a relatively large portion of the basilar membrane (BM) affect the DPOAE signal if the level of one tone is 50 dB sound pressure level (SPL) and the level of the other tone is varied from 30 to 70 dB SPL. We showed that the DPOAE can be well approximated by a point source on the BM. Theoretical tools presented in Vencovský et al. (2019) are used in this paper to study the primary source at large stimulus levels (>60 dB SPL) and with a frequency ratio between the stimuli of f2/f1 = 1.2, which is near the “optimal” ratio for which experiments have shown an abrupt decrease of DPOAE amplitude (e.g., Johannesen and Lopez-Poveda, 2010; Martin et al., 2013; Whitehead et al., 1992). The amplitude decrease caused notches in the DPOAE input/output functions or DP-grams. An interference tone (IT) with frequency greater than f2 presented simultaneously with the stimuli in some cases enhanced the DPOAE amplitude. Therefore, a possible explanation for the pronounced notches in the DPOAE amplitude is destructive interference between the primary sources (Martin et al., 1999, 2003, 2013). The same mechanism was theoretically identified as causing the amplitude decrease of the DPOAE when the frequency ratio f2/f1 approaches unity (Shera, 2002; Shera and Guinan, 2007; Sisto et al., 2018; van Hengel, 1996). Since the analytical solution of the cochlear model presented in Vetešník and Gummer (2012) and Vencovský et al. (2019) enables the calculation of the wavelets forming the primary DPOAE source, the analytical solution is used here to test the hypothesis of destructive interference over a broad basal region as postulated by Martin and colleagues (Martin et al., 1999, 2003, 2013).
DPOAEs simulated in this paper were generated by two tones of frequencies f1 = 2 kHz and f2 = 2.4 kHz (f2/f1 = 1.2), giving a lower side-band cubic DP at fDP = 2f1 − f2 = 1.6 kHz, and levels L1 ranging from 40 to 80 dB SPL and L2 ranging from 30 to 80 dB SPL, with a 2-dB step. To study the onset of the DPOAE signal, the f2 tone was presented with 10-ms delay after the onset of the f1 tone and turned off before the offset of the f1 tone. The onset and offset of the f1 and f2 tones were, respectively, shaped with 4- and 10-ms raised-cosine ramps (similar to Vencovský et al., 2019). The duration of the f2 tone was 60 ms. To obtain a cubic DPOAE recording without the evoking signals and other DP signals, the starting phases of the stimuli were rotated in successive ensembles (Whitehead et al., 1996). In the first ensemble, the starting phases were 0°. In every other ensemble, the phase of the f1 tone was changed by 90° and the f2 tone by 180°. To suppress the basal sources of DPOAEs, an IT was added with level LIT = 65 dB SPL and frequency fIT = 3050 Hz, which is about 1/3 octave above f2 (the same octave range used in Martin et al., 2013). The IT was added simultaneously with f1 and was shaped with 4-ms raised-cosine ramps. The phase of the IT was rotated by 45°, which eliminated the IT from the recorded signal and yielded an otoacoustic emission (OAE) signal in which the DPs at 1.4 and 1.8 kHz—those closest to fDP—were about 52 dB smaller than the DP at fDP. Signals from eight ensembles were averaged for the three-tone stimulus. Without the IT, only four ensembles needed to be averaged.
2.2 Cochlear model and analytical solution
DPOAEs were simulated by the two-dimensional1 box model described in Vencovský et al. (2019), which was designed to simulate the human cochlea. The cochlea model is coupled to a middle-ear model (Vencovský et al., 2019), allowing numerical simulation of OAEs as pressure changes in the external auditory canal. To analyze the primary source of the DPOAE, an analytical solution of the model equations described in Vencovský et al. (2019) was used. The box model did not contain inhomogeneities (roughness), in order to simulate only the nonlinear-distortion component of DPOAEs (also called the primary component).
In the model, the BM displacement ξ(x, t) (positive for displacement toward scala vestibuli) is given by
where every segment of the BM (from the base x = 0 to the apex x = L) is represented by its mass m(x), stiffness k(x), damping h(x), and shearing resistance ∂xs(x)∂x (∂t and ∂x are partial derivatives with respect to t and x, respectively). The right-hand side of Eq. (1) contains forces accounting for: (1) the fluid coupling between the stapes displacement σ(t) (positive for displacement into scala vestibuli) and the BM displacement, which is modeled by the Green's function GS(x); (2) fluid coupling between the individual BM segments modeled by the Green's function G(x,); and (3) active feedback force U(x, t), which amplifies the BM vibrations and is assumed to be based on electromechanical force produced by the outer hair cells. U(x, t) is moderated by the oscillations of another resonance system, tuned to about 0.5 oct below the characteristic frequency (CF) of the BM, assumed to be formed by the tectorial membrane connected to outer hair cell stereocilia (Nobili and Mammano, 1996). The force U(x, t) is limited at larger vibrations of the BM by a Boltzmann function (Vencovský et al., 2019) simulating mechanoelectrical transduction in the stereocilia of the outer hair cells. The Boltzmann function causes the simulated BM displacement to be nonlinear at stimulus levels greater than about 30 dB SPL (Vencovský et al., 2019). The same model implementation as in Vencovský et al. (2019) was used: Matlab implementation with time-domain equations solved numerically by the Dorman-Prince method, which is a member of the Runga-Kutta family of ODE solvers. The sampling frequency was 600 kHz and the model was discretized in the spatial domain with 800 equal-length segments. Testing with up to 1200 segments, we have shown that 800 segments are sufficient to avoid reflection components resulting from the discretization (Vencovský et al., 2019).
If the evoking stimuli are at frequencies which are multiples of a fundamental frequency f0 (Nobili and Mammano, 1996), such that f1 = n1f0, f2 = n2f0, and fDP= nDPf0, where nDP = 2n1 − n2, then the analytical solution of the model [Eq. (1)] may be approximated by
The reader is referred to Vencovský et al. (2019) and Vetešník and Gummer (2012) for details. Briefly, is a complex constant which does not depend on the stimulus level, x1 and x2 define the range of positions along the BM where the primary DPOAE sources are located [the BM part where the traveling waves (TWs) of evoking stimuli overlap (Vencovský et al., 2019)], is the solution of the homogeneous equation associated with Eq. (1) and represents the forward-going (base) wave, and is the nonlinear force. The forward-going base wave with amplitude and phase can be approximated by
where is the complex constant determined by the boundary condition at the apex. In this paper, the forward-going base wave is approximated numerically by the model solution for the stapedial input at 10 dB SPL (Vencovský et al., 2019). The nonlinear force is the component of the Fourier decomposition of the nonlinear part of U(x, t) at fDP = 2f1 − f2, described in Vencovský et al. (2019). The nonlinear part of U(x, t), denoted by UNL (x, t), was calculated numerically in the time domain as
where η(x, t) is the shearing displacement between the reticular lamina and tectorial membrane, u(x) is a function which sets the desired cochlear gain in different BM segments, and S[aη(x, t)] is the Boltzmann function; a = 1 m−1 is a constant. To obtain was filtered by a window-based finite impulse response (FIR) bandpass filter centered at fDP. The filter of order 800 with coefficients computed using a Hamming window was applied as a zero-phase filter by the function filtfilt in Matlab. The 6-dB bandwidth of the filter was 140 Hz and the signal was downsampled to 102.4 kHz before filtering (Vencovský et al., 2019). was calculated with the Hilbert transform.
3.1 Simulated DPOAEs
Figure 1 examines the effect of the IT on the steady-state and transient responses of the DPOAEs. Figure 1(A) depicts the DPOAE steady-state level for f1 = 2 kHz and f2 = 2.4 kHz, and various combinations of L1 and L2. Figure 1(B) depicts the DPOAE steady-state level for the same stimuli as in Fig. 1(A) but with a simultaneously presented IT with a frequency of 3050 Hz and level of 65 dB SPL. The main difference between the contours in Figs. 1(A) and 1(B) is that the pronounced notch at higher levels [Fig. 1(A)] disappears as a result of adding the IT [Fig. 1(B)]. The IT also affects the optimal L1 and L2 levels, i.e., the L1 which leads to the largest DPOAE level for a given L2. The IT causes the largest DPOAE level to be reached at smaller L1 values than in the absence of the IT. The red cross in Figs. 1(A) and 1(B) indicates the level condition which is studied in the remainder of this paper, i.e., the arbitrarily chosen intensity pair at L1 = 72 dB SPL and L2 = 66 dB SPL in the region of the notch [Fig. 1(A)].
Figure 1(C) depicts the envelope of the DPOAE signal for L1 = 72 dB SPL and L2 = 66 dB SPL. The signal was filtered with the bandpass FIR filter described in Sec. 2.2. The onset and offset of the DPOAE signal are monotonically increasing if the IT is present (orange line), which contrasts with the case in the absence of the IT (black line) presenting pronounced amplitude overshoots (complexities) in the signal onset and offset. The same effect of the IT on the envelope was observed experimentally by Martin et al. (2013).
3.2 Analysis of the primary source
Figure 2 examines the effect of the IT on the steady-state TWs, nonlinear force,2 and DPOAE. The stimuli are f1 = 2 kHz, f2 = 2.4 kHz, L1 = 72 dB SPL, and L2 = 66 dB SPL. Responses in the absence of the IT are depicted by black solid lines and in the presence of the IT, with fIT = 3050 Hz and LIT = 65 dB SPL, by orange dashed lines. Figure 2(A) depicts TWs (amplitude and phase) for f1, f2, fDP, and fIT. The f2 TW is slightly suppressed by the IT (at most 1 dB at the f2 place); the f1 TW is practically unaffected.
Figure 2(B) depicts the amplitude and phase of the nonlinear force calculated with the model [Eq. (4)]. The IT strongly suppresses the nonlinear force at the BM place near the peak of the IT TW (x ≈ 1 cm); the suppression extends over a region of approximately 0.32 cm (0.64 oct) basal from the f2 TW peak. Now, the principal peak of the force amplitude is located at the peak of the f2 TW (x ≈ 1.15 cm).
Figure 2(C) depicts the amplitude and phase of the cumulative integral over the product of the nonlinear force and the base wave [Eq. (2)]. The cumulative integral illuminates interference phenomena between adjacent wavelets at the fDP frequency; the cumulative integration starts at the BM base and proceeds toward the apex. For the most apically located points of the cumulative integral (x > 1.65 cm), where the amplitude and phase do not change with increasing x, the integral is proportional to the numerically calculated DPOAE. The oscillations result from interference between DP wavelets. The oscillations are around these apical, asymptotically constant values; their beginning is indicated by the arrows in the amplitude panel. Figures 2(B) and 2(C) indicate that mainly the region near the peak of the nonlinear force (x ≈ 1.15 cm) determines the final DPOAE amplitude and phase, as evidenced by the abrupt amplitude increase [orange arrow in Fig. 2(C)]. The cumulative integral in the more apically placed BM segments (x > 1.15 cm) presents oscillatory behaviour (both amplitude and phase) due to mutual interference of DP wavelets. The oscillations near x = 1.6 cm are due to mutual cancellation among wavelets generated near the maximum of the fDP TW; therefore, they do not contribute to the DPOAE signal. In the absence of the IT, the amplitude of the cumulative integral [Fig. 2(C)] first increases for x increasing from 0 to about [Fig. 2(B)] and then decreases for x increasing up to about [Fig. 2(B)]. The wavelets between the base and x′, i.e., Eq. (2) with the integration interval between 0 and x′, yield the amplitude of 105 a.u. and phase of 91°. The wavelets between x′ and x″ yield the amplitude of 113 a.u. and phase of −86°, which is about the same amplitude as in the basal region but of opposite phase implying destructive interference. In general, in the absence of the IT, the oscillations begin [black arrow in Fig. 2(C)] near the basal edge of the maximum region of the nonlinear force.
Figure 3 examines the transient TW, nonlinear force, and DPOAE responses in the region of an amplitude notch. The analysis samples at 13, 17, 19, and 30 ms after the onset of the f1 tone. The onset of the f2 tone is at 10 ms. The positions of the sampling instants in the simulated DPOAE signal are indicated in Fig. 3(D). The last analyzed time instant (30 ms) is located in the steady-state part of the DPOAE signal. Figure 3(A) depicts the amplitude and phase of the simulated TWs for the four sampling instants. There is a gradual buildup of the TW amplitudes approximately independent of place.
Figure 3(B) depicts the amplitude and phase of the nonlinear force . Notice that the largest peak of the nonlinear force amplitude (x ≈ 1.05 cm) slightly broadens with increasing time and almost divides into two peaks in the steady-state part of the DPOAE signal.
The cumulative integral is depicted in Fig. 3(C). We can see that for the first sampling instant (13 ms), the cumulative integral reaches most of the final DPOAE amplitude (blue arrow) in the region of the peak of the nonlinear force. The more apically located portions of the nonlinear force only cause relatively small oscillations in the amplitude and phase of the cumulative integral. In other words, in contrast to Martin et al. (2013), we do not see any evidence for basal DP wavelets building up faster than wavelets near the f2 TW peak. After reaching the amplitude maximum in the onset of the DPOAE signal (for time samples at about 17 ms), the phase of the cumulative integral also rolls off in the more apical parts (at x between 1.1 and 1.6 cm), which is a consequence of destructive interference between wavelets leading to relatively small steady-state DPOAE amplitudes. The crosses in Fig. 3(D) depict relative DPOAE amplitudes calculated from the cumulative integral. The black solid line represents the numerically calculated pressure in the basal-most BM segment filtered with the same bandpass FIR filter as described in Sec. 2.2 for the nonlinear force.
In conclusion, this temporal analysis suggests that these transient response overshoots are due to mutual interference of DP wavelets.
The nonlinear component of DPOAEs simulated by a human cochlear model displayed amplitude notches for the optimal frequency ratio (f2/f1 = 1.2). An IT presented at frequencies above the f2 tone enhanced the DPOAE amplitude in the region of the notches due to suppression of the primary sources of DPOAEs located basally from the peak of the f2 TW. This model-based finding agrees with conclusions of Martin and colleagues based on experimental data in rabbits (Martin et al., 1999, 2013) and humans (Martin et al., 2003). The notch region in our simulations occurred for L1 ≈ L2 + 6 dB with L2 > 60 dB SPL, but experimental data in rabbits showed pronounced notches near L1 ≈ 60–70 dB SPL or L2 ≈ 60–70 dB SPL (Fahey et al., 2008; Whitehead et al., 1992). Fahey et al. (2008) successfully simulated notches in experimental data by a single isolated nonlinearity, which may also display notches (Lukashkin and Russell, 1999). If this isolated nonlinearity explanation were valid in the present simulations, the amplitude of the nonlinear force should have decreased to cause the DPOAE notch. The model suggests the contrary: the IT suppresses the nonlinear force basal to the f2 place [Fig. 2(B)] and thereby causes enhancement of DPOAE amplitude. Differences between the simulations for humans and experimental data in small mammals could be due to narrower tuning of cochlear filters in humans, which was also suggested to be the cause of less pronounced coherent reflection components of DPOAEs in small mammals (see the general discussion in Shera, 2002).
Pronounced reflection components in humans may also cause notches due to destructive interference with nonlinear distortion components (e.g., see Fig. 3 in Zelle et al., 2015). Johannesen and Lopez-Poveda (2010) and Zelle et al. (2017) took precautions to avoid such secondary sources in their experiments. In their data, a notch was shown for L2 ≈ 60 dB SPL and L1 > 60 dB SPL, which is roughly equivalent to the intensity conditions in our simulations; see Figs. 1 and 2 in Johannesen and Lopez-Poveda (2010), and Fig. 5 in Zelle et al. (2017). In contrast, Meinke et al. (2005) showed a shallow notch near L2 ≈ 70–75 dB SPL and L1 ≈ 55 dB SPL.
Near a notch of DPOAE amplitude, the DPOAE signal contains amplitude overshoots (complexities) in the onset and offset (Fig. 1). In the model, these complexities are due to the spatial extent of the DP primary sources—spatial extent of the principal peak of the nonlinear force. The peak broadens for increasing time instants in the onset of the DPOAE signal [Fig. 3(B)]. A broader extent leads to more destructive interference due to the slow phase rotation of the wavelets generated along the BM length. Martin et al. (2013) (their Figs. 4B and 4C) showed that the DPOAE onset is delayed by about 1.5 ms due to the IT, but the present simulations indicate equal time onsets. Martin et al. (2013) interpreted the time delay as an indication of the basal source building up before the source near f2. Two-tone suppression between the f1 and f2 TWs can cause complexities and affect onset latency, but such suppression was shown to be most pronounced if the level of the eliciting tone is larger than the level of the other tone (Vencovský et al., 2019). In the present simulations, the f1 TW was unaffected by the increasing amplitude of the f2 TW [Fig. 3(A)]. Therefore, the complexities were not due to two-tone suppression. For high-equal level stimuli (e.g., 75 dB SPL in Martin et al., 2013), the simulated f2 TW suppressed the f1 TW by only about 2 dB and there were negligible onset and offset overshoots (data not shown).
Another source of basal DPOAEs might lie within the organ of Corti whose parts have been shown to vibrate nonlinearly even at frequencies much lower than the CF (Cooper et al., 2018). The present model does not take these observations into account.
In conclusion, the analysis supports the hypothesis that notches in DP-grams or DPOAE input/output functions can be caused by destructive interference among wavelets generated within the primary source region. In the onset and offset of a DPOAE signal, the spatial extent of the generation region is narrower and, consequently, destructive interference is less than in the steady-state, resulting in a smaller steady-state response or equivalently amplitude overshoots (complexities) in the signal envelope.
Supported by the Grant Agency of the Czech Technical University in Prague (Grant No. SGS17/190/OHK3/3T/13), by the German Research Council (Grant Nos. DFG Da 487/3-1,2 and Gu 194/12-1), and by European Regional Development Fund-Project “Centre for Advanced Applied Science” (Grant No. 1183 CZ.02.1.01/0.0/0.0/16_019/0000778). Access to computing and storage facilities owned by parties and projects contributing to the National Grid Infrastructure MetaCentrum provided under the programme “Projects of Large Research, Development, and Innovations Infrastructures” (CESNET LM2015042), is greatly appreciated. V.V. is supported by the International Mobility of Researchers in CTU (Grant No. CZ.02.2.69/0.0/0.0/16_027/0008465).
The pressure field is calculated for a rectangular two-dimensional space with one dimension along the BM length and the other dimension given by the height of the cochlear duct (Vetešník and Gummer, 2012).
For the steady-state condition, the amplitude and phase of the TWs and the nonlinear force were obtained by Fast Fourier Transformation of the model results.