The amplitudes of distortion-product otoacoustic emissions (DPOAEs) may abruptly decrease even though the stimulus level is relatively high. These notches observed in the DPOAE input/output functions or distortion-product grams have been hypothesized to be due to destructive interference between wavelets generated by distributed sources of the nonlinear-distortion component of DPOAEs. In this paper, simulations with a smooth cochlear model and its analytical solution support the hypothesis that destructive interference between individual wavelets may lead to the amplitude notches and explain the cause for onset and offset amplitude overshoots in the DPOAE signal measured for intensity pairs in the notches.

## 1. Introduction

Distortion-products (DPs) generated in the cochlea may be recorded by a microphone placed in the ear canal. DPs evoked by two pure tones of nearby frequencies are called distortion-product otoacoustic emissions (DPOAEs) (Probst *et al.*, 1991). In a previous study (Vencovský *et al.*, 2019), we analyzed how DPs generated over a relatively large portion of the basilar membrane (BM) affect the DPOAE signal if the level of one tone is 50 dB sound pressure level (SPL) and the level of the other tone is varied from 30 to 70 dB SPL. We showed that the DPOAE can be well approximated by a point source on the BM. Theoretical tools presented in Vencovský *et al.* (2019) are used in this paper to study the primary source at large stimulus levels (>60 dB SPL) and with a frequency ratio between the stimuli of *f*_{2}/*f*_{1} = 1.2, which is near the “optimal” ratio for which experiments have shown an abrupt decrease of DPOAE amplitude (e.g., Johannesen and Lopez-Poveda, 2010; Martin *et al.*, 2013; Whitehead *et al.*, 1992). The amplitude decrease caused notches in the DPOAE input/output functions or DP-grams. An interference tone (IT) with frequency greater than *f*_{2} presented simultaneously with the stimuli in some cases enhanced the DPOAE amplitude. Therefore, a possible explanation for the pronounced notches in the DPOAE amplitude is destructive interference between the primary sources (Martin *et al.*, 1999, 2003, 2013). The same mechanism was theoretically identified as causing the amplitude decrease of the DPOAE when the frequency ratio *f*_{2}/*f*_{1} approaches unity (Shera, 2002; Shera and Guinan, 2007; Sisto *et al.*, 2018; van Hengel, 1996). Since the analytical solution of the cochlear model presented in Vetešník and Gummer (2012) and Vencovský *et al.* (2019) enables the calculation of the wavelets forming the primary DPOAE source, the analytical solution is used here to test the hypothesis of destructive interference over a broad basal region as postulated by Martin and colleagues (Martin *et al.*, 1999, 2003, 2013).

## 2. Methods

### 2.1 Stimuli

DPOAEs simulated in this paper were generated by two tones of frequencies *f*_{1} = 2 kHz and *f*_{2} = 2.4 kHz (*f*_{2}/*f*_{1} = 1.2), giving a lower side-band cubic DP at *f*_{DP} = 2*f*_{1} − *f*_{2} = 1.6 kHz, and levels *L*_{1} ranging from 40 to 80 dB SPL and *L*_{2} ranging from 30 to 80 dB SPL, with a 2-dB step. To study the onset of the DPOAE signal, the *f*_{2} tone was presented with 10-ms delay after the onset of the *f*_{1} tone and turned off before the offset of the *f*_{1} tone. The onset and offset of the *f*_{1} and *f*_{2} tones were, respectively, shaped with 4- and 10-ms raised-cosine ramps (similar to Vencovský *et al.*, 2019). The duration of the *f*_{2} tone was 60 ms. To obtain a cubic DPOAE recording without the evoking signals and other DP signals, the starting phases of the stimuli were rotated in successive ensembles (Whitehead *et al.*, 1996). In the first ensemble, the starting phases were 0°. In every other ensemble, the phase of the *f*_{1} tone was changed by 90° and the *f*_{2} tone by 180°. To suppress the basal sources of DPOAEs, an IT was added with level *L*_{IT} = 65 dB SPL and frequency *f*_{IT} = 3050 Hz, which is about 1/3 octave above *f*_{2} (the same octave range used in Martin *et al.*, 2013). The IT was added simultaneously with *f*_{1} and was shaped with 4-ms raised-cosine ramps. The phase of the IT was rotated by 45°, which eliminated the IT from the recorded signal and yielded an otoacoustic emission (OAE) signal in which the DPs at 1.4 and 1.8 kHz—those closest to *f*_{DP}—were about 52 dB smaller than the DP at *f*_{DP}. Signals from eight ensembles were averaged for the three-tone stimulus. Without the IT, only four ensembles needed to be averaged.

### 2.2 Cochlear model and analytical solution

DPOAEs were simulated by the two-dimensional^{1} box model described in Vencovský *et al.* (2019), which was designed to simulate the human cochlea. The cochlea model is coupled to a middle-ear model (Vencovský *et al.*, 2019), allowing numerical simulation of OAEs as pressure changes in the external auditory canal. To analyze the primary source of the DPOAE, an analytical solution of the model equations described in Vencovský *et al.* (2019) was used. The box model did not contain inhomogeneities (roughness), in order to simulate only the nonlinear-distortion component of DPOAEs (also called the primary component).

In the model, the BM displacement *ξ*(*x*, *t*) (positive for displacement toward *scala vestibuli*) is given by

where every segment of the BM (from the base *x* = 0 to the apex *x* = *L*) is represented by its mass *m*(*x*), stiffness *k*(*x*), damping *h*(*x*), and shearing resistance *∂ _{x}s*(

*x*)

*∂*(

_{x}*∂*and

_{t}*∂*are partial derivatives with respect to

_{x}*t*and

*x*, respectively). The right-hand side of Eq. (1) contains forces accounting for: (1) the fluid coupling between the stapes displacement

*σ*(

*t*) (positive for displacement into

*scala vestibuli*) and the BM displacement, which is modeled by the Green's function

*G*(

_{S}*x*); (2) fluid coupling between the individual BM segments modeled by the Green's function

*G*(

*x,*$x\xaf$); and (3) active feedback force

*U*(

*x*,

*t*), which amplifies the BM vibrations and is assumed to be based on electromechanical force produced by the outer hair cells.

*U*(

*x*,

*t*) is moderated by the oscillations of another resonance system, tuned to about 0.5 oct below the characteristic frequency (CF) of the BM, assumed to be formed by the tectorial membrane connected to outer hair cell stereocilia (Nobili and Mammano, 1996). The force

*U*(

*x*,

*t*) is limited at larger vibrations of the BM by a Boltzmann function (Vencovský

*et al.*, 2019) simulating mechanoelectrical transduction in the stereocilia of the outer hair cells. The Boltzmann function causes the simulated BM displacement to be nonlinear at stimulus levels greater than about 30 dB SPL (Vencovský

*et al.*, 2019). The same model implementation as in Vencovský

*et al.*(2019) was used: Matlab implementation with time-domain equations solved numerically by the Dorman-Prince method, which is a member of the Runga-Kutta family of ODE solvers. The sampling frequency was 600 kHz and the model was discretized in the spatial domain with 800 equal-length segments. Testing with up to 1200 segments, we have shown that 800 segments are sufficient to avoid reflection components resulting from the discretization (Vencovský

*et al.*, 2019).

If the evoking stimuli are at frequencies which are multiples of a fundamental frequency *f*_{0} (Nobili and Mammano, 1996), such that *f*_{1} = *n*_{1}*f*_{0}, *f*_{2} = *n*_{2}*f*_{0}, and *f*_{DP}= *n*_{DP}*f*_{0}, where *n*_{DP} = 2*n*_{1} − *n*_{2}, then the analytical solution of the model [Eq. (1)] may be approximated by

The reader is referred to Vencovský *et al.* (2019) and Vetešník and Gummer (2012) for details. Briefly, $K\u0302$ is a complex constant which does not depend on the stimulus level, *x*_{1} and *x*_{2} define the range of positions along the BM where the primary DPOAE sources are located [the BM part where the traveling waves (TWs) of evoking stimuli overlap (Vencovský *et al.*, 2019)], $\xi \u0302nDP(2)(x)$ is the solution of the homogeneous equation associated with Eq. (1) and represents the forward-going (base) wave, and $U\u0302nDPNL(x)$ is the nonlinear force. The forward-going base wave with amplitude $AnDPF(x)$ and phase $\phi nDPF(x)$ can be approximated by

where $b\u0302$ is the complex constant determined by the boundary condition at the apex. In this paper, the forward-going base wave is approximated numerically by the model solution for the stapedial input at 10 dB SPL (Vencovský *et al.*, 2019). The nonlinear force $U\u0302nDPNL(x)$ is the component of the Fourier decomposition of the nonlinear part of *U*(*x*, *t*) at *f*_{DP} = 2*f*_{1} − *f*_{2}, described in Vencovský *et al.* (2019). The nonlinear part of *U*(*x*, *t*), denoted by *U*^{NL} (*x*, *t*), was calculated numerically in the time domain as

where *η*(*x*, *t*) is the shearing displacement between the reticular lamina and tectorial membrane, *u*(*x*) is a function which sets the desired cochlear gain in different BM segments, and *S*[*aη*(*x*, *t*)] is the Boltzmann function; *a* = 1 m^{−1} is a constant. To obtain $U\u0302nDPNL(x),\u2009UNL(x,t)$ was filtered by a window-based finite impulse response (FIR) bandpass filter centered at *f*_{DP}. The filter of order 800 with coefficients computed using a Hamming window was applied as a zero-phase filter by the function *filtfilt* in Matlab. The 6-dB bandwidth of the filter was 140 Hz and the signal was downsampled to 102.4 kHz before filtering (Vencovský *et al.*, 2019). $U\u0302nDPNL(x)$ was calculated with the Hilbert transform.

## 3. Results

### 3.1 Simulated DPOAEs

Figure 1 examines the effect of the IT on the steady-state and transient responses of the DPOAEs. Figure 1(A) depicts the DPOAE steady-state level for *f*_{1} = 2 kHz and *f*_{2} = 2.4 kHz, and various combinations of *L*_{1} and *L*_{2}. Figure 1(B) depicts the DPOAE steady-state level for the same stimuli as in Fig. 1(A) but with a simultaneously presented IT with a frequency of 3050 Hz and level of 65 dB SPL. The main difference between the contours in Figs. 1(A) and 1(B) is that the pronounced notch at higher levels [Fig. 1(A)] disappears as a result of adding the IT [Fig. 1(B)]. The IT also affects the optimal *L*_{1} and *L*_{2} levels, i.e., the *L*_{1} which leads to the largest DPOAE level for a given *L*_{2}. The IT causes the largest DPOAE level to be reached at smaller *L*_{1} values than in the absence of the IT. The red cross in Figs. 1(A) and 1(B) indicates the level condition which is studied in the remainder of this paper, i.e., the arbitrarily chosen intensity pair at *L*_{1} = 72 dB SPL and *L*_{2} = 66 dB SPL in the region of the notch [Fig. 1(A)].

Figure 1(C) depicts the envelope of the DPOAE signal for *L*_{1} = 72 dB SPL and *L*_{2} = 66 dB SPL. The signal was filtered with the bandpass FIR filter described in Sec. 2.2. The onset and offset of the DPOAE signal are monotonically increasing if the IT is present (orange line), which contrasts with the case in the absence of the IT (black line) presenting pronounced amplitude overshoots (complexities) in the signal onset and offset. The same effect of the IT on the envelope was observed experimentally by Martin *et al.* (2013).

### 3.2 Analysis of the primary source

Figure 2 examines the effect of the IT on the steady-state TWs, nonlinear force,^{2} and DPOAE. The stimuli are *f*_{1} = 2 kHz, *f*_{2} = 2.4 kHz, *L*_{1} = 72 dB SPL, and *L*_{2} = 66 dB SPL. Responses in the absence of the IT are depicted by black solid lines and in the presence of the IT, with *f*_{IT} = 3050 Hz and *L*_{IT} = 65 dB SPL, by orange dashed lines. Figure 2(A) depicts TWs (amplitude and phase) for *f*_{1}, *f*_{2}, *f*_{DP}, and *f*_{IT}. The *f*_{2} TW is slightly suppressed by the IT (at most 1 dB at the *f*_{2} place); the *f*_{1} TW is practically unaffected.

Figure 2(B) depicts the amplitude and phase of the nonlinear force $UnDPNL(x)$ calculated with the model [Eq. (4)]. The IT strongly suppresses the nonlinear force at the BM place near the peak of the IT TW (*x* ≈ 1 cm); the suppression extends over a region of approximately 0.32 cm (0.64 oct) basal from the *f*_{2} TW peak. Now, the principal peak of the force amplitude is located at the peak of the *f*_{2} TW (*x* ≈ 1.15 cm).

Figure 2(C) depicts the amplitude and phase of the cumulative integral over the product of the nonlinear force $U\u0302nDPNL(x)$ and the base wave $\xi \u0302nDP(2)(x)$ [Eq. (2)]. The cumulative integral illuminates interference phenomena between adjacent wavelets at the *f*_{DP} frequency; the cumulative integration starts at the BM base and proceeds toward the apex. For the most apically located points of the cumulative integral (*x* > 1.65 cm), where the amplitude and phase do not change with increasing *x*, the integral is proportional to the numerically calculated DPOAE. The oscillations result from interference between DP wavelets. The oscillations are around these apical, asymptotically constant values; their beginning is indicated by the arrows in the amplitude panel. Figures 2(B) and 2(C) indicate that mainly the region near the peak of the nonlinear force (*x* ≈ 1.15 cm) determines the final DPOAE amplitude and phase, as evidenced by the abrupt amplitude increase [orange arrow in Fig. 2(C)]. The cumulative integral in the more apically placed BM segments (*x* > 1.15 cm) presents oscillatory behaviour (both amplitude and phase) due to mutual interference of DP wavelets. The oscillations near *x* = 1.6 cm are due to mutual cancellation among wavelets generated near the maximum of the *f*_{DP} TW; therefore, they do not contribute to the DPOAE signal. In the absence of the IT, the amplitude of the cumulative integral [Fig. 2(C)] first increases for *x* increasing from 0 to about $x=x\u2032=1.05\u2009cm$ [Fig. 2(B)] and then decreases for *x* increasing up to about $x=x\u2033=1.23\u2009cm$ [Fig. 2(B)]. The wavelets between the base and *x*′, i.e., Eq. (2) with the integration interval between 0 and *x*′, yield the amplitude of 105 a.u. and phase of 91°. The wavelets between *x*′ and *x*″ yield the amplitude of 113 a.u. and phase of −86°, which is about the same amplitude as in the basal region but of opposite phase implying destructive interference. In general, in the absence of the IT, the oscillations begin [black arrow in Fig. 2(C)] near the basal edge of the maximum region of the nonlinear force.

Figure 3 examines the transient TW, nonlinear force, and DPOAE responses in the region of an amplitude notch. The analysis samples at 13, 17, 19, and 30 ms after the onset of the *f*_{1} tone. The onset of the *f*_{2} tone is at 10 ms. The positions of the sampling instants in the simulated DPOAE signal are indicated in Fig. 3(D). The last analyzed time instant (30 ms) is located in the steady-state part of the DPOAE signal. Figure 3(A) depicts the amplitude and phase of the simulated TWs for the four sampling instants. There is a gradual buildup of the TW amplitudes approximately independent of place.

Figure 3(B) depicts the amplitude and phase of the nonlinear force $U\u0302nDPNL(x)$. Notice that the largest peak of the nonlinear force amplitude (*x* ≈ 1.05 cm) slightly broadens with increasing time and almost divides into two peaks in the steady-state part of the DPOAE signal.

The cumulative integral is depicted in Fig. 3(C). We can see that for the first sampling instant (13 ms), the cumulative integral reaches most of the final DPOAE amplitude (blue arrow) in the region of the peak of the nonlinear force. The more apically located portions of the nonlinear force only cause relatively small oscillations in the amplitude and phase of the cumulative integral. In other words, in contrast to Martin *et al.* (2013), we do not see any evidence for basal DP wavelets building up faster than wavelets near the *f*_{2} TW peak. After reaching the amplitude maximum in the onset of the DPOAE signal (for time samples at about 17 ms), the phase of the cumulative integral also rolls off in the more apical parts (at *x* between 1.1 and 1.6 cm), which is a consequence of destructive interference between wavelets leading to relatively small steady-state DPOAE amplitudes. The crosses in Fig. 3(D) depict relative DPOAE amplitudes calculated from the cumulative integral. The black solid line represents the numerically calculated pressure in the basal-most BM segment filtered with the same bandpass FIR filter as described in Sec. 2.2 for the nonlinear force.

In conclusion, this temporal analysis suggests that these transient response overshoots are due to mutual interference of DP wavelets.

## 4. Discussion

The nonlinear component of DPOAEs simulated by a human cochlear model displayed amplitude notches for the optimal frequency ratio (*f*_{2}/*f*_{1} = 1.2). An IT presented at frequencies above the *f*_{2} tone enhanced the DPOAE amplitude in the region of the notches due to suppression of the primary sources of DPOAEs located basally from the peak of the *f*_{2} TW. This model-based finding agrees with conclusions of Martin and colleagues based on experimental data in rabbits (Martin *et al.*, 1999, 2013) and humans (Martin *et al.*, 2003). The notch region in our simulations occurred for *L*_{1} ≈ *L*_{2} + 6 dB with *L*_{2} > 60 dB SPL, but experimental data in rabbits showed pronounced notches near *L*_{1} ≈ 60–70 dB SPL or *L*_{2} ≈ 60–70 dB SPL (Fahey *et al.*, 2008; Whitehead *et al.*, 1992). Fahey *et al.* (2008) successfully simulated notches in experimental data by a single isolated nonlinearity, which may also display notches (Lukashkin and Russell, 1999). If this isolated nonlinearity explanation were valid in the present simulations, the amplitude of the nonlinear force should have decreased to cause the DPOAE notch. The model suggests the contrary: the IT suppresses the nonlinear force basal to the *f*_{2} place [Fig. 2(B)] and thereby causes enhancement of DPOAE amplitude. Differences between the simulations for humans and experimental data in small mammals could be due to narrower tuning of cochlear filters in humans, which was also suggested to be the cause of less pronounced coherent reflection components of DPOAEs in small mammals (see the general discussion in Shera, 2002).

Pronounced reflection components in humans may also cause notches due to destructive interference with nonlinear distortion components (e.g., see Fig. 3 in Zelle *et al.*, 2015). Johannesen and Lopez-Poveda (2010) and Zelle *et al.* (2017) took precautions to avoid such secondary sources in their experiments. In their data, a notch was shown for *L*_{2} ≈ 60 dB SPL and *L*_{1} > 60 dB SPL, which is roughly equivalent to the intensity conditions in our simulations; see Figs. 1 and 2 in Johannesen and Lopez-Poveda (2010), and Fig. 5 in Zelle *et al.* (2017). In contrast, Meinke *et al.* (2005) showed a shallow notch near *L*_{2} ≈ 70–75 dB SPL and *L*_{1} ≈ 55 dB SPL.

Near a notch of DPOAE amplitude, the DPOAE signal contains amplitude overshoots (complexities) in the onset and offset (Fig. 1). In the model, these complexities are due to the spatial extent of the DP primary sources—spatial extent of the principal peak of the nonlinear force. The peak broadens for increasing time instants in the onset of the DPOAE signal [Fig. 3(B)]. A broader extent leads to more destructive interference due to the slow phase rotation of the wavelets generated along the BM length. Martin *et al.* (2013) (their Figs. 4B and 4C) showed that the DPOAE onset is delayed by about 1.5 ms due to the IT, but the present simulations indicate equal time onsets. Martin *et al.* (2013) interpreted the time delay as an indication of the basal source building up before the source near *f*_{2}. Two-tone suppression between the *f*_{1} and *f*_{2} TWs can cause complexities and affect onset latency, but such suppression was shown to be most pronounced if the level of the eliciting tone is larger than the level of the other tone (Vencovský *et al.*, 2019). In the present simulations, the *f*_{1} TW was unaffected by the increasing amplitude of the *f*_{2} TW [Fig. 3(A)]. Therefore, the complexities were not due to two-tone suppression. For high-equal level stimuli (e.g., 75 dB SPL in Martin *et al.*, 2013), the simulated *f*_{2} TW suppressed the *f*_{1} TW by only about 2 dB and there were negligible onset and offset overshoots (data not shown).

Another source of basal DPOAEs might lie within the organ of Corti whose parts have been shown to vibrate nonlinearly even at frequencies much lower than the CF (Cooper *et al.*, 2018). The present model does not take these observations into account.

In conclusion, the analysis supports the hypothesis that notches in DP-grams or DPOAE input/output functions can be caused by destructive interference among wavelets generated within the primary source region. In the onset and offset of a DPOAE signal, the spatial extent of the generation region is narrower and, consequently, destructive interference is less than in the steady-state, resulting in a smaller steady-state response or equivalently amplitude overshoots (complexities) in the signal envelope.

## Acknowledgments

Supported by the Grant Agency of the Czech Technical University in Prague (Grant No. SGS17/190/OHK3/3T/13), by the German Research Council (Grant Nos. DFG Da 487/3-1,2 and Gu 194/12-1), and by European Regional Development Fund-Project “Centre for Advanced Applied Science” (Grant No. 1183 CZ.02.1.01/0.0/0.0/16_019/0000778). Access to computing and storage facilities owned by parties and projects contributing to the National Grid Infrastructure MetaCentrum provided under the programme “Projects of Large Research, Development, and Innovations Infrastructures” (CESNET LM2015042), is greatly appreciated. V.V. is supported by the International Mobility of Researchers in CTU (Grant No. CZ.02.2.69/0.0/0.0/16_027/0008465).

^{1}

The pressure field is calculated for a rectangular two-dimensional space with one dimension along the BM length and the other dimension given by the height of the cochlear duct (Vetešník and Gummer, 2012).

^{2}

For the steady-state condition, the amplitude and phase of the TWs and the nonlinear force were obtained by Fast Fourier Transformation of the model results.

## References and links

_{2}-f

_{1}and 2f

_{1}-f

_{2}distortion components generated by the hair cell mechanoelectrical transducer: Dependence on the amplitudes of the primaries and feedback gain

_{2}. I. Basic findings in rabbits

_{2}. II. Findings in humans

_{1},L

_{2}space