This paper presents an efficient method to compute the numerical solutions of transmission-line (TL) cochlear models, and its application on the model of Verhulst *et al.* The stability region of the model is extended by adopting a variable step numerical method to solve the system of ordinary differential equations that describes it, and by adopting an adaptive scheme to take in account variations in the system status within each numerical step. The presented method leads to improve simulations numerical accuracy and large computational savings, leading to employ TL models for more extensive simulations than currently possible.

## 1. Introduction

The cochlea is an active and nonlinear system, turning accurate simulation of its response to a sound stimulus evaluated in terms of basilar-membrane (BM) displacement and velocity into a non-trivial computational problem.

One common approach is to discretize the space along the BM length and describe the system in terms of coupled mass-spring-damper elements. The resulting system is called a transmission-line (TL) cochlear model. In the case of nonlinear active cochlear models, the numerical solution can be computed by solving a differential equations system using numerical methods in the time domain. In particular, a popular strategy is the method proposed by Diependaal *et al.* (1987) which consists in discretizing the partial differential equation describing the system's behavior in space, such that a system of ordinary differential equations (ODEs) in the time variable is obtained. The numerical solution can be found using an explicit numerical method to integrate the ODEs in sequential time steps, imposing linear behavior during each integration step, as in the models proposed by Van Hengel (1996), Epp *et al.* (2010), and Verhulst *et al.* (2012).

To make the computation efficient, the adoption of a variable step-size integration method has been proposed by Diependaal *et al.* (1987). However, if the stability of the model relies on the system's status update rate (which corresponds to the simulation sample-rate), variable stepsize methods might not effectively speed up simulations since the maximum integration time step allowed is bound by stability requirements. This aspect is particularly true for functional cochlear models, such as those proposed by Zweig (1991), Shera (2001), and Verhulst *et al.* (2012), where active processes in the cochlea are represented by the action of coupled, unstable oscillators, and where stability is guaranteed by feedback-delay lines. Additionally, the imposition of linearity within each integration step introduces a strong dependency between the sample rate and accuracy: The higher the sample rate, the lower the error introduced by the linear approximation within each integration step.

There are existing mathematical descriptions of TL models equivalent to the one proposed by Diependaal *et al.* (1987), with the state space model being one of them. The state space model, i.e., the formulation of the cochlear dynamics through a set of coupled, first-order differential equations, can be applied to a large family of cochlear models (Elliott *et al.*, 2007), and it has been used by Bertaccini and Sisto (2011) in conjunction with a hybrid solver for fast numerical simulations. The connection between the method proposed by Diependaal *et al.* (1987) and the state space model has recently been discussed by Rapson *et al.* (2012).

This study presents an efficient and straightforward implementation of the time domain cochlear model by Verhulst *et al.* (2012), which in turn is based on a modified version of the method proposed by Diependaal *et al.* (1987). The new implementation reduces the minimum sample rate required for a stable simulation, improving numerical accuracy via the adoption of an adaptive method that updates the model parameters within each internal step of a Runge-Kutta explicit integration method. This sample rate reduction leads to (1) an effective speed up of simulations, and (2) a more efficient memory usage. It is now possible to store the displacement and velocity vectors for 1000 BM sections at once for arbitrary lengthy simulations, and to run several parallel simulations efficiently on a multi-core processor. This while the current implementation (Verhulst *et al.*, 2012) only allows for simulations that are 250 ms long for 20 BM sections at the same time.

## 2. Numerical solution of the cochlear model

The time-domain, one-dimensional (1-D) cochlear model described by Verhulst *et al.* (2012) represents the cochlea as an array of coupled oscillators embedded in a fluid-filled tube, assuming that the pressure is uniformly distributed in the direction perpendicular to the BM. The BM is discretized in N sections along its length, and their center angular frequency, *ω _{n}*, is determined using the Greenwood map (Greenwood, 1961). The resulting model is the TL system shown in Fig. 1(a), where the input,

*p*, is the sound pressure on the stapes, and

_{in}*p*represents the pressure on the

_{n}*n*th BM section.

The series admittance, $Ysn$, and the parallel admittance, $Ypn$, can be expressed as a function of the complex Laplace transform variable, *s*, as follows:

with $Ms0$ and $Mp0$ being constants. The variables *δ _{n}*(

*t*),

*ρ*(

_{n}*t*), and

*ψ*(

_{n}*t*) control the instantaneous nonlinearities according to the model proposed by Shera (2001), as explained in details by Verhulst

*et al.*(2012). Setting

*δ*(

_{n}*t*) < 0 allows for an active model, and the term $\rho n(t)e\u2212\psi n(t)s$ represents a feedback term, which guarantees the stability of the system (Zweig, 1991). Assuming local symmetry in the variation of the transfer function between neighboring points along the BM, it is possible to reduce the number of independent variables in the transfer function from two to one by replacing s with $j\omega /\omega n$, $\omega \u2208R$ (Zweig, 1991).

By introducing the variables **g** = [*g*_{0}, *g*_{1},…, *g _{N}*] and

**q**= [

*q*

_{0},

*q*

_{1},…,

*q*] (see Diependaal

_{N}*et al.*, 1987), it is then possible to compute the values of velocity,

*v*, and displacement,

*y*, at the time

*t*of each BM section by solving the following system of ODEs over time:

^{1}

where

and **Aq** = g, with **A** being a tridiagonal matrix describing the cochlear macro-mechanics of the model, and *τ _{n}*(

*t*) =

*ψ*(

_{n}*t*)/

*ω*the time delay of the stabilizing feedback term.

_{n}## 3. Constant step-size method

In the implementation proposed by Verhulst *et al.* (2012), the numerical values for the velocity and displacement at the discrete time instants are obtained by solving Eq. (3) over time using the constant step-size, fourth-order, Runge-Kutta explicit method (RK4). In particular, to compute the values for *y _{n}* and

*v*at the time instant

_{n}*t*

_{i}_{+1}the values for

*δ*,

_{n}*ω*,

_{n}*ρ*,

_{n}*τ*, and

_{n}*y*(

_{n}*t*−

*τ*) are computed for the time instant

_{n}*t*. Accordingly,

_{i}**g**and

**q**are evaluated using the RK4 method at the time instants

*t*,

_{i}*t*+

_{i}*T*/2, and

_{s}*t*+

_{i}*T*=

_{s}*t*

_{i}_{+1}, where

*T*is the model sampling period. Since

_{s}*τ*(

_{n}*t*) is generally not a multiple of

*T*, the value for

_{s}*y*(

_{n}*t*−

*τ*(

_{n}*t*)) is estimated using linear interpolation. The fact that

*y*(

_{n}*t*−

*τ*(

_{n}*t*)) is fixed within each integration step has the effect of setting a theoretical lower bound on the minimum sample rate,

*F*, which guarantees stability.

_{s}To perform a simplified stability analysis of this system, *δ*, *ρ*, and *ψ* are considered to be constant over time and along the BM. Equation (2) thus becomes

The solution for the resulting system is stable when the real part of the *Y _{p}* poles is negative. As discussed by Zweig (1991),

*Y*has infinite pairs of complex conjugated poles. In particular, there are two conjugated poles,

_{p}*s*

_{0}and $s0*$, in proximity of

*s*=

*i*, and an infinite number of conjugated poles,

*s*, $sn*$, for which Re(

_{n}*s*) < Re(

_{n}*s*

_{0}). This makes it possible to rewrite the stability condition as Re(

*s*

_{0}) < 0. In Eq. (5), the term

*e*

^{−ψs}corresponds to a delay of

*τ*=

*ψ*/

*ω*seconds. Since the term

_{n}*y*(

_{n}*t*−

*τ*) is updated at sample rate, within the integration step from

*t*to

_{i}*t*

_{i}_{+1}it can be written as

*y*(

_{n}*t*−

*τ*− (

*t*−

*t*)) =

_{i}*y*(

_{n}*t*−

*τ*+

*ϵ*(

*t*)). It follows that

*ψ*varies from

*τω*to (

_{n}*τ*+

*T*)

_{s}*ω*within each integration step, with the consequence of a variation in the values assumed by

_{n}*s*

_{0}, which determines the stability of the system. The variation in the

*s*

_{0}value depends on the center frequency of the BM section being considered and on the sampling period,

*T*, resulting in largest variations for the cochlear section with the highest center frequency. We can thus focus our stability analysis on the section with the highest center frequency in the model (i.e., 20.4 kHz).

_{s}Figure 1(b) shows the real part of *s*_{0} for the cochlear section with center frequency at 20.4 kHz as a function of the time error *ϵ*(*t*/*T _{s}*) in samples for different

*F*. It can be seen that the system is ideally stable when

_{s}*F*= 300 kHz, since

_{s}*s*

_{0}remains negative during the integration step in this case. However, since a real situation is far more complex than the simplified representation in Eq. (5), due to the introduction of level-dependent nonlinearities, discretization, linear interpolation and RK4, the minimum

*F*empirically found for a stable model solution is 400 kHz. This value is adopted in the implementations by Van Hengel (1996), Epp

_{s}*et al.*(2010), and Verhulst

*et al.*(2012).

## 4. Adaptive method

To achieve a faster numerical solution for the model while keeping the system stable, several aspects need to be considered. The principal lower bound for the sample rate of the system is imposed by the variation in the value of the principal pole, *s*_{0}, within each integration step. Additionally, there are at least three other factors that set a lower bound for the sample rate of the system. The first is that numerical precision and stability of the RK4 method both depend on the selected integration step. Second, a certain oversampling is needed to (1) ensure that the estimation of the feedback term in Eq. (4) remains accurate, and (2) reduce the effect of aliasing introduced by non-linearities. The third factor is that controlling nonlinearities at sample rate can cause sharp changes in the transfer function Eq. (2), affecting the stability of the model (Elliott *et al.*, 2007).

The technique proposed here reduces the number of intermediate steps required by the integration method by using the variable step-size Runge-Kutta RK4(5) solver by Dormand and Prince (1980), as implemented in the Python library integrate.ode, which is included in the SciPy package. It extends the stability region of the RK4, requiring only two additional evaluations of the system per integration step, and it dynamically adjusts the step-size to minimize the number of iterations required to obtain a solution that guarantees that the estimated error remains within a certain tolerance. In our implementation, the maximum integration step allowed was set equal to the selected sampling period, to obtain the displacement and velocity values of the BM sections at the selected rate. This is also necessary for practical reasons, e.g., to efficiently compute the feedback delay value in Eq. (4).

A large reduction of the overall sample rate is necessary to speed up the model calculations while introducing extra evaluation steps into the ODE solver. To allow for this reduction, the feedback terms in Eq. (4) and the input signal, *p _{in}*, must be estimated for each internal RK4(5) step to ensure the stability of the system. Because a reduction in the sample rate implies a reduction in the accuracy of the linear interpolation used to estimate those, the Catmull-Rom cubic spline interpolation method (Catmull and Rom, 1974) was selected to replace linear interpolation. The Catmull-Rom cubic spline is a local interpolating cubic spline that is particularly attractive because it provides smoothness and continuities at a relatively low computational cost (DeRose and Barsky, 1988).

To achieve the best possible accuracy in the simulations, the computation of the parameters that control the instantaneous locations of the poles of *Y _{p}*, and therefore the cochlear model non-linearities,

*δ*,

*ρ*, and

*ψ*, should also be evaluated for each step of the RK4(5) method. Unfortunately, computing the abovementioned parameters is time consuming, as it requires about the same number of operations as those needed to compute

**g**and

**q**. In addition, updating the time delay in Eq. (2) has the consequence of increasing the time to compute the feedback terms in Eq. (4), since the data points on which the interpolation is performed must be changed accordingly.

In our approach, we decided to compromise by deciding whether or not to update the values of the poles depending on the variation in the status of the system. This can be done by computing the variation of the principal pole, *s*_{0}, for each section: As soon as the variation in the location of the principal poles for one section is above a certain tolerance, *ϵ _{s}*, the values for

*δ*,

*ψ*, and

*ρ*are updated for all sections. This approach saves computational time when the system behavior is nearly linear (e.g., when the velocity values are close to zero), and makes it possible to react rapidly to small changes in the status of the system, thereby ensuring smooth transitions in the transfer function

*Y*. Furthermore, this method is independent of the mechanism used to modify the locations of the poles and can be used to compute efficiently the numerical solutions for different nonlinear models.

_{p}## 5. Results

The performance of this model relies on four parameters: The relative and absolute tolerance per integration step, *ϵ _{r}* and

*ϵ*in the RK4(5) method, the tolerance on the principal pole variation,

_{a}*ϵ*, and the system

_{s}*F*. To allow for a direct comparison, the accuracy of the model for different values of the above parameters was evaluated through an estimated error that represents the numerical difference between the velocity solutions of the considered case and a reference case, using 100 ms of frozen white noise at 60 dB sound pressure level (SPL) as an input signal. The outcome of the variable step-size method with a sampling frequency of 800 kHz,

_{s}*ϵ*= 10

_{r}^{−6},

*ϵ*= 10

_{a}^{−16}, and

*ϵ*= 0 (in order to force the displacement of the poles for every step) was chosen as the reference. To present a meaningful comparison, two different accuracy measures are used. (1) The

_{s}*relative*accuracy of every BM section is determined by the root-mean-square (RMS) signal to noise ratio

and (2) the *absolute* accuracy is calculated as the average SNR across the BM sections: SNR_{av} = $1N\u2211nSNR(n)$, where $\upsilon \xaf$ represents the BM velocity solution for the reference model with *F _{s}* = 800 kHz,

*T*the duration of the input signal, and

*t*the time instants for which the numerical solution is computed in the considered case.

_{i}Figure 1(c) plots the estimated SNR against the center frequency of the BM sections for the constant step-size method using *F _{s}* = 400 kHz and for the adaptive method using

*ϵ*= 10

_{r}^{−2},

*ϵ*= 10

_{a}^{−13},

*ϵ*= 10

_{s}^{−2}and different

*F*. The above parameters were determined empirically to give meaningful results in terms of the trade-off between accuracy and efficiency. The adaptive method produced more accurate results under all conditions for sections with center frequencies above approximatively 500 Hz. At lower frequencies, the SNR mainly depends on

_{s}*F*, and it is relatively high in all cases. This result was expected, since the magnitude of the pole displacement effect discussed in Sec. 3 is directly proportional to the frequency of the BM section. Furthermore, the values of the parameters that control the nonlinearity vary slowly in this region, so similar outcomes in terms of accuracy between the two methods were expected.

_{s}Table 1 summarizes the outcomes of the computations for the adaptive and constant stepsize methods, where the CPU time indicates the time needed by the processor to solve and store the results for *N* = 1000 BM sections. Both methods were implemented in Python and C, and tested on a MacBook Pro 11.3 computer armed with a 2.6 GHz, quad-core Intel^{®} Core I7 processor. The table shows that in all considered cases the adaptive method introduces a lower error than the constant step-size method with *F _{s}* = 400 kHz. In particular, it is possible to solve the model in about the same time using the two methods, with the adaptive one increasing the average SNR of 25 dB. Furthermore, it is possible to speed up the simulations by a factor of 1.5 while achieving a 16 dB error reduction. Surprisingly, the number of steps required to compute the numerical solution is higher with

*F*= 80 kHz than with

_{s}*F*= 100 kHz. This is probably because the RK4(5) estimated error is frequently above the tolerance due to the reduced accuracy of the Catmull-Rom interpolation for BM sections with a high center frequency.

_{s}Method . | F (kHz)
. _{s} | CPU Time (s) . | Integration steps (× 10^{3})
. | SNR_{av} (dB)
. |
---|---|---|---|---|

Constant step-size | 400 | 35.05 | 160 | 20.2 |

Adaptive | 400 | 76.03 | 280 | 57.8 |

Adaptive | 200 | 35.60 | 140 | 47.0 |

Adaptive | 100 | 19.3 | 75 | 36.2 |

Adaptive | 80 | 25.32 | 104 | 32.2 |

Adaptive | 50 | Numerically Unstable |

Method . | F (kHz)
. _{s} | CPU Time (s) . | Integration steps (× 10^{3})
. | SNR_{av} (dB)
. |
---|---|---|---|---|

Constant step-size | 400 | 35.05 | 160 | 20.2 |

Adaptive | 400 | 76.03 | 280 | 57.8 |

Adaptive | 200 | 35.60 | 140 | 47.0 |

Adaptive | 100 | 19.3 | 75 | 36.2 |

Adaptive | 80 | 25.32 | 104 | 32.2 |

Adaptive | 50 | Numerically Unstable |

Adopting the value *F _{s}* = 100 kHz leads to a drastic reduction of the memory space required per simulation, e.g., by reducing of a factor 4 the space needed to store the displacement values to compute the feedback term in Eq. (4). This allows to store the velocity and displacement values for all BM sections at once efficiently, and to run several parallel simulations on a multicore processor, reducing the time per simulation of a factor proportional to the number of available CPUs.

The time required for a single simulation with the proposed method depends on the stimulus type and level, since both the RK4(5) step-size and the method to update the nonlinear parameters are adaptive. In Fig. 1(d) the reduction of time for a simulation of the adaptive method respect to constant-step size one, is plotted as a function of the stimulus level for 100 ms tone bursts at 1, 4, and 8 kHz, and 80 *μ*s condensation clicks (80 ms response), adopting *F _{s}* = 100 kHz. The speedup factor for the tone bursts is independent on the stimulus frequency. It decreases with the SPL almost linearly from 30 to 60 dB and remains almost constant around the value 1.3 up to 100 dB. The proposed method is slower (speedup of about 0.9) than the constant-step size method for stimulus levels beyond 100 dB. In fact, for very high stimulation levels, BM sections distant from the stimulus frequency location result strongly excited, as shown by the rms level of the velocity of the BM sections in Fig. 1(e) for the 1 kHz tone burst, triggering frequently the update of the nonlinear parameters and requiring small integration step-size in order to guarantee that the estimated errors remain within the prescribed tolerance interval. Note that it is possible to speed up the simulations while introducing similar numerical errors as with the constant-step size method by relaxing

*ϵ*,

_{r}*ϵ*, and

_{a}*ϵ*.

_{s}The speedup factors for the click responses are always larger than 2.4, in fact, the responses of most cochlear sections are shorter than the stimulus duration, as shown in Fig. 1(f) for the 1 kHz BM section, allowing the adaptive method to set a large step-size for most of the simulation time.

## 6. Conclusions

An adaptive method to solve numerically 1-D nonlinear and active TL cochlear model was presented. The method applied to the model proposed by Verhulst *et al.* (2012) was found to result in improvements in both numerical accuracy and computational efficiency compared to the previously used constant step-size approach. The decrease in the minimum sample rate required for stability allows for a drastic reduction in the amount of memory space required by the CPU for a simulation, allowing to run an arbitrary lengthy simulation while storing the displacement and velocity values for all cochlear sections at once, and to efficiently run several parallel simulations on a multi-core processor.

This new method makes it possible to employ TL cochlear models for more extensive simulations than currently possible, for example, as peripheral stage in complex functional auditory models (e.g., Santurette *et al.*, 2012; Verhulst *et al.*, 2013; Takanen *et al.*, 2014). An open source implementation of the method proposed here, along with a MATLAB interface, is available as part of the Auditory Model Toolbox (Søndergaard and Majdak, 2013).

## Acknowledgments

The authors thank Professor Christopher A. Shera for the helpful discussions when developing the model, and Piotr Majdak for his valuable help in integrating it into the Auditory Model Toolbox. The work of the first and third authors was supported by the Academy of Finland, the Aalto ELEC doctoral school and by the European Research Council under the European Community's Seventh Framework Programme (FP7/2007-2013) / ERC Grant agreement No. 240453. The work of the second author was supported by the DFG Cluster of Excellence EXC 1077/1 “Hearing4all.”

^{1}

For a complete mathematical derivation, see Verhulst (2010): Appendix B.