A fast numerical time-domain solution for a one-dimensional cochlear transmission-line model was proposed for real-time applications. In this approach, the three-dimensional solver developed by Murakami [J. Acoust. Soc. Am. 150(4), 2589–2599 (2021)] was modified to develop a solution for the one-dimensional model. This development allows the solution to accurately and quickly calculate cochlear responses. The present solution can solve the model in real-time under coarse grid conditions. However, under fine-grid conditions, the computation time is significantly longer than the duration of the signal. Nevertheless, calculations can be performed under the fine grid condition, which previously required much computation time. This fact is essential to applications.
1. Introduction
In our auditory system, the cochlea functions as a natural time–frequency analyzer, segregating sounds into constituent frequencies and analyzing their temporal evolution. Researchers have developed a time–frequency analyzer similar to the cochlea, successfully emulating cochlear processing to electronically analyze sound signals (Baby , 2021; Lyon and Mead, 1988; Mandal , 2009; Xu , 2018; Zwicker, 1986). This approach can potentially improve audio signal-processing systems in two principal domains: preprocessing and hearing aids. The preprocessing aspect could see significant improvements in frequency separation and temporal analysis. The analyzer enhances the performance of preprocessing stages in audio systems where signals are typically readied for further analysis or manipulation. In hearing aids, a bio-inspired analyzer can be used to improve fitting and potentially aid in diagnosing hearing impairments. Mimicking the natural processing of the cochlea can improve the effectiveness of hearing aids. A crucial criterion for this analyzer is real-time operation, ensuring that sound signals are processed promptly without significant delays. In essence, this bio-inspired analyzer aims to harness cochlear strengths for improved audio processing across diverse applications, with real-time functionality serving as a key determinant of its efficacy.
Cochlear models fall into two main categories: transmission-line (TL) models and filter bank models. TL models are grounded in the physical principles of the cochlea, treating it akin to an equivalent electrical circuit whose components are governed by specific equations. This approach encapsulates the processes involved in the propagation of sound waves through the cochlea and their subsequent conversion into electrical signals, which the brain perceives as sound. Conversely, filter bank models concentrate on the function or output of the cochlea, sidestepping the detailed inner mechanics. These models employ signal processing techniques to mimic how the cochlea filters segregate them into various frequencies, providing insights into the cochlear ability to achieve frequency selectivity without delving into the underlying causes. Comparisons of cochlear models, including both categories, reveal similar frequency selectivities but slightly different temporal feature actions (Osses Vecchi , 2022; Saremi , 2016). Given the differences in temporal characteristics observed, it is evident that there is not a singular, definitive model of the cochlea that is entirely precise. While both models excel in capturing certain aspects of cochlear function, neither accurately encapsulates the full complexity.
The strength of the TL model is based on the actual physical properties of the cochlea, making it particularly useful for comprehending and simulating hearing loss. Because it is rooted in the physics of the cochlea, the TL model is capable of incorporating the underlying mechanisms associated with otoacoustic emission (OAE) (Liu and Neely, 2010; Verhulst , 2012; Wen and Meaud, 2022) and hearing loss (Verhulst , 2018). However, solving the TL model incurs high computational costs (Diependaal , 1987). To address this challenge, attempts have been made toward hardware implementations (Lyon and Mead, 1988; Mandal , 2009; Xu , 2018; Zwicker, 1986) and the use of neural network-based estimations (Baby , 2021). Despite these advancements, challenges persist in terms of model construction efficiency and the complexity of reconstructing these models, especially when considering variations in model parameters that correspond to different degrees of hearing loss. Addressing these challenges often requires adjustments to the model equations and computational configurations to accurately reflect the condition being modeled.
Murakami (2021) proposed a fast Fourier transform (FFT)-based solution for fast solving the three-dimensional (3D) TL model in the time domain. This software-based approach is adaptable to a range of micromechanical models, offering flexibility in terms of altering parameters and reconstructing models. Building upon this foundational concept, the current study suggests a modification that transitions this fast-solving approach from the complex 3D TL model to a more simplified one-dimensional (1D) variant. Significantly, the method put forward in this study delivers both accurate and timely performance results.
2. Methods
2.1 FFT-based solution of a cochlear transmission–line model
Diependaal (1987) used the matrix production to solve the Poisson equation in Eq. (2). Murakami (2021) replaced the 3D model with the FFT-based Poisson solver (Schumann and Sweet, 1988). This FFT-based algorithm uses the discrete cosine transform (DCT) depending on the boundary conditions. In this study, we modified the FFT-based method from a 3D model to a 1D model.
The cochlear model includes the mixed boundary conditions described in Eqs. (3) and (4). DCTs are categorized from type from I to IV, with the choice depending on the boundary conditions (Schumann and Sweet, 1988). Under mixed boundary conditions, the proposed method employs DCT-III.
2.2 Specific example
2.3 Numerical procedures
The proposed FFT-based method constitutes a modified time-domain solution for the cochlear TL model, as proposed by Diependaal (1987), and it performs discretization using the finite difference approach, as described in Sec. 2.1. Diependaal (1987) employed the finite element approach for solving numerical problems. However, no modification to the numerical procedure was required to solve the cochlear TL model in the time domain.
The numerical procedure is outlined as follows:
The proposed method is implemented using the FFT described in step (iii). The original version of the direct method uses the matrix product in step (iii). However, this step entails the highest computational cost of all steps. By replacing the matrix operation with an FFT, the computational complexity reduces from N2 to N log N operations. This reduction is pivotal for the highly efficient computation of the proposed method. In addition, the direct method incurs a high computational cost in the preprocessing step, which calculates the inverse matrix using the Gaussian elimination method.
The computations were performed on A Linux-based personal computer with an Intel Corei9 12900 central processing unit and 64 GB of RAM. The programs were written in C++ programming language. In step (iii) of the FFT-based algorithm, FFTW_REDFT10 and FFTW_REDFT01 routines in the FFTW3 library were used. The DGEMV routine in OpenBLAS performs the matrix product in step (iii) of the original direct algorithm. In step (v) of the algorithm, we employed the fourth-order Runge–Kutta method for solving ordinary differential equations with a constant time step Δt = 10[μs].
The values of the parameters were set to fit human data (Greenwood, 1990) based on the values of the original model proposed by Neely and Kim (1986), which was used to reproduce the auditory nerve response in cats.
3. Results
3.1 Accuracy
The BM responses for pure tones are used to evaluate frequency selectivity in cochlear mechanics. Figure 1 illustrates that the BM responses calculated using the FFT-based method and direct methods used by Diependaal (1987) are similar. As an essential feature, the cochlear model exhibits a peak at the BM location depending on the tone input frequency. Moreover, the active pressure pa causes the BM model to improve frequency selectivity and sensitivity compared to the low input level. This result matches the original simulation conducted by Neely and Kim (1986). Decreased frequency selectivity and sensitivity are observed when the input levels are increased. This phenomenon is called compressive nonlinearity in the experimental measurements (Cooper , 2018; Fallah , 2021).
Input level (dB) . | N = 256 . | N = 512 . | N = 1024 . |
---|---|---|---|
0 | 8.20 × 10−12 | 5.11 × 10−11 | 5.93 × 10−10 |
40 | 3.39 × 10−11 | 4.38 × 10−11 | 5.20 × 10−10 |
80 | 4.94 × 10−12 | 8.91 × 10−12 | 5.71 × 10−11 |
Input level (dB) . | N = 256 . | N = 512 . | N = 1024 . |
---|---|---|---|
0 | 8.20 × 10−12 | 5.11 × 10−11 | 5.93 × 10−10 |
40 | 3.39 × 10−11 | 4.38 × 10−11 | 5.20 × 10−10 |
80 | 4.94 × 10−12 | 8.91 × 10−12 | 5.71 × 10−11 |
3.2 Computational cost
The computational speed is a crucial evaluation criterion for measuring the practicality of the FFT-based solution, which should focus on real-time computations. As illustrated in Fig. 2, the FFT-based solver is faster and more scalable than the direct solver. Under all conditions, the times obtained from the FFT-based solution were smaller than those obtained by the direct solution. The elapsed time is smaller than the input signal duration when the number of segments N is 256. The increments of the times are monotonical, approximately orders of N log N and N2 for the FFT-based and direct solutions, respectively. This discrepancy arises from step (iii) of the algorithm, which uses the FFT or matrix product depending on the solutions. Thus, the FFT-based method can solve a large number of segment settings.
4. Discussion
The primary contribution of this study lies in the development of a fast time-domain solution for the 1D cochlear model, enabling real-time computation. This breakthrough is an extension of the 3D model solution introduced by Murakami (2021). The improved efficiency of our model is primarily due to the substitution of matrix multiplication with FFT computations, an approach initially used by Diependaal (1987), as described in Sec. 2.1. While FFT has a computationally complexity of N log N, matrix multiplication scales with a complexity of N2. Theoretically, the FFT-based method produces equivalent outputs to the direct method, as demonstrated in Fig. 1 and Table 1. The FFT-based approach yields faster calculation speeds and operates in real-time under coarse conditions, as depicted in Fig. 2. However, under normal and fine grid scenarios, computational times still exceed the duration of the input signal. Despite this, the method has not yet achieved the ideal of spare-time real-time computation.
The equation for the TL model is the same for all cochlear models. It is derived from Newton's second law and the law of conservation of mass concerning the fluid flowing through the cochlear duct, as shown in Eq. (1), by assuming an incompressible fluid with no viscosity. The cochlear model is constructed by combining this macro fluid model with a microcochlear model, as shown in Sec. 2.2. The microcochlear model can be a 1DoF oscillating system (Verhulst , 2012), a 2DoF model used here, or a complicated model including an electrophysiological mechanism (Liu and Neely, 2010). It is important to emphasize that the proposed method can calculate any cochlear model, including the macro model, which is the TL form shown in Eq. (1).
The software-based method proposed here allows for easy modification of the cochlear model configuration. This straightforward modification is a significant advantage compared to hardware (Lyon and Mead, 1988; Mandal , 2009; Xu , 2018; Zwicker, 1986) and neural network-based methods (Baby , 2021), which can be computationally expensive to retrain for different configurations. Another strong point is the ability to predict OAEs. Predicting OAEs can provide valuable information about the internal state of the cochlea (Liu and Neely, 2010; Verhulst , 2012; Wen and Meaud, 2022). Combining these two aspects leads to the possibility of constructing personalized cochlear models.
Supplementary Material
See the supplementary material for the source code for the computer program required for the simulation.
Acknowledgments
This study was supported by Japan Society for the Promotion of Science Kakenhi Grant No. 21K17765.
Author Declarations
Conflict of Interest
The author declares no conflict of interest.
Data Availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.