We propose a four‐stage model to represent auditory response magnitude over time and across frequency channels. The first stage performs spectral estimation by computing power at each of the output frequency channels. Frequency is scaled in basilar membrane distance and the input is passed through a window having a duration inversely proportional to the frequency. The next stage has been described earlier [R. V. Shannon, J. Acoust. Soc. Am. Suppl. 1 65, S56 (1979)]; it portrays frequency analysis in the cochlea. This stage includes mechanisms of cochlear filtering (by means of two filter banks, one having sharp and the other having broad filters), nonlinear compression of the input power, and lateral suppression. The third stage models temporal adaptation in the auditory nerve. The final stage is a temporal integrator. Speech sounds analyzed by the model acquire several interesting characteristics: (i) The dynamic range of the input is greatly reduced (<15 dB), (ii) bandpass information (e.g., the one in formants, fricatives, etc.) is represented as a spectral edge at the low side of the bandpass region, (iii) bursts (especially high‐frequency plosives) acquire temporal sharpness, (iv) individual glottal pulses are clearly visible only at high frequencies. [Work supported by Institut National de la Recherche Scientifique, the Veterans Administration, and grants by N.I.H.]

This content is only available via PDF.