Detection thresholds for spectral and temporal modulations are measured using broadband spectra with sinusoidally rippled profiles that drift up or down the log-frequency axis at constant velocities. Spectro-temporal modulation transfer functions (MTFs) are derived as a function of ripple peak density (Ω cycles/octave) and drifting velocity (ω Hz). The MTFs exhibit a low-pass function with respect to both dimensions, with 50% bandwidths of about 16 Hz and 2 cycles/octave. The data replicate (as special cases) previously measured purely temporal MTFs (Ω=0) [Viemeister, J. Acoust. Soc. Am. 66, 1364–1380 (1979)] and purely spectral MTFs (ω=0) [Green, in Auditory Frequency Selectivity (Plenum, Cambridge, 1986), pp. 351–359]. A computational auditory model is presented that exhibits spectro-temporal MTFs consistent with the salient trends in the data. The model is used to demonstrate the potential relevance of these MTFs to the assessment of speech intelligibility in noise and reverberant conditions.

1.
Amagai, S., Dooling, R., Shamma, S., Kidd, T., and Lohr, B. (1999). “Perception of rippled spectra in the parakeet and zebra finches,” J. Acoust. Soc. Am. (in press).
2.
Arai, T., Pavel, M., Hermansky, H., and Avendano, C. (1996). “Intelligibility of speech with filtered time trajectories of spectral envelopes,” Proc. ICSLP, pp. 2490–2492.
3.
Clarey, J., Barone, P., and Imig, T. (1992). “Physiology of thalamus and cortex,” in The Mammalian Auditory Pathway: Neurophysiology, edited by D. Webster, A. Popper, and R. Fay (Springer Verlag, New York), pp. 232–334.
4.
De-Valois, R., and De-Valois, K. (1990). Spatial Vision (Oxford U. P., New York).
5.
deCharms
,
R.
,
Blake
,
D.
, and
Merzenich
,
M.
(
1998
). “
Optimizing sound features for cortical neurons
,”
Science
280
,
1439
.
6.
Dong
,
D.
, and
Atick
,
J.
(
1995
). “
Statistics of natural time-varying images
,”
Network Comput. Neural Syst.
6
,
345
358
.
7.
Drullman
,
R.
,
Festen
,
J.
, and
Plomp
,
R.
(
1994
). “
Effect of envelope smearing on speech reception
,”
J. Acoust. Soc. Am.
95
,
1053
1064
.
8.
Duda, R., and Hart, P. (1973). Pattern Classification (Wiley–Interscience, New York).
9.
Green, D. (1986). “Frequency’ and the detection of spectral shape change,” in Auditory Frequency Selectivity, edited by B. C. J. Moore and R. Patterson (Plenum, Cambridge), pp. 351–359.
10.
Greenberg, S., and Kingsbury, B. (1997). “The modulation spectrogram: In pursuit of an invariant representation of speech,” ICASSP-97, pp. 1647–1650.
11.
Greenberg, S., Hollenback, J., and Ellis, D. (1996). “Insights into spoken language gleaned from phonetic transcription of the switchboard corpus,” in ICSLP-96 Proc. 4th Int. Conf. Spoken Lang. (IEEE, New York), pp. S32–S35.
12.
Haykin, S. (1996). Adaptive Filter Theory (Prentice–Hall, Englewood Cliffs, NJ).
13.
Hermansky
,
H.
, and
Morgan
,
N.
(
1994
). “
RASTA processing of speech
,”
IEEE Trans. Speech Audio Process.
2
(
4
),
578
589
.
14.
Hillier, D. (1991). “Auditory processing of sinusoidal spectral envelopes,” Ph.D. thesis, The Washington University and Severn Institute.
15.
Houtgast
,
T.
,
Steeneken
,
H.
, and
Plomp
,
R.
(
1980
). “
Predicting speech intelligibility in rooms from the Modulation transfer function: General room acoustics
,”
Acustica
46
,
60
72
.
16.
Kelly
,
D. H.
(
1961
). “
Visual responses to time-dependent stimuli
,”
J. Opt. Soc. Am.
51
,
422
429
.
17.
Kowalski
,
N.
,
Depireux
,
D.
, and
Shamma
,
S.
(
1996
). “
Analysis of dynamic spectra in ferret primary auditory cortex: Characteristics of single unit responses to moving ripple spectra
,”
J. Neurophysiol.
76
(
5
),
3503
3523
.
18.
Kryter
,
K.
(
1962
). “
Methods for the calculation and Use of the articulation index
,”
J. Acoust. Soc. Am.
34
(
11
),
1689
2147
.
19.
Levitt
,
W.
(
1971
). “
T transformed up–down methods in psychoacoustics
,”
J. Acoust. Soc. Am.
49
,
467
477
.
20.
Shamma
,
S.
(
1988
). “
The acoustic features of speech sounds in a model of auditory processing: vowels and voiceless fricatives
,”
J. Phonetics
16
,
77
91
.
21.
Shamma
,
S.
,
Chadwick
,
R.
,
Wilbur
,
J.
,
Morrish
,
K.
, and
Rinzel
,
J.
(
1986
). “
A biophysical model of cochlear processing: Intensity dependence of pure tone responses
,”
J. Acoust. Soc. Am.
80
,
133
145
.
22.
Shannon
,
R.
,
Zeng
,
F.-G.
,
Wygonski
,
J.
,
Kamath
,
V.
, and
Ekelid
,
M.
(
1995
). “
Speech recognition with primarily temporal cues
,”
Science
270
,
303
304
.
23.
Simon, J., Depireux, D. A., and Shamma, S. A. (1998). “Representation of complex spectra in auditory cortex,” in Psychophysical and Physiological Advances in Hearing. Proceedings of the 11th International Symposium on Hearing, edited by A. R. Palmer, A. Ress, A. Q. Summerfield, and R. Meddis (Whurr, London), pp. 513–520.
24.
van Zanten
,
G.
, and
Senten
,
C.
(
1983
). “
Spectro-temporal modulation transfer functions (STMTF) for various types of temporal modulation and a peak distance of 200 Hz
,”
J. Acoust. Soc. Am.
74
,
52
62
.
25.
Viemeister
,
N.
(
1979
). “
Temporal modulation transfer functions based upon modulation thresholds
,”
J. Acoust. Soc. Am.
66
,
1364
1380
.
26.
Wang
,
K.
, and
Shamma
,
S. A.
(
1994
). “
Self-normalization and noise-robustness in early auditory representations
,”
IEEE Trans. Speech Audio Process.
2
,
421
435
.
27.
Wang
,
K.
, and
Shamma
,
S.
(
1995
). “
Representation of spectral profiles in primary auditory cortex
,”
IEEE Trans. Speech Audio Process.
3
,
382
395
.
28.
Yang
,
X.
,
Wang
,
K.
, and
Shamma
,
S. A.
(
1992
). “
Auditory representations of acoustic signals
,”
IEEE Trans. Inf. Theory
Special Issue on Wavelet Transforms and Multiresolution Signal Analysis)
38
,
824
839
.
29.
Yost
,
W.
, and
Moore
,
M.
(
1987
). “
Temporal changes in a complex spectral profile
,”
J. Acoust. Soc. Am.
81
,
1896
1905
.
This content is only available via PDF.
You do not currently have access to this content.