Two experiments investigated the effects of critical bandwidth and frequency region on the use of temporal envelope cues for speech. In both experiments, spectral details were reduced using vocoder processing. In experiment 1, consonant identification scores were measured in a condition for which the cutoff frequency of the envelope extractor was half the critical bandwidth (HCB) of the auditory filters centered on each analysis band. Results showed that performance is similar to those obtained in conditions for which the envelope cutoff was set to 160Hz or above. Experiment 2 evaluated the impact of setting the cutoff frequency of the envelope extractor to values of 4, 8, and 16Hz or to HCB in one or two contiguous bands for an eight-band vocoder. The cutoff was set to 16Hz for all the other bands. Overall, consonant identification was not affected by removing envelope fluctuations above 4Hz in the low- and high-frequency bands. In contrast, speech intelligibility decreased as the cutoff frequency was decreased in the midfrequency region from 16to4Hz. The behavioral results were fairly consistent with a physical analysis of the stimuli, suggesting that clearly measurable envelope fluctuations cannot be attenuated without affecting speech intelligibility.

1.
ANSI
(
1996
). “
Specifications for audiometers
,” ANSI Report No. S3.6-1996,
American National Standards Institute
, New York.
2.
Apoux
,
F.
, and
Bacon
,
S. P.
(
2004
). “
Relative importance of temporal information in various frequency regions for consonant identification in quiet and in noise
,”
J. Acoust. Soc. Am.
116
,
1671
1680
.
3.
Apoux
,
F.
, and
Bacon
,
S. P.
(
2008
). “
Selectivity of modulation interference for consonant identification in normal-hearing listeners
,”
J. Acoust. Soc. Am.
123
,
1665
1672
4.
Arai
,
T.
, and
Greenberg
,
S.
(
1997
). “
The temporal properties of spoken Japanese are similar to those of English
,”
Proceedings of the Eurospeech
,
Rhodes, Greece
, pp.
1011
1014
, September.
5.
Arai
,
T.
,
Pavel
,
M.
,
Hermansky
,
H.
, and
Avendano
,
C.
(
1999
). “
Syllable intelligibility for temporally filtered LPC cepstral trajectories
,”
J. Acoust. Soc. Am.
105
,
2783
2791
.
6.
Atlas
,
L.
,
Li
,
Q.
, and
Thompson
,
J.
(
2004
). “
Homomorphic modulation spectra
,”
Proceedings of the 29th International Conference on Acoustics, Speech, and Signal Processing
,
Montreal, Canada
, pp.
761
764
, May.
8.
Baskent
,
D.
, and
Shannon
,
R. V.
(
2006
). “
Frequency transposition around dead regions simulated with a noiseband vocoder
,”
J. Acoust. Soc. Am.
119
,
1156
1163
.
9.
Christiansen
,
T. U.
, and
Greenberg
,
S.
(
2007
). “
Distinguishing spectral and temporal properties of speech using and information-theoric approach
,”
XVIth International Congress of Phonetic Sciences
,
Saarbrücken, Germany
, December.
10.
Crouzet
,
O.
, and
Ainsworth
,
W. A.
(
2001
). “
On the various influences of envelope information on the perception of speech in adverse conditions: An analysis of between-channel envelope correlation
,”
Workshop on Consistent and Reliable Cues for Sound Analysis
,
Aalborg, Denmark
, September.
11.
Dorman
,
M. F.
,
Loizou
,
P. C.
, and
Rainey
,
D.
(
1997
). “
Speech intelligibility as a function of the number of channels of stimulation for signal processors using sine-wave and noise-band outputs
,”
J. Acoust. Soc. Am.
102
,
2403
2411
.
12.
Dorman
,
M. F.
,
Loizou
,
P. C.
,
Fitzke
,
J.
, and
Tu
,
Z.
(
1998
). “
The recognition of sentences in noise by normal-hearing listeners using simulations of cochlear-implant signal processors with 6-20 channels
,”
J. Acoust. Soc. Am.
104
,
3583
3585
.
13.
Drullman
,
R.
,
Festen
,
J. M.
, and
Plomp
,
R.
(
1994
). “
Effect of temporal envelope smearing on speech reception
,”
J. Acoust. Soc. Am.
95
,
1053
1064
.
14.
Eddins
,
D. A.
(
1993
). “
Amplitude modulation detection of narrow-band noise: Effects of absolute bandwidth and frequency region
,”
J. Acoust. Soc. Am.
93
,
470
479
.
15.
Eddins
,
D. A.
(
1999
). “
Amplitude-modulation detection at low- and high-audio frequencies
,”
J. Acoust. Soc. Am.
105
,
829
837
.
16.
Eisenberg
,
L. S.
,
Shannon
,
R. V.
,
Martinez
,
A. S.
,
Wygonski
,
J.
, and
Boothroyd
,
A.
(
2000
). “
Speech recognition with reduced spectral cues as a function of age
,”
J. Acoust. Soc. Am.
107
,
2704
2710
.
17.
Faulkner
,
A.
,
Rosen
,
S.
, and
Stanton
,
D.
(
2003
). “
Simulations of tonotopically mapped speech processors for cochlear implant electrodes varying in insertion depth
,”
J. Acoust. Soc. Am.
113
,
1073
1080
.
18.
Friesen
,
L. M.
,
Shannon
,
R. V.
,
Baskent
,
D.
, and
Wang
,
X.
(
2001
). “
Speech recognition in noise as a function of the number of spectral channels: Comparison of acoustic hearing and cochlear implants
,”
J. Acoust. Soc. Am.
110
,
1150
1163
.
19.
Fu
,
Q. J.
, and
Galvin
,
J. J.
(
2003
). “
The effects of short-term training for spectrally mismatched noise-band speech
,”
J. Acoust. Soc. Am.
113
,
1065
1072
.
20.
Gilbert
,
G.
, and
Lorenzi
,
C.
(
2006
). “
The ability of listeners to use recovered envelope cues from speech fine structure
,”
J. Acoust. Soc. Am.
119
,
2438
2444
.
21.
Ghitza
,
O.
(
2001
). “
On the upper cutoff frequency of the auditory critical-band envelope detectors in the context of speech perception
,”
J. Acoust. Soc. Am.
110
,
1628
1640
.
22.
Glasberg
,
B. R.
, and
Moore
,
B. C. J.
(
1990
). “
Derivation of auditory filter shapes from notched-noise data
,”
Hear. Res.
47
,
103
138
.
23.
Gonzales
,
J.
, and
Oliver
,
J. C.
(
2005
). “
Gender and speaker identification as a function of the number of channels in spectrally reduced speech
,”
J. Acoust. Soc. Am.
118
,
461
470
.
24.
Grant
,
K. W.
,
Braida
,
L. D.
, and
Renn
,
R. J.
(
1991
). “
Single band amplitude envelope cues as an aid to speechreading
,”
Q. J. Exp. Psychol. A
43
,
621
645
.
25.
Greenberg
,
S.
,
Arai
,
T.
, and
Silipo
,
R.
(
1998
). “
Speech intelligibility derived from exceedingly sparse spectral information
,”
International Conference on Spoken Language Processing
,
Sydney, Australia
, pp.
74
77
, December.
26.
Greenberg
,
S.
(
1999
). “
Speaking in shorthand—A syllable-centric perspective for understanding pronunciation variation
,”
Speech Commun.
29
,
159
176
.
27.
Healy
,
E. W.
, and
Warren
,
R. M.
(
2003
). “
The role of contrasting temporal amplitude patterns in the perception of speech
,”
J. Acoust. Soc. Am.
113
,
1676
1688
.
28.
Houtgast
,
T.
, and
Steeneken
,
H. J. M.
(
1985
). “
A review of the MTF concept in room acoustics and its use for estimating speech intelligibility in auditoria
,”
J. Acoust. Soc. Am.
77
,
1069
1077
.
29.
Kwon
,
B. J.
, and
Turner
,
C. W.
(
2001
). “
Consonant identification under maskers with sinusoidal modulation: Masking release or modulation interference?
,”
J. Acoust. Soc. Am.
110
,
1130
1140
.
30.
Lawson
,
J. L.
, and
Uhlenbeck
,
G. E.
(
1950
).
Threshold Signals
,
Radiation Laboratory Series
Vol.
24
(
McGraw-Hill
,
New York
).
32.
Payton
,
K. L.
, and
Braida
,
L. D.
(
1999
). “
A method to determine the Speech Transmission Index from speech waveforms
,”
J. Acoust. Soc. Am.
106
,
3637
3648
.
33.
Plomp
,
R.
(
1983
). “
The role of modulation in hearing
,” in
Hearing—Physiological bases and psychophysics
, edited by
R.
Klinke
and
R.
Hartmann
(
Springer-Verlag
,
New York
).
34.
Qin
,
M. K.
, and
Oxenham
,
A. J.
(
2003
). “
Effects of simulated cochlear-implant processing on speech reception in fluctuating maskers
,”
J. Acoust. Soc. Am.
114
,
446
454
.
35.
Rosen
,
S.
(
1992
). “
Temporal information in speech: Acoustic, Auditory and Linguistic aspects
,”
Philos. Trans. R. Soc. London, Ser. B
336
,
367
373
.
36.
Silipo
,
R.
,
Greenberg
,
S.
, and
Arai
,
T.
(
1999
). “
Temporal constraints on speech intelligibility as deduced from exceedingly sparse spectral representation
,”
Proceedings of the Eurospeech
,
Budapest, Hungary
, pp.
2687
2690
, September.
37.
Shannon
,
R. V.
,
Zeng
,
F. G.
,
Kamath
,
V.
,
Wygonski
,
J.
, and
Ekelid
,
M.
(
1995
). “
Speech recognition with primarily temporal cues
,”
Science
270
,
303
304
.
38.
Steeneken
,
H. J. M.
, and
Houtgast
,
T.
(
1980
). “
A physical method for mearuring speech-transmission quality
,”
J. Acoust. Soc. Am.
67
,
318
326
.
39.
Steeneken
,
H. J. M.
, and
Houtgast
,
T.
(
1999
). “
A physical method for mearuring speech-transmission quality
,”
J. Acoust. Soc. Am.
67
,
318
326
.
40.
Strickland
,
E. A.
, and
Viemeister
,
N. F.
(
1997
). “
The effects of frequency region and bandwidth on the temporal modulation transfer function
,”
J. Acoust. Soc. Am.
102
,
1799
1810
.
41.
van Tasell
,
D. J.
,
Soli
,
S. D.
,
Kirby
,
V. M.
, and
Widen
,
G. P.
(
1987
). “
Speech waveform envelope cues for consonant recognition
,”
J. Acoust. Soc. Am.
82
,
1152
1161
.
42.
Viemeister
,
N. F.
(
1979
). “
Temporal modulation transfer functions based upon modulation thresholds
,”
J. Acoust. Soc. Am.
66
,
1364
1380
.
43.
Warren
,
R. M.
,
Riener
,
K. R.
,
Bashford
,
J. A.
, and
Brubaker
,
B. S.
(
1995
). “
Spectral redundancy: Intelligibility of sentences heard through narrow spectral slits
,”
Percept. Psychophys.
57
,
175
182
.
44.
Xu
,
L.
,
Thompson
,
C. S.
, and
Pfingst
,
B. E.
(
2005
). “
Relative contributions of spectral and temporal cues for phoneme recognition
,”
J. Acoust. Soc. Am.
117
,
3255
3267
.
45.
Zeng
,
F.-G.
,
Nie
,
K.
,
Liu
,
S.
,
Stickney
,
G.
,
Del Rio
,
E.
,
Kong
,
Y.-Y.
, and
Chen
,
H.
(
2004
). “
On the dichotomy in auditory perception between temporal envelope and fine structure cues
,”
J. Acoust. Soc. Am.
116
,
1351
1354
.
46.
Zwicker
,
E.
,
Flottorp
,
G.
, and
Stevens
,
S. S.
(
1957
). “
Critical band width in loudness summation
,”
J. Acoust. Soc. Am.
29
,
548
557
.
You do not currently have access to this content.