Vocoder simulation studies have suggested that the carrier signal type employed affects the intelligibility of vocoded speech. The present work further assessed how carrier signal type interacts with additional signal processing, namely, single-channel noise suppression and envelope dynamic range compression, in determining the intelligibility of vocoder simulations. In Experiment 1, Mandarin sentences that had been corrupted by speech spectrum-shaped noise (SSN) or two-talker babble (2TB) were processed by one of four single-channel noise-suppression algorithms before undergoing tone-vocoded (TV) or noise-vocoded (NV) processing. In Experiment 2, dynamic ranges of multiband envelope waveforms were compressed by scaling of the mean-removed envelope waveforms with a compression factor before undergoing TV or NV processing. TV Mandarin sentences yielded higher intelligibility scores with normal-hearing (NH) listeners than did noise-vocoded sentences. The intelligibility advantage of noise-suppressed vocoded speech depended on the masker type (SSN vs 2TB). NV speech was more negatively influenced by envelope dynamic range compression than was TV speech. These findings suggest that an interactional effect exists between the carrier signal type employed in the vocoding process and envelope distortion caused by signal processing.

1.
Baumgärtel
,
R. M.
,
Krawczyk-Becker
,
M.
,
Marquardt
,
D.
,
Völker
,
C.
,
Hu
,
H.
,
Herzke
,
T.
,
Coleman
,
G.
,
Adiloğlu
,
K.
,
Ernst
,
S. M.
,
Gerkmann
,
T.
,
Doclo
,
S.
,
Kollmeier
,
B.
,
Hohmann
,
V.
, and
Dietz
,
M.
(
2015
). “
Comparing binaural pre-processing strategies I: Instrumental evaluation
,”
Trends Hear.
19
,
1
16
.
2.
Carhart
,
R.
,
Tillman
,
T. W.
, and
Johnson
,
K. R.
(
1967
). “
Release of masking for speech through interaural time delay
,”
J. Acoust. Soc. Am.
42
,
124
138
.
3.
Chen
,
F.
,
Hu
,
Y.
, and
Yuan
,
M.
(
2015
). “
Evaluation of noise reduction methods for speech recognition by Mandarin-speaking cochlear implant listeners
,”
Ear Hear.
36
,
61
71
.
4.
Chen
,
F.
, and
Lau
,
A. H. Y.
(
2014
). “
Effect of vocoder type to Mandarin speech recognition in cochlear implant simulation
,” in
Proceedings of the International Symposium on Chinese Spoken Language Processing
,
September 12–14
,
Singapore
, pp.
551
554
.
5.
Chen
,
F.
, and
Loizou
,
P. C.
(
2011a
). “
Predicting the intelligibility of vocoded speech
,”
Ear Hear.
32
,
3281
3290
.
6.
Chen
,
F.
, and
Loizou
,
P. C.
(
2011b
). “
Predicting the intelligibility of vocoded and wideband Mandarin Chinese
,”
J. Acoust. Soc. Am.
129
,
3281
3290
.
7.
Chen
,
F.
,
Wong
,
L. L.
,
Qiu
,
J.
,
Liu
,
Y.
,
Azimi
,
B.
, and
Hu
,
Y.
(
2013
). “
The contribution of matched envelope dynamic range to the binaural benefits in simulated bilateral electric hearing
,”
J. Speech Lang. Hear. Res.
56
,
1166
1174
.
8.
Dorman
,
M. F.
,
Loizou
,
P. C.
, and
Rainey
,
D.
(
1997
). “
Speech intelligibility as a function of the number of channels of stimulation for signal processors using sine-wave and noise-band outputs
,”
J. Acoust. Soc. Am.
102
,
2403
2411
.
9.
Ephraim
,
Y.
, and
Malah
,
D.
(
1985
). “
Speech enhancement using a minimum mean-square error log-spectral amplitude estimator
,”
IEEE Trans. Acoust. Speech, Signal Process.
33
,
443
445
.
10.
Fu
,
Q. J.
,
Chinchilla
,
S.
, and
Galvin
,
J. J.
(
2004
). “
The role of spectral and temporal cues in voice gender discrimination by normal-hearing listeners and cochlear implant users
,”
J. Assoc. Res. Oto.
5
,
253
260
.
11.
Fu
,
Q. J.
, and
Shannon
,
R. V.
(
1999
). “
Effect of acoustic dynamic range on phoneme recognition in quiet and noise by cochlear implant users
,”
J. Acoust. Soc. Am.
106
,
EL65
EL70
.
12.
Gonzalez
,
J.
, and
Oliver
,
J. C.
(
2005
). “
Gender and speaker identification as a function of the number of channels in spectrally reduced speech
,”
J. Acoust. Soc. Am.
118
,
461
470
.
13.
Howie
,
J. M.
(
1976
).
Acoustical Studies of Mandarin Vowels and Tones
(
Cambridge University Press
,
Cambridge, England
), pp.
1
308
.
14.
Hu
,
Y.
, and
Loizou
,
P.
(
2003
). “
A generalized subspace approach for enhancing speech corrupted by colored noise
,”
IEEE Trans. Speech Audio Process.
11
,
334
341
.
15.
Hu
,
Y.
, and
Loizou
,
P. C.
(
2007
). “
A comparative intelligibility study of single-microphone noise reduction algorithms
,”
J. Acoust. Soc. Am.
122
,
1777
1786
.
16.
Kamath
,
S.
, and
Loizou
,
P.
(
2002
). “
A multi-band spectral subtraction method for enhancing speech corrupted by colored noise
,” in
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing
,
May 13–17
,
Orlando, FL
, pp.
4164
4167
.
17.
Kasturi
,
K.
, and
Loizou
,
P. C.
(
2007
). “
Effect of filter spacing on melody recognition: Acoustic and electric hearing
,”
J. Acoust. Soc. Am.
122
,
EL29
EL34
.
18.
Lai
,
Y. H.
,
Tsao
,
Y.
, and
Chen
,
F.
(
2015
). “
Effects of adaptation rate and noise suppression on the intelligibility of compressed-envelope based speech
,”
PLoS One
10
,
e0133519
.
19.
Lan
,
N.
,
Nie
,
K.
,
Gao
,
S.
, and
Zeng
,
F. G.
(
2004
). “
A novel speech-processing strategy incorporating tonal information for cochlear implants
,”
IEEE Trans. Biomed. Eng.
51
,
752
760
.
20.
Li
,
J.
,
Yang
,
L.
,
Zhang
,
J.
,
Yan
,
Y.
,
Hu
,
Y.
,
Akagi
,
M.
, and
Loizou
,
P. C.
(
2011
). “
Comparative intelligibility investigation of single-channel noise-reduction algorithms for Chinese, Japanese, and English
,”
J. Acoust. Soc. Am.
129
,
3291
3301
.
21.
Loizou
,
P. C.
(
2007
).
Speech Enhancement: Theory and Practice
(
CRC Press
,
Boca Raton, FL
), pp.
1
689
.
22.
Loizou
,
P. C.
,
Dorman
,
M.
, and
Fitzke
,
J.
(
2000
). “
The effect of reduced dynamic range on speech understanding: Implications for patients with cochlear implants
,”
Ear Hear.
21
,
25
31
.
23.
Loizou
,
P. C.
, and
Kim
,
G.
(
2011
). “
Reasons why current speech-enhancement algorithms do not improve speech intelligibility and suggested solutions
,”
IEEE Trans. Audio Speech Lang. Process.
19
,
47
56
.
24.
Luo
,
X.
, and
Fu
,
Q. J.
(
2004
). “
Enhancing Chinese tone recognition by manipulating amplitude envelope: Implications for cochlear implants
,”
J. Acoust. Soc. Am.
116
,
3659
3667
.
25.
Luo
,
X.
, and
Fu
,
Q. J.
(
2006
). “
Contribution of low-frequency acoustic information to Chinese speech recognition in cochlear implant simulations
,”
J. Acoust. Soc. Am.
120
,
2260
2266
.
26.
Rosen
,
S.
,
Zhang
,
Y.
, and
Speers
,
K.
(
2015
). “
Spectral density affects the intelligibility of tone-vocoded speech: Implications for cochlear implant simulations
,”
J. Acoust. Soc. Am.
138
,
EL318
EL323
.
27.
Scalart
,
P.
, and
Filho
,
J.
(
1996
). “
Speech enhancement based on a priori signal to noise estimation
,” in
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing
,
May 9
,
Atlanta, GA
, pp.
629
632
.
28.
Shannon
,
R. V.
,
Zeng
,
F. G.
,
Kamath
,
V.
,
Wygonski
,
J.
, and
Ekelid
,
M.
(
1995
). “
Speech recognition with primarily temporal cues
,”
Science
270
,
303
304
.
29.
Souza
,
P.
, and
Rosen
,
S.
(
2009
). “
Effects of envelope bandwidth on the intelligibility of sine- and noise-vocoded speech
,”
J. Acoust. Soc. Am.
126
,
792
805
.
30.
Stone
,
M. A.
,
Füllgrabe
,
C.
,
Mackinnon
,
R. C.
, and
Moore
,
B. C.
(
2011
). “
The importance for speech intelligibility of random fluctuations in ‘steady’ background noise
,”
J. Acoust. Soc. Am.
130
,
2874
2881
.
31.
Stone
,
M. A.
,
Füllgrabe
,
C.
, and
Moore
,
B. C.
(
2008
). “
Benefit of high-rate envelope cues in vocoder processing: Effect of number of channels and spectral region
,”
J. Acoust. Soc. Am.
124
,
2272
2282
.
32.
Studebaker
,
G. A.
(
1985
). “
A ‘rationalized’ arcsine transform
,”
J. Speech Hear Res.
28
,
455
462
.
33.
Watson
,
C. S.
(
2005
). “
Some comments on informational masking
,”
Acta Acust.
91
,
502
512
.
34.
Whitmal
,
N. A.
,
Poissant
,
S. F.
,
Freyman
,
R. L.
, and
Helfer
,
K. S.
(
2007
). “
Speech intelligibility in cochlear implant simulations: Effects of carrier type, interfering noise, and subject experience
,”
J. Acoust. Soc. Am.
122
,
2376
2388
.
35.
Williges
,
B.
,
Dietz
,
M.
,
Hohmann
,
V.
, and
Jürgens
,
T.
(
2015
). “
Spatial release from masking in simulated cochlear implant users with and without access to low-frequency acoustic hearing
,”
Trends Hear.
19
,
1
14
.
36.
Wong
,
L. L.
,
Soli
,
S. D.
,
Liu
,
S.
,
Han
,
N.
, and
Huang
,
M. W.
(
2007
). “
Development of the Mandarin Hearing in Noise Test (MHINT)
,”
Ear Hear.
28
,
70S
74S
.
37.
Xu
,
L.
,
Thompson
,
C. S.
, and
Pfingst
,
B. E.
(
2005
). “
Relative contributions of spectral and temporal cues for phoneme recognition
,”
J. Acoust. Soc. Am.
117
,
3255
3267
.
38.
Zeng
,
F. G.
,
Grant
,
G.
,
Niparko
,
J.
,
Galvin
,
J.
,
Shannon
,
R.
,
Opie
,
J.
, and
Segel
,
P.
(
2002
). “
Speech dynamic range and its effect on cochlear implant performance
,”
J. Acoust. Soc. Am.
111
,
377
386
.
You do not currently have access to this content.