The neural mechanisms underlying the ability of human listeners to recognize speech in the presence of background noise are still imperfectly understood. However, there is mounting evidence that the medial olivocochlear system plays an important role, via efferents that exert a suppressive effect on the response of the basilar membrane. The current paper presents a computer modeling study that investigates the possible role of this activity on speech intelligibility in noise. A model of auditory efferent processing [Ferry, R. T., and Meddis, R. (2007). J. Acoust. Soc. Am.122, 3519

3526
] is used to provide acoustic features for a statistical automatic speech recognition system, thus allowing the effects of efferent activity on speech intelligibility to be quantified. Performance of the “basic” model (without efferent activity) on a connected digit recognition task is good when the speech is uncorrupted by noise but falls when noise is present. However, recognition performance is much improved when efferent activity is applied. Furthermore, optimal performance is obtained when the amount of efferent activity is proportional to the noise level. The results obtained are consistent with the suggestion that efferent suppression causes a “release from adaptation” in the auditory-nerve response to noisy speech, which enhances its intelligibility.

1.
Cooke
,
M.
(
2006
). “
A glimpsing model of speech perception in noise
,”
J. Acoust. Soc. Am.
119
,
1562
1573
.
2.
Cooper
,
N. P.
, and
Guinan
,
J. J.
(
2006
). “
Medial olivocochlear efferent effects on basilar membrane response to sound
,” in
Auditory Mechanisms: Processes and Models
, edited by
A. L.
Nuttall
,
T.
Ren
,
P. G.
Gillespie
,
K.
Grosch
, and
E.
de Boer
(
World Scientific
,
Singapore
), pp.
86
92
.
3.
Cui
,
X.
, and
Alwan
,
A.
(
2005
). “
Noise robust speech recognition using feature compensation based on polynomial regression of utterance SNR
,”
IEEE Trans. Speech Audio Process.
13
,
1161
1172
.
4.
Dallos
,
P.
(
1992
). “
The active cochlea
,”
J. Neurosci.
12
,
4575
4585
.
5.
Dewson
,
J. H.
(
1968
). “
Efferent olivocochlear bundle: Some relationships to stimulus discrimination in noise
,”
J. Neurophysiol.
31
,
122
130
.
6.
Dolan
,
D.
, and
Nuttall
,
A.
(
1988
). “
Masked cochlear whole-nerve response intensity functions altered by electrical-stimulation of the crossed olivocochlear bundle
,”
J. Acoust. Soc. Am.
83
,
1081
1086
.
7.
Dolan
,
D. F.
,
Guo
,
M. H.
, and
Nuttall
,
A. L.
(
1997
). “
Frequency-dependent enhancement of basilar membrane velocity during olivocochlear bundle stimulation
,”
J. Acoust. Soc. Am.
3587
3596
.
8.
Ferry
,
R. T.
(
2008
). “
Auditory processing and the medial olivocochlear efferent system
,” Ph.D. thesis,
University of Essex
, Colchester, UK.
9.
Ferry
,
R. T.
, and
Meddis
,
R.
(
2007
). “
A computer model of medial efferent suppression in the mammalian auditory system
,”
J. Acoust. Soc. Am.
122
,
3519
3526
.
10.
Gales
,
M.
, and
Young
,
S.
(
2008
). “
The application of hidden Markov models in speech recognition
,”
Foundations and Trends in Signal Processing
1
,
195
304
.
11.
Ghitza
,
O.
(
2007
). “
Using auditory feedback and rhythmicity for diphone discrimination of degraded speech
,” in
Proceedings of the International Conference on Phonetic Sciences (ICPhS)
,
Saarbrucken, Germany
, pp.
163
168
.
12.
Ghitza
,
O.
,
Messing
,
D.
,
Delhorne
,
L.
,
Braida
,
L.
,
Bruckert
,
E.
, and
Shondhi
,
M.
(
2007
). “
Towards predicting consonant confusions of degraded speech
,” in
Hearing—From Sensory Processing to Perception
, edited by
B.
Kollmeier
,
G.
Klump
,
V.
Hohmann
,
U.
Langemann
,
M.
Mauermann
,
S.
Uppenkamp
, and
J.
Verhey
(
Springer
,
Berlin
), pp.
541
550
.
13.
Giraud
,
A.
,
Garnier
,
S.
,
Micheyl
,
C.
,
Lina
,
G.
,
Chays
,
A.
, and
CheryCroze
,
S.
(
1997
). “
Auditory efferents involved in speech-in-noise intelligibility
,”
NeuroReport
8
,
1779
1783
.
14.
Guinan
,
J. J.
(
1996
). “
Physiology of olivocochlear efferents
,” in
The Cochlea
, edited by
P.
Dallos
,
A. N.
Popper
, and
R. R.
Fay
(
Springer-Verlag
,
Berlin
), pp.
435
502
.
15.
Guinan
,
J. J.
(
2006
). “
Olivocochlear efferents: Anatomy, physiology, function, and the measurement of efferent effects in humans
,”
Ear Hear.
27
,
589
607
.
16.
Guinan
,
J. J.
, and
Gifford
,
M. L.
(
1988
). “
Effects of electrical stimulation of efferent olivocochlear neurons on cat auditory-nerve fibers. III. Tuning curves and thresholds at CF
,”
Hear. Res.
37
,
29
45
.
17.
Guinan
,
J. J.
, and
Stankovic
,
K. M.
(
1996
). “
Medial efferent inhibition produces the largest equivalent attenuations at moderate to high sound levels in cat auditory-nerve fibers
,”
J. Acoust. Soc. Am.
100
,
1680
1690
.
18.
Hermansky
,
H.
, and
Morgan
,
N.
(
1994
). “
Rasta processing of speech
,”
IEEE Trans. Speech Audio Process.
2
,
578
589
.
19.
Hienz
,
R.
,
Stiles
,
P.
, and
May
,
B.
(
1998
). “
Effects of bilateral olivocochlear lesions on vowel formant discrimination in cats
,”
Hear. Res.
116
,
10
20
.
20.
Holmberg
,
M.
,
Gelbart
,
D.
, and
Hemmert
,
W.
(
2007
). “
Speech encoding in a model of peripheral auditory processing: Quantitative assessment by means of automatic speech recognition
,”
Speech Commun.
49
,
917
932
.
21.
Hopkins
,
K.
,
Moore
,
B. C. J.
, and
Stone
,
M. A.
(
2008
). “
Effects of moderate cochlear hearing loss on the ability to benefit from temporal fine structure information in speech
,”
J. Acoust. Soc. Am.
123
,
1140
1153
.
22.
Huber
,
A.
,
Linder
,
T.
,
Ferrazzini
,
M.
,
Schmid
,
S.
,
Dillier
,
N.
,
Stoeckli
,
S.
, and
Fisch
,
U.
(
2001
). “
Intraoperative assessment of stapes movement
,”
Ann. Otol. Rhinol. Laryngol.
110
,
31
35
.
23.
Jankowski
,
C. R. J.
,
Vo
,
H.-D.
, and
Lippmann
,
R. P.
(
1995
). “
A comparison of signal processing front ends for automatic word recognition
,”
IEEE Trans. Speech Audio Process.
3
,
286
293
.
24.
Kim
,
S.
,
Frisina
,
R. D.
, and
Frisina
,
D. R.
(
2006
). “
Effects of age on speech understanding in normal hearing listeners: Relationship between the auditory efferent system and speech intelligibility in noise
,”
Speech Commun.
48
,
855
862
.
25.
Kumar
,
U.
, and
Vanaja
,
C.
(
2004
). “
Functioning of olivocochlear bundle and speech perception in noise
,”
Ear Hear.
25
,
142
146
.
26.
Liberman
,
M. C.
(
1988
). “
Response properties of cochlear efferent neurons: Monaural vs. binaural stimulation and the effects of noise
,”
J. Neurophysiol.
60
,
1779
1798
.
27.
Liberman
,
M. C.
, and
Guinan
,
J. J.
(
1998
). “
Feedback control of the auditory periphery: Anti-masking effects of middle ear muscles vs. olivocochlear efferents
,”
J. Commun. Disord.
31
,
471
482
.
28.
Lippmann
,
R. P.
(
1997
). “
Speech recognition by machines and humans
,”
Speech Commun.
22
,
1
16
.
29.
Liu
,
R.
,
Stern
,
R.
,
Huang
,
X.
, and
Acero
,
A.
(
1993
). “
Efficient cepstral normalization for robust speech recognition
,” in
Proceedings of ARPA Speech and Natural Language Workshop
, (
Princeton, NJ
), pp.
69
74
.
30.
Lopez-Poveda
,
E. A.
, and
Meddis
,
R.
(
2001
). “
A human nonlinear cochlear filterbank
,”
J. Acoust. Soc. Am.
110
,
3107
3118
.
31.
May
,
B. J.
, and
McQuone
,
S. J.
(
1995
). “
Effects of bilateral olivocochlear lesions on pure-tone intensity discrimination in cats
,”
Aud. Neurosci.
1
,
385
400
.
32.
Meddis
,
R.
(
2006
). “
Auditory-nerve first-spike latency and auditory absolute threshold: A computer model
,”
J. Acoust. Soc. Am.
119
,
406
417
.
33.
Meddis
,
R.
,
O’Mard
,
L.
, and
Lopez-Poveda
,
E.
(
2001
). “
A computational algorithm for computing nonlinear auditory frequency selectivity
,”
J. Acoust. Soc. Am.
109
,
2852
2861
.
34.
Messing
,
D. P.
,
Delhorne
,
L.
,
Bruckert
,
E.
,
Braida
,
L. D.
, and
Ghitza
,
O.
(
2009
). “
A non-linear efferent-inspired model of the auditory system; matching human confusions in stationary noise
,”
Speech Commun.
51
,
668
683
.
35.
Miller
,
G. A.
(
1947
). “
The masking of speech
,”
Psychol. Bull.
44
,
105
129
.
36.
Miller
,
G. A.
, and
Licklider
,
J.
(
1950
). “
The intelligibility of interrupted speech
,”
J. Acoust. Soc. Am.
22
,
167
173
.
37.
Oppenheim
,
A. V.
,
Schafer
,
R. W.
, and
Buck
,
J. R.
(
1999
).
Discrete-Time Signal Processing
(
Pearson Education
,
Upper Saddle River, NJ
).
38.
Pearce
,
D.
, and
Hirsch
,
H.-G.
(
2000
). “
The AURORA experimental framework for the performance evaluation of speech recognition systems under noisy conditions
,” in
Proceedings of the International Conference on Spoken Language Processing
, Vol.
IV
, pp.
29
32
.
39.
Rabiner
,
L. R.
(
1989
). “
A tutorial on hidden Markov models and selected applications
,”
Proc. IEEE
77
,
257
286
.
40.
Russell
,
I. J.
, and
Murugasu
,
E.
(
1997
). “
Medial efferent inhibition suppresses basilar membrane responses to near characteristic frequency tones of moderate to high intensities
,”
J. Acoust. Soc. Am.
102
,
1734
1738
.
41.
Sachs
,
M. B.
,
May
,
B. J.
,
Prell
,
G. S. L.
, and
Hienz
,
R. D.
(
2006
). “
Adequacy of auditory-nerve rate representations of vowels: Comparison with behavioural measures in cat
,” in
Listening to Speech: An Auditory Perspective
, edited by
S.
Greenberg
and
W. A.
Ainsworth
(
Lawrence Erlbaum Associates
,
Hillsdale, NJ
), pp.
115
127
.
42.
Scharf
,
B.
,
Magnan
,
J.
, and
Chays
,
A.
(
1997
). “
On the role of the olivocochlear bundle in hearing: 16 case studies
,”
Hear. Res.
103
,
101
122
.
43.
Sumner
,
C.
,
Lopez-Poveda
,
E.
,
O’Mard
,
L.
, and
Meddis
,
R.
(
2002
). “
A revised model of the inner-hair cell and auditory-nerve complex
,”
J. Acoust. Soc. Am.
111
,
2178
2189
.
44.
Sumner
,
C.
,
Lopez-Poveda
,
E.
,
O’Mard
,
L.
, and
Meddis
,
R.
(
2003a
). “
Adaptation in a revised inner-hair cell model
,”
J. Acoust. Soc. Am.
113
,
893
901
.
45.
Sumner
,
C.
,
O’Mard
,
L.
,
Lopez-Poveda
,
E.
, and
Meddis
,
R.
(
2003b
). “
A nonlinear filterbank model of the guinea-pig cochlear nerve: Rate responses
,”
J. Acoust. Soc. Am.
113
,
3264
3274
.
46.
Wagner
,
W.
,
Frey
,
K.
,
Heppelmann
,
G.
,
Plontke
,
S. K.
, and
Zenner
,
H.-P.
(
2008
). “
Speech-in-noise intelligibility does not correlate with efferent olivocochlear reflex in humans with normal hearing
,”
Acta Oto-Laryngol.
128
,
53
60
.
47.
Wiederhold
,
M. L.
, and
Kiang
,
N. Y. S.
(
1970
). “
Effects of electric stimulation of the crossed olivocochlear bundle on single auditory-nerve-fibers in the cat
,”
J. Acoust. Soc. Am.
48
,
950
965
.
48.
Winslow
,
R. L.
,
Barta
,
P. E.
, and
Sachs
,
M. B.
(
1987
). “
Rate coding in the auditory nerve
,” in
Auditory Processing of Complex Sounds
, edited by
W. A.
Yost
and
C. S.
Watson
(
Lawrence Erlbaum Associates
,
Hillsdale, NJ
), pp.
212
224
.
49.
Winslow
,
R. L.
, and
Sachs
,
M. B.
(
1988
). “
Single-tone intensity discrimination based on auditory-nerve rate responses in backgrounds of quiet, noise, and with stimulation of the crossed olivocochlear bundle
,”
Hear. Res.
35
,
165
190
.
50.
Young
,
S.
,
Evermann
,
G.
,
Gales
,
M.
,
Hain
,
T.
,
Kershaw
,
D.
,
Moore
,
G.
,
Odell
,
J.
,
Ollason
,
D.
,
Povey
,
D.
,
Valtchev
,
V.
, and
Woodland
,
P.
(
2009
).
The Hidden Markov Model Toolkit (HTK)
,
Cambridge University Engineering Department
, http://htk.eng.cam.ac.uk/ (Last viewed 10/06/09).
You do not currently have access to this content.