The potential contribution of the peripheral auditory efferent system to our understanding of speech in a background of competing noise was studied using a computer model of the auditory periphery and assessed using an automatic speech recognition system. A previous study had shown that a fixed efferent attenuation applied to all channels of a multi-channel model could improve the recognition of connected digit triplets in noise [G. J. Brown, R. T. Ferry, and R. Meddis, J. Acoust. Soc. Am. 127, 943–954 (2010)]. In the current study an anatomically justified feedback loop was used to automatically regulate separate attenuation values for each auditory channel. This arrangement resulted in a further enhancement of speech recognition over fixed-attenuation conditions. Comparisons between multi-talker babble and pink noise interference conditions suggest that the benefit originates from the model’s ability to modify the amount of suppression in each channel separately according to the spectral shape of the interfering sounds.

1.
Boll
,
S. F.
(
1979
). “
Suppression of acoustic noise in speech using spectral subtraction
,”
IEEE Trans. Acoust. Speech, Signal Process.
27
,
113
120
.
2.
Backus
,
B. C.
, and
Guinan
,
J. J.
, Jr.
(
2006
). “
Time-course of the human medial olivocochlear reflex
,”
J. Acoust. Soc. Am.
119
,
2889
2904
.
3.
Ballestero
,
J.
,
Zorrilla de San Martin
,
J.
,
Goutman
,
J.
,
Elgoyhen
,
A. B.
,
Fuchs
,
P. A.
, and
Katz
,
E.
(
2011
). “
Short-term synaptic plasticity regulates the level of olivocochlear inhibition to auditory hair cells
,”
J. Neurosci.
31
,
14763
14774
.
5.
Brown
,
G. J.
,
Ferry
,
R.
, and
Meddis
,
R.
(
2010
). “
A computer model of auditory efferent suppression: Implications for the coding of speech in noise
,”
J. Acoust. Soc. Am.
127
,
943
954
.
4.
Brown
,
G. J.
,
Jürgens
,
T.
,
Meddis
,
R.
,
Robertson
,
M.
, and
Clark
,
N. R.
(
2011
). “
The representation of speech in a nonlinear auditory model: Time-domain analysis of simulated auditory-nerve firing patterns
,”
Proceedings of Interspeech
, Italy, pp.
2453
2456
.
6.
Brown
,
M. C.
(
1989
). “
Morphology and response properties of single olivocochlear fibers in the guinea pig
,”
Hear. Res.
40
,
93
110
.
7.
Cooper
,
N. P.
, and
Guinan
,
J. J.
, Jr.
(
2003
). “
Separate mechanical processes underlie fast and slow effects of medial olivocochlear efferent activity
,”
J. Physiol.
548
,
307
312
.
8.
Delgutte
,
B.
, and
Kiang
,
N. Y.
(
1984
). “
Speech coding in the auditory nerve: V. Vowels in background noise
,”
J. Acoust. Soc. Am.
75
,
908
918
.
9.
Ferry
,
R.
, and
Meddis
,
R.
(
2007
). “
A computer model of medial efferent suppression in the mammalian auditory system
,”
J. Acoust. Soc. Am.
122
,
3519
3526
.
10.
Geisler
,
C. D.
, and
Gamble
,
T.
(
1989
). “
Responses of ‘high-spontaneous’ auditory-nerve fibers to consonant-vowel syllables in noise
,”
J. Acoust. Soc. Am.
85
,
1639
1652
.
11.
Goldstein
,
J.
(
1990
). “
Modeling rapid waveform compression on the basilar membrane as multiple-bandpass-nonlinearity filtering
,”
Hear. Res.
49
,
39
60
.
12.
Guinan
,
J. J.
, Jr.
(
2006
). “
Olivocochlear efferents: Anatomy, physiology, function, and the measurement of efferent effects in humans
,”
Ear Hear.
27
,
589
607
.
13.
Guinan
,
J. J.
, Jr.
(
2010
). “
Cochlear efferent innervation and function
,”
Curr. Opin. Otolaryngol. Head Neck Surg.
18
,
447
453
.
14.
Guinan
,
J. J.
, Jr.
, and
Gifford
,
L.
(
1988
). “
Effects of electrical stimulation of efferent olivocochlear neurons on cat auditory-nerve fibers. III. Tuning curves and thresholds
,”
Hear. Res.
37
,
29
46
.
15.
Holmberg
,
M.
,
Gelbart
,
D.
, and
Hemmert
,
W.
(
2007
). “
Speech encoding in a model of peripheral auditory processing: Quantitative assessment by means of automatic speech recognition
,”
Speech Commun.
49
,
917
932
.
16.
Huber
,
A.
,
Linder
,
T.
,
Ferrazzini
,
M.
,
Schmid
,
S.
,
Dillier
,
N.
,
Stoeckli
,
S.
, and
Fisch
,
U.
(
2001
). “
Intraoperative assessment of stapes movement
,”
Ann. Otol. Rhinol. Laryngol.
110
,
31
35
.
17.
Jankowski
,
C. R. J.
,
Vo
,
H.-D.
, and
Lippmann
,
R. P.
(
1995
). “
A comparison of signal processing front ends for automatic word recognition
,”
IEEE Trans. Speech Audio Proc.
3
,
286
293
.
18.
Kim
,
D.-S.
,
Lee
,
S.-Y.
, and
Kil
,
R.-M.
(
1999
). “
Auditory processing of speech signals for robust speech recognition in real-world noisy environments
,”
IEEE Trans. Speech Audio Proc.
7
,
55
69
.
19.
Lee
,
C.
(
2010
). “
Closed-loop auditory-based representation for robust speech recognition
,” M.Sc. thesis,
Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA
, pp.
1
96
.
20.
Lee
,
C.
,
Glass
,
J.
, and
Ghitza
,
O.
(
2011
). “
An efferent-inspired auditory model front-end for speech recognition
,”
Proceedings of Interspeech
, Florence, Italy, pp.
49
52
.
21.
Leonard
,
R. G.
(
1984
). “
A database for speaker-independent digit recognition
,”
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
, San Diego, pp.
328
331
.
23.
Liberman
,
M. C.
(
1988
). “
Response properties of cochlear efferent neurons: Monaural versus binaural stimulation and the effects of noise
,”
J. Neurophys.
60
,
1779
1789
.
22.
Liberman
,
M. C.
, and
Brown
,
M. C.
(
1986
). “
Physiology and anatomy of single olivocochlear neurons in the cat
,”
Hear. Res.
24
,
17
36
.
24.
Lilaonitkul
,
W.
, and
Guinan
,
J. J.
, Jr.
(
2009
). “
Reflex control of the human inner ear: A half-octave offset in medial efferent feedback that is consistent with an efferent role in the control of masking
,”
J. Neurophys.
101
,
1394
1406
.
25.
Lopez-Poveda
,
E. A.
, and
Meddis
,
R.
(
2001
). “
A human nonlinear cochlear filterbank
,”
J. Acoust. Soc. Am.
110
,
3107
3118
.
27.
Meddis
,
R.
(
2006
). “
Auditory-nerve first-spike latency and auditory absolute threshold: A computer model
,”
J. Acoust. Soc. Am.
119
,
406
417
.
26.
Meddis
,
R.
,
O’Mard
,
L.
, and
Lopez-Poveda
,
E.
(
2001
). “
A computational algorithm for computing nonlinear auditory frequency selectivity
,”
J. Acoust. Soc. Am.
109
,
2852
2861
.
28.
Messing
,
D. P.
,
Delhorne
,
L.
,
Bruckert
,
E.
,
Braida
,
L.
, and
Ghitza
,
O.
(
2008
). “
Consonant discrimination of degraded speech using an efferent-inspired model closed-loop cochlear model
,”
Proceedings of Interspeech
, Brisbane, Australia, pp.
1052
1055
.
29.
Messing
,
D. P.
,
Delhorne
,
L.
,
Bruckert
,
E.
,
Braida
,
L.
, and
Ghitza
,
O.
(
2009
). “
A non-linear efferent-inspired model of the auditory system; matching human confusions in stationary noise
,”
Speech Commun.
51
,
668
683
.
30.
Robertson
,
M.
,
Brown
,
G. J.
,
Lecluyse
,
W.
,
Panda
,
M.
, and
Tan
,
C. M.
(
2010
). “
A speech in noise test based on spoken digits: Comparison of normal and impaired listeners using a computer model
,”
Proceedings of Interspeech
, Makuhari, Japan, pp.
2470
2473
.
31.
Russel
,
I. J.
, and
Murugasu
,
E.
(
1997
). “
Medial efferent inhibition suppresses basilar membrane responses to near characteristic frequency tones of moderate to high intensities
,”
J. Acoust. Soc. Am.
102
,
1734
1738
.
32.
Sachs
,
M. B.
,
May
,
B. J.
,
Prell
,
G. S. L.
, and
Heinz
,
R. D.
(
2006
). “
Adequacy of auditory-nerve rate representations of vowels: Comparison with behavioural measures in cat
,” in
Listening to Speech: An Auditory Perspective
, edited by
S.
Greenberg
and
W. A.
Ainsworth
(
Lawrence Erlbaum Associates
,
Hillsdale, NJ
), pp.
115
127
.
33.
Sheikhzadeh
,
H.
, and
Deng
,
L.
(
1998
). “
Speech analysis and recognition using interval statistics generated from a composite auditory model
,”
IEEE Trans. Speech Audio Proc.
6
,
90
94
.
34.
Sumner
,
C.
,
Lopez-Poveda
,
E.
,
O’Mard
,
L.
, and
Meddis
,
R.
(
2002
). “
A revised model of the inner-hair cell and auditory-nerve complex
,”
J. Acoust. Soc. Am.
111
,
2178
2189
.
35.
Sumner
,
C.
,
Lopez-Poveda
,
E.
,
O’Mard
,
L.
, and
Meddis
,
R.
(
2003a
). “
Adaptation in a revised inner-hair cell model
,”
J. Acoust. Soc. Am.
113
,
893
901
.
36.
Sumner
,
C.
,
O’Mard
,
L.
,
Lopez-Poveda
,
E.
, and
Meddis
,
R.
(
2003b
). “
A nonlinear filterbank model of the guinea-pig cochlear nerve: Rate responses
,”
J. Acoust. Soc. Am.
113
,
3264
3274
.
37.
Winslow
,
R. L.
,
Barta
,
P. E.
, and
Sachs
,
M. B.
(
1987
). “
Rate coding in the auditory nerve
,” in
Auditory Processing of Complex Sounds
, edited by
W. A.
Yost
and
C. S.
Watson
(
Lawrence Erlbaum Associates
,
Hillsdale, NJ
), pp.
212
224
.
38.
Young
,
S.
,
Evermann
,
G.
,
Gales
,
M.
,
Hain
,
T.
,
Kershaw
,
D.
,
Moore
,
G.
,
Odell
,
J.
,
Ollason
,
D.
,
Povey
,
D.
,
Valtchev
,
V.
, and
Woodland
,
P.
(2009). “The Hidden Markov Model Toolkit (HTK),” Engineering Department, University of Cambridge, Cambridge, UK, http://htk.eng.cam.ac.uk/ (last viewed 08/05/12).
You do not currently have access to this content.