Speakers adjust their voice when talking in noise, which is known as Lombard speech. These acoustic adjustments facilitate speech comprehension in noise relative to plain speech (i.e., speech produced in quiet). However, exactly which characteristics of Lombard speech drive this intelligibility benefit in noise remains unclear. This study assessed the contribution of enhanced amplitude modulations to the Lombard speech intelligibility benefit by demonstrating that (1) native speakers of Dutch in the Nijmegen Corpus of Lombard Speech produce more pronounced amplitude modulations in noise vs in quiet; (2) more enhanced amplitude modulations correlate positively with intelligibility in a speech-in-noise perception experiment; (3) transplanting the amplitude modulations from Lombard speech onto plain speech leads to an intelligibility improvement, suggesting that enhanced amplitude modulations in Lombard speech contribute towards intelligibility in noise. Results are discussed in light of recent neurobiological models of speech perception with reference to neural oscillators phase-locking to the amplitude modulations in speech, guiding the processing of speech.

1.
Arnal
,
L. H.
,
Giraud
,
A.-L.
, and
Poeppel
,
D.
(
2015
). “
A neurophysiological perspective on speech processing
,” in
Neurobiology of Language
, edited by
G.
Hickok
and
S.
Small
(
Academic Press
,
New York)
, pp.
463
478
.
2.
Baayen
,
R. H.
,
Davidson
,
D. J.
, and
Bates
,
D. M.
(
2008
). “
Mixed-effects modeling with crossed random effects for subjects and items
,”
J. Mem. Lang.
59
,
390
412
.
3.
Barr
,
D. J.
,
Levy
,
R.
,
Scheepers
,
C.
, and
Tily
,
H. J.
(
2013
). “
Random effects structure for confirmatory hypothesis testing: Keep it maximal
,”
J. Mem. Lang.
68
,
255
278
.
4.
Bates
,
D.
,
Maechler
,
M.
,
Bolker
,
B.
, and
Walker
,
S.
(
2015
). “
Fitting linear mixed-effects models using lme4
,”
J. Stat. Softw.
67
,
1
48
.
5.
Boersma
,
P.
, and
Weenink
,
D.
(
2016
). “
Praat: Doing phonetics by computer
” [computer program].
6.
Bosker
,
H. R.
(
2017
). “
Accounting for rate-dependent category boundary shifts in speech perception
,”
Atten. Percept. Psychophys.
79
,
333
343
.
7.
Bosker
,
H. R.
, and
Cooke
,
M.
(
2018
). “
Talkers produce more pronounced amplitude modulations when speaking in noise
,”
J. Acoust. Soc. Am.
143
,
EL121
EL126
.
8.
Bosker
,
H. R.
, and
Ghitza
,
O.
(
2018
). “
Entrained theta oscillations guide perception of subsequent speech: Behavioural evidence from rate normalisation
,”
Lang. Cogn. Neurosci.
33
(
8
),
955
967
.
9.
Bosker
,
H. R.
, and
Kösem
,
A.
(
2017
). “
An entrained rhythm's frequency, not phase, influences temporal sampling of speech
,” in
Proceedings of Interspeech 2017
, Stockholm.
10.
Bradlow
,
A. R.
,
Torretta
,
G. M.
, and
Pisoni
,
D. B.
(
1996
). “
Intelligibility of normal speech I: Global and fine-grained acoustic-phonetic talker characteristics
,”
Speech Commun.
20
,
255
272
.
11.
Cooke
,
M.
(
2006
). “
A glimpsing model of speech perception in noise
,”
J. Acoust. Soc. Am.
119
,
1562
1573
.
12.
Cooke
,
M.
,
King
,
S.
,
Garnier
,
M.
, and
Aubanel
,
V.
(
2014a
). “
The listening talker: A review of human and algorithmic context-induced modifications of speech
,”
Comput. Speech Lang.
28
,
543
571
.
13.
Cooke
,
M.
,
Mayo
,
C.
,
Valentini-Botinhao
,
C.
,
Stylianou
,
Y.
,
Sauert
,
B.
, and
Tang
,
Y.
(
2013
). “
Evaluating the intelligibility benefit of speech modifications in known noise conditions
,”
Speech Commun.
55
,
572
585
.
14.
Cooke
,
M.
,
Mayo
,
C.
, and
Villegas
,
J.
(
2014b
). “
The contribution of durational and spectral changes to the Lombard speech intelligibility benefit
,”
J. Acoust. Soc. Am.
135
,
874
883
.
15.
Dai
,
B.
,
Chen
,
C.
,
Long
,
Y.
,
Zheng
,
L.
,
Zhao
,
H.
,
Bai
,
X.
,
Liu
,
W.
,
Zhang
,
Y.
,
Liu
,
L.
,
Guo
,
T.
,
Ding
,
G.
, and
Lu
,
C.
(
2018
). “
Neural mechanisms for selectively tuning in to the target speaker in a naturalistic noisy situation
,”
Nat. Commun.
9
(
2405
),
1
12
.
16.
Ding
,
N.
,
Patel
,
A.
,
Chen
,
L.
,
Butler
,
H.
,
Luo
,
C.
, and
Poeppel
,
D.
(
2017
). “
Temporal modulations in speech and music
,”
Neurosci. Biobehav. Rev.
81B
,
181
187
.
17.
Ding
,
N.
, and
Simon
,
J. Z.
(
2012
). “
Emergence of neural encoding of auditory objects while listening to competing speakers
,”
Proc. Natl. Acad. Sci. U.S.A.
109
(
29
),
11854
11859
.
18.
Doelling
,
K. B.
,
Arnal
,
L. H.
,
Ghitza
,
O.
, and
Poeppel
,
D.
(
2014
). “
Acoustic landmarks drive delta–theta oscillations to enable speech comprehension by facilitating perceptual parsing
,”
NeuroImage
85
,
761
768
.
19.
Dreher
,
J. J.
, and
O'Neill
,
J.
(
1957
). “
Effects of ambient noise on speaker intelligibility for words and phrases
,”
J. Acoust. Soc. Am.
29
,
1320
1323
.
20.
Drullman
,
R.
,
Festen
,
J. M.
, and
Plomp
,
R.
(
1994a
). “
Effect of reducing slow temporal modulations on speech recognition
,”
J. Acoust. Soc. Am.
95
,
2670
2680
.
21.
Drullman
,
R.
,
Festen
,
J. M.
, and
Plomp
,
R.
(
1994b
). “
Effect of temporal envelope smearing on speech reception
,”
J. Acoust. Soc. Am.
95
,
1053
1064
.
22.
Elliott
,
T. M.
, and
Theunissen
,
F. E.
(
2009
). “
The modulation transfer function for speech intelligibility
,”
PLoS Comput. Biol.
5
,
e1000302
.
23.
Flinker
,
A.
,
Doyle
,
W. K.
,
Mehta
,
A. D.
,
Devinsky
,
O.
, and
Poeppel
,
D.
(
2019
). “
Spectrotemporal modulation provides a unifying framework for auditory cortical asymmetries
,”
Nat. Human Behav.
1
,
393
405
.
24.
Garnier
,
M.
,
Henrich
,
N.
, and
Dubois
,
D.
(
2010
). “
Influence of sound immersion and communicative interaction on the Lombard effect
,”
J. Speech Lang. Hear. Res.
53
,
588
608
.
25.
Ghitza
,
O.
(
2011
). “
Linking speech perception and neurophysiology: Speech decoding guided by cascaded oscillators locked to the input rhythm
,”
Front. Psychol.
2
,
130
.
26.
Ghitza
,
O.
(
2012
). “
On the role of theta-driven syllabic parsing in decoding speech: Intelligibility of speech with a manipulated modulation spectrum
,”
Front. Psychol.
3
,
238
.
27.
Giraud
,
A.-L.
, and
Poeppel
,
D.
(
2012
). “
Cortical oscillations and speech processing: Emerging computational principles and operations
,”
Nat. Neurosci.
15
,
511
517
.
28.
Godoy
,
E.
,
Koutsogiannaki
,
M.
, and
Stylianou
,
Y.
(
2014
). “
Approaching speech intelligibility enhancement with inspiration from Lombard and Clear speaking styles
,”
Comput. Speech Lang.
28
(
2
),
629
647
.
29.
Golumbic
,
E. M. Z.
,
Cogan
,
G. B.
,
Schroeder
,
C. E.
, and
Poeppel
,
D.
(
2013a
). “
Visual input enhances selective speech envelope tracking in auditory cortex at a ‘cocktail party
,’ ”
J. Neuroscience
33
,
1417
1426
.
30.
Golumbic
,
E. M. Z.
,
Ding
,
N.
,
Bickel
,
S.
,
Lakatos
,
P.
,
Schevon
,
C. A.
,
McKhann
,
G. M.
,
Goodman
,
R. R.
,
Emerson
,
R.
,
Mehta
,
A. D.
, and
Simon
,
J. Z.
(
2013b
). “
Mechanisms underlying selective neuronal tracking of attended speech at a ‘cocktail party
,’ ”
Neuron
77
,
980
991
.
31.
Golumbic
,
E. M. Z.
,
Poeppel
,
D.
, and
Schroeder
,
C. E.
(
2012
). “
Temporal context in speech processing and attentional stream selection: A behavioral and neural perspective
,”
Brain Lang.
122
,
151
161
.
32.
Hotchkin
,
C.
, and
Parks
,
S.
(
2013
). “
The Lombard effect and other noise-induced vocal modifications: Insight from mammalian communication systems
,”
Biol. Rev.
88
,
809
824
.
33.
Kerlin
,
J. R.
,
Shahin
,
A. J.
, and
Miller
,
L. M.
(
2010
). “
Attentional gain control of ongoing cortical speech representations in a ‘cocktail party
,’ ”
J. Neurosci.
30
(
2
),
620
628
.
34.
Kösem
,
A.
,
Bosker
,
H. R.
,
Takashima
,
A.
,
Jensen
,
O.
,
Meyer
,
A.
, and
Hagoort
,
P.
(
2018
). “
Neural entrainment determines the words we hear
,”
Curr. Biol.
28
(
18
),
2867
2875
.
35.
Koutsogiannaki
,
M.
, and
Stylianou
,
Y.
(
2016
). “
Modulation enhancement of temporal envelopes for increasing speech intelligibility in noise
,” in
Proceedings of Interspeech
, pp.
2508
2512
.
36.
Krause
,
J. C.
, and
Braida
,
L. D.
(
2009
). “
Evaluating the role of spectral and envelope characteristics in the intelligibility advantage of clear speech
,”
J. Acoust. Soc. Am.
125
,
3346
3357
.
37.
Kusumoto
,
A.
,
Arai
,
T.
,
Kinoshita
,
K.
,
Hodoshima
,
N.
, and
Vaughan
,
N.
(
2005
). “
Modulation enhancement of speech by a pre-processing algorithm for improving intelligibility in reverberant environments
,”
Speech Commun.
45
(
2
),
101
113
.
38.
Lakatos
,
P.
,
Karmos
,
G.
,
Mehta
,
A. D.
,
Ulbert
,
I.
, and
Schroeder
,
C. E.
(
2008
). “
Entrainment of neuronal oscillations as a mechanism of attentional selection
,”
Science
320
,
110
113
.
39.
Lombard
,
E.
(
1911
). “
Le signe de l'elevation de la voix” (“The sign of the rise in the voice”)
,
Ann. Malad. l'Oreille Larynx (Ann. Ear Larynx Dis.)
37
,
101
119
.
40.
Lu
,
Y.
, and
Cooke
,
M.
(
2009
). “
The contribution of changes in F0 and spectral tilt to increased intelligibility of speech produced in noise
,”
Speech Commun.
51
(
12
),
1253
1262
.
41.
Luke
,
S. G.
(
2017
). “
Evaluating significance in linear mixed-effects models
,”
Behav. Res. Meth.
49
(
4
),
1494
1502
.
42.
Luo
,
J.
,
Goerlitz
,
H. R.
,
Brumm
,
H.
, and
Wiegrebe
,
L.
(
2015
). “
Linking the sender to the receiver: Vocal adjustments by bats to maintain signal detection in noise
,”
Sci. Rep.
5
,
18556
.
43.
Mesgarani
,
N.
, and
Chang
,
E. F.
(
2012
). “
Selective cortical representation of attended speaker in multi-talker speech perception
,”
Nature
485
,
233
236
.
44.
Peelle
,
J. E.
, and
Davis
,
M. H.
(
2012
). “
Neural oscillations carry speech rhythm through to comprehension
,”
Front. Psychol.
3
,
320
.
45.
Pittman
,
A. L.
, and
Wiley
,
T. L.
(
2001
). “
Recognition of speech produced in noise
,”
J. Speech Lang. Hear. Res.
44
,
487
496
.
46.
Quené
,
H.
, and
Van den Bergh
,
H.
(
2008
). “
Examples of mixed-effects modeling with crossed random effects and with binomial data
,”
J. Mem. Lang.
59
,
413
425
.
47.
R Development Core Team
. (
2012
). R: A Language and Environment for Statistical Computing [computer program].
48.
Rimmele
,
J. M.
,
Golumbic
,
E. M. Z.
,
Schröger
,
E.
, and
Poeppel
,
D.
(
2015
). “
The effects of selective attention and speech acoustics on neural speech-tracking in a multi-talker scene
,”
Cortex
68
,
144
154
.
49.
Saigusa
,
J.
, and
Hazan
,
V.
(
2019
). “
The effect of temporally fluctuating maskers on speech production and communication
,” in
Proceedings of the 19th International Congress of Phonetic Sciences 2019 [ICPhS XIX]
, Melbourne, Australia, p.
5
.
50.
Shannon
,
R. V.
,
Zeng
,
F.-G.
,
Kamath
,
V.
,
Wygonski
,
J.
, and
Ekelid
,
M.
(
1995
). “
Speech recognition with primarily temporal cues
,”
Science
270
,
303
.
51.
Smith
,
Z. M.
,
Delgutte
,
B.
, and
Oxenham
,
A. J.
(
2002
). “
Chimaeric sounds reveal dichotomies in auditory perception
,”
Nature
416
,
87
90
.
52.
Steeneken
,
H. J.
, and
Houtgast
,
T.
(
1980
). “
A physical method for measuring speech-transmission quality
,”
J. Acoust. Soc. Am.
67
,
318
326
.
53.
Summers
,
W. V.
,
Pisoni
,
D. B.
,
Bernacki
,
R. H.
,
Pedlow
,
R. I.
, and
Stokes
,
M. A.
(
1988
). “
Effects of noise on speech production: Acoustic and perceptual analyses
,”
J. Acoust. Soc. Am.
84
,
917
928
.
54.
The Language Archive
(
2020
). https://hdl.handle.net/1839/21ee5744-b5dc-4eed-9693-c37e871cdaf6 (Last viewed 01/28/2020).
55.
Varnet
,
L.
,
Ortiz-Barajas
,
M. C.
,
Erra
,
R. G.
,
Gervain
,
J.
, and
Lorenzi
,
C.
(
2017
). “
A cross-linguistic study of speech modulation spectra
,”
J. Acoust. Soc. Am.
142
,
1976
1989
.
56.
Versfeld
,
N. J.
,
Daalder
,
L.
,
Festen
,
J. M.
, and
Houtgast
,
T.
(
2000
). “
Method for the selection of sentence materials for efficient measurement of the speech reception threshold
,”
J. Acoust. Soc. Am.
107
,
1671
1684
.
You do not currently have access to this content.