With the rise of hearables and the advantages of using in-ear microphones with intra-aural devices, accessibility to an in-ear speech database in adverse conditions is essential. Speech captured inside the occluded ear is limited in its frequency bandwidth and has an amplified low frequency content. In addition, occluding the ear canal affects speech production, especially in noisy environments. These changes to speech production have a detrimental effect on speech-based algorithms. Yet, to the authors' knowledge, there are no speech databases that account for these changes. This paper presents a speech-in-ear database, of speech captured inside an occluded ear in noise and in quiet. The database is bilingual (in French and in English) and is intended to aid researchers in developing algorithms for intra-aural devices utilizing in-ear microphones.

1.
Bates
,
D.
,
Maechler
,
M.
,
Bolker
,
B.
, and
Walker
,
S.
(
2014
). “
lme4: Linear mixed-effects models using Eigen and s4
,”
R package version
1
(
7
),
1
23
.
2.
Bernier
,
A.
, and
Voix
,
J.
(
2013
). “
An active hearing protection device for musicians
,” in
Proceedings of Meetings on Acoustics ICA2013
, Vol. 19, p.
040015
.
3.
Bottalico
,
P.
,
Passione
,
I. I.
,
Graetzer
,
S.
, and
Hunter
,
E. J.
(
2017
). “
Evaluation of the starting point of the Lombard effect
,”
Acta Acust. Acust.
103
(
1
),
169
172
.
4.
Bouserhal
,
R.
,
Chabot
,
P.
,
Sarria-Paja
,
M.
,
Cardinal
,
P.
, and
Voix
,
J.
(
2018
). “
Classification of nonverbal human produced audio events: A pilot study
,”
in
Interspeech
, 2–6 September, Hyderabad, India.
5.
Bouserhal
,
R. E.
,
Bockstael
,
A.
,
MacDonald
,
E.
,
Falk
,
T. H.
, and
Voix
,
J.
(
2017b
). “
Modeling speech level as a function of background noise level and talker-to-listener distance for talkers wearing hearing protection devices
,”
J. Speech, Lang. Hear. Res.
60
(
12
),
3393
3403
.
6.
Bouserhal
,
R. E.
,
Falk
,
T. H.
, and
Voix
,
J.
(
2013
). “
Integration of a distance sensitive wireless communication protocol to hearing protectors equipped with in-ear microphones
,” in
Proceedings of Meetings on Acoustics ICA2013
, Vol. 19, p.
040013
.
7.
Bouserhal
,
R. E.
,
Falk
,
T. H.
, and
Voix
,
J.
(
2015
). “
On the potential for artificial bandwidth extension of bone and tissue conducted speech: A mutual information study
,” in
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
, pp.
5108
5112
.
8.
Bouserhal
,
R. E.
,
Falk
,
T. H.
, and
Voix
,
J.
(
2017a
). “
In-ear microphone speech quality enhancement via adaptive filtering and artificial bandwidth extension
,”
J. Acoust. Soc. Am.
141
(
3
),
1321
1331
.
9.
Bouserhal
,
R. E.
,
Macdonald
,
E. N.
,
Falk
,
T. H.
, and
Voix
,
J.
(
2016
). “
Variations in voice level and fundamental frequency with changing background noise level and talker-to-listener distance while wearing hearing protectors: A pilot study
,”
International journal of audiology
55
(sup
1
),
S13
S20
.
10.
Brumm
,
H.
, and
Zollinger
,
S. A.
(
2011
). “
The evolution of the Lombard effect: 100 years of psychoacoustic research
,”
Behaviour
148
(
11-13
),
1173
1198
.
11.
Bulbuller
,
G.
,
Fargues
,
M.
, and
Vaidyanathan
,
R.
(
2006
). “
In-ear microphone speech data segmentation and recognition using neural networks
,” in
12th Digital Signal Processing Workshop and 4th Signal Processing Education Workshop
, pp.
262
267
.
12.
Byrne
,
D.
(
2014
). “
Influence of ear canal occlusion and air-conduction feedback on speech production in noise
,” Ph.D. thesis,
University of Pittsburgh
.
13.
Casali
,
J. G.
, and
Horylev
,
M. J.
(
1987
). “
Speech discrimination in noise: The influence of hearing protection
,” in
Proceedings of the Human Factors Society Annual Meeting
,
SAGE Publications Sage CA: Los Angeles, CA
, Vol. 31, pp.
1246
1250
.
14.
Cooke
,
M.
, and
Lecumberri
,
M. L. G.
(
2012
). “
The intelligibility of Lombard speech for non-native listeners
,”
J. Acoust. Soc. Am.
132
(
2
),
1120
1129
.
15.
Denby
,
B.
,
Schultz
,
T.
,
Honda
,
K.
,
Hueber
,
T.
,
Gilbert
,
J. M.
, and
Brumberg
,
J. S.
(
2010
). “
Silent speech interfaces
,”
Speech Commun.
52
(
4
),
270
287
.
16.
Garnier
,
M.
, and
Henrich
,
N.
(
2014
). “
Speaking in noise: How does the Lombard effect improve acoustic contrasts between speech and ambient noise?
,”
Comput. Speech Lang.
28
(
2
),
580
597
.
17.
Garnier
,
M.
,
Henrich
,
N.
, and
Dubois
,
D.
(
2010
). “
Influence of sound immersion and communicative interaction on the Lombard effect
,”
J. Speech, Lang., Hear. Res.
53
(
3
),
588
608
.
18.
Hiipakka
,
M.
,
Tikander
,
M.
, and
Karjalainen
,
M.
(
2010
). “
Modeling of external ear acoustics for insert headphone usage
,”
J. Audio Eng. Soc.
58
,
269
281
.
19.
Hirsch
,
H.-G.
, and
Pearce
,
D.
(
2000
). “
The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions
,” in
ASR2000-Automatic Speech Recognition: Challenges for the new Millenium ISCA Tutorial and Research Workshop (ITRW)
.
20.
Hoemann
,
H.
,
Lazarus-Mainka
,
G.
,
Schubeius
,
M.
, and
Lazarus
,
H.
(
1984
). “
Effect of noise and the wearing of ear protectors on verbal communication
,”
Noise Control Eng. J.
23
(
2
),
69
77
.
21.
Hotchkin
,
C.
, and
Parks
,
S.
(
2013
). “
The Lombard effect and other noise-induced vocal modifications: Insight from mammalian communication systems
,”
Biol. Rev.
88
(
4
),
809
824
.
22.
Howell
,
K.
, and
Martin
,
A.
(
1975
). “
An investigation of the effects of hearing protectors on vocal communication in noise
,”
J. Sound Vib.
41
(
2
),
181
196
.
23.
Hu
,
Y.
, and
Loizou
,
P.
(
2007
). “
Subjective comparison and evaluation of speech enhancement algorithms
,”
Speech Commun.
49
(
7
),
588
601
.
24.
Hunn
,
N.
(
2016
). “
The market for hearable devices 2016–2020
,” Technical Report, http://www.nickhunn.com.
25.
Ikeno
,
A.
,
Varadarajan
,
V.
,
Patil
,
S.
, and
Hansen
,
J. H. L.
(
2007
). “
UT-Scope: Speech under Lombard effect and cognitive stress
,” in
IEEE Aerospace Conference
, pp.
1
7
.
26.
Jankowski
,
C.
,
Kalyanswamy
,
a.
,
Basson
,
S.
, and
Spitz
,
J.
(
1990
). “
NTIMIT: A phonetically balanced, continuous speech, telephone bandwidth speech database
,”
International Conference on Acoustics, Speech, and Signal Processing
, pp.
109
112
, doi: 10.1109/ICASSP. 1990.115550.
27.
Johansen
,
B.
,
Flet-Berliac
,
Y. P. R.
,
Korzepa
,
M. J.
,
Sandholm
,
P.
,
Pontoppidan
,
N. H.
,
Petersen
,
M. K.
,
Larsen
,
J. E.
, and
Stephanidis
,
C.
(
2017
).
“Hearables in hearing care: Discovering usage patterns through IoT devices,”
in
Universal Access in Human–Computer Interaction. Human and Technological Environments
(
Springer
,
New York)
, pp.
39
49
.
28.
Junqua
,
J.-C.
,
Fincke
,
S.
, and
Field
,
K.
(
1999
). “
The Lombard effect: A reflex to better communicate with others in noise
,” in
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
, Vol. 4, pp.
2083
2086
.
29.
Kryter
,
K. D.
(
1946
). “
Effects of ear protective devices on the intelligibility of speech in noise
,”
J. Acoust. Soc. Am.
18
(
2
),
413
417
.
30.
Kuk
,
F.
,
Keenan
,
D.
, and
Lau
,
C.-c.
(
2005
). “
Vent configurations on subjective and objective occlusion effect
,”
J. Am. Acad. Audiol.
19
(
9
),
747
762
.
31.
Kurcan
,
R. S.
(
2006
). “
Isolated word recognition from in-ear microphone data using hidden Markov models (HMM)
,” Ph.D. thesis,
Naval Postgraduate School
, Monterey, California.
32.
Lane
,
H.
, and
Tranel
,
B.
(
1971
). “
The Lombard sign and the role of hearing in speech
,”
J. Speech, Lang., Hear. Res.
14
(
4
),
677
709
.
33.
Le Roux
,
J.
,
Vincent
,
E.
,
Hershey
,
J. R.
, and
Ellis
,
D. P.
(
2015
). “
Micbots: Collecting large realistic datasets for speech and audio research using mobile robots
,” in
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
, pp.
5635
5639
.
34.
Lombard
,
E.
(
1911
). “
Le signe de l'elevation de la voix” (“The sign of the elevation of the voice”)
,
Ann. Mal. L'Oreille Larynx
37
,
101
119
.
35.
Martin
,
A.
, and
Voix
,
J.
(
2017
). “
In-ear audio wearable: Measurement of heart and breathing rates for health and safety monitoring
,”
IEEE Trans. Biomed. Eng.
65
,
1256
1263
.
36.
Navarro
,
R.
(
1996
). “
Effects of ear canal occlusion and masking on the perception of voice
,”
Percept. Mot. Skills
82
(
1
),
199
208
.
37.
Nilsson
,
M.
,
Soli
,
S. D.
, and
Sullivan
,
J. A.
(
1994
). “
Development of the hearing in noise test for the measurement of speech reception thresholds in quiet and in noise
,”
J. Acoust. Soc. Am.
95
(
2
),
1085
1099
.
38.
Pantelopoulos
,
A.
, and
Bourbakis
,
N.
(
2010
). “
A survey on wearable sensor-based systems for health monitoring and prognosis
,”
IEEE Trans. Systems, Man, Cybernet., Part C (Applic. Rev.)
40
(
1
),
1
12
, http://ieeexplore.ieee.org/document/5306098/.
39.
Pittman
,
A. L.
, and
Wiley
,
T. L.
(
2001
). “
Recognition of speech produced in noise
,”
J. Speech, Lang. Hear. Res.
44
(
3
),
487
496
.
40.
Rothauser
,
E.
(
1969
). “
IEEE recommended practice for speech quality measurements
,”
IEEE Trans. Audio Electroacoust.
17
,
225
246
.
41.
Shin
,
H. S.
,
Kang
,
H.-G.
, and
Fingscheidt
,
T.
(
2012
). “
Survey of speech enhancement supported by a bone conduction microphone
,” in
Proceedings of 10. ITG Symposium Speech Communication
, VDE, pp.
1
4
.
42.
Summers
,
W. V.
,
Pisoni
,
D. B.
,
Bernacki
,
R. H.
,
Pedlow
,
R. I.
, and
Stokes
,
M. A.
(
1988
). “
Effects of noise on speech production: Acoustic and perceptual analyses
,”
J. Acoust. Soc. Am.
84
(
3
),
917
928
.
43.
Team
,
R. C.
(
2013
). “
R: A language and environment for statistical computing
.”
44.
Tourville
,
J. A.
, and
Guenther
,
F. H.
(
2011
). “
The diva model: A neural theory of speech acquisition and production
,”
Lang. Cognit. Processes
26
(
7
),
952
981
.
45.
Tufts
,
J. B.
, and
Frank
,
T.
(
2003
). “
Speech production in noise with and without hearing protection
,”
J. Acoust. Soc. Am.
114
(
2
),
1069
1080
.
46.
Vaillancourt
,
V.
,
Laroche
,
C.
,
Mayer
,
C.
,
Basque
,
C.
,
Nali
,
M.
,
Eriks-Brophy
,
A.
,
Soli
,
S. D.
, and
Giguère
,
C.
(
2005
). “
Adaptation of the hint (hearing in noise test) for adult Canadian francophone populations [Adaptación del hint (prueba de audición en ruido) para poblaciones de adultos canadienses francófonos]
,”
Int. J. Audiol.
44
(
6
),
358
361
.
47.
Varga
,
A.
, and
Steeneken
,
H. J.
(
1993
). “
Assessment for automatic speech recognition: Ii. noisex-92: A database and an experiment to study the effect of additive noise on speech recognition systems
,”
Speech Commun.
12
(
3
),
247
251
.
48.
v. Békésy
,
G.
(
1949
). “
The structure of the middle ear and the hearing of one's own voice by bone conduction
,”
J. Acoust. Soc. Am.
21
(
3
),
217
232
.
49.
Voix
,
J.
(
2017
). “
The ear beyond hearing: From smart earplug to in-ear brain computer interfaces
,” in
24th International Congress on Sound and Vibration, ICSV24
,
London
.
50.
Voix
,
J.
, and
Laville
,
F.
(
2009
). “
The objective measurement of individual earplug field performance
,”
J. Acoust. Soc. Am.
125
(
6
),
3722
3732
.
51.
Zue
,
V.
,
Seneff
,
S.
, and
Glass
,
J.
(
1990
). “
Speech database development at MIT: TIMIT and beyond
,”
Speech Commun.
9
,
351
356
.
52.
Zwislocki
,
J.
(
1957
). “
In search of the bone-conduction threshold in a free sound field
,”
J. Acoust. Soc. Am.
29
(
7
),
795
804
.

Supplementary Material

You do not currently have access to this content.