Acquisition of dynamic articulatory data is of major importance for studying speech production. It turns out that one technique alone often is not enough to get a correct coverage of the whole vocal tract at a sufficient sampling rate. Ultrasound (US) imaging has been proposed as a good acquisition technique for the tongue surface because it offers a good temporal sampling, does not alter speech production, is cheap, and is widely available. However, it cannot be used alone and this paper describes a multimodal acquisition system which uses electromagnetography sensors to locate the US probe. The paper particularly focuses on the calibration of the US modality which is the key point of the system. This approach enables US data to be merged with other data. The use of the system is illustrated via an experiment consisting of measuring the minimal tongue to palate distance in order to evaluate and design Magnetic Resonance Imaging protocols well suited for the acquisition of three-dimensional images of the vocal tract. Compared to manual registration of acquisition modalities which is often used in acquisition of articulatory data, the approach presented relies on automatic techniques well founded from geometrical and mathematical points of view.

1.
Baer
,
T.
,
Gore
,
J. C.
,
Gracco
,
L. C.
, and
Nye
,
P. W.
(
1991
). “
Analysis of vocal tract shape and dimensions using magnetic resonance imaging: Vowels
,”
J. Acoust. Soc. Am.
90
,
799
828
.
2.
Besl
,
P.
, and
McKay
,
N.
(
1992
). “
A method for registration of 3-d shapes
,”
IEEE Trans. Pattern Anal. Mach. Intell.
14
,
239
256
.
3.
Bresch
,
E.
,
Kim
,
Y.-C.
,
Byrd
,
K. N. D.
, and
Narayanan
,
S.
(
2008
). “
Seeing speech: Capturing vocal tract shaping using real-time magnetic resonance imaging
,”
IEEE Signal Process. Mag.
25
,
123
132
.
4.
Brunner
,
D. O.
,
Zanche
,
N. D.
,
Fröhlich
,
J.
,
Paska
,
J.
, and
Pruessmann
,
K. P.
(
2009
). “
Travelling-wave nuclear magnetic resonance
,”
Nature
457
,
994
998
.
5.
Flannery
,
B.
,
Teukolsky
,
S.
, and
Vetterling
,
W.
(
1992
).
Numerical Recipes
, 2nd ed. (
Cambridge University Press
,
London
), Chap. 10.
6.
Hueber
,
T.
,
Chollet
,
G.
,
Denby
,
B.
, and
Stone
,
M.
(
2008
). “
Acquisition of ultrasound, video and acoustic speech data for a silent-speech interface application
,” in
Proceedings of the International Seminar on Speech Production
, Strasbourg, France, pp.
365
369
.
7.
Hummel
,
J.
,
Figl
,
M.
,
Kollmann
,
C.
, and
Bergmann
,
H.
(
2002
). “
Evaluation of a miniature electromagnetic position tracker
,”
Med. Phys.
29
,
2205
2212
.
8.
Katz
,
W. F.
,
Bharadwaj
,
S. V.
, and
Stettler
,
M. P.
(
2006
). “
Influences of electromagnetic articulography sensors on speech produced by healthy adults and individuals with aphasia and apraxia
,”
J. Speech, Lang. Hear. Res.
49
,
645
659
.
9.
Khamene
,
A.
, and
Sauer
,
F.
(
2005
). “
A novel phantom-less spatial and temporal ultrasound calibration method
,” in
MICCAI2005
, pp.
65
72
.
10.
Kirsch
,
S.
(
2005
). “
Accuracy assessment of the electromagnetic tracking system aurora
,” Technical Report, NDI Europe GmbH.
11.
Laprie
,
Y.
,
Aron
,
M.
,
Berger
,
M.-O.
, and
Wrobel-Dautcourt
,
B.
(
2014
). “
Studying MRI acquisition protocols of sustained sounds with a multimodal acquisition system
,” in
10th International Seminar on Speech Production (ISSP)
,
Köln, Allemagne
.
12.
Mercier
,
L.
,
Lango
,
T.
,
Lindseth
,
F.
, and
Collins
,
D.
(
2005
). “
A review of calibration techniques for freehand 3-D ultrasound systems
,”
Ultrasound Med. Biol.
31
,
449
471
.
13.
Munhall
,
K. G.
,
Vatikiotis-Bateson
,
E.
, and
Tokhura
,
Y.
(
1995
). “
X-ray film database for speech research
,”
J. Acoust. Soc. Am.
98
,
1222
1224
.
14.
Perkell
,
J. S.
,
Cohen
,
M. H.
,
Svirsky
,
M. A.
,
Matthies
,
M. L.
,
Garabieta
,
I.
, and
Jackson
,
M. T. T.
(
1992
). “
Electromagnetic midsagittal articulometer systems for transducing speech articulatory movements
,”
J. Acoust. Soc. Am.
92
,
3078
3096
.
15.
Qin
,
C.
, and
Carreira-Perpiñán
,
M.
(
2007
). “
A comparison of acoustic features for articulatory inversion
,” in
Proceedings of EUROSPEECH
,
Antwerp
, pp.
2469
2472
.
16.
Scobbie
,
J.
,
Wrench
,
A.
, and
van der Linden
,
M.
(
2008
). “
Head probe stabilisation in ultrasound tongue imaging using a headset to permit natural head movement
,” in
Proceedings of the International Seminar on Speech Production
,
Strasbourg, France
, pp.
373
376
.
17.
Sock
,
R.
,
Hirsch
,
F.
,
Laprie
,
Y.
,
Perrier
,
P.
,
Vaxelaire
,
B.
,
Brock
,
G.
,
Bouarourou
,
F.
,
Fauth
,
C.
,
Hecker
,
V.
,
Ma
,
L.
,
Busset
,
J.
, and
Sturm
,
J.
(
2011
). “
DOCVACIM an X-ray database and tools for the study of coarticulation, inversion and evaluation of physical models
,” in
The Ninth International Seminar on Speech Production - ISSP'11
,
Montreal, Canada
, pp.
41
48
.
18.
Stone
,
M.
(
2005
). “
A guide to analyzing tongue motion from ultrasound images
,”
Clin. Ling. Phonetics
19
,
455
502
.
19.
Stone
,
M.
, and
Davis
,
E. P.
(
1995
). “
A head and transducer support system for making ultra-sound images of tongue/jaw movement
,”
J. Acoust. Soc. Am.
98
,
3107
3112
.
20.
Stone
,
M.
,
Sonies
,
B.
,
Shawker
,
T. H.
,
Weiss
,
G.
, and
Nadel
,
L.
(
1983
). “
Analysis of real-time ultrasound images of tongue configuration using a grid-digitizing system
,”
J. Phon.
11
(
3
),
207
218
.
21.
Story
,
B. H.
,
Titze
,
I. R.
, and
Hoffman
,
E. A.
(
1996
). “
Vocal tract area functions from magnetic resonance imaging
,”
J. Acoust. Soc. Am.
100
,
537
553
.
22.
Westbury
,
J. R.
,
Turner
,
G.
, and
Dembowski
,
J.
(
1994
). “
X-ray microbeam speech production database user's handbook version 1.0
,” Technical Report, Waisman Center on Mental Retardation & Human Development, University of Wisconsin, Madison,
139
pp.
24.
Whalen
,
D. H.
,
Iskarous
,
K.
,
Tiede
,
M. K.
,
Ostry
,
D. J.
,
Lehnert-Lehouillier
,
H.
,
Vatikiotis-Bateson
,
E.
, and
Hailey
,
D. S.
(
2005
). “
The Haskins optically corrected ultrasound system (HOCUS)
,”
J. Speech, Lang. Hear. Res.
48
,
543
553
.
25.
Wrobel-Dautcourt
,
B.
,
Berger
,
M. O.
,
Potard
,
B.
,
Laprie
,
Y.
, and
Ouni
,
S.
(
2005
). “
A low cost stereovision based system for acquisition of visible articulatory data
,” in
Proceedings of International Conference on Auditory-Visual Speech Processing (AVSP'05)
,
Vancouver, Canada
, pp.
145
150
.
26.
Zierdt
,
A.
,
Hoole
,
P.
, and
Tillmann
,
H. G.
(
1999
). “
Development of a system for three-dimensional fleshpoint measurement of speech movements
,” in
Proceedings of the International Congress on Phonetic Sciences
, pp.
73
76
.
You do not currently have access to this content.