USC-TIMIT is an extensive database of multimodal speech production data, developed to complement existing resources available to the speech research community and with the intention of being continuously refined and augmented. The database currently includes real-time magnetic resonance imaging data from five male and five female speakers of American English. Electromagnetic articulography data have also been presently collected from four of these speakers. The two modalities were recorded in two independent sessions while the subjects produced the same 460 sentence corpus used previously in the MOCHA-TIMIT database. In both cases the audio signal was recorded and synchronized with the articulatory data. The database and companion software are freely available to the research community.

1.
S.
Narayanan
,
K.
Nayak
,
S.
Lee
,
A.
Sethy
, and
D.
Byrd
, “
An approach to real-time magnetic resonance imaging for speech production
,”
J. Acoust. Soc. Am.
115
,
1771
1776
(
2004
).
2.
E.
Bresch
,
Y.-C.
Kim
,
K.
Nayak
,
D.
Byrd
, and
S.
Narayanan
, “
Seeing speech: Capturing vocal tract shaping using real-time magnetic resonance imaging [Exploratory DSP]
,”
IEEE Signal Process. Mag.
25
,
123
132
(
2008
).
3.
J. S.
Perkell
,
M. H.
Cohen
,
M. A.
Svirsky
,
M. L.
Matthies
,
I.
Garabieta
, and
M. T.
Jackson
, “
Electromagnetic midsagittal articulometer systems for transducing speech articulatory movements
,”
J. Acoust. Soc. Am.
92
,
3078
3096
(
1992
).
4.
J. R.
Westbury
,
G.
Turner
, and
J.
Dembowski
, “
X-ray microbeam speech production database user's handbook
,”
Technical Report, Waisman Center on Mental Retardation and Human Development
,
University of Wisconsin
(
1994
).
5.
J.
Kim
,
A. C.
Lammert
,
P. Kumar
Ghosh
, and
S. S.
Narayanan
, “
Co-registration of speech production datasets from electromagnetic articulography and real-time magnetic resonance imaging
,”
J. Acoust. Soc. Am.
135
,
EL115
EL121
(
2014
).
6.
Y.-C.
Kim
,
S. S.
Narayanan
, and
K. S.
Nayak
, “
Flexible retrospective selection of temporal resolution in real-time speech MRI using a golden-ratio spiral view order
,”
Magn. Reson. Med.
65
,
1365
1371
(
2011
).
7.
J.
Santos
,
G.
Wright
, and
J.
Pauly
, “
Flexible real-time magnetic resonance imaging framework
,” in
Proceedings of the 26th Annual International Conference of the IEEE Engineering in Medicine and Biology Society
,
San Francisco, CA
(
2004
), pp.
1048
1051
.
8.
J.
Jackson
,
C.
Meyer
,
D.
Nishimura
, and
A.
Macovski
, “
Selection of a convolution function for Fourier inversion using gridding
,”
IEEE Trans. Med. Imaging
10
,
473
478
(
1991
).
9.
E.
Bresch
,
J.
Nielsen
,
K.
Nayak
, and
S.
Narayanan
, “
Synchronized and noise-robust audio recordings during realtime magnetic resonance imaging scans
,”
J. Acoust. Soc. Am.
120
,
1791
1794
(
2006
).
10.
A.
Wrench
and
W.
Hardcastle
, “
A multichannel articulatory speech database and its application for automatic speech recognition
,” in
Proceedings of the 5th Seminar on Speech Production
,
Kloster Seeon
,
Bavaria
(
2000
), pp.
305
308
.
11.
M. I.
Proctor
,
D.
Bone
, and
S. S.
Narayanan
, “
Rapid semi-automatic segmentation of real-time Magnetic Resonance Images for parametric vocal tract analysis
,” in
Proceedings Conference of the International Speech Communication Association
,
Makuhari
,
Japan
(
2010
), pp.
1576
1579
.
12.
E.
Bresch
and
S.
Narayanan
, “
Region segmentation in the frequency domain applied to upper airway real-time magnetic resonance images
,”
IEEE Trans. Med. Imaging
28
,
323
338
(
2009
).
13.
D.
Byrd
,
S.
Tobin
,
E.
Bresch
, and
S.
Narayanan
, “
Timing effects of syllable structure and stress on nasals: A real-time MRI examination
,”
J. Phon.
37
,
97
110
(
2009
).
14.
M. I.
Proctor
,
E.
Bresch
,
D.
Byrd
,
K. S.
Nayak
, and
S. S.
Narayanan
, “
Paralinguistic mechanisms of production in human ‘beatboxing’: a real-time magnetic resonance imaging study
,”
J. Acoust. Soc. Am.
133
,
1043
1054
(
2013
).
15.
M.
Stone
,
G.
Stock
,
K.
Bunin
,
K.
Kumar
,
M.
Epstein
,
C.
Kambhamettu
,
M.
Li
,
V.
Parthasarathy
, and
J.
Prince
, “
Comparison of speech production in upright and supine position
,”
J. Acoust. Soc. Am.
122
,
532
541
(
2007
).
16.
B. P.
Sutton
,
C.
Conway
,
Y.
Bae
,
C.
Brinegar
,
Z.-P.
Liang
, and
D. P.
Kuehn
, “
Dynamic imaging of speech and swallowing with MRI
,” in
Proceedings of the 31st Annual International Conference of the IEEE Engineering in Medicine and Biology Society
,
Minneapolis, MN
(
2009
), pp.
6651
6654
.
17.
A.
Scott
,
R.
Boubertakh
,
M.
Birch
, and
M.
Miquel
, “
Towards clinical assessment of velopharyngeal closure using MRI: evaluation of real-time MRI sequences at 1.5 and 3 T
,”
Br. J. Radiol.
85
,
e1083
e1092
(
2012
).
18.
S.
Zhang
,
A.
Olthoff
, and
J.
Frahm
, “
Real-time magnetic resonance imaging of normal swallowing
,”
J. Magn. Reson. Imaging
35
,
1372
1379
(
2012
).
19.
A.
Niebergall
,
S.
Zhang
,
E.
Kunay
,
G.
Keydana
,
M.
Job
,
M.
Uecker
, and
J.
Frahm
, “
Real-time MRI of speaking at a resolution of 33 ms: Undersampled radial FLASH with nonlinear inverse reconstruction
,”
Magn. Reson. Med.
69
,
477
485
(
2013
).
You do not currently have access to this content.