Despite being an indispensable tool for both researchers and clinicians, traditional endoscopic imaging of the human vocal folds is limited in that it cannot capture their inferior-superior motion. A three-dimensional reconstruction technique using high-speed video imaging of the vocal folds in stereo is explored in an effort to estimate the inferior-superior motion of the medial-most edge of the vocal folds under normal muscle activation in vivo. Traditional stereo-matching algorithms from the field of computer vision are considered and modified to suit the specific challenges of the in vivo application. Inferior-superior motion of the medial vocal fold surface of three healthy speakers is reconstructed over one glottal cycle. The inferior-superior amplitude of the mucosal wave is found to be approximately 13 mm for normal modal voice, reducing to approximately 3 mm for strained falsetto voice, with uncertainty estimated at σ ≈ 2 mm and σ ≈ 1 mm, respectively. Sources of error, and their relative effects on the estimation of the inferior-superior motion, are considered and recommendations are made to improve the technique.

1.
R. L.
Miller
, “
Nature of the vocal cord wave
,”
J. Acoust. Soc. Am.
31
,
667
677
(
1959
).
2.
T.
Baer
,
A.
Löfqvist
, and
N. S.
McGarr
, “
Laryngeal vibrations: A comparison between high-speed filming and glottographic techniques
,”
J. Acoust. Soc. Am.
73
(
4
),
1304
1308
(
1983
).
3.
D. M.
Bless
,
M.
Hirano
, and
R. J.
Feder
Videostroboscopic evaluation of the larynx
,”
Ear Nose Throat J.
66
(
7
),
289
296
(
1987
).
4.
H.
Hirose
, “
High-speed digital imaging of vocal fold vibration
,”
Acta Otolaryngol. Suppl.
105
(
s458
),
151
153
(
1988
).
5.
I. R.
Titze
,
The Myoelastic Aerodynamic Theory of Phonation
(
National Center for Voice and Speech
,
Iowa City, IA
,
2006
),
424
pp.
6.
S.
Adachi
and
J.
Yu
, “
Two-dimensional model of vocal fold vibration for sound synthesis of voice and soprano singing
,”
J. Acoust. Soc. Am.
117
(
5
),
3213
3224
(
2005
).
7.
D. A.
Berry
,
D. W.
Montequin
, and
N.
Tayama
, “
High-speed digital imaging of the medial surface of the vocal folds
,”
J. Acoust. Soc. Am.
110
,
2539
2547
(
2001
).
8.
M.
Döllinger
,
D. A.
Berry
, and
G. S.
Berke
, “
Medial surface dynamics of an in vivo canine vocal fold during phonation
,”
J. Acoust. Soc. Am.
117
,
3174
3183
(
2005
).
9.
S.
Tang
,
Y.
Zhang
,
X.
Qin
,
S.
Wang
, and
M.
Wana
, “
Measuring body layer vibration of vocal folds by high-frame-rate ultrasound synchronized with a modified electroglottograph
,”
J. Acoust. Soc. Am.
134
(
1
),
528
538
(
2013
).
10.
S.
Saito
,
H.
Fukuda
,
S.
Kitahira
,
Y.
Isogai
,
T.
Tsuzuki
,
H.
Muta
,
E.
Takayama
,
T.
Fujika
,
N.
Kokawa
, and
K.
Makino
, “
Pellet tracking in the vocal fold while phonating: Experimental study using canine larynges with muscle activity
,” in
Vocal Fold Physiology
, edited by
I. R.
Titze
and
R. C.
Scherer
(
Denver Center for the Performing Arts
,
Denver, CO
,
1985
), pp.
169
182
.
11.
H.
Larsson
and
S.
Hertegård
, “
Calibration of high-speed imaging by laser triangulation
,”
Logoped. Phoniatr. Vocol.
29
,
154
161
(
2004
).
12.
A.
Chan
and
L.
Mongeau
, “
Vocal fold vibration measurements using laser Doppler vibrometry
,”
J. Acoust. Soc. Am.
133
(
3
),
1667
1676
(
2013
).
13.
M.
Schuster
,
J.
Lohscheller
,
P.
Kummer
,
U.
Eysholdt
, and
U.
Hoppe
, “
Laser projection in high-speed glottography for high-precision measurements of laryngeal dimensions and dynamics
,”
Eur. Arch. Otorhinolaryngol.
262
(
6
),
477
481
(
2005
).
14.
T.
Wurzbacher
,
I.
Voigt
,
R.
Schwarz
,
M.
Döllinger
,
U.
Hoppe
,
J.
Penne
,
U.
Eysholdt
, and
J.
Lohscheller
, “
Calibration of laryngeal endoscopic high-speed image sequences by an automated detection of parallel laser line projections
,”
Med. Image Anal.
12
(
3
),
300
317
(
2008
).
15.
N. A.
George
,
F. F. M.
Mul
,
Q.
Qiu
,
G.
Rakhorst
, and
H. K.
Schutte
, “
Depth-kymography: High-speed calibrated 3D imaging of human vocal fold vibration dynamics
,”
Phys. Med. Biol.
53
,
2667
2675
(
2008
).
16.
G.
Luegmair
,
S.
Kniesburges
,
M.
Zimmermann
,
A.
Sutor
,
U.
Eysholdt
, and
M.
Döllinger
, “
Optical reconstruction of high-speed surface dynamics in an uncontrollable environment
,”
IEEE Trans. Med. Imag.
29
(
12
),
1979
1991
(
2010
).
17.
L.
Yu
,
G.
Liu
,
M.
Rubinstein
,
A.
Saidi
,
B.
Wong
, and
Z.
Chen
, “
Office-based dynamic imaging of vocal cords in awake patients with swept-source optical coherence tomography
,”
J. Biomed. Opt.
14
(
6
),
064020
(
2009
).
18.
E.
Chang
,
J.
Kobler
, and
S.
Yun
, “
Triggered optical coherence tomography for capturing rapid periodic motion
,”
Sci. Rep.
1
,
48
(
2011
).
19.
M.
Sawashima
and
S.
Miyazaki
, “
Stereo-fiberscopic measurement of the larynx: A preliminary experiment by use of ordinary laryngeal fiberscopes
,”
Ann. Bull. RILP
8
,
7
10
(
1974
).
20.
O.
Fujimura
,
T.
Baer
, and
S.
Niimi
, “
A stereo-fiberscope with a magnetic interlens bridge for laryngeal observation
,”
J. Acoust. Soc. Am.
65
,
478
480
(
1979
).
21.
M.
Sawashima
,
H.
Hirose
,
K.
Honda
,
H.
Yoshioka
,
S. R.
Hibi
,
N.
Kawase
, and
M.
Yamada
, “
Stereoendoscopic measurement of the laryngeal structure
,” in
Vocal Fold Physiology: Contemporary Research and Clinical Issues
, edited by
D. M.
Bless
and
J. H.
Abbs
(
College-Hill
,
San Diego
,
1983
), pp.
264
276
.
22.
I. T.
Tokuda
,
M.
Iwawaki
,
K.-I.
Sakakibara
,
H.
Imagawa
,
T.
Nito
,
T.
Yamasoba
, and
N.
Tayama
, “
Reconstructing three-dimensional vocal fold movement via stereo matching
,”
Acoust. Sci. Tech.
34
,
374
377
(
2013
).
23.
S.
Thomson
,
L.
Mongeau
, and
S.
Frankel
, “
Aerodynamic transfer of energy to the vocal folds
,”
J. Acoust. Soc. Am.
118
(
3
),
1689
1700
(
2005
).
24.
N.
Henrich
,
C.
d'Alessandro
,
B.
Doval
, and
M.
Castellengo
, “
Glottal open quotient in singing: Measurements and correlation with laryngeal mechanisms
,
vocal intensity
,
and fundamental frequency
,”
J. Acoust. Soc. Am.
117
,
1417
1430
(
2005
).
25.
R.
Szeliski
,
Computer Vision: Algorithms and Applications
(
Springer
,
London
,
2011
), pp.
181
227
, 467–539.
26.
D. H.
Ballard
and
C. M.
Brown
,
Computer Vision
(
Prentice Hall
,
Englewood Cliffs, NJ
,
1982
), pp.
1
382
.
27.
D.
Scharstein
and
R.
Szeliski
, “
A taxonomy and evaluation of dense two-frame stereo correspondence algorithms
,”
Int. J. Comput. Vision
47
,
7
42
(
2002
).
28.
M. D.
Levine
,
D. A.
O'Handley
, and
G. M.
Yagi
, “
Computer determination of depth maps
,”
Comput. Graph. Image Process.
2
,
131
150
(
1973
).
29.
T.
Kanade
and
M.
Okutomi
, “
A stereo matching algorithm with an adaptive window: Theory and experiment
,”
IEEE Trans. Pattern Anal. Machine Intell.
16
(
9
),
920
932
(
1994
).
30.
J.
Wendler
, “
Stroboscopy
,”
J. Voice
6
(
2
),
149
154
(
1992
).
31.
M.
Hirano
and
D. M.
Bless
,
Videostroboscopic Examination of the Larynx
(
Singular
,
San Diego, CA
,
1993
).
32.
J.
Švec
and
H.
Schutte
, “
Videokymography: High-speed line scanning of vocal fold vibration
,”
J. Voice
10
,
201
205
(
1996
).
33.
I. R.
Titze
, “
The physics of small-amplitude oscillation of the vocal folds
,”
J. Acoust. Soc. Am.
83
,
1536
1552
(
1988
).
34.
R.
Mittal
,
B. D.
Erath
, and
M. W.
Plesniak
, “
Fluid dynamics of human phonation and speech
,”
Ann. Rev. Fluid Mech.
45
,
437
467
(
2013
).
35.
Joint Committee for Guides in Metrology, “
Evaluation of Measurement Data—Guide to the Expression of Uncertainty in Measurement
,” Joint Committee for Guides in Metrology, Vol.
100
,
2008
.
36.
H.
Hollien
, “
On vocal registers
,”
J. Phonetics
2
,
125
143
(
1974
).
37.
A.
Boessenecker
,
D. A.
Berry
,
J.
Lohscheller
,
U.
Eysholdt
, and
M.
Doellinger
, “
Mucosal wave properties of a human vocal fold
,”
Acta Acust. Acust.
93
,
815
823
(
2007
).
38.
M.
Doellinger
and
D. A.
Berry
, “
Visualization and quantification of the medial surface dynamics of an excised human vocal fold during phonation
,”
J. Voice
20
,
401
413
(
2006
).
You do not currently have access to this content.