Acoustic cues related to the voice source, including harmonic structure and spectral tilt, were examined for relevance to prosodic boundary detection. The measurements considered here comprise five categories: duration, pitch, harmonic structure, spectral tilt, and amplitude. Distributions of the measurements and statistical analysis show that the measurements may be used to differentiate between prosodic categories. Detection experiments on the Boston University Radio Speech Corpus show equal error detection rates around 70% for accent and boundary detection, using only the acoustic measurements described, without any lexical or syntactic information. Further investigation of the detection results shows that duration and amplitude measurements, and, to a lesser degree, pitch measurements, are useful for detecting accents, while all voice source measurements except pitch measurements are useful for boundary detection.

1.
V.
Auberge
,
N.
Audibert
, and
A.
Rilliard
, “
Acoustic morphology of expressive speech: What about contours?
Proceedings of Speech Prosody 2004
, Nara, Japan (
2004
), pp.
201
204
;
2.
Y.
Xu
,
C.
Xu
, and
X.
Sun
, “
On the temporal domain of focus
,”
Proceedings of Speech Prosody 2004
, Nara, Japan (
2004
), pp.
81
84
;
3.
P. J.
Price
,
M.
Ostendorf
,
S.
Shattuck-Hufnagel
, and
C.
Fong
, “
The use of prosody in syntactic disambiguation
,”
J. Acoust. Soc. Am.
90
,
2956
2970
(
1991
).
4.
D.
Hirst
, “
The phonology and phonetics of speech prosody: Between acoustics and interpretation
,”
Proceedings of Speech Prosody 2004
, Nara, Japan (
2004
),
163
169
;
5.
M.
Beckman
and
G.
Ayers
, “
Guidelines for TOBI labeling (version 3.0)
,” The Ohio State University (
1997
).
6.
L.
Larkey
, “
Reiterant speech: An acoustic and perceptual validation
,”
J. Acoust. Soc. Am.
73
,
1337
1345
(
1983
).
7.
K. N.
Stevens
,
Acoustic Phonetics
(
Massachusetts Institute of Technology
, Cambridge, MA,
1998
).
8.
D. H.
Klatt
and
L. C.
Klatt
, “
Analysis, synthesis, and perception of voice quality variations among female and male talkers
,”
J. Acoust. Soc. Am.
87
,
820
857
(
1990
).
9.
S.
Chavarria
,
T.
Yoon
,
J.
Cole
, and
M.
Hasegawa-Johnson
, “
Acoustic differentiation of ip and IP boundary levels: Comparison of L- and L-L% in the Switchboard corpus
,”
Proceedings of Speech Prosody 2004
, Nara, Japan (
2004
), pp.
333
336
;
10.
J.
Slifka
, “
Respiratory constraints on speech production at prosodic boundaries
,” Ph.D. dissertation,
Massachusetts Institute of Technology
, Cambridge, MA,
2000
.
11.
J.
Pierrehumbert
and
D.
Talkin
, “
Lenition of ∕h∕ and glottal stop
,” in
Papers in Laboratory Phonology II: Gesture, Segment, Prosody
, edited by
G.
Doherty
and
D. R.
Ladd
(
Cambridge University Press
, Cambridge, UK,
1992
), pp.
90
119
.
12.
C.
Wightman
,
S.
Shattuck-Hufnagel
,
M.
Ostendorf
, and
P.
Price
. “
Segmental durations in the vicinity of prosodic phrase boundaries
,”
J. Acoust. Soc. Am.
91
,
1707
1717
(
1992
).
13.
A. E.
Turk
, and
L.
White
, “
Structural influences on accentual lengthening in English
,”
J. Phonetics
27
,
171
206
(
1990
).
14.
J.
Edwards
,
M.
Beckman
, and
J.
Fletcher
, “
The articulatory kinematics of final lengthening
,”
J. Acoust. Soc. Am.
89
,
369
382
(
1991
).
15.
M.
Ostendorf
,
P. J.
Price
, and
S.
Shattuck-Hufnagel
, “
The Boston University Radio Speech Corpus
,” Linguistic Data Consortium (
1995
);
16.
Entropic Research Laboratory, Inc.
, xwaves+manual (
1996
).
17.
K.
Chen
,
M.
Hasegawa-Johnson
,
A.
Cohen
, and
J.
Cole
, “
A maximum likelihood prosody recognizer
,”
Proceedings of Speech Prosody 2004
, Nara, Japan (
2004
), pp.
509
512
;
18.
K.
Sonmez
,
E.
Shriberg
,
L.
Heck
, and
M.
Weintraub
, “
Modeling dynamic prosodic variation for speaker verification
,”
Proceedings of Int. Conf. on Spoken Lang. Proc. 1998
, Sydney, Australia (
1998
), pp.
3189
3192
.
19.
J. J.
Godfrey
and
E.
Holliman
. “
The Switchboard-1 Telephone Speech Corpus Release 2
,” Linguistic Data Consortium (
1997
);
You do not currently have access to this content.