This study presents a method for analyzing speech rhythm using empirical mode decomposition of the speech amplitude envelope, which allows for extraction and quantification of syllabic- and supra-syllabic time-scale components of the envelope. The method of empirical mode decomposition of a vocalic energy amplitude envelope is illustrated in detail, and several types of rhythm metrics derived from this method are presented. Spontaneous speech extracted from the Buckeye Corpus is used to assess the effect of utterance length on metrics, and it is shown how metrics representing variability in the supra-syllabic time-scale components of the envelope can be used to identify stretches of speech with targeted rhythmic characteristics. Furthermore, the envelope-based metrics are used to characterize cross-linguistic differences in speech rhythm in the UC San Diego Speech Lab corpus of English, German, Greek, Italian, Korean, and Spanish speech elicited in read sentences, read passages, and spontaneous speech. The envelope-based metrics exhibit significant effects of language and elicitation method that argue for a nuanced view of cross-linguistic rhythm patterns.
Skip Nav Destination
Article navigation
July 2013
July 11 2013
Speech rhythm analysis with decomposition of the amplitude envelope: Characterizing rhythmic patterns within and across languages
Sam Tilsen;
Sam Tilsen
a)
Department of Linguistics
, Cornell University
, 203 Morrill Hall, Ithaca, New York 14853-4701
Search for other works by this author on:
Amalia Arvaniti
Amalia Arvaniti
b)
Department of Linguistics, University of California
, San Diego, 9500 Gilman Drive, Number 0108, La Jolla, California 92093-0108
Search for other works by this author on:
a)
Author to whom correspondence should be addressed. Electronic mail: [email protected]
b)
Current address: Department of English Language and Linguistics, University of Kent, Cornwallis North West, Canterbury, Kent CT2 7NF, UK.
J. Acoust. Soc. Am. 134, 628–639 (2013)
Article history
Received:
May 30 2012
Accepted:
April 29 2013
Citation
Sam Tilsen, Amalia Arvaniti; Speech rhythm analysis with decomposition of the amplitude envelope: Characterizing rhythmic patterns within and across languages. J. Acoust. Soc. Am. 1 July 2013; 134 (1): 628–639. https://doi.org/10.1121/1.4807565
Download citation file:
Pay-Per-View Access
$40.00
Sign In
You could not be signed in. Please check your credentials and make sure you have an active account and try again.
Citing articles via
All we know about anechoic chambers
Michael Vorländer
A survey of sound source localization with deep learning methods
Pierre-Amaury Grumiaux, Srđan Kitić, et al.
Performance study of ray-based ocean acoustic tomography methods for estimating submesoscale variability in the upper ocean
Etienne Ollivier, Richard X. Touret, et al.
Related Content
Acquisition of speech rhythm in a second language by learners with rhythmically different native languages
J. Acoust. Soc. Am. (August 2015)
Speech timing and linguistic rhythm: On the acoustic bases of rhythm typologies
J Acoust Soc Am (May 2015)
Rhythmic variability between speakers: Articulatory, prosodic, and linguistic factors
J. Acoust. Soc. Am. (March 2015)
Rhythm measures and dimensions of durational variation in speech
J. Acoust. Soc. Am. (May 2011)
Perception of rhythmic grouping depends on auditory experience
J. Acoust. Soc. Am. (October 2008)