Recently, cluster analysis on f0 contours has become a popular method in phonetic research. Cluster analysis provides an automated way of categorising f0 contours, which gives new insights into (phonological) categories of intonation that vary across languages. As cluster analysis can be performed in many different ways, it is important to understand the extent to which these analyses can capture human perception of f0. This study focuses on the way in which f0 contours and differences among them are represented numerically, i.e., a crucial methodological choice preceding cluster analysis. These representations are then compared to the way in which f0 contour differences are perceived by human listeners from two different languages. To this end, four time-series contour representations (equivalent rectangular bandwidth, standardisation, octave-median rescaling, first derivative) and three distance measures [Euclidean distance (L2 norm), Pearson correlation, and dynamic time warping) were tested. The perceived differences were obtained from listeners of German and Papuan Malay, two typologically different languages. Results show that computed contour differences reflect human perception moderately, with dynamic time warping applied to the first derivative of the contour performing best, and showing minimal differences between the languages.
Skip Nav Destination
Article navigation
July 2023
July 06 2023
Intonation contour similarity: f0 representations and distance measures compared to human perception in two languages
Constantijn Kaland
Constantijn Kaland
a)
Institute of Linguistics, University of Cologne
, Cologne, Germany
Search for other works by this author on:
a)
Electronic mail: ckaland@uni-koeln.de
J. Acoust. Soc. Am. 154, 95–107 (2023)
Article history
Received:
February 24 2023
Accepted:
June 07 2023
Citation
Constantijn Kaland; Intonation contour similarity: f0 representations and distance measures compared to human perception in two languages. J. Acoust. Soc. Am. 1 July 2023; 154 (1): 95–107. https://doi.org/10.1121/10.0019850
Download citation file:
Sign in
Don't already have an account? Register
Sign In
You could not be signed in. Please check your credentials and make sure you have an active account and try again.
Pay-Per-View Access
$40.00
210
Views
Citing articles via
A survey of sound source localization with deep learning methods
Pierre-Amaury Grumiaux, Srđan Kitić, et al.
Co-speech head nods are used to enhance prosodic prominence at different levels of narrow focus in French
Christopher Carignan, Núria Esteve-Gibert, et al.
In a presentation, Ted once said I'd like my epitaph to be “I simplified.”
Paul Schomer, Truls Gjestland
Related Content
Offline and online processing of acoustic cues to word stress in Papuan Malay
J. Acoust. Soc. Am. (February 2020)
Demarcating and highlighting in Papuan Malay phrase prosody
J. Acoust. Soc. Am. (April 2020)
Uncovering the acoustic vowel space of a previously undescribed language: The vowels of Nambo
J. Acoust. Soc. Am. (June 2016)
Introduction to the special issue on the phonetics of under-documented languages
J. Acoust. Soc. Am. (April 2020)
An acoustic phonetic description of Nungon vowels
J. Acoust. Soc. Am. (April 2020)