Automatic speech understanding systems are beginning to attain a level of sophistication where commercial applications are within reach. However, if humans and machines are ever going to communicate in a natural way, it is of vital importance that language modeling go beyond the sentence level. A profound understanding of discourse structure is required, and to this end, knowledge concerning how prosody interacts with other linguistic phenomena is needed. Not only will better prosodic modeling of discourse lead to better speech recognition/understanding, it will also yield more natural‐sounding speech synthesis. This paper reports on a dialogue/prosody project at Telia Research, Sweden. A Wizard‐of‐Oz simulation of a computerized reservation system was used to collect realistic speech data. Fifty subjects were given three tasks each that entailed the reservation of flights, trains, car hire, and hotel reservations. To avoid linguistic influence on the subjects’ utterances, the tasks were given as maps and icons. A ToBI‐style analysis was applied, adapted to meet language‐specific requirements. The dialogues were analyzed with regard to phrase boundaries, tones, disfluencies, syntax (functions/categories), new versus given information, and pitch range. This paper describes our observations concerning the interaction between prosodic, syntactic, and higher‐level linguistic phenomena, such as discourse structure.
Skip Nav Destination
Article navigation
November 1997
Meeting abstract. No PDF available.
November 01 1997
Interaction between prosody and discourse structure in a simulated man–machine dialogue Free
Robert Eklund
Robert Eklund
Telia Res. AB, Spoken Lang. Processing, S‐136 80 Haninge, Sweden
Search for other works by this author on:
Robert Eklund
Telia Res. AB, Spoken Lang. Processing, S‐136 80 Haninge, Sweden
J. Acoust. Soc. Am. 102, 3202 (1997)
Citation
Robert Eklund; Interaction between prosody and discourse structure in a simulated man–machine dialogue. J. Acoust. Soc. Am. 1 November 1997; 102 (5_Supplement): 3202. https://doi.org/10.1121/1.420926
Download citation file:
59
Views
Citing articles via
Focality of sound source placement by higher (ninth) order ambisonics and perceptual effects of spectral reproduction errors
Nima Zargarnezhad, Bruno Mesquita, et al.
A survey of sound source localization with deep learning methods
Pierre-Amaury Grumiaux, Srđan Kitić, et al.
Variation in global and intonational pitch settings among black and white speakers of Southern American English
Aini Li, Ruaridh Purse, et al.
Related Content
Realization of rhythmic dialogue on spoken dialogue system using paralinguistic information
J. Acoust. Soc. Am. (November 2006)
Hierarchical prosody modeling for Mandarin spontaneous speech
J. Acoust. Soc. Am. (April 2019)
Acquisition and evaluation of a human-robot elderly spoken dialog corpus for developing computerized cognitive assessment systems
J. Acoust. Soc. Am. (October 2016)
Using pause durations to discriminate between lexically ambiguous words and dialog acts in spontaneous speeech
J. Acoust. Soc. Am. (May 2008)
Concept‐to‐speech conversion for reply speech generation in a spoken dialogue system for road guidance and its prosodic control
J. Acoust. Soc. Am. (November 2006)