Prior work has shown that the mouth area can yield articulatory features of speech segments and durational information (Navarra et al., 2010), while pitch and speech amplitude, are cued by the eyebrows and other head movements (Hamarneh et al., 2019). It has been reported that adults will look more at the mouth when evaluating speech information in a non-native language (Barenholtz et al., 2016). In the present study, we ask how listeners' visual scanning of a talking face is affected by task demands that specifically target prosodic and segmental information, which has not been examined by the prior work. Twenty-five native English speakers heard two audio sentences in English (the native language) or Mandarin (the non-native language) that might differ in segmental or prosodic information, or even both, and then saw a silent video of a talking face. Their task was to judge whether the video matched either the first or second audio sentence (or whether both sentences were the same).The results show that although looking was generally weighted towards the mouth, reflecting task demands, increased looking to the mouth predicted correct responses only for Mandarin trials. This effect was more pronounced in the Prosody and Both conditions, relative to the Segment condition (p < 0.05). The results suggest a link between mouth-looking and the extraction of speech-relevant information at both prosodic and segmental levels, but only under high cognitive load.
Skip Nav Destination
Article navigation
October 2020
Meeting abstract. No PDF available.
October 01 2020
Visual scanning of a talking face when evaluating segmental and prosodic information
Xizi Deng;
Xizi Deng
Linguist, Simon Fraser Univ., 830 Duthie Ave., Burnaby, BC V5A 2P8, Canada, xizi_deng@sfu.ca
Search for other works by this author on:
Yue Wang
Yue Wang
Linguist, Simon Fraser Univ., Burnaby, BC, Canada
Search for other works by this author on:
J. Acoust. Soc. Am. 148, 2765 (2020)
Citation
Xizi Deng, Henny Yeung, Yue Wang; Visual scanning of a talking face when evaluating segmental and prosodic information. J. Acoust. Soc. Am. 1 October 2020; 148 (4_Supplement): 2765. https://doi.org/10.1121/1.5147695
Download citation file:
31
Views
Citing articles via
Vowel signatures in emotional interjections and nonlinguistic vocalizations expressing pain, disgust, and joy across languages
Maïa Ponsonnet, Christophe Coupé, et al.
The alveolar trill is perceived as jagged/rough by speakers of different languages
Aleksandra Ćwiek, Rémi Anselme, et al.
A survey of sound source localization with deep learning methods
Pierre-Amaury Grumiaux, Srđan Kitić, et al.
Related Content
Perceiving prosodic prominence via unnatural visual information in avatar communication
J Acoust Soc Am (September 2018)
Effects of prosodic structure on the temporal organization of speech and co-speech gestures
J Acoust Soc Am (October 2022)
The phrasal organization in speech gestures and co-speech head and eyebrow beats
J. Acoust. Soc. Am. (March 2023)
You good?: Examining the role of intonation and eyebrow movements in sentence type distinction
J. Acoust. Soc. Am. (March 2024)
Eyebrow movements and vocal pitch height: Evidence consistent with an ethological signal
J. Acoust. Soc. Am. (May 2013)