We describe transcription and forced alignment of the Digital Archive of Southern Speech (DASS), a project that will provide a large corpus of historical, semi-spontaneous Southern speech for acoustic analysis. 372 hours of recordings (64 interviews) comprise a subset of the Linguistic Atlas of the Gulf States, an extensive dialect study of 1121 speakers conducted across eight southern U.S. states from 1968 to 1983. Manual orthographic transcription of full DASS interviews is carried out according to in-house guidelines that ensure consistency across files and transcribers. Separate codes are used for the interviewee, interviewer, non-speech, overlapping, and unintelligible speech. Transcriber output is converted to Praat TextGrids using LaBB-CAT, a tool for maintaining large speech corpora. TextGrids containing only the interviewee’s speech are generated, and subjected to forced alignment by DARLA, which accommodates the levels of variation and noise in the DASS files with a high degree of success. Toward acoustic analysis, we evaluate three methods for vowel formant extraction: the native output of DARLA, a local implementation of FAVE-Extract, and a Praat-based extractor that incorporates separate formant tracks for different regions of the vowel space. We present this workflow of transcription and analysis to benefit other projects of similar size and scope.
Skip Nav Destination
Article navigation
May 2017
Meeting abstract. No PDF available.
May 01 2017
Transcription and forced alignment of the digital archive of southern speech
Margaret E. Renwick;
Margaret E. Renwick
Linguist Program, Univ. of Georgia, 240 Gilbert Hall, Athens, GA 30602, mrenwick@uga.edu
Search for other works by this author on:
Michael Olsen;
Michael Olsen
Linguist Program, Univ. of Georgia, 240 Gilbert Hall, Athens, GA 30602, mrenwick@uga.edu
Search for other works by this author on:
Rachel M. Olsen;
Rachel M. Olsen
Linguist Program, Univ. of Georgia, 240 Gilbert Hall, Athens, GA 30602, mrenwick@uga.edu
Search for other works by this author on:
Joseph A. Stanley
Joseph A. Stanley
Linguist Program, Univ. of Georgia, 240 Gilbert Hall, Athens, GA 30602, mrenwick@uga.edu
Search for other works by this author on:
J. Acoust. Soc. Am. 141, 3981 (2017)
Citation
Margaret E. Renwick, Michael Olsen, Rachel M. Olsen, Joseph A. Stanley; Transcription and forced alignment of the digital archive of southern speech. J. Acoust. Soc. Am. 1 May 2017; 141 (5_Supplement): 3981. https://doi.org/10.1121/1.4989090
Download citation file:
Citing articles via
A survey of sound source localization with deep learning methods
Pierre-Amaury Grumiaux, Srđan Kitić, et al.
Co-speech head nods are used to enhance prosodic prominence at different levels of narrow focus in French
Christopher Carignan, Núria Esteve-Gibert, et al.
Source and propagation modelling scenarios for environmental impact assessment: Model verification
Michael A. Ainslie, Robert M. Laws, et al.
Related Content
Modeling dynamic trajectories of front vowels in the American South
J. Acoust. Soc. Am. (January 2020)
Automatic alignment for New Englishes: Applying state-of-the-art aligners to Trinidadian English
J. Acoust. Soc. Am. (April 2020)
Improved vowel labeling for prenasal merger using customized forced alignment
J. Acoust. Soc. Am. (October 2019)
Mapping Southern spoken dialect features with geographic information systems
J. Acoust. Soc. Am. (October 2020)
Automatic measurement of vowel duration via structured prediction
J. Acoust. Soc. Am. (December 2016)