Automatic labeling methods have been developed which allow speech recognition systems to be trained and tested on very large data bases (up to 140 000 word tokens). But are these automatic labeling methods accurate enough to collect statistics of direct value to speech science? Despite the success of automatic methods in effectively training speech recognition systems, the alignment routines still make a small percentage of gross phone alignment errors—in sharp contrast to human spectrogram or waveform experts. Automatic training procedures seem, nonetheless, to be tolerant of this small percentage of misaligned sounds provided they are given a sufficiently large number of correctly aligned instances of each sound. In this paper, a new method is proposed to directly address the following classification problem: is a particular putative phone alignment correct or is it an alignment error? Applying this classification method to a large data base previously labeled by a conventional automatic routine, acoustic‐phonetic statistics for each sound may be obtained for all instances of that sound which are classified as correctly labeled.
Skip Nav Destination
Article navigation
November 1982
August 12 2005
Alignment classification method to facilitate automatic acoustic‐phonetic statistics collection
Janet M. Baker
Janet M. Baker
DRAGON Systems, Inc., 173 Highland Street, West Newton, MA 02165
Search for other works by this author on:
Janet M. Baker
DRAGON Systems, Inc., 173 Highland Street, West Newton, MA 02165
J. Acoust. Soc. Am. 72, S32 (1982)
Citation
Janet M. Baker; Alignment classification method to facilitate automatic acoustic‐phonetic statistics collection. J. Acoust. Soc. Am. 1 November 1982; 72 (S1): S32. https://doi.org/10.1121/1.2019829
Download citation file:
92
Views
Citing articles via
A survey of sound source localization with deep learning methods
Pierre-Amaury Grumiaux, Srđan Kitić, et al.
I can't hear you without my glasses
Tessa Bent
Related Content
Unifying dynamic programming methods
J. Acoust. Soc. Am. (August 2005)
A powerful post‐processing algorithm for time domain pitch trackers
J. Acoust. Soc. Am. (August 2005)
Decisions about features
J. Acoust. Soc. Am. (August 2005)
Very large vocabulary recognition (VLVR): using prosodic and spectral filters
J. Acoust. Soc. Am. (August 2005)