This paper discusses the formulation, development and analysis of a segment-based approach to the automatic language identification (LID) problem. This system utilizes phonotactic, acoustic-phonetic, and prosodic information within a unified probabilistic framework. The implementation of this framework allows the relative contributions of different sources of information to be determined empirically, as well as providing the mechanism for combining them within one system. The system has been evaluated using the Oregon Graduate Institute (OGI) multi-language telephone speech corpus and the results are competitive with other current LID systems. The results have also indicated that, while the phontotactic information of a spoken utterance is the most useful information for LID, acoustic-phonetic and prosodic information can be useful for increasing a system’s accuracy, especially when the utterance is short.
Skip Nav Destination
Article navigation
April 1997
April 01 1997
Segment-based automatic language identification
Timothy J. Hazen;
Timothy J. Hazen
Spoken Language Systems Group, Laboratory for Computer Science, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139
Search for other works by this author on:
Victor W. Zue
Victor W. Zue
Spoken Language Systems Group, Laboratory for Computer Science, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139
Search for other works by this author on:
J. Acoust. Soc. Am. 101, 2323–2331 (1997)
Article history
Received:
April 15 1996
Accepted:
November 25 1996
Citation
Timothy J. Hazen, Victor W. Zue; Segment-based automatic language identification. J. Acoust. Soc. Am. 1 April 1997; 101 (4): 2323–2331. https://doi.org/10.1121/1.418211
Download citation file:
Pay-Per-View Access
$40.00
Sign In
You could not be signed in. Please check your credentials and make sure you have an active account and try again.
Citing articles via
Vowel signatures in emotional interjections and nonlinguistic vocalizations expressing pain, disgust, and joy across languages
Maïa Ponsonnet, Christophe Coupé, et al.
The alveolar trill is perceived as jagged/rough by speakers of different languages
Aleksandra Ćwiek, Rémi Anselme, et al.
A survey of sound source localization with deep learning methods
Pierre-Amaury Grumiaux, Srđan Kitić, et al.
Related Content
Convolutive mixture separation in time–frequency domain for robust automatic speech recognition
J Acoust Soc Am (October 2002)
Chinese dialect identification using segmental and prosodic features
J Acoust Soc Am (October 2000)
Analysis and modeling of dialect information in Ao, a low resource language
J. Acoust. Soc. Am. (May 2021)
Training candidate selection for effective out-of-set rejection in robust open-set language identification
J. Acoust. Soc. Am. (January 2018)
Fast on‐line speaker/environment adaptation using modified maximum likelihood stochastic matching
J Acoust Soc Am (October 2002)