Knowledge-based speech recognition systems extract acoustic cues from the signal to identify speech characteristics. For channel-deteriorated telephone speech, acoustic cues, especially those for stop consonant place, are expected to be degraded or absent. To investigate the use of knowledge-based methods in degraded environments, feature extrapolation of acoustic-phonetic features based on Gaussian mixture models is examined. This process is applied to a stop place detection module that uses burst release and vowel onset cues for consonant-vowel tokens of English. Results show that classification performance is enhanced in telephone channel-degraded speech, with extrapolated acoustic-phonetic features reaching or exceeding performance using estimated Mel-frequency cepstral coefficients (MFCCs). Results also show acoustic-phonetic features may be combined with MFCCs for best performance, suggesting these features provide information complementary to MFCCs.
Skip Nav Destination
Article navigation
February 2012
February 14 2012
Classification of stop place in consonant-vowel contexts using feature extrapolation of acoustic-phonetic features in telephone speech
Jung-Won Lee;
Jung-Won Lee
a)
Department of Electrical and Electronic Engineering,
Yonsei University
, 134 Shinchon-dong, Seodaemun-gu, Seoul, Korea 120-749
Search for other works by this author on:
Jeung-Yoon Choi;
Jeung-Yoon Choi
Department of Electrical and Electronic Engineering,
Yonsei University
, 134 Shinchon-dong, Seodaemun-gu, Seoul, Korea 120-749
Search for other works by this author on:
Hong-Goo Kang
Hong-Goo Kang
Department of Electrical and Electronic Engineering,
Yonsei University
, 134 Shinchon-dong, Seodaemun-gu, Seoul, Korea 120-749
Search for other works by this author on:
a)
Author to whom correspondence should be addressed. Electronic mail: jaesuk2002@dsp.yonsei.ac.kr
J. Acoust. Soc. Am. 131, 1536–1546 (2012)
Article history
Received:
August 25 2010
Accepted:
December 05 2011
Citation
Jung-Won Lee, Jeung-Yoon Choi, Hong-Goo Kang; Classification of stop place in consonant-vowel contexts using feature extrapolation of acoustic-phonetic features in telephone speech. J. Acoust. Soc. Am. 1 February 2012; 131 (2): 1536–1546. https://doi.org/10.1121/1.3672706
Download citation file:
Sign in
Don't already have an account? Register
Sign In
You could not be signed in. Please check your credentials and make sure you have an active account and try again.
Sign in via your Institution
Sign in via your InstitutionPay-Per-View Access
$40.00