Mel frequency cepstral coefficients (MFCC) are the most widely used speech features in automatic speech recognition systems, primarily because the coefficients fit well with the assumptions used in hidden Markov models and because of the superior noise robustness of MFCC over alternative feature sets such as linear prediction-based coefficients. The authors have recently introduced human factor cepstral coefficients (HFCC), a modification of MFCC that uses the known relationship between center frequency and critical bandwidth from human psychoacoustics to decouple filter bandwidth from filter spacing. In this work, the authors introduce a variation of HFCC called HFCC-E in which filter bandwidth is linearly scaled in order to investigate the effects of wider filter bandwidth on noise robustness. Experimental results show an increase in signal-to-noise ratio of 7 dB over traditional MFCC algorithms when filter bandwidth increases in HFCC-E. An important attribute of both HFCC and HFCC-E is that the algorithms only differ from MFCC in the filter bank coefficients: increased noise robustness using wider filters is achieved with no additional computational cost.
Skip Nav Destination
,
Article navigation
September 2004
September 07 2004
Exploiting independent filter bandwidth of human factor cepstral coefficients in automatic speech recognition Available to Purchase
Mark D. Skowronski;
Mark D. Skowronski
Computational Neuro-Engineering Laboratory, Electrical and Computer Engineering, University of Florida, Gainesville, Florida 32611
Search for other works by this author on:
John G. Harris
John G. Harris
Computational Neuro-Engineering Laboratory, Electrical and Computer Engineering, University of Florida, Gainesville, Florida 32611
Search for other works by this author on:
Mark D. Skowronski
John G. Harris
Computational Neuro-Engineering Laboratory, Electrical and Computer Engineering, University of Florida, Gainesville, Florida 32611
J. Acoust. Soc. Am. 116, 1774–1780 (2004)
Article history
Received:
September 29 2003
Accepted:
June 08 2004
Citation
Mark D. Skowronski, John G. Harris; Exploiting independent filter bandwidth of human factor cepstral coefficients in automatic speech recognition. J. Acoust. Soc. Am. 1 September 2004; 116 (3): 1774–1780. https://doi.org/10.1121/1.1777872
Download citation file:
Pay-Per-View Access
$40.00
Sign In
You could not be signed in. Please check your credentials and make sure you have an active account and try again.
Citing articles via
Focality of sound source placement by higher (ninth) order ambisonics and perceptual effects of spectral reproduction errors
Nima Zargarnezhad, Bruno Mesquita, et al.
A survey of sound source localization with deep learning methods
Pierre-Amaury Grumiaux, Srđan Kitić, et al.
Drawer-like tunable ventilated sound barrier
Yong Ge, Yi-jun Guan, et al.
Related Content
Effect of speech tasks on cepstral measures of articulation
J. Acoust. Soc. Am. (October 2016)
Human factor cepstral coefficients
J. Acoust. Soc. Am. (October 2002)
Predicting fundamental frequency from mel-frequency cepstral coefficients to enable speech reconstruction
J. Acoust. Soc. Am. (August 2005)
Analysis and prediction of acoustic speech features from mel-frequency cepstral coefficients in distributed speech recognition architectures
J. Acoust. Soc. Am. (December 2008)
Predictive power of cepstral coefficients and spectral moments in the classification of Azerbaijani fricatives
J. Acoust. Soc. Am. (March 2020)