In this paper, we introduce a voice activity detection (VAD) algorithm to perform voice/nonvoice (V/NV) classification using a fundamental frequency (F0) estimator called YIN. Although current speech recognition technology has achieved high performance, it is insufficient for some applications where high reliability is required, such as voice control of powered‐wheelchairs for handicapped persons. The proposed VAD, which rejects nonvoice input in preprocessing, is helpful for realizing a highly reliable system. Previous V/NV classification algorithms have generally adopted statistical analyses of F0, the zero‐crossing rate, and the energy of short‐time segments. A combination of these methods, a cepstrum‐based F0 extractor, has been proposed [S. Ahmadi and S. S. Andreas, IEEE Trans. SAP. 7, 333–339 (1999)]. The proposed V/NV classification adopts the ratio of a reliable fundamental frequency contour to the whole input interval. To evaluate the performance of our proposed method, we used 1360 voice commands and 736 noises in powered‐wheelchair control in a real environment. These results indicate that the recall rate is 97.4% when the lowest threshold is selected for noise classification with precision 97.3% in VAD. The proposed VAD, which rejects nonvoice input in preprocessing, can be helpful to realize a highly reliable system.
Skip Nav Destination
,
,
Article navigation
November 2006
Meeting abstract. No PDF available.
November 01 2006
Voice activity detection with voice/nonvoice classification using reliable fundamental frequency
Soo‐young Suk;
Soo‐young Suk
Information Technol. Res. Inst., AIST, Central 2, Umezono, Tsukuba, Japan
Search for other works by this author on:
Hiroaki Kojima;
Hiroaki Kojima
Information Technol. Res. Inst., AIST, Central 2, Umezono, Tsukuba, Japan
Search for other works by this author on:
Hyun‐Yeol Chung
Hyun‐Yeol Chung
Yeungnam Univ.
Search for other works by this author on:
Soo‐young Suk
Hiroaki Kojima
Hyun‐Yeol Chung
Information Technol. Res. Inst., AIST, Central 2, Umezono, Tsukuba, Japan
J. Acoust. Soc. Am. 120, 3216 (2006)
Citation
Soo‐young Suk, Hiroaki Kojima, Hyun‐Yeol Chung; Voice activity detection with voice/nonvoice classification using reliable fundamental frequency. J. Acoust. Soc. Am. 1 November 2006; 120 (5_Supplement): 3216. https://doi.org/10.1121/1.4788157
Download citation file:
63
Views
Citing articles via
I can't hear you without my glasses
Tessa Bent
A survey of sound source localization with deep learning methods
Pierre-Amaury Grumiaux, Srđan Kitić, et al.
Related Content
Development of nonvoice dialogue interface for robot systems
J. Acoust. Soc. Am. (November 2006)
Getting two birds with one phone: An acoustic sensor for both speech recognition and medical monitoring
J. Acoust. Soc. Am. (October 1999)
Robust speech recognition for the control of wheelchairs by inarticulate speech of the severely disabled
J. Acoust. Soc. Am. (November 2006)
Displaying speech as vocal tract area function pictures
J. Acoust. Soc. Am. (August 2005)
Novel interface using a microphone array for wheelchair control
J. Acoust. Soc. Am. (November 2006)