Currently, there are technology barriers inhibiting speech processing systems that work in extremely noisy conditions from meeting the demands of modern applications. These systems often require a noise reduction system working in combination with a precise voice activity detector (VAD). This paper shows statistical likelihood ratio tests formulated in terms of the integrated bispectrum of the noisy signal. The integrated bispectrum is defined as a cross spectrum between the signal and its square, and therefore a function of a single frequency variable. It inherits the ability of higher order statistics to detect signals in noise with many other additional advantages: (i) Its computation as a cross spectrum leads to significant computational savings, and (ii) the variance of the estimator is of the same order as that of the power spectrum estimator. The proposed approach incorporates contextual information to the decision rule, a strategy that has reported significant benefits for robust speech recognition applications. The proposed VAD is compared to the G.729, adaptive multirate, and advanced front-end standards as well as recently reported algorithms showing a sustained advantage in speech/nonspeech detection accuracy and speech recognition performance.
Skip Nav Destination
Article navigation
May 01 2007
Statistical voice activity detection based on integrated bispectrum likelihood ratio tests for robust speech recognition Available to Purchase
J. Ramírez;
J. Ramírez
a)
Deptarment of Signal Theory, Networking and Communications,
University of Granada
, Granada, Spain
Search for other works by this author on:
J. M. Górriz;
J. M. Górriz
Deptarment of Signal Theory, Networking and Communications,
University of Granada
, Granada, Spain
Search for other works by this author on:
J. C. Segura
J. C. Segura
Deptarment of Signal Theory, Networking and Communications,
University of Granada
, Granada, Spain
Search for other works by this author on:
J. Ramírez
a)
Deptarment of Signal Theory, Networking and Communications,
University of Granada
, Granada, Spain
J. M. Górriz
Deptarment of Signal Theory, Networking and Communications,
University of Granada
, Granada, Spain
J. C. Segura
Deptarment of Signal Theory, Networking and Communications,
University of Granada
, Granada, Spaina)
Electronic mail: [email protected]
J. Acoust. Soc. Am. 121, 2946–2958 (2007)
Article history
Received:
November 20 2006
Accepted:
February 13 2007
Citation
J. Ramírez, J. M. Górriz, J. C. Segura; Statistical voice activity detection based on integrated bispectrum likelihood ratio tests for robust speech recognition. J. Acoust. Soc. Am. 1 May 2007; 121 (5): 2946–2958. https://doi.org/10.1121/1.2714915
Download citation file:
Pay-Per-View Access
$40.00
Sign In
You could not be signed in. Please check your credentials and make sure you have an active account and try again.
Citing articles via
Variation in global and intonational pitch settings among black and white speakers of Southern American English
Aini Li, Ruaridh Purse, et al.
Effects of network selection and acoustic environment on bounding-box object detection of delphinid whistles using a deep learning tool
Peter C. Sugarman, Elizabeth L. Ferguson, et al.
Designing a BirdNET classifier for high wind detection in passive acoustic recordings to support wildlife monitoring
Danielle T. Fradet, Megan A. Cimino, et al.
Related Content
An effective cluster-based model for robust speech detection and speech recognition in noisy environments
J. Acoust. Soc. Am. (July 2006)
Multiple feature extraction and classification of electroencephalograph signal for Alzheimers' with spectrum and bispectrum
Chaos (January 2015)
Detection of transients using the nonstationary bispectrum
J. Acoust. Soc. Am. (April 1993)
The extraction of wind turbine rolling bearing fault features based on VMD and bispectrum
AIP Conf. Proc. (August 2017)
A bispectral synthesizer
J. Acoust. Soc. Am. (March 1979)