The subject of this work is a unifying treatment of estimating the Direction of Arrival (DOA), detecting speech activity and suppressing noise in the case of a moving speaker by using a linear microphone array. The approach is based on the generalized likelihood ratio test applied to the framework of far-field, wideband moving sources (W-GLRT). It is shown that under certain distributional assumptions the W-GLRT provides a framework for the evaluation of DOA measurements against spurious DOAs, probabilistic speech activity detection as well as speech enhancement. As regards speech enhancement, we demonstrate the direct connection of W-GLRT with enhancement based on subspace methods. In addition, through the concept of directive a priori SNR we demonstrate its indirect connection with Minimum Mean Square Error spectral (MMSE_SA) and log-spectral gain modification (MMSE_LSA). The efficiency of the approach is illustrated on a moving speaker when either additive white Gaussian or babble noise is present in the acoustical field at very low SNRs.
Skip Nav Destination
Article navigation
October 2004
October 06 2004
Speech activity detection and enhancement of a moving speaker based on the wideband generalized likelihood ratio and microphone arrays Available to Purchase
Ilyas Potamitis;
Ilyas Potamitis
Wire Communications Laboratory, Electrical and Computer Engineering Department, University of Patras, Sofocleous-Adiparou 1, 265 00 Rion, Patras, Greece
Search for other works by this author on:
Eran Fishler
Eran Fishler
Electrical Engineering Department, Princeton University, Princeton, New Jersey
Search for other works by this author on:
Ilyas Potamitis
Wire Communications Laboratory, Electrical and Computer Engineering Department, University of Patras, Sofocleous-Adiparou 1, 265 00 Rion, Patras, Greece
Eran Fishler
Electrical Engineering Department, Princeton University, Princeton, New Jersey
J. Acoust. Soc. Am. 116, 2406–2415 (2004)
Article history
Received:
August 15 2003
Accepted:
June 11 2004
Citation
Ilyas Potamitis, Eran Fishler; Speech activity detection and enhancement of a moving speaker based on the wideband generalized likelihood ratio and microphone arrays. J. Acoust. Soc. Am. 1 October 2004; 116 (4): 2406–2415. https://doi.org/10.1121/1.1781622
Download citation file:
Pay-Per-View Access
$40.00
Sign In
You could not be signed in. Please check your credentials and make sure you have an active account and try again.
Citing articles via
Focality of sound source placement by higher (ninth) order ambisonics and perceptual effects of spectral reproduction errors
Nima Zargarnezhad, Bruno Mesquita, et al.
A survey of sound source localization with deep learning methods
Pierre-Amaury Grumiaux, Srđan Kitić, et al.
Variation in global and intonational pitch settings among black and white speakers of Southern American English
Aini Li, Ruaridh Purse, et al.
Related Content
Detection performances of experienced human operators compared to a likelihood ratio based detector
J. Acoust. Soc. Am. (July 2007)
Classification of signals in spherically invariant random clutter
J. Acoust. Soc. Am. (April 2015)
Acoustic detection of North Atlantic right whale contact calls using spectrogram-based statistics
J. Acoust. Soc. Am. (August 2007)
Optimal processing and performance evaluation of passive acoustic systems
J. Acoust. Soc. Am. (May 1997)
Jammer robust model-aided deep learning-based target detection for cognitive sonar
J. Acoust. Soc. Am. (October 2022)