The subject of this work is a unifying treatment of estimating the Direction of Arrival (DOA), detecting speech activity and suppressing noise in the case of a moving speaker by using a linear microphone array. The approach is based on the generalized likelihood ratio test applied to the framework of far-field, wideband moving sources (W-GLRT). It is shown that under certain distributional assumptions the W-GLRT provides a framework for the evaluation of DOA measurements against spurious DOAs, probabilistic speech activity detection as well as speech enhancement. As regards speech enhancement, we demonstrate the direct connection of W-GLRT with enhancement based on subspace methods. In addition, through the concept of directive a priori SNR we demonstrate its indirect connection with Minimum Mean Square Error spectral (MMSE_SA) and log-spectral gain modification (MMSE_LSA). The efficiency of the approach is illustrated on a moving speaker when either additive white Gaussian or babble noise is present in the acoustical field at very low SNRs.

1.
Y.
Huang
,
J.
Benesty
,
G.
Elko
, and
R.
Mersereau
, “
Real-time passive source localization: An unbiased linear-correction least-squares approach
,”
IEEE Trans. Speech Audio Process.
9
,
943
956
(
2001
).
2.
P.
Aarabi
and
S.
Zaky
, “
Robust sound localization using multi-source audiovisual information fusion
,”
Information Fusion
2
,
209
223
(
2001
).
3.
D.
Rabinkin
,
R. J.
Renomeron
,
A.
Dahl
,
J.
French
,
J.
Flanagan
, and
M.
Bianchi
, “
A DSP implementation of source location using microphone arrays
,”
Proceedings of the SPIE
, Vol.
2846
, pp.
88
99
, Denver, Colorado, August
1996
.
4.
T.
Yamada
,
S.
Nakamura
, and
K.
Shikano
, “
Distant-talking speech recognition based on a 3-D Viterbi search using a microphone array
,”
IEEE Trans. Speech Audio Process.
10
,
48
56
(
2002
).
5.
M.
Hoffman
,
Z.
Li
, and
D.
Khataniar
, “
GSC-based spatial voice activity detection for enhanced speech coding in the presence of competing speech
,”
IEEE Trans. Speech Audio Process.
9
,
175
179
(
2001
).
6.
D. Johnson and D. Dudgeon, Array Signal Processing: Concepts and Techniques (Prentice Hall, New York, 1993).
7.
J.
Friedmann
,
E.
Fishler
, and
H.
Messer
, “
General asymptotic analysis of the generalized likelihood test for a Gaussian point source under statistical or spatial mismodeling
,”
IEEE Trans. Signal Process.
50
,
2617
2631
(
2002
).
8.
D.
Malah
,
R.
Cox
, and
A.
Accardi
, “
Tracking speech-presence uncertainty to improve speech enhancement in non-stationary noise environments
,”
Proc. IEEE ICASSP
1
,
789
792
(
1999
).
9.
I.
Soon
,
S.
Koh
, and
C.
Yeo
, “
Improved noise suppression filter using self-adaptive estimator of probability of speech absence
,”
Signal Process.
75
,
151
159
(
1999
).
10.
N.
Kim
and
J.
Chang
, “
Spectral enhancement based on global soft decision
,”
IEEE Signal Process. Lett.
7
,
108
110
(
2000
).
11.
Y.
Hioka
and
N.
Hamada
, “
Voice activity detection with array signal processing in the Wavelet domain
,”
IEICE Trans. Fundamentals
E86-A
,
2802
2811
(
2003
).
12.
F.
Beritelli
,
S.
Casale
, and
A.
Cavallaro
, “
A multi-channel speech/silence detector based on time delay estimation and fuzzy classification
,”
Proc. IEEE ICASSP
1
,
93
96
(
1999
).
13.
R.
Bouquin
and
G.
Le Faucon
, “
Proposal of a voice activity detector for noise reduction
,”
Electron. Lett.
30
,
930
932
(
1994
).
14.
Y.
Ephraim
and
D.
Malah
, “
Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator
,”
IEEE Trans. Acoust., Speech, Signal Process.
32
,
1109
1121
(
1984
).
15.
Y.
Ephraim
and
D.
Malah
, “
Speech enhancement using a minimum mean-square error log-spectral amplitude estimator
,”
IEEE Trans. Acoust., Speech, Signal Process.
33
,
443
445
(
1985
).
16.
P.
Stoica
and
A.
Nehorai
, “
On the concentrated stochastic likelihood function in array signal processing
,”
Circuits Syst. Signal Process.
14
,
669
674
(
1995
).
17.
J. Deller, J. Proakis, and J. Hansen, Discrete-time Processing of Speech Signals (Prentice–Hall, Englewood Cliffs, NJ, 1993), p. 296.
18.
R.
McAulay
and
M.
Malpass
, “
Speech enhancement using a soft decision noise suppression filter
,”
IEEE Trans. Speech Audio Process.
28
,
137
145
(
1980
).
19.
O.
Cappé
, “
Elimination of the musical noise phenomenon using a minimum mean-square error short-time spectral amplitude estimator
,”
IEEE Trans. Speech Audio Process.
32
,
345
349
(
1994
).
20.
http://www.sonicspot.com/3daudio/3daudio.html
21.
J. Garofolo, L. Lamel, W. Fisher, J. Fiscus, D. Pallett, and N. Dahlgren, “DARPA TIMIT acoustic-phonetic continuous speech corpus CD-ROM, NIST Speech Disc 1-1.1” (1993).
22.
A.
Varga
and
H.
Steeneken
, “
Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems
,”
Speech Commun.
12
,
247
251
(
1993
).
23.
A.
Jaffer
, “
Maximum likelihood direction finding of stochastic sources: A separable solution
,”
Proc. IEEE of the IEEE International Conference on Acoustics, Speech, and Signal Processing.
,
1998
, vol.
5
, pp
2893
2896
.
This content is only available via PDF.
You do not currently have access to this content.