Time-delay estimation (TDE), which measures the relative time delay between different receivers, is a fundamental approach for identifying, localizing, and tracking radiating sources. The generalized cross-correlation method is the most popular and is well explained in a landmark paper by Knapp and Carter [(1976). IEEE Trans. Acoust. Speech Signal Process. 24(4), 320–327]. Adaptive eigenvalue decomposition- (EVD) based algorithms have also been developed to improve TDE performance, especially in reverberant environments. This paper extends the adaptive EVD algorithm to utilize the sparsity in transfer channel between source and receivers. Two estimation algorithms based on the log-sum and lp-norm penalized minor component analysis by excitatory and inhibitory learning rules is proposed. In addition, simulations with uncorrelated, correlated noise and reverberation for several signal-to-noise ratios are performed to show the improved estimation performance in noise and reverberation.

Time delay estimation (TDE), which estimates the relative time difference of arrival (TDOA) among spatially separated sensors, has played an important role in radar, sonar, and seismology in localizing radiating sources. Nowadays, TDE on a microphone array is applied to localize and track acoustic sources in a room environment for applications such as automatic camera tracking for video-conference and microphone array beam steering for suppressing noise and reverberation in various communication and voice processing systems. However, TDE with two receivers is also actively used in many actual applications such as humanoid robotics (Murray et al., 2004; Ferreira et al., 2009; Trifa et al., 2007) and hearing aids (May et al., 2011). More recent works using just two receivers include studies by Cobos et al. (2011), Zhang and Rao (2010), and Cobos and Lopez (2010).

In many TDE methods, the generalized cross-correlation (GCC) method, proposed by Knapp and Carter (1976), is the most popular technique for TDE. The delay estimate is obtained as the time-lag that maximizes the cross-correlation between filtered versions of the received signals. Since then, many new ideas have been proposed to mitigate noise and reverberation. In Benesty (2000), an adaptive eigenvalue decomposition (EVD) algorithm has been developed for the blind estimation of two acoustic impulse responses from the source to the two microphones. The adaptive EVD algorithm for TDE performs better in reverberant environments than GCC based methods. Doclo and Moonen (2003) extend the adaptive EVD algorithm for TDE to the spatiotemporally colored noise case using an adaptive generalized eigenvalue decomposition (GEVD) algorithm.

In practice, we often encounter sparse impulse responses that have a small percentage of their components with a significant magnitude while the rest are zero or small (Hansler, 2006). Sparse impulse responses are encountered in many applications: network and acoustic echo cancellation, feedback cancellation in hearing aids, blind identification of acoustic impulse responses for time delay estimation, and source localization. Many recent works have been done in sparse channel estimation by the use of the l1-norm penalty (Donoho, 2006; Li et al., 2017; Berger et al., 2010). These algorithms achieve a good performance for sparse channel estimation applications (Bajwa et al., 2010; Taheri and Vorobyov, 2011; Lim and Pang, 2016b).

In this paper, we propose a new TDE algorithm to improve the adaptive EVD-based algorithm in Benesty (2000). The proposed algorithm utilizes the sparsity in the propagation channel. In order to deal with sparsity, we reformulate the object function from Benesty (2000) into the Rayleigh quotient form and add the log-sum penalty. We also reformulate the object function with the lp-norm penalty. In the simulation, we show the performance of the two proposed TDE algorithms by comparing them with two different algorithms of GCC and an adaptive EVD-based TDE. This paper is organized as follows. Section II discusses two models for the TDE problem. In Sec. III, the adaptive EVD algorithm for TDE is summarized, as previously proposed in Benesty (2000). In Secs. IV and V, the two proposed algorithms are developed. Section VI gives some simulation results and comparisons of different algorithms.

This section presents two models often used for the TDE problem. First, the “ideal model” is described and then is followed by the “real model,” which more accurately describes a real acoustic environment (Benesty, 2000).

A simple and widely used signal model for the classical TDE problem is as follows. Let xi(k),i=1,2, denote the ith receiver signal, then

(1)

where αi is the attenuation factor due to propagation effects, τi is the propagation time from the unknown source s(k) to receiver i, and ni(k) is an additive noise signal at the ith receiver. It is assumed that s(k), n1(k), and n2(k) are zero-mean, uncorrelated, and stationary Gaussian random processes. The relative delay between the two received signals 1 and 2 is defined as

(2)

In a real acoustic environment we must consider the reverberation of the room; therefore, the ideal model no longer holds. A more complicated but more complete model for received signals xi(k),i=1,2, can be expressed as follows:

(3)

where * denotes convolution and gi is the channel impulse response between the source s(k) and the ith receiver. Moreover, n1(k) and n2(k) might be correlated, which is the case when the noise is directional, e.g., from a ceiling fan or an overhead projector (Benesty, 2000; Rubo et al., 2011; Netsch and Stachurski, 2014). For example, n1(k) is correlated with n2(k) in case that n1(k) and n2(k) are from the same noise source. In this case, Eq. (3) can be rewritten as

(4)

where τsignal is the time delay between signals in receiver 1 and receiver 2, and τnoise is the time delay between noises in receiver 1 and receiver 2.

Benesty (2000) utilized the following relationship between the received signals 1 and 2:

(5)

where xi(n)=[xi(n),xi(n1),,xi(nM+1)]T, i = 1, 2 is vector of signal samples at the microphone outputs, “T” denotes the transpose of a vector or a matrix, and the impulse response vectors of length M are defined as gi(n)=[gi,0,gi,1,,gi,M1]T, i = 1, 2.

This linear relation follows from the fact that xi=sgi, i = 1, 2, thus x1g2=sg1g2=x2g1. The covariance matrix of the two microphone signals is

(6)

where Rxixj=E{xi[n]xjT[n]},i,j=1,2. Consider the 2M×1 vector w=[g2Tg1T]T.

From Eqs. (5) and (6), it can be seen that Rw=0, which means that the vector w (containing the two impulse responses) is the eigenvector of the covariance matrix R corresponding to the eigenvalue of 0. Moreover, if the two impulse responses g1 and g2 have no common zeros and the autocorrelation matrix of the source signal s(n) is full rank, which is assumed in the rest of this paper, the covariance matrix R has only one eigenvalue with a value of zero (Benesty, 2000).

In practice, accurate estimation of the vector w is not trivial because of the nature of speech, the length of the impulse responses, and the background noise, etc. However, for this application, we only need to find an efficient way to detect the direct paths of the two impulse responses. In the following, it is explained how this can be done. Benesty (2000) derived the object function for the optimal vector wopt as the minimizing the quantity wTRw with respect to w and subject to w2=wTw=1.

Benesty (2000) proposed a simple algorithm to estimate iteratively the eigenvector (here w) corresponding to the minimum eigenvalue of R, by using an algorithm similar to the Frost algorithm which is a simple constrained Least-Mean-Square (LMS).

The minimization problem derived by Benesty (2000) can be formulated with the Lagrange multiplication method

(7)

where, if z=[x1T,x2T]T,R=E{zzT}=[E{x1x1T}E{x1x2T}E{x2x1T}E{x2x2T}]=[Rx1x1Rx1x2Rx2x1Rx2x2].

The above optimization problem can also be reformulated with the Rayleigh quotient (Cirrincione and Cirrincione, 2010; Cichocki and ichi Amari, 2002)

(8)

In general, the impulse response gi is a kind of a sparse impulse response (Hansler, 2006). It is because a large fraction of its energy is concentrated in a small fraction of its duration (Hansler, 2006). We can utilize the sparsity to derive the solution of Eq. (8) with more improved accuracy.

In Lim and Pang (2016a), the authors proposed a time recursive MCA EXIN with the log-sum regularization (so called RZA-TLS EXIN) in order to estimate the system coefficients of the sparse system,

(9)

where wi(k) is the ith component in w(k). Because the impulse response gi is sparse, the augmented vector of the impulse response w is also sparse. Therefore, we can replace the time delay estimation by Eq. (8) with the object function of Eq. (9).

For the update equation from Eq. (9), the steepest descent method and subgradient concept were used in Lim and Pang (2016a). The update equation is as follows:

(10)

where subg(·) is the sub-gradient of the convex function g(·) (Chen et al., 2009) and η=μα.

(11)

where μ is the learning rate and ε=1/ε. The third term in Eq. (11) acts as reweighted zero attractor, which takes effect only on taps whose magnitudes are comparable to 1/ε; there is little shrinkage exerted on the taps whose |wi(n)|1/ε (Chen et al., 2009).

The instantaneous approximation algorithm applied to Eq. (11) yields

(12)

The log-sum penalty term in Eq. (9) works as l0-norm penalty term. We consider another penalty function which is more similar to the l0-norm penalty. It is lp-norm with 0<p<1. The lp-norm has been introduced to LMS based sparse channel estimation in Taheri and Vorobyov, (2011), Taheri and Vorobyov (2014), and Wu and Tong (2013). In this case, the cost function becomes

(13)

where ·lp stands for the lp-norm of a vector and γp is the weighting constant. By using the steepest descent method and subgradient concept like Eq. (10), the update equation is as follows:

(14)

where μp=μγp. Furthermore, we can set an upper bound on the last term in Eq. (14) in order to cope with an element in w(k) approaching zero in a sparse channel impulse response. Then the update equation is modified as

(15)

where εp is a value for bounding the last term in Eq. (15) (Taheri and Vorobyov, 2011; Taheri and Vorobyov, 2014).

In this section, we compare the performance of different time delay estimation algorithms for three scenarios. The algorithms being considered here are GCC (Knapp and Carter, 1976) and adaptive eigenvector decomposition algorithm (Benesty, 2000) as well as the proposed log-sum penalized MCA EXIN-based time delay algorithm and the lp-norm penalized MCA EXIN-based time delay algorithm.

In the first experiment, we consider two receiving sensors. In most practical applications, the desired source signals of interest are correlated. We also generated the source signal x1(k) by passing a first order autoregressive (AR) process, viz., s0(k)=0.7s0(k1)+w(k), where w(k) is a white Gaussian process (So, 2001). Assuming the signals propagated in the free space, the second signal x2(k) is the time-delayed version of x1(k)=s0(k) with delay of ten time steps, i.e., x2(k)=x1(k10). Then, the two signals x1(k) and x2(k) are contaminated by two real white Gaussian noises n1(k) and n2(k), respectively. We check to ensure that the two noise sequences are mutually uncorrelated.

The signal sequences were scaled to obtain the desired signal-to-noise ratio (SNR) and added to the noise sequences as in Eq. (1) to form the sensor outputs x1(k) and x2(k). SNR of approximately from 20 dB to −10 dB were considered, where SNR = σx2/σn2.

The sequences x1(k) and x2(k) were processed using the traditional GCC, adaptive EVD-based method and the proposed two algorithms. We set the step-size as 0.001 with the filter length of 30 in the adaptive EVD-based algorithm and the proposed algorithms. For the lp-norm penalized algorithm, p is chosen to be 1/2 with ϵp=0.5 and ρp=104. The parameters of the log-sum penalized algorithm are set to α=0.01 and ϵ=0.9. In GCC, we apply the Phase Transform–Generalized Cross Correlation (PHAT-GCC) algorithm with an fast Fourier transform size of 2048 samples at 16 kHz sampling rate.

We compared the root mean squared delay errors (RMSD) for the performance comparison. All the results provided were averages of 500 independent trials. Figure 1 shows the RMSD of the three algorithms. When SNR  −5 dB, the proposed methods outperform the adaptive EVD method and the GCC. In particular, the lp-norm penalized algorithm is superior to the others. This result means lp-norm handle the sparsity better than the other proposed algorithm. In the first experiment, we can conclude that the proposed algorithms improve the performance by applying the sparsity penalty to the delay parameter estimation algorithm.

FIG. 1.

RMSD comparison in the uncorrelated additive noise environments (-○-: log-sum penalized algorithm, --: lp-norm penalized algorithm, -×-: GCC (Knapp and Carter, 1976), --: adaptive EVD method (Benesty, 2000)).

FIG. 1.

RMSD comparison in the uncorrelated additive noise environments (-○-: log-sum penalized algorithm, --: lp-norm penalized algorithm, -×-: GCC (Knapp and Carter, 1976), --: adaptive EVD method (Benesty, 2000)).

Close modal

In the second experiment, the case of the correlate noises was considered. This case frequently happens when the noise is directional such as from a ceiling fan or an overhead projector. We generated correlated noise as, i.e., n1(k)=n2(k5), where n2(k) is real white Gaussian noise. The other parameters for the algorithms are the same as those in the first experiment. We also compared the RMSD for the performance comparison by 500 independent trials. Figure 2 shows the two proposed estimators outperform the GCC and the adaptive EVD method. From these results, we can conclude that the proposed algorithms improve the performance of the adaptive EVD method by adding the sparsity penalty term.

FIG. 2.

RMSD comparison in the correlated additive noise environments (-○-: log-sum penalized algorithm, : lp-norm penalized algorithm, -×-: GCC (Knapp and Carter, 1976), --: adaptive EVD method (Benesty, 2000)).

FIG. 2.

RMSD comparison in the correlated additive noise environments (-○-: log-sum penalized algorithm, : lp-norm penalized algorithm, -×-: GCC (Knapp and Carter, 1976), --: adaptive EVD method (Benesty, 2000)).

Close modal

In the third experiment, we simulate under realistic reverberation conditions. For realistic reverberation conditions, we have simulated a room with dimensions [5 m, 4 m, 2 m] having three different reverberation times of T60 = 250 ms and 1 s, respectively. The reverberation time T60 can be expressed as a function of the absorption coefficient γ of the walls, according to Eyring's formula (Doclo and Moonen, 2003; Everest, 2001)

(16)

with V the volume of the room and S the total surface of the room. As shown in Fig. 3, the room consists of a microphone array, with two omnidirectional microphones at positions [1 m, 1 m, 1 m] and [1.5 m, 1 m, 1 m], and a sound source at position [2 m, 2 m, 1.7 m]. The microphone 2 is set as the reference. The received signals, x1(k),x2(k), are filtered versions of the s0(k) in experiment 1 using simulated acoustic impulse responses constructed by the image method (Lehmann and Johansson, 2008; Lehmann, 2008; Jarrett et al., 2011; De Sena et al., 2015) with a filter length L=1000. The exact time delay between the speech components is −12.18 samples, which have been obtained by a simple geometrical calculation in case of the sampling frequency fs=16 kHz. Noise has been generated by considering uncorrelated white noise sources equally distributed over all directions. We have performed simulations using the adaptive eigenvector decomposition algorithm (Adaptive EVD), GCC, and the two proposed algorithms for different SNRs (10 dB and 0 dB). In this experiment, we perform 100 independent trials.

FIG. 3.

Chamber floor plan with the position of two microphones and one sound source.

FIG. 3.

Chamber floor plan with the position of two microphones and one sound source.

Close modal

Figures 4 and 5 show histograms of TDE with a pair of omni directional microphones in T60 = 250 ms and 1 s, respectively. Comparison results in SNR = 10 dB are shown in Figs. 4(a)–4(h), and comparison results in SNR = 0 dB are shown in Figs. 5(a)–5(h). The inverted triangle in the figures points to the true delay. In all the cases, the proposed two algorithms keep the excellent estimation performance. The lp-norm penalized algorithm shows better than the l1-norm penalized algorithm especially in the low SNR case. From these results, we can conclude that the two proposed algorithms perform better in accuracy. In addition, we can also confirm that the proposed algorithms improve the robustness of the adaptive EVD algorithm in reverberant environments.

FIG. 4.

(Color online) Comparison of TDE in SNR = 10 dB, (a) proposed l1 MCA EXIN in T60 = 250 ms, (b) proposed lp MCA EXIN in T60 = 250 ms, (c) GCC in T60 = 250 ms, (d) adaptive EVD in T60 = 250 ms, (e) proposed l1 MCA EXIN in T60 = 1 s, (f) proposed lp MCA EXIN in T60 = 1 s, (g) GCC in T60 = 1 s, (h) adaptive EVD in T60 = 1 s.

FIG. 4.

(Color online) Comparison of TDE in SNR = 10 dB, (a) proposed l1 MCA EXIN in T60 = 250 ms, (b) proposed lp MCA EXIN in T60 = 250 ms, (c) GCC in T60 = 250 ms, (d) adaptive EVD in T60 = 250 ms, (e) proposed l1 MCA EXIN in T60 = 1 s, (f) proposed lp MCA EXIN in T60 = 1 s, (g) GCC in T60 = 1 s, (h) adaptive EVD in T60 = 1 s.

Close modal
FIG. 5.

(Color online) Comparison of TDE in SNR = 0 dB, (a) proposed l1 MCA EXIN in T60 = 250 ms, (b) proposed lp MCA EXIN in T60 = 250 ms, (c) GCC in T60 = 250 ms, (d) adaptive EVD in T60 = 250 ms, (e) proposed l1 MCA EXIN in T60 = 1 s, (f) proposed lp MCA EXIN in T60 = 1 s, (g) GCC in T60 = 1 s, (h) adaptive EVD in T60 = 1 s.

FIG. 5.

(Color online) Comparison of TDE in SNR = 0 dB, (a) proposed l1 MCA EXIN in T60 = 250 ms, (b) proposed lp MCA EXIN in T60 = 250 ms, (c) GCC in T60 = 250 ms, (d) adaptive EVD in T60 = 250 ms, (e) proposed l1 MCA EXIN in T60 = 1 s, (f) proposed lp MCA EXIN in T60 = 1 s, (g) GCC in T60 = 1 s, (h) adaptive EVD in T60 = 1 s.

Close modal

This paper proposed two new algorithms in the time delay estimation. The algorithms estimate the time delay parameter from the eigenvector for the minimum eigenvalue estimated by log-sum penalized MCA EXIN and lp-norm penalized MCA EXIN, respectively. In comparison with other methods, the two proposed algorithms are much more robust in reverberant environments as well as in uncorrelated noise environments and correlated noise environments. The results in this paper provide a good starting point for discussion, and further research is needed to confirm the proposed algorithms by applying and testing in real reverberant facilities.

This paper was supported by Agency for Defense Development (ADD) in Korea (UD160015DD).

1.
Bajwa
,
W. U. Z.
,
Haupt
,
J. D.
,
Sayeed
,
A. M.
, and
Nowak
,
R. D.
(
2010
). “
Compressed channel sensing: A new approach to estimating sparse multipath channels
,”
Proc. IEEE
98
(
6
),
1058
1076
.
2.
Benesty
,
J.
(
2000
). “
Adaptive eigenvalue decomposition algorithm for passive acoustic source localization
,”
J. Acoust. Soc. Am.
107
(
1
),
384
391
.
3.
Berger
,
C. R.
,
Wang
,
Z.
,
Huang
,
J.
, and
Zhou
,
S.
(
2010
). “
Application of compressive sensing to sparse channel estimation
,”
IEEE Commun. Mag.
48
(
11
),
164
174
.
4.
Chen
,
Y.
,
Gu
,
Y.
, and
Hero
,
A. O.
(
2009
). “
Sparse LMS for system identification
,” in
Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing
, April 19–24, Taipei, Taiwan, pp.
3125
3128
.
5.
Cichocki
,
A.
, and
ichi Amari
,
S.
(
2002
).
Adaptive Blind Signal and Image Processing
(
John Wiley & Sons, Ltd.
,
New York
).
6.
Cirrincione
,
G.
, and
Cirrincione
,
M.
(
2010
).
Neural-Based Orthogonal Data Fitting: The EXIN Neural Networks
(
John Wiley & Sons, Inc.
,
New York
).
7.
Cobos
,
M.
, and
Lopez
,
J. J.
(
2010
). “
Two-microphone separation of speech mixtures based on interclass variance maximization
,”
J. Acoust. Soc. Am.
127
(
3
),
1661
1672
.
8.
Cobos
,
M.
,
Lopez
,
J. J.
, and
Martinez
,
D.
(
2011
). “
Two-microphone multi-speaker localization based on a Laplacian mixture model
,”
Digital Signal Process.
21
(
1
),
66
76
.
9.
De Sena
,
E.
,
Antonello
,
N.
,
Moonen
,
M.
, and
Van Waterschoot
,
T.
(
2015
). “
On the modeling of rectangular geometries in room acoustic simulations
,”
IEEE/ACM Trans. Audio, Speech Lang. Process.
23
(
4
),
774
786
.
10.
Doclo
,
S.
, and
Moonen
,
M.
(
2003
). “
Robust adaptive time delay estimation for speaker localization in noisy and reverberant acoustic environments
,”
EURASIP J. Appl. Signal Process.
2003
,
1110
1124
.
11.
Donoho
,
D. L.
(
2006
). “
Compressed sensing
,”
IEEE Trans. Inf. Theory
52
(
4
),
1289
1306
.
12.
Everest
,
F.
(
2001
).
The Master Handbook of Acoustics
, 4th ed. (
McGraw-Hill
,
New York
).
13.
Ferreira
,
J. F.
,
Pinho
,
C.
, and
Dias
,
J.
(
2009
). “
Implementation and calibration of a Bayesian binaural system for 3D localisation
,” in
Proceedings of the IEEE International Conference on Robotics and Biomimetics ROBIO 2008
, February 22–25, Bangkok, Thailand, pp.
1722
1727
.
14.
Hansler
,
E.
(
2006
).
Topics in Acoustic Echo and Noise Control: Selected Methods for the Cancellation of Acoustical Echoes, the Reduction of Background Noise, and Speech Processing
(
Springer-Verlag
,
Berlin
).
15.
Jarrett
,
D. P.
,
Habets
,
E. A.
,
Thomas
,
M. R.
, and
Naylor
,
P. A.
(
2011
). “
Simulating room impulse responses for spherical microphone arrays
,” in
Proceedings of the 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
, May 22–27, Prague, Czech Republic, pp.
129
132
.
16.
Knapp
,
C.
, and
Carter
,
G.
(
1976
). “
The generalized correlation method for estimation of time delay
,”
IEEE Trans. Acoust. Speech Signal Process.
24
(
4
),
320
327
.
17.
Lehmann
,
E.
(
2008
). “
Image-source method: matlab code implementation
,” http://www.eric-lehmann.com/ (Last viewed February 21, 2018).
18.
Lehmann
,
E. A.
, and
Johansson
,
A. M.
(
2008
). “
Prediction of energy decay in room impulse responses simulated with an image-source model
,”
J. Acoust. Soc. Am.
124
(
1
),
269
277
.
19.
Li
,
Y.
,
Wang
,
Y.
, and
Jiang
,
T.
(
2017
). “
Sparse least mean mixed-norm adaptive filtering algorithms for sparse channel estimation applications
,”
Int. J. Commun. Syst.
30
(
8
),
1
14
.
20.
Lim
,
J.
, and
Pang
,
H.-S.
(
2016a
). “
Reweighted l1 regularized TLS linear neuron for the sparse system identification
,”
Neurocomputing
173
,
1972
1975
.
21.
Lim
,
J.-S.
, and
Pang
,
H.-S.
(
2016b
). “
l1-regularized recursive total least squares based sparse system identification for the error-in-variables
,”
SpringerPlus
5
(
1
),
1460
.
22.
May
,
T.
,
van de Par
,
S.
, and
Kohlrausch
,
A.
(
2011
). “
A probabilistic model for robust localization based on a binaural auditory front-end
,”
IEEE Trans. Audio Speech Lang. Process.
19
(
1
),
1
13
.
23.
Murray
,
J. C.
,
Erwin
,
H.
, and
Wermter
,
S.
(
2004
). “
Robotics sound-source localization and tracking using interaural time difference and cross-correlation
,” in
Proceedings of the AI Workshop on NeuroBotics
, September 20, Ulm, Germany, pp.
89
97
.
24.
Netsch
,
L.
, and
Stachurski
,
J.
(
2014
). “
Robust low-resource sound localization in correlated noise
,” in
Proceedings of the Fifteenth Annual Conference of the International Speech Communication Association
, September 14–18, Singapore, pp.
2218
2222
.
25.
Rubo
,
Z.
,
Guanqun
,
L.
, and
Xueyao
,
L.
(
2011
). “
A time-delay estimation method against correlated noise
,”
Proc. Eng.
23
,
445
450
.
26.
So
,
H.
(
2001
). “
On time delay estimation using an fir filter
,”
Signal Process.
81
(
8
),
1777
1782
.
27.
Taheri
,
O.
, and
Vorobyov
,
S. A.
(
2011
). “
Sparse channel estimation with lp-norm and reweighted l1-norm penalized least mean squares
,” in
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011
, May 22–27, Prague, Czech Republic, pp.
2864
2867
.
28.
Taheri
,
O.
, and
Vorobyov
,
S. A.
(
2014
). “
Reweighted l1-norm penalized LMS for sparse channel estimation and its analysis
,”
Signal Process.
104
,
70
79
.
29.
Trifa
,
V. M.
,
Koene
,
A.
,
Morén
,
J.
, and
Cheng
,
G.
(
2007
). “
Real-time acoustic source localization in noisy environments for human-robot multimodal interaction
,” in
Proceedings of the 16th IEEE International Symposium on Robot and Human Interactive Communication, RO-MAN 2007
, August 26–29, Jeju, South Korea, pp.
393
398
.
30.
Wu
,
F.
, and
Tong
,
F.
(
2013
). “
Gradient optimization p-norm-like constraint LMS algorithm for sparse system estimation
,”
Signal Process.
93
(
4
),
967
971
.
31.
Zhang
,
W.
, and
Rao
,
B. D.
(
2010
). “
A two microphone-based approach for source localization of multiple speech sources
,”
IEEE Trans. Audio Speech Lang. Process.
18
(
8
),
1913
1928
.