Reconstructing the acoustic source distribution via imposing a sparsity constraint on a continuum, the atomic norm minimization (ANM) based grid-free compressive beamforming can eliminate the basis mismatch of conventional grid-based compressive beamforming. However, it works well only for sufficiently separated sources, which prohibits high resolution. The drawback arises because it uses an atomic norm to measure the source sparsity, while the atomic norm is not a direct sparse metric and its minimization is equivalent to the sparsity constraint only when the sources are sufficiently separated. This paper devotes itself to overcoming the drawback for the two-dimensional ANM based grid-free compressive beamforming. First, a sparse metric that can promote sparsity to a greater extent than the atomic norm is proposed. Then, using this metric a minimization problem is formulated and the majorization−minimization (MM) solving algorithm is introduced. MM iteratively conducts atomic norm minimization with a sound reweighting strategy, and therefore the developed method can be termed as iterative reweighted atomic norm minimization (IRANM). Both simulations and experiments demonstrate that whether a standard uniform rectangular array or a non-uniform array constituted by a small number of microphones is utilized, IRANM can overcome the drawback and thus enhance the resolution.

Compressive sensing1–3 based beamforming, or simply “compressive beamforming” for short, is a powerful approach to estimate the direction-of-arrivals (DOAs) and quantify the strengths of acoustic sources. Reconstructing the source distribution from noisy measurements of the wavefield with an array of microphones via solving an underdetermined system of equations with a sparsity constraint, the method can offer clear imaging even with a small number of microphones.4–7 

Conventional compressive beamforming sets up a discrete grid of look directions and assumes all the DOAs to lie on the grid. Imposing the sparsity constraint is minimizing the 1-norm of the vector composed by the source strengths in all the look directions. When the DOAs do not coincide with the look directions, the reconstruction is inaccurate. The problem is referred to as basis mismatch6,8 and often occurs in practical applications. In 2015, Xenaki and Gerstoft9 developed a grid-free compressive beamforming strategy for one-dimensional DOA estimation with linear microphone arrays. It treats the DOA as a continuum, introduces the atomic norm of source strength and substitutes it for the 1-norm in conventional compressive beamforming. In 2017, the authors in this paper developed a grid-free compressive beamforming strategy for two-dimensional DOA estimation with planar microphone arrays.10 In our strategy, the DOA is also treated as a continuum and the atomic norm of microphone pressure induced by acoustic sources is defined. Minimizing the atomic norm can effectively denoise the measured pressure and therefore obtain the pressure from acoustic sources. Utilizing the obtained pressure, the source distribution is reconstructed through matrix enhancement and matrix pencil (MEMP) method.11 The atomic norm minimization (ANM) based grid-free compressive beamforming can fundamentally eliminate the basis mismatch, but requires the sources to meet a separation condition.9,10,12 This can be ascribed to the fact that the atomic norm is not a direct sparse metric, and its minimization is equivalent to the sparsity constraint only when the sources are sufficiently separated. For sources that do not meet the separation condition, the grid-free compressive beamforming provides wrong reconstructions. The smaller the minimum separation of the sources that can be identified accurately is, the higher the resolution is. The separation condition prohibits high resolution. It is of significance for the resolution enhancement and function perfection of the grid-free compressive beamforming to explore an effective method to overcome the drawback. Considering the broad application space of planar microphone arrays, this paper is devoted to solving this issue for the two-dimensional grid-free compressive beamforming. An iterative reweighted atomic norm minimization (IRANM) method is presented.

Iterative reweighted minimization is not a fresh term. In the field of one-dimensional frequency estimation with compressive sensing, the iterative reweighted 1-norm minimization,13 the iterative reweighted 2-norm minimization14,15 and the iterative reweighted atomic norm minimization16 have been developed. All these methods improve the performance via enhancing the sparsity constraint. The method presented in this paper is also inspired by the idea. First, we propose a novel sparse metric to substitute for the atomic norm. The metric can promote sparsity to a greater extent than the atomic norm. Then, we solve the minimization problem of the metric through majorization−minimization (MM) algorithm.17,18 Finally, we interpret the method as IRANM. Both simulations and experiments demonstrate that IRANM can break the separation condition limit and therefore enhance the resolution.

Throughout this paper, and denote the sets of real and complex numbers, respectively. Bold lowercase letters are reserved for vectors and bold uppercase letters for matrices. The measured pressure is marked by . The superscripts T, and H denote the transpose, the conjugate and the Hermitian operator, respectively, on vectors and matrices. The symbol denotes the Kronecker product. The symbol || denotes the modulus of a scalar, the determinant of a matrix or the cardinality of a set. The generalized inequality X¯0 denotes that the matrix X is positive semidefinite. The 2-norm of a vector xn is defined as x2=i=1n|xi|2. The symbol tr(X) denotes the trace of X. The symbol diag(x) denotes a square matrix with off-diagonal elements being zeros and diagonal being x.

Figure 1 depicts the measurement model with a rectangular array, where the symbols • represent microphones, a=0,1,,A1 and b=0,1,,B1 are the microphone indices in x and y dimensions, respectively, and Δx and Δy are the microphone spacings. The product AB is the total number of microphones. For the ith source, the DOA is denoted by (θi,ϕi) and the strength by si. θi denotes elevation angle, which is the angle between the positive z axis and the line segment from the (0,0)th microphone, namely, the origin, to the source. ϕi denotes azimuth angle, which is the angle from the positive x axis to the orthogonal projection of the line segment on the xy plane. Sources in front of the array are taken into consideration, so 0θi90 and 0ϕi360. si equals the pressure induced by the ith source at the (0,0)th microphone. Under the assumption of plane wave, the pressure induced by sources at the (a,b)th microphone can be described by

(1)

where k is the total number of sources, j=1 is the imaginary unit, t1isinθicosϕiΔx/λ and t2isinθisinϕiΔy/λ with λ being the wavelength.

FIG. 1.

Measurement model.

FIG. 1.

Measurement model.

Close modal

After constructing vectors p=[p0,0,p0,1,,p0,B1,p1,0,p1,1,,p1,B1,,pA1,0,pA1,1,,pA1,B1]TAB, c(t1i)=[1,ej2πt1i,,ej2πt1i(A1)]TA, c(t2i)=[1,ej2πt2i,,ej2πt2i(B1)]TB, and d(t1i,t2i)=c(t1i)c(t2i)AB, we have

(2)

In the presence of additive noise nAB, the measured pressure pAB is expressed as

(3)

The noise is generated as independent and identically distributed complex Gaussian in the subsequent simulations. The array signal-to-noise ratio (SNR) is defined as SNR=20log10(p2/n2). The noise 2-norm can be obtained by n2=p210SNR/20.

The first step of the grid-free compressive beamforming is to denoise the measured pressure p and therefore obtain the pressure p from acoustic sources. This is achieved by imposing a sparsity constraint on the source distribution. Under a grid-free setting, t1sinθcosϕΔx/λ and t2sinθsinϕΔy/λ are continuous functions of θ and ϕ. d(t1,t2) denotes atom. The set of all the d(t1,t2) is called atomic set and denoted as A. The atomic 0-norm of p is a direct metric of source sparsity. Its definition is

(4)

where inf denotes the infimum. The reconstruction problem of p can be written as

(5)

where ε is an upper bound for the noise norm. Usually, let ε=n2. The atomic 0-norm exploits sparsity to the greatest extent possible. However, it is non-convex and the minimization is non-deterministic polynomial-time hard. Consequently, computationally feasible alternatives are encouraged. Reference 10 introduces the atomic norm of p, denoted as pA, as a convex relaxation of pA,0. The definition of pA is

(6)

Correspondingly, Eq. (5) becomes

(7)

Equation (7) is a convex optimization problem,19 which can be cast as a positive semidefinite programming and solved by the SDPT3 solver in CVX toolbox.20 

To deduce the positive semidefinite programming, a twofold Toeplitz matrix and its Vandermonde decomposition are required. The Vandermonde decomposition of a onefold Toeplitz matrix is a classical result discovered by Carathéodory and Fejér21 in the 1910s and rediscovered by Pisarenko22 in the 1970s. Recently, some scholars23,24 also study on the Vandermonde decomposition of a multifold Toeplitz matrix. To help understand, we begin with a onefold Toeplitz matrix. Denote the onefold Toeplitz operator by T(). For a given vector uA, T(u) forms a A×A Hermitian Toeplitz matrix with u being its first column. For example, the covariance matrix E(ppH)=i|si|2d(ti)d(ti)H of a linear array with A microphones in the case of uncorrelated sources is such a onefold Toeplitz matrix, where E denotes the expectation, d(ti)=[1,ej2πti,,ej2πti(A1)]T and tisinθiΔx/λ. Correspondingly, u=[E(p0p0),E(p1p0),,E(pA1p0)]T=i|si|2d(ti) is the vector composed by the cross-spectra of all the microphone pressures with reference to the pressure at the 0th microphone. i|si|2d(ti)d(ti)H is a Vandermonde decomposition of E(ppH). Next, we focus on the twofold Toeplitz matrix. Denote the twofold Toeplitz operator by Tb(). For a given vector u=[u0,0,u0,1,,u0,B1,u1,(B1),u1,(B2),,u1,0,,u1,B2,u1,B1,,uA1,(B1),uA1,(B2),,uA1,0,,uA1,B2,uA1,B1]TNu(Nu=(A1)(2B1)+B), Tb(u) forms a Hermitian A×A block Toeplitz matrix with each block being a B×B Toeplitz matrix. Denote the blocks in the first column of Tb(u) by Ta(0aA1). [u0,0,u0,1,,u0,B1]TB is the first column of T0. [ua,(B1),ua,(B2),,ua,0,ua,B2,ua,B1]T2B1(1aA1) contains the elements of the first row (from right to left) and the first column (from top to bottom) of Ta. For example, the covariance matrix E(ppH)=i|si|2d(t1i,t2i)d(t1i,t2i)H of a planar array in the case of uncorrelated sources is such a twofold Toeplitz matrix. Correspondingly, u=[E(p0,0p0,0),E(p0,1p0,0),,E(p0,B1p0,0),E(p1,0p0,B1),E(p1,0p0,B2),,E(p1,0p0,0),,E(p1,B2p0,0),E(p1,B1p0,0),,E(pA1,0p0,B1),E(pA1,0p0,B2),,E(pA1,0p0,0),,E(pA1,B2p0,0),E(pA1,B1p0,0)]T=i|si|2[1,ej2πt2i,,ej2πt2i(B1),ej2π(t1it2i(B1)),ej2π(t1it2i(B2)),,ej2πt1i,,ej2π(t1i+t2i(B2)),ej2π(t1i+t2i(B1)),,ej2π(t1i(A1)t2i(B1)),ej2π(t1i(A1)t2i(B2)),,ej2πt1i(A1),,ej2π(t1i(A1)+t2i(B2)),ej2π(t1i(A1)+t2i(B1))]T is the vector composed by the cross-spectra of all the microphone pressures with reference to the pressure at the (0,0)th microphone and the ones of the pressures at the (1aA1,0)th microphones with reference to the pressures at the (0,1bB1)th microphones. i|si|2d(t1i,t2i)d(t1i,t2i)H is a Vandermonde decomposition of E(ppH).

Denote

(8)

and

(9)

If Tb(û) admits a Vandermonde decomposition, i.e.,

(10)

where V=[d(t11,t21),d(t12,t22),,d(t1r,t2r)], Σ=diag([σ1,σ2,,σr]) with σi(i=1,2,,r) being real and positive values, and r is the rank of Tb(û), then pΤ=pA. The pT in this proposition differs from the one in Ref. 10, but they both equal pA. Two methods are given to prove the proposition, which can be seen in  Appendix A. Based on the proposition, the atomic norm minimization in Eq. (7) can be cast as the following positive semidefinite programming

(11)

The second step of the grid-free compressive beamforming is to estimate the DOAs and quantify the strengths of acoustic sources. This is achieved by using MEMP method to process the reconstructed pressure p̂. The detailed procedure of MEMP can be seen in Refs. 10 and 11. In addition, the grid-free compressive beamforming is also applicable to the non-uniform arrays constructed by randomly choosing microphones from a standard uniform rectangular array.10 For the non-uniform arrays, pp in Eqs. (5), (7), and (11) is changed to pΩpΩ, where Ω denotes the set of the indices of the chosen microphones, pΩ denotes the vector of the pressures measured by the chosen microphones, and pΩ denotes the vector of the pressures induced by sources at the chosen microphones. The grid-free compressive beamforming is not applicable to the arrays with randomly distributed microphones.

The above grid-free compressive beamforming can identify sources reliably and accurately only when the following separation condition is met:10 

(12)

where Δmin denotes the minimum separation of sources. The separation condition prohibits high resolution. In this section, we develop an IRANM method to overcome the drawback via establishing a novel sparse metric to substitute for the atomic norm.

Denote

(13)

where κ>0 is a regularization parameter that avoids the first term being when Tb(u) is rank deficient, and IAB×AB is an identity matrix. We establish the following sparse metric

(14)

which is the minimum value of the objective function in Eq. (13). As with pA,0 and pA, Mκ(p) is also a function of p under a given κ. When Tb(û) admits a Vandermonde decomposition, Mκ(p) has the following properties:16 (1) Mκ(p) is on the order of (pA,0/2ABAB/2)lnκ1 as κ approaches 0 if pA,0<AB, i.e., limκ0Mκ(p)/((pA,0/2ABAB/2)lnκ1)=1; (2) as κ approaches 0, the smallest ABpA,0 eigenvalues of Tb(û) are either 0 or approach 0, i.e., only pA,0 eigenvalues are relatively large; (3) Mκ(p)(AB/2)lnκ is on the order of pAκ1/2 as κ approaches +, i.e., limκ+(Mκ(p)(AB/2)lnκ)/(pAκ1/2)=1. Property (1) shows that as κ approaches 0, minimizing Mκ(p) is equivalent to minimizing pA,0. Property (3) shows that as κ approaches +, minimizing Mκ(p) is equivalent to minimizing pA. This means Mκ(p) serves as a link between pA,0 and pA, and can enhance sparsity compared with pA. Therefore, substituting Mκ(p) for pA to measure the source sparsity is expected to overcome the above drawback.

Substituting Mκ(p) for pA, the reconstruction problem of p can be rewritten as

(15)

Simultaneous Eqs. (13)–(15) yield

(16)

In Eq. (13), ln|Tb(u)+κI| is a concave function of u, while pHTb(u)1p is a convex function of u. An effective algorithm to minimize such a concave + convex function is the MM.17,18 MM solves the optimal variable and the optimal value of the objective function iteratively, whose procedure is shown pictorially in Fig. 2. Two steps are involved in each iteration. In the first majorization step, we find a surrogate function to locally approximate the objective function. The surrogate function should be greater than or equal to the objective function, and the equality holds at the current optimal variable. Then in the minimization step, we minimize the surrogate function to obtain a new optimal variable. Obviously, MM drives the objective function downward until a local optimum is reached.

FIG. 2.

The MM procedure.

FIG. 2.

The MM procedure.

Close modal

When using MM algorithm to solve the problem in Eq. (13), let κ also be updated dynamically. Denote by ûl the optimal variable, κl the regularization parameter, and Wl(Tb(ûl)+κlI)1AB×AB the weighting matrix, determined by the lth iteration. When conducting the (l+1)th iteration, the objective function is

(17)

and the surrogate function is constructed as

(18)

Because ln|Tb(ûl)+κlI|+tr(WlTb(uûl)) is the tangent plane of ln|Tb(u)+κlI| at u=ûl,

(19)

This means the surrogate function meets the requirement of MM algorithm. Ignoring the constant independent of u in h(u|ûl), the minimization problem in the (l+1)th iteration can be written as

(20)

Based on the above analysis, the problem in Eq. (16) also needs to be solved iteratively. In the (l+1)th iteration,

(21)

where the three equalities are due to Eq. (16), the relationship between the surrogate function and the objective function and Eq. (20) in turn. Equation (21) can be written as

(22)

Because [Tb(u)ppHv]¯0 is equivalent to Tb(u)¯0 and vpHTb(u)1p according to the Schur complement condition,19 Eq. (22) can be further written as

(23)

Equation (23) is a disciplined convex optimization problem19 and can be solved using the SDPT3 solver in the CVX toolbox.20 

With reference to the atomic set A, we define a weighted atomic set

(24)

where dw(t1,t2) is the weighted atom and w(t1,t2)0 is a weighting function. For p, with reference to its atomic norm shown by Eq. (6), we define its weighted atomic norm as

(25)

According to the definition, w(t1,t2) specifies preference of the atom d(t1,t2). To be specific, the atom d(t1,t2) is more likely selected if w(t1,t2) is larger. When w(t1,t2)1, pAw=pA.

Denote by wl(t1,t2) and pAwl the weighting function and the weighted atomic norm corresponding to the results of the lth iteration. If

(26)

and Tb(ûl+1) admits a Vandermonde decomposition, then

(27)

The proof of the proposition can be seen in  Appendix B. Based on the proposition, Eq. (21) can be written as

(28)

Obviously, p is reconstructed via minimizing its weighted atomic norm iteratively, and the weighting function is updated in each iteration. Therefore, the method can be interpreted as IRANM.

Initialize û0=0 and κ0=1, then W0=I. The first iteration in Eq. (23) coincides with the ANM in Eq. (11). To enhance sparsity, let κ decrease gradually during the iteration. The specific strategy is

(29)

where λmax(Tb(ûl)) is the largest eigenvalue of Tb(ûl). We terminate the iteration if the relative change of the solution p̂ at two consecutive iterations, namely, p̂lp̂l12/p̂l12, is less than 10−3 or the maximum number of iterations, set to 20, is reached.

As with the grid-free compressive beamforming in Ref. 10, IRANM is also applicable to the non-uniform arrays constructed by randomly choosing microphones from a standard uniform rectangular array. This is achieved by changing pp in Eqs. (15), (16), (21)–(23), and (28) to pΩpΩ.

To summarize the IRANM, we provide an algorithm in Table I for the implementation in matlab.

TABLE I.

MATLAB code for IRANM.

Given A, B, Ω, pΩ|Ω|(|Ω|AB), ε
Initialize κ=1, W=IAB×AB and p0=0AB.
Let p0Ω=pΩ|Ω|.
Solve IRANM with CVX (Ref. 20)
1: cvx_solver sdpt3 
2: for j = 1: 20 
3:  cvx_begin sdp quiet 
4:  variable u ((A-1)*(2*B-1)+B, 1) complex 
5:  variable Tb (A*B, A*B) Hermitian 
6:  variable p (A*B, 1) complex 
7:  variable v 
8:  minimize 1/(2*sqrt(A*B))*(trace(W*Tb)+v
9:  for j1 = 1: A 
10:   Tb((j1-1)*B+1: j1*B, (j1-1)*B+1: j1*B) == toeplitz(u(1: B), [u(1); conj(u(2: B))]); 
11:  end 
12:  for j1 = 2: A 
13:   for j2 = 1: j1-1 
14:    Tb((j1-1)*B+1: j1*B, (j2-1)*B+1: j2*B) == toeplitz(u([B+(2*B-1)*(j1-j2): (2*B-1)*(j1-j2+1)]-(B-1)), u([B+(2*B-1)*(j1-j2): −1: 1+(2*B-1)*(j1-j2)]-(B-1))); 
15:   end 
16:  end 
17:  subject to 
18:  [Tb, p; p', v] >= 0; 
19:  norm(pΩ-p(Ω)) <= ε; 
20:  cvx_end % Eq. (23) 
21:  if norm(p-p0) < 10(−3)*norm(p0
22:   break; 
23:  end 
24:  p0 = p; 
25:  if j == 1 
26:   κ = min(κ/2, max(eig(Tb))/10); 
27:  elseif j < = 10 
28:   κ = κ/2; 
29:  end % Eq. (29) 
30:  W = (Tb+κ*eye(A*B, A*B))(−1); 
31: end 
Given A, B, Ω, pΩ|Ω|(|Ω|AB), ε
Initialize κ=1, W=IAB×AB and p0=0AB.
Let p0Ω=pΩ|Ω|.
Solve IRANM with CVX (Ref. 20)
1: cvx_solver sdpt3 
2: for j = 1: 20 
3:  cvx_begin sdp quiet 
4:  variable u ((A-1)*(2*B-1)+B, 1) complex 
5:  variable Tb (A*B, A*B) Hermitian 
6:  variable p (A*B, 1) complex 
7:  variable v 
8:  minimize 1/(2*sqrt(A*B))*(trace(W*Tb)+v
9:  for j1 = 1: A 
10:   Tb((j1-1)*B+1: j1*B, (j1-1)*B+1: j1*B) == toeplitz(u(1: B), [u(1); conj(u(2: B))]); 
11:  end 
12:  for j1 = 2: A 
13:   for j2 = 1: j1-1 
14:    Tb((j1-1)*B+1: j1*B, (j2-1)*B+1: j2*B) == toeplitz(u([B+(2*B-1)*(j1-j2): (2*B-1)*(j1-j2+1)]-(B-1)), u([B+(2*B-1)*(j1-j2): −1: 1+(2*B-1)*(j1-j2)]-(B-1))); 
15:   end 
16:  end 
17:  subject to 
18:  [Tb, p; p', v] >= 0; 
19:  norm(pΩ-p(Ω)) <= ε; 
20:  cvx_end % Eq. (23) 
21:  if norm(p-p0) < 10(−3)*norm(p0
22:   break; 
23:  end 
24:  p0 = p; 
25:  if j == 1 
26:   κ = min(κ/2, max(eig(Tb))/10); 
27:  elseif j < = 10 
28:   κ = κ/2; 
29:  end % Eq. (29) 
30:  W = (Tb+κ*eye(A*B, A*B))(−1); 
31: end 

In this section, we examine the performance of IRANM with an illustrative example and the accuracy analysis. The microphone array is set as follows: A=B=8 and Δx=Δy=0.035m. SNR is taken as 20 dB.

We assume six sources in the illustrative example. Their DOAs are (60°, 180°), (60°, 190°), (50°, 90°), (40°, 90°), (30°, 200°), and (70°, 300°), and strengths are 100, 97, 97, 94, 94, and 90 dB (referring to 2 × 10–5 Pa), in turn.

We consider 4900 Hz (the upper limit frequency determined by the condition max{Δx/λ,Δy/λ}0.510) first. Calculated as 0.062, the minimum separation of these sources is less than 0.125 (1/AB), which means the separation condition shown by the inequality (12) is not met. Figure 3 gives the results when the 64 microphones are all utilized. Figure 3(a) shows that IRANM reduces the number of the relatively large eigenvalues of Tb(û) and finally produces a number of 6, which coincides with Property (2) of Mκ(p) in Sec. IV A. Figures 3(b)–3(d) present the source distributions reconstructed by IRANM + MEMP. In Fig. 3(b), the result after one iteration, namely, the result of ANM, deviates from the true source distribution severely. As a consequence, the sources are identified wrongly. This demonstrates that the ANM based grid-free compressive beamforming is burdened with the separation condition limit. It can be seen from Figs. 3(c) and 3(d) that the sources have been identified accurately after two iterations, demonstrating that IRANM can break the limit and thus enhance the resolution. Figures 3(e)–3(g) present the weighting functions used in the first three iterations. The weighting function used in the current iteration is obtained based on Tb(û) and κ determined by the previous iteration, and provides preference to the atoms around the sources identified by the previous iteration. Figure 4 gives the results when only 30 microphones are utilized. It exhibits a phenomenon similar to that of Fig. 3. The sources are identified accurately after three iterations. This demonstrates that IRANM is also applicable to the non-uniform arrays constructed by randomly choosing microphones from a standard uniform rectangular array. Then, we change the frequency to conduct simulations. Similar performance improvement can be seen. Figure 5 compares the source distributions reconstructed by ANM + MEMP and IRANM + MEMP at 3000 and 4000 Hz as examples. The non-uniform array used in Figs. 4 and 5(e)–5(h) is depicted in Fig. 6.

FIG. 3.

(Color online) An illustrative example at 4900 Hz. A standard uniform rectangular array with 64 microphones is utilized. (a) Eigenvalues of Tb(û). Reconstructions (*) of IRANM + MEMP for sources (○) after (b) one, (c) two and (d) three iterations. Weighting functions used in the (e) first, (f) second, and (g) third iterations. In (b)–(d), the reconstructed and the actual outputs are scaled to dB by referring to their respective maximum, and meanwhile, the reconstructed maximum referring to 2 × 10−5 Pa is labeled on the top.

FIG. 3.

(Color online) An illustrative example at 4900 Hz. A standard uniform rectangular array with 64 microphones is utilized. (a) Eigenvalues of Tb(û). Reconstructions (*) of IRANM + MEMP for sources (○) after (b) one, (c) two and (d) three iterations. Weighting functions used in the (e) first, (f) second, and (g) third iterations. In (b)–(d), the reconstructed and the actual outputs are scaled to dB by referring to their respective maximum, and meanwhile, the reconstructed maximum referring to 2 × 10−5 Pa is labeled on the top.

Close modal
FIG. 4.

(Color online) An illustrative example at 4900 Hz. A non-uniform array with 30 microphones is utilized. (a) Eigenvalues of Tb(û). Reconstructions (*) of IRANM + MEMP for sources (○) after (b) one, (c) two, (d) three, and (e) four iterations. Weighting functions used in the (f) first, (g) second, (h) third, and (i) fourth iterations. In (b)–(e), the reconstructed and the actual outputs are scaled to dB by referring to their respective maximum, and meanwhile, the reconstructed maximum referring to 2 × 10−5 Pa is labeled on the top.

FIG. 4.

(Color online) An illustrative example at 4900 Hz. A non-uniform array with 30 microphones is utilized. (a) Eigenvalues of Tb(û). Reconstructions (*) of IRANM + MEMP for sources (○) after (b) one, (c) two, (d) three, and (e) four iterations. Weighting functions used in the (f) first, (g) second, (h) third, and (i) fourth iterations. In (b)–(e), the reconstructed and the actual outputs are scaled to dB by referring to their respective maximum, and meanwhile, the reconstructed maximum referring to 2 × 10−5 Pa is labeled on the top.

Close modal
FIG. 5.

(Color online) Comparison of the source distributions (*) reconstructed by (a), (c), (e), (g) ANM + MEMP and (b), (d), (f), (h) IRANM+MEMP at (a), (b), (e), (f) 3000 Hz, and (c), (d), (g), (h) 4000 Hz. The actual source distribution is indicated by the symbols ○. In (a)–(d), a standard uniform rectangular array with 64 microphones is utilized. In (e)–(h), a non-uniform array with 30 microphones is utilized. In each map, the reconstructed and the actual outputs are scaled to dB by referring to their respective maximum, and meanwhile, the reconstructed maximum referring to 2 × 10−5 Pa is labeled on the top.

FIG. 5.

(Color online) Comparison of the source distributions (*) reconstructed by (a), (c), (e), (g) ANM + MEMP and (b), (d), (f), (h) IRANM+MEMP at (a), (b), (e), (f) 3000 Hz, and (c), (d), (g), (h) 4000 Hz. The actual source distribution is indicated by the symbols ○. In (a)–(d), a standard uniform rectangular array with 64 microphones is utilized. In (e)–(h), a non-uniform array with 30 microphones is utilized. In each map, the reconstructed and the actual outputs are scaled to dB by referring to their respective maximum, and meanwhile, the reconstructed maximum referring to 2 × 10−5 Pa is labeled on the top.

Close modal
FIG. 6.

Non-uniform array used in Figs. 4 and 5(e)–5(h).

FIG. 6.

Non-uniform array used in Figs. 4 and 5(e)–5(h).

Close modal

Both ANM and IRANM reconstruct the microphone pressure p induced by acoustic sources. Define a normalized 2-norm error p̂p2/p2 to measure the reconstruction accuracy. We compare the changing relations of the errors of ANM and IRANM with the minimum separation Δmin. For each Δmin, we obtain the average error over 50 runs. In each run, we randomly generate two sources that are separated by Δmin. Figure 7 shows the curves of p̂p2/p2 vs Δmin. Figures 7(a) and 7(b) correspond to the standard uniform rectangular array with 64 microphones and the non-uniform array with 30 microphones, respectively. In both Figs. 7(a) and 7(b), limited by the separation condition, ANM has distinctly larger errors when Δmin<1/AB compared with Δmin1/AB. In contrast to that, the errors of IRANM do not change much under all the Δmin. Besides, the error of IRANM is much smaller than the one of ANM. These phenomena demonstrate that whether the standard uniform rectangular array or the non-uniform array is utilized, IRANM can not only break the separation condition limit but also reconstruct p more accurately than ANM.

FIG. 7.

Curves of p̂p2/p2 vs Δmin when (a) the standard uniform rectangular array with 64 microphones and (b) the non-uniform array with 30 microphones are utilized.

FIG. 7.

Curves of p̂p2/p2 vs Δmin when (a) the standard uniform rectangular array with 64 microphones and (b) the non-uniform array with 30 microphones are utilized.

Close modal

MEMP processes the reconstructed pressure p̂ to estimate the DOAs and quantify the strengths of acoustic sources. Define an average 2-norm error θ̂θ22+ϕ̂ϕ22/2k and a normalized 2-norm error ŝs2/s2 to measure the DOA estimation accuracy and the strength quantification accuracy, where θ̂ and θ are the vectors of the estimated and the real elevation angles, respectively, ϕ̂ and ϕ are the vectors of the estimated and the real azimuth angles, respectively, and ŝ and s are the vectors of the quantified and the real strengths, respectively. Figure 8 shows the curves of these errors vs Δmin when using MEMP to process p̂ reconstructed by ANM and IRANM. Obviously, whether the standard uniform rectangular array or the non-uniform array is utilized, IRANM + MEMP enjoys the higher DOA estimation and strength quantification accuracy than ANM + MEMP, especially when Δmin<1/AB.

FIG. 8.

Curves of (a), (c) θ̂θ22+ϕ̂ϕ22/2k and (b), (d) ŝs2/s2 vs Δmin when (a), (b) the standard uniform rectangular array with 64 microphones and (c), (d) the non-uniform array with 30 microphones are utilized.

FIG. 8.

Curves of (a), (c) θ̂θ22+ϕ̂ϕ22/2k and (b), (d) ŝs2/s2 vs Δmin when (a), (b) the standard uniform rectangular array with 64 microphones and (c), (d) the non-uniform array with 30 microphones are utilized.

Close modal

To validate correctness of the simulation conclusion and effectiveness of the developed method in practical applications, an experimental measurement is performed on three small loudspeakers in a semi-anechoic room with a rectangular array. Figure 9 depicts the configuration. The array includes 64 Brüel&Kjær type 4958 microphones, A=B=8 and Δx=Δy=0.035m. Three mirror sources exist due to the reflection effect of the ground. Hence, there are six sources in total, whose Cartesian coordinates are (2.24, 0, 5) m, (2.24, −2.2, 5) m, (−1.24, 0, 5) m, (−1.24, −2.2, 5) m, (−2.24, 0, 5) m, and (−2.24, −2.2, 5) m, elevation angles are 24.13°, 32.13°, 13.93°, 26.80°, 24.13°, and 32.13°, and azimuth angles are 0°, 315.52°, 180°, 240.59°, 180°, and 224.48°, in turn. Pressure signals captured by microphones are acquired simultaneously by Brüel&Kjær PULSE Type 3560D Data Acquisition System and then transferred to Brüel&Kjær PULSE LABSHOP Software where their Fourier spectra are achieved. The sample frequency is 16 384 Hz, and a single snapshot with a length of 1 s and 214 samples is used. Eventually, ANM + MEMP and IRANM + MEMP are applied to map the sources.

FIG. 9.

(Color online) Experimental configuration.

FIG. 9.

(Color online) Experimental configuration.

Close modal

Figure 10 shows the acoustic source maps at 3000 and 4000 Hz, where the separation condition shown by the inequality (12) is not met. To obtain Figs. 10(a)–10(d), the 64 microphones are all utilized, while to obtain Figs. 10(e)–10(h), only 30 randomly chosen microphones are utilized. As shown in Figs. 10(a), 10(c), 10(e), and 10(g), the source distribution reconstructed by ANM + MEMP deviates from the true one severely. In contrast, as shown in Figs. 10(b), 10(d), 10(f), and 10(h), the source distribution reconstructed by IRANM + MEMP coincides with the true one well. As a conclusion, the ANM based grid-free compressive beamforming is burdened with the separation condition limit, while the developed IRANM method can break the limit and thus enhance the resolution. The experimental conclusion agrees with the simulation one, demonstrating that the conclusion is correct and the developed method is effective in practical applications.

FIG. 10.

(Color online) Experimental acoustic source maps at (a), (b), (e), (f) 3000 Hz and (c), (d), (g), (h) 4000 Hz. Reconstructions (*) of (a), (c), (e), (g) ANM+MEMP and (b), (d), (f), (h) IRANM + MEMP when utilizing (a)–(d) a standard uniform rectangular array with 64 microphones and (e)–(h) a non-uniform array constructed by randomly choosing 30 microphones from the standard uniform rectangular array. In each map, the reconstruction is scaled to dB by referring to its maximum, and meanwhile, the reconstructed maximum referring to 2 × 10−5 Pa is labeled on the top. The symbols ○ indicate the actual DOAs but do not contain the strength information.

FIG. 10.

(Color online) Experimental acoustic source maps at (a), (b), (e), (f) 3000 Hz and (c), (d), (g), (h) 4000 Hz. Reconstructions (*) of (a), (c), (e), (g) ANM+MEMP and (b), (d), (f), (h) IRANM + MEMP when utilizing (a)–(d) a standard uniform rectangular array with 64 microphones and (e)–(h) a non-uniform array constructed by randomly choosing 30 microphones from the standard uniform rectangular array. In each map, the reconstruction is scaled to dB by referring to its maximum, and meanwhile, the reconstructed maximum referring to 2 × 10−5 Pa is labeled on the top. The symbols ○ indicate the actual DOAs but do not contain the strength information.

Close modal

The ANM based grid-free compressive beamforming can identify acoustic sources reliably and accurately only when the sources are sufficiently separated, which prohibits high resolution. This paper focuses on overcoming the drawback and thus enhancing the resolution for the two-dimensional cases with planar microphone arrays. First, we propose a novel sparse metric to substitute for the atomic norm, which promotes sparsity to a greater extent. Then, we introduce MM algorithm to solve the minimization problem of the novel metric, which helps to obtain the microphone pressure induced by sources. When a sound reweighting strategy is considered, the method can be interpreted as IRANM. Finally, using the obtained pressure, we reconstruct the source distribution through MEMP, which is just the same as in the ANM based grid-free compressive beamforming.

Some interesting conclusions have been drawn based on simulations and experiments. First, IRANM can overcome the drawback and thus enhance the resolution. Second, in terms of reconstructing the microphone pressure induced by sources, estimating the DOAs and quantifying the source strengths, IRANM enjoys higher accuracy. Third, the above conclusions stand up whether a standard uniform rectangular array or a non-uniform array constituted by a small number of microphones is utilized.

This work was supported by the National Natural Science Foundation of China, under Grant No. 11704040, the Fundamental Research Funds for the Central Universities, under Grant No. 106112017CDJQJ338810 and the Graduate Scientific Research and Innovation Foundation of Chongqing, China, under Grant No. CYB17034.

A. Method one

Let p=isid(t1i,t2i), Tb(u)=(1/AB)i|si|d(t1i,t2i)d(t1i,t2i)H and v=ABi|si|, then

(A1)

where φi is the phase of si. This indicates

(A2)

Since the inequality (A2) holds for any decomposition of p, we conclude that pTpA. Moreover, we have Tb(û)¯0 and Tb(û)¯ppH/v̂ from [Tb(û)ppHv̂]¯0 according to the Schur complement condition.19 If Eq. (10) holds, p falls within the column space of Tb(û), namely, that p=Vs̃ where s̃=[s1,s2,,sr]Tr. Introduce a vector qAB such that VHq=[ejφ1,ejφ2,,ejφr]T, then

(A3)

This implies

(A4)

where the rightmost is due to the definition of the atomic norm shown by Eq. (6). Ultimately, pT=pA.

B. Method two

If Tb(u) admits a Vandermonde decomposition, p falls within the column space of Tb(u), namely, that p=Vs̃ where s̃=[s1,s2,,sr]Tr. Then,

(A5)

According to the Schur complement condition,19 we have

(A6)

from the inequality (A5). Hence,

(A7)

where the four equalities are due to the Schur complement condition, Eqs. (10) and (A6), the inequality of arithmetic and geometric means, and Eq. (6) in turn.

(B1)

where the first equality is due to the Vandermonde decomposition of Tb(ûl+1), the second equality is based on the fact that

(B2)

and Eq. (A6), the third equality comes from the inequality of arithmetic and geometric means, and the fourth equality follows from Eq. (26) and the definition of the weighted atomic norm shown by Eq. (25).

1.
M.
Elad
,
Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing
(
Springer
,
New York
,
2010
), pp.
1
361
.
2.
S.
Foucart
and
H.
Rauhut
,
A Mathematical Introduction to Compressive Sensing
(
Springer
,
New York
,
2013
), pp.
1
587
.
3.
H.
Boche
,
R.
Calderbank
,
G.
Kutyniok
, and
J.
Vybíral
,
Compressed Sensing and Its Applications
(
Springer
,
New York
,
2015
), pp.
1
468
.
4.
G. F.
Edelmann
and
C. F.
Gaumond
, “
Beamforming using compressive sensing
,”
J. Acoust. Soc. Am.
130
(
4
),
EL232
EL237
(
2011
).
5.
S.
Zhong
,
Q.
Wei
, and
X.
Huang
, “
Compressive sensing beamforming based on covariance for acoustic imaging with noisy measurements
,”
J. Acoust. Soc. Am.
134
(
5
),
EL445
EL451
(
2013
).
6.
A.
Xenaki
,
P.
Gerstoft
, and
K.
Mosegaard
, “
Compressive beamforming
,”
J. Acoust. Soc. Am.
136
(
1
),
260
271
(
2014
).
7.
P.
Gerstoft
,
A.
Xenaki
, and
C. F.
Mecklenbräuker
, “
Multiple and single snapshot compressive beamforming
,”
J. Acoust. Soc. Am.
138
(
4
),
2003
2014
(
2015
).
8.
Y.
Chi
,
L. L.
Scharf
,
A.
Pezeshki
, and
A. R.
Calderbank
, “
Sensitivity to basis mismatch in compressed sensing
,”
IEEE Trans. Signal Process.
59
(
5
),
2182
2195
(
2011
).
9.
A.
Xenaki
and
P.
Gerstoft
, “
Grid-free compressive beamforming
,”
J. Acoust. Soc. Am.
137
(
4
),
1923
1935
(
2015
).
10.
Y.
Yang
,
Z.
Chu
,
Z.
Xu
, and
G.
Ping
, “
Two-dimensional grid-free compressive beamforming
,”
J. Acoust. Soc. Am.
142
(
2
),
618
629
(
2017
).
11.
Y.
Hua
, “
Estimating two-dimensional frequencies by matrix enhancement and matrix pencil
,”
IEEE Trans. Signal Process.
40
(
9
),
2267
2280
(
1992
).
12.
Z.
Yang
,
J.
Li
,
P.
Stoica
, and
L.
Xie
, “
Sparse methods for direction-of-arrival estimation
,” arXiv:1609.09596 (
2017
).
13.
E. J.
Candès
,
M. B.
Wakin
, and
S. P.
Boyd
, “
Enhancing sparsity by reweighted 1 minimization
,”
J. Fourier Anal. Appl.
14
(
5–6
),
877
905
(
2008
).
14.
J.
Fang
,
J.
Li
,
Y.
Shen
,
H.
Li
, and
S.
Li
, “
Super-resolution compressed sensing: An iterative reweighted algorithm for joint parameter learning and sparse signal recovery
,”
IEEE Signal Process. Lett.
21
(
6
),
761
765
(
2014
).
15.
J.
Fang
,
F.
Wang
,
Y.
Shen
,
H.
Li
, and
R. S.
Blum
, “
Super-resolution compressed sensing for line spectral estimation: An iterative reweighted approach
,”
IEEE Trans. Signal Process.
64
(
18
),
4649
4662
(
2016
).
16.
Z.
Yang
and
L.
Xie
, “
Enhancing sparsity and resolution via reweighted atomic norm minimization
,”
IEEE Trans. Signal Process.
64
(
4
),
995
1006
(
2016
).
17.
D. R.
Hunter
and
K.
Lange
, “
A tutorial on MM algorithms
,”
Am. Stat.
58
(
1
),
30
37
(
2004
).
18.
Y.
Sun
,
P.
Babu
, and
D. P.
Palomar
, “
Majorization-minimization algorithms in signal processing, communications, and machine learning
,”
IEEE Trans. Signal Process.
65
(
3
),
794
816
(
2017
).
19.
S.
Boyd
and
L.
Vandenberghe
,
Convex Optimization
(
Cambridge University Press
,
Cambridge
,
2004
), pp.
1
684
.
20.
M.
Grant
and
S.
Boyd
, “
CVX: matlab software for disciplined convex programming, version 2.1
,” available at http://cvxr.com/cvx (Last viewed November
2017
).
21.
C.
Carathéodory
and
L.
Fejér
, “
Über den Zusammenhang der Extremen von harmonischen Funktionen mit ihren Koeffizienten und über den Picard-Landauschen Satz
” (“About the correlation between the extremum and the coefficient of harmonic function and about Picard-Landau theory”),
Rend. Circ. Mat. Palermo
32
(
1
),
218
239
(
1911
) (in German).
22.
V. F.
Pisarenko
, “
The retrieval of harmonics from a covariance function
,”
Geophys. J. R. Astron. Soc.
33
(
3
),
347
366
(
1973
).
23.
Y.
Chi
and
Y.
Chen
, “
Compressive two-dimensional harmonic retrieval via atomic norm minimization
,”
IEEE Trans. Signal Process.
63
(
4
),
1030
1042
(
2015
).
24.
Z.
Yang
,
L.
Xie
, and
P.
Stoica
, “
Vandermonde decomposition of multilevel Toeplitz matrices with application to multidimensional super-resolution
,”
IEEE Trans. Inf. Theory
62
(
6
),
3685
3701
(
2016
).