Noise-mapping is an effective sound visualization tool for the identification of urban noise hotspots, which is crucial to taking targeted measures to tackle environmental noise pollution. This paper develops a high-resolution wideband acoustic source mapping methodology using a portable microphone array, where the joint localization and power spectrum estimation of individual sources sparsely distributed over a large region are achieved by tomographic imaging with the multi-frequency delay-and-sum beamforming power outputs from multiple array positions. Exploiting the fact that a wideband source has a common spatial signal-support across the frequency spectrum, two-dimensional tomographic maps are produced by applying compressive sensing techniques including group least absolute shrinkage selection operator formulation and sparse Bayesian learning to promote group sparsity over multiple frequency bands. The high-resolution mapping is demonstrated with experimental data recorded with a microphone array mounted atop an electric vehicle driven along a road while playing audio clips from a loudspeaker positioned within the adjacent open field.

Environmental noise has become a major problem in metropolitan areas, causing serious health problems including heart diseases, hearing loss, hypertension, and sleep disturbance.1,2 The localization, measurement, and monitoring of primary noise sources are of great importance to the implementation of effective noise mitigation policies for creating livable residential areas. Present noise-monitoring systems involve average sound-pressure-level (SPL) measurements at particular times and locations, which may be repeated only a few times a year due to the time and cost associated with the deployment and maintenance of fixed monitoring stations. Therefore, the current technology is incapable of providing sufficient information to tackle noise pollution on a city-wide scale.

Acoustic cameras are commercial imaging devices developed for acoustic testing and noise-source identification with arrays consisting of tens to hundreds of microphones, using acoustic measurement methods such as beamforming,3 sound intensity mapping,4 and near-field acoustic holography.5 Delay-and-sum (DAS) beamforming is the most popular approach among these techniques, but also has some well-known downsides, including spurious sources resulting from the large-sidelobe leakage, and blurring at low frequencies due to the beamwidth varying with frequency.6 Weighting algorithms may be used to obtain a fixed mainlobe across the frequency spectrum.7,8 Methods based on the design of specialized array geometries for sidelobe elimination have also been proposed.9,10 Adaptive beamforming methods may achieve improved resolution with reduced sidelobes provided that the number of snapshots is sufficiently large.11–13 Deconvolution methods for sidelobe suppression have also been developed.14,15 WB-RELAX and WB-CLEAN use a two-step optimization approach for near-field acoustic imaging.14 DAMAS estimates the source locations and power-levels jointly in the near-field based on the known steering vectors.15 

The class of acoustic imaging methods using the compressive sensing (CS) framework16–21 achieve superior resolution compared to classical methods.6,12,13 The recovery of localized sources is formulated as an underdetermined linear inverse problem with a sparsity constraint imposed as a regularization term in the optimization cost function. SC-DAMAS is an extension of DAMAS for high-resolution near-field imaging,16 which adapts the least absolute shrinkage selection operator (LASSO)22 formulation with an additional nonnegativity constraint for power-levels. The CS-based direction-of-arrival (DOA) methods for source localization in the far-field enforce a group sparsity constraint over multiple measurement vectors (MMV, or snapshots), as stationary sources should share a common spatial sparsity profile across snapshots.17–19 The ℓ1-SVD17 applies the singular value decomposition on the measurement matrix to decompose it into signal and noise subspaces, significantly reducing the complexity for sparse reconstruction when there are only a few sources. The compressive-beamforming approach18 directly uses the LASSO and mixed ℓ1,2-norm minimization for single- and multiple-snapshot DOA estimation, respectively. The recently proposed multi-frequency DOA estimation approach based on multiple-snapshot sparse Bayesian learning (SBL) estimates the hyperparameters, the unknown angular signal variances and noise variance, via evidence (type-II likelihood) maximization using iterative fixed-point updates.19–21 

Noise maps demonstrate the long-term average SPL distribution over a region, and are extensively used by urban planners to develop targeted noise abatement strategies. To reduce the dependence on time- and labor-intensive noise-level measurements, city-wide noise maps are generally created with noise-prediction software using outdoor sound propagation models, but the accuracy of maps still relies on the input data. An alternative large-region acoustic source mapping framework recently developed by our group23 enables the time- and cost-effective collection of the acoustic data at as many locations as possible with a portable microphone array mounted on a vehicle as it drives by and allows the generation of tomographic acoustic maps of the neighborhood traversed by the vehicle periodically throughout the day. At a given frequency, the far-field delay-and-sum beamforming power outputs from different array positions are stacked together for tomographic reconstruction, and the two-dimensional high-resolution maps representing the locations and single-frequency short-term average SPLs of primary noise sources in the acoustic field are generated by solving the sparse narrowband imaging problem via compressive sensing. Rather than separately processing the acoustic data from each array position for DOA estimation, the proposed approach significantly extends the sensing capabilities of single-position measurement schemes, which can only provide a limited aperture for large regions.

This paper addresses the problem of drive-by group-sparse tomographic mapping of spatially fixed wideband acoustic sources based on the multi-frequency delay-and-sum beamforming power outputs computed at multiple array positions. Inspired by the MMV-based sparse DOA estimation methods,18–20 the sparsity constraint is imposed on the groups of image pixels across the frequency spectrum corresponding to the same location on a two-dimensional map, alleviating the effects of poor resolution at lower frequencies and preventing the occurrence of ghost sources caused by spatial aliasing at higher frequencies. The multi-frequency acoustic maps are generated and the power spectrums of individual sources are estimated with group LASSO via block coordinate descent (BCD)24–26 and SBL19,20,27 techniques. The proposed methodology is tested with the drive-by acoustic data collected for the previous work.23 The acoustic mapping results demonstrate the promise of this approach for low-cost and time-saving city-wide noise monitoring.

Consider the acoustic wave-field produced by stationary wideband sources sparsely distributed over a large region. Recording the acoustic field with a portable M-channel microphone array at multiple positions, each microphone channel data is split into L time segments, and a length-N fast Fourier transform (FFT) is applied to each N-sample segment, which results in N narrow frequency bands. At a frequency ffn (nth frequency band), the far-field steering vector describing the propagation delays from a possible source at a particular angle θi to the microphones is given by

(1)

where d m , θ i is the relative Euclidean distance of microphone m from the source arriving at the array origin from the angular direction θi and c is the speed of sound.

At array position k, the linear additive-noise frequency-domain signal model is defined for the lth time-segment as6 

(2)

where z f , l ( k ) M is the complex-valued M-channel microphone output, A f = [ a f , θ 0 , , a f , θ I 1 ] M × I is the far-field steering matrix, which is fixed at any measurement location. The complex source amplitudes s f , l ( k ) = [ s f , l , θ 0 ( k ) , , s f , l , θ I 1 ( k ) ] T I are zero-mean and mutually uncorrelated across the direction of arrivals. The angular source covariance matrix given by Σ s f ( k ) = E [ s f , l ( k ) s f , l ( k ) H ] = diag { c f , θ 0 ( k ) , , c f , θ I 1 ( k ) } I × I , where the diagonal element c f , θ i ( k ) yields the total signal power-level along the angular direction θ i . The zero-mean additive-noise vector e f , l ( k ) is mutually uncorrelated with the signal, and the noise covariance matrix is given by Σ e f ( k ) = E [ e f , l ( k ) e f , l ( k ) H ] = υ f ( k ) I M .

Assuming that the zero-mean and mutually uncorrelated acoustic sources are located in the far-field such that the array length D is much smaller than the source distances to the array (rD), resulting in the acoustic waves arriving approximately parallel at the microphone array, the total signal power-level c f , θ i ( k ) may also be characterized by the line integral

(3)

where ( r , θ ) is the polar coordinate system and | X ( f , r , θ i ) | 2 refers to the power-level at frequency f and position ( r , θ i ) relative to the array center.23 This can be regarded as a typical tomographic measurement formulation, as it describes the line-integration taken over potentially multiple sources positioned along θi with distance-squared power attenuation, yielding the total power-level at this specific angular direction. Considering a finite region discretized by using a two-dimensional pixel array, Eq. (3) can be approximated as a linear equation by dividing the given angular path θ i into equal-length segments with the grid spacing Δ r and assigning weights to the pixels in close proximity,

(4)

where J is the total number of pixels and xf,j denotes the value of pixel j and the j-th element of the image coefficient vector x f = [ x f , 0 , , x f , J 1 ] T J representing the discretized power levels over the region of interest in the vectorized form. The weighting vector for the path θ i at array position k is defined as g θ i ( k ) = [ w i , 0 ( k ) Δ r / r i , 0 2 , , w i , J 1 ( k ) Δ r / r i , J 1 2 ] T J , where the weighting factor w θ i , j ( k ) > 0 only if pixel j is in the close proximity of a segment along the specific path, and inversely proportional to the Euclidean distance between the pixel j and the closest segment with the distance of r i , j to the center of the microphone array. Repeating this procedure for all the directions across the angular grid, we obtain

(5)

where G ( k ) = [ g θ 0 ( k ) , , g θ I 1 ( k ) ] T denotes the linear matrix operator that relates the image field x f to the total angular power-level vector c f ( k ) = [ c f , θ 0 ( k ) , , c f , θ I 1 ( k ) ] T I of the signals arriving at the microphone array at position k.

The beamforming output power at angle θi and frequency f is given by

(6)

where w f , θ i is the beamformer weighting vector, and the data covariance matrix Σ z f ( k ) = A f Σ s f ( k ) A f H + Σ e f ( k ) . In practice, Σ z f ( k ) is approximated by the sample data covariance matrix, i.e., Σ z f ( k ) Σ l = 0 L 1 z f , l ( k ) z f , l ( k ) H / L for a sufficiently large number of time segments (L ≫ 1). Given the far-field delay-and-sum (DAS) beamformer weighting vector w f , θ i = a f , θ i / M , the DAS-beamformer output power at θi is

(7)

where b f , θ i = ( 1 / M 2 ) [ | a f , θ i H a f , θ 0 | 2 , , | a f , θ i H a f , θ I 1 | 2 ] T I is the DAS power beam-pattern, which acts as a frequency-dependent blurring filter causing some leakage from the sidelobes, and υ f ( k ) = υ f ( k ) / M is the average measurement noise power. Defining B f = [ b f , θ 0 , , b f , θ I 1 ] T I × I as the angular power beam-pattern matrix at frequency f, H f ( k ) = B f G ( k ) as the imaging matrix, and y f ( k ) = [ y f , θ 0 ( k ) , , y f , θ I 1 ( k ) ] T I as the angular output power vector, the DAS-beamforming output power of an M-channel microphone array at array position k can be reformulated as

(8)

In tomography, a multi-dimensional signal is reconstructed from lower-dimensional measurements collected by multiple sensors from different directions.29 Therefore, it should be feasible to recover the unknown two-dimensional (2-D) power-level distribution produced by the acoustic sources located within a large region provided that the acoustic data are recorded at multiple array positions. Let H f = [ H f ( 0 ) T , , H f ( K 1 ) T ] T K · I × J be the concatenated imaging matrix, y f = [ y f ( 0 ) T , , y f ( K 1 ) T ] T K · I be the measurement vector, and v f = [ υ f ( 0 ) 1 I T , , υ f ( K 1 ) 1 I T ] T K · I be the additive noise vector. At frequency f, we then arrive at the set of linear equations

(9)

There are no analytical solutions for the resulting set of undetermined linear inverse problems (K · IJ) over the frequencies f = 0,…, F – 1. Instead, these ill-posed problems may be solved by regularization to impose prior information as a penalty term in the cost function.

Sparse signal representation methods may be considered to recover the unknown image field if it is known a priori that the localized acoustic noise-sources are positioned sparsely in the field such that x f has only few nonzero elements. Since the wideband acoustic sources sparsely distributed over a large region should also share the same spatial signal-support across the frequency spectrum, the sparsity constraint may be jointly imposed on groups of variables that refer to the same pixel location over multiple frequencies.

Let x j = [ x 0 , j , , x F 1 , j ] T F be the jth frequency-pixel group and H j F · K · I × F be the corresponding jth block-diagonal submatrix defined as

(10)

Then, the additive signal-noise model for the joint linear inverse problem may be given by

where x F · J is the unknown stacked field vector to be estimated, y F · K · I is the concatenated measurement vector, H F · K · I × F · J is the reordered and concatenated imaging matrix, and v F · K · I is the additive measurement noise vector. In this paper, we use two techniques including group LASSO24–26 and SBL19,20,27 for wideband tomographic imaging of localized acoustic noise sources by enforcing group sparsity.

In the Bayesian stochastic framework, if noise is assumed to have a Gaussian distribution, i.e., v N ( 0 , σ 2 I ) , then the data likelihood p ( y | x ) is also Gaussian,

(11)

If the prior p ( x ) is assumed to have a Laplacian-like distribution that promotes group sparsity,

(12)

where x 1 , 2 = j = 0 J 1 x j 2 is the ℓ1,2-norm, then using Bayes' theorem p ( x | y ) = p ( y | x ) p ( x ) / p ( y ) , the maximum a posteriori (MAP) estimate is given by

(13)

as the marginal distribution of the data p ( y ) does not depend on x . Introducing an additional constraint x 0 to enforce nonnegativity and setting μ = σ2/η yields the nonnegative group LASSO estimate26 

(14)

The BCD algorithm24–26 is used to solve the nonnegative group LASSO problem, in which each block x j is updated in a cyclic manner at every iteration as summarized in Table I.

TABLE I.

Group LASSO via BCD for wideband compressive beamforming tomography.

Given: μ > 0 , J , τ  
y f · K · I f o r f = 0 , , F 1  
h f , j · K · I f o r f = 0 , , F 1 ; j = 0 , , J 1  
Initialize: x ̂ j ( 0 ) = 0 f o r j = 0 , , J 1 , t = 1 
whilet ≤ τ 
x j x ̂ j ( t 1 ) f o r j = 0 , , J 1  
forj = 0 to J −1 
  forf = 0 to F − 1 
    ζ j [ f ] = h f , j T ( y f i j h f , i x i [ f ] )  
  end 
  if | | ζ j | | 2 μ then 
    x j 0  
  else 
   Find γ > 0 such that Q ( γ ) = 0 (Newton's method): 
     Q ( γ ) = 1 ( f , ζ j [ f ] > 0 ( ζ j [ f ] ) 2 ( γ | | h f , j | | 2 2 + μ ) 2 ) 1 / 2  
   forf = 0 to F − 1 
     x j [ f ] max { 0 , ζ j [ f ] / ( | | h f , j | | 2 2 + γ 1 μ ) }  
   end 
  end 
 end 
x ̂ j ( t ) x j f o r j = 0 , , J 1 , t t + 1  
end 
x ̂ j x ̂ j ( τ ) f o r j = 0 , , J 1  
X ̂ G L = [ x ̂ 0 x ̂ J 1 ] T  
Moore-Penrose pseudoinverse solution: 
x ̂ 1 , 2 = [ | | x ̂ 0 | | 2 , , | | x ̂ J 1 | | 2 ] T  
J = { j | J largest peaks in x ̂ 1 , 2 } = { j 0 , , j J 1 }  
H f , J = [ h f , j 0 h f , j J 1 ] f o r f = 0 , , F 1  
X ̂ J = [ H 0 , J + y 0 H F 1 , J + y F 1 ]  
Output: X ̂ G L , J , X ̂ J
Given: μ > 0 , J , τ  
y f · K · I f o r f = 0 , , F 1  
h f , j · K · I f o r f = 0 , , F 1 ; j = 0 , , J 1  
Initialize: x ̂ j ( 0 ) = 0 f o r j = 0 , , J 1 , t = 1 
whilet ≤ τ 
x j x ̂ j ( t 1 ) f o r j = 0 , , J 1  
forj = 0 to J −1 
  forf = 0 to F − 1 
    ζ j [ f ] = h f , j T ( y f i j h f , i x i [ f ] )  
  end 
  if | | ζ j | | 2 μ then 
    x j 0  
  else 
   Find γ > 0 such that Q ( γ ) = 0 (Newton's method): 
     Q ( γ ) = 1 ( f , ζ j [ f ] > 0 ( ζ j [ f ] ) 2 ( γ | | h f , j | | 2 2 + μ ) 2 ) 1 / 2  
   forf = 0 to F − 1 
     x j [ f ] max { 0 , ζ j [ f ] / ( | | h f , j | | 2 2 + γ 1 μ ) }  
   end 
  end 
 end 
x ̂ j ( t ) x j f o r j = 0 , , J 1 , t t + 1  
end 
x ̂ j x ̂ j ( τ ) f o r j = 0 , , J 1  
X ̂ G L = [ x ̂ 0 x ̂ J 1 ] T  
Moore-Penrose pseudoinverse solution: 
x ̂ 1 , 2 = [ | | x ̂ 0 | | 2 , , | | x ̂ J 1 | | 2 ] T  
J = { j | J largest peaks in x ̂ 1 , 2 } = { j 0 , , j J 1 }  
H f , J = [ h f , j 0 h f , j J 1 ] f o r f = 0 , , F 1  
X ̂ J = [ H 0 , J + y 0 H F 1 , J + y F 1 ]  
Output: X ̂ G L , J , X ̂ J

At iteration t, the BCD algorithm solves the optimization sub-problem for x j below at the jth sub-iteration, while fixing all the variables in other groups (ij),

(15)

where the residual vector r j = y i j H i x i .

Let L ( x j , μ ) = 1 2 | | r j H j x j | | 2 2 + μ | | x j | | 2 and ζ j = H j T r j . The subderivative of L ( x j , μ ) with respect to x j is then

(16)

where x j is the subdifferential operator for non-differentiable functions, and the subgradient

(17)

This immediately implies that x j = 0 if and only if | | ζ j | | 2 μ . If | | ζ j | | 2 > μ , then there exists a unique γ > 0 such that the optimum solution ( x j 0 ) yields | | x j | | 2 = γ (Refs. 24 and 26)

(18)

where H j T H j = diag { h 0 , j | | 2 2 , , h F 1 , j | | 2 2 } is a diagonal matrix with the ℓ2-norm-squared of the column vectors corresponding to the frequency-dependent group variables in x j on its diagonal. Hence, the nonnegativity of x j is ensured by exploiting the fact that x j 0 only if ζ j 0 . Each element of x j is then given by

(19)

where also using this block-diagonal structure,

(20)

The optimum value for γ can be obtained by applying Newton's method to the function Q ( γ ) = | | x j | | 2 2 / γ 2 = 1 ,

(21)

where only the frequency indices corresponding to the positive-valued elements of ζ j are considered as a result of the nonnegativity constraint. Alternatively, the Newton's method may be applied to Q ( γ ) = 1 Q ( γ ) 1 / 2 find the root γ, as it has been shown in practice to achieve better results.26 

The group LASSO estimate matrix is obtained when the BCD algorithm reaches iteration t = τ, where τ is the maximum allowable number of iterations used as a stopping criterion,

(22)

which is a row-sparse matrix with each column f corresponding to the estimate at the frequency f.

Since the noise or signal-level information is generally unavailable in advance, a value for the regularization parameter μ that makes a reasonable trade-off between the sparsity of the solution and agreement with the physical model is chosen manually, and the development of an empirical parameter-tuning methodology is left as future work.

LASSO-type techniques produce μ-dependent biased estimates, since the ℓ1,2-norm regularization shrinks the elements of x more toward zero with increasing μ. Using a similar approach discussed in Ref. 18, the unbiased source power-level estimates may be determined using the Moore-Penrose pseudoinverse solution. Let x ̂ 1 , 2 = [ | | x ̂ 0 | | 2 , , | | x ̂ J 1 | | 2 ] T be the ℓ1,2-norm estimate and J = { j | J largest peaks in x ̂ 1 , 2 } = { j 0 , , j K 1 } be the pixel set corresponding to the J largest peaks in x ̂ 1 , 2 . Then, the unbiased power-level estimates for the J sources at a frequency f are computed from

(23)

where the truncated matrix H f , J = [ h f , j 0 h f , j N 1 ] , and the pseudoinverse operator H f , J + = ( H f , J T H f , J ) 1 H f , J T .

In the SBL framework,27 each frequency-pixel group x j may be assigned an F-dimensional Gaussian prior with an unknown variance γj, yielding a multivariate Gaussian distribution across pixels

(24)

where γ = [ γ 0 , , γ J 1 ] T . The data likelihood distribution p ( y | x ; σ 2 ) is also assumed to have a multivariate Gaussian distribution with unknown noise variances σ f 2 across frequencies,

(25)

where σ 2 = [ σ 0 2 , , σ F 1 2 ] T .

The posterior distribution p ( x | y ; γ , σ 2 ) can be found via Bayes rule

(26)

where p ( y ; γ , σ 2 ) = p ( y | x ; σ ) p ( x ; γ ) d x is the fixed evidence term conditioned on ( γ , σ 2 ). This yields that given the prior and data likelihood with Gaussian distributions, p ( x | y ; γ , σ 2 ) is also Gaussian with the posterior mean μ x and covariance Σ x (Refs. 19, 20, and 27)

(27)

where Γ = diag { γ 0 I F , , γ j I F } is the prior covariance matrix, and Σ y 1 = diag { Σ y 0 1 , , Σ y F 1 1 } is the inverse of the block-diagonal data covariance matrix Σ y . Each submatrix Σ y f is computed from Eq. (9),

(28)

The posterior mean is also the MAP estimate given ( γ , σ 2 ),

(29)

Therefore, the group sparsity is controlled by the elements of γ , since x ̂ j MAP = 0 whenever γj = 0 with probability 1. For non-Gaussian signals, the MAP estimate becomes the linear minimum mean square error estimate, which minimizes E [ x x ̂ ] . Besides, since the MAP estimate does not guarantee nonnegativity for the power levels, it can be projected onto the nonnegative subspace by simply zeroing the negative estimates, i.e., x ̂ j [ f ] = max { 0 , γ j h f , j T Σ y f 1 y f } . Alternatively, the Gaussian prior assumption may be replaced by a one-sided probability distribution such as the exponential distribution to directly formulate SBL as a nonnegative least squares problem, which is left as future work.

The multi-frequency SBL algorithm for wideband compressive beamforming is summarized in Table II. The SBL methodology developed in Refs. 19, 20, and 21 for the estimation of the hyperparameters ( γ ̂ , σ 2 ̂ ) based on the maximization of the evidence p ( y ; γ , σ 2 ) = N ( 0 , Σ y ) is also used here, which is also known as the type-II maximum likelihood,

(30)

An approximate solution for this non-convex cost function can be found via fixed-point updates based on the derivative of Eq. (30) iteratively taken with respect to the frequency-pixel group variances γj followed by the estimation of the noise variance σ f 2 for the data likelihood across frequencies.19–21 

TABLE II.

Multi-frequency SBL for wideband compressive beamforming tomography.

Given: y f · K · I f o r f = 0 , , F 1 , J , τ  
h f , j · K · I f o r f = 0 , , F 1 ; j = 0 , , J 1  
Initialize: γ ( 0 ) = 1 J , σ 2 ( 0 ) = 0.1 · 1 F , t = 1 
whilet ≤ τ 
forf = 0 to F – 1 
   Σ y f = σ f 2 ( t 1 ) I K · I + j = 0 J 1 γ j ( t 1 ) h f , j h f , j T  
end 
forj = 0 to J – 1 
   γ j ( t ) γ j ( t 1 ) ( f = 0 F 1 ( h f , j T Σ y f 1 y f ) 2 f = 0 F 1 h f , j T Σ y f 1 h f , j ) 1 / 2  
end 
J = { j | J largest peaks in γ ( t ) } = { j 0 , , j J 1 }  
forf = 0 to F – 1 
   H f , J = [ h f , j 0 h f , j J 1 ]  
   S y f = diag { | | y f , 0 | | 2 2 , , | | y f , K · I | | 2 2 }  
   σ f 2 ( t ) 1 J J tr ( ( I F H f , J H f , J + ) S y f )  
end 
tt + 1 
end 
γ ̂ γ ( τ ) , σ 2 ̂ σ 2 ( τ )  
Σ y f = σ ̂ f 2 I + j = 0 J 1 γ ̂ j h f , j h f , j T f o r f = 0 , , F 1  
forj = 0 to J – 1 
x ̂ j [ f ] max { 0 , γ ̂ j h f , j T Σ y f 1 y f } f o r f = 0 , , F 1  
end 
X ̂ S B L = [ x ̂ 0 x ̂ J 1 ] T  
Output: X ̂ S B L , J
Given: y f · K · I f o r f = 0 , , F 1 , J , τ  
h f , j · K · I f o r f = 0 , , F 1 ; j = 0 , , J 1  
Initialize: γ ( 0 ) = 1 J , σ 2 ( 0 ) = 0.1 · 1 F , t = 1 
whilet ≤ τ 
forf = 0 to F – 1 
   Σ y f = σ f 2 ( t 1 ) I K · I + j = 0 J 1 γ j ( t 1 ) h f , j h f , j T  
end 
forj = 0 to J – 1 
   γ j ( t ) γ j ( t 1 ) ( f = 0 F 1 ( h f , j T Σ y f 1 y f ) 2 f = 0 F 1 h f , j T Σ y f 1 h f , j ) 1 / 2  
end 
J = { j | J largest peaks in γ ( t ) } = { j 0 , , j J 1 }  
forf = 0 to F – 1 
   H f , J = [ h f , j 0 h f , j J 1 ]  
   S y f = diag { | | y f , 0 | | 2 2 , , | | y f , K · I | | 2 2 }  
   σ f 2 ( t ) 1 J J tr ( ( I F H f , J H f , J + ) S y f )  
end 
tt + 1 
end 
γ ̂ γ ( τ ) , σ 2 ̂ σ 2 ( τ )  
Σ y f = σ ̂ f 2 I + j = 0 J 1 γ ̂ j h f , j h f , j T f o r f = 0 , , F 1  
forj = 0 to J – 1 
x ̂ j [ f ] max { 0 , γ ̂ j h f , j T Σ y f 1 y f } f o r f = 0 , , F 1  
end 
X ̂ S B L = [ x ̂ 0 x ̂ J 1 ] T  
Output: X ̂ S B L , J

At iteration t, using ( Σ y 1 / γ j ( t ) ) = Σ y 1 ( Σ y / γ j ( t ) ) Σ y 1 and ( Σ y / γ j ( t ) ) = h f , j h f , j T , we have

(31)

and

Then, ( / γ j ( t ) ) ( y T Σ y 1 y + ln det Σ y ) = 0 yields the fixed-point update at iteration t,

(32)

The noise variance is updated at each frequency iteratively using the estimator19,20

(33)

where J = { j | J largest peaks in γ ̂ } = { j 0 , , j J 1 } is the set of pixels corresponding to the J largest peaks in γ ̂ , and S y f denotes the data sample covariance matrix. This estimation rule is derived using the fact that the optimal solution ( γ ̂ , σ 2 ̂ ) should satisfy the necessary condition H f , J T ( S y f Σ y f ) H f , J = 0 .19,20 However, since there is only one single snapshot at each frequency, we take an empirical approach by replacing the sample data covariance matrix in Eq. (33) with S y f = diag { | | y f , 0 | | 2 2 , , | | y f , K I | | 2 2 } . Although this update usually underestimates the noise level, our empirical observation reveals that this estimate is sufficient enough for the fast convergence of the multi-frequency SBL algorithm.

The acoustic data recorded for our previous work23 was also used here to test the proposed wideband acoustic imaging algorithm. As illustrated in Fig. 1, an electric vehicle (EV) with a rooftop-mounted array with omnidirectional free-field microphones was driven along a straight road, while various pre-recorded noise audio clips were played from a loudspeaker mounted on a tripod elevated 1.25 m higher than the microphone array, and positioned at different distances from the road in an adjacent field. The microphone array consisted of two concentric 12-channel circular subarrays with the inner and outer diameters of 0.5 and 1 m, respectively, on which microphones were separated by 30° from each other. During the “stop-and-collect” experiments, the EV was stopped every 10 m for recordings at seven positions between 0 and 60 m. At each array position, pre-recorded environmental noise audio clips including jackhammer, pile driver and other construction-type noise-sources retrieved from Ref. 30, an online sound repository, were played back-to-back from the loudspeaker with durations ranging from 7 to 20 s. In “drive-by” experiments, the acoustic data were recorded continuously as the EV was driven along the same path at nearly constant velocities, during which longer-duration versions of noise recordings were played from the loudspeaker on the adjacent field, and the timing was achieved by two people shouting at 0 -m and 60 -m marks. In order to achieve near-omnidirectional sound propagation, the loudspeaker was rotated manually toward the EV during all experiments. Please refer to Ref. 23 for further details regarding the experimental setup and data collection.

FIG. 1.

(Color online) The experimental setup for acoustic noise-source tomographic mapping experiments.

FIG. 1.

(Color online) The experimental setup for acoustic noise-source tomographic mapping experiments.

Close modal

An area of 20 × 40 m2 inside the open-field was used for acoustic mapping with the image resolution of 0.5 m/pixel in both x- and y-directions, resulting in an image size of 41 × 81. The sampling rate was 25.6 kHz and the FFT-length was chosen to be N = 512 (frequency-bin bandwidth: 50 Hz). A Hann window with an overlap of 50% was applied to each time segment before the FFT. For stop-and-collect experiments, five array positions between the 10-m and 50-m marks were used for single-source mapping, and the sample covariance matrix was computed over the entire recording duration (7–20 s). In drive-by experiments, microphone channels were split into 0.5-s-long time blocks for sample covariance matrix computation, each corresponding to an array position estimated via the average EV velocity to be equally separated by 2.5 m along the same road portion (10–50 m), leading to a total of 17 measurement positions. The DAS-beamforming power outputs were initially computed for the angular range between 0° and 180° with 1° angular resolution over the frequency range of interest, and then further downsampled to 5° resolution to reduce the computation time for map generation. A reference power-level for each noise source at a frequency band was determined by finding the DOAs for the five “stop-and-collect” measurements (10–50 m) via geometry, and then averaging over the beamforming power outputs along the DOAs: x f avg = k = 0 4 y f , θ s r c ( k ) r k 2 / 5 , where r k 2 is distance of the loudspeaker from the array at position k used to account for the distance-squared power attenuation.

The DAS-beamforming suffers poor resolution at low frequencies. This corresponds to extremely low-rank (or highly coherent) imaging matrices, causing significant degradation in imaging performance if very low frequency components are used. On the contrary, spatial aliasing occurs due to array spacing at high frequencies, which may lead to weak ghost sources in tomographic maps. Therefore, the operating frequency range was reasonably selected as 500–4000 Hz based on the array design, resulting in a total of 71 frequency bands (with a bandwidth of 50 Hz), so as to recover a sparse common spatial support over the frequency spectrum by exploiting the superior resolution at higher frequencies while preventing spatial aliasing by using lower frequency bands.

Group LASSO was run for τ = 500 iterations, whereas τ = 100 for multi-frequency SBL at all cases, as the group LASSO requires more iterations due to the cyclic update at each iteration slowing down the algorithm convergence. The cyclic update also may be computationally very costly given the problem size, but it was accelerated by computing the vector products in Eq. (20) in advance at the cost of a large storage requirement. Newton's method, however, generally was able to find the root within five iterations, so it did not have a notable effect on the overall LASSO computational cost. Furthermore, both algorithms implicitly assume the number of sources to be known beforehand for the tuning of the regularization parameter in group LASSO, and the estimation of the noise variance in SBL. However, a slightly incorrect number had a relatively limited impact on the SBL performance.

In environmental noise measurements, A-weighting is typically used as a correction value based on a standardized weighting filter designed with respect to the human ear sensitivity over frequency.28 For each case, an image was reconstructed by averaging over the frequency range using A-weighting, in which the value of pixel j in the averaged image was computed by x j a v g = ( 1 / F ) f = 0 F 1 A w ( f ) x f , j , where Aw(f) is the frequency-dependent A-weighting function, followed by the dB-SPL conversion (with the reference pressure level of 20 μPa). An additional threshold of 50 dB was also applied to the images for illustration purposes. A source location was estimated from a reconstructed image by determining the position of the pixel with the largest value among the pixels in close proximity. For the power spectrum estimation, the SPL (re 20 μPa) at a frequency band f was computed by the summation of the multiple pixel values representing the same noise-source in the image version of the estimate x ̂ f and then conversion of the total value into the A-weighted dB-SPL [denoted as “dB(A)”]. The Moore-Penrose pseudoinverse solution given in Eq. (23) was also computed using the estimated source locations at each case for both imaging algorithms (denoted as “P-INV”). The noise-sources used in the experiments were fixed in space, but in fact, many of them were time-varying, meaning that the estimated noise level via the stationary model in general corresponded to the short-term average SPLs. The relative residual error defined as Rel . Res . = y f y ̂ f 2 / y f 2 was considered for quantitative performance evaluation, referring to the consistency between the physical DAS-beamforming output power vector y f and the output power y ̂ f = H f x ̂ f estimated via the forward model.

Figure 2 shows the acoustic single-source maps for the “jackhammer I” noise-source positioned at (30, 15 m) averaged over the A-weighted images reconstructed over the frequency range of 500–4000 Hz using the stop-and-collect data and drive-by measurements recorded at the average velocities of 5.6, 10.1, and 17.5 km/h. The source-location and power-spectrum estimates for all cases are presented in Table III and Fig. 3, respectively. As anticipated, the constant-speed assumption for the drive-by experiments was often violated, resulting in the relative residual error being higher in all drive-by cases than the stop-and collect results. The fluctuations generally corresponded to the deep notches at the estimated SPLs, particularly at lower frequencies, since by definition, the relative residual error increases with the weakening signal power-level. However, the vehicle movement and varying velocity only caused modest artifacts in the acoustic maps and errors in the location estimates of less than 2 m, since an accurate DAS-beamformer output power computation requiring only a small number of snapshots (0.5 s in “drive-by” compared to 20 s in “stop-and collect”) made it possible to use many more array positions for tomographic mapping, reducing the estimation errors in the drive-by trials.

FIG. 2.

(Color online) The resulting images averaged over the frequency range of 500–4000 Hz using A-weighting for the wideband tomographic acoustic mapping of the “jackhammer I” noise-source played from a loudspeaker positioned at (30 m, 15 m) in the open field.

FIG. 2.

(Color online) The resulting images averaged over the frequency range of 500–4000 Hz using A-weighting for the wideband tomographic acoustic mapping of the “jackhammer I” noise-source played from a loudspeaker positioned at (30 m, 15 m) in the open field.

Close modal
TABLE III.

The source-location estimates for the imaging results given in Fig. 2.

Method Stop-and-collect V1 = 5.6 km/h V2 = 10.1 km/h V3 = 17.5 km/h
Group LASSO  (29.5 m, 15.0 m)  (30.0 m, 14.0 m)  (30.5 m, 15.0 m)  (30.5 m, 16.0 m) 
Multi-freq. SBL  (29.5 m, 15.0 m)  (30.0 m, 14.5 m)  (30.5 m, 15.0 m)  (30.5 m, 16.0 m) 
Method Stop-and-collect V1 = 5.6 km/h V2 = 10.1 km/h V3 = 17.5 km/h
Group LASSO  (29.5 m, 15.0 m)  (30.0 m, 14.0 m)  (30.5 m, 15.0 m)  (30.5 m, 16.0 m) 
Multi-freq. SBL  (29.5 m, 15.0 m)  (30.0 m, 14.5 m)  (30.5 m, 15.0 m)  (30.5 m, 16.0 m) 
FIG. 3.

(Color online) The power-spectrum estimates for the imaging results given in Fig. 1 (Avg. SPL: Average SPL, P-INV: SPL computed via the Moore-Penrose pseudoinverse), and the corresponding relative residual error results (bottom row).

FIG. 3.

(Color online) The power-spectrum estimates for the imaging results given in Fig. 1 (Avg. SPL: Average SPL, P-INV: SPL computed via the Moore-Penrose pseudoinverse), and the corresponding relative residual error results (bottom row).

Close modal

The multi-frequency SBL generates maps much sharper than the group LASSO, benefiting from the hyperparameter estimation procedure. On the other hand, SBL may suffer from lack of an additional smoothness prior, as seen in the drive-by case of 10.1 km/h, where a strong short-duration, intermittent source (a human shout very close to the EV driving by around the 30-m mark) recorded at a few array positions led to the worst performance, and the multi-frequency SBL produced more background artifacts in the map than the group LASSO. At the same time, the effect of the intermittent source was mitigated by the nature of tomographic imaging, which implicitly averaged over the measurements collected at multiple array positions. The same also held for the destructive/constructive ground reflections, which were significant at lower frequencies while gradually disappearing with increasing frequency. Although the imaging model did not take acoustic reflections into account, their impact was limited as a result of this averaging feature. As expected, group LASSO produced biased estimates, with values being smaller than the pseudoinverse solution, whereas there was a close match for SBL (except for the worst case of 10.1 km/h).

The Doppler effect was not notably observed because of operating at relatively slow velocities, as the maximum shift in frequency was computed to be less than 100 Hz at the highest frequency, not causing a significant variation in the estimated SPLs as a result of the wideband source having a relatively smooth power spectrum at high frequencies. The slow-velocity driving also reduced the impact of wind noise on the mapping results, but additional trials at average velocities higher than 20 km/h showed that wind noise may significantly degrade the data quality, and hence require the use of well-designed wind screens for mitigation. Furthermore, the current imaging model does not account for the ground absorptions, which may have a significant effect over the frequency spectrum depending on the field characteristics, resulting in the measured/estimated SPLs to be lower than the actual power-level of the noise-source.31 The collection of more drive-by measurements around the field of interest may be a solution for this issue, as the losses due to ground absorptions over the power spectrum may be assessed based on these measurements and used as prior information.

The proposed technique was also tested for the multiple-source scenario, which was realized by mixing stop-and-collect microphone recordings of different noise-sources before the DAS beamforming, at the cost of increasing background noise. Horizontally shifted noise-sources were created by mixing the recordings from the five measurement locations between 0 and 40 m and 20 and 60 m marks. The acoustic multiple-source maps for the two cases are shown in Fig. 4 along with the source-location and power-spectrum estimates given in Table IV, and Figs. 5 and 6, respectively.

FIG. 4.

(Color online) The resulting images averaged over the frequency range of 500–4000 Hz using A-weighting for wideband tomographic acoustic multiple-source mapping.

FIG. 4.

(Color online) The resulting images averaged over the frequency range of 500–4000 Hz using A-weighting for wideband tomographic acoustic multiple-source mapping.

Close modal
TABLE IV.

The source-location estimates for the imaging results given Fig. 4.

Case I Actual location Group LASSO Multi-frequency SBL
Pile driver  (20.0 m, 15.0 m)  (19.0 m, 15.0 m)  (19.0 m, 15.0 m) 
Jackhammer II  (30.0 m, 15.0 m)  (29.5 m, 15.5 m)  (29.5 m, 15.0 m) 
Jackhammer III  (40.0 m, 15.0 m)  (39.5 m, 15.0 m)  (39.5 m, 15.0 m) 
Case II  Actual Loc.  Group LASSO  Multi-freq. SBL 
Construction site I  (20.0 m, 10.0 m)  (20.0 m, 9.5 m)  (20.0 m, 10.0 m) 
Construction site II  (30.0 m, 15.0 m)  (29.5 m, 15.5 m)  (29.5 m, 15.0 m) 
White noise  (40.0 m, 10.0 m)  (40.0 m, 9.5 m)  (40.0 m, 10.0 m) 
Case I Actual location Group LASSO Multi-frequency SBL
Pile driver  (20.0 m, 15.0 m)  (19.0 m, 15.0 m)  (19.0 m, 15.0 m) 
Jackhammer II  (30.0 m, 15.0 m)  (29.5 m, 15.5 m)  (29.5 m, 15.0 m) 
Jackhammer III  (40.0 m, 15.0 m)  (39.5 m, 15.0 m)  (39.5 m, 15.0 m) 
Case II  Actual Loc.  Group LASSO  Multi-freq. SBL 
Construction site I  (20.0 m, 10.0 m)  (20.0 m, 9.5 m)  (20.0 m, 10.0 m) 
Construction site II  (30.0 m, 15.0 m)  (29.5 m, 15.5 m)  (29.5 m, 15.0 m) 
White noise  (40.0 m, 10.0 m)  (40.0 m, 9.5 m)  (40.0 m, 10.0 m) 
FIG. 5.

(Color online) The power-spectrum estimates for case I in Fig. 4 and the resulting relative residual error.

FIG. 5.

(Color online) The power-spectrum estimates for case I in Fig. 4 and the resulting relative residual error.

Close modal
FIG. 6.

(Color online) The power-spectrum estimates for case II in Fig. 4 and the resulting relative residual error.

FIG. 6.

(Color online) The power-spectrum estimates for case II in Fig. 4 and the resulting relative residual error.

Close modal

There were only five measurement locations separated by 10 m, which may be considered as a type of limited-data tomography problem, particularly degrading the power-spectrum estimation performance and causing visible near-field artifacts in the map backgrounds for the second case, whereas the performance was much better in the first case, since there was a direct line of sight was between the sources and microphone array at every measurement position. The multi-frequency SBL outperformed group LASSO in both cases, showing its robustness to the tripled background noise, as it produced much cleaner images and estimated SPLs that were much closer to the average values across the entire frequency spectrum. LASSO was very prone to poor resolution at lower frequencies, resulting in a very large bias in the estimated SPLs. The effect of poor resolution and limited angular views was also observed in the second case for the pseudo-norm solution, as one of the sources was masked by the primary source, failing to recover the SPL at 950 Hz. On the other hand, the multi-frequency SBL yielded higher relative residual errors because of generating much sparser maps, which led to a larger model mismatch. Overall, both imaging algorithms were able to estimate the source locations and spectrum with high accuracy.

This paper presents a novel group-sparse tomographic imaging approach for large-scale wideband acoustic source mapping with a portable microphone array mounted on a vehicle, exploiting the common spatial sparsity support shared across the frequency spectrum. This methodology enables the precise localization of primary wideband acoustic sources over a large area and the individual detailed identification of each source through power spectrum estimation, achieved by creating 2-D beamforming-based high-resolution acoustic noise-maps over multiple frequencies. This novel monitoring scheme may overcome some of the main limitations of acoustic monitoring, since it allows acoustic recordings at as many array positions as possible while the vehicle drives by, reducing the cost and time associated with data acquisition by replacing fixed terminals with mobile sensors. With this less time- and labor-intensive approach, it becomes very feasible to install microphone arrays onto vehicles to regularly create noise maps of the city neighbourhoods traversed by the vehicles during the day and night. The evaluation of these periodically generated maps may assist urban planners in gaining a better understanding of the noise-scape in different urban settings and developing location-specific mitigation solutions. The unprecedented scale of detailed frequency-based information may be used in crucial tasks including sound classification and noise profiling at different time periods and locations, which may also potentially be used in noise-prediction computer models to improve the input data quality.

Future work includes the experimental validation of the drive-by acoustic imaging methodology with real acoustic noise sources, and the further improvement of the imaging model to tackle correlated noise-source scenarios and to take other major outdoor sound propagation factors into account including acoustic reflections, ground and atmospheric absorptions, and variations in wind and temperature conditions. The more detailed nonstationary characteristics of the acoustic noise-sources may also be achieved by developing a more generalized dynamic imaging model.

This research is supported in part by the Singapore Ministry of National Development and the National Research Foundation, Prime Minister's Office under the Land and Liveability National Innovation Challenge (L2 NIC) Research Programme (L2 NIC Award No. L2NICCFP1-2013-7) and in part by the research grant for the Human-Centered Cyber-physical Systems Programme at the Advanced Digital Sciences Center from Singapore's Agency for Science, Technology and Research (A*STAR).

1.
WHO Regional Office for Europe
, Burden of Disease from Environmental Noise (
2011
). This is available online at http://www.euro.who.int/__data/assets/pdf_file/0008/136466/e94888.pdf (Last viewed July 31, 2017).
2.
E.
Murphy
, “
What to do about environmental noise?
,”
Acoust. Today
13
(
2
),
18
25
(
2017
).
3.
R. P.
Dougherty
, “
Beamforming in acoustic testing
,” in
Aeroacoustic Measurements
, edited by
T. J.
Mueller
(
Springer
,
New York
,
2002
), pp.
63
97
.
4.
F. J.
Fahy
, “
Acoustic measurement of acoustic intensity using the cross-spectral density of two microphone signals
,”
J. Acoust. Soc. Am.
62
(
4
),
1057
1059
(
1977
).
5.
J. D.
Maynard
,
E. G.
Williams
, and
Y.
Lee
, “
Nearfield acoustic holography: I. Theory of generalized holography and the development of NAH
,”
J. Acoust. Soc. Am.
78
(
4
),
1395
1413
(
1985
).
6.
B. D.
Van Veen
and
K. M.
Buckley
, “
Beamforming: A versatile approach to spatial filtering
,”
IEEE ASSP Mag.
5
(
2
),
4
24
(
1998
).
7.
T. F.
Brooks
,
M. A.
Marcolini
, and
D. S.
Pope
, “
A directional array approach for the measurement of rotor noise source distributions with controlled spatial resolution
,”
J. Sound Vib.
112
,
192
197
(
1987
).
8.
J.
Li
,
Y.
Xie
,
P.
Stoica
,
X.
Zheng
, and
J.
Ward
, “
Beampattern synthesis via a matrix approach for signal power estimation
,”
IEEE Trans. Sign. Process.
55
,
5643
5657
(
2007
).
9.
J. R.
Underbrink
, “
Aeroacoustic phased array testing in low speed wind tunnels
,” in
Aeroacoustic Measurements
, edited by
T. J.
Mueller
(
Springer
,
New York
,
2002
), pp.
98
217
.
10.
E. J.
Arcondoulis
,
C. J.
Doolan
,
A. C.
Zander
, and
L. A.
Brooks
, “
Design and calibration of a small aeroacoustic beamformer
,” in
20th International Congress on Acoustics
, Sydney, New South Wales, Australia,
2010
.
11.
J.
Capon
, “
High-resolution frequency-wavenumber spectrum analysis
,”
Proc. IEEE
57
,
1408
1418
(
1969
).
12.
R. P.
Dougherty
and
R. W.
Stoker
, “
Sidelobe suppression for phased array aeroacoustic measurements
,” in
4th AIAA/CEAS Aeroacoustics Conference
, AIAA, Toulouse, France (1998), pp.
98
2242
.
13.
R. A.
Gramann
and
J.
Mocio
, “
Aero-acoustic measurements in wind tunnels using conventional and adaptive beamforming methods
,” AIAA Paper 93-4341 (
1993
).
14.
Z.
Wang
,
J.
Li
,
P.
Stoica
,
T.
Nishida
, and
M.
Sheplak
, “
Wideband RELAX and wideband CLEAN for aeroacoustic imaging
,”
J. Acoust. Soc. Am.
115
,
757
767
(
2004
).
15.
T. F.
Brooks
and
W. M.
Humphreys
, “
A deconvolution approach for the mapping of acoustic sources (DAMAS) determined from phased microphone arrays
,”
J. Sound Vib.
294
(
4
),
856
879
(
2006
).
16.
T.
Yardibi
,
J.
Li
,
P.
Stoica
, and
L. N.
Cattafesta
 III
, “
Sparsity constrained deconvolution approaches for acoustic source mapping
,”
J. Acoust. Soc. Am.
123
(
5
),
2631
2642
(
2008
).
17.
D. M.
Malioutov
,
M.
Cetin
, and
A. S.
Willsky
, “
A sparse signal reconstruction perspective for source localization with sensor arrays
,”
IEEE Trans. Sign. Process.
53
,
3010
3022
(
2005
).
18.
P.
Gerstoft
,
A.
Xenaki
, and
C. F.
Mecklenbräuker
, “
Multiple and single snapshot compressive beamforming
,”
J. Acoust. Soc. Am.
138
(
4
),
2003
2014
(
2015
).
19.
P.
Gerstoft
,
C. F.
Mecklenbräuker
,
A.
Xenaki
, and
S.
Nannuru
, “
Multisnapshot sparse Bayesian learning for DOA
,”
IEEE Sign. Process. Lett.
23
(
10
),
1469
1473
(
2016
).
20.
K. L.
Gemba
,
S.
Nannuru
,
P.
Gerstoft
, and
W. S.
Hodgkiss
, “
Multi-frequency sparse Bayesian learning for robust matched field processing
,”
J. Acoust. Soc. Am.
141
(
5
),
3411
3420
(
2017
).
21.
S.
Nannuru
,
K. L.
Gemba
,
P.
Gerstoft
,
W. S.
Hodgkiss
, and
C. F.
Mecklenbräuker
, “
Multi-frequency sparse Bayesian learning with uncertainty models
,” arXiv:1704.00436 (
2017
).
22.
R.
Tibshirani
, “
Regression shrinkage and selection via the LASSO
,”
J. R. Stat. Soc. Ser. B Methodol.
58
,
267
288
(
1996
).
23.
C.
Tuna
,
S.
Zhao
,
T. N. T.
Nguyen
, and
D. L.
Jones
, “
Drive-by large-region acoustic noise-source mapping via sparse beamforming tomography
,”
J. Acoust. Soc. Am.
140
(
1
),
2530
2541
(
2016
).
24.
A. T.
Puig
,
A.
Wiesel
, and
A. O.
Hero
, “
A multidimensional shrinkage-thresholding operator
,” in
The 2009 IEEE/SP 15th Workshop on Statistical Signal Processing
, Cardiff, UK (
2009
).
25.
A.
Rakotomamonjy
, “
Surveying and comparing simultaneous sparse approximation (or group-LASSO) algorithms
,”
Sign. Process.
91
,
1505
1526
(
2011
).
26.
Z.
Qin
,
K.
Scheinberg
, and
D.
Goldfrab
, “
Efficient block-coordinate descent algorithms for the group LASSO
,”
Math. Prog. Comput.
5
,
143
169
(
2013
).
27.
D. P.
Wipf
and
B. D.
Rao
, “
An empirical Bayesian strategy for solving the simultaneous sparse approximation problem
,”
IEEE Trans. Sign. Process.
55
(
7
),
3704
3716
(
2007
).
28.
L. E.
Kinsler
,
A. R.
Frey
,
A. B.
Coppens
, and
J. V.
Sanders
,
Fundamentals of Acoustics
, 4th ed. (
Wiley
,
Hoboken, NJ
,
2000
).
29.
C. W.
Groetsch
,
Inverse Problems in the Mathematical Sciences
(
Vieweg
,
Braunschweig, Germany
,
1993
).
30.
Freesound.org
, a collaborative database of Creative Commons licensed sounds, https://www.freesound.org/ (Last viewed July 31, 2017).
31.
J. E.
Piercy
,
T. F. W.
Embleton
, and
L. C.
Sutherland
, “
Review of noise propagation in the atmosphere
,”
J. Acoust. Soc. Am.
61
(
6
),
1403
1418
(
1977
).