Acoustic imaging can be performed using a spherical microphone array (SMA) and conventional beamforming (CBF) or spherical harmonic beamforming (SHB). At low frequencies, the mainlobe width depends on the SMA radius for CBF and on the order of the spherical harmonics expansion for SHB, which is related to the number of microphones. In this letter, Kriging is used to virtually increase the SMA radius and/or the number of microphones. Numerical and experimental investigations show the effectiveness of Kriging to reduce the mainlobe width and thus improve the acoustic images obtained with a SMA and CBF or SHB.

## 1. Introduction

Acoustic imaging can be performed using a planar or a spherical microphone array (SMA) (Chiariotti , 2019). Planar microphone arrays are commonly used for pass-by or wind tunnel measurements (Merino-Martínez , 2019; Padois and Berry, 2017) while spherical microphone arrays are more suitable for workplace or in-vehicle measurements (Heilmann , 2008; Noël , 2006) because they can capture the acoustic waves coming from all directions. While the main algorithm used with a planar microphone array is beamforming in the frequency- or time-domains (Padois , 2021), it is also possible to transform the sound field into the spherical harmonic domain using a SMA (Rafaely, 2019). This latter algorithm is known as spherical harmonic beamforming (SHB) (Petersen, 2004).

For beamforming, the SMA is commonly open which means that the microphones are either installed on a wire-frame (Heilmann , 2008; Merimaa, 2002) or at the extremity of rods (Noël , 2006; Padois , 2017). In this case, the formulation of conventional beamforming in the frequency-domain (CBF) is the same as for planar microphone arrays and is based on free-field propagation, assuming no interaction between the acoustic waves and the SMA. Alternatively, time-domain beamforming based on the generalized cross correlation can be used with the same hypothesis of free-field propagation (Padois , 2016; Quaegebeur , 2016). It must be noted that CBF has been successfully used with a rigid SMA even though the scattering effects were not taken into account (Rouard , 2022). For SHB, the microphones are commonly flush-mounted over an acoustically rigid sphere (Haddad and Hald, 2008). In this case, the acoustic waves may interfere with the sphere leading to scattering effects which can be accounted for in SHB algorithm (Petersen, 2004; Rafaely, 2019). Although less common, SHB has also been used with an open SMA but leads to numerical instabilities at some specific frequencies due to division by zeros of Bessel function (Balmages and Rafaely, 2007; Rouard , 2022).

The limitations of CBF and SHB are different in nature. For CBF, the size of the mainlobe (which indicates the source position on the acoustic image) is related to the frequency of the source signal and to the SMA radius. At low frequencies, a larger mainlobe is thus expected. At high frequencies, the mainlobe width decreases but at the expense of a higher side lobes level (Padois , 2021). Therefore, a larger SMA is required for imaging the low frequencies which increases the microphone spacing and thereafter decreases the spatial aliasing frequency. One solution is to increase the number of microphones (Heilmann , 2008). Conversely, the mainlobe provided by SHB is not related to the frequency but to the order $N$ of the spherical harmonics expansion (Haddad and Hald, 2008; Petersen, 2004). For a given order, the mainlobe width is constant over a frequency range. A higher order provides a narrower mainlobe. However, the order is limited by the number of microphones $Q$ usually estimated by $ Q > ( N + 1 ) 2$ (Battista , 2018; Chiariotti , 2019). Therefore, the number of microphones increases rapidly with the order of the spherical harmonics expansion. Moreover, SHB requires the orthogonality of the spherical harmonics, therefore the positions of the microphones on the sphere cannot be chosen arbitrarily and must be selected according to specific geometrical patterns of the microphone array such as the t-design for instance (Lecomte , 2016; Rouard , 2022).

Algorithms have been developed to improve the acoustic image obtained with a rigid SMA. They are based on deconvolution techniques initially developed for planar microphone arrays, such as CLEAN-SC or DAMAS (Chu , 2015, 2019). However, the application is usually limited to high frequencies (>1000 Hz). To improve the acoustic image at low frequencies, Tiana-Roig *et al.* proposed to virtually increase the SMA radius (Tiana-Roig , 2014). First, the acoustic pressure is measured with a rigid SMA, then acoustic holography is used to predict the acoustic pressure on a virtual larger rigid SMA and finally SHB is applied on this virtual SMA. Although, the results are promising, this technique shares the limitations of SHB previously described.

In this letter, we propose to virtually modify the SMA before performing CBF or SHB using Kriging (Grana , 2021). This technique is commonly used in geostatistics to predict some unknown value at a virtual point based on a weighted average of values measured at the multiple sampled points. Therefore, the objective of this letter is to investigate whether Kriging is a good candidate to virtually modify the SMA, either by increasing the SMA radius or even the number of microphones. This letter focuses on an open SMA with few microphones. In Sec. 2, the formulation of CBF and SHB are recalled as well as a criterion to characterize the quality of the acoustic image. The Kriging and its application to acoustic imaging are described in Sec. 3. The methodology and the results are presented in Secs. 4 and 5, respectively. Finally, the last section draws some conclusions and future works (Sec. 6).

## 2. Acoustic imaging

*ω*is the angular frequency and $ ( \xb7 ) *$ the Hermitian transpose. Matrix and vector terms are denoted in bold. In the case of a SMA, the source position is searched over a spherical scan zone with coordinates $ ( \theta l , \varphi l )$ including

*L*scan points. The main difference between CBF and SHB is the formulation of the steering matrix $ w ( \omega )$ and the cross-spectral matrix $ C ( \omega )$.

### 2.1 CBF

*r*represents the distance between scan point

_{lq}*l*and microphone

*q*and

*c*

_{0}is the sound speed. The cross-spectral matrix $ C ( \omega )$ is given by the product of the Fourier transform of the microphones signals $ p ( \omega )$,

### 2.2 SHB

*α*is a weight which depends on the microphone geometry on the sphere, $ p q ( \omega )$ is the sound pressure measured by the microphone

_{q}*q*, and $ Y n m * ( \theta q , \varphi q )$ is an element of the spherical harmonic matrix

*k*is the wavenumber, $ d n = 4 \pi / ( N + 1 ) 2$ is a weight parameter giving the maximum directivity (Rafaely, 2019) and

*b*is given by

_{n}*j*is the Bessel function of the first kind and $ h n ( 2 )$ is the Hankel function of the second kind. The dimension of the modal steering matrix $ w SH ( \omega )$ is $ [ ( N + 1 ) 2 \xd7 L ]$.

_{n}### 2.3 Acoustic image quality criterion

In the case of a single source, an acoustic image exhibits a mainlobe which has the highest level and secondary lobes with lower levels, called side lobes. The quality of an acoustic image is considered as correct when the mainlobe width is small and the side lobes level is low. Beyond the visual inspection of the acoustic images, it is important to define criteria for quantitative comparison. A criterion based on the area of the mainlobe has been selected (Padois , 2021). The objective is to surround by an ellipse the values of the mainlobe higher than –3 dB with respect to the peak value. The assessment of the minor and major ellipse axes allow for computing the area of the ellipse which is then normalized by the image area (360 × 180). The result is provided in %. Note that a value of 10% means that the ellipse's area, i.e., the mainlobe, covers 10% of the acoustic image. Therefore, the smaller the ratio of ellipse's areas, the better the acoustic image. In the following, this criterion is denoted ellipse area ratio (EAR).

## 3. Spatial interpolation based on Kriging

*λ*is an element of the weight vector $\lambda $ given by

_{q}**K**and $ K \u0303$ correspond, respectively, to the covariance matrix between the original microphone pairs

## 4. Methodology

The original open SMA with a radius $R$ and $Q$ microphones randomly distributed is depicted in Fig. 1 (left side). The Kriging can be used to increase the original SMA radius to the virtual SMA radius $ R K$ and/or to increase the number of microphones to the number of virtual microphones $ Q K$ (Fig. 1, right side). With CBF, the original SMA radius and number of microphones are increased with the Kriging. With SHB, only the number of microphones is virtually increased in order to increase the order of the spherical harmonics. For all configurations, the scan zone is a grid with 180 × 90 points distributed along the azimuth $ \varphi l \u2208 [ \u2212 180 \xb0 , 180 \xb0 ]$ and the elevation $ \theta l \u2208 [ 0 \xb0 , 180 \xb0 ]$.

### 4.1 Numerical data

The microphone signals captured by the open SMA are derived from the convolution of a source signal and a room impulse response (Jarrett , 2012). The room dimensions are 18 m × 12 m × 5 m and the SMA is installed at the center of the room. In order to assess the performance of Kriging with CBF and SHB, the reflection coefficient of the room walls is set to 0.01, which approximates anechoic conditions. The source is installed 4 m away from the SMA and generates a sine wave at a frequency *f* = 500 Hz. The source position is $ \varphi = 0 \xb0$ and $ \theta = 90 \xb0$. No noise is added to the microphone signals and the frequency sampling is 65 536 Hz. The original SMA has a radius of $ R = 0.2$ m and $ Q = 24$ microphones with a random distribution (Fig. 1) which should provide an order $ N = 3$ for the spherical harmonics. Higher original SMA radii are also considered for CBF in order to compare with the results obtained by Kriging.

### 4.2 Experimental data

Experiments were carried out in a hemi-anechoic room where a source was installed 2 m away from a 3D printed open SMA. The $ Q = 24$ microphones follow the numerical geometry and the original SMA radius is $ R = 0.2$ m. The source signal was generated with the software audacity and was the sum of a sine wave at frequency *f* = 500 Hz (amplitude 0.8) and a white noise (amplitude 0.5). The source signal was transferred to a USB audio card (ESI Gigaport HD+), then to the source power amplifier (LMS^{TM} Q-AMP), and finally to the source (LMS^{TM} Q-MHF). The acoustic signals were captured with Brüel&Kjaer 4935 1/4″ microphones and GRAS 40PH 1/4″ microphones and were recorded with a SIEMENS Simcenter Scadas Mobile at the frequency sampling of 51 200 Hz.

## 5. Acoustic imaging results

### 5.1 CBF with Kriging

The acoustic images obtained with CBF are shown in Fig. 2. Numerical data are first considered, the original SMA radius (with $ Q = 24$ microphones) is increased from $ R = 0.2$ m to $ R = 0.5$ m [Figs. 2(a)–2(d)]. With a small SMA, the acoustic image exhibits a large mainlobe surrounded by two side lobes with low levels (EAR = 10.8%). This SMA is unlikely to be able to separate two sources closely spaced. When the original SMA radius increases from $ R = 0.3$ m up to $ R = 0.5$ m, the mainlobe area decreases (from EAR = 4.6% down to 1.6%). Side lobes with a higher level appear at random positions due to the random geometry of the original SMA and the low number of microphones. As expected, the larger the SMA, the better the acoustic image. However, a SMA with a radius of $ R = 0.5$ m would not be easily handled in a workplace or a cabin environment.

Now, Kriging is used to virtually increase the radius to $ R K$ and the number of microphones to $ Q K$. The original SMA has a radius of $ R = 0.2$ m and $ Q = 24$ microphones. Four different virtual radii are investigated $ R K = [ 0 , 2 ; 0 , 3 ; 0 , 4 ; 0 , 5 ]$ m with $ Q K = 60$ microphones following a t-design geometry [Figs. 2(e)–2(h)]. The choice of $ Q K = 60$ microphones is explained in the Sec. 5.2. When the virtual SMA radius is equal to the original SMA radius ( $ R K = R$), the acoustic image obtained is similar to the acoustic image obtained with the original SMA (EAR = 11.2%) but without side lobes thanks to the higher number of microphones. When $ R K = 0.3$ m, the mainlobe area decreases as expected, the EAR criterion being divided by 2 (EAR = 5.3%) which is close to the result obtained with the original SMA with $ R = 0.3$ m. Again, no side lobes appear. When $ R K = 0.4$ m, the EAR criterion decreases (EAR = 3.0%) and is similar to the one obtained with the original SMA (EAR = 2.5%). Symmetrical side lobes appear with a lower level than the original SMA. When $ R K = 0.5$ m, the side lobes level becomes too high even if the EAR value is the lowest (EAR = 1.8%). For $ R K > 0.5$ m (not shown in the figure), the source is no more localized. Therefore, the prediction of the microphone signals over the virtual SMA seems limited to the radius $ R K = 0.4 = 2 R$ in this case. The Kriging allows for dividing the EAR value by 3.6.

The acoustic images obtained experimentally with CBF and the Kriging are presented in Figs. 2(i)–2(l). When the virtual SMA radius is equal to the original SMA radius (with $ Q K = 60$), the acoustic image exhibits a large mainlobe with an EAR criterion of 12.8%. The mainlobe is enlarged at the bottom due to the image distortion from spherical view to planar view (a point at a pole of a sphere will be represented by a line on a 2D planar image). With $ R K = 0.3$ m, the mainlobe area is decreased with an EAR criterion similar to the numerical data (EAR = 5.7%). When $ R K = 0.4$ m, a tiny mainlobe with few side lobes is obtained, the EAR criterion value is close to the numerical one (3.1% with respect to 3.0%). Although the mainlobe is smaller with $ R K = 0.5$ m, the side lobes level prevents an efficient localization again. In conclusion, the Kriging allows for dividing by almost 4 the area of the mainlobe. However, the radius of the SMA cannot be increased indefinitely, the limit reached here is $ R K = 2 R$ and requires further study.

### 5.2 SHB with Kriging

The acoustic images obtained numerically with SHB are shown in Fig. 3. The original SMA radius is $ R = 0.2$ m and the number of microphones is $ Q = 24$ (which should allow one to use $ N = 3$). The spherical harmonic order ranges from $ N = 2$ to $ N = 5$ [Figs. 3(a)–3(d)]. With the original SMA and $ N = 2$, the acoustic image exhibits a mainlobe uncentered with respect to the source position with two side lobes (EAR = 5.9%). This poor result is due to the original SMA geometry which does not fully respect the orthogonality of the spherical harmonics. Increasing the order $N$ highlights this fact, the source is no more localized. The Kriging allows for virtually increasing the number of microphones over the SMA. The virtual SMA includes $ Q K = 60$ microphones whose positions are enforced to follow a t-design. Therefore, the maximal spherical harmonic order should be now $ N = 5$. The acoustic images obtained with SHB and Kriging are shown in Figs. 3(e)–3(h). With $ N = 2$, the acoustic image exhibits a mainlobe centered with respect to the source position with two symmetrical side lobes (EAR = 6.0%). Increasing the order from $ N = 2$ to $ N = 5$ allows for decreasing the EAR value from 6% to 2% with a clear decrease in side lobes in comparison to Figs. 3(a)–3(d).

The acoustic images obtained experimentally with SHB are presented in Fig. 4. Again, the original SMA is not able to localize the source position when the order increases [Figs. 4(a)–4(d)]. Only the order $ N = 2$ provides a mainlobe but still with a shift with respect to the exact source position. With the Kriging, the source position is localized no matter the order considered [Figs. 4(e)–4(h)]. The EAR values starts from 6.2% down to 2.0% which is similar to the numerical results. However, the side lobes level becomes higher when the order increases (compared with the numerical results). The sidelobe below the source could be attributed to a floor reflection in the hemi-anechoic room as it does not exist in the numerical results [Figs. 4(g) and 4(h)]. However, this sidelobe does not appear in CBF acoustic images. In conclusion, the Kriging allows for virtually increasing the number of microphones and therefore the order of the spherical harmonics which leads to a smaller mainlobe.

## 6. Conclusion

In this letter, an open SMA with few microphones is considered with CBF and SHB in order to perform acoustic imaging. A spatial interpolation method based on Kriging is used to virtually increase the SMA radius or the number of microphones. With CBF, the SMA radius is doubled which leads to a mainlobe area almost divided by 4 and allows a better source localization at low frequencies. However, for higher SMA radius the side lobes level becomes too high. More work is required to understand this limitation. With SHB, the Kriging allows for positioning the virtual microphones over the sphere in order to respect the orthogonality of the spherical harmonics. Moreover, the number of microphones can be increased which allows for increasing the spherical harmonic expansion order and therefore leading to a smaller mainlobe. The combination of Kriging and acoustic imaging seems promising and paves the way for designing small SMA with a low number of microphones while enhancing the performance for low frequency source signals. In a future work, Kriging could be applied to planar or linear array in order to increase the number of microphones and therefore the frequency range of the microphone array.

## Acknowledgments

This research was supported by the Institut de recherche Robert-Sauvé en Santé et en Sécurité du Travail (Grant No. 2018-0026) and Mitacs (Grant No. IT25827).

## REFERENCES

*Seismic Reservoir Modeling: Theory, Examples, and Algorithms*

*Springer Topics in Signal Processing*