Matched-field processing is applied to source localization and detection of sound sources in the ocean. The source spectrum is included in the set of unknown parameters and is estimated in the localization/detection process. Bayesian broadband (multi-tonal) incoherent and coherent processors are developed, integrating the source spectrum estimation using a Gibbs sampler and are first evaluated in source localization via point estimates and probability density functions obtained from synthetic signals. The coherent performance is superior to the incoherent one both in terms of source location estimates and density spread. The two processors are also applied to real data from the Hudson Canyon experiment. Subsequently, using Receiver Operating Characteristic (ROC) curves, the two processors are evaluated and compared in the task of joint detection and localization. The coherent detector/localization processor is superior to the incoherent one, especially as the number of frequencies increases. Joint detection and localization performance is evaluated with Localization-ROC curves.
I. INTRODUCTION
Matched-field processing (MFP)1–13 has been used extensively for source localization in the ocean. It is based on full-field calculations on a set of spatially separated receivers (replicas) for multiple candidate values of the unknown source location parameters and their comparison through a measure of correlation to real pressure fields received at the same phones. The simplest MFP scheme is the Bartlett or linear processor that evaluates squared moduli of inner products between normalized replica fields and acoustic data. The processor computes an ambiguity surface with the obtained values vs range and depth. The Bartlett MFP estimates are equivalent to Maximum Likelihood (ML) estimates in a Gaussian noise environment.14
In source localization and detection problems, sound is typically measured at a number of frequencies. One of the important aspects is how to combine this broadband information for optimum extraction of information. In realistic cases the source spectrum is unknown. Using the approach of Ref. 14, a broadband (multi-tonal in our case) Bartlett MFP approach multiplies narrowband ambiguity surfaces. This process is termed incoherent MFP. Coherent processing has been proposed,6,7,15–21 taking into account coherence among the acoustic fields at different frequencies. Coherent MFP often provides better estimates than incoherent processing; however, this largely depends on whether source information (amplitude and phase) is available. In our work, we treat the source spectrum as an unknown; we estimate it in addition to the source range and depth and noise variance and we compare broadband coherent and incoherent processors using this approach [in Ref. 22 it was shown that narrowband processing with the source spectrum probability density function (PDF) estimation is superior to an estimation where an ML estimate for the source spectrum is used14]. The relative performance of the two processors is a focal point of this work. Estimating PDFs of the field spectrum (that includes phase) at particular frequencies, a task we undertake here, is a novel element that can facilitate deconvolution for source identification with uncertainty quantification.
In essence the two processors are Bayesian estimators that provide estimates of source location, source spectrum, and noise variance by maximizing posterior PDFs (the first implementation of Bayesian MFP was presented in Ref. 4 with a broadband extension derived in Ref. 23; more Bayesian localization work was shown in Refs. 24 and 25). The implementation in our work is done using Gibbs Sampling. Sequences of samples drawn from conditional PDFs provide the joint PDF of all unknowns and marginal PDFs for all parameters separately.22,26
Although MFP has been widely applied to the problem of source localization, it has hardly been employed in the problem of signal detection, with some results presented in Ref. 27; coherent MFP has not been evaluated in detection at all. It is an important task, however, to establish whether a sound source is present while also localizing it, and matched-field methods can prove useful in this. In addition to source localization, an additional contribution of this work is detection with the two new processors and then joint detection and localization. We compare the two processors using Receiver Operating Characteristic (ROC) and Localization Receiver Operating Characteristic (LROC) curves and we show that the new coherent processor performs better than the incoherent processor by a substantial amount for the case of four frequencies (the improvement in performance is present but not so pronounced for the case of two frequencies).
The paper is organized as follows: Section II presents a brief summary of narrowband matched field processing and how it is implemented in Ref. 22. Section III develops Bayesian broadband incoherent and coherent processors using Gibbs Sampling, extending the method of Sec. II. Section IV presents localization results with synthetic data and Sec. V presents results from real data. Section VI discusses coherent and incoherent detection and Sec. VII presents joint detection and localization results for the coherent and incoherent processors. Conclusions are presented in Sec. VIII.
II. SINGLE FREQUENCY MFP
Let a sound source transmit a signal at frequency f that is received at L vertically separated hydrophones. The L-dimensional complex received signal can be written as
where is the solution of the Helmholtz equation, called the replica vector, μ is the source spectrum, and is zero-mean spatially complex white Gaussian noise with covariance matrix Σ for both real and imaginary parts, where . The remaining parameters are source range r, source depth zs, and the vector of hydrophone depths . We assume here that the propagation medium and the phone depths are known exactly. We consider a single vector observation .
The assumption is that we have spatially independent white Gaussian noise; this is because we are considering only sensor noise. This is an approximation as there are other noise sources such as the sea surface, shipping, and biological sounds. A combination of noise factors and the spacing between sensors could cause a colored noise environment. In such a case, the data could be pre-whitened through multiplication by a matrix containing noise covariance structure if some prior information is available27 or a non-diagonal matrix Σ could be employed in the modeling of the additive noise.
For source localization, the linear-Bartlett MFP approach3 relies on the calculation of ambiguity surface for range r and source depth zs values on a grid, where
“*” stands for conjugate transpose. For simplicity, we omit the arguments r and zs of . Maximizing P provides the estimates for range and depth. This is also derived in Ref. 14. Spectrum μ and variance do not appear in Eq. (2). The equation is obtained by using Maximum Likelihood estimates (MLEs) computed through the following Gaussian density:
where is the likelihood of the unknown parameters.
It was shown in Ref. 22 that better localization results are obtained from the posterior PDF calculated using a Bayesian process,
where is a constant with respect to the unknown parameters. We integrate over μ and , rather than using MLEs, before estimating r and zs; p(r), p(z), , and are prior distributions on range, source depth, source spectrum, and variance, respectively. Range is sought in interval and source depth is sought in interval . Here
That is, priors for r, zs, μ, and () are considered uniform.
Combining the likelihood and priors using Bayes' theorem we get,
where K is a constant. To obtain the posterior PDF for r and zs, we integrate over μ and ,
Maximizing the density of Eq. (10) provides the Maximum a Posteriori (MAP) estimates for range and depth.
To compute in Ref. 22 we used a Gibbs Sampler.
To implement the sampler, we identified the conditional densities of each parameter on all others. For the source spectrum μ we fix parameters r, zs, and in Eq. (9) and we obtain
where is a constant. This is recognized as a Gaussian distribution with a mean equal to and variance .
Fixing r, zs, and μ we get an inverse distribution for :
If the noise is colored, covariance matrix Σ is not diagonal as previously mentioned and off-diagonal elements will be estimated as well with the sampler.
We cannot obtain analytically the density , thus we calculate it on a grid. The density that is evaluated on the grid is
where K is a constant.
The process is iterative. We start with initial values for μ and and then sample from to obtain values for r and zs. We continue by using these new values of r and zs and the initial value for and drawing a sample for μ from the density of Eq. (11). Using this value of μ we then draw a sample for from the density of Eq. (12). After repeating the process for many iterations and omitting results from the initial “burn-in” iterations, the obtained sample of values converges to the joint density of Eq. (9). Samples for individual parameters provide estimates for the marginal densities. Modes of those provide the MAP estimates of the unknown parameters, including source range and depth.
III. BROADBAND MFP
A. Incoherent processing
Sound data are typically available at a number of frequencies and there has been much discussion as to how MFP should be implemented for broadband (multi-tonal) data, when the source spectrum is unknown. Assume that we have transmission in two frequencies,
where i = 1, 2. Noise Wi is distributed similarly to . The noise variance is assumed to be the same for both frequencies.
In Ref. 14 the ambiguity surface for broadband sound is derived as
PB stands for broadband ambiguity surface. Derivation of this PB is based on using ML estimates for source spectra and variance as before.
We can extend the method described in Sec. II to the two-frequency case. Then
In our work we implement MFP using the density of Eq. (16) and estimating μ1, μ2, and along with r and zs. This estimation process is done using the Gibbs Sampler described above after the derivation of the conditional densities. The conditional densities are the same as before with the conditionals for μi being Gaussian with mean equal to and variance .
The conditional density for is
The process is straightforwardly extended to N frequencies where the exponent of variance changes appropriately and we have N terms in the summation within the exponential.
We term this approach incoherent MFP, because it does not take into account coherence among frequencies.
B. Coherent processing
In Ref. 16 coherent MFP was implemented. In addition to spatial coherence across phones, it considered field coherence across frequencies. Toward that goal supervectors were generated for both data and replicas.
We consider source localization for N frequencies with received data . Then the data supervector is the vertically stacked collection
The replica super vector is
Conventional MFP using these vectors relies on calculating the following ambiguity surface:
In a realistic case and passive sonar processing, the spectrum is typically unknown and the lack of knowledge of relative phases of the frequency domain data Xi presents a difficulty. Namely, if the data vectors (and corresponding replica vectors) are not adjusted for phase, data and replicas will not match. To circumvent this problem, in Ref. 16 all data and replica vectors within the supervectors are modified so that the phase at the first phone for each frequency is zero and the norm of each subvector is one. The subtraction of the phase at the first phones removes the unknown phase of the source and the normalization removes the impact of the unknown amplitudes. We call the new scaled supervectors and and the new ambiguity surface becomes
This processor was shown in Ref. 17 to be superior to the incoherent processor of Eq. (15). However, it is susceptible to a poor Signal-to-Noise Ratio (SNR) at the first phone, which is used for the phase removal. To bypass this problem, it was proposed in Ref. 19 to estimate the source spectrum phases instead of subtracting the phase at the first phone from receptions at all other phones with good results.
In a similar way we implement here a coherent processor using full PDFs as in Secs. II and III. We again assume for simplicity just two frequencies, N = 2, and we form supervectors and , where
The dimension of and is 2L. Then
where is distributed as but the covariance matrix now has a dimension of . The joint density of all unknowns is
The conditional density for μi, i = 1, 2, is
where Gi and Xi are the original replica and data vectors for the ith frequency.
The conditional density for is
The joint density that we estimate with the Gibbs Sampler is
where KC is a constant.
The method can be extended to N frequencies in a straightforward manner.
As stated previously, the issue of source phase is of great importance in the coherent processor implementation. Phases are considered as unknowns in the Gibbs Sampling process through the consideration of the complex spectra as unknown parameters. At the end of the sampling process, we obtain a joint PDF of all unknowns including range, depth, spectra, and variance [Eq. (28)]. Samples for just the spectra provide marginal PDFs for those, and, as a consequence, marginal PDFs of the phases. Samples for just range and depth form a two-dimensional PDF after integrating over the spectra and variance. Thus, estimated PDFs for range and depth (the new “ambiguity” surfaces) are the results of integration over all possible phase values.
IV. LOCALIZATION RESULTS WITH SYNTHETIC DATA
We consider the shallow water environment of Refs. 16 and 28 shown in Fig. 1. This is the environment of the Hudson Canyon Experiment, which was conducted off the coast of New Jersey in an area with relatively flat bathymetry. For the simulations, the true source location was at 2 km in range and 36 m in depth. There were 24 receiving phones with 2.5 m spacing. We generated 2000 realizations of signal plus noise for two frequencies: 175 and 375 Hz. We ran the Gibbs Sampler for 2000 iterations for all realizations; to confirm convergence we monitored the modes of the posterior PDFs. We removed the results of the first 500 iterations, considering them burn-in samples. We then estimated source location coordinates for each realization by maximizing the marginal PDFs for range and depth. The probability of correct localization (PCL) was computed by counting how many times the processors estimated the correct location within 200 m in range and 4 m in source depth, that is, within 10% of the true location. The results for eight SNRs are shown in Table I and Fig. 2. As a benchmark we also compute results for conventional MFPs, where ambiguity surfaces are computed for each frequency and are multiplied to provide a broadband surface [Eq. (15)]. The superiority of the coherent processor is evident. The new incoherent processor and the conventional processor have very similar performances. The developed incoherent processor offers a small advantage for SNRs up to 14 dB. Interestingly the conventional processor outperforms the new incoherent processor for 15 dB. Beyond that SNR, all probabilities are very close to 1.
(Color online) The Hudson Canyon environment; c(z) and are the sound speed profiles in the water column and sediment, respectively, and are considered known from measurements and ground truth information.
(Color online) The Hudson Canyon environment; c(z) and are the sound speed profiles in the water column and sediment, respectively, and are considered known from measurements and ground truth information.
PCL vs SNR.
SNR . | Coherent PCL . | Incoherent PCL . | Conv. PCL . |
---|---|---|---|
16 dB | 0.99 | 0.98 | 0.99 |
15 dB | 0.97 | 0.84 | 0.93 |
14 dB | 0.92 | 0.73 | 0.72 |
13 dB | 0.87 | 0.72 | 0.70 |
12 dB | 0.71 | 0.59 | 0.55 |
11 dB | 0.55 | 0.46 | 0.44 |
10 dB | 0.48 | 0.40 | 0.32 |
9 dB | 0.29 | 0.24 | 0.23 |
SNR . | Coherent PCL . | Incoherent PCL . | Conv. PCL . |
---|---|---|---|
16 dB | 0.99 | 0.98 | 0.99 |
15 dB | 0.97 | 0.84 | 0.93 |
14 dB | 0.92 | 0.73 | 0.72 |
13 dB | 0.87 | 0.72 | 0.70 |
12 dB | 0.71 | 0.59 | 0.55 |
11 dB | 0.55 | 0.46 | 0.44 |
10 dB | 0.48 | 0.40 | 0.32 |
9 dB | 0.29 | 0.24 | 0.23 |
(Color online) PCL vs SNR for the coherent, incoherent, and conventional estimators.
(Color online) PCL vs SNR for the coherent, incoherent, and conventional estimators.
In Fig. 3 we show PDFs of source range and depth for a SNR of 14 dB for one realization. Both approaches estimate the source location correctly. However, the spread of the PDF for the coherent processor is smaller than the one for the incoherent processor. That shows that there is a reduced variance in the coherent estimation.
(Color online) PDFs for one realization for a SNR of 14 dB: (a) coherent PDF and (b) incoherent PDF.
(Color online) PDFs for one realization for a SNR of 14 dB: (a) coherent PDF and (b) incoherent PDF.
Figure 4 illustrates the phase PDFs for a frequency of 175 Hz for two cases with a difference phase in the source spectrum. The correct phase for the first case is 0.75 radians; for the second one it is 0. The PDF shown in Fig. 4(a) has significant probability density around 0.75 and is peaked at 0.7 (a small bias exists between the MAP estimate and the true value). For the second case [Fig. 4(b)], the PDF mode is at 0, the true value.
Source localization was also performed with conventional incoherent MFPs as mentioned above. The PCL for 10 dB was 0.32 in contrast to the 0.40 rate for the new incoherent processor and 0.48 for the new coherent processor. Figure 5(a) shows an ambiguity surface for the conventional processor, where the location estimate is highly ambiguous with the surface exhibiting multiple sidelobes. Figure 5(b) illustrates the PDF for range and depth for the same realization obtained via the Gibbs Sampler for incoherent estimation. There is a unique mode at the correct location.
(Color online) (a) Ambiguity surface for the conventional processor and (b) PDF for the new incoherent processor.
(Color online) (a) Ambiguity surface for the conventional processor and (b) PDF for the new incoherent processor.
V. HUDSON CANYON LOCALIZATION RESULTS
Source localization was also performed with real data collected during the Hudson Canyon Experiment. Data were collected at two sets of frequencies, 50, 175, 375, and 425 Hz and 75, 275, 525, and 600 Hz. Ten snapshots per source location were recorded at 20 locations. Because of a high SNR and accurate knowledge of the propagation medium, all processors [with the exception of the incoherent Minimum Variance Distortionless Response (MVDR)]16 successfully estimated the source 18 times out of 20. However, it is interesting to observe the difference between the new coherent processor and the incoherent processor when using a single data snapshot, in which case the SNR is significantly lower. For the data that we consider here, the correct source range was 3.48 km and the depth was 36 m and the data were collected at 50, 175, 375, and 425 Hz. Figure 6 shows the surfaces for the coherent and incoherent processors. The coherent processor calculates a PDF with a mode and 3.42 km in range and 36 m in depth, very close to the true values. The incoherent mode is at a source range and depth of 2.75 km and 55 m, respectively. The incoherent PDF has a secondary mode at the correct location but the density there is smaller than the one around the main mode. The results indicate the superiority of the coherent processor.
(Color online) PDFs for Hudson Canyon data: (a) coherent PDF and (b) incoherent PDF.
(Color online) PDFs for Hudson Canyon data: (a) coherent PDF and (b) incoherent PDF.
VI. DETECTION AND LOCALIZATION
In Sec. III, we discussed how we can localize a broadband source including source spectrum and noise variance in the estimation process. In addition to localization, we are also interested in detecting the sound-emitting source. We consider two hypotheses: H1 when a signal is present and H0 when there is no signal. We will evaluate the above processors by calculating probabilities of detection and probabilities of false alarm and plotting ROC curves. Combining detection and localization, we will continue our comparison using Localization-ROC (LROC) curves.
When a signal is present, we have
When there is no signal,
The most commonly used detector is the likelihood ratio detector. For narrowband data, we formulate the likelihood ratio using the likelihood function of Eq. (3),
where and are ML estimates of source spectrum and variance, respectively. For multiple frequencies, in the presence of signals we have
and for noise only
here i = 1, 2.
For the incoherent processor
Equivalently we can select sufficient statistic where
We compare to a threshold β: when is larger than β we conclude that hypothesis H1 is true: a signal is present. Otherwise it is concluded that H0 is true, that is, we receive only noise.
For the coherent processor for hypothesis H1 we have
and for H0
For coherent likelihood-ratio detection, the statistic becomes
VII. JOINT DETECTION AND LOCALIZATION RESULTS
We consider again the shallow water environment of Fig. 1. We generated 2000 realizations of signal plus noise and 2000 realizations of just noise for different SNRs. As in the localization task, we ran the Gibbs Sampler for 2000 iterations for all realizations. With a Monte Carlo process we calculated probabilities of detection and false alarm for detection and probabilities of correct localization and false alarm for localization.
We performed detection using the statistics of Eqs. (35) and (38) and formed ROC curves. For a SNR of 17 dB the ROC curves for both processors coincided with the upper left corner, showing perfect detection. We then decreased the SNR and computed the ROC curves of Fig. 7(a) for a SNR of 14 dB. Coherent results are shown with a solid line; the dashed line illustrates incoherent detection. Curves for both processors show a high probability of detection for a small probability of false alarm. Figure 7(b) zooms on the upper left corner of the ROC curve showing that the coherent processor attains a higher probability of detection for the same false alarm albeit not by a large amount. For a probability of false alarm of 0.003 the probability of detection for the incoherent processor is 0.93; for the coherent processor for the same probability of false alarm the probability of detection is 0.96.
(Color online) (a) ROC curves for a SNR of 14 dB; (b) zoom on the upper left corner of (a).
(Color online) (a) ROC curves for a SNR of 14 dB; (b) zoom on the upper left corner of (a).
Figure 8 illustrates the PDFs of the statistics and for the two processors. The overlap for the two PDFs for hypothesis H0 is slightly larger for the incoherent method than for the coherent approach, showing the advantage of coherent processing.
(Color online) PDFs for the detection statistic for hypotheses H1 and H0: (a) coherent results and (b) incoherent results.
(Color online) PDFs for the detection statistic for hypotheses H1 and H0: (a) coherent results and (b) incoherent results.
Figure 9 shows the ROC curves for 13 dB [(a) actual ROC curve and (b) zoom]. Looking at the curve of Fig. 9(b), we observe that for a probability of false alarm of 0.003 (the probability of false alarm at the bottom left corner), the coherent probability of detection is 0.93, while the corresponding incoherent probability is lower at 0.84.
(Color online) (a) ROC curves for a SNR of 13 dB; (b) zoom on the upper left corner of (a).
(Color online) (a) ROC curves for a SNR of 13 dB; (b) zoom on the upper left corner of (a).
Figure 10 shows LROC curves for a SNR of 14 dB. The coherent PCL reached 92% while the incoherent probability of localization reached only 73%. Of particular interest are the results for detection and localization for four frequencies. We carried out simulations for frequencies of 75, 275, 525, and 600 Hz, one of the sets in which sound was transmitted in the Hudson Canyon experiment. Figure 11 shows (a) the ROC curves and (b) the LROC curves for the two processors for a SNR of 9 dB. Now the superiority of the coherent detector over the incoherent detector is much more pronounced. For a probability of false alarm of 0.02 the probability of detection reaches 0.98. For the same probability of false alarm, the probability of detection of the incoherent processor is only 0.81. The superiority is also evident in the LROC curves. The PCL reaches 0.54 for the coherent processor and 0.43 for the incoherent processor.
(Color online) (a) ROC and (b) LROC curves for a SNR of 9 dB for four frequencies.
(Color online) (a) ROC and (b) LROC curves for a SNR of 9 dB for four frequencies.
VIII. CONCLUSIONS
In this work, we develop Bayesian coherent and incoherent matched-field processors for broadband localization and detection that incorporate source spectrum estimation in the process. The estimation is performed using a Gibbs Sampler that computes PDFs of the unknown parameters that we then maximize. Bayesian coherent localization is clearly superior to the corresponding incoherent estimation, which is superior to conventional Bartlett processing for most SNRs. We then perform detection with the two proposed processors. With ROC curves and PDF calculation for the detection statistics we show that coherent processing is superior to incoherent processing in detection; the advantage of using the coherent processor in detection is limited for two frequencies but it is significant for four frequencies. Performing joint detection and localization demonstrates a significant advantage of the coherent processor. Results are also presented with real data showing a coherent PDF with a mode at the correct location. Incoherent processing provided erroneous estimates.
The coherent processor is superior to conventional processing by a considerable amount. It is, however, more computationally demanding although not prohibitively so. The 2000 Gibbs sampling iterations performed here were computed very efficiently because of the simple form of the conditional densities. The incoherent processor also required 2000 iterations and offered little improvement over conventional processing (it offered an advantage for several SNRs that we considered but was inferior at 15 dB). Thus, the additional computational load it requires is not justified.
Summarizing, the new coherent method outperforms other processors in terms of localization accuracy and decreased uncertainty as illustrated via a comparison of PDFs, ambiguity surfaces, and their modes. The approach is also superior to the incoherent processor in terms of detection (investigated to date only in a limited fashion in terms of MFP), especially in the four-frequency case. Additionally, both coherent and incoherent methods proposed here provide estimates of the source spectra and their uncertainty as shown in Fig. 4, which means that they also provide deconvolution results and corresponding uncertainty that can be used in source identification.
Being Bayesian in nature, the new work provides complete densities in addition to point estimates, enabling the understanding of uncertainty in the estimation process. In the future the proposed Gibbs sampling coherent approach can be considered for geoacoustic inversion.26
ACKNOWLEDGMENTS
This work was supported by the Office of Naval Research through Grant Nos. N000141612485 and N000141812125.