In sonar array processing, a challenging problem is the estimation of the data covariance matrix in the presence of moving targets in the water column, since the time interval of data local stationarity is limited. This work describes an eigenvector-based method for proper data segmentation into intervals that exhibit local stationarity, providing data-driven higher bounds for the number of snapshots available for computation of time-varying sample covariance matrices. Application of the test is illustrated with simulated data in a horizontal array for the detection of a quiet source in the presence of a loud interferer.
I. Introduction
Approximation of the true population covariance matrix corresponding to a time-dependent process by means of the sample covariance can be a challenging problem in highly dynamic environments with large-aperture arrays.1 On one hand, a large number of data snapshots M is required for the sample covariance to be a consistent estimator. On the other hand, snapshots must be obtained from a limited time interval where the data is assumed to be locally stationary. For this reason, tests for stationarity that require very few samples are relevant. In this work, a statistical test for data stationarity based on the relation between consecutive sample eigenvectors is considered. The proposed test is sensitive to non-stationarity caused by time-dependent variations in source position.
Figure 1(a) shows the simulated scenario considered in this work, consisting of an N-element horizontal line array (HLA) in the presence of uncorrelated acoustic sources located at time-varying azimuths in the water column. This configuration results in a population covariance with spiked2 structure, so called due to a few eigenvalues associated to sources in the water column resulting in peaks in the eigenspectra. Data segmentation according to local stationarity is illustrated in Fig. 1(b), in which the continuous time axis is discretized as t0, t1, etc. In this example, from to and from to the azimuths of sources 1 and 2 are constant or slowly-varying (i.e., stationary). In contrast, between and the azimuth of both sources varies significantly. The goal of this work is to define an eigenvector-based statistical test that requires very few snapshots to identify segments of stationary data, based on changes of direction of the corresponding population eigenvectors, as illustrated in Fig. 1(c) and developed in Sec. III.
Random matrix theory (RMT) has been applied to study the sample eigenvalues and eigenvectors of spiked covariances3–5 for constant as . Research on this field has provided (among other results) methods for correcting sample eigenvalue bias4 and estimating the effective number of identifiable sources,6 as well as tools for performance analysis of array processing techniques.7–9 More recently, research on the high dimension, low sample size (HDLSS) regime10 (in which constant as ) has described the relationship between sample and population eigenvectors of spiked covariances as a joint probability density function (pdf). The work presented here makes use of expressions from HDLSS to derive (Sec. III) a full statistical description of the expected behavior of consecutive signal sample eigenvectors (i.e., those eigenvectors related to dominant sources in the water column) under the hypothesis of stationarity. The obtained expression provides a quantitative metric to decide bounds of local stationarity, as illustrated in Sec. IV with a numerical example.
II. Signal model
Let be an N × 1 data snapshot vector11 received at time at an N-element HLA. Each entry of the noise vector is normally distributed with power , while represents Q uncorrelated moving sources with power . Source amplitudes are complex-valued stochastic variables , where N(0,1) indicates a random realization from a normal distribution and . For each source, the time-dependent replica vector (assuming sources in the far field) is
where is the wavenumber, is the frequency in rad/s, is the speed of sound in the water, is the azimuth angle of the qth source at time , and d is the array inter-element spacing. The population (i.e., clairvoyant) covariance for mutually uncorrelated sources is11
where is a matrix with source replicas as columns and is a Q × Q diagonal matrix of the source powers. The covariance can also be written in terms of population eigenvalues and eigenvectors as
where is the effective number of sources determined by the actual number of sources located at distinct azimuths. Equation (3) is partitioned into eigenpairs containing signal information versus those associated to the background noise. At time an estimate of is obtained by averaging the outer product of snapshots as
where and are the sample eigenvalues and eigenvectors, respectively. The number of snapshots M available for estimation of signal statistics is determined by the largest time interval over which can be considered locally stationary. Testing whether the underlying population covariance structure has changed from that at time to that at time can be accomplished by monitoring quantities such as the (non-observable) population eigenvalues and eigenvectors at consecutive times, based on the variability of the (observable) sample eigenvalues and eigenvectors. For the simulated scenario in Fig. 1(a) the population eigenvector configuration is strongly affected by target movement. In Sec. III a statistically justified metric to decide between hypothesis : versus : based on statistics of the sample eigenvectors is described.
III. Probability distribution of consecutive sample eigenvectors
Consider the signal eigenspace spanned by the eigenvectors and at in Fig. 1(c), and how it relates to the eigenspace at a later time: In the case of signal stationarity the population eigenvectors do not change (e.g., and in this example), although the direction of the sample eigenvectors varies from realization to realization. On the other hand if the signal is not stationary due to source movement then the population eigenvectors at point to different directions in relation to those at . Since the true eigenvectors are not observable quantities, the decision of whether or not the signal eigenspace has changed must be made based on the observable (yet inaccurate) sample eigenvectors (, and in this example). This section investigates the pdf of (i.e., the squared cosine of the angle between consecutive sample eigenvectors) under the hypothesis. Previous work in RMT (Refs. 4 and 5) has shown that for dominant eigenvalues, the corresponding sample eigenvectors are constrained to lie on a cone centered on the population eigenvectors, as illustrated in Fig. 1(c). Due to this constraint, a relation can be expected between the sample eigenvectors at and under the condition of data stationarity. To obtain a full probability distribution for under hypothesis , each sample eigenvector is written as the sum of projections parallel and transverse to the signal subspace:10
Under hypothesis , for all q so . For simplicity, the case (i.e., one dominant interferer) is considered, so
According to RMT, for the transverse component is a vector with direction uniformly distributed on an N − 1 dimensional sphere, making . For HDLSS, the asymptotic regimes10 for eigenvector convergence depend on the magnitude of compared to . In this paper we focus on the HDLSS regime for which (i.e., loud interferers if is large), for which is still valid since the RMT condition is met for (as expected for usual background noise levels). Notice that from Eq. (5) with . Finally,
where and are independent random variables and * indicates the complex conjugate operator. It has been shown10 that (or ) is distributed as
where is the Chi-square distribution with M degrees of freedom. Then, the pdf of under hypothesis is given by12
The integral in Eq. (9) can be computed numerically for any value of by discretizing the integration variable . For snapshots, the test provides a quantitative expression to distinguish between sample eigenvector variability due to true changes on the underlying signal statistics (i.e., non-stationarity) versus variability due to the lack of snapshots. Application of Eq. (9) to data segmentation is illustrated in Sec. IV with an example.
To conclude this section, it is important to comment on the extension of the proposed stationarity metric to the case of multiple interferers. For the stationarity metric is a function of the scalar products where . Similar to the procedure described in this section, expressing each sample eigenvector according to Eq. (5) allows defining the metric function in terms of random variables (i.e., population vs sample eigenvector), the statistics of which are described in the literature10 and can be used to obtain the first and second moments of the stationarity metric.
IV. Numerical example
This section considers the case of a fast maneuvering quiet source (received signal strength of = 5 dB) initially located at azimuth in the presence of a slow and loud interferer ( = 18 dB) initially at . For this simulation = 1500 m/s, f = 500 Hz, N = 120 sensors, d = 1.5 m, and the background noise is set to = 0 dB. Tracking of the source azimuth is accomplished by applying minimum variance distortionless response beamforming to the data covariance matrix. Ideally, if the clairvoyant covariance is known,11
where is the replica vector for a plane wave impinging the array with angle θ. The beamformer output corresponding to Eq. (10) is shown in Fig. 2(a) and it provides a benchmark for beamformer simulations in which must be estimated. The results are plotted with time in the vertical axis and azimuth in the horizontal axis, showing the time-varying azimuth for the fast source and the loud interferer. In realistic situations, the covariance matrix must be estimated from limited samples and the beamformer becomes
where is a modified sample covariance matrix8,13 (note the indicator) that retains the dominant eigenvectors. The ϵ loading is used to stabilize the inverse . The results of corresponding to M = 5 snapshots per covariance are shown in Fig. 2(b). Due to the lack of snapshot support the power of the interferer is underestimated at all times and the quiet source is not detected at some time instances. Figure 2(c) shows the beamformer output using improved weights, obtained from sample covariance matrices with increased snapshot support determined by data segmentation based on Eq. (9). For example, between 0 and 25 s the interferer azimuth remains constant, so the dominant eigenvalue/eigenvector ( and , respectively) can be estimated with all the data snapshots for this time interval as opposed to just M = 5 snapshots. A similar situation is observed between 50 and 65 s, where the variation in azimuth for the interferer is slow enough to consider the data as stationary within this interval. The importance of Eq. (9) is that it allows objective data segmentation based on the observable sample eigenvectors. This is shown in Fig. 2(d), where the theoretical is shown in the background indicating the probability of random realizations of under hypothesis H0. In this panel, Monte Carlo realizations of at each time step are over-plotted as a solid white line. Segmentation bounds indicated as horizontal dashed lines are placed whenever or , selected to exclude values of z with low probability. Notice that the theoretical was computed assuming knowledge of the (unobservable) ratio . While estimates of and based on sample eigenvectors for have been suggested in the literature,4 the impact of using such estimates in is subject of future research.
In general, slow variations of the interferer azimuth cause to remain around the value where peaks. Deviations from this peak can be interpreted as statistical evidence to select over , thereby providing a data-driven segmentation approach. The advantage of setting M according to the segmentation in Fig. 2(d) can be quantified by comparing the scalar product between the first (and dominant) population eigenvector to the sample eigenvectors (obtained with M = 5) and (obtained with M ≥ 5, according to stationarity bounds). This is shown in Fig. 2(e), which demonstrates the improvement achieved with the segmentation procedure on estimating the dominant eigenvector: In all cases, , approaching the ideal value of 1 for time intervals with slow or no interferer motion.
A comparison of Figs. 2(b) and 2(c) highlights the advantage of using the improved weights in the beamformer: For Fig. 2(c), the power of both sources is estimated more accurately compared to Fig. 2(b), and in some cases it also improves the detection of the quiet source, such as at times 23.8 and 104.72 s, for example. This comparison is shown in Fig. 3, in which the improvement in detecting the quiet source is evident.
V. Summary
An eigenvector-based test for data stationarity, suitable for snapshot-starved experimental scenarios with large arrays in the presence of maneuvering targets, has been described in this work. The test gives an ability to estimate from the data the number of snapshots that should be used to compute a sample covariance matrix where stationarity is ensured. This suggests applications to beamforming techniques with a variable number of snapshots according to the dynamics of dominant sources in the water column. Application of the test for improvement of the detection levels of targets and interferers using Dominant Mode Rejection (DMR) beamforming was illustrated with an example that includes a fast-maneuvering target in the presence of a loud interferer.
Acknowledgment
The authors gratefully acknowledge the support of the U.S. Office of Naval Research, ONR Undersea Signal Processing.