In sonar array processing, a challenging problem is the estimation of the data covariance matrix in the presence of moving targets in the water column, since the time interval of data local stationarity is limited. This work describes an eigenvector-based method for proper data segmentation into intervals that exhibit local stationarity, providing data-driven higher bounds for the number of snapshots available for computation of time-varying sample covariance matrices. Application of the test is illustrated with simulated data in a horizontal array for the detection of a quiet source in the presence of a loud interferer.

Approximation of the true population covariance matrix R corresponding to a time-dependent process by means of the sample covariance R̂ can be a challenging problem in highly dynamic environments with large-aperture arrays.1 On one hand, a large number of data snapshots M is required for the sample covariance to be a consistent estimator. On the other hand, snapshots must be obtained from a limited time interval where the data is assumed to be locally stationary. For this reason, tests for stationarity that require very few samples are relevant. In this work, a statistical test for data stationarity based on the relation between consecutive sample eigenvectors is considered. The proposed test is sensitive to non-stationarity caused by time-dependent variations in source position.

Figure 1(a) shows the simulated scenario considered in this work, consisting of an N-element horizontal line array (HLA) in the presence of uncorrelated acoustic sources located at time-varying azimuths in the water column. This configuration results in a population covariance with spiked2 structure, so called due to a few eigenvalues associated to sources in the water column resulting in peaks in the eigenspectra. Data segmentation according to local stationarity is illustrated in Fig. 1(b), in which the continuous time axis is discretized as t0, t1, etc. In this example, from ta to ta+2 and from tb to tb+3 the azimuths of sources 1 and 2 are constant or slowly-varying (i.e., stationary). In contrast, between ta+2 and tb the azimuth of both sources varies significantly. The goal of this work is to define an eigenvector-based statistical test that requires very few snapshots to identify segments of stationary data, based on changes of direction of the corresponding population eigenvectors, as illustrated in Fig. 1(c) and developed in Sec. III.

FIG. 1.

(a) Top-down view of the simulated scenario in this work, consisting of an N-element HLA in the presence of moving sources in the far field (drawing not to scale). (b) Illustration of time segmentation according to data stationarity. In this example, the azimuth of sources 1 and 2 does not change significantly between ta and ta+2 and between tb and tb+3, thereby defining two intervals with local data stationarity. (c) Relation between population and sample eigenvectors: At times ta and ta+2, the sample eigenvectors (û1a, û2a) lie on a cone centered on the corresponding population eigenvectors (u1a and u2a). At time tb the population eigenvectors vary in relation to those at ta due to source movement.

FIG. 1.

(a) Top-down view of the simulated scenario in this work, consisting of an N-element HLA in the presence of moving sources in the far field (drawing not to scale). (b) Illustration of time segmentation according to data stationarity. In this example, the azimuth of sources 1 and 2 does not change significantly between ta and ta+2 and between tb and tb+3, thereby defining two intervals with local data stationarity. (c) Relation between population and sample eigenvectors: At times ta and ta+2, the sample eigenvectors (û1a, û2a) lie on a cone centered on the corresponding population eigenvectors (u1a and u2a). At time tb the population eigenvectors vary in relation to those at ta due to source movement.

Close modal

Random matrix theory (RMT) has been applied to study the sample eigenvalues and eigenvectors of spiked covariances3–5 for N/Mconstant as N,M. Research on this field has provided (among other results) methods for correcting sample eigenvalue bias4 and estimating the effective number of identifiable sources,6 as well as tools for performance analysis of array processing techniques.7–9 More recently, research on the high dimension, low sample size (HDLSS) regime10 (in which Mconstant as N) has described the relationship between sample and population eigenvectors of spiked covariances as a joint probability density function (pdf). The work presented here makes use of expressions from HDLSS to derive (Sec. III) a full statistical description of the expected behavior of consecutive signal sample eigenvectors (i.e., those eigenvectors related to dominant sources in the water column) under the hypothesis of stationarity. The obtained expression provides a quantitative metric to decide bounds of local stationarity, as illustrated in Sec. IV with a numerical example.

Let ya=sa+wa be an N × 1 data snapshot vector11 received at time ta at an N-element HLA. Each entry of the noise vector wa is normally distributed with power σw2, while sa=Σq=1Qvqaξq represents Q uncorrelated moving sources with power σq2. Source amplitudes are complex-valued stochastic variables ξqσq(N(0,1)+jN(0,1))/2, where N(0,1) indicates a random realization from a normal distribution and j=1. For each source, the time-dependent replica vector (assuming sources in the far field) is

(1)

where k=ω/cw is the wavenumber, ω is the frequency in rad/s, cw is the speed of sound in the water, θqaθq(ta) is the azimuth angle of the qth source at time ta, and d is the array inter-element spacing. The population (i.e., clairvoyant) covariance for mutually uncorrelated sources is11 

(2)

where Va=[v1av2avQa] is a matrix with source replicas as columns and Ψ=diag(σ12σ22σQ2) is a Q × Q diagonal matrix of the source powers. The covariance can also be written in terms of population eigenvalues λia and eigenvectors uia as

(3)

where Q̂Q is the effective number of sources determined by the actual number of sources located at distinct azimuths. Equation (3) is partitioned into eigenpairs containing signal information versus those associated to the background noise. At time ta an estimate of Ra is obtained by averaging the outer product of M snapshots as

(4)

where λ̂na and ûna are the sample eigenvalues and eigenvectors, respectively. The number of snapshots M available for estimation of signal statistics is determined by the largest time interval over which ym can be considered locally stationary. Testing whether the underlying population covariance structure has changed from that at time ta to that at time tb can be accomplished by monitoring quantities such as the (non-observable) population eigenvalues and eigenvectors at consecutive times, based on the variability of the (observable) sample eigenvalues and eigenvectors. For the simulated scenario in Fig. 1(a) the population eigenvector configuration is strongly affected by target movement. In Sec. III a statistically justified metric to decide between hypothesis H0: Ra=Rb versus H1: RaRb based on statistics of the sample eigenvectors is described.

Consider the signal eigenspace spanned by the eigenvectors u1a and u2a at ta in Fig. 1(c), and how it relates to the eigenspace at a later time: In the case of signal stationarity the population eigenvectors do not change (e.g., u1a=u1a+2 and u2a=u2a+2 in this example), although the direction of the sample eigenvectors varies from realization to realization. On the other hand if the signal is not stationary due to source movement then the population eigenvectors at tb point to different directions in relation to those at ta. Since the true eigenvectors are not observable quantities, the decision of whether or not the signal eigenspace has changed must be made based on the observable (yet inaccurate) sample eigenvectors (û1a,û2a,û1b, and û2b in this example). This section investigates the pdf of |ûnaHûnb|2 (i.e., the squared cosine of the angle between consecutive sample eigenvectors) under the H0 hypothesis. Previous work in RMT (Refs. 4 and 5) has shown that for dominant eigenvalues, the corresponding sample eigenvectors are constrained to lie on a cone centered on the population eigenvectors, as illustrated in Fig. 1(c). Due to this constraint, a relation can be expected between the sample eigenvectors at ta and tb under the condition of data stationarity. To obtain a full probability distribution for |ûnaHûnb|2 under hypothesis H0, each sample eigenvector is written as the sum of projections parallel (p̂na) and transverse (ĝna) to the signal subspace:10 

(5)

Under hypothesis H0, uqa=uqb for all q so p̂naHĝnb=0. For simplicity, the case Q̂=1 (i.e., one dominant interferer) is considered, so

(6)

According to RMT, for λ1a>σw2(1+N/M) the transverse component ĝ1a is a vector with direction uniformly distributed on an N − 1 dimensional sphere, making |ĝ1aHĝ1b|0. For HDLSS, the asymptotic regimes10 for eigenvector convergence depend on the magnitude of λ1a compared to N. In this paper we focus on the HDLSS regime for which λ1aN (i.e., loud interferers if N is large), for which |ĝ1aHĝ1b|0 is still valid since the RMT condition N>σw2(1+N/M) is met for σw2/N1 (as expected for usual background noise levels). Notice that from Eq. (5) with u1a=u1b,p̂1aHp̂1b=((û1aHu1a)u1a)H((û1bHu1a)u1a). Finally,

(7)

where Xa=|û1aHu1a|2 and Xb=|û1bHu1a|2 are independent random variables and * indicates the complex conjugate operator. It has been shown10 that Xa (or Xb) is distributed as

(8)

where χM2(·) is the Chi-square distribution with M degrees of freedom. Then, the pdf of z=|p̂1aHp̂1b|2 under hypothesis H0 is given by12 

(9)

The integral in Eq. (9) can be computed numerically for any value of z by discretizing the integration variable xa. For MN snapshots, the test provides a quantitative expression to distinguish between sample eigenvector variability due to true changes on the underlying signal statistics (i.e., non-stationarity) versus variability due to the lack of snapshots. Application of Eq. (9) to data segmentation is illustrated in Sec. IV with an example.

To conclude this section, it is important to comment on the extension of the proposed stationarity metric to the case of multiple interferers. For Q̂>1 the stationarity metric is a function of the scalar products |ûnaHûmb|2 where m,n=1,,Q̂. Similar to the procedure described in this section, expressing each sample eigenvector according to Eq. (5) allows defining the metric function in terms of random variables |unaHûmb|2 (i.e., population vs sample eigenvector), the statistics of which are described in the literature10 and can be used to obtain the first and second moments of the stationarity metric.

This section considers the case of a fast maneuvering quiet source (received signal strength of σ22 = 5 dB) initially located at θ2=0° azimuth in the presence of a slow and loud interferer (σ12 = 18 dB) initially at θ1=3.5°. For this simulation cw = 1500 m/s, f = 500 Hz, N = 120 sensors, d = 1.5 m, and the background noise is set to σw2 = 0 dB. Tracking of the source azimuth is accomplished by applying minimum variance distortionless response beamforming to the data covariance matrix. Ideally, if the clairvoyant covariance is known,11 

(10)

where v(θ) is the replica vector for a plane wave impinging the array with angle θ. The beamformer output corresponding to Eq. (10) is shown in Fig. 2(a) and it provides a benchmark for beamformer simulations in which Ra must be estimated. The results are plotted with time in the vertical axis and azimuth in the horizontal axis, showing the time-varying azimuth for the fast source and the loud interferer. In realistic situations, the covariance matrix must be estimated from limited samples and the beamformer becomes

(11)

where R̃a=Σq=1Q̂λ̃qaũqaũqaH+ϵI is a modified sample covariance matrix8,13 (note the ̃ indicator) that retains the dominant eigenvectors. The ϵ loading is used to stabilize the inverse R̃a1. The results of BFMa(θ) corresponding to M = 5 snapshots per covariance are shown in Fig. 2(b). Due to the lack of snapshot support the power of the interferer is underestimated at all times and the quiet source is not detected at some time instances. Figure 2(c) shows the beamformer output using improved weights, obtained from sample covariance matrices with increased snapshot support determined by data segmentation based on Eq. (9). For example, between 0 and 25 s the interferer azimuth remains constant, so the dominant eigenvalue/eigenvector (λ̃1a and ũ1a, respectively) can be estimated with all the data snapshots for this time interval as opposed to just M = 5 snapshots. A similar situation is observed between 50 and 65 s, where the variation in azimuth for the interferer is slow enough to consider the data as stationary within this interval. The importance of Eq. (9) is that it allows objective data segmentation based on the observable sample eigenvectors. This is shown in Fig. 2(d), where the theoretical PZ(z|H0) is shown in the background indicating the probability of random realizations of |û1aHû1b|2 under hypothesis H0. In this panel, Monte Carlo realizations of |û1aHû1b|2 at each time step are over-plotted as a solid white line. Segmentation bounds indicated as horizontal dashed lines are placed whenever |û1aHû1b|2<0.32 or |û1aHû1b|2>0.67, selected to exclude values of z with low probability. Notice that the theoretical PZ(z|H0) was computed assuming knowledge of the (unobservable) ratio σw2/λ1a. While estimates of λ1a and σw2 based on sample eigenvectors for NM have been suggested in the literature,4 the impact of using such estimates in PZ(z|H0) is subject of future research.

FIG. 2.

(Color online) Example of data stationarity detection for a fast maneuvering target in the presence of a loud interferer: (a) BFCa [Eq. (10)] corresponding to the clairvoyant covariance. (b) BFMa [Eq. (11)] using the sample covariance matrix obtained from M = 5 at each time step. (c) Improved beamforming results, with weights computed from M5 snapshots. (d) PZ(z|H0), indicating the pdf of |û1aHû1b|2 assuming hypothesis H0. The white overplot lines indicate realizations of |û1aHû1b|2, while the horizontal-dashed lines provide the boundsfor data segmentation based on stationarity. (e) Scalar products |u1aHũ1a| and |u1aHû1a|, showing improvement on estimating the dominant eigenvector.

FIG. 2.

(Color online) Example of data stationarity detection for a fast maneuvering target in the presence of a loud interferer: (a) BFCa [Eq. (10)] corresponding to the clairvoyant covariance. (b) BFMa [Eq. (11)] using the sample covariance matrix obtained from M = 5 at each time step. (c) Improved beamforming results, with weights computed from M5 snapshots. (d) PZ(z|H0), indicating the pdf of |û1aHû1b|2 assuming hypothesis H0. The white overplot lines indicate realizations of |û1aHû1b|2, while the horizontal-dashed lines provide the boundsfor data segmentation based on stationarity. (e) Scalar products |u1aHũ1a| and |u1aHû1a|, showing improvement on estimating the dominant eigenvector.

Close modal

In general, slow variations of the interferer azimuth cause z=|û1aHû1b|2 to remain around the value where PZ(z|H0) peaks. Deviations from this peak can be interpreted as statistical evidence to select H1 over H0, thereby providing a data-driven segmentation approach. The advantage of setting M according to the segmentation in Fig. 2(d) can be quantified by comparing the scalar product between the first (and dominant) population eigenvector to the sample eigenvectors û1a (obtained with M = 5) and ũ1a (obtained with M ≥ 5, according to stationarity bounds). This is shown in Fig. 2(e), which demonstrates the improvement achieved with the segmentation procedure on estimating the dominant eigenvector: In all cases, |u1aHũ1a|>|u1aHû1a|, approaching the ideal value of 1 for time intervals with slow or no interferer motion.

A comparison of Figs. 2(b) and 2(c) highlights the advantage of using the improved weights in the beamformer: For Fig. 2(c), the power of both sources is estimated more accurately compared to Fig. 2(b), and in some cases it also improves the detection of the quiet source, such as at times 23.8 and 104.72 s, for example. This comparison is shown in Fig. 3, in which the improvement in detecting the quiet source is evident.

FIG. 3.

Detailed view of the results from Figs. 2(b) and 2(c) at selected times: (a) ta = 23.8 s; (b) ta = 104.72 s. The improved DMR beamformer (M5 snapshots) allow better discrimination of both the loud interferer and the quiet source.

FIG. 3.

Detailed view of the results from Figs. 2(b) and 2(c) at selected times: (a) ta = 23.8 s; (b) ta = 104.72 s. The improved DMR beamformer (M5 snapshots) allow better discrimination of both the loud interferer and the quiet source.

Close modal

An eigenvector-based test for data stationarity, suitable for snapshot-starved experimental scenarios with large arrays in the presence of maneuvering targets, has been described in this work. The test gives an ability to estimate from the data the number of snapshots that should be used to compute a sample covariance matrix where stationarity is ensured. This suggests applications to beamforming techniques with a variable number of snapshots according to the dynamics of dominant sources in the water column. Application of the test for improvement of the detection levels of targets and interferers using Dominant Mode Rejection (DMR) beamforming was illustrated with an example that includes a fast-maneuvering target in the presence of a loud interferer.

The authors gratefully acknowledge the support of the U.S. Office of Naval Research, ONR Undersea Signal Processing.

1.
H.
Cox
, “
Adaptive beamforming in non-stationary environments
,” in
Conference Record of the 36th Asilomar Conference
(Pacific Grove, CA,
2002
), pp.
431
438
.
2.
I. M.
Johnstone
, “
On the distribution of the largest eigenvalue in principal component analysis
,”
Ann. Stat.
29
,
295
327
(
2001
).
3.
Z.
Bai
and
J. W.
Silverstein
,
Spectral Analysis of Large Dimensional Random Matrices
, 2nd ed. (
Springer
,
New York
,
2010
).
4.
S.
Lee
,
F.
Zou
, and
F. A.
Wright
, “
Convergence and prediction of principal component scores in high-dimensional settings
,”
Ann. Stat.
38
,
3605
3629
(
2010
).
5.
F.
Benaych-Georges
and
R. R.
Nadakuditi
, “
The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices
,”
Elsevier Adv. Math.
227
,
494
521
(
2011
).
6.
R. R.
Nadakuditi
and
A.
Edelman
, “
Sample eigenvalue based detection of high-dimensional signals in white noise using relatively few samples
,”
IEEE Trans. Signal Process.
56
,
2625
2638
(
2008
).
7.
M.
Pajovic
,
J. C.
Preisig
, and
A. B.
Baggeroer
, “
MSE of diagonally loaded capon beamformer-based power spectrum estimator in snapshot deficient regime
,” in
Conference Record of the 46th Asilomar Conference
(Pacific Grove, CA,
2012
), pp.
207
211
.
8.
K. E.
Wage
and
J. R.
Buck
, “
Performance analysis of dominant mode rejection beamforming
,” in
Proceedings of the 20th International Congress in Acoustics
, Sydney, Australia (
2010
), pp.
1
6
.
9.
X.
Mestre
and
M. A.
Lagunas
, “
Modified subspace algorithms for DoA estimation with large arrays
,”
IEEE Trans. Signal Process.
56
,
598
614
(
2008
).
10.
S.
Jung
,
A.
Sen
, and
J. S.
Marron
, “
Boundary behavior in high dimension, low sample size asymptotics of PCA
,”
J. Multivar. Anal.
109
,
190
203
(
2012
).
11.
H. L.
Van Trees
,
Optimum Array Processing
(
Wiley-Interscience
,
New York
,
2002
).
12.
V. K.
Rohatgi
,
An Introduction to Probability Theory and Mathematical Statistics
(
Wiley Series in Probability and Mathematical Statistics
,
New York
,
1976
).
13.
D. A.
Abraham
and
N. L.
Owsley
, “
Beamforming with dominant mode rejection
,” in
Proceedings of OCEANS '90. Engineering in the Ocean Environment
, pp.
470
475
.