A previous analysis of 1977 passive acoustic recordings in the Indian Ocean focused on sound pressure levels (SPLs) and showed that SPLs were slightly depth dependent and highly influenced by shipping activities [Wagstaff and Aitkenhead, IEEE J. Ocean. Eng. 30(2), 295–302 (2005)]. Consequently, SPL alone does not provide a consistent comprehensive metric to compare among sites or with contemporary recordings in the same region. Therefore, a source separation analysis was devised and applied to identify the major sound source contributions at three Indian Ocean locations. Shipping noise was a major sound contributor in all sites, while the site with the most diverse number of sources was in the central Arabian Sea.

Concern of rising ocean sound levels was sparked by observations made in the Northeast Pacific Ocean in which a 3 dB/decade rise in low frequency ambient ocean sound was documented.1 The few well-documented datasets of ocean sound recordings prior to 1980 provided information to assess long term ambient sound trends and to understand the changes in the sound source contributions driving the sound level dynamics.2–4 The aforementioned observations raised the question as to whether this is a relatively regional phenomenon or indicative of a more global effect. More recent studies sought to expand the current state of knowledge by assessing ocean sound trends in other parts of the world.5–8 The contemporary studies examined ocean sound trends over the past 20 years, as access to quality recordings prior to the 1990s in areas outside the Northeast Pacific have been a limiting factor. Recent trends do not indicate that low frequency sound levels are increasing globally,8 but access to more historical acoustic recordings are needed to better interpret changes beyond the most recent 20-year window.

The objective of this study was to assess the quality and characteristics of acoustic recordings made in the Northwest Indian Ocean in 1977 for future comparison to recordings made over the past decade. Acoustic recordings were made during the U.S. Naval Bearing Stake exercise in the Northwest Indian Ocean as part of the U.S. Navy Long Range Acoustic Propagation Project (LRAPP). The purpose of the Bearing Stake exercise was to acquire acoustic environmental data to support assessment of sonar performance. Ambient sound recordings were made in three sites across the Arabian and Somali Basins in addition to other measures of propagation losses, coherence, surface shipping distributions, bottom losses, bathymetry, sound speed structure, physical oceanography, and meteorology.9 In regards to these sites, the authors were aware of (1) acoustic analysis performed by Wagstaff and Aitkenhead,9 which was related to ambient sound pressure level, depth dependence, and directionality and (2) the work conducted by Sagers and Knobles10 that addressed the statistical inference of sound-speed depth profile of a seabed in the Gulf of Oman.

In the present study, the Bearing Stake recordings were revisited, and new understanding of the ambient sound is provided. The assessment of Bearing Stake soundscapes via SPL was not a consistent choice of a comparison metric due to poor signal to noise ratio (SNR) and the presence of high electrical noise in some channels. Therefore, source identification was pursued to better understand the source contribution to the overall soundscape. New knowledge resulting from the source identification process will support future comparison with contemporary datasets in this region to gain a better understanding of the ambient sound changes over the past 50 years.

Recordings in the Bearing Stake exercise were made at three locations in the Northwestern Indian Ocean during the first quarter of 1977; recordings were made at three depths in the water column at each site. Clusters of hydrophones were deployed at depths of 400 m, mid-water column, and near-bottom. A vertical acoustic data capsule (VAC) unit was used to record the underwater sound. Each cluster consisted of four hydrophones recording at 27 dB gain steps to cover a wide dynamic range. The recorded data were multiplexed in a channel order from 1 to 14 at each site. The first 12 channels were distributed equally over the three targeted depth regions with four phones at each depth. Channel 13 was a single hydrophone at the deepest depth, and channel 14 was a time code.

In this study, three sites were analyzed based on knowledge gained from Ref. 9 and documentation of the Bearing Stake recordings provided by Applied Research Laboratories, University of Texas (ARL UT) at Austin11 (Table 1). Site 3 was the first deployment of the Bearing Stake exercise with a total of six recording days. Recordings at sites 1B and 4 were made following site 3, and were recorded over 7 and 13 days, respectively. The original sound pressure time series data were requested from ARL UT by the authors, and it is not available for public use without Office of Naval Research approval.

Table 1.

Parameters of the three sites in Bearing Stake (Ref. 11).

Site No.CoordinatesWater depth (m)Recording periodHydrophone depth (m)
1B 23°38′N61°09′E 3351 Feb 16–Feb 22 1685 
17°18′N65°24′E 3583 Feb 02–Feb 07 1919 
04°45′N53°09′E 5106 Mar 11–Mar 24 1916 
Site No.CoordinatesWater depth (m)Recording periodHydrophone depth (m)
1B 23°38′N61°09′E 3351 Feb 16–Feb 22 1685 
17°18′N65°24′E 3583 Feb 02–Feb 07 1919 
04°45′N53°09′E 5106 Mar 11–Mar 24 1916 

Statistical analysis and characterization of the low frequency (<1000 Hz) ocean ambient sound was performed on acoustic recordings at the three Bearing Stake sites. The depth selected for analysis was the depth within the SOFAR channel (Table 1 shows only the selected depth) to maximize the acoustic coverage represented by the measurements and allow future comparison with contemporary datasets in the same region. The selection of channels with the clearest signal display among the three sites was a critical task for analysis purposes, because the large dynamic range across the hydrophones in each cluster at the same depth resulted in different SNR levels between the signals of interest and the ambient sound. Further, low signal level and electrical noise were documented10,11 in some channels within each cluster.

Spectrograms of each channel were visually reviewed to select one channel per site for source comparison purposes. The channel that displayed the clearest signal and least amount of system noise was selected. Channels 7, 6, and 6 were selected for sites 3, 1B, and 4, respectively. Relying only on sound levels, and due to the gain instability, did not provide a concrete metric to assess the ambient sound differences across sites; therefore, source contribution was examined.

The SPL variability between the three sites was the result of constantly changing sources contributions. A frequency correlation method was used to identify the most salient contribution differences among the three Bearing Stake sites. Frequency correlation metrics were calculated based on the time-frequency representation of the raw data8 (Fig. 1 top row). The raw data was transformed from the time to frequency domain with a 10 s FFT and 0.1 Hz frequency resolution. The correlation is Pearson's pairwise linear correlation coefficient12 between each pair of frequency measurements in the time-frequency matrix. The absolute difference correlation matrices (Fig. 1 bottom row) were computed by taking the difference between the two frequency correlation matrices to highlight areas of greatest dissimilarity between the sites. The 2–20 Hz band was identified as a spectral region of large difference between site 3 compared to sites 1B and 4 and was thought to be driven by a unique source mechanism present at site 3. Sites 1B and 4 were comparatively similar.

Fig. 1.

(Color online) Frequency correlation matrices of sites 3, 1B, and 4 (top row) and frequency correlation difference matrices (bottom row).

Fig. 1.

(Color online) Frequency correlation matrices of sites 3, 1B, and 4 (top row) and frequency correlation difference matrices (bottom row).

Close modal

A source separation method that takes advantage of the time-frequency representation provided by the short-time Fourier transform (STFT) was applied to the data at each site. A normalized cross correlation was applied to determine the segments that were generated by the same source.

For each site, the sound pressure time series data, x(n), was divided into M segments, xm(n), in which the mth segment is 60 s (or 144 000 samples). One minute was collected at 12 minutes intervals totaling 5 minutes each hour. In this study, a total of 12, 12.5, and 22.5 hours were processed for the sites 3, 1B, and 4 respectively. For each segment, the mean, xm¯(n), was subtracted then each zero mean segment was transformed to time-frequency representation using short-time Fourier transform (STFT). The STFT matrix of the mth segment is given as Xm(f)=[X1(f)X2(f)Xk(f)]; whereas the ith element of the STFT matrix13 is computed as follows:

(1)

where xm(n) is the mth sound pressure time series segment, w(n) is the Hann window function, and G is the hop size between successive DFTs. The STFT matrix magnitude of the mth segment is given as xm(f)=Xm(f). In this study, DFT points of 2048 samples, Hann spectral window size of 1024 samples, and overlapped samples of 512 were used. For each site, all the STFT matrices were concatenated in one large matrix, denoted here by QU [Fig. 2(a)]. The matrix QU was updated after each iteration with the remaining segments, which is why the total number of segments for that matrix was denoted by the variable, Mˇ [Fig. 2(a)].

Fig. 2.

(a) The flowchart of the sound source separation process. (b) Illustration of abrupt changes in Cm for the case of ambient soundscape. (c) Illustration of abrupt changes in Cm for the case of sound source.

Fig. 2.

(a) The flowchart of the sound source separation process. (b) Illustration of abrupt changes in Cm for the case of ambient soundscape. (c) Illustration of abrupt changes in Cm for the case of sound source.

Close modal

The normalized 2-D cross-correlation14 was computed between the mth STFT matrix xm(f) and the entire matrix, QU to examine if that particular segment held any similarity with other segments. The resulting cross correlation is another large matrix, Q̂U in which the dimension of each segment in that matrix is double the QU. The maximum value of each cross correlation matrix from Q̂U was accumulated and sorted in ascending order in an array, denoted by Cm. The changes in the Cm slope was used as a metric to determine whether the contribution of a given segment contained a salient or transient sound source or was representative of the ambient soundscape. In each iteration, two changing points were computed, denoted by λ1 and λ2, where these two points worked as a testing metric. These two points were computed using the error sum of squares (SSE) between the correlation coefficients Cm values and the least-squares linear fit through those values minimizing cost function [Eq. (1) in Ref. 15]. If both changing points λ1 and λ2 were detected in the lower half of Cm, then that mth segment was ambient soundscape [Figs. 2(a), 2(b)]. On the other hand, if at least one changing point was detected in the upper half of Cm series, then there were K segments grouped as a sound source cluster [Figs. 2(a), 2(c)]. Consequently, the matrix QU was updated by excluding the mth segment from the QU matrix, in case of ambient soundscape. However, the jth cluster, SClusterj, held all segments after the changing point, λ2, in which those K segments in the SClusterj were excluded from the matrix QU. The process continued until all possible sound source clusters and ambient segments were separated.

The characteristics of the separated sound sources were unique enough for the normalized 2-D cross correlation to make a robust decision about them (Fig. 3), except for the ship noise cluster which consisted of a combination of both source and ambient environment characteristics. Therefore, to further separate the ship noise sources from ambient, an additional stage of processing was necessary. For that purpose, the segments of the ship noise cluster (represented by the sound pressure time series segments) were transformed to Welch's power spectral density (Welch's PSD)16 [Fig. 2(a)]. The transformed segments were compared to an empirical threshold (θ=135dB) in which the kth segment was considered ambient if its maximum PSD < 135 dB, otherwise the segment was identified as containing ship noise. The empirical threshold was selected high enough to ensure that only high SNR ship noise sources were separated out from the ambient soundscape. On the other hand, opting for a lower threshold (<135 dB) increased the false positive rate of detecting of ship noise, which was not supported by manual review of randomly selected segments across all three sites.

Fig. 3.

(Color online) The spectrogram of ambient environment and four captured sources of sites 3, 1B, and 4. (a) Source A, (b) source B, (c) source C, and (d) source D.

Fig. 3.

(Color online) The spectrogram of ambient environment and four captured sources of sites 3, 1B, and 4. (a) Source A, (b) source B, (c) source C, and (d) source D.

Close modal

The separated sources grouped into four categories based on unique amplitude and frequency characteristics (Fig. 3). The frequency content varied from several Hz to 500 Hz with different regions of energy concentration. Each source category was labeled from A to D, and the ambient environment was labelled as N (Table 2). The proportion of each sound source and the ambient soundscape without any identified sources is shown in Table 2. Identifiable sound sources were detected at site 3 more often than at any other site. The most prevalent source identified at site 3 was ship noise (source C) detected 34% of the time, followed closely by source A detected 33% of the time. Source A is consistent with the broadband signal of particles or plankton colliding with the hydrophone in the current.17 Site 1B shared the presence of source B and C along with a majority contribution of ambient (>60%). Source B is consistent with the signal of seismic impulses and associated multipath arrivals produced by distant seismic surveys.18 The diversity of sources at site 1B was the least among other sites with a significant contribution from source C. The characteristics of signals in source D have not been linked directly to any known source by the authors, but the constant 4 s pulse rate suggests a human generated signal. Sound source D was only detected at site 4 with a minimal 2% proportion. All three sites exhibited ship noise along with ambient environment as the dominant parts of the soundscape (Table 2). The frequency correlation analysis in Sec. 2.2 showed that there was a spectral region of large difference (2–20 Hz band) between site 3 and the other sites. That analysis was supported by identifying source A, which was only detected at site 3, as likely driving the observed spectral difference in correlation matrices of Fig. 1.

Table 2.

A comparison of the sound source contributions across sites.

Source CategorySite source contribution (%)Overall source contribution (%)
31B4
33 — — 8.38 
3.22 
34 37 35 35.2 
— — 0.85 
28 62 60 52.35 
Source CategorySite source contribution (%)Overall source contribution (%)
31B4
33 — — 8.38 
3.22 
34 37 35 35.2 
— — 0.85 
28 62 60 52.35 

The Bearing Stake exercise project measured the acoustic environment of the Northwestern Indian Ocean during the first four months of 1977. The observation of increasing sound levels in (1) the NE Pacific Ocean over a period that included the 1970s decade and (2) over the past 2 decades in the Northern Indian Ocean,5 motivated the current study in which the only known and accessible recordings in the Northern Indian Ocean prior to 1980 were from the Bearing Stake LRAPP exercise. The high and differing dynamic range settings between the hydrophones made it difficult to use the sound pressure levels as an indication of which channel was the most appropriate for spectral decomposition of the source contributions to the overall soundscape.

Due to the limited data recording duration, the statistical analysis was focused on finding the sound source contribution per each site to assess the diversity of regional sources. The applied sound source separation algorithm takes advantages of the short-time Fourier transform in representing the data segments in both time and frequency domains, whereas the normalized cross-correlation between a reference segment and the entire dataset was used as source detector by scanning for relevant segments. The applied algorithm enabled separation of the segments, whether they were amplitude or frequency modulated sources, through a threshold that detects abrupt changes of the segments' correlation coefficients. The significance of this study lies in gaining new knowledge of historical recordings that are relatively short. The results provide a benchmark by which to compare with contemporary studies in the same region and hence facilitate greater understanding of the ambient sound changes over the past 50 years. The unique source category detected at site 3 may indicate a greater level of current and/or water column particles at this site compared to sites 1B and 4. Present-day soundscapes from this area8 show a large soundscape contribution from baleen whales, which the analysis of the 1977 recordings did not reflect. It is possible that the military activity suppressed baleen whale vocal production during the 1977 recording period, as a reduction in baleen whale vocalization has been observed in response to elevated sound levels and anthropogenic source exposure.19,20 This is an observation that must be addressed when comparing the 1977 sound levels and soundscapes to contemporary datasets.

Thank you to the Applied Research Laboratories, University of Texas at Austin for facilitating the transfer of the Bearing Stake data. This study was supported under a grant from the Office of Naval Research Award No. N000141812063.

1.
R. K.
Andrew
,
B. M.
Howe
,
J. A.
Mercer
, and
M. A.
Dzieciuch
, “
Ocean ambient sound: Comparing the 1960s with the 1990s for a receiver off the California coast
,”
Acoust. Res. Lett. Online
3
(
2
),
65
70
(
2002
).
2.
N. R.
Chapman
and
A.
Price
, “
Low frequency deep ocean ambient noise trend in the Northeast Pacific Ocean
,”
J. Acoust. Soc. Am.
129
(
5
),
EL161
EL165
(
2011
).
3.
M. A.
McDonald
,
J. A.
Hildebrand
, and
S. M.
Wiggins
, “
Increases in deep ocean ambient noise in the Northeast Pacific west of San Nicolas Island, California
,”
J. Acoust. Soc. Am.
120
(
2
),
711
718
(
2006
).
4.
M. A.
McDonald
,
J. A.
Hildebrand
,
S. M.
Wiggins
, and
D.
Ross
, “
A 50 year comparison of ambient ocean noise near San Clemente Island: A bathymetrically complex coastal region off Southern California
,”
J. Acoust. Soc. Am.
124
(
4
),
1985
1992
(
2008
).
5.
J. L.
Miksis-Olds
,
D. L.
Bradley
, and
X.
Maggie Niu
, “
Decadal trends in Indian Ocean ambient sound
,”
J. Acoust. Soc. Am.
134
(
5
),
3464
3475
(
2013
).
6.
A.
Širović
,
S. M.
Wiggins
, and
E. M.
Oleson
, “
Ocean noise in the tropical and subtropical Pacific Ocean
,”
J. Acoust. Soc. Am.
134
(
4
),
2681
2689
(
2013
).
7.
M.
Van der Schaar
,
M. A.
Ainslie
,
S. P.
Robinson
,
M. K.
Prior
, and
M.
André
, “
Changes in 63 Hz third-octave band sound levels over 42 months recorded at four deep-ocean observatories
,”
J. Marine Syst.
130
,
4
11
(
2014
).
8.
J. L.
Miksis-Olds
and
S. M.
Nichols
, “
Is low frequency ocean sound increasing globally?
,”
J. Acoust. Soc. Am.
139
(
1
),
501
511
(
2016
).
9.
R. A.
Wagstaff
and
J. W.
Aitkenhead
, “
Ambient noise measurements in the northwest Indian Ocean
,”
IEEE J. Ocean. Eng.
30
(
2
),
295
302
(
2005
).
10.
J. D.
Sagers
and
D. P.
Knobles
, “
Statistical inference of seabed sound-speed structure in the Gulf of Oman Basin
,”
J. Acoust. Soc. Am.
135
(
6
),
3327
3337
(
2014
).
11.
J. A.
Shooter
,
T. E.
De Mary
,
D. P.
Knobles
, and
R. E.
Keenan
, “
BEARING STAKE Exercise Acoustic Data Recovery
,” September 2012, ARL-TL-EV-12-97 delivered to the Oceanographic and Atmospheric Master Library, Stennis Space Center, MS (April
2019
).
12.
J. D.
Gibbons
and
S.
Chakraborti
,
Nonparametric Statistical Inference: Revised and Expanded
(
CRC Press
,
Boca Raton, FL
,
2014
).
13.
J.
Allen
, “
Short term spectral analysis, synthesis, and modification by discrete Fourier transform
,”
IEEE Trans. Acoust. Speech Sign. Process.
25
(
3
),
235
238
(
1977
).
14.
J. P.
Lewis
, “
Fast normalized cross-correlation
,” in
Proceedings of Vision Interface
(
1995
), pp.
120
123
.
15.
R.
Killick
,
P.
Fearnhead
, and
I. A.
Eckley
, “
Optimal detection of changepoints with a linear computational cost
,”
J. Am. Stat. Assoc.
107
(
500
),
1590
1598
(
2012
).
16.
P.
Stoica
and
R.
Moses
,
Spectral Analysis of Signals
(
Prentice-Hall
,
Upper Saddle River, NJ
,
2005
).
17.
T.
Geay
,
P.
Belleudy
,
C.
Gervaise
,
H.
Habersack
,
J.
Aigner
,
A.
Kreisler
, and
H.
Seitz
, “
Passive acoustic monitoring of bed load discharge in a large gravel bed river
,”
J. Geophys. Res. Earth Surf.
122
,
528
545
, (
2017
).
18.
S. B.
Martin
,
M.-N. R.
Matthews
,
J. T.
MacDonnell
, and
K.
Bröker
, “
Characteristics of seismic survey pulses and the ambient soundscape in Baffin Bay and Melville Bay, West Greenland
,”
J. Acoust. Soc. Am.
142
(
6
),
3331
3346
(
2017
).
19.
M.
Castellote
,
C. W.
Clark
, and
M. O.
Lammers
, “
Acoustic and behavioural changes by fin whales (Balaenoptera physalus) in response to shipping and airgun noise
,”
Biol. Conserv.
147
,
115
122
(
2012
).
20.
S.
Cerchio
,
S.
Strindberg
,
T.
Collins
,
C.
Bennett
, and
H.
Rosenbaum
, “
Seismic surveys negatively affect humpback whale singing activity off northern Angola
,”
PLoS One
9
(
3
),
1
11
(
2014
).