Changes in the Arctic environment with regard to declining sea ice are expected to alter the ambient sound field, affecting both the sound generating processes and the sound propagation. This paper presents acoustic recordings collected on the 150-m isobath on the Chukchi Shelf over a yearlong period spanning October 2016 to October 2017. The analysis uses sections of recordings approximately 12 min long collected six times daily. The measurements were collected on a vertical line array spanning the lower 110 m of the water column. The 25th percentile level is used to characterize the spectral shape of the background sound between 40 Hz and 4 kHz. The ambient sound data are analyzed using k-means clustering to quantify the occurrence of six spectral shapes over the yearlong experiment. Each cluster type is associated with a different sound generation process based on the correlations with environmental observations. The cluster observed most frequently was associated with wind-generated sound based on a correlation of sound level with wind speed as well as occurrence during the open water season. The cluster with the smallest number of observations was attributed to wind effects on frazil ice forming in open leads during the ice-covered season.
Ambient sound in the Arctic is highly variable in space and time and shows strong seasonal variations that are correlated with annual changes in sea ice. Historically, the prevailing sound in ice-covered regions mainly consisted of contributions from ice dynamics (ridging, shearing, etc.) or thermal cracking (Carey and Evans, 2011). These processes can create some of the loudest underwater environments, but under calm wind conditions, ice-covered environments can also be some of the quietest places in the world's oceans. For instance, under-ice ambient sound can be up to 30 dB higher or 20 dB lower than the same location with open water (Hutt, 2012). Furthermore, the ambient sound level in ice-covered environments is highly non-Gaussian as a result of numerous loud transient ice noise events (Ozanich et al., 2017). Near the marginal ice zone (MIZ), where the ocean swell and sea-state influence environmental conditions, ambient sound levels can be 10 dB greater than in the surrounding area (Geyer et al., 2016; Johannessen et al., 2003). Sound sources common to lower latitudes are also observed in the Arctic, but the generation and propagation of these sounds are also influenced by sea ice. For example, wind is a dominant source of ambient sound in temperate oceans, but the presence of sea ice serves to partially insulate the ocean from the atmosphere, thereby altering the effect of wind-generated sound (Kinda et al., 2013; Roth et al., 2012). As a second example, sound from distant shipping and other anthropogenic sources is far less prevalent in the ice-covered Arctic (Carey and Evans, 2011). As a final example, the observation of some marine mammal vocalizations is correlated with species-specific migratory patterns, which often follow the ice edge (Chou et al., 2020).
The Arctic has experienced changes in atmospheric conditions, seasonal sea ice coverage (Wood et al., 2015), and thermohaline structure (Timmermans et al., 2018) over the last several decades. The level and character of ambient sound in the Arctic are expected to change as the ice coverage and thickness decrease. There is already evidence of an increase in anthropogenic activities from industries such as oil and gas, fishing, shipping, tourism, and defense (Reeves et al., 2014). The once impassable Northwest Passage has recently seen an advance of commercial traffic, and routine vessel transits are expected by midcentury (Miller and Ruiz, 2014). The increased ambient sound levels from industrialization in the Arctic are affecting marine mammal vocalizations (Fournet et al., 2021). Additionally, some marine mammal species are shifting their migratory patterns, influenced by the longer ice-free season (Hauser et al., 2017). Furthermore, the thinner and more mobile ice of the modern Arctic responds more readily to winds and is more easily broken up, resulting in increased contributions from wind-generated sound (Kinda et al., 2013). The younger ice may also be less capable of withstanding ice sheet shearing that produces sustained tonal features (Chen et al., 2019).
These changes in the sound generating processes are accompanied by changes in acoustic propagation. For example, declines in sea ice extent are associated with long-term trends of thinning sea ice, a lengthening of the summer open water season, and a shift from primarily perennial multi-year ice to seasonal first-year ice (Frey et al., 2015; Krishfield et al., 2014). The implications for acoustic propagation include lower surface loss since sound incurs less loss from open water than from the rough ice interface (Ballard et al., 2020). Similarly, surface loss is expected to be lower for first-year ice compared to multi-year ice, which is generally rougher as it contains significantly larger ice keels (Strub-Klein and Sudom, 2012; Timco and Burden, 1997) and has a rugged underside caused by uneven melt rates during previous summers (Wadhams and Toberg, 2012). Furthermore, oceanographic changes also have effects on acoustic propagation. For example, in the Beaufort Sea, a doubling in the Beaufort Gyre halocline heat content has been observed over the past three decades (Timmermans et al., 2018). This water mass forms the upper boundary of a subsurface acoustic duct and has resulted in a more efficient channeling of sound within the duct (Freitag et al., 2015). Modeling results have investigated the changing strength of the duct and its effects on acoustic propagation (Ballard et al., 2020; Duda et al., 2019; Sagers et al., 2015). Together, these changes are contributing to a significantly different acoustic propagation environment compared to that of previous decades and add to the complexities of the evolving ambient sound field in the Arctic.
This study examines a yearlong record of ambient sound data collected on the Chukchi Shelf between 2016 and 2017 as part of the Shallow-Water Canada Basin Acoustic Propagation Experiment (SW CANAPE) (Badiey et al., 2019; Ballard et al., 2020; Collins et al., 2019). Following previous studies of Arctic ambient sound, this work is focused on the “background” sound level as defined by Roth et al. (2012) or, similarly, as the sound that originates from a myriad of indistinguishable sources as described by Kinda et al. (2013). In this work, the 25th percentile level is used to characterize the spectral shape of the background sound. Previous studies have demonstrated the spectral shape of the background ambient sound level changes seasonally (Roth et al., 2012; Sagers and Ballard, 2018a) and can be associated with sound generation processes (Kinda et al., 2013; Miksis-Olds et al., 2013).
This study investigates the application of an unsupervised machine learning approach to cluster the spectral shape of the background ambient sound. In this work, the spectral shape is defined as the normalized 25th percentile level calculated from 24-min-long recordings collected every 4 h. Compared to previous studies that examine seasonal or monthly averages (Miksis-Olds et al., 2013; Roth et al., 2012; Sagers and Ballard, 2018a), this approach facilitates an examination of the data at higher temporal resolution, making it possible to observe the acoustic response to shorter duration environmental processes. However, the application of clustering to ambient sound data is challenged by the continuous nature of environmental processes that influence it. For example, wind-generated sound is a non-binary process in which sound level is a function of wind speed. Additionally, there are often multiple simultaneous ambient sound generating processes that contribute to the measured sound field. Likewise, the slowly varying changes to the sound propagation environment also affect the measurements. All of these factors lead to spectra that are shaped by multiple and sometimes simultaneous processes. The resulting spectra can be smoothly varying and may be difficult to separate into district groups. Nevertheless, this work seeks to answer the question of whether an unsupervised clustering approach produces clusters that are correlated with dominant environmental processes. To this end, the spectral shapes for each of the 2216 acoustic SW CANAPE recordings are sorted using k-means clustering, and the clustered observations are then compared with environmental measurements to infer the dominant sound generation processes associated with each cluster.
The k-means algorithm is considered one of the simplest and most classical methods for data clustering (Aggarwal and Reddy, 2014). Given a data set consisting of N observations, the goal is to partition the data set into K of clusters with K < N (Bishop, 2006). The prototype of each cluster, which represents the cluster center, is given by with . The k-means algorithm solves the task of assigning observations to clusters, such that the sum of the square distances of each observation to its closest prototype is minimized.
Following Bishop (2006), this task is accomplished through the minimization of an objective function,
where is the Euclidean distance and rnk describes the assignment of the nth observation to the kth cluster according to
The cluster prototypes are found by
Hence, the is equal to the mean of the observations assigned to the cluster k.
Determining the optimal number of clusters in a data set is a fundamental issue in applying k-means clustering, which requires the user to specify the number of clusters to be generated. The elbow method is a heuristic approach that consists of plotting an optimizing criterion as a function of the number of clusters and picking the elbow of the curve as the optimal number of clusters. In this case, the optimizing criterion is the average distance from the observations within a cluster to the cluster center
SW CANAPE took place over a yearlong period beginning in October 2016 and involved 22 moorings, which included acoustic sources and receiver arrays as well as arrays of oceanographic sensors (Badiey et al., 2019; Ballard et al., 2020; Collins et al., 2019). The SW CANAPE moorings were located on the Chukchi shelf and upper slope between the 100 and 700 m isobaths. The location of the central mooring, known as the persistent acoustic observation system (PECOS), is shown in Fig. 1.
SW CANAPE was conducted in concert with the Canada Basin Acoustic Propagation Experiment (CANAPE), which took place in the Canada Basin. The principle components of CANAPE were six transceiver moorings and a distributed vertical line array (DVLA) that were deployed for a yearlong period beginning in September 2016 (Ballard et al., 2020). This analysis uses ice draft measurements collected at the location of the nearest CANAPE mooring to PECOS, deployed by Scripps Institution of Oceanography (SIO) and denoted as SIO 4 in Fig. 1. The distance from SIO 4 to PECOS was 238 km.
This analysis also utilizes atmospheric measurements, including air temperature and wind speed, obtained from the National Weather Service Barrow Weather Station (BWS). The location of BWS, approximately 175 km from PECOS, is also shown in Fig. 1.
A. Acoustic measurements
The acoustic data were recorded by PECOS located on the 150 m isobath on the Chukchi Shelf at 72.7105° N, 159.0100° W (Ballard et al., 2020; Sagers and Ballard, 2018,a). PECOS has an atomic clock time base and 8192 Hz sample rate. For SW CANAPE, PECOS was configured as an L-shaped array, and the measurements analyzed in this paper were recorded on the vertical line array (VLA), which consisted of 18 hydrophones equally spaced 5.8 m apart with the deepest hydrophone 3.5 m above the seafloor. Between October 20, 2016, and October 23, 2017, PECOS collected 24-min-long recordings every 4 h. In addition to recording ambient sound, PECOS also recorded signals broadcast by moored sources on the Chukchi Shelf (Collins et al., 2019) and in the Canada Basin (Ballard et al., 2020). To eliminate these signals from influencing the results, this analysis uses portions of the recording periods that did not include broadband signals from the moored SW CANAPE sources (700 Hz–4 kHz) and applies a frequency domain filter to the time periods occupied by receptions of the deep-water CANAPE signals (200–300 Hz). Additionally, the PECOS recordings were contaminated by electronic noise that resulted from corrosion of the submersible connectors between the individual hydrophones and the array cable. The electronic noise events were broadband and short duration (generally <50 ms); they were removed from the data using an algorithm tuned to detect their unique spectral signature.
For each of the 2216 recordings, the power spectral density was computed using 32-ms Fourier transforms from the portion of each recording that did not include signals from moored sources or electronic noise. To reduce the depth-dependency of the acoustic field and to fill data gaps resulting from the removal of electronic noise, the median level over the hydrophones on the VLA is considered. Spectral statistics of selected percentile levels for each of the 2216 CANAPE recordings are shown in Fig. 2. The black bar above the plots indicates the time period when sea ice concentration was greater than 70%, which roughly marks the ice-covered season. The spectral levels are consistently higher during the open water periods: the median 25th percentile level at 1 kHz is 43 dB during the open water period compared to 22 dB when ice cover is present. This is a well-known phenomenon, resulting both from the insulating effect of sea ice between the ocean and atmosphere, which decreases the amount of wind-generated sound that enters the ocean, and the increased propagation loss due to absorption and scattering of sound by the ice canopy (Carey and Evans, 2011).
Additional observations can also be noted but are not straightforward to explain from the yearlong time series alone. For example, a decrease in high frequency sound occurs during the ice-covered season and displays minimum values in April and May [see annotations in Fig. 2(a)]. There are also differing spectral shapes, including some with more low frequency energy [see annotations in Fig. 2(b)] and some with more high frequency energy [see annotations in Fig. 2(c)]. There is also high variability in the sound level from one recording to the next. In Sec. IV, k-means clustering is applied to sort the data according to their spectral shape and to understand each cluster in the context of the environmental measurements.
B. Environmental measurements
1. Sea ice measurements
Sea ice thickness was measured by an upward looking sonar (ULS) on SIO 4. The ULS sampled every 2 s with a beam footprint of approximately 2 m (Krishfield et al., 2014). The ULS measurements are missing for periods when currents on the mooring line pulled the ULS away from the ice canopy, resulting in data gaps in November and April. A gradual increase in ice thickness is observed between November and April, at which time the ice reaches its maximum daily mean thickness. At this time, the thickness of the undeformed sea ice was approximately 1.3 m, calculated from the median value of the ice draft. During this period, ice keels with drafts greater than 15 m were routinely observed in the measurements. At the beginning of July, the ice starts to melt, and the mean ice draft rapidly decreases.
The growth of the thickness of the ice cover, which takes place over several months, starts at the higher latitudes first. Similarly, the ice breakup occurs at lower latitudes first but takes place over a shorter time period. Hence, there is a small offset in the timing of the ice thickness measurements due to the difference in the location of the ice draft and acoustic measurements. In this work, the ULS measurements are complemented by measurements of sea ice concentration at the PECOS site, which were determined from satellite data (Fetterer et al., 2015). Sea ice concentration describes the relative amount of area covered by ice but does not provide an indication of sea ice thickness or roughness.
2. Atmospheric measurements
The atmospheric measurements were also collected a significant distance away from acoustic measurements. Since the air temperature measurements are intended to mark the seasonal changes in the region that influence the timing of sea ice freeze up and melt, the physical separation of the measurement locations is not expected to significantly affect the results. Conversely, the ambient sound levels can be strongly affected by wind events that can last for periods of hours to days and are more spatially localized. For this reason, the spatial dependence of wind speed is considered by comparing measurements from BWS to a wind speed calculated by the National Centers for Environmental Prediction (NCEP) and the National Center for Atmospheric Research (NCAR) 40-year global reanalysis (Kalnay et al., 1996). The data are provided on a 1.9° grid with a temporal resolution of 6 h.
A comparison of wind speed measurements from BWS to data from the NCEP/NCAR reanalysis are shown in Fig. 3. The BWS measurements are biased toward higher values with a median wind speed of 5.0 m/s over the yearlong measurement period compared to the NCEP/NCAR reanalysis at locations of Barrow and PECOS with median wind speeds of 3.5 and 3.6 m/s, respectively. From the yearlong time series of wind speed shown in Fig. 3(a), it is evident that measurements from BWS and the data from the NCEP/NCAR reanalysis both show elevated wind speeds at approximately the same times. However, there are some differences that can be observed from examining shorter time scales. For example, the inset to Fig. 3(a), which shows a week of data from the beginning of October 2017, shows that the NCEP/NCAR reanalysis data indicate a wind event that begins on October 2 persists for almost a day longer at the PECOS site compared to Barrow. It is more difficult to make a direct comparison to the BWS measurement due to the amplitude bias in the wind speed measurements.
To gain additional insight into which data set is best suited for understanding the acoustic measurements from SW CANAPE, the 25th percentile level at 1 kHz from each of the PECOS recordings is plotted as function of the wind speed, both from the measurements collected at BWS and NCEP/NCAR reanalysis at locations of Barrow and PECOS. For this analysis, only data from the open water period were considered, since the ice cover can mitigate the effects of wind speed on ambient sound. The wind speed data from the NCEP/NCAR reanalysis at the PECOS location provided the strongest dependence from the acoustic sound levels [slope of m = 1.48 dB/(m/s)] and the highest correlation coefficient ( ). The BWS measurements provided a similar correlation to the acoustic data [m = 1.43 dB/(m/s), ]. The NCEP/NCAR reanalysis data from the Barrow location showed the smallest dependence from the acoustic sound levels [slope of m = 0.99 dB/(m/s)] and significantly smaller correlation coefficient ( ). The similarity in the correlation to the acoustic levels shared by the measurements from BWS and the NCEP/NCAR reanalysis data at the PECOS location seems to indicate both wind speed data sets have important qualities. The BWS measurements were notably different from the NCEP/NCAR reanalysis data at both locations in that they included more extreme values. The importance of spatial dependence of the wind speed is evidenced by the decreased correlation in the NCEP/NCAR reanalysis at Barrow compared to the PECOS site.
As described in Sec. I, the 25th percentile level is used to characterize the spectral shape of the background sound level. Working with the low percentiles of ambient sound effectively filters out the loud transient events (Kinda et al., 2013), allowing the analysis to focus on the “background” sound level (Roth et al., 2012). Studies of the background sound level are complementary to investigations of transient sound events that use algorithms to detect and classify marine mammal vocalizations (Chou et al., 2020; Hannay et al., 2013) and other sound events (Chen and Schmidt, 2020; Stamoulis and Dyer, 2000). Although the choice of the 25th percentile level is somewhat arbitrary, the results are robust: nearly identical clusters were formed using the 1st, 5th, 10th, 25th, and 50th percentile levels. Differences in the cluster centroids and distribution of the observations were found when the clustering was applied to the 75th and higher percentile levels.
The k-means clustering algorithm was applied to the 25th percentile level in decibels, the data shown in Fig. 2(b). The decibel scale is advantageous as it reduces the variance in the data and causes the clustering to operate on relevant features. An alternative choice, to cluster the data in units of power μPa2, was ultimately unfruitful as the results were dominated by high amplitude values at low frequencies. Furthermore, since clustering algorithms use distance-based measurements to determine the similarity between data points, an important pre-processing step is to standardize the values of all variables to have a mean of zero. Indeed, when the normalization step was not included in the procedure, the clusters were primarily dependent on sound level and not on the spectral shape. Hence, the data were normalized by removing the mean, which was computed as the average over frequency in decibels.
As described in Sec. II A, the number of clusters was chosen according to the elbow method. Figure 4 plots the average of the square root of the distance with calculated according to Eq. (4). The distance metric was calculated 20 times for each value of k, and the error bars represent the standard deviation of the distances calculated for the iterations. For data that form well-separated clusters, there should be a distinct elbow in the curve that occurs at the point where the optimizing criterion has stopped rapidly decreasing and after which it reaches a plateau. As shown in Fig. 4, for the ambient sound data considered in this work, it is difficult to unambiguously identify the elbow. The reasons for the lack of a distinct elbow may be attributed to the continuous nature of the sound generation processes, the simultaneous influence of multiple sound generating processes, and the slowly varying sound generation processes and propagation conditions. Nevertheless, the beginning of the plateau, marking the end point of the optimizing criterion's rapid decrease, was identified as k = 6.
The cluster prototypes and their characteristics are depicted in Fig. 5. The clusters are organized according to the number of observations they contain as shown in Fig. 5(a). The mean spectral shapes, which serve as the prototype of each cluster, are shown in Fig. 5(b). Cluster 1 has the flattest spectral shape while the other clusters all have spectral shapes that contain more low frequency content. Additionally, the prototype for cluster 6 has a spectral minimum near 500 Hz.
Table I lists the mean distance between cluster prototypes and the mean distance from the observations within each cluster to the cluster prototype, which are measures of the cluster separations and spreads, respectively. The data from Table I are graphically represented in Fig. 6. In the figure, one of the clusters is centered in each plot, with the other clusters offset from the center according to the mean distance between cluster prototypes as listed in the first six rows of Table I. The solid-line circle around each cluster is the mean distance from observations within the cluster to its respective prototype as listed in the bottom row of Table I. The dashed-line circles are the mean plus or minus one standard deviation of the distance from observations within the cluster to its respective prototype.
|.||1 .||2 .||3 .||4 .||5 .||6 .|
|Mean distance between cluster prototypes (dB)|
|Mean distance from observations to prototypes (dB)|
|.||1 .||2 .||3 .||4 .||5 .||6 .|
|Mean distance between cluster prototypes (dB)|
|Mean distance from observations to prototypes (dB)|
As shown in Fig. 6(a), cluster 1 has the smallest spread, as indicated by the smallest circles surrounding its position at the origin. A second observation is that cluster 1 has the most overlap with cluster 2 as the outer dashed-line circles (representing the mean plus one standard deviation of the distance from observations within the cluster to its respective prototype) overlay one another. Making a comparison of all the plots in Fig. 6, it is evident that cluster 2 has the most overlap with the other clusters: the outer dashed-line circles (representing the mean plus one standard deviation) of cluster 2 overlap those of every other cluster except cluster 6, and the solid-line circles (representing the mean distance) of cluster 2 and cluster 4 overlap one another. Cluster 6 is the most well-separated cluster, with none of its circles overlapping those of any other cluster.
B. Spectral exponent
The spectral exponent n characterizes the exponential decay with frequency of the ambient sound (Buckingham and Chen, 1988; Kinda et al., 2013; Milne, 1972; Sagen, 1998). The spectral exponent was calculated as the best exponential fit to the 25th percentile level data for frequencies between 50 and 500 Hz. These frequencies were chosen since the data have a relatively constant slope within these bounds. Histograms of the spectral exponent for each cluster are shown in Fig. 7. Although the spectral exponent was not explicitly included as a parameter over which clustering was performed, the information is intrinsically included in the spectral shapes. Hence, the uniqueness of the distributions observed in Fig. 7 is not unexpected.
Cluster 1 is characterized by the smallest spectral exponents, while clusters 2, 3, and 6 all have mean spectral exponents greater than 2.5. Sagen (1998) calculated spectral exponents from sonobuoy data near the MIZ, and the analysis included ambient sound data from open water, diffuse ice, compact ice, and interior ice pack conditions. The results generally showed the smallest spectral exponents were associated with open water conditions, while larger exponents were found for compact ice and interior ice pack conditions.
C. Sound level and wind speed
The sound level calculated from the 25th percentile level at 1 kHz is chosen in this analysis since contributions from wind-generated sound peak in this frequency range (Wenz, 1962). The histograms of the sound level data are shown in Figs. 8(a)–8(f) for each cluster. Since the data were normalized before clustering, information about sound level was not used to create the clusters. Investigating the sound level distribution for each cluster may provide insight about the environmental process(es) that is the primary contributor to each cluster. Cluster 1 contains the loudest observations, with an average 25th percentile sound level of 58 dB at 1 kHz. Clusters 2, 5, and 6 all have moderately high 25th percentile sound levels at 1 kHz with an average close to 45 dB. Clusters 3 and 4 are made up of the quietest observations with average 25th percentile sound level at 1 kHz of 32 and 34 dB, respectively.
The 25th percentile level at 1 kHz is compared to wind speed measurements from BWS in Figs. 8(g)–8(l) and wind speed data from the NCEP/NCAR reanalysis at the PECOS location in Figs. 8(m)–8(r). Cluster 6 has the strongest relationship to wind speed with a slope of 1.20 dB/(m/s) (ρ = 0.67) using measurements from BWS and a slope of 0.88 dB/(m/s) (ρ = 0.40) using data from the NCEP/NCAR reanalysis. The higher correlation coefficient associated with the BWS measurements suggests the high amplitude sound events are better modeled by the more extreme values of wind speed contained within the BWS data set. Cluster 1 has the next strongest correlation with wind speed, with both the BWS measurement and the NCEP/NCAR reanalysis data providing nearly equivalent relationships, with slopes of 0.83 dB/(m/s) (ρ = 0.42) and 0.85 dB/(m/s) (ρ = 0.46), respectively. The remaining clusters all have weaker relations between sound level and wind speed with smaller correlation coefficients. The data suggest that despite having widely different sound level distributions, the environmental process(es) that strongly contributes to clusters 1 and 6 is well correlated to wind speed.
D. Temporal occurrence
The temporal occurrence of the clusters over the yearlong SW CANAPE is compared with the presence and thickness of the ice cover, the air temperature, and the wind speed in Fig. 9. The wind speed, including measurements from BWS and data from the NCEP/NCAR reanalysis at the location of PECOS, have been low-pass filtered to observe sustained wind forcing on the annual scale. From Fig. 9(e), it is evident that cluster 1 prevails during the open water season. The correlation of its sound level with wind speed in the absence of ice cover is an indication that the primary environmental process contributing to this cluster is wind-generated sound. Intermittent observations of cluster 1 during the ice-covered season are hypothesized to be caused by wind-generated sound entering the undersea environment through nearby open leads in the ice cover. Around June 15, 2017, the air temperature surpassed 0 °C, the sea ice concentration dropped from 100% to 70%, and an increasing rate of occurrences of cluster 1 is observed. An example showing 10 s of data classified as the cluster 1 spectral type is shown in Fig. 10(a). Cluster 1 has the shallowest spectral exponent (see Fig. 7), consistent with other observations of ambient sound measurements in open water (Miksis-Olds et al., 2013; Roth et al., 2012; Sagen, 1998; Sagers and Ballard, 2018a).
Cluster 2 observations occur most frequently during the ice formation and ice melt periods that are associated with the MIZ. During the ice formation period, the most observations of cluster 2 occur during the onset of the ice-covered season in mid-November and during a period of high sustained wind centered around January 1. During the melt period, cluster 2 serves as a transition between cluster 3 (the dominant spectral shape in April and May) and cluster 1 (the dominant spectral shape in August and September). Examination of the spectrograms within this cluster reveals they contain ice sounds (knocking, fracturing, shearing) as well as biological sounds (numerous echolocation clicks). Ice-generated sound in the MIZ is strongly influenced by ocean swell as well as by mesoscale ice-edge eddies (Johannessen et al., 2003). The average sound level for cluster 2 is less than that of cluster 1 (see Fig. 3) possibly because the ice-generated sound is spatially distributed and, therefore, experiences more propagation loss than the wind-generated sound whose source is located directly over the array. Additionally, the sound level of short duration ice events may not be well-characterized by the 25th percentile level. Nevertheless, cluster 2 has the second highest mean level, and its distribution has a long tail biased toward loud events. The spectral shape of cluster 2 recordings is characterized by an intermediate spectral exponent. An example of a broadband ice event from a recording from cluster 2 is shown in Fig. 10(b). Although inspection of spectrograms from cluster 2 recordings suggests that the predominant sound generation process is associated with broadband impulsive events, it is centrally located in the cluster space (see Fig. 6), and its cluster prototype has the shortest distance to several other clusters (see Table I). Hence, observations with greater distance from the cluster 2 prototype are also likely influenced by sound generation processes associated with neighboring clusters.
Clusters 3 and 4 both occur primarily during the ice-covered season, and together these clusters contain the quietest measurements. These clusters also have the largest spectral exponents (see Fig. 7), an observation consistent with previous measurements under pack ice (Sagen, 1998). Compared to cluster 3, the cluster 4 prototype contains more low frequency sound. The relative differences in the spectral content of the clusters can also be observed in the spectrograms shown in Figs. 10(c) and 10(d), which show data from consecutive recordings (measurements collected 4 h apart). Cluster 3 occurs most often later in the ice-covered season, and it is the dominant spectral shape when the sea ice is thickest. Its occurrence tapers off in the spring after the air temperature surpasses 0 °C. Cluster 4 dominates the early part of the ice-covered season when the ice cover is thinnest. Since thin ice is more mobile, the soundscape can be expected to contain more contributions from sound generated by ice knocking and wind. This theory is reinforced by the relatively short distances between cluster 4 and cluster 2 (see Table I). The observations associated with clusters 3 and 4 during the open water season were made during periods of low wind (mean wind of 2.9 ± 1.5 m/s), and these clusters also have lower spectral exponents 1.8 ± 0.2, more consistent with open water periods.
Cluster 5 is observed throughout the year. Examining selected spectrograms from within this cluster suggests that the ambient sound is made up of seemingly continuous marine mammal vocalizations. An example of one such spectrogram is shown in Fig. 10(e). Many marine mammals use the Chukchi Sea as seasonal habitat, with bowhead and beluga whale vocalizations mainly coinciding with the spring and fall migration events, walrus vocalizations occurring mainly in the summer months, and bearded seal vocalizations contributing to the soundscape between January and July (Hannay et al., 2013). The spectral shape of cluster 2 data is characterized by an intermediate spectral exponent, and the 1 kHz sound level was moderate and poorly correlated with wind speed. However, noting that the cluster 5 prototype is nearest to that of cluster 4 and considering the spread of observations with cluster 5 (see Table I and Fig. 6), it is likely the sound generation process described for cluster 4 (sound generated by wind and ice movement) also applies to some observations from cluster 5.
Cluster 6 is observed exclusively during the ice-covered season. It is the cluster with the smallest number of observations, and it has a unique spectral shape characterized by a spectral minimum at 500 Hz [see Fig. 5(b)]. An example spectrogram of cluster 6 data is shown in Fig. 10(f). Of all the clusters, the 25th percentile sound level at 1 kHz for cluster 6 showed the strongest correlation to wind speed [see Fig. 8(f)]. Comparing the wind data from Fig. 9(d) to the occurrences of cluster 6 in Fig. 9(j) shows periods of sustained wind are correlated with the observations of cluster 6. These factors lead to the conclusion that the sound generation process for cluster 6 is related to wind-generated sound entering the ocean through leads in the ice cover near the acoustic observations. It is hypothesized that periods of sustained wind cause leads in the leads to form. Similar spectral shapes were observed in ambient sound recordings by Kinda et al. (2015) and labeled as high frequency broadband (HFBB) transients. The HFBB transients were described as quasi-continuous events that generally persisted for the entire 7-min recording period prescribed for the experiment. Kinda et al. (2015) attributed the source of this sound to wind effects on the frazil ice that forms in open leads. After mid-April, very few observations of cluster 6 are observed, possibly because the pack ice has more densely frozen by this late in the season and the ice is no longer broken up by the wind.
The application of k-means clustering to ambient sound data from SW CANAPE revealed new relationships between ambient sound and environmental forcing as well as reinforcing known connections. For example, cluster 1 was observed most frequently during the open water season, and its source was attributed to wind-generated sound, a result consistent with previous studies. A new observation was related to the timing of the transition from the ice-covered ambient sound to that of the open water condition. It was found to occur when the air temperature surpassed 0 °C, approximately 1 month before the sea ice concentration dropped to 0%. The influence of the MIZ was observed in cluster 2, which roughly coincided with periods of sea concentration between 70% and 95%. More observations of the cluster 2 spectral shape were observed during the melt season when the sea ice concentration was within this range for an extended period. Furthermore, the spectral shape of the ambient sound was shown to evolve over the ice-covered season, transiting from cluster 4 to cluster 3 and shifting to contain less low frequency content later in the season. Cluster 5 was influenced by the multitude of marine mammal vocalizations present in the data set. Finally, cluster 6 showed the strongest correlation to wind speed, and its source was attributed to wind-generated sound associated with frazil ice formation in open leads during the ice-covered season.
K-means clustering is one of the simplest unsupervised machine learning algorithms. Its ease of application makes it an attractive method for statistical analysis for ambient sound models and databases. Specifically, the spectral clustering approach established in this study could be used to form a basis for empirical ambient sound modeling. These results can be used to predict the most likely spectral shape as well as the expected distribution of the background ambient sound level for a particular time of year. This approach has advantages over reporting monthly statics, which cannot account for forcing events with shorter time scales, such as wind forcing or ice cover retreat. Furthermore, possible correlations with environmental observations provide insight into the sound generation processes and the propagation environment.
Additionally, the results of this study could also be applied to infer the occurrence of sound events and monitor environmental processes. Similar to using acoustic methods to monitor wind speed and rainfall in remote parts of the temperate oceans (Nystuen, 2001) or to sense melting glacier ice near marine-terminating glaciers (Pettit et al., 2012), the techniques described in this report can be applied to monitor seasonal changes in the cryosphere. For example, occurrences of cluster 2 were correlated with ice processes, and occurrences of cluster 6 were associated with wind effects on frazil ice entering the ocean through leads in the ice cover. Cluster distances and spreads can be examined to gain further insight into the accuracy of assigning a sound generation process to a particular observation. Applying this approach to a longer, multi-year data set could reveal interannual changes in the soundscape and the undersea environment.
This work was supported by the Office of Naval Research under Grant Nos. N00014-15-1-2144, N00014-18-1-2401, N00014-19-1-2721, and N00024-17-D-6421. NCEP reanalysis data were provided by the National Oceanic and Atmospheric Administration (NOAA)/Office of Oceanic and Atmospheric Research (OAR)/Earth System Research Laboratory (ESRL) Physical Sciences Laboratory (PSL) (Boulder, CO) from their website (NOAA Physical Sciences Laboratory, 2014).