Minke whales were acoustically detected, localized, and tracked on the U.S. Navy's Pacific Missile Range Facility from 2012 to 2017. Animal source levels (SLs) were estimated by adding transmission loss estimates to measured received levels of 42 159 individual minke whale boings. Minke whales off Hawaii exhibited the Lombard effect in that they increased their boing call intensity in increased background noise. Minke whales also decreased the variance of the boing call SL in higher background noise levels. Although the whales partially compensated for increasing background noise, they were unable or unwilling to increase their SLs by the same amount as the background noise. As oceans become louder, this reduction in communication space could negatively impact the health of minke whale populations. The findings in this study also have important implications for acoustic animal density studies, which may use SL to estimate probability of detection.
I. INTRODUCTION
North Pacific minke whales (Balaenoptera acutorostrata) produce boing calls. These calls were first described by Wenz (1964), and later by Thompson and Friedl (1982), but it took almost 40 years before Rankin and Barlow (2005) were able to localize the boing calls to a minke whale source. These calls have been recorded between the southwest coast of North America and Hawaii (Rankin and Barlow, 2005; Wenz, 1964), off Hawaii (Martin et al., 2013; Oswald et al., 2011; Rankin and Barlow, 2005; Thompson and Friedl, 1982; Wenz, 1964), in the Southern California Bight (Kerosky et al., 2013), in the Chukchi Sea (Delarue et al., 2012), and near the Mariana Trench (Nieukirk et al., 2016; Norris et al., 2017). Boing calls from the central North Pacific region have a peak frequency of about 1.4 kHz (Oswald et al., 2011; Thompson and Friedl, 1982) and a duration ranging from 1 to 4.5 s (Oswald et al., 2011; Rankin and Barlow, 2005; Thompson and Friedl, 1982; Wenz, 1964). The call consists of an initial pulse followed by an amplitude- and frequency-modulated component that decreases in amplitude over its duration (Oswald et al., 2011; Wenz, 1964). Rankin and Barlow (2005) noticed that boing calls had different characteristics depending on where they were recorded in the region between the southwest coast of North America and Hawaii. Central boings, recorded off Hawaii and other areas west of 135°W, had a pulse repetition rate in the amplitude-modulated component of 114–118 pulses per second, while eastern boings, recorded east of 138°W, had a pulse repetition rate of 91–93 pulses per second (Rankin and Barlow, 2005). In addition to the pulse repetition rates being slower, the average duration was longer in the eastern boing than the central boing (Rankin and Barlow, 2005; Wenz, 1964). Thompson and Friedl (1982) observed that the inter-call intervals of boings seemed to change based on whether another minke whale was producing boings in the area. They reported a median inter-call interval of 6 min for individual boing series and 30 s when two separate boing series were recorded at the same time (Thompson and Friedl, 1982). Insufficient evidence exists to conclude the purpose of minke whale boing calls, but long-duration recordings can show patterns over time and environmental conditions, which may help to improve understanding of the function of the call and the subset of the population that is vocally active.
Although it is known that minke whales have a wide distribution across the North Pacific, much is still unknown about the minke whale population structure in the eastern North Pacific. Like most mysticetes, minke whales are thought to annually migrate between higher latitude feeding areas and lower latitude wintering areas. In a photo-identification study, almost all minke whales photographed off Vancouver Island and Central British Columbia had scars from tropical and subtropical cookiecutter sharks, and minke whales photographed in multiple years gained new scars in between years (Towers et al., 2013). Minke whales have only been sighted off Vancouver and British Columbia from spring until fall (Towers et al., 2013), and minke whale boing calls have been recorded off Hawaii from fall to spring (Oswald et al., 2011; Thompson and Friedl, 1982). Minke whale boing calls detected in the northeastern Chukchi Sea in fall 2009 fit the criteria of central boings, but those detected in fall 2011 were closer to the description of eastern boings (Delarue et al., 2012). Although it is currently unknown where the population of minke whales that winters off Hawaii spends the rest of the year, these indications support the hypothesis that they may travel north to more temperate or polar waters and might feed in the Chukchi Sea in the summer.
Much is unknown about the minke whale population size, especially the number of individuals that spend the winter in Hawaii, although past studies have used both visual and acoustic methods to estimate their abundance. Since minke whales are small baleen whales, usually seen alone, and produce inconspicuous blows and other surface cues, they are difficult to sight during visual surveys (Zerbini et al., 2006). No total abundance estimate exists for minke whales in Hawaiian waters since they are rarely seen during summer and fall visual surveys (Carretta et al., 2014). Martin et al. (2015) estimated minimum density values of 3.64 and 2.77 acoustically active minke whales per hour in a 3780 km2 search area around the U.S. Navy's Pacific Missile Range Facility (PMRF) before Navy training exercises in February 2011 and 2012, respectively. The proportion of calling animals and the overall calling rate is needed before these minimum density estimates can be converted into animal abundance for the region. In addition, the source level (SL) is necessary to understand the probability of detection in varying noise conditions. These values are difficult to obtain and may change with time, location, season, age, sex, and behavioral state. Visual surveys have been used to estimate seasonal abundance of minke whales in other parts of the eastern North Pacific. Abundance estimates have been calculated off of the west coast of the continental United States (Barlow, 2016), British Columbia (Williams and Thomas, 2007), the Alaskan Peninsula and Aleutian Islands (Zerbini et al., 2006), and in the Bering Sea (Friday et al., 2013). Minke whales have also been sighted as far north as the eastern Chukchi Sea in the summer (Clarke et al., 2013; Clarke et al., 2017a,b) but sightings were not converted into abundance in those surveys. Most of these abundance estimates are minimum estimates of minke whale abundance since many of the analyses used an assumption that all whales along the trackline were detected (e.g., Zerbini et al., 2006). In addition, the probability of visually detecting a minke whale is small so the confidence intervals of these abundance estimates are large.
Most marine mammals, including minke whales, have evolved to rely on acoustic calls as their primary form of communication since sound travels efficiently underwater. However, shipping, drilling, military exercises, and other anthropogenic activities have increased the background noise level (NL) in the ocean, which reduces communication range for marine mammals unless they can adapt. Many animals from diverse taxa compensate for noise by increasing the amplitude of their vocalizations (Brumm and Zollinger, 2011). This phenomenon was first described in humans in 1911 by Etienne Lombard and later termed the Lombard effect (Brumm and Zollinger, 2011; Lombard, 1911). This increase in amplitude is sometimes also accompanied by increases in frequency, word or call length, and changes in calling rate (Brumm and Zollinger, 2011). Many experiments have shown evidence for the Lombard effect in birds. For example, the budgerigar or parakeet (Melopsittacus undulatus) increased its call intensity as white background noise within the bandwidth of its calls increased (Manabe et al., 1998). Bats change their echolocation behavior in response to noise. Greater horseshoe bats (Rhinolophus ferrumequinum) increased their echolocation amplitude when the noise bandwidth overlapped with their call bandwidth and increased their call frequency in most tested noise conditions (Hage et al., 2013). Some frogs also show evidence for the Lombard effect. Male túngara frogs (Physalaemus pustulosus) increased their call amplitude, calling rate, and call complexity in tank experiments when noise overlapped with their call frequency (Halfwerk et al., 2016). Several species of marine mammals also seem to change their vocalizations in response to noise. Bottlenose dolphins (Tursiops truncatus) off Florida increased the apparent output level of their whistles in increased background noise (Kragh et al., 2019). Both whistle type and NL seemed to influence the output level of the whistles (Kragh et al., 2019). Southern resident killer whales (Orcinus orca) near the San Juan Islands in Washington increased call duration and the SL of calls as boat noise increased in both short- and long-term studies (Foote et al., 2004; Holt et al., 2009). As NLs increased, right whales, especially North Atlantic right whales (Eubalaena glacialis), increased the amplitude and frequency of their calls and decreased their calling rate over short time periods and increased the frequency and duration of their calls over long time periods (Parks et al., 2007; Parks et al., 2010). Humpback whales (Megaptera novaeangliae) increased the SL of their social calls in increased background noise along their migration route and in a high-latitude feeding area (Dunlop, 2016; Dunlop et al., 2014; Fournet et al., 2018). Physiological limitations may inhibit animals from fully compensating for background noise, resulting in a reduced communication range. In addition, noise has been linked to increased stress in marine mammals (Rolland et al., 2012) and decreased reproductive success in songbirds (Habib et al., 2007).
Rigorous studies of SL for minke whale boing calls have not been published. Thompson and Friedl (1982) estimated that the boing call SL was a minimum of 150 dB at 1 m based on received calls from a single whale estimated to be a certain minimum distance from a hydrophone recorder. It is unknown if and how the minke boing call SL varies over time, behavioral state, and environmental conditions. SLs are important, not only for understanding the behavior of an animal, but also for determining the detection range during passive acoustic monitoring for marine mammals and estimating the density or abundance of vocal species. When determining density estimates from acoustic recordings, assumptions are made about the SL of the calls and/or the probability of detecting a call. If SL does change as a function of noise, current methods in acoustic density estimation could be inaccurate. Acoustic density estimation is especially relevant for minke whales since they are a cryptic species and difficult to monitor by traditional visual methods. The purpose of this study was to calculate the SL of minke whale boing calls and compare these SLs over a range of background NLs.
II. METHODS
A. Study area and data description
The U.S. Navy's PMRF is located off the northwest coast of the island of Kauai in the Hawaiian Islands. Since 2002, an array of time-synchronized hydrophones from the PMRF underwater range have been recorded approximately two days a month, in addition to recordings associated with U.S. Navy mid-frequency sonar training events. Over the years, the number of hydrophones in the array and sampling rate has changed, but from August 2012 to July 2017 the array configuration remained the same, containing 47 broadband hydrophones with a 96 kHz sampling rate. In 2014, opportunistic recordings spanning several weeks were added, recording at a 6 kHz sampling rate. Of the 47 hydrophones, 14 offshore hydrophones, at depths of 3150–4700 m and covering a rectangular-shaped grid approximately 20 km to the east/west and 60 km to the north/south, were selected for localization purposes (Fig. 1). All data recorded at 96 kHz were down-sampled to 6 kHz before processing for consistency.
Five years of data were chosen for this study, spanning August 2012–July 2017. All seasons were included, although minke whale boing calls were only recorded in fall, winter, and spring. Figure 2 shows the recording effort (in h) for each month, along with the number of acoustic localizations of minke whale calls within 1–10 km of the center hydrophones of each array for each month.
There are several assumptions used throughout this study. Both the animal source and receiving hydrophones were treated as omni-directional, and therefore the minke whale sound directivity was assumed to be zero. The hydrophones on PMRF were designed and tested to be omni-directional so the omni-directionality of the receivers is supported. The SL and NL measurements presented were limited to the 1250–1600 Hz band, which represents the main components of the minke whale boing. The whales were assumed to be near the surface and NLs were recorded on the bottom hydrophones and are therefore only a proxy for the noise experienced by the whale. The sound propagation models used for this study were checked for sensitivity to variability in sound speed profiles (SSPs), sediment type and thickness, and calling depth. Nevertheless, they cannot fully characterize the complexities of transmission loss (TL) and do not account for sea-surface roughness and internal waves, for example. The validity of these assumptions are discussed in Sec. IV.
B. Detection, localization, and tracking of minke whale signals
The process for obtaining whale locations can be divided into three steps: detection and feature extraction, cross correlation of those features to obtain time difference of arrivals (TDOAs) of the call at each hydrophone, and TDOA-based localization. These steps are outlined in detail in other publications using calls from humpback whales (Helble et al., 2015) and Bryde's whales (Balaenoptera edeni; Helble et al., 2016), and therefore are only summarized in this paper.
The generalized power-law (GPL) detector (Helble et al., 2012) was used to detect minke whale boings. The GPL detector marked the start and end time of each minke boing and used a spectral “templating” procedure that subtracted the underlying noise at each frequency band, leaving only the spectral contents of the signal. These templates were later used in the cross-correlation process to obtain TDOAs.
The 14 hydrophones used to localize the calling minke whales were divided into 4 subarrays (A,B,C,D), each containing five hydrophones as shown in Fig. 1. The TDOAs were computed between the center hydrophone of each subarray and the nearest four corner hydrophones. A localization was valid if the call was detected on the center hydrophone and any three of the four corner hydrophones. The maximum allowable time delay between the center hydrophone and each adjacent hydrophone in the subarray was limited to the direct path propagation time between them. The subarray configuration was chosen so that a direct path solution on the four hydrophone pairs always existed across the monitored area.
To calculate the location of the minke whales producing the calls, minor modifications were made to the methods outlined in Sec. II of Helble et al. (2015). The minke whale boing call primary frequency components ranged from 1250 to 1600 Hz, so these frequencies were used as the template bounds. As with the localization methods used for Bryde's whales (Helble et al., 2016), single minke whale boing call templates were cross correlated to estimate the TDOA of the call between pairs of hydrophones. Helble et al. (2015) showed that for single tonal humpback calls, the timing delay errors were on the order of 40 ms, resulting in localization standard deviations of less than 60 m. Unlike the humpback tonal calls, the initial pulse of the minke whale boing vocalization creates a prominent peak when cross correlated, so it is expected that the timing delay errors are no worse than those for humpback tonal calls. Localization accuracy is important for modeling the TL between the whale location and the recording hydrophone, and discussed in Sec. II D.
A semiautomatic tracker, previously described by Klay et al. (2015), was utilized to spatially and temporally associate localized minke whale calls into individual tracks. Localized calls out to 20 km from the center hydrophone were considered for tracks. These calls were recursively examined so that the elapsed time and distance between calls fit reasonable assumptions of minke whale swimming and calling behavior. A minke whale track required a minimum of 12 localized calls and had a maximum of 3 km and 40 min between successive localizations. The number of counted tracks gave a rough estimate of how many individual whale encounters made up the localized calls used in this study. Although the metrics chosen were somewhat subjective, they did not make a difference in the results of this study and are reported for reproducibility only.
All tracks were validated by an analyst to be minke whale boings by plotting the locations of the calls, the inter-call intervals, and the corresponding spectrograms for a subset of calls along the track.
C. Received level and NL estimation
To estimate the sound pressure spectral level of the received minke whale boing calls and background noise, the spectral density was calculated and integrated over the frequency bandwidth of interest. The spectral level, or mean square received level (RL), is
where fs is the sampling frequency, nFFT is the number of samples used in each fast Fourier transform (FFT) window, and is the spectral density and summed over n frequency bins. Spectral density is calculated by
where the factor of 2 accounts for the energy at negative frequencies, and nT is the number of time segments incoherently averaged to obtain the spectral density estimate. Within the summation, the quantity is the fast Fourier transformed complex value in the frequency bin corresponding to fi. In the denominator, the sum of is the sum of the square of all the points in the window applied to each of the j time series segments before Fourier transforming. Dividing by the ratio of the sampling frequency (fs) and the FFT length (nFFT) normalizes by the bin width. An nFFT of 2048, an overlap of 87.5%, a sampling frequency fs of 6 kHz, and a Hamming window were used in this analysis.
To estimate the NL, was summed from f1 = 1250 to fn = 1600 Hz, which was chosen to cover the dominate frequencies of the minke whale boing call and matches the frequencies used for the call templates. The number of time segments nT was chosen to be 60 samples, which corresponds to approximately 2.5 s of noise. The noise sample was constructed using the time period just before and after the call with a 1 s buffer so that any residual minke boing signal not detected would not be included in the noise sample. The noise samples taken before and after the call did not differ significantly, indicating that there was no signal present in the noise measurements.
To estimate the RL of minke whale boings, was again summed from f1 = 1250 to fn = 1600 Hz over the duration of the signal as determined by the GPL detector. A minke whale boing spectrogram contains both the signal from the whale call and the background noise. Two methods were explored to determine the most accurate way of removing noise from the estimated RL. In the first method, the call templates as described in Sec. II B of Helble et al. (2015) were used in place of . The call templates were designed to contain only the spectral contributions of the minke whale boings with the background NL removed. For the second method, the full time series, including both the signal and the noise, was fast Fourier transformed and used for to calculate the spectral level. The NL adjacent to the call (as described above) was subtracted separately to estimate the RL.
Both the noise measurements and minke whale boing call RLs were converted into decibel units, using , where RL is in units of , and RLdB is in units of dB . This method calculates the root mean square (RMS) RL, which is the method used for the remainder of this paper.
Monte Carlo simulations were used to estimate the accuracy of the minke whale boing RL measurements over all likely NLs. Thirty high quality audio recordings of minke whales on PMRF were selected. The signals were then reduced in amplitude and added to 100 h of randomly selected ocean noise recorded at PMRF over all likely signal-to-noise ratios (SNRs), defined as
where NL and RLdB are both in units of dB . Next, the GPL detector was used to detect the calls inserted into the noise, and the start time and end time were estimated. The results from the two calculation methods were compared against the known RLs and NLs.
D. TL estimation
In order to estimate the SL of minke whale boing calls, both the RL and TL between the source and receiver must be estimated, as described by
where TLdB is in units of dB (RL at a range r relative to RL at 1 m from the source), RLdB is in units of dB , and SLdB is in units of dB at 1 m. This equation assumes an omni-directional source and receiver.
TL was estimated using two methods. The first method was to use the range dependent acoustic model (RAM) to estimate the TL between each whale call location and each hydrophone location where the call was recorded. The Peregrine software developed by Oasis was utilized for this task. Peregrine is a C-language interface to the split-step Pade parabolic equation acoustic propagation code Seahawk (Heaney and Campbell, 2016; Heaney et al., 2017), which is based on RAM (Collins, 1995). Peregrine is a general purpose acoustic propagation model suited to the N × 2D modeling for this project, where the two-dimensional (2D) model is based on range and depth, but Peregrine adds azimuthal dependence at N radials (Heaney and Campbell, 2016; Heaney et al., 2017). The TL over the minke whale boing call bandwidth was calculated by incoherently averaging the TL between the source and receiver in the 1250–1600 Hz band in 5 Hz increments. The TL between the whale location and the hydrophone was approximated by interpolating values derived from TL radials in 60 deg increments for each hydrophone. This azimuthal interpolation was justified because the variability of TL as a function of azimuth was less than 1 dB for ranges and hydrophone locations included in this study. TL was calculated for whale depths between 5 and 100 m, covering the likely depth range at which the animals vocalize. The Peregrine model can import environmental variables that may affect TL. Environmental inputs were interpolated from a variety of four-dimensional [4D; three-dimensional (3D) space plus time] ocean models and bathymetry databases as they were needed in the calculations. Bathymetry information was collected from the National Oceanic and Atmospheric Administration (NOAA) National Geophysical Data Center U.S. Coastal Relief Model (NOAA National Geophysical Data Center, 2011) with 3 arc-second resolution. Historical seasonal SSPs were derived from the 2018 World Ocean Atlas (Locarnini et al., 2018; Zweng et al., 2018). Within Peregrine, the sediment is treated as an acoustically thick halfspace (implemented as 20 wavelengths at the given frequency, containing an exponential absorptive sponge along the bottom of the sediment layer). Various sediment grain sizes on the Krumbein phi (ϕ) scale (Krumbein and Sloss; 1951; Wentworth, 1922) were chosen as inputs to the model. A sensitivity study was conducted over all likely bottom compositions and SSPs, and TL was calculated for the most likely conditions with bounding values recorded for combinations of SSPs and bottom types that resulted in the highest and lowest TL values.
The second method for estimating TL was to use the geometrical spreading and attenuation loss equations described by Urick (1967). For slant ranges from the source to the hydrophone greater than the seafloor depth at the source location, the SL was estimated from the RL by
where SLdB is the SL (dB at 1 m), RLdB is the RL (dB ), rT is the transition range in m at which geometrical spreading transitions from spherical to cylindrical, α is the attenuation loss coefficient in dB/km, and r is the slant range from the whale to the hydrophone in m (Urick, 1967). At slant ranges less than rT, SL was calculated using spherical spreading only
The attenuation loss coefficient α is primarily influenced by frequency dependent absorption for the relatively short ranges and deep water used in this study. A median frequency of 1400 Hz was used to calculate α by means of the method described by Ainslie and McColm (1998) and found to be . Since attenuation is minimal compared to geometrical spreading, the “geometrical spreading and attenuation loss equation” is shortened to “geometrical spreading equation” for the remainder of this paper. The transition range, rT, was hypothesized to be one water depth since the whales are thought to vocalize near the surface, and the hydrophones are raised just above the seafloor. To confirm, a variety of values was tested for rT, and the binned average SLs were plotted as a function of range. Additionally, TL as a function of range with various rT values was compared against the Peregrine model.
Both methods of estimating transition loss (the Peregrine model and the geometrical spreading equation) were used to calculate SL by adding TL estimates to the measured RLs of minke whale boing calls. Assuming call SLs are independent of range, correcting the call RLs by the estimated TL should produce similar average SLs over all distances between the animal location and the receiving hydrophones. One caveat to this assumption is that low SL calls at farther ranges could be masked from the detector, limiting detections to those from higher SLs, and therefore causing SLs to trend higher at farther distances (discussed in more detail in Sec. II E).
Minke whale boing RLs from ranges of 0 to 20 km were measured in this study and used to validate the propagation model. The RL values were binned into 10 m horizontal range increments, and the average values were plotted as a function of range. The TL was calculated for each of the calls using both Peregrine and the geometrical spreading equation, and the average values were also binned in 10 m increments and plotted as a function of range. The TL from horizontal ranges of 0 to 2.5 km was not available from Peregrine due to the inherent limitations of the parabolic equation when modeling high-angle propagation from a source near the surface to a receiver near the bottom in the deep ocean. The TLs calculated from Peregrine and spreading equations were compared over the ranges available.
E. Probability of detection and localization
Both the detection and localization processes for any passive acoustic marine mammal monitoring system will be influenced by the acoustic environment, including background noise. Therefore, particular care must be taken when trying to assess the influence of the environment on the marine mammal. Masking, or the addition of natural or artificial sound to the signal of interest that may cover-up or otherwise change the detectability of the sound, is a primary concern when measuring changes in vocal behavior.
The potential influence of masking on the detection of minke whale calls was modeled by first estimating the probability of detection at the center hydrophone of each subarray with simulated animal source locations randomly distributed at ranges of 0–20 km from the center hydrophone over all measured SL and NL conditions. The probability of localization was subsequently estimated by combining the detection probabilities over the required number of adjacent hydrophones (at least three) to allow for a valid localization. The estimated probability of detection, , within a given area surrounding a hydrophone was calculated by
where is the probability density function (PDF) of whale calling locations in the horizontal plane, and is the detection function (Buckland et al., 2001). For the purpose of assessing masking, a homogeneous random distribution of animals over the whole area of detection, , was assumed, and so . The accuracy of estimating PD relies on characterizing the range, azimuth, and depth dependent detection function, , in accordance with the detector used. In this paper, the animal was assumed to be near the surface so that was taken as a function of range, r, and azimuth, θ, only. The detection function measures the probability of detection from a radial distance from the recording hydrophone (w1) out to a radial distance (w2) over all azimuths. Normally, w1 would be set to zero, with the animal location directly above the hydrophone, but was included here as a variable for reasons explained subsequently. The azimuthal dependence was added to the standard equation to emphasize the complexity caused by bathymetry.
Since the detectability of a call using the GPL detector was dependent on the SL, TL, and NL such that , Monte Carlo simulations were used to characterize the detector performance for the full range of expected TL, SL, and NL. First, the noise was removed from ten minke whale boing signals recorded near the hydrophone with high SNR following the procedure outlined in Helble et al. (2012). Next, 100 h of PMRF noise samples covering the fall, winter, and spring months were collected. The amplitudes of the minke signals were adjusted so that the SLs ranged from 150 to 180 dB at 1 m in 0.5 dB increments. These signals were then reduced in amplitude according to the modeled TL for each location () with TL calculated as described in Sec. II D. The reduced signal was then added to noise, the combined audio was processed with the GPL detector, and the detection was recorded as either detected or missed. Using this technique, was estimated for each hydrophone over all likely combinations of SL and NL. It is important to note that only the amplitudes of the signals were reduced by the expected TL, and any distortions (such as multipath) that may affect the detectability of the call were not simulated. However, for the ranges chosen for w1 and w2, the signal was minimally distorted by the environment since w2 was limited to distances of primarily direct-path propagation.
With characterized over all likely SLs and NLs, the estimated probability of localization, , within a given area surrounding a hydrophone could be calculated. In order for a call to be localized, the call must be detected on the center hydrophone in the subarray and any three of the four surrounding hydrophones in that subarray. Probabilities of localization functions for each of the center hydrophones were created by multiplying at the center hydrophone with the highest three of four probabilities from the adjacent hydrophones, where r and θ from each adjacent hydrophone differ in order to reference the same position as defined by the center hydrophone. The resulting probability of the localization function, , at the center hydrophone is inherently lower than the detection function, , due to requirement of the call being detected on the center hydrophone and at least three of the adjacent hydrophones. The probability of localization for each of the four subarrays had similar performance since all the hydrophones were at similar depths and bathymetries.
In order to maintain a high probability of detection and localization throughout the study area at observed NLs, the maximum allowable radius (w2) from the center hydrophone was set to 10 km for tracked calls included in the SL analysis. The minimum radius (w1) for tracked calls was set to 1 km to avoid uncertainties with directionality of calls produced by minke whales and depth of the calling whale, which would have a greater impact on TL at closer ranges.
F. SL estimation
SLs were estimated by adding the measured RL of each minke whale boing to the expected TL for the animal's position. For each boing, the RL was measured on the master hydrophone and at least three of the four surrounding hydrophones. Therefore, four or five independent measurements of the RL (and thus SL) were available for each call. The average SL and NL across the contributing hydrophones were recorded for each boing, as well as the standard deviation of the values.
A generalized additive model (GAM) with the “mgcv” package in R (Wood, 2017) was used to model the relationship between minke whale SL and ocean NL. A Gaussian distribution for the error terms and an identity link function were used with a smoothing term using cubic regression splines with 5 knots (k = 5) capturing nonlinearities in the relationship between the predictor and response variable. The exact number of knots is not critical but was chosen conservatively with the intention of producing biologically meaningful results. To ensure that the number of knots were not over-specified, the effective degrees of freedom were used as a guide (Wood, 2017).
Masked calls create a detection problem that can potentially bias results. Calls made but not detected due to masking cause the mean call SL in a given noise band to be overestimated. If more calls were masked at higher NLs than at lower levels, then the bias in the mean would increase with NL, artificially inducing or overstating any Lombard effect. The impact of masking was minimized in this study by limiting the localization range to 10 km from the center hydrophone, as demonstrated by the probability of detection and localization calculations. However, it was impossible to ensure all calls were not masked, and therefore the sensitivity to masking was investigated by simulating a range of heavier (left) tailed SL distributions.
SLs were also analyzed in 5 dB NL bins, and the average SL and variance in each bin were calculated. These bins were used to produce histograms of SLs to examine how the shape and character of the distribution changes as a function of noise. The histograms were fit to the data using nonparametric kernel smoothing distributions evaluated at 100 evenly spaced points covering the range of data for each NL bin.
III. RESULTS
Between 2012 and 2017, 42 159 minke whale boing calls were recorded and tracked at ranges between 1 and 10 km from the center hydrophone of each subarray during opportunistic recordings. SL estimates were derived from these calls by adding the measured RLs and the TL estimates. To verify the TL model, a greater search area was used and 211 184 minke whale boing RLs were measured at ranges between 0 and 20 km. The number of localized and tracked calls per month is shown in Fig. 2. The hydrophones each recorded a total time of approximately 639 days. An example spectrogram and time series of a minke whale boing call is shown in Fig. 3. These calls formed 1261 tracks through PMRF.
A. RL and NL measurements
Two methods (GPL templating and spectrogram noise subtraction) for measuring the RLs were compared over a range of SNRs. Figure 4 shows the measurement accuracy for RL estimates as a function of SNR of the detected call. Both techniques produce RL estimates within 0.5 dB of the true RL at SNRs between -1 and 30 dB. However, the GPL detector more accurately measured the RLs, especially at low SNRs when compared with subtracting the adjacent band-limited noise from the band-limited spectrogram. Therefore, the GPL template measurement results were used for estimating minke whale boing call RLs for the rest of this study. In order to ensure accurate RL estimates, calls with measured SNR less than -5 dB were excluded from the analysis. This cutoff value was subsequently carried forward and accounted for in the probability of detection and localization analysis.
The RLs and associated NLs for 42 159 boing calls were measured using the GPL templating technique. The 25th, 50th, and 75th percentiles of RL measurements averaged across the hydrophones were 84, 87, and 90 dB , respectively. NLs associated with these calls had 25th, 50th, and 75th percentiles of 75, 79, and 82 dB , respectively. The measured RLs were added to the TL estimates from Sec. III B to calculate the SLs described in Sec. III D.
B. TL estimation
The TL was calculated using both the Peregrine model and the geometrical spreading equation for 211 184 individually measured minke whale boings from distances that range between 0 and 20 km from the measuring hydrophone. The average TL can be seen as a function of range in the lower portion of Fig. 5 for Peregrine (purple) and the geometrical spreading equation (black), using 10 m bins. TL estimates from Peregrine could not be reliably estimated for ranges less than about 2.5 km due to limitations inherent with the RAM model formulation at very high propagation angles and are therefore not shown in Fig. 5. For any given range, the modeled TL will vary slightly for each boing because the hydrophone depths vary between 3150 and 4700 m, and the bathymetry along the path is unique to the animal's position. However, plotting the TL average as a function of range helps to illustrate the differences between the two models. The Peregrine TL model used a seafloor sediment grain size of , assumed animal depth of 30 m, and historical SSPs selected from data that most closely matched the date of the call. Within 10 km, changes in the sediment grain size and SSP had negligible effects on the TL. Beyond 10 km, changing the sediment grain size over all likely values (–8) and all likely SSPs showed 3 dB or less variation. A sediment grain size of was chosen from TL experiments using sonobuoys in the region, the details of which are not presented because sediment grain size did not affect TL estimates from Peregrine within 10 km. Changing the assumed animal depth between 5 and 100 m also had negligible effects on TL. The only adjustable parameter for the geometrical spreading model is the transition range, rT, which was set to the water depth. Choosing values of rT greater than or less than the water depth resulted in less agreement between the two models. Navy surface assets on the range with known SLs and similar frequency ranges were also used to verify rT as the predicted SLs from measurements closely matched the known SLs. The largest discrepancies between the geometrical spreading equation and Peregrine occurred at the closest ranges (2.6 dB at a range of 2.5 km). These differences are likely attributable to more complex surface-bottom interactions that are not accounted for with the simpler geometrical spreading model. The close agreement between the models does not necessarily make them both correct, but agreement between the two is reassuring.
The average minke whale boing RL values as described in Sec. III A were added to the TL estimates from the two models, allowing the SL to be computed as a function of range and shown in the upper portion of Fig. 5 for Peregrine (magenta) and the geometrical spreading equation (black). The average SL derived from the geometrical spreading equation fluctuated by less than 2.9 dB from 1 to 20 km with no appreciable slope (0.029 dB/km using a linear fit). The average SL as a function of range derived from the Peregrine model varied by 5.9 dB or less with a slightly positive slope (0.15 dB/km using a linear fit). In order to eliminate the possibility that masking could bias the SL estimates upward at farther ranges, the same process was repeated using only calls that occurred in noise backgrounds of 70 dB or less (minimal to no masking expected), which resulted in no appreciable change in Fig. 5. The few calls that were produced within 1 km horizontal range from the center hydrophone were omitted from the analysis because the localization and depth uncertainty of the whale resulted in proportionally more uncertainty in the SL than those at farther ranges. Furthermore, the maximum horizontal range was limited to 10 km in order to reduce uncertainty in the TL and also to minimize the effects of masking (discussed further in Sec. III C). The red dotted vertical lines in Fig. 5 illustrate the horizontal range over which SL estimates were confined.
Although both models are suitable for estimating TL in the study area, the geometrical spreading equation was chosen as the preferred model for the remainder of the study as it predicted no appreciable slope for SL as a function of range, had a faster computation time, and could be computed at closer ranges.
C. Probability of detection and localization
Probability of detection and localization were calculated for all background NLs and minke whale boing call SLs to determine what effect masking had on the observed SL results. Figure 6 illustrates the probability of detection and localization for subarray D for two example NLs, NL = 70 dB and NL = 85 dB , using a distribution of call SLs with a mean of 164 dB RMS at 1 m and variance of 14 dB. This SL distribution was estimated from measured minke boing RLs limited to 10 km in range during very low NLs of 65–70 dB , and so no masking was expected. The high NL maps therefore represent the worst-case scenario for masking since the SL distribution used assumes the minke whales do not change their SL as NL increases. In this scenario, the probability of localization was = 99.8% for NL = 70 dB and = 74.1% for NL = 85 dB , assuming a random spatial distribution of calls. Values for were determined for all likely combinations of SL and NL for each subarray with the range from the center hydrophone limited to w1 = 1 km and w2 = 10 km. The values for the four subarrays were averaged and, assuming random spatial distribution of animals on the range, provide the average probability of localization for all SLs and NLs over the observed data on PMRF. The resulting “masking zone” can be seen in Fig. 7, where areas of black background indicate = 0%, and areas of white background represent .
D. Minke whale boing SLs
Minke whale boing calls were estimated to have a median RMS SL of 166 dB at 1 m measured over the 1250–1600 Hz bandwidth and averaged over all NLs. The 25th and 75th percentiles of the RMS SLs were 163 and 168 dB at 1 m, respectively (Table I). Since each call was recorded at least four separate times (center hydrophone and three or four surrounding hydrophones), the SL was estimated separately and averaged across each hydrophone and the standard deviation calculated. The mean standard deviation of the SL across hydrophones was 0.66 dB. However, the measured increase in SL across NLs was much greater than the standard deviation across hydrophones. Figure 7 (upper) shows the minke whale SL estimates for all 42 159 localized calls, illustrating that SLs increased as background NLs increased. If a linear fit is used (black line), RMS SL increased on average 0.24 dB per 1 dB increased NL (95% confidence interval, 0.23–0.25 dB/dB). The vast majority of SLs are well above the masking zone (shown in black).
Noise bin . | Mean . | Median . | Variance . | n . |
---|---|---|---|---|
65–70 dB | 163 | 164 | 14 | 2449 |
70–75 dB | 164 | 164 | 13 | 8062 |
75–80 dB | 165 | 165 | 13 | 14 917 |
80–85 dB | 167 | 167 | 10 | 13 537 |
85–90 dB | 168 | 168 | 9 | 2572 |
ALL | 166 | 166 | 14 | 42 159 |
Noise bin . | Mean . | Median . | Variance . | n . |
---|---|---|---|---|
65–70 dB | 163 | 164 | 14 | 2449 |
70–75 dB | 164 | 164 | 13 | 8062 |
75–80 dB | 165 | 165 | 13 | 14 917 |
80–85 dB | 167 | 167 | 10 | 13 537 |
85–90 dB | 168 | 168 | 9 | 2572 |
ALL | 166 | 166 | 14 | 42 159 |
A GAM model was also used to quantify the relationship between SL and NL (Fig. 7, red line). The GAM model explained only 10% of the variability in minke whale SLs (deviance explained) and the range of predicted values was substantially smaller than the range of observed source values. Since a single explanatory variable was not expected to explain a large proportion of the variance in call SLs, this result was not surprising. Residual analysis plots indicated symmetrically distributed residuals that were approximately normal except for a tendency for the model to overpredict SLs at low levels of ocean noise. There was no discernible evidence of heteroskedasticity or unmodeled relationships between residuals and either observed or fitted values of the dependent variable.
Although Figs. 7 and 8 suggest that masking only occurs well into the tail of the SL distributions, and thus is unlikely to be a serious problem, masking was investigated by simulating a range of heavier (left) tailed SL distributions. Observations in the masked region of the SL distributions were simulated using a functional form , where f(x) is the number of calls simulated in the SL interval () dB RMS at 1 m, b is a constant controlling the rate of decay in the tail (b = 1 for a triangular distribution), and a was chosen so that the tail distribution generated the detected number of calls just above the masked region and reached a value of zero at SLs of 150 dB RMS at 1 m. Values of were used, all of which generated substantially heavier tails than observed [between two (b = 4) and five (b = 1) times as many observations in the masked region of the SL distribution than observed]. GAMs were fitted to each of the reweighted datasets as described previously. The GAM fits for the original and reweighted datasets can be seen in red in both the upper and lower plots of Fig. 7. The red line represents the GAM fit if no points were missed due to masking, while the dashed red line represents b = 2. The most plausible values lie between these lines, as b = 1 produces an unlikely elbow to the distribution.
The lower portion of Fig. 7 shows the slope of the SLs as a function of NLs, which reflects the sensitivity response of the whales to increasing background noise. The GAM model suggests minke whales were most responsive to (i.e., greatest slope at) background NLs greater than or equal to 80 dB RMS . The linear fit suggests minke whales increase their RMS SLs by 0.24 dB per 1 dB increase in background noise in the 1250–1600 Hz band. Masking plays a proportionally larger role above 85 dB RMS , and so the uncertainty of the GAM fit increases. Regardless, it appears the whales have a decreased sensitivity at the highest NLs, suggesting they are unable or unwilling to increase their SLs further. There are also proportionally very few calls above 85 dB RMS , which may suggest the whales reduce calling altogether in these high noise conditions.
SLs grouped in 5 dB NL bins are shown in Fig. 8 and Table I. The median RMS SL of boing calls produced in each NL bin was significantly greater than the SL of calls produced in the NL bin centered 5 dB lower (one-sided Wilcoxon's rank sum test, for all four comparisons). In addition to the median SL increasing for calls produced in each increasing NL bin, the variance of the SL distribution significantly decreased when comparing subsequent NL bins for 75–80 dB, 80–85 dB, and 85–90 dB (one-sided Ansari-Bradley test, for these comparisons). The dashed portions of the histograms in Fig. 8 indicate where masking may artificially suppress the number of calls, which can be seen in the tail of the distributions for the higher NLs.
IV. DISCUSSION
These results indicate that minke whales off Hawaii exhibit the Lombard effect in that they increased their boing call intensity in increased background noise. In addition to increasing the average boing call SL, minke whales also decreased the variance of the boing call SL in higher background NLs, suggesting that minke whales were more precise with their call intensity in increased NLs.
Studying the Lombard effect in a natural population of whales is inherently difficult as automated detection and localization systems can produce different results depending on changes in background NLs. The deep bathymetry of the study area has both advantages and disadvantages. Deep water means that propagation between the source and receiver is primarily direct-path over the range considered, which minimizes the impact that uncertainties about sediment type, sediment thickness, SSP, and whale calling depth have on TL (and therefore SL) estimates. Unfortunately, in high noise conditions, the deep hydrophone placement also means that low SL calls, even calls emitted directly above the receiver (one water depth), will be masked from detection. Therefore, determining the true distribution of SLs in high noise conditions is difficult. However, the multitude of sensors available on PMRF allows detection and localization of the vast majority of calls within the ranges and in the noise conditions considered. Because the TL was well characterized at these ranges and the detector performance and estimated RL could be characterized as a function of SNR, the probability of detection and localization and the resulting masking zones in Fig. 7 could be determined with high levels of certainty. Additionally, the symmetrical shapes of the distributions of call SLs as shown in Figs. 7 and 8 indicate that very few calls were likely masked.
The close agreement between the SLs estimated from Peregrine and the geometrical spreading equation, combined with no appreciable slope as a function of range, suggests that the models are able to accurately estimate the TL over the ranges in this study. During this study, the Peregrine model had run time durations of many seconds for each TL range-depth slice as compared to instant computation of simple geometrical spreading. For the locations in this study, Peregrine also did not appear to have any advantage for estimating TL compared with the simpler geometrical spreading equation. Further, the geometrical spreading equation allowed for computation of TL even at very short ranges for this deep ocean location, which is not available from Peregrine.
The Lombard effect demonstrated by the minke whales showed a maximum response of 0.34 dB increase in SL per 1 dB increase in background NL when the noise was 82 dB, and an average response of 0.24 dB per 1 dB increase in background NL for the full range of noise encountered. Minke whales, therefore, did not fully compensate for the increase in background NL, and as such the minke whale communication space decreased with increasing noise. No data are available for the hearing sensitivity of minke whales, but for this hypothetical example an assumption is made that minke whales need approximately 0 dB of SNR in the 1250–1600 Hz band to effectively transmit information to a conspecific through their boing call (assuming that a greater SNR is needed for information transmission than for simple detection). Using this 0 dB SNR assumption, then their calls would have a maximum allowable TL of 96 dB in NLs of 67.5 dB, and 78 dB in NLs of 87.5 dB, assuming that whales call at the median SL for the associated noise bin (164 dB and 168 dB, respectively). Using the geometrical spreading equation, a minke whale calling in 67.5 dB of background noise would be able to communicate to a range of approximately 114 km, while one calling in 87.5 dB of background noise would be able to communicate to 19 km. Although the detection range in low NLs would be highly dependent on bottom type, bathymetry, and SSP over these long ranges, it is still likely that an increase of 20 dB in NL would result in minke whales losing 1 order of magnitude of effective communication range (distance) and 2 orders of magnitude of communication space defined as area (distance-squared). Sound production in marine mammals has been associated with a variety of behaviors, including social interaction, group cohesion, feeding, and mating, all of which may be negatively impacted by reduced communication space (Erbe et al., 2016). However, quantifying these effects in marine mammal species has been difficult (Clark et al., 2009; Erbe et al., 2016). Since the exact function of the minke whale boing is poorly understood, it is difficult to determine the impact reduced communication space has on the species. For example, if the boing is mainly used for interaction of conspecifics within a 10 km distance, then the impacts of increasing noise are less than if the boing is used to locate and coordinate with conspecifics at greater distances.
The Lombard response for minke whale boing calls is more subtle than that reported for other baleen whales and more similar to values reported for bottlenose dolphins and terrestrial animals. Bottlenose dolphins increased their apparent output level 0.1–0.3 dB per 1 dB increase in background NLs (Kragh et al., 2019). Frogs, birds, and terrestrial mammals have shown similar Lombard responses in that they did not increase their SLs to compensate for the increase in background NL (e.g., Hage et al., 2013; Halfwerk et al., 2016; Manabe et al., 1998). Studies to date about other species of whales, however, report that they increase their SLs approximately 1 dB per 1 dB increase in background noise. Humpback whale responses varied between 0.81 dB and 1.5 dB increase in SL per 1 dB increase in background NLs (Dunlop, 2016; Dunlop et al., 2014; Fournet et al., 2018). Right whales and killer whales also responded at near 1 dB per 1 dB background NL increase (Holt et al., 2009; Parks et al., 2010). However, many of these studies used small sample sizes (approximately 103 vocalizations) compared to this study. In addition, behavioral response may differ depending on whether the noise source is local or distant and whether it originates from a point source or a diffuse source. It is important to note that when comparing responses, the frequency range and character of the noise are inherently different in every study. Overall very few studies have examined the response of baleen whales to background noise, and so more work is needed to get a clearer picture of baleen whale response to noise.
With bottom-mounted hydrophones, it is impossible to know the exact noise environment that the whale experiences along its transit track. The seafloor sensors used for this study are only a proxy for the noise experienced by the whale. As most natural sources of noise in this study originate at the sea surface interface, it is likely that the NL experienced by the whale is greater than what is measured on the seafloor with noise potentially increasing at a faster rate at the surface than at the bottom as wind and waves increase. Therefore, the observable Lombard effect presented in this study is likely an upper-bound response.
The Lombard effect observed in whales has important implications for passive acoustic animal density studies. Inherent to the acoustic marine mammal density estimation equation described by Buckland et al. (2001) is the probability of detection of a call. This value is often determined from propagation modeling or distance sampling methods. If an animal is able to completely compensate for changes in background noise, then the probability of detection for a study area would be resilient to changes in background noise. However, if the whales respond to the noise but do not completely compensate for the changes as in this study, determining the probability of detection becomes a complex task. If not properly accounted for, population estimates could be skewed by 1 order of magnitude, considering relatively small changes in background NLs can change the detection range by large values.
In this study, it was assumed that the minke whale source was omni-directional. If minke whales produce directional calls, then the RLs recorded on bottom-mounted hydrophones and estimated SLs may be less than the levels if the sounds were recorded on-axis. Receivers at closer ranges and in the same plane as the whale are necessary to measure the true directionality of the call.
The SL and NL measurements were limited to the bandwidth corresponding with the main components of the minke whale boing. In previous laboratory studies with other taxa, individuals responded the most to noise in the same band as their calls (e.g., Hage et al., 2013; Halfwerk et al., 2016; Manabe et al., 1998). Examining the RLs of minke whale calls in wider bands revealed negligible differences supporting the bandwidth selection.
Minke whales were assumed to call near the surface. Tags on the Antarctic minke whale (Balaenoptera bonaerensis) have shown maximum dive depths of approximately 100 m during feeding dives (Friedlaender et al., 2014). Although no tagging data are available for minke whales in Hawaii, preliminary 3D acoustic localization as described by Henderson et al. (2018) revealed depths of less than 100 m. Because the hydrophones were in deep water, estimated SLs did not change significantly for modeled source depths between 5 and 100 m.
V. CONCLUSIONS
The results of this study show that baleen whales may not all be able to compensate for changes in the background NLs in their environment. Therefore, baleen whales may experience a drastic decrease in their communication space even with natural events that increase NLs. These results from minke whale boings show that the communication range is greater than what would be predicted ignoring Lombard but is still 1 order of magnitude less over a 20 dB increase in NL. These effects of natural events will help to contextualize effects of anthropogenic noise sources.
Many marine mammal populations are regularly assessed, but some cetacean species, such as small, solitary species like the minke whale, are difficult to detect visually so much is unknown about their population sizes. Passive acoustic monitoring has been suggested as a way to estimate population size if a conversion can be made from the number of calls to the number of whales (e.g., Marques et al., 2009). SL is often needed to estimate probability of detection, especially in single hydrophone studies (e.g., Helble et al., 2013). This research shows that SL as a function of background NL may be required to improve the accuracy of this calculation. When localization is possible, the study area can be restricted to keep the probability of detection and localization close to 1 over all noise conditions at the cost of reducing the sample size.
In order to evaluate whether cetacean population sizes can be measured using passive acoustics, more work must be done assessing the stability of species-specific call SLs. Species may be more likely to respond to increasing NLs depending on the function of their call, the seasonal cycle, and the source of the background noise or other disturbance. Errors in SL estimations can occur throughout the process. The estimated RLs need to be checked to ensure that background noise is correctly being subtracted from the call RL. TL assumptions should be verified. One method of verifying TL is to plot SL as a function of range (over ranges where masking is not expected) to ensure that no trend exists. Detection range limits should be carefully selected so that the majority of calls produced within these ranges will be detected and, ideally, localized. Choosing a detection radius that is too large will result in more uncertainties in the number of calls produced, especially at high NLs. Finally, the effects of masking need to be carefully considered. If masking is not corrected for, it may bias the trend of SL as a function of NL to appear greater than it really is. Accurate estimation of SLs across time, locations, and environmental conditions is often required before passive acoustics can be used to calculate density or abundance of marine mammals.
ACKNOWLEDGMENTS
This paper was presented at the Fifth International Meeting on The Effects of Noise on Aquatic Life held in Den Haag, July 2019. Work was supported by the Office of Naval Research (Code 322, Marine Mammals and Biology), Commander, U.S. Pacific Fleet (Code N465JR), and the Naval Facilities Engineering Command Living Marine Resources Program. The authors would like to thank Glenn Ierley, Eva-Marie Nosal, and Aaron Thode, who provided helpful feedback that improved this manuscript.