Dolphin echolocation clicks measured far off-axis contain two time-separated components. Whether these components overlap and appear as a single signal on axis has received little attention. Here, the scaled reassigned spectrogram analysis was used to examine if bottlenose dolphin (Tursiops truncatus) clicks measured near- or on-axis of the echolocation beam contained overlapping components. Across click trains, the number of overlapping components spatially varied within the echolocation beam. Two overlapping components were found to predominantly occur in the upper portion of the beam, whereas the lower portion of the beam predominantly contained a single component. When components overlapped, the trailing component generally had a higher center frequency and arrived less than 5 μs after the leading component. The spatial relationship of components was consistent with previous findings of two vertically distinct beam lobes with separated frequency content. The two components in the upper portion of the beam possibly result from a single transient click propagating through a geometrically dispersive media; specifically, the slower sound speed of the dolphin melon's core slightly delays the more directional, high frequency energy of the click, whereas the less directional, lower frequency energy propagates through more peripheral but higher sound speed portions of the melon.

Measurements of beluga (Delphinapterus leucas) and bottlenose dolphin (Tursiops truncatus) echolocation signals at extreme off-axis angles have been demonstrated to contain two transient components (Au et al., 2012a,b; Finneran et al., 2014; Lammers and Castellote, 2009). At 90° to the longitudinal axis of the animal these components were separated by approximately 250–300 μs and 130 μs, respectively, for the two species. Time differences decreased as the measurement angle decreased toward the main response axis (MRA) of the beam until the measured signal close to the MRA appeared to consist of just one signal. For the case with the bottlenose dolphin, the second of the two signal components measured at extreme off-axis angles had a higher center frequency than the first. Using the timing differences between the off-axis components, various hypotheses have been proposed as to the origin of the odontocete echolocation click.

Lammers and Castellote (2009) concluded from their off-axis click measurements that two active sources combine to generate the echolocation click, a proposition that has been supported by Cranford (2011) and Cranford et al. (2011). Conversely, Au et al. (2012b) interpreted their off-axis click measurements as an indication of one active source generating one signal, but with two measurable signal components in the far-field arising from the signal propagation path. Au et al. (2012b) hypothesized that the first component in the 90° position relative to the beam axis followed an acoustic propagation path with tissues acting as a low-pass filter. The second, higher frequency component was a reflected signal that followed a slightly longer acoustic propagation path, but which did not experience low-pass filtering.

Finneran et al. (2014) later compared the timing of the two off-axis components with a simple point reflector model. They found that the best-fit location for the hypothesized reflector was 17–19 cm forward and 13–14 cm below or above the sound source. Since the position 13–14 cm above the sound source would be located in the water over the animal, they assumed the location of the reflector was below the source at a location that coincided with the premaxillary bones. The premaxillary bones are large and do not intuitively conform to a point reflector, as assumed in their model, and it is debatable if they are the reflective source responsible for the second component.

The previously mentioned studies primarily analyzed off-axis signals to arrive at their conclusions. Fewer investigations have focused on the potential for these same components to overlap on-axis, and if they do, whether they can be distinguished in the time and frequency domains. Information obtained from such studies would benefit our understanding of echolocation click generation. Capus et al. (2007) used fractional Fourier transforms, a signal processing method designed to detect and enhance the time-frequency sweeps in signals, on on-axis echolocation clicks. They found that the majority of clicks analyzed had energy distributed into a high and low frequency band with the high frequency band typically delayed by 5–20 μs. The presence of spatially and spectrally separated frequency components across the beam was also shown by Starkhammar et al. (2011). However, the time separation of the spatially and spectrally separated components was not analyzed in that study, and it is not possible to say whether the components were related to the off-axis components measured in other studies.

Further in-depth studies where the detailed time and frequency characteristics of individual clicks have been conducted on the time scale of single clicks are lacking, possibly due to a deficiency in reliable signal processing and time-frequency representation methods suitable for transient signals. There exist a plethora of methods suitable for long harmonic signals, e.g., Auger and Flandrin (1995); Daubechies et al. (2011); Khan and Sandsten (2016); Reinhold and Sandsten (2017); Stanković et al. (2014), but these are not suitable for short transients with few periods (e.g., <5 periods). However, a reassigned spectrogram algorithm designed for time and frequency localization of Gaussian envelope harmonic pulses was developed by Hansson-Sandsten and Brynolfsson (2015) [with theoretical evaluation in Brynolfsson and Sandsten (2018)]. Further, a component detection and estimation algorithm, called the scaled reassigned spectrogram for transient signals (ReSTS), was developed by Reinhold et al. (2018). The ReSTS separates spurious noise components from true Gaussian envelope components and has been thoroughly tested and benchmarked (Reinhold et al., 2018). Since harmonic pulses with a Gaussian envelope has been proposed as a model for dolphin clicks (Au, 1993), the ReSTS might also be suitable for automatic component detection and estimation in bottlenose dolphin echolocation clicks (Au, 1993; Brynolfsson et al., 2019).

The study presented here uses ReSTS to resolve the signals captured inside the dolphin echolocation beam (hydrophone array coverage ±26° horizontally and ±11.6° vertically, not at extreme off-axis angles) to investigate if the two components found in the extreme off-axis angles are also present and overlapping in time near or close to MRA of the echolocation beam. Transient components and how they spatially and temporally vary across the echolocation beam are described. An alternative explanation of the formation of the two previously described components is presented and supported by a three-dimensional (3-D) finite element (FE) model. Results are used to interpret how the acoustic field created by the echolocation click is affected by the fatty melon and sheds light on the timing and origin of the temporally, spatially and spectrally separated peak components previously identified (Capus et al., 2007; Starkhammar et al., 2011).

The data analyzed here are from recordings of the echolocation click trains of a male bottlenose dolphin that previously have been described in depth (Moore et al., 2008; Starkhammar et al., 2011). Briefly, the clicks were recorded with a curved, diamond shaped 29-channel hydrophone array while the dolphin performed an on-axis target detection task from a bite plate station (Fig. 1). The curvature of the array kept the hydrophones approximately equidistant (1.2 m) from the animal's phonic lips (the source of the echolocation click). Targets were presented 9.4 m from the bite plate station. The animal's position on the bite plate during trials was monitored by video; trials where the animal was not correctly positioned were not included in this study. Each hydrophone was coupled to an analog in-line filter amplifier (Reson VP 1000, Reson Inc., Slangerup, Denmark) set for 20 dB of gain and bandpass filtered from 10 to 200 kHz. For each hydrophone, the echolocation clicks were digitized with an analog to digital converter at a 312.5 kHz sample rate (per channel) and with a 16 bit resolution.

FIG. 1.

(Color online) Experimental setup. (a) The positions of the curved hydrophone array to the dolphin's position on the bite plate. (b) The hydrophone array configuration and channel numbering. The horizontal 6.5° and the vertical 2.9° angle indicate the hydrophone spacing relative to the presumed location of the sound source.

FIG. 1.

(Color online) Experimental setup. (a) The positions of the curved hydrophone array to the dolphin's position on the bite plate. (b) The hydrophone array configuration and channel numbering. The horizontal 6.5° and the vertical 2.9° angle indicate the hydrophone spacing relative to the presumed location of the sound source.

Close modal

1. Scaled reassignment of echolocation click spectrograms

Measurement data were analyzed using the scaled reassigned spectrogram for transient signals (ReSTS) as developed by Reinhold et al. (2018). This method was chosen since it was developed for the purpose of detecting and counting overlaying transient signals. The method assumes that the dolphin echolocation clicks can be modeled by Gaussian envelope harmonic signals (also called Gabor functions), which is a common assumption (Au, 1993; Brynolfsson et al., 2019). Thus the signals are modeled as

x(t)=k=1KakxG(ttk)ei2πfktei2πϕk,
(1)

with ak as the amplitude, tk and fk = ωk/2π the time and frequency centers, ϕk ∈ [0 1) the phase shift and the unit energy Gaussian envelope is defined as

xG(t)=σ1/2π1/4e(t2/2σ2),
(2)

where the length of the envelope is determined by the scaling parameter σ. The traditional windowed spectrogram is

Sxh(t,f)=| x(s)h*(st)ei2πfsds |2,
(3)

where integrals run from −∞ to ∞, and the scaled reassigned spectrogram is defined as

ReSxh(t,ω)=Sxh(s,ξ)δ(tt̂x(s,ξ),ωω̂x(s,ξ))dsdξ,
(4)

using the two-dimensional Dirac impulse, f(t,ω)δ(tt0,ωω0)dtdω=f(t0,ω0). The reassignment coordinates are calculated according to

t̂x(t,ω)=t+2(Fxth(t,ω)Fxh(t,ω)),ω̂x(t,ω)=ω2(Fxdh/dt(t,ω)Fxh(t,ω)),
(5)

where and represents the real and imaginary parts, and Fxh,Fxth, and Fxdh/dt are short-time Fourier transforms. Choosing the window h(t) = xG(t), i.e., a Gaussian window of the same length as the Gaussian signal envelope, gives optimal time-frequency localization of the signal components (Brynolfsson and Sandsten, 2018; Hansson-Sandsten and Brynolfsson, 2015; Reinhold et al., 2018).

The advantage of the scaled reassigned spectrogram over the traditional spectrogram is visualized by Fig. 2, which shows an echolocation click, sampled simultaneously by the vertically spaced hydrophone channels (ch) 9 and 21 with a sample rate of 312.5 kHz [see Fig. 1(b) for hydrophone placement]. The two signals have different characteristics; the one recorded at a vertically higher position within the echolocation beam (ch 9) is bi-modal (the spectrum contains two local peak frequencies), and the other, recorded at a vertically lower position (ch 21), is uni-modal (the spectrum contains one local peak frequency). The left column of the subplots [(a),(c), (e) and (g)] in Fig. 2 contain data obtained on ch 9 and the right column [(b), (d), (f) and (h)] contain data obtained on ch 21. Figures 2(a) and 2(b) shows time-waveforms of the recorded echolocation click expressed in measured volts. The peak-to-peak sound pressure level of the signal in Fig. 2(a) is 192 dB re 1 μPa and in Fig. 2(b) 190 dB re 1 μPa.

FIG. 2.

(Color online) Click component detection using the ReSTS algorithm for the 11th click in a series of 18 clicks recorded simultaneously on hydrophone channel (ch) 9 (left column of subplots) and ch 21 (right column of subplots) during a target detection task. (a),(b) Measured time-waveform of the click (note undersampling) recorded on ch 9 and ch 21. (c),(d) Spectrogram of the click recorded on ch 9 and ch 21 with overlaying red circles marking the detected components using the ReSTS method. As reference, the margin plots show the spectra of the signals and shares the frequency axis of the spectrograms. (Colormap ranges from low relative energy in black to high in white.) (e),(f) The ReSTS of the click recorded on ch 9 and ch 21 with overlaid red circles marking the detected components. (Colormap ranges from low relative energy in white to high in black.) (g) Classification algorithm results where two peaks (within one echolocation click recorded on ch 9) are identified as separate Gaussian components and the rest of the peaks are classified as noise.(h) Classification algorithm results where one peak (within one echolocation click recorded on ch 21) is identified as separate Gaussian components and the rest of the peaks are classified as noise.

FIG. 2.

(Color online) Click component detection using the ReSTS algorithm for the 11th click in a series of 18 clicks recorded simultaneously on hydrophone channel (ch) 9 (left column of subplots) and ch 21 (right column of subplots) during a target detection task. (a),(b) Measured time-waveform of the click (note undersampling) recorded on ch 9 and ch 21. (c),(d) Spectrogram of the click recorded on ch 9 and ch 21 with overlaying red circles marking the detected components using the ReSTS method. As reference, the margin plots show the spectra of the signals and shares the frequency axis of the spectrograms. (Colormap ranges from low relative energy in black to high in white.) (e),(f) The ReSTS of the click recorded on ch 9 and ch 21 with overlaid red circles marking the detected components. (Colormap ranges from low relative energy in white to high in black.) (g) Classification algorithm results where two peaks (within one echolocation click recorded on ch 9) are identified as separate Gaussian components and the rest of the peaks are classified as noise.(h) Classification algorithm results where one peak (within one echolocation click recorded on ch 21) is identified as separate Gaussian components and the rest of the peaks are classified as noise.

Close modal

Figure 2(c) and 2(d) show the Gaussian windowed spectrograms of the click (σ = 2, window length = 8 samples, overlap = 7 samples). Though it seems feasible from the traditional spectrogram and the corresponding frequency marginal (c) that the signal from ch 9 contains more than one transient component, reliable identification of the components would not be possible as they overlap heavily in time and frequency. The scaled reassigned spectrogram localizes the signal energy around the time-frequency centers of individual components, thus making reliable identification of the two click components possible (e). The spectrogram, and the frequency marginal (d), and the scaled reassigned spectrogram (f) of the (uni-modal) signal from ch 21 indicates that only one transient component is present.

Note that the scaled reassignments in Figs. 2(e) and 2(f) are computed with the same parameters as for the Gaussian windowed spectrograms in Figs. 2(c) and 2(d). Opposite colormaps were deliberately chosen for Figs. 2(c) and 2(d) and 2(e) and 2(f) to enhance visibility for each of the two method outputs.

2. Component classification using the ReSTS method

The signal energy of the scaled reassigned spectrogram is more localized compared to the traditional spectrogram. For the recorded clicks, this results in a time-frequency plane with several local maxima, or peaks. The ReSTS algorithm automatically identifies these peaks, determines which peaks correlate to click components, i.e., determining the number of click components in each signal, and finds the time and frequency centers of each individual click component (Reinhold et al., 2018). This process allows for automatic and objective classification of overlapping click components.

The output of the ReST algorithm is the number of estimated click components K̂, as well as the time locations, t̂1,,t̂K̂, and the corresponding frequency locations, f̂1,,f̂K̂, of the estimated click components. This is shown in Fig. 2 by the red circles on the time-frequency plots. The circles mark the detected click components for both the bi-modal and uni-modal signal at their respective time and frequency center location. The first peak amplitudes shown in Figs. 2(g) and 2(h) are related to how much signal energy each click component contains; however, noise present in the signals also form peaks after reassignment. The peaks that correspond to noise have lower amplitudes, even if the signal-to-noise ratio is low (Reinhold et al., 2018), which is utilized by the ReSTS algorithm when detecting what peaks correspond to click components.

To use the ReSTS method, the user decides an upper limit for the maximum number of components for the signal, Kmax, and approximate resolutions, which have been shown to be δt = 2σ in time and δf = 1/(πσ) in frequency (Reinhold et al., 2018). The σ to be used is the same scaling parameter used for the Gaussian window of the spectrogram, Eq. (2), which should match the scaling of the click components for best performance. If unknown, the parameter can be estimated through the full-width half-maximum relationship (FWHM) for Gaussian pulses, FWHM=22ln2σ, or if preferred, methods using the local Rényi entropy (Brynolfsson and Sandsten, 2018; Hansson-Sandsten and Brynolfsson, 2015). For the signal in Fig. 2 the time and frequency resolutions are 7μs and 13 kHz, respectively.

The ReSTS algorithm will detect click components that are separated by at least δt = 2σ in time, even if the components have entirely overlapping frequencies, and it will detect click components that are separated by at least δf= 1/(πσ) in frequency, even if the components overlap entirely in time. Thus only if two click components are both closer in time than δt = 2σ and closer in frequency than δf = 1/(πσ), will they not be detected as two components but one single component.

3. Statistical analysis method

The occurrence of overlaid transients across the hydrophone array was statistically assessed by the nonparametric Wilcoxon rank-sum test. The test compared how the number of overlaying transients varied across the hydrophone array and across sequential click numbers during target detection task trials, and was thus used to determine which hydrophones recorded the signals that were most interesting to analyze.

The Wilcoxon rank-sum test was used to test if two sets of data were sampled from continuous distributions with equal population means, against the alternative that they are not. Hence, it was used to test if the occurrences of overlaid transients varied similarly from click-to-click between hydrophone channel numbers and during click trains. The Wilcoxon rank-sum test was chosen because it does not require normal distributions of the number of detected overlaying click components across the hydrophone array. Instead, it tests the null-hypothesis that it is equally likely that a randomly selected value from one sample will be less than or greater than a randomly selected value from a second sample.

A simple acoustic 3-D FE model was created to visualize the acoustic field as the transient signal travels through head tissues and into the water. The melon was roughly modeled as a prolate spheroid in which sound velocity changes linearly from the center axis along the radial direction perpendicular to the most elongated radii, Fig. 3(a). The core sound velocity was set to 1000 m/s and the edge velocity to 1440 m/s. The speed of sound in the surrounding water was set to 1500 m/s. The source was modeled as a point source with the Gaussian windowed transient signal s(t) [Eq. (6)] output presented in Fig. 3(b):

s(t)=Ae[(t3.5σ)2/1σ2]sin(2πtf0),
(6)

where A = 0.01, σ = 12.5 μs and f0 = 60 kHz.

FIG. 3.

(a) Two-dimensional sketch of the acoustic three-dimensional finite element model. (b) Gaussian point source pulse.

FIG. 3.

(a) Two-dimensional sketch of the acoustic three-dimensional finite element model. (b) Gaussian point source pulse.

Close modal

The sound pressure over time was sampled at the measurement point Mp. The lower of the sound velocities was deliberately set to 1000 m/s, which is lower than the previously measured lowest core sound velocities (1265 m/s) (Norris and Harvey, 1974), to better demonstrate the effect of the sound velocity gradient on signal propagation.

The 3-D model is intended to show how a sound velocity gradient in a spheroid medium splits a single input pulse into two temporally separated pulses. The model does not illustrate the physical properties of a real dolphin melon. A video of the simulation can be found as supplementary material.

The results of the Wilcoxon rank-sum test are visualized in Figs. 4 and 5. The test is used to see if the number of one component (1C) clicks, uni-modal signals as seen in Fig. 2(b), and two overlaying component (2C) clicks, bimodal signals as seen in Fig. 2(a), vary in a similar way for all clicks, across different channels. To investigate along the vertical axis, ch 5 and ch 25 were analyzed. These where chosen because they were positioned in the center of the upper and lower part of the hydrophone array, respectively, in positions where previous research by Starkhammar et al. (2011) reported spatially separated frequency components. It is shown that, for both ch 5 and ch 25, only the neighboring channels on the hydrophone array, marked by green in Fig. 4, could have the same continuous distributions describing the number of 1C and 2C clicks during the click trains.

FIG. 4.

(Color online) Statistical Wilcoxon rank-sum test results of (a) testing all channels against ch 5 and (b) testing all channels against ch 25. Two null hypotheses were tested for both channels, the first being that the counts of occurrences of one component clicks, within the click trains, come from the same continuous distribution (equal population mean), the second being that the counts of occurrences of two component clicks, within the click trains, come from the same continuous distribution. Green indicates channels where both (logical AND) null hypotheses could not be rejected.

FIG. 4.

(Color online) Statistical Wilcoxon rank-sum test results of (a) testing all channels against ch 5 and (b) testing all channels against ch 25. Two null hypotheses were tested for both channels, the first being that the counts of occurrences of one component clicks, within the click trains, come from the same continuous distribution (equal population mean), the second being that the counts of occurrences of two component clicks, within the click trains, come from the same continuous distribution. Green indicates channels where both (logical AND) null hypotheses could not be rejected.

Close modal
FIG. 5.

(Color online) Statistical Wilcoxon rank-sum test results of testing all channels against ch 13. The null hypothesis of the test being that the counts of occurrences of two component clicks, within the click trains, come from the same continuous distribution. Green indicates channels where the null hypothesis could not be rejected.

FIG. 5.

(Color online) Statistical Wilcoxon rank-sum test results of testing all channels against ch 13. The null hypothesis of the test being that the counts of occurrences of two component clicks, within the click trains, come from the same continuous distribution. Green indicates channels where the null hypothesis could not be rejected.

Close modal

Along the horizontal axis, the distribution of the number of 2C clicks could be similar for more than just the neighboring hydrophones. The results of the Wilcoxon rank-sum test on ch 13, shown in Fig. 5, where hydrophones on the left and right side of the array are marked green, indicate that the hypothesis of similar distributions cannot be rejected. (Given the low peak-to-peak amplitude in the horizontally far spaced hydrophones, the test on ch 13 was only done for 2C clicks whereas the tests on ch 5 and ch 25 were done for both 1C and 2C clicks, emphasized by the different tones of green in the figures.)

The Wilcoxon rank-sum test thus shows an asymmetry along the vertical dimension of the array in how components vary over the course of the click series, prompting a more rigorous investigation of channels along the vertical center line. The test also shows symmetry along the horizontal dimension, suggesting that investigation of additional channels along the horizontal axis would yield similar results. These results are in line with previous findings of two vertically separated components described in Starkhammar et al. (2011).

Following the results of the Wilcoxon rank-sum test, ch 1, 9, 21, and 29 along the vertical center column of the hydrophone array were further investigated by detecting the number of components in clicks recorded by these channels. Figure 6 shows the number of clicks containing 1, 2, 3, or ≥4 signal components detected from the recordings of the bottlenose dolphin solving an on-axis target detection task. The results in Fig. 6 are derived from 50 click trains. The number of clicks in each train varied from click train to click train. The left column of Fig. 6 contains the count of clicks with one to four or more components and the right column contains the percentage of the clicks with one to four or more components as a function of click position in the click trains.

FIG. 6.

(Color online) Number of clicks containing 1, 2, 3, or ≥4 components measured on ch 1, ch 9, ch 21, and ch 29. The channel numbers are stated over each graph. The left column of plots contains the absolute occurrence count and the right column contains the percentage of the four click composition alternatives within each click number. The relative mean value, calculated over trials, of the MRA amplitude over click numbers are plotted as a thick red line on top of the histograms as a reference. The mean values calculated over trials. The relative mean amplitude over click numbers of the channels are plotted as a thin red line on top of the histograms.

FIG. 6.

(Color online) Number of clicks containing 1, 2, 3, or ≥4 components measured on ch 1, ch 9, ch 21, and ch 29. The channel numbers are stated over each graph. The left column of plots contains the absolute occurrence count and the right column contains the percentage of the four click composition alternatives within each click number. The relative mean value, calculated over trials, of the MRA amplitude over click numbers are plotted as a thick red line on top of the histograms as a reference. The mean values calculated over trials. The relative mean amplitude over click numbers of the channels are plotted as a thin red line on top of the histograms.

Close modal

The echolocation beam during the ramp-up period (first four clicks in the beginning of the click train) mainly consists of 1C clicks whereas a waveform consisting of 2C clicks dominates the upper part of the beam (ch 1 and ch 9) for click numbers 7–14. The number of occurrences of overlaid transients descends gradually in the upper part of the beam across click numbers 14–25. The lower part of the beam (ch 21 and ch 29) shows mainly 1C clicks during the entire click sequence. The last part of the click train is dominated by 1C clicks over the entire vertical beamwidth measurement. This dynamic behavior between 1C and 2C clicks in the upper part of the beam is indicative of some sort of on/off functionality of the second component within the dolphin click train. These results are consistent with previous research (Starkhammar et al., 2011) where two separated beam lobes were described by frequency dependent beam patterns. However, the ReSTS method provides information on center times and center frequencies of the overlaying components making it possible to analyze the order in which the two components arrive at the hydrophone positions, their relative time separation and their center frequencies.

For 2C clicks, the ReSTS gives two time centers t1, t2 and two frequency centers f1, f2, where t1, f1 correspond to the component with the stronger peak amplitude and t2, f2 the component with the weaker peak amplitude. Similarly, for 1C clicks, the time-frequency center t1, f1 of that component is obtained. Figures 7, 8, and 9 show histograms of the center frequencies for 1C clicks (a) and for 2C clicks (b), recorded on ch 5, ch 15, and ch 25 on the vertical center column of the hydrophone array. These results indicate that the center frequencies of the component in the 1C clicks are quite similar, around 45–60 kHz, to the center frequency of one of the components in the 2C clicks from ch 5, ch 15, and ch 25. The 2C clicks, from ch 5, ch 15, and ch 25 also have a second component with a center frequency around 90–105 kHz. The separation in frequency centers of the two components in the 2C clicks means that the components, relative to each other, can be defined as the high frequency (HF) and the low frequency (LF) components.

FIG. 7.

(Color online) (a) Histogram of center frequencies for one component clicks recorded on ch 5. (b) Histogram of center frequencies for all two component clicks recorded on ch 5. (The color of the bar of the weaker component is semi-transparent.) The gray bar corresponds to the sum of the blue and green bar.

FIG. 7.

(Color online) (a) Histogram of center frequencies for one component clicks recorded on ch 5. (b) Histogram of center frequencies for all two component clicks recorded on ch 5. (The color of the bar of the weaker component is semi-transparent.) The gray bar corresponds to the sum of the blue and green bar.

Close modal
FIG. 8.

(Color online) (a) Histogram of center frequencies for one component clicks recorded on ch 15. (b) Histogram of center frequencies for all two component clicks recorded on ch 15. (The color of the bar of the weaker component is semi-transparent. The gray bar corresponds to the sum of the blue and green bar.)

FIG. 8.

(Color online) (a) Histogram of center frequencies for one component clicks recorded on ch 15. (b) Histogram of center frequencies for all two component clicks recorded on ch 15. (The color of the bar of the weaker component is semi-transparent. The gray bar corresponds to the sum of the blue and green bar.)

Close modal
FIG. 9.

(Color online) (a) Histogram of center frequencies for one component clicks recorded on ch 25. (b) Histogram of center frequencies for all two component clicks recorded on ch 25. (The color of the bar of the weaker component is semi-transparent.) The gray bar corresponds to the sum of the blue and green bar.

FIG. 9.

(Color online) (a) Histogram of center frequencies for one component clicks recorded on ch 25. (b) Histogram of center frequencies for all two component clicks recorded on ch 25. (The color of the bar of the weaker component is semi-transparent.) The gray bar corresponds to the sum of the blue and green bar.

Close modal

Across all measured clicks, the mean and median time separation (Δt) of the components within the clicks were smaller at the upper vertical channels (ch 1, ch 5, and ch 9) compared to the lower vertical channels (ch 21, ch 25, and ch 29), as seen in Table I. This is true both when Δt is calculated as the difference of center times between the stronger and weaker component, and as the difference in times between the LF and HF component. This suggests that the weaker component either traveled a longer distance when measured in the lower part of the array or that the entire acoustic wave packet has passed through dispersive media where higher frequencies traveled slower than the lower frequencies. However, the number of 2C clicks was small in the lower half of the array near the vertical midline and represented only a small portion of the total number of observed 2C clicks. There is also a difference between the mean and median Δt for these three lower channels, suggesting some inconsistencies and the time differences observed in that region should be interpreted cautiously.

TABLE I.

Display of time differences (Δt) in μs between components in two component (2C) clicks for the channels on the vertical center column of the hydrophone array and the two channels to the very far left and right of the array center. The two components have different frequency centers (given by ReSTS), thus relative to each other they can be referred to as the high frequency (HF) and low frequency (LF) components. The components also have different peak amplitudes in the scaled reassigned spectrogram, thus the components can also be classified as relatively stronger or weaker. The mean and median Δt are calculated separately using both ways of defining the two components in each click. (Δt > 0 in columns 5 and 6 indicates the stronger components hits the array first, while it in columns 7 and 8 indicates that the LF component hits the array first.)

No. 2CMeanMedianWeaker–strongerHF–LF
Chclick occur.abs. Δtabs. ΔtMean ΔtMedian ΔtMean ΔtMedian Δt
1 512 6.5 3.2 3.3 3.2 5.8 3.2 
5 502 7.6 3.2 0.5 0.0 5.1 3.2 
9 459 7.4 3.2 5.4 3.2 7.3 3.2 
12 453 25.0 22.4 −0.9 −19.2 22.5 22.4 
15 383 5.1 3.2 4.2 0.0 4.7 0.0 
18 174 22.4 28.8 21.0 28.8 8.6 9.6 
21 129 13.7 3.2 12.7 3.2 10.5 3.2 
25 71 52.8 9.6 19.8 6.4 45.4 3.2 
29 132 52.3 16.0 34.4 16.0 35.5 0.0 
No. 2CMeanMedianWeaker–strongerHF–LF
Chclick occur.abs. Δtabs. ΔtMean ΔtMedian ΔtMean ΔtMedian Δt
1 512 6.5 3.2 3.3 3.2 5.8 3.2 
5 502 7.6 3.2 0.5 0.0 5.1 3.2 
9 459 7.4 3.2 5.4 3.2 7.3 3.2 
12 453 25.0 22.4 −0.9 −19.2 22.5 22.4 
15 383 5.1 3.2 4.2 0.0 4.7 0.0 
18 174 22.4 28.8 21.0 28.8 8.6 9.6 
21 129 13.7 3.2 12.7 3.2 10.5 3.2 
25 71 52.8 9.6 19.8 6.4 45.4 3.2 
29 132 52.3 16.0 34.4 16.0 35.5 0.0 

In order to verify previous suggestions of alternative pathways for the weaker component, a larger time difference was expected when comparing the upper part of the beam, at ch 1 and ch 5, to the more central part of the array, at ch 15. As seen in Table I, the time differences are roughly the same for the channels on the upper, vertical mid-line, at ch 1, ch 5, ch 9, and ch 15.

The horizontally spaced ch 12 and ch 18 show a large time separation with reversed sign between the stronger and weaker components. This could suggest that the signals are generated in some horizontally spaced manner, e.g., with two horizontally spaced sources. However, the sign was not reversed when considering the time separation of the HF and LF component. The HF component arrived first in only 4% of these cases, regardless of its amplitude relative to the LF component. The HF component is however stronger in 58% of the cases, hence explaining the negative values of time differences in Table I. With the same calculation for ch 18 we see that here to the HF component tended to arrive later than the LF component, although a little less often. The HF component arrived first in 24% of the cases, explaining the smaller mean and median Δt compared to ch 12, but not indicating a substantial difference between ch 12 and ch 18.

The time differences between components in the upper part of the echolocation beam, where 2C clicks are more frequently observed, are around 5 μs. If this time difference is viewed as a time delay due to a longer acoustic path, 5 μs corresponds to approximately a 7.0 mm longer acoustic path (assuming a mean sound velocity of 1400 m/s in the tissue). However, longer acoustic paths are not the only possible explanation to a time delay between components with different frequencies. Therefore, three hypotheses are proposed which might explain the time differences: (A) reflections off the airsacks or bone located to the side or behind the sound source, (B) signal separation due to dispersive media, and (C) generation of two components by two separate sources.

Considering hypothesis A, assume that the two click components originate from the same source and one of the components has reached the hydrophone array via a reflector. Airsacks and bone have frequently been suggested as potential reflectors of sound, possibly contributing to the formation of the echolocation beam. For the case where a reflector is assumed to be positioned laterally to the source, the two acoustic paths can be roughly approximated as a right triangle with the reflector positioned at a 90° from the 1 m long distance between the sound source and the point of measurement. The trigonometric relationship results in a 7.0 mm distance between the reflector and the source. However, air sacks and bones are positioned at multiple places and predominantly several cm further away from the right pair of phonic lips that are believed to generate the sound [Dormer (1979); Aroyan et al. (1992); Madsen et al. (2010); Cranford et al. (2014); and Madsen et al. (2013)].

If instead a reflection off structures between the phonic lips and the skull is considered (located behind the source and in front of the skull), the reflective structure(s) must be situated just 3.5 mm from the acoustic source in order to be responsible for the 5 μs long time separation. Hence, this cannot be a viable explanation since the distances to known structures with potential reflectivity like bone or air sacks are in the order of cm rather than mm. In addition, the first of the two components would have to still propagate though a different type of tissue that could filter out the higher frequencies while the reflected component would either had to have taken a different route, where the lower frequencies were filtered out, or have been reflected off such a small reflector that only the higher frequencies were reflected. Since the time difference between these two components is so small, and both the HF and LF components are similar in signal level, the explanation is not convincing. In addition, if such structures were responsible for one of the signal components, it is rather surprising that no more than just one reflected component is observed.

Hypothesis B considers signal separation due to dispersive media. Previous research has shown that the speed of sound in the melon is not uniform throughout, but has higher sound velocity in the outer parts of the melon and lower sound velocity in the core (Norris and Harvey, 1974). Layers in the melon with different velocities will act as a geometrically dispersive media for the signal and will gradually split it up into separate components. It is possible that the HF component that generally (in 76% of the cases) arrives later to the hydrophone array than the LF component, is delayed by the lower sound velocity in the melon core. Higher frequencies are more directional and are potentially constrained within the direct propagation path through the core, whereas lower frequencies are less directional and demonstrate greater volumetric spreading. If the LF component travels faster along the sides of the melon, where the speed of sound is higher, this could potentially explain the appearance of two components in clicks recorded in the farfield. The results in Figs. 10 and 11 also show that the upper part of the beam contains a delayed, HF component, suggesting that the upper part of the beam might have experienced more separation due to geometrical dispersion than the lower part of the beam. As the lower part of the beam most often does not record two components, it seems that the higher frequencies are constrained to the upper part of the beam.

FIG. 10.

(Color online) Number of times the LF component arrived at ch 1, ch 9, ch 21, and ch 29 earlier than the HF component and vice versa, during 50 trials in cases where two-component clicks were detected.

FIG. 10.

(Color online) Number of times the LF component arrived at ch 1, ch 9, ch 21, and ch 29 earlier than the HF component and vice versa, during 50 trials in cases where two-component clicks were detected.

Close modal
FIG. 11.

(Color online) Number of times the LF component had higher amplitude than the HF component and vice versa for ch 1, ch 9, ch 21, and ch 29, during 50 trials in cases where two-component clicks were detected.

FIG. 11.

(Color online) Number of times the LF component had higher amplitude than the HF component and vice versa for ch 1, ch 9, ch 21, and ch 29, during 50 trials in cases where two-component clicks were detected.

Close modal

Figure 10 shows that the HF component sometimes (although at rare occurrences) reaches the hydrophones first, but only at the center and lower part of the beam. As seen in Fig. 11, this is also the region where the LF component is more often the stronger of the two components. If the LF and HF are similar in amplitudes, summation of random noise might cause one or the other to be the strongest, although this would not be often or consistently observed.

The dolphin in this study was also involved in a task where targets were detected off of the main response axis while the animal was in a fixed position. (Data from off-axis targets were not included in this study.) It is known that the dolphin steered its beam and varied its width in the performance of the target detection task. One possible mechanism that might contribute to echolocation beamwidth control and steering is the muscular manipulation of the melon. Changing the shape of the melon has been observed in several echolocating toothed whales (Moore et al., 2008; Wisniewska et al., 2015). This process cannot be accounted for in this study but it potentially explains some of the variability in the observed relationship between the two click components.

Acoustically, it would be advantageous for the animal to be able to manipulate the timing and strength of the two observed click components in order to increase precision while analyzing the returning echo. If the time separation is detectable to the animal it may be an important information carrier. Finneran et al. (2018a) recently determined that dolphins are capable of temporal resolutions on the order of 1–2 μs when listening to echoes that were alternately jittered in time. If the same degree of resolution can be applied to the timing delays between components within the outgoing click, then the dolphin might be able to capitalize on the small differences in echo returns and increase discriminatory potential.

The velocity difference needed along the acoustic path for the two click components to be separated with 5 μs is approximately 26 m/s according to the following calculations:

Δt=t1t2=s/v1s/v2.
(7)

If it is in this example assumed that the sound wave has traveled approximately 40 cm in geometrically dispersive media (this could be any type of tissue and water and 40 cm is a reasonable length for a dolphin melon) and that the highest of the sound velocities in the melon is v2 = 1440 m/s (Norris and Harvey, 1974), then t2 = 278 μs. Using Δt = 5μs we know t1 = 283 μs and v1 = 1414 m/s. Since Δv = v2 − v1, we arrive at the very feasible estimate of Δv = 26 m/s.

It is possible that the acoustic path through a sound velocity gradient is shorter than 40 cm. However, even halving the estimated acoustic path gives a velocity difference of just 52 m/s, which is still feasible. In fact, much larger sound velocity differences have been measured post-mortem (Δv > 240 m/s) (Norris and Harvey, 1974), which would cause an even larger time separation of the components. These results are further discussed in Sec. III D along with the results of the acoustic FE model.

The final hypothesis (C) is not supported by the findings in this paper since the time differences of the components across the entire cross section of the array do not correlate well with the presence of a second source. If the reversed sign of the time differences in Table I is an indication to two horizontally spaced sound sources, then the HF component should have arrived later at one of the horizontally most spaced hydrophones (ch 12 or ch 18) and earlier at the other. This is not the case here, which is similar to previous findings (Au et al., 2012b).

The animal could be using two sources to generate echolocation signals, e.g., in more challenging echolocation tasks. One animal's ability to activate both sets of phonic lips during echolocation has been shown previously in Cranford (2011), but there is currently mounting evidence that the right pair of phonic lips are the predominant, if not the only, source used for the production of echolocation clicks (Finneran et al., 2018b; Madsen et al., 2013, Madsen et al., 2010). However, the sound velocity gradient in the melon and the higher sound velocity in water is a more convincing explanation to the formation of the two signal components reported in this study.

The results from the 3-D model of the ellipsoid with an internal sound velocity gradient (Fig. 3) is presented in Fig. 12. It shows that a sound velocity gradient and a rough estimation of the shape of the melon give rise to two signal components. The time separation is detectable both visually in the time waveform as well as by the output of the ReSTS-algorithm. The propagation of the transient pulse through the symmetry plane of the rotationally symmetric 3D model of the ellipsoid can also be seen in Mm (1). The intention of the model is to show how a sound velocity gradient in an elliptically bound medium can split a single pulse into two temporally separated pulses (without modelling an actual dolphin melon). Note that the velocity differences used in the model are exaggerated to enhance the temporal separation effect and make the separation visually clear to the reader.

FIG. 12.

(Color online) (a) Time signal and envelope of the modeled acoustic pressure output as measured at the measurement point Mp in the model. (b) Time-frequency representation of the acoustic pressure as measured at measurement point Mp where t1 and t2 are the center times and f1 and f2 are the center frequencies of the detected components. Numbered red circles at (t1, f1) and (t2, f2) indicate detections of (partially) overlapping signal components.

FIG. 12.

(Color online) (a) Time signal and envelope of the modeled acoustic pressure output as measured at the measurement point Mp in the model. (b) Time-frequency representation of the acoustic pressure as measured at measurement point Mp where t1 and t2 are the center times and f1 and f2 are the center frequencies of the detected components. Numbered red circles at (t1, f1) and (t2, f2) indicate detections of (partially) overlapping signal components.

Close modal
Mm. 1.

Propagation of a transient pulse through the modelled 3D rotationally symmetric ellipsoid, viewed in the symmetry plane. This is a file of type “gif” (3.32 MB).

Mm. 1.

Propagation of a transient pulse through the modelled 3D rotationally symmetric ellipsoid, viewed in the symmetry plane. This is a file of type “gif” (3.32 MB).

Close modal

The simple FE model exemplifies the separation of a pulse traveling through an elliptically bound medium where the sound velocity changes perpendicularly to the direction of propagation of the acoustic wave. The resulting time difference between the two emerged click components is 29 μs. The center frequencies of the first and second components are 67 and 74 kHz, respectively. In this case, we only modeled a transient point source with uniform frequency spreading. However, it follows directly from this example that we would see a greater difference in frequency between the two separated pulses if we employ a source where the higher frequencies are more directional than the lower frequencies. Such a source would increase the level of complexity of the FE model with little benefit for the reader and was not considered necessary to exemplify this phenomenon.

The concept of signal separation due to a sound velocity gradient might at least partially explain the larger time separation observed in off-axis echolocation click measurements. Since the sound source is located just centimeters from the surrounding water with a sound wave velocity of approximately 1500 m/s, it is possible that a significantly lower sound velocity in tissue and blubber would provide two alternative acoustic paths with the similar signal-separating effects occurring off-axis. This means that the first of the two recorded transients entered the water at an early stage and the second of the pulses traveled through the significantly “slower” tissue laterally to the source. Clearly, this hypothesis requires a tissue sound velocity of approximately 1000 m/s for the sound pulse to get delayed by 150 μs at a 90° angle to the maximum response axis. This is an unrealistically low sound velocity for the melon. However, if the second pulse has experienced a less direct acoustic path in the tissue, the required sound velocity could be significantly higher. Hence, this hypothesis should be evaluated in future models with computerized tomography (CT) based geometries and fresh measurements of relevant tissue properties.

Detailed investigation of echolocation clicks from a bottlenose dolphin performing an on-axis echolocation target detection task and recorded with a 29 channel hydrophone array shows that the elevated cross section of the echolocation beam has a region where clicks, during very specific parts of the click trains, contain separate but partially overlaid transients with different frequency content and separated by less than 5 μs.

Comparisons of click properties recorded across the entire array indicate that the second, higher frequency component potentially arises as a result of the geometrically dispersive nature of the beam shaping soft tissues rather than from reflections off structures in the head of the animal or from two separate sound sources. Reflections off head structures would imply exotic frequency filters in the tissue and the presence of more than just one distinct reflection. Therefore, the simpler explanation of geometric dispersion causing signal separation is put forward as the more parsimonious explanation.

The research reported of in this paper was supported by the Swedish Research Council, the eSSENCE strategic research programme and the Office of Naval Research, USA.

1.
Aroyan
,
J. L.
,
Cranford
,
T. W.
,
Kent
,
J.
, and
Norris
,
K. S.
(
1992
). “
Computer modeling of acoustic beam formation in Delphinus delphis
,”
J. Acoust. Soc. Am.
92
(
5
),
2539
2545
.
2.
Au
,
W.
(
1993
).
The Sonar of Dolphins
(
Springer-Verlag
,
New York
).
3.
Au
,
W. W. L.
,
Branstetter
,
B.
,
Moore
,
P. W.
, and
Finneran
,
J. J.
(
2012a
). “
The biosonar field around an Atlantic bottlenose dolphin (Tursiops truncatus)
,”
J. Acoust. Soc. Am.
131
(
1
),
569
576
.
4.
Au
,
W. W. L.
,
Branstetter
,
B.
,
Moore
,
P. W.
, and
Finneran
,
J. J.
(
2012b
). “
Dolphin biosonar signals measured at extreme off-axis angles: Insights to sound propagation in the head
,”
J. Acoust. Soc. Am.
132
(
2
),
1199
1206
.
5.
Auger
,
F.
, and
Flandrin
,
P.
(
1995
). “
Improving the readability of time-frequency and time-scale representations by the reassignment method
,”
IEEE Trans. Signal Process.
43
,
1068
1089
.
6.
Brynolfsson
,
J.
,
Reinhold
,
I.
,
Starkhammar
,
J.
, and
Sandsten
,
M.
(
2019
). “
The matched reassignment applied to echolocation data
,” in
Proceedings of the ICASSP
(
IEEE, Brighton, UK
).
7.
Brynolfsson
,
J.
, and
Sandsten
,
M.
(
2018
). “
Parameter estimation of oscillating Gaussian functions using the scaled reassigned spectrogram
,”
Signal Process.
150
,
20
32
.
8.
Capus
,
C.
,
Pailhas
,
Y.
,
Brown
,
K.
,
Lane
,
D.
,
Moore
,
P.
, and
Houser
,
D.
(
2007
). “
Bio-inspired wideband sonar signals based on observations of the bottlenose dolphin (Tursiops truncatus)
,”
J. Acoust. Soc. Am.
121
(
1
),
594
604
.
9.
Cranford
,
T. W.
(
2011
). “
Biosonar sources in odontocetes: Considering structure and function
,”
J. Exp. Biol.
214
(
8
),
1403
1404
.
10.
Cranford
,
T. W.
,
Elsberry
,
W. R.
,
Bonn
,
W. G. V.
,
Jeffress
,
J. A.
,
Chaplin
,
M. S.
,
Blackwood
,
D. J.
,
Carder
,
D. A.
,
Kamolnick
,
T.
,
Todd
,
M. A.
, and
Ridgway
,
S. H.
(
2011
). “
Observation and analysis of sonar signal generation in the bottlenose dolphin (Tursiops truncatus): Evidence for two sonar sources
,”
J. Exp. Mar. Biol. Ecol.
407
(
1
),
81
96
.
11.
Cranford
,
T. W.
,
Trijoulet
,
V.
,
Smith
,
C. R.
, and
Krysl
,
P.
(
2014
). “
Validation of a vibroacoustic finite element model using bottlenose dolphin simulations: The dolphin biosonar beam is focused in stages
,”
Bioacoustics
23
(
2
),
161
194
.
12.
Daubechies
,
I.
,
Lu
,
J.
, and
Wu
,
H.-T.
(
2011
). “
Synchrosqueezed wavelet transforms: An empirical mode decomposition-like tool
,”
Appl. Comput. Harmonic Anal.
30
(
2
),
243
261
.
13.
Dormer
,
K. J.
(
1979
). “
Mechanism of sound production and air recycling in delphinids: Cineradiographic evidence
,”
J. Acoust. Soc. Am.
65
(
1
),
229
239
.
14.
Finneran
,
J. J.
,
Branstetter
,
B. K.
,
Houser
,
D. S.
,
Moore
,
P. W.
,
Mulsow
,
J.
,
Martin
,
C.
, and
Perisho
,
S.
(
2014
). “
High-resolution measurement of a bottlenose dolphin's (Tursiops truncatus) biosonar transmission beam pattern in the horizontal plane
,”
J. Acoust. Soc. Am.
136
(
4
),
2025
2038
.
15.
Finneran
,
J. J.
,
Jones
,
R.
,
Mulsow
,
J.
,
Houser
,
D. S.
, and
Moore
,
P. W.
(
2018a
). “
Jittered echo-delay resolution in bottlenose dolphins (Tursiops truncatus)
,”
J. Comp. Physiol. A
205
,
125
137
.
16.
Finneran
,
J. J.
,
Mulsow
,
J.
,
Jones
,
R.
,
Houser
,
D. S.
,
Accomando
,
A. W.
, and
Ridgway
,
S. H.
(
2018b
). “
Non-auditory, electrophysiological potentials preceding dolphin biosonar click production
,”
J. Comp. Physiol. A
204
(
3
),
271
283
.
17.
Hansson-Sandsten
,
M.
, and
Brynolfsson
,
J.
(
2015
). “
The scaled reassigned spectrogram with perfect localization for estimation of Gaussian functions
,”
IEEE Signal Process. Lett.
22
(
1
),
100
104
.
18.
Khan
,
N. A.
, and
Sandsten
,
M.
(
2016
). “
Time-frequency image enhancement based on interference suppression in Wigner–Ville distribution
,”
Signal Process.
127
,
80
85
.
19.
Lammers
,
M.
, and
Castellote
,
M.
(
2009
). “
The beluga whale produces two pulses to form its sonar signal
,”
Biol. Lett.
5
,
297
301
.
20.
Madsen
,
P. T.
,
Lammers
,
M.
,
Wisniewska
,
D.
, and
Beedholm
,
K.
(
2013
). “
Nasal sound production in echolocating delphinids (Tursiops truncatus and Pseudorca crassidens) is dynamic, but unilateral: Clicking on the right side and whistling on the left side
,”
J. Exp. Biol.
216
(
21
),
4091
4102
.
21.
Madsen
,
P. T.
,
Wisniewska
,
D.
, and
Beedholm
,
K.
(
2010
). “
Single source sound production and dynamic beam formation in echolocating harbour porpoises (Phocoena phocoena)
,”
J. Exp. Biol.
213
(
18
),
3105
3110
.
22.
Moore
,
P. W.
,
Dankiewicz
,
L. A.
, and
Houser
,
D. S.
(
2008
). “
Beamwidth control and angular target detection in an echolocating bottlenose dolphin (Tursiops truncatus)
,”
J. Acoust. Soc. Am.
124
(
5
),
3324
3332
.
23.
Norris
,
K. S.
, and
Harvey
,
G. W.
(
1974
). “
Sound transmission in the porpoise head
,”
The J. Acoust. Soc. Am.
56
(
2
),
659
664
.
24.
Reinhold
,
I.
, and
Sandsten
,
M.
(
2017
). “
Optimal time-frequency distributions using a novel signal adaptive method for automatic component detection
,”
Signal Process.
133
,
250
259
.
25.
Reinhold
,
I.
,
Sandsten
,
M.
, and
Starkhammar
,
J.
(
2018
). “
Objective detection and time-frequency localization of components within transient signals
,”
J. Acoust. Soc. Am.
143
(
4
),
2368
2378
.
26.
Stanković
,
L.
,
Djurović
,
I.
,
Stanković
,
S.
,
Simeunović
,
M.
,
Djukanović
,
S.
, and
Daković
,
M.
(
2014
). “
Instantaneous frequency in time-frequency analysis: Enhanced concepts and performance of estimation algorithms
,”
Digital Signal Process.
35
,
1
13
.
27.
Starkhammar
,
J.
,
Moore
,
P.
,
Talmadge
,
L.
, and
Houser
,
D.
(
2011
). “
Frequency-dependent variation in the two-dimensional beam pattern of an echolocating dolphin
,”
Biol. Lett.
7
(
6
),
836
839
.
28.
Wisniewska
,
D. M.
,
Ratcliffe
,
J. M.
,
Beedholm
,
K.
,
Christensen
,
C. B.
,
Johnson
,
M.
,
Koblitz
,
J. C.
,
Wahlberg
,
M.
, and
Madsen
,
P. T.
(
2015
). “
Range-dependent flexibility in the acoustic field of view of echolocating porpoises (Phocoena phocoena)
,”
eLife
4
(
e05651
).