Three bottlenose dolphins (Tursiops truncatus) participated in simulated cylinder wall thickness discrimination tasks utilizing electronic “phantom” echoes. The first experiment resulted in psychometric functions (percent correct vs wall thickness difference) similar to those produced by a dolphin performing the task with physical cylinders. In the second experiment, a wide range of cylinder echoes was simulated, with the time separation between echo highlights covering a range from <30 to >300 μs. Dolphin performance and a model of the dolphin auditory periphery suggest that the dolphins used high-frequency, spectral-profiles of the echoes for discrimination and that the utility of spectral cues degraded when the time separation between echo highlights approached and exceeded the dolphin's temporal integration time of ∼264 μs.
Odontocetes use echolocation to navigate, detect, and capture prey, and to presumably avoid predators and coordinate group behavior (Au, 1988; Benoit-Bird and Au, 2009; Madsen et al., 2004). These tasks often occur in complex, dynamic, and cluttered environments (Jensen et al., 2013). Under strict laboratory controls, echolocating odontocetes (e.g., bottlenose dolphins) have demonstrated the ability to discriminate and recognize targets that differ in size, shape, and material composition (Au and Pawloski, 1992; Nachtigall, 1980; Pack and Herman, 1995). Echoic shape recognition requires knowledge of the spatial relationships between an object's features that must be obtained through returned echoes (Altes, 1978; Altes et al., 2003; Pack et al., 2002). Given that sound speed in water is approximately 1500 m/s, echoes from multiple reflective components of an object will arrive at the dolphin with extremely small time delays related to the spatial separation of reflective sources. The dolphin's binaural auditory system likely resolves the spatial separation of these reflections based on their delays and binaural stimulus differences in order to help construct a spatial representation of the object. How the dolphin achieves this is a topic of speculation and debate (Altes et al., 2003; Branstetter and Mercado, 2006; Harley et al., 2003; Pack et al., 2002). However, by studying the dolphin's echolocation capabilities with simplified echolocation tasks, the mechanisms and limitations that engender complex representations can be elucidated.
A. Discrimination of cylinder echoes
Bottlenose dolphins (Tursiops truncatus) have the ability to discriminate between hollow cylinders whose wall thicknesses vary by fractions of millimeters (Au and Pawloski, 1992). Hollow, water-filled, metal cylinder echoes are dominated by two specular reflections, called echo “highlights”: the first, a reflection from the front wall and the second, a reflection from the back, inside wall. The time separation τ between these two highlights can be expressed as
where W is the cylinder wall thickness which varied, D is the outer diameter equal to 3.81 cm that was used in this study, and co and c1 are the sound velocities in sea water and an aluminum cylinder wall, which were equal to 1520 and 5150 m/s, respectively (Au and Pawloski, 1992). Because cylinder echoes are dominated by only two highlights, these echoes are one of the simplest forms of a complex echo. In Au and Pawloski (1992), a bottlenose dolphin discriminated between a standard hollow aluminum cylinder and comparison cylinders with identical dimensions, except for differences in wall thickness (ΔW). The dolphin's 75% correct discrimination threshold was 0.23 and 0.27 mm for thinner and thicker cylinders, respectively, which corresponded to a time separation difference (Δτ) of approximately 0.6 μs. The time separation difference between two cylinders can be expressed as
where τs and τc are the time separations between echo highlights for the standard cylinder and comparison cylinder, respectively.
A similar experiment was conducted with a false killer whale (Pseudorca crassidens), where apparently, the animal was so proficient at discriminating cylinder wall thickness differences that cylinders could not be machined with enough precision to frustrate the animal's near perfect discrimination abilities (Gisiner, 2008). Sixteen years later, the experiment was repeated with the same animal (Kina). The whale's performance was significantly worse, presumably due to an acquired high-frequency hearing loss (Kloepper et al., 2010).
B. Spectral and temporal resolution
The dolphin's ability to discriminate between ensonified targets depends on its hearing abilities, more specifically, the spectral and temporal resolution of its auditory system. Temporal resolution and temporal integration are terms that are often used interchangeably. However, here temporal resolution will be defined as the ability to follow rapid changes over time, such as amplitude fluctuations in the temporal envelope of a waveform. Temporal integration will be defined as the process or mechanism responsible for limiting temporal resolution. Thus, temporal integration is typically described as a low-pass filter (Viemeister, 1979). Because the time separation between echo highlights in Au and Pawloski (1992) was well below the dolphin's temporal integration time of 264 μs (Au et al., 1988; Moore et al., 1984; Supin and Popov, 1995), the use of time-domain cues for discrimination is unsupported. However, auditory hearing models suggest spectral profile cues should be salient and sufficient to perform the discrimination task (Au, 1994; Branstetter et al., 2007). Dolphins have excellent frequency discrimination capabilities, with relative difference limens below 1% for all frequencies between 1 and 140 kHz (Herman and Arbeit, 1972; Thompson and Herman, 1975). Auditory peripheral filter shapes have been measured (Lemonds, 1999; Finneran et al., 2002) and modeled as both ROEX filters and Gammatone filter banks (Branstetter et al., 2007), with a constant-Q value of 11.3 providing good fits to the behavioral data. Simulations of the cylinder wall thickness discrimination task have been performed by three different models with slight variations in processing stages; however, all of the models share auditory filter banks with similar properties, and all of the models suggest the dolphin likely performs the task using spectral profile cues (Au, 1994; Au and Nachtigall, 1995; Branstetter et al., 2007). Figure 1 shows the output of the model used in Branstetter et al. (2007), where small time differences between echo highlights resulted in significant differences in their time-frequency representations.
In the current study, two experiments were conducted. In Experiment I, a cylinder wall thickness discrimination task was performed. However, instead of using physical cylinders, a phantom echo generator (PEG) (Finneran et al., 2010) was used to simulate cylinder echoes. The PEG system is advantageous in that echoes from cylinders of almost any diameter and wall thickness can be quickly and easily simulated. The goals of Experiment I were to: (1) determine if simplified, simulated echoes from the PEG would produce a similar performance pattern (i.e., psychometric functions) as obtained from the dolphin in the original Au and Pawloski (1992) experiment, and (2) to determine if there are individual differences in performance between three dolphins with different high-frequency hearing capabilities. In Experiment II, the discrimination task was expanded to include simulated cylinders where the standard cylinders varied, such that the time separations between the first and second highlights were greater than and less than the dolphin's integration time of 264 μs (Moore et al., 1984). When the time separation between echo highlights is less than the dolphin's integration time, the auditory model from Branstetter et al. (2007) predicts there should be peaks and valleys in the spectral profile that can be used for discrimination. However, as the time separation between echo highlights exceeds the dolphin's integration time, these peaks and valleys should diminish and, according to this model, produce a decrement in discrimination performance. Experiment II tested this model prediction.
II. EXPERIMENT I. CYLINDER WALL THICKNESS DISCRIMINATION WITH PHANTOM ECHOES
Three adult male bottlenose dolphins participated in this experiment. Age and upper-frequency limit of hearing are presented for each dolphin in Table I. Individual audiograms can be found in the Supplemental Material.1 All dolphins had prior experience with psychoacoustic tasks. They were housed in 9 m × 9 m or 9 m × 18 m floating netted enclosures (pens) located in San Diego Bay, CA. The study followed a protocol approved by the Institutional Animal Care and Use Committee at the Naval Information Warfare Center Pacific, and all applicable U.S. Department of Defense guidelines for the care of laboratory animals.
|Dolphin .||Age (years) .||High-frequency cutoff (kHz) .|
|Dolphin .||Age (years) .||High-frequency cutoff (kHz) .|
B. Change discrimination task
The dolphins performed an echolocation, echo-change discrimination task using a go / no-go response procedure (Finneran et al., 2010). During each trial, the dolphin would swim, station on an underwater bite plate, and begin echolocating and receiving phantom echoes (Fig. 2). The bite plate and PEG station were constructed from polyvinyl chloride (PVC). The receiving hydrophone and underwater sound projector [both Reson TC 4013 piezoelectric transducers, (Teledyne Reson, Denmark)] were located at distances of 1 m and 20 cm, respectively, from the tip of the dolphin's rostrum (Fig. 2). The on-axis sonar beam of the bottlenose dolphin is elevated above the horizontal plane by approximately 5 degrees (Au et al., 1986). As a result, the receiver hydrophone was positioned 14 cm above the plane of the bite plate to place it more directly on the main transmit axis of the biosonar beam. The echo projector was approximately 2.5 cm below the plane of the bite plate to facilitate hearing through the jaw (Brill et al., 1988). Each trial began with the phantom echo presentation of the standard target [an aluminum cylinder with an outside diameter (D) of 3.81 cm and a wall thickness (W) of 6.35 mm]. After a random interval between 3 and 6 s (rectangular distribution), the standard target would either change to a comparison target (change trial) or remain the same (catch trial). If the dolphin detected a change, the dolphin was trained to respond by producing a whistle or burst-pulse sound within a 2-s response window that began after the onset of the change (see Fig. 3). If the dolphin did not detect a change, the dolphin was trained to continue echolocating until receiving feedback from the trainer (see below). A third “listening” hydrophone (Reson TC 4013) was situated next to the PEG receiver hydrophone, which allowed a human operator to monitor each trial and the animal's acoustic response via a speaker. The custom software that controlled the experiment recorded and displayed a time-domain waveform of each trial in real time. This allowed the human operator to also visually identify the dolphin's acoustic response. The acoustic responses and visual display were monitored in real time. All acoustic responses were within the human hearing range and no ambiguous responses were observed, i.e., there were no trials where a dolphin's response needed to be qualified from further examination of an acoustic recording. If the dolphin responded correctly to the change (hit), he would receive a whistle “bridge” that indicated his response was correct, which prompted the dolphin to return to his trainer's station for fish reinforcement. If the dolphin failed to respond during the response interval (miss), the dolphin received a “hand slap” on the water's surface, which prompted the dolphin to return to the trainer's station where he would not receive fish reinforcement. Any vocal response outside of the response window was logged as an out-of-window response. A single catch trial, where the target did not change to a comparison target but remained the same, was randomly inserted into each block of five trials (20% catch trials). A vocal response at any time during a catch trial resulted in a false alarm (FA). False alarms and out-of-window responses resulted in a “hand slap” on the water's surface, which again prompted the dolphin to return to the trainer's station where he would not receive fish reinforcement. If the animal did not produce a whistle or burst-pulse vocalization during a catch trial, the animal received a whistle bridge followed by fish reinforcement. The duration of each catch trial was randomly determined by the same parameters as a change trial (i.e., 3–6 s followed by a 2-s response window). A total of 25 trials were conducted for each research session, where 20 trials were “change” trials and five trials were “catch” trials. The animals typically participated in two sessions/day. In any given session, if the false alarm rate exceeded 20% or the number of out-of-window responses exceeded five (20%), the data from the entire session were excluded from analysis. The use of an objective criterion for excluding data is necessary and standard practice (Branstetter et al., 2003; Branstetter et al., 2013) for preventing inclusion of data when an animal may be having a subjective “off day.” An “off day” can be broadly defined as atypical behavior where the animal is subjectively judged to be unmotivated, inattentive, producing a vocal response during every trial, choosing not to participate, etc. The exclusion of data where the animal's FA rate was greater than 20% also ensures a relatively consistent response bias with the associated data. The difficulty of each change discrimination was manipulated by varying the wall thickness (W) of the comparison cylinder, which was the independent variable for the experiment. The dolphin's performance (dependent variable) was measured by calculating the percent correct discriminations (i.e., the number of correct trials divided by the number of total trials, multiplied by 100) for each wall thickness difference (ΔW). Each instance of a full psychometric function, termed a replicate (i.e., a full psychometric function), was completed when ten trials were completed for every ΔW condition. For simplicity, the first full psychometric function was considered replication 1, even though, technically speaking, there were no data that preceded it.
Data for this experiment were collected as part of a larger study investigating the potential to “enhance” target detection and target discrimination performance during echolocation. Although there was no evidence that these experimental manipulations had any effect on the animal's performance, a summary of the experiment can be found in the supplementary material.1
C. Target transfer functions and the PEG
Previous experiments were limited by the number of physical cylinders available (e.g., nine cylinders; Au and Pawloski, 1992). Physical cylinders require time and money to manufacture and each cylinder must be machined within a fraction of a millimeter precision. One advantage of using simulated targets with the PEG system is that a transfer function for a cylinder with almost any dimensions can be quickly generated and implemented. To mathematically generate an impulse response of an aluminum cylinder, the time separation τ between two highlights is first calculated from Eq. (1), where W from the standard cylinder was equal to 6.35 mm, D was equal to 3.81 cm, and co and c1 are the sound velocities in sea water and the aluminum cylinder wall, which were equal to 1520 and 5150 m/s, respectively (Au and Pawloski, 1992). The outer diameter value of 3.81 cm was taken from Au (1993), p. 196. However, the outer diameter reported in Au and Pawloski (1992) was 3.785 cm. This discrepancy was not detected until data collection was near completion. Therefore, 3.81 cm was used for generating transfer functions and for computer modeling. Ten recorded echoes from Au (1994) were randomly selected (from a pool of 50) and the time waveforms averaged coherently, based upon the time alignment of the maximum positive value of each click's time domain waveform. Details of how the recorded cylinder echoes were acquired can be found in Au (1994). The amplitude of the second highlight relative to the first highlight was calculated by dividing the average peak amplitude of the second highlight by the average peak amplitude of the first highlight, which was equal to 0.51. The impulse response h(t) of the cylinder was then defined as
The impulse response function consists of two discrete impulses (see Fig. 4): one with unity amplitude at time t = 0, the other with an amplitude of 0.51 at time t = τ, with τ varied to simulate cylinders with different wall thickness [see Eq. (1)]. Discrete impulse responses were generated in matlab R2007a. The PEG hardware had a sampling rate of 1 MHz, which resulted in a sampling (time) resolution of 1 μs. To achieve time separations (τ) between the first and second highlights at values other than integer multiples of 1 μs (i.e., fractional delays), the discrete impulse response was first generated at a sampling rate of 10 MHz, which allowed for a resolution of 100 ns (i.e., τ was rounded to the nearest 0.1 μs). The discrete impulse response was then decimated to 1 MHz following anti-aliasing, low-pass filtering (20th order Kaiser window with a beta factor of 2) using the matlab “resample” function (matlab 2007a). Transfer functions were calculated by taking a 512-point Fast Fourier Transform (FFT) of the impulse response. Transfer functions were then saved on a desktop computer, which the PEG system used for generating echoes in real time. Figure 4 provides example impulse responses and resulting echoes generated using the fractional delay method outlined above, where the time separation difference between the two waveforms was Δτ = 0.5 μs. Although the impulse response, initially defined at 10 MHz, contains only two non-zero values [Eq. (3)], low-pass filtering and decimation to 1 MHz results in an impulse response with many positive and negative values surrounding time τ. Note that this method is different from that used in previous phantom echo studies (Finneran et al., 2010), where recorded echoes from physical targets were used to calculate transfer functions, instead of mathematically modeled echoes as in the case of the current study.
The PEG hardware and software was based on a TMS320C6713 floating point digital signal processor (Texas Instruments, Dallas, TX) with an analog input/output (I/O) daughtercard (AED109, Signalware Corp., Colorado Springs, CO), and was similar to systems used in previous echolocation experiments (Finneran et al., 2019; Finneran et al., 2010; Kloepper and Branstetter, 2019). During each trial, the dolphin's outgoing click was acquired by the Reson TC4013 click receiver, amplified by a custom pre-amplifier (6-dB gain, 5–200 kHz band-pass filter), and digitized (1 MS/s, AED109). If the instantaneous amplitude from the receiving hydrophone signal exceeded a threshold, the click was extracted and the FFT of the click was multiplied by a target transfer function. The threshold level was calibrated so the PEG system would only be triggered by dolphin sonar clicks and would not be triggered by ambient noise (e.g., snapping shrimp) in San Diego Bay (Branstetter et al., 2012). The inverse-FFT of this signal was then computed. The resulting echo waveform was scaled in amplitude and delayed simulating the two-way acoustic propagation (spherical propagation) of a target at a 10 m range. The echo waveform was then converted to analog (1 MHz sampling rate), amplified [22 dB, Hafler PRO2000 (Tempe, AZ)] and projected back to the dolphin via the echo projector (Fig. 2). A comparison between the physical cylinder echoes, mathematically-simulated cylinder echoes, and acoustically-transmitted PEG cylinder echoes can be found in Fig. 5. The recording of the PEG cylinder echo [Fig. 5(J)] was accomplished by substituting a real dolphin incident signal with a recorded dolphin incident signal from the dolphin WHP and using that as the input to the PEG system. The projected PEG echoes were recorded in San Diego Bay, at the position where the dolphin's lower jaw would be located under the bite plate (i.e., the dolphin was not present), by a Reson TC4013 hydrophone (Teledyne Reson, Denmark), coupled to a Reson VP1000 preamplifier (Teledyne Reson, Denmark), and digitized by a National Instruments USB 6251 DAQ device (National Instruments, Austin, TX) at a sampling rate of 500 kHz. Because of the ambient noise in San Diego Bay, recordings were made by coherently averaging 1048 PEG echoes (again based on time alignment of the maximum positive value from the time domain waveform) using custom Labview software (National Instruments, Austin, TX). Although the echoes in the current study [Figs. 5(J) and 5(K)] appear different than echoes from an actual cylinder [Figs. 5(B) and 5(C)], this difference is primarily due to differences in WHP's incident signals. The incident signal used in Au, 1994 [Fig. 5(A)] simulated a dolphin click recorded in Kaneohe Bay, where the ambient noise levels are significantly higher than San Diego Bay (Au et al., 1985). Odontocete click sound pressure levels and peak frequencies have been shown to be positively correlated with ambient noise levels (Au et al., 1985; Thomas and Turl, 1990). WHP's incident signals recorded in San Diego Bay have lower-frequency components compared to clicks recorded in Kaneohe Bay.
D. Results and discussion
The number of trials completed and rejected sessions for each dolphin can be found in Table II. Rejected sessions where the FA rate exceeded 20% can be used to calculate the proportion of trials excluded from analysis. Since each session contained 25 trials, the proportion of rejected data (rejected sessions × 25/total number of trials for the experiment) for each dolphin was 0.00, 0.08, and 0.02 for IND, SPA, and WHP, respectively. In the current go-no-go procedure, 20% of the trials were catch trials. Although this low percentage of catch trials may seem to have the potential to produce a liberal response bias, the proportion of rejected data, where the response bias was above 20% was low. The performance for each dolphin is plotted in Fig. 6. Despite apparent individual differences between the dolphins, the general psychometric functions had similar shapes where percent correct decreased as the wall thickness difference between the standard and comparison targets decreased. Data from Au and Pawloski (1992) are included for comparison purposes (participant Tt-8). Although the data from the Au and Pawloski experiment fit within the distribution of the current data, the procedure was different in that the animal was echolocating on physical cylinders that were presented simultaneously in a two-alternative, forced choice task. In that study, data points on the psychometric function below 50% correct could not be measured. With the current go/no-go procedure, data points on the psychometric function could be measured from 0% to 100% correct. Another difference between the two studies is that the peak frequencies of the dolphins' incident signals in the current study were lower than that of the dolphin in Au and Pawloski (1992). Because the incident signals were lower in frequency in the current study, so were the resulting PEG echoes (see Fig. 5). Nevertheless, the PEG apparently generated echoes similar enough to the physical echoes to produce closely matching performance. To our knowledge, this is the first time a PEG system has been used with dolphins in a target discrimination task where the transfer functions were mathematically generated to mimic physical targets (as opposed to transfer functions calculated from ensonified physical targets), the independent variable (ΔW) varied on a continuous scale (rather than categorical), and performance matched well with the physical analog task (i.e., participant Tt-8 echolocating on physical cylinders).
|Dolphin .||Total trials .||Total replications .||Trials included .||Replications included .||Rejected sessions .|
|Dolphin .||Total trials .||Total replications .||Trials included .||Replications included .||Rejected sessions .|
A few possibilities may explain the individual differences in discrimination performance. These include individual differences related to (1) incident signals of each animal, (2) hearing abilities, and (3) general aptitude. The average incident signal of each dolphin is shown in Fig. 7. Signals were produced by coherently averaging all recorded clicks from a randomly selected trial. The dolphins do not have any a priori knowledge of what type of trial they receive (change or catch) or what type of comparison target they may be presented with, so it is impossible for the dolphins to adjust their incident signal for specific targets. No stark differences seem to exist in the waveforms or spectral densities between the dolphins, suggesting that individual incident signals were unlikely related to performance. There was a general relationship between each animal's audiogram high-frequency cutoff and the animal's discrimination threshold (i.e., 50% correct from each animal's psychometric function) where better hearing at higher frequencies was a predictor of lower discrimination thresholds (i.e., better performance). This topic will be elaborated on in Sec. IV in order to include data from Experiment II. Variables related to general aptitude (e.g., expertise, motivation, attention) were not measured and their contribution to target discrimination performance could not be assessed.
III. EXPERIMENT II. TEMPORAL INTEGRATION AND DISCRIMINATION OF TWO-HIGHLIGHT ECHOES
The same dolphin subjects that participated in Experiment I also participated in Experiment II, and the echolocation change discrimination task was identical to Experiment I. In Experiment I, thresholds (where the dolphin can discriminate between the two cylinders at the 50% correct level) were reported as cylinder wall thickness differences (ΔW) in mm to be consistent with Au and Pawloski (1992). In Experiment II, the actual physical dimensions of the cylinders that are simulated with the PEG are disregarded and instead, only the acoustic dimensions (i.e., time separations between the first and second highlights) are considered. Here, thresholds are reported as time separation differences (Δτ) defined as
where τs is the time separation between echo highlights for the standard cylinder, and τc is the time separation between highlights for the comparison cylinder. In Experiment I, only one standard phantom cylinder was used, where τs was always 35.9 μs. In Experiment II, several standard cylinders were used including τs equal to 100, 200, and 300 μs. The dolphin WHP performed the task with additional standard cylinders, where τs was 250, 280, 350, and 400 μs. Thresholds were calculated by transforming percent correct values into proportion correct (i.e., percent correct divided by 100), and then fitting logistic functions to each animal's proportion correct data (i.e., proportion correct as a function of time separation difference) and taking the 0.5 proportion correct of each function as the threshold. Logistic functions were in the form of
where a and b were fitting parameters and p was the proportion correct, which was bound between 0 and 1.
B. Model of the dolphin auditory periphery
A model of the dolphin auditory periphery (Branstetter et al., 2007) was used to examine discrimination cues potentially available to the dolphin during the task. Although a summary of the relevant processing stages of the model is provided here, a complete description of the model can be found in Branstetter et al. (2007). The model has three primary stages. The first stage is a gammatone filter bank (Slaney, 1993) where the frequency response of the filters has been fit to ROEX filters derived from bottlenose dolphin notched-noise masking data (Lemonds, 1999). The gammatone filter bank is a time-domain implementation of the ROEX filter and approximates the basilar membrane response to the stimulus waveform. The second stage is a non-linear, half-wave rectifier which simulates a simplified inner hair cell response. The third stage consists of low-pass filtering with a time-domain, exponential-decay function corresponding to the duration of the dolphin's temporal integration time of 264 μs (Moore et al., 1984). The low-pass filtering simulates the relative sluggishness of the 8th nerve (and other auditory neurons with refractory periods) measured in mammals and is often called a leaky integrator (Berg, 2007; Viemeister, 1979). The output of the model Gi,j is a two-dimensional (2-D) matrix where i and j represent the frequency and time dimensions, respectively, and can be visualized as a spectrogram (Fig. 8). However, unlike a standard spectrogram, where the frequency spacing is constant and the temporal and spectral resolution of each frequency bin is equally influenced by the length and shape of the analysis window (see Fig. 4 in Branstetter et al., 2007), the frequency spacing of the hearing model is based on equal overlap of auditory filters and the spectral and temporal resolution is frequency dependent. Post-processing of the hearing model output can be implemented to represent discrimination cues related to the spectral profile, temporal envelope, or overall energy. The spectral profile (s) is a vector that can be represented by summing across time, within each frequency channel,
The output (s) resembles the spectral density of the stimulus waveform; however, the spectral resolution of the dolphin's auditory system is preserved [Figs. 8(A)–8(D), panel 3]. The temporal envelope (env) can be represented by summing across each frequency channel,
In Fig. 8, each alphabet labeled panel represents model outputs for τs of 35.9, 100, 200, and 300 μs. Within each panel, numeric plots represent: (1) the time-domain echo, (2) the model output (Gi,j), (3) the spectral profile (s), and (4) the temporal envelope (env).
C. Results and discussion
Table III displays the number of trials and replications for each dolphin and condition. Data from Experiment I, which were reported as cylinder wall thickness differences (ΔW) in mm, were converted into Δτ values and included in the analysis. Figure 9 presents psychometric functions for each dolphin for each experimental condition. Thresholds (p = 0.5, where p is the proportion correct) for each function are plotted in Fig. 10. A general decrease in performance was observed as τs increased from 35.9 to 300 μs. Although an attempt was made to collect data where τs = 350 and 400 μs, the dolphin WHP could not reliably perform the task. For IND and SPA, the decrease in performance was not monotonic. The source of this variability remains speculative but could be related to the availability of multiple discrimination cues (Branstetter et al., 2007), order effects related to the sequence in which the data was collected (see supplementary material1), or simply variability in each animal's consistent performance. The results are consistent with hearing model predictions where spectral cues are most salient when both echo highlights are within the dolphin's integration time. Figure 11 displays spectral profile differences of the hearing model (Branstetter et al., 2007) between standard cylinders and comparison cylinders, where Δτ is a constant and equal to 2 μs. Spectral differences (Δs) were calculated by
where ss and sc were the spectral profiles for the standard and comparison cylinders, respectively, and were calculated from Eq. (6). Spectral differences decrease with increased τs despite Δτ being a constant 2 μs in each condition. In order to produce large spectral peaks and valleys, both echo highlights must be within the dolphin's integration window [Figs. 11(A) and 11(B)]. When the time separation between echo highlights approaches the integration time [Fig. 11(C)] and then exceeds it [Fig. 11(D)], spectral differences are minimized.
|Dolphin .||Standard target (μs) .||Total trials .||Total replications .||Trials included .||Replications included .||Rejected sessions .|
|Dolphin .||Standard target (μs) .||Total trials .||Total replications .||Trials included .||Replications included .||Rejected sessions .|
There was a relationship between the high-frequency cutoff of each dolphin's audiogram and their respective discrimination thresholds (Fig. 12). A linear regression model where high-frequency cutoff was a predictor for discrimination thresholds failed to reject the null hypothesis using an F-statistic [F(1,22) = 3.30, p = 0.083]. However, if τs values of 35.9 and 100 μs are the only predictors considered [Fig. 12(A)] because, in these conditions, high-frequency spectral profile differences were available [Figs. 12(A) and 12(B)], there was a significant relationship between discrimination thresholds and high-frequency cutoff [F(1,10) = 23.15, p < 0.001], where high-frequency listeners were superior at performing this task.
IV. GENERAL DISCUSSION
Simulated cylinder echoes produced by the PEG system resulted in a discrimination performance that was similar to the original Au and Pawloski (1992) study. These similarities exist despite the current study's use of simplified, mathematically-constructed echoes that only contain two echo highlights. Real cylinder echoes are more complex and contain highlights related to the front of the cylinder, central and square path reflections from the inside back wall, as well as circumferential waves (Au and Pawloski, 1992). Despite these differences, the psychometric functions of the dolphins in the current study are remarkably similar to the psychometric function from the dolphin Tt-8 in the Au and Pawloski (1992) study. One of the advantages of using simulated echoes with the PEG system is that a much larger number of echo exemplars was achieved, allowing an investigation into the potential auditory processing mechanisms that govern this task. This study suggests the dolphin's temporal integration window of 264 μs places constraints on how echo highlights will interact to produce spectral cues used for discrimination. First, the relatively wide auditory filters at higher frequencies result in a trade-off of fine temporal resolution. This fine temporal resolution is likely maintained at the level of the inner hair cell of the cochlea. In humans and other mammals, mechanotransduction of inner hair cells of the mammalian auditory system is not limited by refractory periods, but is instead phase-locked with the acoustic pressure waveform (Sumner et al., 2002), even for very high frequencies. However, as the acoustic phase information is transmitted to the 8th nerve, refractory periods will limit phase-locking to relatively low-frequency stimuli, essentially functioning as a low-pass filter (Dynes and Delgutte, 1992; Köppl, 1997). This low-pass filter characteristic is represented in the model of the dolphin's auditory periphery by the 264 μs exponential decay window. When two echo highlights are separated in time by less than 264 μs, the model predicts the dolphin will not perceive two discrete echo highlights, but instead, a single echo [see, e.g., Fig. 8(A4)] with the frequency spacing of the peaks and valleys in the spectrum roughly corresponding to the reciprocal of the time separation between the highlights [Fig. 8(A3)]. Extremely small changes in the time separation between echo highlights can result in relatively large changes in the spectral profile of the echo [see Fig. 11(A), for a separation of 2 μs]. For example, when the time separation between highlights for the standard cylinder was 35.9 μs, the dolphin WHP's discrimination threshold (averaged between larger and smaller PEG cylinder echoes) was 283 ns. For comparison purposes, this is significantly less than the 500 ns difference displayed in Fig. 4. The most parsimonious interpretation of this result is that the dolphin was unlikely capable of perceiving the 283 ns differences in the time domain, but instead perceives the relative shift in the spectral profile between the standard and comparison echoes. As the time separation between echoes increases beyond the integration time, the dolphin's auditory system begins to resolve the two echo highlights, both of which have similar (almost identical) spectral profiles [see Fig. 8(D)]. Spectral cues previously used for discrimination with smaller time separations are degraded, which is reflected in the dolphin's performance decrement [Fig. 11(D)]. A video representation of how spectral cues degrade with increased time separation between highlights can be found in Mm. 1.
The hypothesis that the dolphins' performances reflect the use of diminishing spectral cues is further supported by a non-constant Weber's fraction (Gulick et al., 1989; p. 229). Weber's law of just-noticeable differences predicts that if the dolphin is performing the discrimination by attending to time domain cues, thresholds should increase proportionally to τs, the time separation in μs between echo highlights for the standard cylinder,
where τD is the discrimination threshold in μs and k is a constant. No such constant resulted with any of the three animals' performances, suggesting that the cue(s) the dolphins were using to discriminate the echoes were not related to the time domain and were not equally available in each τs condition.
The dolphin WHP in Experiment II could not reliably discriminate between echoes when τs > 300 μs, despite having conditions where Δτ was larger than 40 μs. This result is puzzling since recent studies have demonstrated that dolphins can detect an echo “jitter” in the time domain as small as 1–2 μs (Finneran et al., 2019). In the current study, when τs is greater than the integration time constant of 264 μs, the dolphin should perceive the echo highlights as two distinct echoes. If the dolphin were to ignore the first highlight and attend only to the second (which shifts in time during the change discrimination task), the task would be similar to a jitter task. However, in the jitter task from Finneran et al., the echo jitters on a click-to-click basis. In the current study, the second highlight shifts in time only once per trial during the initial change, likely making this task more difficult. Another possible explanation is that the dolphins in the current study were reinforced for using spectral cues (during training) and when the spectral cues were no longer available, they did not reliably switch to using a potential time-domain cue. This hypothesis could be tested by first training a dolphin to attend to time domain cues, such as a large time domain jitter in the second highlight, and then progressively decreasing the jitter.
High-frequency hearing was a good predictor of discrimination thresholds in conditions where high-frequency spectral differences between echoes were available (Fig. 12). Of course, a linear regression with a total of three dolphin subjects (where each dolphin performs replications), should be met with skepticism. However, these data are not unique. The false killer whale (Kina) that performed the cylinder wall thickness discrimination task (Gisiner, 2008) had broadband hearing at the time of data collection. Sixteen years later, Kina was retested in the same task with a decrease in performance presumably related to presbycusis acquired between the two studies (Kloepper et al., 2010). The two studies with Kina, in conjunction with the behavioral data and modeling results from the current study, argue for a dependence on high-frequency hearing for discrimination in this task. This suggests as echolocating odontocetes age and their upper frequency hearing sensitivity progressively becomes compromised, the signal-to-noise ratio of echoes will decrease, availability of high frequency spectral information will degrade, and their ability to make fine distinctions between targets will become compromised.
The model from Branstetter et al. (2007) was useful for visualizing the output of the dolphin auditory periphery and making an accurate prediction that spectral cues would diminish with larger time separations between echo highlights (see Mm. 1). Of course, this model, like all models, is only a simplified approximation of a much more sophisticated system. The current model is a passive listening model and does not consider how the incident signal may affect the perception of the returned echo (which may be required for jitter discrimination). The current model is also inconsistent with coherent processing models (Saillant et al., 1993) since the temporal fine structure of the echo is smeared by the temporal integration window, leaving only the temporal envelope. Tasks such as echoic navigation, shape recognition, or simply catching a moving fish, require constructing a cognitive spatial map from multiple, complex-echoes, in a dynamic environment. The current model, which captures the temporal and spectral limitations of the dolphin's auditory periphery, might serve as a front-end processor to a more sophisticated model.
The dolphin's ability to discriminate between synthetic cylinder echoes, composed of two highlights, appears to depend on high-frequency spectral cues. When echo highlights occur within the dolphin's integration time constant, they combine to produce peaks and valleys in the spectral profile used for discrimination. However, when the time separation between echo highlights is greater than the integration time constant, the dolphin will perceive two discrete echoes, where the second is an attenuated version of the first. In this scenario, spectral cues for discrimination are not sufficient to discriminate between the echoes. High-frequency hearing loss confounds the discrimination process as animals become incapable of capitalizing on spectral peaks and valleys where hearing loss has occurred.
We would like to thank the trainers, interns, and staff of the National Marine Mammal Foundation and the Navy Marine Mammal Program for support during training and data collection. Special thanks go to Annie Finneran and Kelly Govenar for their help in data collection and Hannah Bates, Carrie Espinoza, Megan Graves, and Jess Haynsworth for training support. We would like to thank Whitlow Au for the inspiration for pursuing these experiments and for supplying the recorded cylinder echoes from Au (1994). This publication is dedicated to Whitlow Au, who was a pioneer, a colleague, a mentor, and a friend. This project was sponsored by the Targeted Neural Plasticity Program of Defense Advanced Research Projects Agency (DARPA). This is contribution #251 from the National Marine Mammal Foundation.
See supplementary material at https://doi.org/10.1121/10.0001626 for individual audiograms for each dolphin and a summary of the peripheral nerve stimulation procedure.