This Letter provides an advanced time-frequency analysis method for quickly and precisely estimating dispersion curves of low-frequency impulsive sounds propagated in shallow water. First, the modal chirplet transform is derived based on an approximated variation law of the modal instantaneous frequency, which can omit the modal filtering step of warping-based methods. Then an extension modal chirplet transform (EMCT) is developed to further improve its potential for automation, together with a trick for fast computation. EMCT is applied to a recorded signal to demonstrate that the method can achieve high time-frequency concentration for modal components in a significantly decreased run time.

Dispersion is a very common physical phenomenon in underwater acoustics. In dispersive environments, a signal received at a hydrophone can be decomposed to several modal components. Since these components are closely related to the source/receiver configuration and the propagation medium, one can use them to infer details of the source location and the ocean environment through suitable inversion schemes, where the dispersion curve is probably the most commonly used characteristic of modal components.

Considering that dispersion curves are theoretically separated on the time-frequency (TF) plane, time-frequency analysis (TFA) is naturally suitable for dispersion curve estimation. However, traditional TFA methods, such as short-time Fourier transform (STFT) and wavelet transform (WT) (Chen et al., 2003; Potty et al., 2000), usually perform with thick instantaneous frequency (IF) trajectories and thus have insufficient resolution for modal components in short range measurements. There are two ways to distinguish each modal component in such case. The first way is to reduce the thickness of modal IF trajectories by parameterized TFA methods, such as matching pursuit algorithm (Chen et al., 2003), Pekeris-STFT (Le Touzé et al., 2009), and β-powergram (Bonnel et al., 2013). However, these advanced TFA methods have found very limited applications since they were proposed. As pointed out in Bonnel et al. (2020), this may be caused by their complexity of understanding for the end-users. The second way is to conduct modal filtering through warping transform before applying traditional TFA methods, such as STFT (Bonnel et al., 2020) or Wigner–Ville distribution (WVD) (Niu et al., 2014). Unfortunately, manual operations are inevitable in modal filtering. In addition, since traditional TFA methods are still utilized, this scheme actually provides no improvement in resolution, i.e., the thickness of modal IF trajectories, which indicates the signal-to-noise ratio to some extent. Thus, this scheme may not be robust in estimating the IF trajectories of weak energy modes. In short, at present a relatively simple method with fewer manual operations (i.e., the potential for automation) is still desired in dispersion curve estimation.

The remainder of this Letter explores a new advanced TFA method of dispersion curve estimation. Since the proposed modal chirplet transform (MCT) has a similar framework as STFT, its programming can be easily realized based on the existing STFT code. This also makes MCT very competitive in computing efficiency. In addition, the modal filtering step can be omitted when applying MCT, which avoids a lot of manual operations. To reduce the number of manual parameters and further improve the potential for automation, an extension modal chirplet transform (EMCT) is developed, together with a trick for fast computation. The analysis of an experimental signal reveals that the proposed EMCT takes full account of both the TF concentration for modal components and the computational efficiency.

Details of the modal propagation and the warping transform can be found in Bonnel et al. (2020) and its references. The following descriptions assume that the reader has basic knowledge of them.

The variation law of the modal IF provides prior knowledge for MCT and EMCT to redistribute the signal energy and then to produce thin IF trajectories for modal components. Here, the ideal waveguide is utilized to derive an approximation of the variation law with a closed form. Since the variation law is also followed by the warping transform, the law's ability to describe the dispersion curves of experimental signals is clearly implied by the robustness of the warping transform in practical applications.

For an impulse signal that is transmitted in the ideal waveguide at time t = 0, the signal received at a hydrophone is given by

y(t)=m=1Mam(t)exp(jΦm(t)),
(1)

where am(t) and Φm(t) are the amplitude and phase of mode m, respectively.

Based on the range r and the speed of sound c, the modal phase Φm(t) can be described as

Φm(t)=2πfcmt2tr2t>tr,
(2)

where fcm denotes the cutoff frequency of mode m and tr=r/c denotes the earliest arrival time of the impulse signal. Thus, the corresponding modal IF is

fm(t)=12πΦm(t)t=fcmtt2tr2t>tr.
(3)

For passive applications such as marine mammal localization (Bonnel et al., 2020), since the transmitting time is unknown, only the relative arrival time tr0 can be estimated from the received signal. As interpreted in Fig. 1, t0 represents the time difference between the absolute arrival time tr and the relative arrival time tr0. In such case, the corresponding modal phase and modal IF are given by

Φm(t)={fcm(t+t0)2(t0+tr0)2t>tr00ttr0,
(4)
fm(t)={fcmt+t0(t+t0)2(t0+tr0)2t>tr00ttr0.
(5)
Fig. 1.

Interpretation of the unknown time delay t0 and the relative arrival time tr0.

Fig. 1.

Interpretation of the unknown time delay t0 and the relative arrival time tr0.

Close modal

Based on the nonlinear kernel function fm(t) [Eq. (5)] or equivalently Φm(t) [Eq. (4)], the definition of MCT is given by

MCT(t,f)=+y(τ)ΦR(τ)ΦS(τ,t)w(τt)exp(j2πfτ)dτ
(6)

with

{ΦR(τ)=exp(jΦm(τ))ΦS(τ,t)=exp(j2πfm(t)τ).

In Eq. (6), y(τ) represents the received signal; ΦR(τ) and ΦS(τ,t) are the rotation operator and the shift operator (Yang et al., 2014), respectively; w(τt) represents a window centered at time t; and exp(j2πfτ) is the basis function of the Fourier transform. Note that ΦR(τ)ΦS(τ,t)w(τt) can be regarded as a special window function that varies with the short overlapping windows. Apart from the variation of window function, the framework of MCT is the same as that of STFT.

As explained in Figs. 2(a) and 2(b), ΦR(τ) and ΦS(τ,t) are responsible for the TF concentration around (t, f). Assume that the black curve in Fig. 2(a) is the target mode y(τ)=am(τ)exp(jΦm(τ)), i.e., mode m. The red block around (t, f) is the target zone whose TF concentration is waiting for improvement. Since ΦR(τ) is equivalent to the conjugate of the phase of y(τ) in theory, taking y×ΦR obtains the amplitude of y(τ), i.e., am(τ), whose IF0. As a consequence, by the rotation operator ΦR, the red line in Fig. 2(b) moves to the position of the blue line. Furthermore, taking y×ΦR×ΦS obtains am(τ)exp(j2πfm(t)τ), whose IFfm(τ)f. Thus, by the shift operator ΦR, the blue line in Fig. 2(b) moves to the position of the black line. As indicated by the red line and the black line, the energy distributed around f is all concentrated to a single frequency f, which means TF concentration at (t, f) is improved. In practical applications, since the kernel function fm(t) is only an approximation of the real modal IF, the rotation explained above will not be performed perfectly. In such case, the TF concentration of modal components is still able to be improved (at least partially). For interference and noise, due to the strong mismatch to the kernel function fm(t), the rotation will spread their TF concentration, i.e., to a certain extent, the rotation acts with an inhibitory effect.

Fig. 2.

(a) The target mode whose TF energy is waiting for concentrating. (b) How rotation operator ΦR and shift operator ΦS concentrate the signal energy. (c) Illustration for understanding the frequency-band partition of EMCT.

Fig. 2.

(a) The target mode whose TF energy is waiting for concentrating. (b) How rotation operator ΦR and shift operator ΦS concentrate the signal energy. (c) Illustration for understanding the frequency-band partition of EMCT.

Close modal

The implementation of MCT needs the advance provision of three parameters for the kernel function fm(t). The estimation of these parameters follows the strategies listed below:

  • tr0: The relative arrival time is a critical parameter for MCT. For the time-domain waveform, the amplitude increases quickly after tr0, while for the spectrogram (e.g., STFT), tr0 is closely followed by the time of arrival of high frequencies. According to these two characteristics, one can fix tr0.

  • t0: The unknown time delay is weakly sensitive in applications. Thus, any reasonable and possible value in experiments can be taken as an estimate [e.g., as shown in Figs. 3(b) and 3(e), t0=20 s performs a good result, though the real value is about 3.2 s].

  • fcm: When the first two parameters, tr0 and t0, are provided, the cutoff frequency can be estimated through the warping transform. In addition, it also can be determined by the equation fm(t)=f [i.e., Eq. (5)], where (t, f) can be any point of the target mode in theory. However, points dotted around the middle position of the dispersion curve are more proper in practice. Since Eq. (5) can be regarded as the base of warping transform in essence, these two ways will give the same estimate for fcm.

Fig. 3.

(a) Time series of the CSS signal; the vertical dashed line identifies the relative arrival time tr0=0.055 s. (b) MCT (mode 4). (c) The influence of the half bandwidth Δf upon the run time; the y axis is normalized by the longest run time (Δf=0) 11.88 s, and the x axis is scaled by the frequency resolution of the discrete Fourier transform fs/20480.95 Hz. (d) STFT. (e) EMCT (Δf=0). (f) EMCT (Δf=15). (g) Pekeris-STFT. (h) β-powergram. (i) WVD (mode 4). The black dotted curves superimposed on (b), (e), and (f) are the dispersion curves predicted after an inversion (Bonnel et al., 2018).

Fig. 3.

(a) Time series of the CSS signal; the vertical dashed line identifies the relative arrival time tr0=0.055 s. (b) MCT (mode 4). (c) The influence of the half bandwidth Δf upon the run time; the y axis is normalized by the longest run time (Δf=0) 11.88 s, and the x axis is scaled by the frequency resolution of the discrete Fourier transform fs/20480.95 Hz. (d) STFT. (e) EMCT (Δf=0). (f) EMCT (Δf=15). (g) Pekeris-STFT. (h) β-powergram. (i) WVD (mode 4). The black dotted curves superimposed on (b), (e), and (f) are the dispersion curves predicted after an inversion (Bonnel et al., 2018).

Close modal

In MCT, since each fcm is applied for only one modal component, a repetitive procedure is inevitable to extract all the dispersion curves. This inconvenience will evidently take a lot of time, especially when coping with large amounts of data. Therefore, EMCT is proposed to optimize the computation of MCT.

Apparently, when tr0 and t0 are provided in advance, the corresponding cutoff frequency fcm at a certain point (t, f) on the TF plane can be uniquely determined from the equation fm(t)=f. Against that backdrop, EMCT is given by

EMCT(t,f)=|MCT(t,f)|fm(t)=f.
(7)

Although Eq. (7) achieves a more automatic computation than MCT, it takes too much computational time as a cost. Note that adjacent points on the TF plane have almost the same fcm. Thus, in practical applications, fcm of the nearby points can be approximated by that of the center point to decrease the run time, i.e.,

EMCT(t,f)=|MCT(t,f)|fm(t)=fcfor|ffc|Δf,
(8)

where the strategy of frequency-band partition is utilized to determine the range of the approximation. As illustrated in Fig. 2(c), fc and Δf in Eq. (8) represent the center frequency and the half width of each frequency band, respectively. It is apparent that frequencies in the same band at a certain time t will share a common fcm, which is determined by the equation fm(t)=fc. If the bandwidth is equal to the frequency range of Fourier transform, EMCT degenerates into MCT.

In essence, EMCT is a splice of a certain number of MCT results that are computed with different fcm. Through the splice, EMCT avoids the laborious step of manually estimating the fcm for each modal component. By applying the strategy of frequency-band partition, EMCT achieves a remarkable acceleration in computation with almost no sacrifice in the TF concentration [as exhibited in Figs. 3(c) and 3(f)]. In addition, an interesting fact is that EMCT could be equivalent to the method of Chen et al. (2003) at Δf=0 [i.e., Eq. (7)], though they are proposed based on different schemes.

The aim of this section is to validate the performance of the two proposed methods. In this section, a detailed processing for estimating dispersion curves of a combustive sound source (CSS) signal has been documented. For convenience, the data as well as the computing programs used below (including MCT, EMCT, and other existing methods that are mentioned in Sec. 1) are all offered in the supplementary material.1

A CSS signal that is recorded at a range of about 4.8 km is employed in the performance validation of the two proposed methods. Nominally, this CSS signal contains as many as 18 modes (modes 10–13 are missing due to their weak modal energy) and thus is ideally suitable for a performance test of dispersion curve estimation. This CSS signal is available online in the supplementary material of Bonnel et al. (2020). More experimental details about this CSS signal are described in Bonnel et al. (2018).

The CSS source is generally considered as impulsive, but in fact it is preceded by a weak precursor and followed by several bubble pulses. To avoid the influence of bubble pulses and show the spectrogram more clearly, here the source deconvolution step performed in Bonnel et al. (2020) is reproduced with code provided in the same reference. In addition, the sampling frequency is decreased to fs=1953.2 Hz to reduce unnecessary computational burdens. Figure 3(a) is the time series of this CSS signal after source deconvolution. In Fig. 3(a), the relative arrival time tr0=0.055 s is identified by a vertical dashed line where the amplitude increases suddenly. The corresponding STFT spectrogram is shown in Fig. 3(d). The spectrogram provides an overview of TF structures for the CSS signal and, in particular, an intuitive indication for estimating tr0. However, due to the insufficient resolution for modal components, dispersion curves are highly contaminated by unknown interference components. Therefore, high-order modes are impossible to identify without further effort.

Figure 3(b) shows the MCT spectrogram of mode 4 with t0=20 s (intentionally larger than the real value of about 3.2 s to illustrate the insensitivity of t0) and the corresponding fcm = 15 Hz. In Fig. 3(b), the shape of mode 4 is shown clearly. However, as an inherent deficiency of MCT, the shapes of other modes suffer great damage due to the mismatch between the kernel function fm(t) and the modal IF. The EMCT spectrogram (Δf=0) of the CSS signal is depicted in Fig. 3(e), where all nominal 18 modes have been successfully extracted. [Modes 19 and 20 may exist pursuant to Fig. 3(e); however, more efforts are needed to confirm this. Proving their existence is beyond the scope of this Letter and hence will not be discussed here.] For validation purposes, dispersion curves that are predicted after an inversion (Bonnel et al., 2018) are superimposed as black dotted curves in Fig. 3(e). It is apparent that the EMCT results are almost identical to the predicted curves.

The influence of the half bandwidth Δf upon the run time of EMCT is shown in Fig. 3(c), where each value is an average of 20 repetitions. In Fig. 3(c), the y axis is normalized by the longest run time (Δf=0) of 11.88 s. In addition, the x axis is scaled by the frequency resolution of the discrete Fourier transform (fs/20480.95 Hz) for a convenient description in the discrete frequency domain. Obviously, by increasing the half bandwidth, the computational efficiency is remarkably improved. For example, when Δf=15, the run time is reduced to only 0.44 s (about 1/27 of that of unaccelerated EMCT). The corresponding spectrogram is exhibited in Fig. 3(f), where the dispersion curves are distinguished as clearly as those in Fig. 3(e). Figures 3(c) and 3(f) demonstrate that the frequency-band partition is an available and efficient strategy to improve the computational efficiency without sacrificing too much TF concentration.

Figures 3(g) and 3(h) are spectrograms of Pekeris-STFT and β-powergram, respectively. Clearly, Pekeris-STFT makes a huge improvement to TF concentration, and modal components are separated completely in Fig. 3(g). The β-powergram also reveals almost all nominal modal components; however, it gives a lower resolution compared with Pekeris-STFT or EMCT. It should be noted that, due to the difference in computational framework, the strategy of frequency-band partition is unsuitable for Pekeris-STFT or β-powergram. This makes EMCT the first approach that considers both TF concentration and computational efficiency.

Figure 3(i) is a WVD spectrogram of mode 4. Although some inner-interference terms exist, the shape of mode 4 is clearly revealed. Nevertheless, the good performance of WVD depends on a well-done modal separation, which is a manual procedure. As shown in Fig. 3(b), the application of MCT can avoid the modal filtering step. Therefore, the proposed MCT is more convenient than WVD in dispersion curve estimation.

This Letter focuses on the estimation of dispersion curves of low-frequency impulsive sounds propagated in shallow water. Based on an approximated variation law of the modal IF derived from the ideal waveguide, an advanced TFA method named MCT and its extended version EMCT are developed to quickly and accurately extract dispersion curves without requiring prior information about the source location and the propagation environments. Processing of a special multi-mode signal reveals that the proposed EMCT is very effective in distinguishing modal components and can be made very efficient in computation by applying the strategy of frequency-band partition. Therefore, the proposed EMCT becomes the first approach that considers both TF concentration and computational efficiency. Abundant supplementary material that contain all the methods mentioned in the Introduction has been packaged to facilitate the estimation of dispersion curves for related research and applications.

This research was supported by National Natural Science Foundation of China Grant Nos. 11974286 and 11904290, Fundamental Research Funds for the Central Universities Grant No. 3102019HHZY030019, and the State Key Laboratory of Acoustics, Chinese Academy of Sciences (No. SKLA201906).

1

See supplementary material at https://www.scitation.org/doi/suppl/10.1121/10.0003367 for (a) the CSS signal obtained after source deconvolution and its dispersion curves predicted after an inversion in Bonnel et al. (2018), (b) the attached TFA programs (including MCT, EMCT, β-powergram, Pekeris-STFT, warping transform, inverse warping transform, WVD, WT, STFT, and inverse STFT), and (c) the example about the use of TFA programs.

1.
Bonnel
,
J.
,
Le Touzé
,
G.
,
Nicolas
,
B.
, and
Mars
,
J. I.
(
2013
). “
Physics-based time-frequency representations for underwater acoustics: Power class utilization with waveguide-invariant approximation
,”
IEEE Signal Process. Mag.
30
(
6
),
120
129
.
2.
Bonnel
,
J.
,
Lin
,
Y.-T.
,
Eleftherakis
,
D.
,
Goff
,
J. A.
,
Dosso
,
S.
,
Chapman
,
R.
,
Miller
,
J. H.
, and
Potty
,
G. R.
(
2018
). “
Geoacoustic inversion on the New England mud patch using warping and dispersion curves of high-order modes
,”
J. Acoust. Soc. Am.
143
(
5
),
EL405
EL411
.
3.
Bonnel
,
J.
,
Thode
,
A.
,
Wright
,
D.
, and
Chapman
,
R.
(
2020
). “
Nonlinear time-warping made simple: A step-by-step tutorial on underwater acoustic modal separation with a single hydrophone
,”
J. Acoust. Soc. Am.
147
(
3
),
1897
1926
.
4.
Chen
,
C.-S.
,
Miller
,
J.
,
Bourdreaux-Bartels
,
G.
,
Potty
,
G.
, and
Lazauski
,
C.
(
2003
). “
Time-frequency representations for wideband acoustic signals in shallow water
,” in
Oceans 2003. Celebrating the past… Teaming toward the Future (IEEE Cat. No. 03CH37492) Proceedings
, Vol.
5
, pp.
SP2903
SP2907
.
5.
Le Touzé
,
G.
,
Nicolas
,
B.
,
Mars
,
J. I.
, and
Lacoume
,
J.
(
2009
). “
Matched representations and filters for guided waves
,”
IEEE Trans. Signal Process.
57
(
5
),
1783
1795
.
6.
Niu
,
H.
,
He
,
L.
,
Li
,
Z.
,
Zhang
,
R.
, and
Nan
,
M.
(
2014
). “
Inversion for the environment parameters in shallow water using warping transforms
,”
Acta Acust.
39
(
1
),
1
10
.
7.
Potty
,
G. R.
,
Miller
,
J. H.
,
Lynch
,
J. F.
, and
Smith
,
K. B.
(
2000
). “
Tomographic inversion for sediment parameters in shallow water
,”
J. Acoust. Soc. Am.
108
(
3
),
973
986
.
8.
Yang
,
Y.
,
Peng
,
Z. K.
,
Dong
,
X. J.
,
Zhang
,
W. M.
, and
Meng
,
G.
(
2014
). “
General parameterized time-frequency transform
,”
IEEE Trans. Signal Process.
62
(
11
),
2751
2764
.

Supplementary Material