An extended chirplet transform method termed as Doppler chirplet transform is proposed to estimate the velocity of a discrete tone source in uniform linear motion. This method directly uses the relation of the observed instantaneous frequency to the source velocity as the kernel of the chirplet transform. It is tested on a set of 30-s truck noise recordings and also on simulated data from a statistical perspective. The results show that the Doppler chirplet transform significantly reduces the run time that the polynomial chirplet transform [Xu, Yang, and Yu, J. Acoust. Soc. Am. 137(4), EL320–EL326 (2015)] costs to produce similarly accurate estimates of the source velocity.
1. Introduction
The power spectrum of the noise produced by a propeller-driven aircraft or a vehicle consists of several spectral lines (discrete tones). When such an aircraft or vehicle travels along a straight line at a constant speed the frequencies of these tones observed by a stationary sensor change with time due to the Doppler effect. Therefore, based on the physical relation of the instantaneous frequency (IF) of the tone to the velocity of the sound source, termed as the IF-to-velocity relation in this letter, the velocity of the moving source can be estimated.
This velocity estimation problem has been addressed in many researches.1–6 Ferguson and Quinn1 introduced a framework which divides into two steps: (1) extract the IF estimates of the received signal by a time-frequency analysis (TFA) method; (2) fit the IF estimates with the IF-to-velocity relation under the nonlinear least squares (NLS) criterion to obtain the estimate of the velocity. Therein the short-time Fourier transform (STFT) and the Wigner-Ville distribution (WVD) are applied to extract the IFs of the received noise of a flying propeller-driven aircraft. Reid et al.2 used the same scheme to estimate the velocity but they used the polynomial WVD as the IF estimator.
It is noted that the accuracy of the IF estimates largely determines the precision of the velocity estimation. Xu et al.5 used the polynomial chirplet transform7 (PCT) instead of the STFT to produce a high-quality time-frequency distribution (TFD) under the two-step framework. The PCT is a parameterized TFA method developed from the conventional chirplet transform8 by replacing the chirplet kernel with a polynomial kernel. Such adaption makes the PCT a very efficient TFA method from the perspective of energy concentration for the signals with a highly nonlinear IF trajectory. However, the order of the polynomial should be large enough for a good polynomial approximation of the nonlinear IF trajectory. This means a large number of unknown polynomial coefficients need to be estimated, resulting in a sharp increase in the computational complexity.
By directly using the IF-to-velocity relation as the kernel function, an extended chirplet transform, named Doppler chirplet transform (Doppler-CT), is proposed. This method iteratively updates a set of motion parameters and consequently the kernel of the chirplet transform for a TFD with high energy concentration. Since there are only four unknown motion parameters to determine, a significant reduction in the computation complexity can be achieved compared to the PCT.
2. The Doppler-CT method
Consider the case where a pure tone source of frequency f0 travels along a straight line at a constant speed v and passes by a fixed sensor, as shown in Fig. 1. The sound speed is given by c. The time when the source passes the closest point of approach (CPA) is denoted by τc and at the very moment the distance between the source and the sensor is denoted by dc. Under these circumstances the IF-to-velocity relation of the sensor received signal is given by6
where the subscript D represents the set of the unknown but constant motion parameters, including the source frequency f0, the source velocity v, the CPA time τc, and the CPA distance dc.
A moving source radiating the signal with a constant frequency f0 is traveling in a straight line at a constant speed v. The slant range between the sensor and the source at time τ is given by r. The distance from sensor to the closest point of approach of the source is denoted by dc.
A moving source radiating the signal with a constant frequency f0 is traveling in a straight line at a constant speed v. The slant range between the sensor and the source at time τ is given by r. The distance from sensor to the closest point of approach of the source is denoted by dc.
The conventional chirplet transform provides a useful tool for analyzing a tone with a linearly time-variant frequency. However, for the case where the frequency of the tone changes in a nonlinear pattern, given by Eq. (1), it does not apply. For such a case, the so-called PCT provides a possible solution, leading to a TFD of high energy concentration. In the PCT, the kernel of the chirplet transform is substituted by a polynomial, which is able to approximate any nonlinearly changing frequency fairly well as long as the order of the polynomial is adequate high. In this letter, an alternative solution named the Doppler-CT is presented by replacing the chirplet kernel with the IF-to-velocity relation directly. The TFD of the Doppler-CT of a signal s(t) for some time instant τ and angular frequency ω is given by
where
are the nonlinear frequency rotating operator and the frequency shifting operator, respectively. The term wσ(t) represents a time window that is herein given by a Gaussian function with a standard deviation σ, i.e.,
In practice, for a signal s(t) of duration T and a time window wσ(t) of length Tw, the range for the time instant τ in Eq. (2) is limited to [Tw/2, T – Tw/2] to mitigate the truncation effects.
The Doppler-CT updates the motion parameter set D of the source iteratively, and therefore adapts the kernel of the chirplet transform, to produce a TFD with improved energy concentration. An outline of the Doppler-CT can be summarized as follows.
Roughly estimate the frequency range of the interested tone from the spectrum. Then a band pass filter is applied to suppress the adjacent tones.
Initialize the unknown constant parameters of the set and set iteration number k = 1.
In kth iteration, substitute the parameter set Dk–1 into Eq. (2) to produce the TFDk.
Slice the TFDk to obtain a spectrum for each time instant τ and locate the peak of the spectrum for the IF of the tone at that time. Join these IF estimates to form the IF curve IFk(τ).
- Estimate the parameters by fitting IFk(τ) with the IF-to-velocity relation of Eq. (1) under the NLS criterion(5)
where the Gauss-Newton algorithm is used to find the optimum parameter set.
- Set k = k + 1 and repeat Steps (3) through (6) until the termination criterion is true,(6)
for some small positive value δ or a maximum number of iterations is reached.
It is worth mentioning that in step (4) the spectrum for each time instant has a unique peak due to the only tone left in the interested frequency band. Where the peak is located is the frequency at which the energy of the tone concentrates at the very time instant, i.e., the IF of the tone. In practice, a simple search over the discrete frequency grids for this peak yields the IF estimate.
3. Example
The 30-s truck noise data9 are used to validate the efficiency of the Doppler-CT. These data were measured in an experiment where a truck traveled along a straight path at a constant speed of 6.07 m/s. The speed of sound was 347 m/s. A microphone was set up nearby to record the truck noise with a sample rate of 12 000 Hz. To reduce the computational burden, herein the data are downsampled to 1200 Hz.
The truck passby noise consists of multiple tones. In this letter, only the tone of around 118 Hz is chosen for the test. To suppress the other tones, a finite impulse response (FIR) band pass filter is used. The passband of the filter ranges from 115 to 122 Hz with a maximum passband ripple of 0.1 dB and the lower and higher stopband cutoff frequencies are, respectively, 105 and 132 Hz with a minimum stopband attenuation of −60 dB. The time window length Tw, largely regulates the tradeoff between the time and the frequency resolutions of the TFD, is chosen as 8192 from a small set of guesses after several tries. Moreover, the initial values for the unknown parameters that the Doppler-CT begins with are also chosen empirically as .
Figures 2(a)–2(c) illustrate, respectively, the TFDs produced by the STFT, the PCT, and the Doppler-CT. For a fair comparison, all the three methods are implemented in matlab and run on a Windows 10 platform configured with an Intel i5‐8250U CPU and an 8 GB RAM. It is obvious that the Doppler-CT generates a TFD with higher energy concentration than the STFT. For the PCT, increasing the order of the polynomial kernel leads to improved performance. When the order is raised to 10, the PCT is able to render a TFD with similar energy concentration to the Doppler-CT. Although it does not necessarily take five iterations for both the Doppler-CT and the PCT to reach convergence, they are still set to run five iterations for a comparison on run time.
(Color online) TFDs for the truck noise: (a) generated by the STFT, (b) generated by the 10th order PCT, (c) generated by the Doppler-CT, (d) IFs extracted from (a), (b), and (c).
(Color online) TFDs for the truck noise: (a) generated by the STFT, (b) generated by the 10th order PCT, (c) generated by the Doppler-CT, (d) IFs extracted from (a), (b), and (c).
Figure 2(d) shows the IF estimates rendered by the three methods. The true IFs are also illustrated for comparison purposes. The estimates given by all three methods are very close to the true IFs. A closer comparison shows that the IF estimation performance can be slightly improved by using the Doppler-CT or the PCT rather than the STFT.
The final estimates of the motion parameter are tabulated in Table 1. It is clear that the Doppler-CT offers slightly more accurate estimates than the STFT at a cost of a sharp increase in run time. Compared to the PCT, however, the Doppler-CT provides a good substitute in terms of estimation performance but it runs significantly faster.
Motion parameters estimated by the proposed and the existing methods.
. | f0 (Hz) . | τc (s) . | v (m/s) . | dc (m) . | Run time (s) . |
---|---|---|---|---|---|
True value (Ref. 10) | 118.66 | 15.77 | 6.07 | 36.5 | — |
STFT | 118.700 | 15.506 | 6.122 | 39.060 | 7.3 |
10th order PCT | 118.672 | 15.645 | 6.133 | 37.883 | 571 |
Doppler-CT | 118.674 | 15.632 | 6.119 | 37.663 | 123 |
. | f0 (Hz) . | τc (s) . | v (m/s) . | dc (m) . | Run time (s) . |
---|---|---|---|---|---|
True value (Ref. 10) | 118.66 | 15.77 | 6.07 | 36.5 | — |
STFT | 118.700 | 15.506 | 6.122 | 39.060 | 7.3 |
10th order PCT | 118.672 | 15.645 | 6.133 | 37.883 | 571 |
Doppler-CT | 118.674 | 15.632 | 6.119 | 37.663 | 123 |
4. Numerical simulations
Monte Carlo simulations are carried out to further evaluate from a statistical perspective the performance of the Doppler-CT. The same computing platform used for the truck noise data processing previously is also used herein to run the simulations. All the statistics are obtained from 300 independent trials in the sequel.
First, we examine the robustness of the Doppler-CT against noise. The experiment settings for the truck passby noise measurements are largely reproduced. Figures 3(a)–3(d) show, respectively, the root-mean-square errors (RMSEs) of the estimates rendered by the Doppler-CT for the four motion parameters with respect to the signal-to-noise ratio (SNR) of the tone. The RMSEs for the STFT and the PCT are also shown for comparison purposes. It is obvious from these figures that the Doppler-CT has better performance than the STFT against noise especially for the cases where the SNR is lower than 2 dB.
(Color online) Comparison of the STFT, the PCT and the Doppler-CT on RMSE of the estimates for (a) source frequency f0, (b) source velocity v, (c) CPA time τc, (d) CPA distance dc with respect to different SNRs, and (e) comparison between the PCT and the Doppler-CT on the convergence rate.
(Color online) Comparison of the STFT, the PCT and the Doppler-CT on RMSE of the estimates for (a) source frequency f0, (b) source velocity v, (c) CPA time τc, (d) CPA distance dc with respect to different SNRs, and (e) comparison between the PCT and the Doppler-CT on the convergence rate.
Then the Doppler-CT and the PCT are compared on the convergence rate, which is measured in terms of the error expressed by Eq. (6). Figure 3(e) shows the averaged results with respect to the iteration number under the condition that SNR = 6 dB. It is observed that on most occasions it needs less than five iterations for the two methods to reach convergence. This also explains the reason that we set the iteration number to be 5 in the previous truck noise experiment.
The statistics of the run time for the three methods are provided in Table 2 under the three scenarios different from each for either the SNR or the source velocity. A clear advantage of the Doppler-CT over the PCT on the run time can be observed from these simulations, where the run time is reduced by around 70% on average. This is also consistent with the result (cf. Table 1) in the truck noise experiment.
Comparison of the three methods on the run time. Std is short for “standard deviation.”
Run time (s) . | SNR 6 dB, velocity 6.07 m/s . | 0 dB, 6.07 m/s . | 6 dB, 12 m/s . | |||
---|---|---|---|---|---|---|
Mean . | Std . | Mean . | Std . | Mean . | Std . | |
STFT | 7.1 | 0.5 | 7.1 | 0.4 | 7.0 | 0.1 |
PCT (5 iterations) | 572.0 | 13.3 | 571.4 | 14.6 | 571.3 | 20.3 |
Doppler-CT (5 iterations) | 122.0 | 1.0 | 118.1 | 1.5 | 118.2 | 1.1 |
Run time (s) . | SNR 6 dB, velocity 6.07 m/s . | 0 dB, 6.07 m/s . | 6 dB, 12 m/s . | |||
---|---|---|---|---|---|---|
Mean . | Std . | Mean . | Std . | Mean . | Std . | |
STFT | 7.1 | 0.5 | 7.1 | 0.4 | 7.0 | 0.1 |
PCT (5 iterations) | 572.0 | 13.3 | 571.4 | 14.6 | 571.3 | 20.3 |
Doppler-CT (5 iterations) | 122.0 | 1.0 | 118.1 | 1.5 | 118.2 | 1.1 |
5. Conclusion
An extended chirplet transform named the Doppler-CT is proposed to estimate the velocity of a source of discrete tones moving in the uniform linear motion. The Doppler-CT uses directly the nonlinear IF-to-velocity relation of a tone as the kernel of the chirplet transform. It updates the kernel iteratively in order to produce a TFD with improved energy concentration. Comparisons of the Doppler-CT with the STFT and the PCT are carried out on a truck passby noise data and simulated data. The results show that the Doppler-CT is more robust to noise than the STFT at a cost of sharply increased computational burden. In addition, it provides a good substitute for the PCT with a significant advantage in terms of computational efficiency.
Acknowledgments
This research was supported by the National Natural Science Foundation of China under Grant No. 51679204 and the Defense Industrial Technology Development Program under Grant No. JCKY2016607C009.