A multitask convolutional neural network (CNN) is trained to localize the instantaneous position of a motorboat throughout its transit past a wide aperture linear array of hydrophones located 1 m above the sea floor in water 20 m deep. A cepstrogram database for each hydrophone and a cross-correlogram database for each pair of adjacent hydrophones are compiled for multiple motorboat transits. Cepstrum-based and correlation-based feature vectors (along with ground-truth source bearing and range data) form the inputs to train three CNNs so that they can predict the instantaneous source range and bearing for other “unseen” motorboat transits. It is shown that CNNs operating on multi-sensor cepstrum-based feature maps are able to predict the instantaneous range and bearing of a transiting motorboat, even when the source is near an endfire direction. Also, multi-sensor generalised cross correlation-based feature maps are able to predict the range and bearing of a transiting motorboat in the presence of interfering multipath arrivals. When compared with the cepstrum-only CNN, cross correlation-only CNN, and the conventional model-based method of passive ranging by wavefront curvature, the combined cepstrum-cross correlation CNN is shown to provide superior source localization performance in a multipath underwater acoustic environment.

1.
M.
Bianco
,
P.
Gerstoft
,
J.
Traer
,
E.
Ozanich
,
M.
Roch
,
S.
Gannot
, and
C.
Deledalle
, “
Machine learning in acoustics: Theory and applications
,”
J. Acoust. Soc. Am.
146
(
5
),
3590
3628
(
2019
).
2.
E. L.
Ferguson
,
R.
Ramakrishnan
,
S. B.
Williams
, and
C. T.
Jin
, “
Deep learning approach to passive monitoring of the underwater acoustic environment
,”
J. Acoust. Soc. Am.
140
(
4
),
3351
(
2016
).
3.
E. L.
Ferguson
,
S. B.
Williams
, and, and
C. T.
Jin
, “
Sound source localization in a multipath environment using convolutional neural networks
,” in
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing
,
Calgary, Canada
(
April 15–20
,
2018
), pp.
2386
2390
.
4.
H.
Niu
,
E.
Reeves
, and
P.
Gerstoft
, “
Source localization in an ocean waveguide using supervised machine learning
,”
J. Acoust. Soc. Am.
142
(
3
),
1176
1188
(
2017
).
5.
H.
Niu
,
E.
Ozanich
, and
P.
Gerstoft
, “
Ship localization in Santa Barbara Channel using machine learning classifiers
,”
J. Acoust. Soc. Am.
142
(
5
),
EL455
EL460
(
2017
).
6.
R.
Lefort
,
G.
Real
, and
A.
Drémeau
, “
Direct regressions for underwater acoustic source localization in fluctuating oceans
,”
Appl. Acoust.
116
,
303
310
(
2017
).
7.
D.
Van Komen
,
T.
Neilsen
,
K.
Howarth
,
D.
Knobles
, and
P.
Dahl
, “
Seabed and range estimation of impulsive time series using a convolutional neural network
,”
J. Acoust. Soc. Am.
147
(
5
),
EL403
EL408
(
2020
).
8.
T.
Neilsen
,
C.
Escobar-Amado
,
M.
Acree
,
W.
Hodgkiss
,
D.
Van Komen
,
D.
Knobles
,
M.
Badiey
, and
J.
Castro-Correa
, “
Learning location and seabed type from a moving mid-frequency source
,”
J. Acoust. Soc. Am.
149
(
1
),
692
705
(
2021
).
9.
D.
Van Komen
,
T.
Neilsen
,
D.
Mortenson
,
M.
Acree
,
D.
Knobles
,
M.
Badiey
, and
W.
Hodgkiss
, “
Seabed type and source parameters predictions using ship spectrograms in convolutional neural networks
,”
J. Acoust. Soc. Am.
149
(
2
),
1198
1210
(
2021
).
10.
E.
Ferguson
,
S.
Williams
, and
C.
Jin
, “
Convolutional neural network for single-sensor acoustic localization of a transiting broadband source in very shallow water
,”
J. Acoust. Soc. Am.
146
(
6
),
4687
4698
(
2019
).
11.
E.
Ferguson
,
S.
Williams
, and, and
C.
Jin
, “
Improved multipath time delay estimation using cepstrum subtraction
,” in
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing
,
Brighton, UK
(
May 12–17
,
2019
). pp.
551
555
.
12.
C.
Knapp
and
G.
Carter
, “
The generalized correlation method for estimation of time delay
,”
IEEE Trans. Acoust. Speech Signal Process.
24
(
4
),
320
327
(
1976
).
13.
G.
Carter
,
Coherence and Time Delay Estimation: An Applied Tutorial for Research, Development, Test, and Evaluation Engineers
(
IEEE
,
New York
,
1993
).
14.
E.
Ferguson
and
B.
Ferguson
, “
High-precision acoustic localization of dolphin sonar click transmissions using a modified method of passive ranging by wavefront curvature
,”
J. Acoust. Soc. Am.
146
(
6
),
4790
4801
(
2019
).
15.
J.
Chen
,
J.
Benesty
, and
Y.
Huang
, “
Time delay estimation in room acoustic environments: An overview
,”
EURASIP J. Adv. Signal Process.
2006
(
1
),
026503
(
2006
).
16.
B.
Ferguson
and
K.
Lo
, “
Passive ranging errors due to multipath distortion of deterministic transient signals with application to the localization of small arms fire
,”
J. Acoust. Soc. Am.
111
(
1
),
117
128
(
2002
).
17.
B.
Ferguson
, “
Time-delay estimation techniques applied to the acoustic detection of jet aircraft transits
,”
J. Acoust. Soc. Am.
106
(
1
),
255
264
(
1999
).
18.
G.
Carter
, “
Time delay estimation for passive sonar signal processing
,”
IEEE Trans. Acoust. Speech Signal Process.
29
(
3
),
463
470
(
1981
).
19.
E.
Ferguson
,
R.
Ramakrishnan
,
S.
Williams
, and, and
C.
Jin
, “
Convolutional neural networks for passive monitoring of a shallow water environment using a single sensor
,” in
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing
,
New Orleans, LA
(
March 5–9
,
2017
), pp.
2657
2661
.
20.
Y.
Bengio
, “
Practical recommendations for gradient-based training of deep architectures
,” in
Neural Networks: Tricks of the Trade
(
Springer
,
New York
,
2012
), pp.
437
478
.
21.
S.
Chakrabarty
and
E.
Habets
, “
Broadband doa estimation using convolutional neural networks trained with noise signals
,” in
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics
,
New Paltz, NY
(
October 15–18
,
2017
), pp.
136
140
.
22.
R.
Caruana
, “
Multitask learning
,”
Mach. Learn.
28
(
1
),
41
75
(
1997
).
23.
V.
Nair
and
G.
Hinton
, “
Rectified linear units improve restricted boltzmann machines
,” in
Proceedings of the 27th International Conference on Machine Learning
,
Haifa, Israel
(
June 21–24
,
2010
), pp.
807
814
.
24.
W.
He
,
P.
Motlicek
, and
J.
Odobez
, “
Deep neural networks for multiple speaker detection and localization
,” in
Proceedings of the IEEE International Conference on Robotics and Automation
,
Brisbane, Australia
(
May 21–25
,
2018
), pp.
74
79
.
25.
Y.
Sun
,
J.
Chen
,
C.
Yuen
, and
S.
Rahardja
, “
Indoor sound source localization with probabilistic neural network
,”
IEEE Trans. Ind. Electron.
65
(
8
),
6403
6413
(
2018
).
26.
R.
Takeda
and
K.
Komatani
, “
Unsupervised adaptation of deep neural networks for sound source localization using entropy minimization
,” in
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing
,
New Orleans, LA
(
March 5–9
,
2017
), pp.
2217
2221
.
27.
X.
Xiao
,
S.
Zhao
,
X.
Zhong
,
D.
Jones
,
E.
Chng
, and
H.
Li
, “
A learning-based approach to direction of arrival estimation in noisy and reverberant environments
,” in
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing
,
Brisbane, Australia
(
April 19–24
,
2015
), pp.
2814
2818
.
28.
M.
Yiwere
and
E.
Rhee
, “
Distance estimation and localization of sound sources in reverberant conditions using deep neural networks
,”
Int. J. Appl. Eng. Res.
12
(
22
),
12384
12389
(
2017
).
29.
D.
Kingma
and
J.
Ba
, “
ADAM: A method for stochastic optimization
,” arXiv:1412.6980 (
2014
).
30.
S.
Ioffe
and
C.
Szegedy
, “
Batch normalization: Accelerating deep network training by reducing internal covariate shift
,” in
Proceedings of the International Conference on Machine Learning
,
Lille, France
(
July 6–11
,
2015
), pp.
448
456
.
31.
N.
Srivastava
,
G.
Hinton
,
A.
Krizhevsky
,
I.
Sutskever
, and
R.
Salakhutdinov
, “
Dropout: A simple way to prevent neural networks from overfitting
,”
J. Mach. Learn. Res.
15
(
1
),
1929
1958
(
2014
).
32.
A.
Krogh
and
J.
Hertz
, “
A simple weight decay can improve generalization
,” in
Advances in Neural Information Processing Systems
(
Kaufmann Publishers
,
San Mateo, CA
,
1992
), pp.
950
957
.
33.
M.
Abadi
,
A.
Agarwal
,
P.
Barham
,
Z.
Brevdo
,
E.
Chen
,
C.
Citro
,
G. S.
Corrado
,
A.
Davis
,
J.
Dean
,
M.
Devin
,
S.
Ghemawat
,
I.
Goodfellow
,
A.
Harp
,
G.
Irving
,
M.
Isard
,
Y.
Jia
,
Y.
Jozefowicz
,
L.
Kaiser
,
M.
Kudlur
,
J.
Levenberg
,
M.
Dandelion
,
R.
Monga
,
S.
Moore
,
D.
Murray
,
C.
Olah
,
M.
Schuster
,
J.
Shlens
,
B.
Steiner
,
I.
Sutskever
,
K.
Talwar
,
P.
Tucker
,
V.
Vanhoucke
,
V.
Vasudevan
,
F.
Viégas
,
O.
Vinyals
,
P.
Warden
,
M.
Wattenberg
,
M.
Wicke
,
Y.
Yu
, and
X.
Zheng
, “
TensorFlow: Large-scale machine learning on heterogeneous systems
,” https://www.tensorflow.org/ (Last viewed 1 March 2019).
34.
K.
He
,
X.
Zhang
,
S.
Ren
, and
J.
Sun
, “
Delving deep into rectifiers: Surpassing human-level performance on imagenet classification
,” in
Proceedings of the IEEE International Conference on Computer Vision
,
Las Dondes, Chile
(
December 11–18
,
2015
), pp.
1026
1034
.
35.
G.
Carter
, “
Passive ranging errors due to receiving hydrophone position uncertainty
,”
J. Acoust. Soc. Am.
65
(
2
),
528
530
(
1979
).
36.
A.
Quazi
, “
An overview on the time delay estimate in active and passive systems for target localization
,”
IEEE Trans. Acoust. Speech Signal Process.
29
(
3
),
527
533
(
1981
).
37.
Y.
Gal
and
Z.
Ghahramani
, “
Dropout as a Bayesian approximation: Representing model uncertainty in deep learning
,” in
Proceedings of the 33rd International Conference on Machine Learning
,
New York, NY
(
June 19–24
,
2016
), pp.
1050
1059
.
You do not currently have access to this content.