Predicting acoustic transmission loss in the SOFAR channel faces challenges, such as excessively complex algorithms and computationally intensive calculations in classical methods. To address these challenges, a deep learning-based underwater acoustic transmission loss prediction method is proposed. By properly training a U-net-type convolutional neural network, the method can provide an accurate mapping between ray trajectories and the transmission loss over the problem domain. Verifications are performed in a SOFAR channel with Munk's sound speed profile. The results suggest that the method has potential to be used as a fast predicting model without sacrificing accuracy.

Predicting low-frequency acoustic transmission loss (TL) in the SOFAR channel is an important research field in acoustics. Low-frequency TL plays a crucial role in various applications, such as early warning of undersea earthquakes,1 underwater sound source localization,2 and monitoring of marine mammals.3 As sound travels through the complex environment of the deep ocean, it encounters various factors, such as varying seafloor and stratified medium, that affect its propagation and intensity. Therefore, the prediction of low-frequency underwater acoustic TL has long been a challenge.

Ray-based models are commonly used to calculate TL in the SOFAR channels by providing a simplified representation of sound waves traveling through water. Based on the tracing of rays, the sound field can be calculated by solving the eikonal equation and the transport equations. Ray-based methods can handle range-dependent environment and are well adapted for long-range propagation. However, due to the high-frequency approximation, classical ray methods are usually not considered to be suitable for low-frequency problems.4 Here, “low frequency” does not refer to a constant value or range, but a variable for different environments. For example, in the user manual of BELLHOP, which uses ray theories as its core algorithms, a calculation example at 50 Hz is performed in a SOFAR channel with a range of 100 km and a depth of 5 km. It is mentioned that 50 Hz is usually considered to be a low frequency in such an environment and the ray methods cannot give accurate results for this problem due to the errors in the shadow zone.5 

Wave-based models are another important type of methods for calculating TL in the SOFAR channel. Normal mode (NM) methods are one of a basic wave-based methods. In the NM, the sound pressure is expressed by summing up a set of modal functions weighted in accordance with the source depth.6 The NM method has high accuracy on calculating the sound field. However, the method is ineffective for range-dependent ocean environments. The parabolic equation (PE) method is a suitable and popular wave-theory technique for solving range-dependent propagation problems.7 Early PEs usually have inherent phase errors, which limit their applicability to a certain range of angles around the main propagation direction. However, very-wide-angle PE implementation based on Padé approximants, which has been proposed in subsequent research, has nearly eliminated the small-angle limitations.8 This high-angle capability is achieved with additional computational effort.

To improve the performance of calculating TL in the complex ocean environment, many extensions to classical methods, such as Gaussian beam tracing method,9 coupled mode method,10 and hybrid method,11 have been proposed. These methods have improved the accuracy of underwater sound field simulation in various aspects. However, they have also raised issues, such as long computation time.

In recent years, deep learning techniques have achieved remarkable progress in various scientific research fields.12,13 Deep learning architectures based on neural networks are capable of extracting valuable patterns and insights that would be challenging or time-consuming with traditional methods. It has been successfully applied in the field of underwater acoustics, such as the source localization,14 source depth estimation,15 and dim frequency line detection.16 Deep learning technique also has been increasingly used in modeling the ocean acoustic propagation. A deep convolutional recurrent autoencoder network is presented for data-driven learning of complex underwater wave scattering and interference.17 Deep learning methods also have been used to predict modal horizontal wavenumbers and group velocities,18 and predict far-field acoustic propagation based on near field data.19 

To rapidly and accurately predict the acoustic TL in SOFAR channels, we develop a convolutional neural network-based method for predicting low-frequency underwater acoustic TL map from ray trajectories. In this method, a U-net type of neural network is trained with ray trajectories as input to predict TL at low frequencies. Compared to the conventional ray-based method, the solving of the transport equation is replaced by the deep learning model. This avoids the problem of high-frequency approximation in the construction of the transport equation. In addition, since ray trajectories can be easily determined even in complex environments, it is possible to conveniently and accurately predict the TL using the proposed method.

Considering a TL prediction problem in a SOFAR channel under a cylindrically symmetric two-dimensional scenario, the space of the environment is denoted as R where the maximum range distance and depth are denoted as R and Z, respectively. For a simple harmonic point source located at range position r = 0 and depth of zs, the Helmholtz equation for calculating the sound pressure at x R can be expressed as follows:20 
1 r r ( r p ( x ) r ) + ρ ( z ) z ( 1 ρ ( z ) p ( x ) z ) + ( ω c ( z ) ) 2 p ( x ) = S ω δ ( z z s ) ,
(1)
where ρ ( z ) is the water density, c ( z ) is the sound speed at depth z, ω = 2 π f is the circular frequency, and S ω is the source intensity at ω. By solving this equation, TL at x can be calculated by
TL ( x ) = 20 log 10 p ( x ) p 0 ,
(2)
where p 0 is the sound pressure at a distance of 1 m from the sound source.
In complex environments, conventional ray models lack accuracy in low-frequency range when determining TL. On the other hand, wave-theory methods are usually computationally intensive.10,21 Regarding this problem, a ray-trajectory-guided underwater acoustic TL prediction method based on deep neural network is proposed. We transform the problem of calculating TL at individual points into solving for a TL map on a grid D defined on the SOFAR channel. By successfully training a neural network g Θ with parameter set Θ, the TL map on D at a frequency f can be obtained from a ray-traces-related input U D as follows:
T pred D , f = g Θ ( U D ) .
(3)

To calculate ray trajectories in a given SOFAR channel, a grid D 0 : { m z × m r } is defined on the 2D plane illustrated in Fig. 1(a). Assume Nr rays are emitted from the sound source within an angular range θ = [ θ 0 , θ 0 ] with equal angular intervals.

Fig. 1.

Schematic of generation of ray trajectories. (a) Calculating plane is discretized into a grid, and source emits rays within an angular range. (b) Calculated ray trajectories on a fine grid D 0. (c) Original rays are downsampled on a coarse grid D. (d) Downsampled ray trajectories.

Fig. 1.

Schematic of generation of ray trajectories. (a) Calculating plane is discretized into a grid, and source emits rays within an angular range. (b) Calculated ray trajectories on a fine grid D 0. (c) Original rays are downsampled on a coarse grid D. (d) Downsampled ray trajectories.

Close modal

For each ray, its trajectory on grid D 0 can be calculated according to the Snell's law.4 After calculating all rays, a set of ray trajectories U D 0 is obtained as shown in Fig. 1(b). For the cell corresponding to depth i ( 0 < i m z) and range j ( 0 < j m r), the ijth component of U D 0 is the number of rays that pass through the cell.

Unlike the conventional ray method for calculating TL, only propagating paths of rays are computed using Eq. (4), without considering their intensity. To calculate the ray trajectories with high spatial resolution, D 0 contains a large number of cells, which is usually too large to be used as an input for the deep neural network. To input the ray trajectories into a network in an appropriate scale, we further downsample the ray trajectories on a coarse grid D : { n z × n r , n z < m z , n r < m r } as shown in Fig. 1(c) as follows:
U D = M ( U D 0 ) ,
(4)
where M ( ) is the downsampling operator. In this processing, for a cell on the grid D corresponding to depth k and range l, U k , l D is the sum over the U D 0 for the cells that fit inside the klth cell on D. After processing all cells on D, a downsampled set of ray trajectories U D can be obtained as illustrated in Fig. 1(d). This collection of downsampled ray trajectories after scale processing, which will be introduced in Sec. 2.4, is used as the input for the neural network.

A U-net is used as the neural network architecture in this research. The U-net is a type of convolutional neural network architecture commonly used for image segmentation tasks. Proposed in 2015, U-net has become widely adopted in the field of image analysis.22 In recent years, it has been introduced in the sound field prediction23 and has achieved promising results.

The architecture of the U-Net used in this research is illustrated in Fig. 2(a). It consists of an encoder path and a decoder path, which are connected through skip connections. The encoder path gradually downsamples the input ray trajectory, extracting high-level features. In each convolutional layer, convolution is performed as illustrated in Fig. 2(b). The “same mode” of padding is used after convolution, and Rectified Linear Unit is performed as the activate function after the padding to avoid vanishing gradient problem. Following each convolutional layer, max pooling with filters of 2 × 2 and stride of (2, 2) is performed, which reduce the spatial dimensions of the feature maps. The decoder path performs upsampling operations to progressively recover the spatial resolution determined by D and finally generates the scaled predicted TL map.

Fig. 2.

Schematic diagram of the U-Net architecture used in this paper. (a) Architecture of the U-Net used in this paper. Blue numbers represent the width × height. Black numbers denote the length of the input and output volumes. (b) Schematic of a convolution layer.

Fig. 2.

Schematic diagram of the U-Net architecture used in this paper. (a) Architecture of the U-Net used in this paper. Blue numbers represent the width × height. Black numbers denote the length of the input and output volumes. (b) Schematic of a convolution layer.

Close modal

The skip connections between the encoder and decoder paths allow the network to preserve and integrate both local and global information. They help in recovering fine details by bypassing the low-level feature maps directly to the decoder path.24 

The reference data for the neural network consist of TL maps produced in the same environment as the ray trajectory calculation. The data can be calculated by selecting an appropriate method for the specified environment. The TL in dB at frequency f is referred to as the ground truth data T GT D , f in the training.

Due to the significant difference in the magnitude of values between the downsampled ray trajectory U D and the ground truth TL map T GT D , f, we use the scaled version of data as the input data. To make the scaled data recoverable to their real values, the scaling is given by
{ U ̃ D = U D / N r , T ̃ GT D , f = T GT D , f / β ,
(5)
where U ̃ D and T ̃ GT D , f denote the scaled ray trajectory and ground truth TL map used by the network. Nr is the number of rays in the ray trajectory calculation, and β is a parameter to scale the ground truth data. In this research, β is set to be 200, which is usually larger than the maximum of TL magnitudes in a general environment. In this way, all data are scaled to the range of [0, 1] while maintaining the global data structure. As shown in Fig. 2(a), the network outputs the scaled result T ̃ pred D , f. Finally, results T pred D , f on their real scale are obtained through an inverse scaling processing on the output of the network.
A hybrid loss function is defined in the research. First, since the prediction of the TL map at a given frequency is quite similar with the image processing, the commonly used concepts of structural similarity index measure (SSIM)25 in image processing field are used to construct the loss function. SSIM compares the local patterns of luminance, contrast, and structure in two images A and B as follows:
SSIM ( A , B ) = ( 2 μ A μ B + c 1 ) ( 2 σ AB + c 2 ) ( μ A 2 + μ B 2 + c 1 ) ( σ A 2 + σ B 2 + c 2 ) ,
(6)
where μ is the mean of the corresponding matrix entries, σ2 is the estimate of the variance of the entries, and σAB is the covariance estimate between the entries of A and B. c1 and c2 are two constants used to stabilize the division.
Then, SSIM loss can be expressed by the following:
L SSIM = 1 SSIM ( T ̃ GT D , f , T ̃ pred D , f ) .
(7)
Furthermore, to improve the predicting accuracy of the network on local information, the concept of image gradient (IG)26 is also used to construct the loss function. IG refers to the spatial rate of change of an image, describing the variations trend of local pixels in the image. The loss function based on the IG is defined as follows:
L IG = mean [ ( G z_pred G z_GT ) 2 + ( G r_pred G r_GT ) 2 ] ,
(8)
where Gz_pred, Gz_GT, Gr_pred, and Gr_GT denote the IG matrices of the predicted and ground truth TL maps in the z and r directions, and mean[·] is the average operation on the tensor.
The total loss function is defined by
L = α L IG + ( 1 α ) L SSIM ,
(9)
where α is hyper-parameter, which is set to be 0.8 in our study.

To evaluate the performance of the proposed method on predicting the TL map, we train the network using the data generated from different sound source depths and then test the network using the data from new source depths that were never used in the training. The training and test data are simulated in a SOFAR channel with a continental slope as illustrated in Fig. 3(a). In such a range-dependent environment, ray-based methods lack sufficient accuracy at low frequencies, while wave-based methods usually require complex and time-consuming algorithms to perform the calculations.

Fig. 3.

Environment and training loss. (a) Calculating environment with a continental slope. (b) Munk's sound speed profile. (c) Training loss curves of the model at 10 Hz. (d) Training loss curves of the model at 50 Hz.

Fig. 3.

Environment and training loss. (a) Calculating environment with a continental slope. (b) Munk's sound speed profile. (c) Training loss curves of the model at 10 Hz. (d) Training loss curves of the model at 50 Hz.

Close modal

The maximum range distance R is 100 km, depth Z is 5 km. The maximum range and height of the slope are 100 km and 1 km, respectively. The surface of the ocean is assumed to be pressure-release boundary, and the seafloor is an acousto-elastic half-space where its speed of sound is 1550 m/s and density is 1 g/cm3. We consider the environment with depth-dependent sound speed following Munk's sound speed profile27 as shown in Fig. 3(b). The ray trajectories are calculated using BELLHOP code,28 and the ground truth TL maps are calculated using RAM code,29 which obtains the results using PE method.

In this paper, the networks are trained independently on individual frequencies from 10 to 50 Hz with interval of 5 Hz; thus, nine network models are obtained. This strategy increases the repetitive works in the training. However, for tasks that have specific frequency of interest, it can effectively reduce the complexity of the network.

Based on the aforementioned training strategy, nine training sets are built at the specified frequencies. Each training set consists of two parts of data, namely, the ray trajectories and ground truth TL maps. Original ray trajectories U D 0 are computed on a grid with size of D 0 : { m z × m r } = { 200 × 4000 }. Then, U D 0 is downsampled to U D on a grid with size of D : { n z × n r } = { 128 × 256 }. Note that the grids are generated in the rectangular plane as shown in Fig. 3(a), which covers the slope area. Calculations were performed under 541 source depths ranging from 300 to 3000 m with a constant interval of 5 m. For each source depth, a number of Nr = 30 rays are emitted from the source within an angle range of θ =  [−30°, 30°]. Ground truth TL maps are also computed on grid D at the same source depths to calculate the loss. Since ray trajectories are frequency-independent, they are the same in all training sets.

In the training, Ω train f is randomly divided into two sets with a ratio of 3:1, being used as the training set and validation set, respectively. The training is performed via the ADAM optimizer. Batchsize in the training is set to be 2 and the learning rate is 0.0001. The hyper-parameters of the neural networks in this paper are listed in Table 1.

Table 1.

Parameters of the neural network and the training.

Layer name Input size Hyper-parameters Output size
C: filter size, (stride), filter number;
M: filter size, (stride);
D: filter size
C1–M1  128 × 256 × 1  C1: 3 × 3 × 1, (1, 1), 64  M1: 2 × 2, (2, 2)  64 × 128 × 64 
C2–M2  64 × 128 × 64  C2: 3 × 3 × 64, (1, 1), 128  M2: 2 × 2, (2, 2)  32 × 16 × 128 
C3–M3  32 × 64 × 128  C3: 3 × 3 × 128, (1, 1), 256  M3: 2 × 2, (2, 2)  16 × 32 × 256 
C4–M4  16 × 32 × 256  C4: 3 × 3 × 256, (1, 1), 512  M4: 2 × 2, (2, 2)  8 × 16 × 512 
D1–C5  8 × 16 × 512  D1: 2 × 2  C5: 3 × 3 × 1024, (1, 1), 512  16 × 32 × 512 
D2–C6  16 × 32 × 512  D2: 2 × 2  C6: 3 × 3 × 768, (1, 1), 256  32 × 64 × 256 
D3–C7  32 × 64 × 256  D3: 2 × 2  C7: 3 × 3 × 384, (1, 1), 128  64 × 128 × 128 
D4–C8  64 × 128 × 128  D4: 2 × 2  C8: 3 × 3 × 192, (1, 1), 64  128 × 256 × 64 
C9  128 × 256 × 64  —  C8: 1 × 1 × 64, (1, 1), 1  128 × 256 × 1 
Layer name Input size Hyper-parameters Output size
C: filter size, (stride), filter number;
M: filter size, (stride);
D: filter size
C1–M1  128 × 256 × 1  C1: 3 × 3 × 1, (1, 1), 64  M1: 2 × 2, (2, 2)  64 × 128 × 64 
C2–M2  64 × 128 × 64  C2: 3 × 3 × 64, (1, 1), 128  M2: 2 × 2, (2, 2)  32 × 16 × 128 
C3–M3  32 × 64 × 128  C3: 3 × 3 × 128, (1, 1), 256  M3: 2 × 2, (2, 2)  16 × 32 × 256 
C4–M4  16 × 32 × 256  C4: 3 × 3 × 256, (1, 1), 512  M4: 2 × 2, (2, 2)  8 × 16 × 512 
D1–C5  8 × 16 × 512  D1: 2 × 2  C5: 3 × 3 × 1024, (1, 1), 512  16 × 32 × 512 
D2–C6  16 × 32 × 512  D2: 2 × 2  C6: 3 × 3 × 768, (1, 1), 256  32 × 64 × 256 
D3–C7  32 × 64 × 256  D3: 2 × 2  C7: 3 × 3 × 384, (1, 1), 128  64 × 128 × 128 
D4–C8  64 × 128 × 128  D4: 2 × 2  C8: 3 × 3 × 192, (1, 1), 64  128 × 256 × 64 
C9  128 × 256 × 64  —  C8: 1 × 1 × 64, (1, 1), 1  128 × 256 × 1 

As mentioned above, a total number of nine network models are trained from 10 to 50 Hz to verify the effectiveness of the methods. At each frequency, after 300 epochs of training, the loss curves basically converge to a stable value, as the examples shown in Figs. 3(c) and (d).

Test data are calculated in the same environment as the training data produced. At frequency f, ray trajectories under a number of 100 random source depths ranging from 300 to 3000 m are computed as the inputs in tests. TL maps calculated by RAM under the same condition are considered as the ground truth data. The proposed method is also compared with the classical ray method, which also uses the ray trajectories to calculate the TL. The results of ray method are obtained by BELLHOP. Two examples of TL maps obtained from different methods are shown in Fig. 4(a).

Fig. 4.

Comparisons of TL maps and error analysis. (a) Two examples of TL maps of ground truth, the proposed method, and the ray method. (b) MAE, MSSIM, and their 95% confidence intervals of the proposed method and the ray method.

Fig. 4.

Comparisons of TL maps and error analysis. (a) Two examples of TL maps of ground truth, the proposed method, and the ray method. (b) MAE, MSSIM, and their 95% confidence intervals of the proposed method and the ray method.

Close modal
We use mean absolute error (MAE) and mean SSIM (MSSIM)23 to measure the performance of the proposed method. MAE and MSSIM at each frequency are defined as follows:
{ MA E f = 1 N sd z s Γ ( 1 N D T pred D , f , z s T GT D , f , z s 1 ) , MSSI M f = 1 N sd z s Γ ( SSIM ( T pred D , f , z s , T GT D , f , z s ) ) ,
(10)
where Γ is the set of testing source depths; z s Γ is a source depth, both T pred D , f , z s and T GT D , f , z s are matrix of size 128 × 256; 1 denotes the L1 norm; N D equals 128 × 256, which is the number of points on grid D; and Nsd equals 100, which is the number of testing source depths at each frequency.

The MAE, MSSIM, and their 95% confidence intervals from 10 to 50 Hz with an interval of 5 Hz are illustrated in Fig. 4(b).

From Fig. 4(a), good similarities are observed between the predicted and true data. This proves that the proposed method is capable of predicting the TL maps for given source depth, frequency, and sound speed profile from the corresponding ray trajectories. Figure 4(b) illustrates that the proposed method predicts the TL maps on low error levels. The error tends to slightly increase with the increase in the frequency. This increase is because the distribution of TL at higher frequencies exhibits higher complexity. In addition, both the 95% confidence intervals of MAE and MSSIM are similar at different frequencies, which demonstrate that the network is stable to predict the TL maps for different source depths.

Figure 4 also demonstrates that the ray method has obvious larger errors than the proposed method. The ray method lacks sufficient accuracy in the area where rays do not travel through, which proves that the proposed method has good performance in low-frequency range.

To efficiently predict the TL in SOFAR channels, a deep learning-based method is proposed and examined in this research. The method provides an accurate mapping between ray trajectories and the TL using the convolutional neural networks in an image-processing-like framework. Ray trajectories contain rich information of the wave propagation, are usually easy to obtain, and thus provide a solid data foundation for predicting the TL. The U-net-type network is used, and a hybrid loss function combining SSIM and IG is designed. By successfully training the network, the model can achieve generalized learning of the underlying physics of underwater acoustic transmission phenomena from ray trajectories and then effectively and efficiently predict the low-frequency TL. The tests in a SOFAR channel with a continental slope show that trainings can converge quickly based on a small amount of training data. It also offers promising prospects for use in more complex environments, where its computational efficiency characteristics can be further exploited.

This work was supported by the National Natural Science Foundation of China (12074317).

The authors declare that there are no conflicts of interest regarding the publication of this paper.

The data that support the findings of this study are available from the corresponding author upon reasonable request.

1.
J.
Lecoulant
,
C.
Guennou
,
L.
Guillon
, and
J.-Y.
Royer
, “
Three-dimensional modeling of earthquake generated acoustic waves in the ocean in simplified configurations
,”
J. Acoust. Soc. Am.
146
(
3
),
2113
2123
(
2019
).
2.
W.
Liu
,
Y.
Yang
,
M.
Xu
,
L.
,
Z.
Liu
, and
Y.
Shi
, “
Source localization in the deep ocean using a convolutional neural network
,”
J. Acoust. Soc. Am.
147
(
4
),
EL314
EL319
(
2020
).
3.
M.
Zhong
,
M.
Torterotot
,
T. A.
Branch
,
K. M.
Stafford
,
J.-Y.
Royer
,
R.
Dodhia
, and
J. L.
Ferres
, “
Detecting, classifying, and counting blue whale calls with Siamese neural networks
,”
J. Acoust. Soc. Am.
149
(
5
),
3086
3094
(
2021
).
4.
F. B.
Jensen
,
W. A.
Kuperman
,
M. B.
Porter
,
H.
Schmidt
, and
A.
Tolstoy
,
Computational Ocean Acoustics
(
Springer
,
New York
,
2011
).
5.
M. B.
Porter
The BELLHOP manual and user's guide: Preliminary draft
,” http://oalib.hlsresearch.com/Rays/HLS-2010-1.pdf.
6.
E. K.
Westwood
,
C. T.
Tindle
, and
N. R.
Chapman
, “
A normal mode model for acousto‐elastic ocean environments
,”
J. Acoust. Soc. Am.
100
(
6
),
3631
3645
(
1996
).
7.
F. D.
Tappert
,
The Parabolic Approximation Method
(
Springer
,
New York
,
1977
).
8.
F.
Sturm
, “
Numerical study of broadband sound pulse propagation in three-dimensional oceanic waveguides
,”
J. Acoust. Soc. Am.
117
(
3
),
1058
1079
(
2005
).
9.
M. B.
Porter
and
H. P.
Bucker
, “
Gaussian beam tracing for computing ocean acoustic fields
,”
J. Acoust. Soc. Am.
82
,
1349
1359
(
1987
).
10.
B. J.
DeCourcy
and
T. F.
Duda
, “
A coupled mode model for omnidirectional three-dimensional underwater sound propagation
,”
J. Acoust. Soc. Am.
148
(
1
),
51
62
(
2020
).
11.
T.
He
,
B.
Wang
,
S.
Mo
, and
E.
Fang
, “
Predicting range-dependent underwater sound propagation from structural sources in shallow water using coupled finite element/equivalent source computations
,”
Ocean Eng.
272
,
113904
(
2023
).
12.
Y.
LeCun
,
Y.
Bengio
, and
G.
Hinton
, “
Deep learning
,”
Nature
521
(
7553
),
436
444
(
2015
).
13.
M. J.
Bianco
,
P.
Gerstoft
,
J.
Traer
,
E.
Ozanich
,
M. A.
Roch
,
S.
Gannot
, and
C.-A.
Deledalle
, “
Machine learning in acoustics: Theory and applications
,”
J. Acoust. Soc. Am.
146
(
5
),
3590
3628
(
2019
).
14.
Y.
Liu
,
H.
Niu
,
S.
Yang
, and
Z.
Li
, “
Multiple source localization using learning-based sparse estimation in deep ocean
,”
J. Acoust. Soc. Am.
150
(
5
),
3773
3786
(
2021
).
15.
S.
Yoon
,
H.
Yang
, and
W.
Seong
, “
Deep learning-based high-frequency source depth estimation using a single sensor
,”
J. Acoust. Soc. Am.
149
(
3
),
1454
1465
(
2021
).
16.
Y. N.
Han
,
Y. Y.
Li
,
Q. Y.
Liu
, and
Y. L.
Ma
, “
DeepLofargram: A deep learning based fluctuating dim frequency line detection and recovery
,”
J. Acoust. Soc. Am.
148
(
4
),
2182
2194
(
2020
).
17.
W.
Mallik
,
R.
Jaiman
, and
J.
Jelovica
, “
Deep neural network for learning wave scattering and interference of underwater acoustics
,”
Phys. Fluids
36
,
017137
(
2024
).
18.
A.
Varon
,
J.
Mars
, and
J.
Bonnel
, “
Approximation of modal wavenumbers and group speeds in an oceanic waveguide using a neural network
,”
JASA Express Lett.
3
,
066003
(
2023
).
19.
W.
Mallik
,
R. K.
Jaiman
, and
J.
Jelovica
, “
Predicting transmission loss in underwater acoustics using convolutional recurrent autoencoder network
,”
J. Acoust. Soc. Am.
152
(
3
),
1627
1638
(
2022
).
20.
H.
Tu
,
Y.
Wang
,
Q.
Lan
,
W.
Liu
,
W.
Xiao
, and
S.
Ma
, “
A Chebyshev-Tau spectral method for normal modes of underwater sound propagation with a layered marine environment
,”
J. Sound Vib.
492
,
115784
(
2021
).
21.
H.
Tu
,
Y.
Wang
,
C.
Yang
,
X.
Wang
,
S.
Ma
,
W.
Xiao
, and
W.
Liu
, “
A novel algorithm to solve for an underwater line source sound field based on coupled modes and a spectral method
,”
J. Comput. Phys.
468
,
111478
(
2022
).
22.
O.
Ronneberger
,
P.
Fischer
, and
T.
Brox
, “
U-net: Convolutional networks for biomedical image segmentation
,” in
International Conference on Medical Image Computing and Computer-Assisted Intervention
(
Springer
,
New York
,
2015
), pp.
234
241
.
23.
F.
Lluís
,
P.
Martínez-Nuevo
,
M.
Bo Møller
, and
S.
Ewan Shepstone
, “
Sound field reconstruction in rooms: Inpainting meets super-resolution
,”
J. Acoust. Soc. Am.
148
(
2
),
649
659
(
2020
).
24.
P.
Qian
,
W.
Gan
,
H.
Niu
,
G.
Ji
,
Z.
Li
, and
G.
Li
, “
A feature-compressed multi-task learning U-Net for shallow-water source localization in the presence of internal waves
,”
Appl. Acoust.
211
,
109530
(
2023
).
25.
W.
Zhou
,
A. C.
Bovik
,
H. R.
Sheikh
, and
E. P.
Simoncelli
, “
Image quality assessment: From error visibility to structural similarity
,”
IEEE Trans. Image Process.
13
(
4
),
600
612
(
2004
).
26.
Q.
Song
,
R.
Xiong
,
D.
Liu
,
Z.
Xiong
,
F.
Wu
, and
W.
Gao
, “
Fast image super-resolution via local adaptive gradient field sharpening transform
,”
IEEE Trans. Image Process.
27
(
4
),
1966
1980
(
2018
).
27.
W. H.
Munk
, “
Sound channel in an exponentially stratified ocean with applications to SOFAR
,”
J. Acoust. Soc. Am.
55
,
220
226
(
1974
).