A Learned Born Series for Highly-Scattering Media

A new method for solving the wave equation is presented, called the learned Born series (LBS), which is derived from a convergent Born Series but its components are found through training. The LBS is shown to be significantly more accurate than the convergent Born series for the same number of iterations, in the presence of high contrast scatterers, while maintaining a comparable computational complexity. The LBS is able to generate a reasonable prediction of the global pressure field with a small number of iterations, and the errors decrease with the number of learned iterations.


Introduction
Computational solvers for the Helmholtz equation with heterogeneous material properties are used in many fields that require single frequency wave simulations, including optics, seismology, acoustics, and electromagnetics.One approach is to compute the terms of the Born series [1] through the recursive application of the known Green's operator for a homogeneous medium.However, in its conventional form the Born series converges only for low contrast scattering potentials -a major limitation in many practical applications.As an example, the application that has motivated this work is the propagation of ultrasound waves through the skull, which has a sound speed about double that of the surrounding soft tissue.A convergent Born series (CBS) was proposed by Kleinman et al. based on a generalised overrelaxation method [2,3].More recently, Osnabrugge et al. introduced another CBS through the use of a preconditioner based on the scattering potential [4], which has since found many applications [5,6,7,8].However, for strongly scattering media many iterations are needed, as the convergence rate is limited by the range of spatial wavenumbers in the problem.To overcome the slow convergence rate, we present a learned Born series (LBS), in which a generalised preconditioner and the modified Green's operator are found, for a given number of unrolled iterations [9], through training.We show that the convergence rate in the presence of high contrast scatterers is considerably improved.

Learned Born Series
We aim to solve the Helmholtz differential equation (∇ 2 + k 2 )u = −s, where k = ω/c 0 is the heterogeneous wavenumber, ω is the angular frequency, and c 0 is the spatially-varying sound speed.To solve this equation, following Osnabrugge et al. [4], we start by defining the scattering potential v = k 2 − κ 2 − iε, where κ 2 , ε ∈ R + are positive scalars, and rewrite the differential equation as We denote with G the Green's operator associated with the operator L, that is where F is the Fourier transform operator and p 2 = p 2 x + p 2 y is the squared magnitude of the Fourier transformed coordinates.Applying the Green's operator to both sides of Eq. (1) leads to u = Gs + Gvu.Repeated substitution of the right-hand-side into u leads to the classical form of the Born series: under the assumption that the series converges.As mentioned in Sec. 1, this assumption is satisfied only for low scattering media.Osnabrugge et al. [4] proposed the preconditioner q = iv/ε, leading to the CBS which is convergent for scattering media with arbitrarily-strong sound speed contrasts for an appropriately chosen ε.Despite the convergence guarantees of Eq. ( 4), the convergence rate is inversely proportional to the maximum wavenumber difference between any two points in the domain.This means that for media with sound-speed high contrast, a large number of iterations are necessary.
With the aim of reducing the number of iterations required to solve problems in highly-scattering media, we modify the CBS as follows.First, note that the series in Eq. ( 4) can be rewritten as with u 0 = w 0 = qGs.To develop a learned version of Eq. ( 4), we transform the linear operator M into a non-linear one in the form ) Here, A is the learned version of the preconditioner q, B is the learned version of the scattering potential v, and C is now a general additive term (that is not enforced to be 1 − A).Each of A, B, C are the outputs of a neural network dependent on the input sound speed and the spatial location maps, i.e., where, to simplify notation, we use θ to denote the learnable parameters for each operator at each iteration, without implying that they are the same set.The learned fields (A, B, C) are returned by three fully connected neural networks (FCNN) acting channelwise on the (c, x) map, also known as 1×1 convolutions, with two hidden layers.The size of the hidden layers is equal to the number of channels.Similar to convolutional networks, we generalise Eq. ( 6) to c channels by associating the action of each (A, B, C) field to a matrix product; for example if u ∈ R c = (u 1 , u 2 , . . ., u c ), then (Au) i = j A ij u j .Note that the values of (A, B, C) change for each pixel location.
We also make the wavenumber shift of the Green's function, κ 2 + iε, learnable.To avoid dividing by zero during training, instead of directly learning κ 2 and ε, we parameterise the Green's operator as where α and κ 2 are learnable parameters, which is equivalent up to a scaling factor (that is absorbed by the A term).
Similarly, the input to the Green's operator is an n−dimensional field, therefore the Green's function is a mapping R c → R c , thus α, κ 2 are c × c, where c is the number of channels.Lastly, we apply a projected non-linearity of the form σ(u) = D 1 σ(D 2 u), where D 1 , D 2 are two dense layers acting channel-wise and σ is a non-linear function, in our case a GeLU function [10].The final learned iteration is therefore where we have used the subscript n to highlight that σ and M have different parameters at each iteration.We set u 0 = 0. We note the similarity between Eq. ( 9) to the Fourier Neural Operators (FNO) framework [11], where the network is set up as the sum of a Fourier filter and a local convolution operator as where K is a learnable Fourier filter and C is a learnable local convolution filter.Similarly to the FNO, we pad the input by a fixed amount to accommodate the artifacts generated by the periodicity assumed by the discrete Fourier transform, and afterwards unpad the output.A block diagram of the network is given in Fig. 1a.

Training and Results
A dataset was constructed from 2000 MNIST [12] samples upsampled to 128×128 pixels, which were rescaled and used as sound speed distributions.The simulations were performed in normalised units with a background sound   The dataset was split between train, validation and test in 80%, 10% and 10% -sized subsets.For each sound speed map, a reference solution was generated using the Helmholtz solver in j-Wave [13] using GMRES with a fixed source position.The loss function was the mean squared error against the reference solution.All networks were trained with a batch-size of 16 and learning rate of 3 • 10 −3 : the network corresponding to the epoch with the lowest validation loss was saved, although we didn't observe any over-fitting during training.The network was trained for 6, 12 and 24 unrolled iterations, with c = 8.We also trained an extremely light version with c = 2 (matching the number of components on a complex field, as in the original CBS) with linear activation functions.
The simulated pressure field for a representative example from the test set using a network trained with 12 iterations is shown in Fig. 1b.Even with a very small number of iterations, the LBS is able to generate a reasonable prediction of the global pressure field.The errors against j-Wave for the whole test set are shown in Fig. 1d.As expected, the errors decrease with the number of learned iterations.Compared to the CBS run for the same number of iterations, the error is significantly reduced.This is because the pseudo propagation speed in the CBS is limited due to the strong contrast, as shown in Fig. 1c.The outliers correspond to speed of sound maps that exhibit resonant modes, eg."0" digits for which the field exhibited a mode-like structure similar to whispering gallery modes [14].The error for the CBS was also proportionally higher for these examples.To match the same error on the test set of the LBS using 6, 12, and 24 iterations, on average the CBS required 13.1, 8.6, and 5.0 times more iterations, respectively (see Fig. 1e).Even when the number of channels is reduced to 2 (equivalent to the complexity of the CBS), the convergence of the LBS is still much faster than the CBS (see yellow boxplot in Fig. 1d).

Summary and Discussion
We present a learned Born series, inspired by the convergent Born series presented in [4], that significantly improves convergence rates for single-frequency wave simulations with strong sound speed contrast.While for channel counts c > 2 the LBS requires more FFTs at each layer compared to the CBS, such FFTs can be computed in a parallel fashion, which still makes the method more efficient in practice.
In the future, it would be interesting to explore several potential extensions of this work.One possibility is to train on sub-sampled problems that are otherwise unsolvable with classical methods, in order to further reduce the computational complexity of the LBS.This approach could have applications in areas such as transcranial treatment planning and full waveform inversion.Another interesting direction would be to apply the method to other PDEs for which a scattering potential can be defined, such as the Schrödinger equation.
One potential way to improve the LBS would be to substitute the fixed padding with a learnable "PML-like" layer, instead of feeding the spatial grid x.This could lead to an architecture that is resolution and grid-size independent, making it a neural operator as described in [15].It would also be interesting to train the method as an iterative solver, as described in [14].Lastly, it would be interesting to explore whether the LBS can be extended to heterogeneous density.While the method presented in this paper does not support it, the CBS does not either.Allowing so would open up new applications in areas where the density of the medium being modeled is non uniform, such as ultrasound applications with the presence of bone or geophysics.
(a) Block diagram for the learned Born series (LBS) with n iterations.Example result against reference.(c) Solution for 6, 12 and 24 iterations.The LBS is shown on top, while the CBS is shown at the bottom.
Maximum error over the test set for different number of iterations.Number of iterations required by CBS over number of LBS stages, for a similar error, for the different number of stages.

Figure 1 :
Figure 1: Results on the test set.