Predicting the Electronic Density Response of Condensed-Phase Systems to Electric Field Perturbations

We present a local and transferable machine learning approach capable of predicting the real-space density response of both molecules and periodic systems to external homogeneous electric fields. The new method, SALTER, builds on the Symmetry-Adapted Gaussian Process Regression SALTED framework. SALTER requires only a small, but necessary, modification to the descriptors used to represent the atomic environments. We present the performance of the method on isolated water molecules, bulk water and a naphthalene crystal. Root mean square errors of the predicted density response lie at or below 10% with barely more than 100 training structures. Derived quantities, such as polarizability tensors and even Raman spectra further derived from these tensors show a good agreement with those calculated directly from quantum mechanical methods. Therefore, SALTER shows excellent performance when predicting derived quantities, while retaining all of the information contained in the full electronic response. This method is thus capable of learning vector fields in a chemical context and serves as a landmark for further developments.

4][15][16][17] However, this approach comes at the cost of losing access to other properties directly related to the electronic density.
9][20][21][22][23] We have recently developed SALTED, a machine learning method which directly predicts the real-space electronic density of molecular and condensed phase systems. 24,25SALTED may be applied in tandem with machine-learned potentials, providing access to every ground-state electronic-structure property which would be available in a traditional AIMD calculation.
In this Communication, we present SALTER (Symmetry-Adapted Learning of Three-dimensional Electron Responses).SALTER builds on SALTED to predict the static real-space response of the electron density of a molecule or material to a homogeneous electric field.][35][36][37] The computational burden of calculating this electronic response is considerable -the most commonly employed method, density-functional perturbation theory (DFPT), 38 scales similarly to the underlying DFT calculation with respect to system size, but with a 4-10 times larger prefactor. 37,39As such, this calculation is a bottleneck in assessing the dielectric properties of systems along long molecular dynamics trajectories and large system sizes.
As in our previous work, 24,25 SALTER is based on an atom-centred expansion of the density response, with the coefficients of the expansion predicted using the symmetry adapted Gaussian process regression (SA-GPR) approach. 40This produces a method which is local, transferable, and capable of treating molecular and periodic systems on the same footing. 24The key modification introduced in this work is the inclusion of a "dummy" atom in the representation of each atomic environment, which encodes information about the direction of the applied perturbation in a simple way, while retaining all of the symmetry properties of the representation.This modification is significantly simpler than formally including the applied field in the descriptor, which would require taking the tensor product of each symmetry-adapted kernel with a rank-1 tensor describing the applied field.
We start by briefly presenting the method.We follow the same general approach outlined in our previous work learning the electron density, ρ, 24,25 and highlight the key differences to the previous method.For a more detailed account we refer the reader to Refs.24 and 25.The density response ρ β to a field applied along a Cartesian axis β can be expanded using a set of auxiliary basis functions φ, such that dρ(r) de β ≡ ρ (1) Here R i is the position of atom i, the basis function φ iσ is arXiv:2304.09057v1[physics.chem-ph]18 Apr 2023 centered on atom i and may be written as the product of a radial part R n (r) and a real spherical harmonic Y λµ (θ, φ); for brevity we use a composite index σ ≡ (anλµ), where a labels the atomic species.T(U) is a translation vector to a point removed from the central reference unit cell by an integer multiple U = (U x , U y , U z ) of the lattice vectors, present only when the system under consideration is periodic.The optimal coefficients for this expansion can be found by minimising the loss function which yields β . ( Here S is the overlap matrix of the periodic or nonperiodic basis functions, and w β is a vector of the projections of the self-consistent density response ρ (1),QM β (r) onto the basis, These optimal coefficients c RI β serve as the references against which the accuracy of coefficients predicted by the ML model c ML β are evaluated.In order to predict c ML β , we approximate them as a linear combination of the regression weights b jσβ associated with M reference environments, and a covariant kernel k σβ (A i , M j ) which describes the similarity between the reference environments {M j } and the atomic environments which comprise the target structure {A i }: Note that here c ML iσβ and b jσβ are vectors of length 2λ + 1, and k σ,β (A i , M j ) is a square matrix of the same dimension.Defining and minimizing a loss function analogous to Eq. (2) allows us to determine the regression weights b β : Here the index A runs over each of the N structures in the training set, the matrix K M M couples the set of reference atomic environments to one another, and we introduce the hyperparamter η to control the regularization of the minimization.In all the examples presented here, we use a regularization parameter of η = 10 −8 , finding only a weak dependence of the observed error on η.Eq. ( 6) is solved iteratively using the conjugate gradient algorithm; for a detailed description of the implementation the reader is referred to Ref. 25. FIG.
1.An illustration of the modified λ-SOAP descriptors and kernels.Environments Ai are defined by a cutoff radius around a central atom i, labelled with a blue dot.The usual SOAP environments ni (see Eq. ( 8)) are constructed from a Gaussian function centred on every atom within the cutoff radius.Our modified environments n iβ (see Eq. ( 9)) add a Gaussian corresponding to an additional "dummy" atom, shown here in green, which encodes the direction of the applied field in the descriptor.The λ-SOAP kernels are then built from the symmetry-adapted overlap of these modified environments, as in Eq. (7).
The key difference between SALTER and SALTED lies in the similarity kernel k σβ (A i , M j ).In Refs.24 and 25, kernels are constructed from the λ-SOAP representation of the atomic environments A i and M j . 40This captures the symmetry covariant transformations of the spherical harmonics under rotation; the coefficients must also possess this symmetry since they expand auxiliary basis functions whose angular parts are spherical harmonics.As such, they are independent of the laboratory frame.However, in the present case the similarity kernel k σβ (A i , M j ) must retain all of the properties exploited in previous work, while also including the laboratory frame direction along which the perturbing field is applied.
Our solution is simple.An additional "dummy" atom of atomic number zero is added to every atomic environment on the β axis, differentiating this axis from the others.The kernels are then calculated using λ-SOAP representations of these modified atomic environments in the normal way.This dummy atom is then naturally included in the kernels for all values of λ, ensuring that the kernels retain all of the required symmetry properties, while including (symmetry-adapted) information about the laboratory frame direction of the applied field.This idea is illustrated in Fig. 1, and defined mathematically below.
The unmodified λ-SOAP kernel is defined as where the first integral is performed over all rotations of order λ, and D λ ( R) is the Wigner matrix corresponding to rotation R. The atomic environment around atom i is described as a sum of Gaussian functions of width ν centred at the position of every atom R j which lies within some radius r c of the central atom: Θ(x) is the Heaviside step function.To include information about the direction of an applied field, we include a "dummy" atom in the atomic environment, adding an additional term to Eq. ( 8): (9) Here x is a parameter between 0 and 1 specifying the position of the dummy atom and rβ is a unit vector in the Cartesian direction β.Note that the width of the Gaussian associated with the dummy atom is not required to be the same as that associated with the physical atoms; in general ζ = ν.The λ-SOAP kernel k σβ which accounts for the direction of the applied field is then obtained by replacing n i (r) with n iβ (r) in Eq. (7).
Throughout this work, we define the % RMSE across the datasets as Aβ .
(10) Here the sums run over all of the structures A in the training set, and the denominator is the standard deviation of the density response across the training set, with A − w(1) ; c and w(1) are the mean values for the vectors of coefficients and density projections in the data set.
In all of the following we used the FHI-aims software package 41 to obtain the training data S and w , using density functional theory 42,43 employing the LDA functional. 42In all cases "light" basis sets were used, and the auxiliary basis functions introduced in Eq. ( 1) are constructed by taking all possible products of these basis functions and then eliminating linear dependencies from the resulting basis set. 44he molecular dynamics simulations were performed using the i-PI code. 45We note that the "light" settings used are not the most accurate for the prediction of these properties, but our purpose here is simply testing the quality of the ML model, which can be achieved at this level of accuracy.Increasing the model accuracy only requires calculating new training data, which, as we shall see below, is not very costly.The training and reference data are available from the repositories listed in the SM.
First, we tackle the simple example of an isolated water molecule.We learn the density response to an electric field for a dataset of 1000 distorted configurations taken from Ref. 40, using both the usual kernels k σ and the modified kernels k σβ .Each molecule in the dataset is aligned to a reference molecule, as shown in Fig. 2. We use the SOAP parameters r c = 4.0 Å, ν = 0.3 Å, as is commonly used to describe water, 40 and chose ζ = ν and x = 0.9 to define the dummy atom.The choice of these parameters will be discussed in more detail below, and makes little difference to the results in this simple example.The dataset of was split into a training set of 250 and a testset of 750 structures; M = 300 reference environments were used.
The accuracy of SALTER density responses for this simple dataset are summarised in the table in Fig. 2. The first row shows that while aligning the molecules is sufficient to allow reasonably accurate learning of the density response to a field applied along the y axis, the learning entirely fails for fields applied along the x or z axis.This can be explained by symmetry: since the molecules lie in the xy plane, the SOAP descriptors are invariant to reflections in that plane.As a result, there is no differentiation between a field applied in the +z and −z direction, resulting in errors approaching 100%.By contrast, once the dummy atom is introduced, this ambiguity is removed; the direction along which the field is applied is clearly defined.As a result, we see similar errors regardless of the field direction, which are around 1% in every case.Indeed, we also see an improvement in the learning of the response to a field in the y direction when using the modified kernels, since the applied field direction is made more explicit than simply being inferred from the aligned geometries.
This simple example also displays a feature of SALTER.Since the molecules are aligned, the Cartesian axes are nonequivalent.As a result, a separate machinelearning model must be trained for each non-equivalent direction.We argue that this additional cost is offset by the conceptual simplicity of the approach, the computational simplicity of calculating the descriptors, and that this drawback only applies in the specific case of nonequivalent axes.We also stress that it is not necessary to align systems for this approach to succeed; the alignment here has a purely pedagogical purpose.
Having established that our approach works in principle, we turn to a more realistic example: a cubic water cell of side 9.67 Å containing 32 water molecules.Our dataset consists of 500 configurations, divided into a training set of 400 structures and a test set of 100 struc- tures; this is the same dataset as used in Ref. 25.Because there are now physical atoms distributed throughout each spherical atomic environment, it is not clear a priori where the dummy atom should be placed within the environment, or what the optimal values for x and ζ in Eq. ( 9) would be.We performed a search of the parameter space using a subset of 50 structures from the full training set, finding the lowest errors when x = 0.3 and ζ = 1.0 Å. Full details of this optimisation are provided in the Supplementary Material.
The learning curves for the density response to a field applied along each Cartesian axis are shown in Fig. 3. Learning improves monotonically with increasing training set size, with the error reaching a plateau of around 12% for N ≥ 100.We have used M = 3000 reference environments for these calculations (see convergence with respect to M in the SM).As the Cartesian directions are all equivalent, the machine learning model trained on the density response to a field applied along the x axis is also used to predict the response to a field along the y and z axes; the errors observed are very similar regardless of the field direction, as would be expected by symmetry.
The learning curve for this dataset saturates at a relatively large % RMSE: around three times larger than the error in the predicted densities we reported in our previous work. 25The reason for this could be a deficiency of the descriptor, or a lack of quality in the training data.Our descriptors are short ranged: any correlations in the response of the density to a field which are longer-ranged than the 4 Å cutoff radius of the atomic environment will be lost, limiting the accuracy of predictions.Alternatively, more accurate predictions may be possible if the training data was obtained using a larger basis set than the "light" basis sets found in AIMS, since this both allows a more accurate expansion of the density response, and reduces the error in the training data which in turn reduces the noise in the machine-learning model.
We then tested the performance of SALTER on a derived quantity from the electronic response, namely the  dielectric susceptibility tensor.This quantity is related to the polarizability of individual molecules and its elements are defined by 37 Since we are considering a periodic system, when evaluating Eq. ( 11) we exploit the commutation relation between the unperturbed Hamiltonian and the position operator to avoid an ill-defined result. 37,38We denote the tensor defined by the density response found by a self-consistent DFPT calculation as α QM , while α ML is that derived from the predicted density response, and the error in α ML relative to α QM provides a measure of the accuracy of the predicted density responses.Table I shows the errors in different elements α ML derived from the density response predicted by the machine learning model with N = 400, M = 3000.In all cases errors are around 5%; this compares favourably to previous direct machinelearning predictions of α ML , which typically find errors > 10% for models trained on a few hundred structures, reducing to 5% only when trained on a few thousand structures. 20,46,47Therefore, for a given accuracy this indirect method of predicting α ML reduces the computational effort required to generate training data by an order of magnitude relative to a direct prediction of α ML .The smaller errors in α ML than in ρ (1),ML can be understood from Eq. (11).If errors in the density response accumulate in regions of space which do not contribute significantly to the integral, one would expect to see lower errors in α ML than the density response from which it is derived.Indeed, we discussed a similar dependence of the error in properties derived from electron density on the spatial distribution of errors in the electron density itself in Ref. 25.
Finally, we show the performance of SALTER on a molecular crystal and on a quantity which is "twice removed" from the target prediction of the model, namely the vibrational Raman spectrum.We take the P2 1 /a naphthalene crystal at 80 K.We used five 10 ps NVE ab initio molecular dynamics trajectories of a 2 × 2 × 1 crystal supercell (8 molecules) using the PBE functional 48 with many-body dispersion corrections. 49,50To train the machine learning model, we selected 100 structures from each of these trajectories, dividing these structures into a training set of 400 configurations and a test set of the remaining 100.The learning curves for the density response to a field applied along each Cartesian direction are shown in Fig. 4.These curves are converged with respect to M , as demonstrated in the Supplementary Ma-  terial, and we see no significant difference in performance when predicting the response to a field applied along each Cartesian axis.We note that the dataset is relatively homogeneous: a single training structure is sufficient to produce a ML model accurate to within 12%, and no improvement is seen when the training set size is increased beyond 100.
We then used the predicted density responses to calculate α ML for each structure in the test set.Curiously, as shown in Table II, the errors are significantly larger than those found in bulk water (Table I), despite the % RMSE of the density response itself being significantly lower for naphthalene.Furthermore, the errors in the components of α ML derived from a response to a field applied in the z direction are the largest, despite the errors in this response being the lowest (see crystal cell orientation in the inset of Fig. 4).This counter-intuitive result is again due to the fact that the errors in α ML depend not simply on the magnitude of the error but also its spatial distribution.Specifically, for a given magnitude of error, the greater the distance between the regions of positive and negative error, the greater the error in the dielectric susceptibility derived from the density response -and this is what we observe along the z direction in this crystal.This is illustrated in more detail in the SM.
We compared a Raman spectrum obtained using α QM with one obtained indirectly using SALTER predictions of ρ (1),ML to obtain α ML .The anharmonic Raman spectrum of a disordered naphthalene sample (the powder spectrum) was obtained through ensemble-averaged time-correlation functions, as detailed in Ref. 27.The spectra are shown in Fig. 5. Good agreement is seen at the polymorph-sensitive low frequency modes, with small discrepancies in the peak intensities appearing at higher frequencies.Therefore, by applying SALTER, we obtain Raman spectrum of comparable quality to those obtained using ab initio methods and to those obtained from a direct learning of α QM , 20 while dramatically reducing the simulation cost and retaining access to all the information included in the real-space distribution of ρ (1) (r).
In conclusion, we presented a conceptually simple modification to the widely used λ-SOAP descriptors which allows us to use machine learning to predict a vector field; specifically, the response of the electron density to a static electric field.By adding a "dummy" atom which indicates the direction of the applied field to each atomic environment, we are able to retain all of the beneficial properties of the hierarchy of λ-SOAP kernels, which allowed us to predict the scalar electron density field in previous work, 24,25 while including (symmetry-adapted) information about the laboratory frame direction of the applied field.As a result, we expect the method developed here to be straightforward to generalise to the prediction of other vector fields.To the best of our knowledge this is the first reported machine learning model capable of predicting vector fields in a chemical context.
We applied SALTER to two condensed phase systems: liquid water and the P2 1 /a naphthalene crystal.As expected for a disordered system, a single machine-learning model successfully predicted the density response of water to a field applied along each of the Cartesian axes; we derived the dielectric susceptibility of water from these predictions, finding values within 5% of the reference DFPT calculations.For naphthalene, we constructed machine learning models of similar accuracy regardless of the direction of the applied field.We found a counterintuitive relationship between the errors in the density response, which were smallest when the field was applied along the z axis, and the errors in the corresponding dielectric susceptibilities, which were largest for the yz and zz component of the tensor.This apparent contradiction can be explained by analysing the spatial distribution of the error in the predicted density response.Nevertheless, the Raman powder spectrum obtained from these indirectly predicted dielectric susceptibilities was very similar to that obtained using DFPT, and was obtained at a fraction of the computational cost.In general, this method produces accurate derived quantities while retaining the full information contained in the electron density response to an applied field.Therefore, for a fixed value of the error ε, then the greater the spatial separation in the γ direction between these regions of error, the larger the error in the polarizability and dielectric susceptibility.
Since the long axis of naphthalene lies approximately along close to the z axis, the separation between these regions is largest when a field is applied along the z axis, and we therefore see much larger errors in the corresponding components of the dielectric susceptibility, despite the fact that the actual magnitude of the error in the density response is lowest for this field direction.

FIG. 2 .
FIG. 2. Above: A visualisation of the 1000 water molecule dataset in which all structures are overlayed.Each molecule lies in the xy plane, with the O atom at the origin; the positions of the H atoms are varied around the equilibrium geometry, which is highlighted in blue and whose centre of mass lies on the y axis.Below: The % RMSE across the testset of 750 structures, using λ-SOAP kernels built using atomic environments which exclude (first row) and include (second row) a dummy atom.

FIG. 3 .
FIG.3.Learning curves for the density response to a field along each Cartesian direction for the bulk water dataset, M = 3000.

FIG. 4 .
FIG. 4. Learning curves for the density response to a field along each Cartesian direction for the naphthalene test set, M = 2000.Component xx yy zz xy xz yz % RMSE 9.13 14.40 19.74 9.38 12.81 20.11

FIG. 5 .
FIG. 5.The powder Raman spectrum of the P21/a naphthalene crystal at 80 K, calculated from time-correlation functions of dielectric susceptibilities ab initio (QM) and via an indirect SALTER prediction.

FIG. 2 .FIG. 3 .
FIG.2.Learning curves for the predicted density response to an applied field along the x axis for the full bulk water dataset, converging with respect to M .

TABLE I .
% RMSE in each component of the dielectric susceptibility derived from the density responses predicted by a machine learning model trained using N = 400, M = 3000 for the bulk water test set.

TABLE II .
% RMSE in each component of the dielectric susceptibility derived from the density responses predicted by machine learning models trained using N = 400, M = 3000 for the naphthalene test set.