Medium-range interactions occur in a wide range of systems, including charged-particle systems with varying screening lengths. We generalize the Ewald method to charged systems described by interactions involving an arbitrary dielectric response function $\u03f5(\mathbf{k})$. We provide an error estimate and optimize the generalization to find the break-even parameters that separate a neighbor list-only algorithm from the particle-particle particle-mesh algorithm. We examine the implications of different choices of the screening length for the computational cost of computing the dynamic structure factor. We then use our new method in molecular dynamics simulations to compute the dynamic structure factor for a model plasma system and examine the wave-dispersion properties of this system.

## I. INTRODUCTION

The interactions employed in molecular dynamics (MD) simulations are typically considered to be either short- or long-range. In *d* dimensions, interactions of the form 1/*r*^{n} are considered short range if $n>d$. Short-range interactions are typically treated with a fast neighbor search within a cutoff radius *r*_{c}; the interaction is truncated thereafter, and a tail correction can be added.^{1,2} Conversely, if $n\u2264d$, then the interactions are considered long range; such interactions are considerably more difficult to treat. In periodic systems, most methods for handling long-range interactions are based on the Ewald sum,^{3,4} which decomposes the Coulomb potential into two sums that are rapidly convergent. This decomposition can be implemented through a wide variety of algorithms, of which a naive implementation yields $O(N2)$ scaling; a simple optimization reduces this scaling to $O(N3/2)$.^{5} Faster, grid-based methods have been developed that allow the fast Fourier transform (FFT) to be used,^{6} resulting in $O(NlogN)$ scaling; among these methods are the particle-mesh Ewald (PME),^{7} smoothed particle-mesh Ewald (SPME),^{8} and particle-particle particle-mesh (PPPM) methods.^{6,9–12} Alternatively, non-Ewald methods^{13} have also been developed, such as the Wolf summation method^{14} and fast multipole methods,^{15} and hybrid methods^{16,17} have been developed, as well.

However, this categorization of interactions as either short range or long range reflects the mathematical issue of force-sum convergence. In fact, many interactions can be considered to be medium range when their convergence is guaranteed yet either that there are a prohibitive number of particles in the neighbor radius or there are unacceptable truncation errors. For example, power-law potentials of the form $\u223c1/r2$ (charge-dipole), $\u223c1/r3$ (dipole-dipole), and 1/*r*^{6} (dispersion) fall into this category,^{18,19} as well as Sutton-Chen^{20} potentials relevant for metallic glasses.^{21} Isele-Holder *et al.*^{22} have extended the PPPM algorithm to interactions other than 1/*r* in their work on 1/*r*^{6} dispersion interactions. Similarly, prior work on screened Coulomb interactions includes extending the Ewald summation to Yukawa (exponentially screened) interactions.^{23,24}

Here, we examine a particular class of medium-range interactions that form with linearly screened Coulomb potentials, where the screening is specified through an arbitrary dielectric response function (DRF) $\u03f5(\mathbf{k})$. The Ewald sum is formulated for this screened interaction and implemented with both a direct force approach (linked-cell list (LCL)^{26} alone) and the PPPM method. An error analysis is carried out to examine the crossover between the direct force calculation and the PPPM implementation as a function of the range of the interaction and the particle number.

This paper is organized as follows. In Sec. II, we describe a class of interactions for screened charged systems obtained through the Fourier representation of the dielectric properties of background species. Several choices for the screening function are given in terms of the dielectric response function $\u03f5(\mathbf{k})$, and their ranges are discussed. In Sec. III, the Ewald decomposition is developed for charged particle systems interacting through an effective interaction involving $\u03f5(\mathbf{k})$.

Sec. IV includes error estimates and timing studies for the Yukawa dielectric response function.

Finally, we apply our results to compute the dynamic structure factor for a screened Coulomb system and discuss the implications of the choice of $\u03f5(\mathbf{k})$ and the computational costs that result from the temperature and density dependence of the effective interaction.

## II. LINEARLY SCREENED COULOMB SYSTEMS

Consider *N* point particles with charges {*Q*_{i}} (e.g., ions), which can be partial, in the presence of a background charge density −*en*_{b} (e.g., electrons). The electrostatic forces in such a system can be calculated from the Poisson equation

where $\Phi (\mathbf{r})$ is the total electrostatic potential, $\delta (\mathbf{r})$ is the delta function, and {**r**_{i}(*t*)} are the particle coordinates at time *t*. Once this equation is solved, the corresponding force on the $ith$ particle is calculated to be $\mathbf{F}i=\u2212Qi\u2207\Phi (\mathbf{r}=\mathbf{r}i)$. Expression (1) can be equivalently represented in Fourier space as

By assuming that the background density evolves instantaneously with the point particles, which is a valid assumption for a very light species, such as electrons, and approximating the structure of the background density as a linear response to the potential fluctuations induced by the presence of the point particles, we can express the background density in terms of a response function $\chi (\mathbf{k})$ as

We can write the electrostatic potential as

where the reciprocal of the DRF can be expressed in terms of the response function as

Finally, the force on the $ith$ particle can be represented in terms of the pair interactions as

where the effective interaction *u*_{ij}(**r**) is given in Fourier space as

When the background does not respond, $\chi (\mathbf{k})=0$ in (3). Such a completely uniform background corresponds to a dielectric response of unity in (5), and inverting (7) recovers the usual Coulomb interaction *u*_{ij}(*r*) = *Q*_{i}*Q*_{j}/*r*. Clearly, in the limit $\u03f5\u21921$, Ewald-class methods are needed. To continue further, we must make a specific choice of the DRF; we begin with the most widely used form

where $\lambda s$ is the screening length associated with the background fluctuations. The associated pair interaction is given in real space by

This interaction decays exponentially and is therefore “short” range. However, as $\lambda s\u21920$, the number of particles in a neighbor-sphere diverges. It is therefore convenient to characterize the strength of the screening in terms of the dimensionless quantity

where $ai=(3/4\pi ni)1/3$ is the ion-sphere radius, which is a measure of the interparticle distance. The $\kappa =0$ case is the pure Coulomb (one-component plasma) limit; thus, despite the exponential decay for finite *κ*, (9) has a long-range interaction as one of its limits. This suggests that there may be a finite range of *κ* for which the interaction (9) is medium range in the sense that there is no clear, optimal choice of algorithm.

To provide a concrete example, we can examine this choice in the context of the well-known Thomas-Fermi (TF) model,^{27,28} for which

where *T* is the background temperature in energy units, *E*_{F} is the Fermi energy of the electron background in energy units, *e* is the elementary charge, and *n*_{b} is the background number density. Let us suppose that, for example, an acceptable error is given by a choice of cutoff radius $rc=5\lambda s$, which yields the factor $e\u22125\u223c0.674\u22c510\u22122$ in (9). Allowing physical conditions (*T* and *n*_{b}) to vary, we can examine the number of particles in the neighbor cell. This is shown in Fig. 1, where contours show the logarithm of the number of particles in such a sphere as a function of temperature and density. Condensed systems (e.g., ionic liquids, liquid metals, and electrolytic solutions) tend to lie near the lower right region of the graph. Laser-produced plasmas originate as cold solids, in that same region, but can be rapidly heated to the very high temperatures of the upper right region. Roughly speaking, the lower right half of the diagram corresponds to $\u223c103$ particles in the neighbor cell for this choice of error. In Fig. 2, we show the $\Gamma \u2212\kappa $ phase diagram. The coupling parameter Γ is given by

where *Z* is the degree of ionization and *T*_{i} is the ionic temperature. This diagram reflects the regimes of different charged particle systems of interest such as liquid metals,^{29} ultracold neutral plasmas,^{30} dusty plasmas,^{31} warm dense matter,^{32} and conditions at the National Ignition Facility (NIF).^{33} We also show the regions corresponding to different liquid-like properties based on the properties of their velocity autocorrelation function.^{27} Note that these disparate systems cover a wide range of $\kappa $ values.

Of course, the long-wavelength form of (8) is not unique, and interactions other than (9) are possible.^{34} Moreover, the method is applicable to a system of partial charges as well. For this reason, we will retain the generality of an arbitrary $\u03f5(\mathbf{k})$ throughout the Ewald development in Sec. III. In fact, medium range interactions of different nature occur in biological and chemical systems where the linear screening approach is not applicable.^{35–37}

## III. EWALD DECOMPOSITION FOR ARBITRARY DIELECTRIC RESPONSE FUNCTIONS

To proceed with the development of the Ewald composition of (4), we note that the central quantity is the potential-like quantity

As is common, we implement the Ewald decomposition by adding and subtracting a screening-charge distribution of the form $\zeta \rho (\mathbf{r})$, where $\zeta $ is a constant to be determined, and $\rho (\mathbf{r})$ is chosen to have the Gaussian form

where $\alpha $ is the Ewald splitting parameter. Fourier-space representation of the decomposition takes the form

where the real-space contribution

can be highly short range (because there are effectively two sources of cutoff). Analogously,

is short range in Fourier-space and hence can also be computed efficiently; this is the subject of Sec. IV. In the absence of screening, $\u03f5(k)\u21921$, and we recover the Ewald decomposition for the standard Coulomb interaction.^{4}

The real-space representation of $\varphi R(\mathbf{r})$ is given by the inverse Fourier transform

where an isotropic dielectric function $\u03f5(k)$ is assumed in the second line. We evaluate this first with the long-wavelength ($k\u21920$) form of $\u03f5(k)$ given by

where *k*_{s} is the magnitude of the inverse screening vector given by $ks=1/\lambda s$, where $\lambda s$ is the long-wavelength screening length associated with properties of the background (i.e., classical, quantum, and multispecies). This form yields what is generically referred to as the Yukawa interaction. In what follows, we will be using $ks\u22121$ in place of $\lambda s$. Substituting $\u03f5Y\u22121(k)$ in (20) yields

with $\zeta eks2/4\alpha 2=1$ (see Appendix A). Alternative derivations of $\varphi RY(r)$ have been given previously.^{23–25}

Gradient corrections to the Yukawa result can be captured by the form

where leading-order corrections to the Yukawa form appear as *k*^{4} terms. An example of this type of DRF is the exact gradient-corrected screening (EGS) potential,^{34} in which *k*^{4} terms arise from either electronic correlations or Heisenberg uncertainty (see Appendix A). Interestingly, the form of the potential is qualitatively different for $\nu <1$ and $\nu >1$, with the latter corresponding to an oscillatory decay.^{34} Hence, the Ewald decomposition for the two cases will be different; in this paper, we will restrict our discussion to the $\nu <1$ case. Substituting $\u03f5EGS\u22121(k)$ in (20) yields

where

Because $k\u2212<k+$ (for $\nu <1$), the optimal choice is $\zeta =e\u2212k\u22122/4\alpha 2$. We have given the detailed derivation in Appendix A.

The real-space part of the Ewald decomposition of the screened Coulomb interaction between two point charges for an arbitrary DRF is, therefore, given by

This completes the formulation of the real-space contribution, which will be common to all Ewald-based methods. In Sec. IV, we first turn to the Fourier-space contribution, which we will treat with the PPPM algorithm for the particle-mesh contribution. Next, we provide an error analysis and a time-per-time-step comparison of the force calculations using PPPM and LCL algorithms for the Yukawa interaction. For the remainder of this paper, when we discuss LCL for a Yukawa interaction, this should be understood to refer to a force calculation using the LCL algorithm for a plain Yukawa interaction that is not subject to Ewald decomposition.

## IV. ERROR ANALYSIS AND TIMING STUDIES

In this section, we begin with a brief explanation of the PPPM algorithm, followed by an error analysis for the computed force for the Yukawa DRF, with errors due to the truncation and discretization that are inherent to the algorithm. Then, we compare the execution times of the PPPM and LCL algorithms for the Yukawa potential to distinguish between the two algorithms in terms of computational efficiency for a certain level of error in the force calculation. The results of this section and Sec. V were obtained using the codes we implemented.^{38}

The particle-particle (PP) part of the PPPM algorithm refers to the real-space part of the Ewald decomposition, which can be highly short range; therefore, interactions beyond a certain cutoff radius can be neglected. If the number of particles within a cutoff sphere *Nc* is small compared to the total number of particles *N* in a simulation box, then the total number of interactions $O(NNc)$ tends to $O(N)$ as *Nc* becomes much smaller than *N* ($Nc\u226aN$). Such an $O(N)$ scaling is achieved by the LCL algorithm,^{6} in which the simulation box is divided into a number of cells with dimensions proportional to the cutoff radius. The particle-mesh (PM) portion of the PPPM algorithm refers to the Fourier-space part of the interaction that is computed on a grid using one of the two algorithms in Ref. 9; these algorithms differ in the manner in which forces are computed on a grid. Both versions of the PPPM algorithm employ the highly efficient FFT, which scales as $O(Mlog\u2061M)$, where *M* is the number of mesh points in the Fourier-space grid.

Errors are associated with computing the energy and forces in real-space and Fourier-space. Because we are interested in MD simulations in which the trajectories of a system of particles evolve primarily because of the forces acting on the particles, we restrict our error analysis to the errors in the forces. We chose to quantify the error in a force using the root-mean-square (RMS) error.^{9,10} The RMS errors in the real-space and Fourier-space components of the forces, which are acting on particles in a simulation box, have a common form given by

where $P=R,F$ refers to real-space or Fourier-space, $\mathbf{F}i,Papprox.$ is the approximate force acting on particle *i*, $\mathbf{F}i,Pexact$ is the actual force (or numerically exact force) acting on particle *i*, and $\xi $ is used to distinguish between computed RMS errors and analytical estimates of RMS errors. Moreover, because the computational costs associated with real-space and Fourier-space force calculations are determined by the levels of error that are allowed, it is essential to understand how the errors vary with respect to the parameters. For this purpose, we have derived analytical estimates for the errors in the real-space and Fourier-space components of the forces for a system of randomly distributed particles. The error estimates were verified by comparing with the RMS errors computed for a system of particles placed at random positions within a simulation box. For the remainder of this paper, we use *a*_{i} as the unit of distance and the elementary charge *e* as the unit of charge; thus, the RMS force error is in units of $e2/ai2$.

The error in the real-space component of the force is due to the truncation of particle interactions beyond a certain cutoff radius. For a system of *N* charged point particles interacting pair-wise with an arbitrary short-range interaction expressed as $us(\mathbf{r}i\u2212\mathbf{r}j)=QiQj\varphi s(\mathbf{r}i\u2212\mathbf{r}j)$, the estimate of the RMS force error due to a cutoff radius *r*_{c} can be derived by adapting the error analysis in Ref. 39, which considers the real-space portion of the Ewald decomposition for the Coulomb potential. With the assumption that beyond *r*_{c} particles are randomly distributed, the estimate of the RMS force error due to *u*_{s}(**r**) is given by

where $\mathbf{f}s(\mathbf{r})=\u2212\u2207\mathbf{r}\varphi s(\mathbf{r})$, $Q=\u2211i=1NQi2$, and *V* is the volume enclosing the particles.

Substituting $\mathbf{f}RY(\mathbf{r})=\u2212\u2207\mathbf{r}\varphi RY(\mathbf{r})$ in (29) and making several approximations, we get the following estimate of the RMS force error for the real-space portion of the Ewald decomposition for the Yukawa potential:

(see Appendix B for details). Note that as $\kappa \u21920$ (Coulomb limit), we recover the corresponding error estimate for the Coulomb case.^{39} Figs. 3(a)–3(c) show very good agreement between the error estimates and the computed RMS error in the real-space force component for a system of $5\xd7103$ unit charges with $\kappa $ = 0.1, 1, and 2 and varying values of $\alpha $. The choice of these $\kappa $ values is consistent with the $\kappa $ range of interest as shown in Fig. 2. The particles were randomly distributed in a cube of edge length *L* given by $L=(4\pi N/3)1/3$ in units of *a*_{i}.

In the Fourier-space part of the PPPM algorithm, the force calculation for a system of charged point particles with arbitrary spatial distribution is mapped to a real-space grid by assigning particle positions to the real-space grid points using B-splines of some order *p*. Calculations are then performed in Fourier-space starting with the FFT of the charge density. This is followed by computing the potential energy and forces on the Fourier-space grid following the steps described in Ref. 9. Forces on the Fourier-space grid are then transformed to real-space by an inverse FFT. Next, the forces are interpolated from the real-space grid to the actual particle positions using B-splines of the same order, in a manner similar to the earlier step of assigning particle positions to the real-space grid. FFT involves an inherent truncation in the maximum allowable length of Fourier-space vectors, which is one cause for error in the Fourier-space component of the forces. In addition, errors occur because of aliasing, which arises because calculations are being made on a grid, and because of the interpolation of particle positions to the real-space grid and of forces computed on the real-space grid to actual particle positions. Critically, the PPPM algorithm employs an optimal Green function^{6} that is derived by minimizing the error in the forces for a random distribution of particles. The optimal Green function can be interpreted as a weighted average of Fourier modes of the actual Green function.

For the version of the PPPM algorithm in which forces are computed on a grid,^{9} the optimal Green function corresponding to an arbitrary Green function is given by^{9}

where **n** refers to the grid vector denoted by the triplet of grid indices (*n*_{x}, *n*_{y}, *n*_{z}), $G^\mathbf{k}\mathbf{n}$ is the actual Green function evaluated at the Fourier-space vector **k**_{n} corresponding to grid vector **n**, **m** refers to the triplet of grid indices (*m*_{x}, *m*_{y}, *m*_{z}) that contribute to aliasing, and $U^\mathbf{k}\mathbf{n}$ is the Fourier transform of the B-spline of order *p*. $U^\mathbf{k}\mathbf{n}$ is given by

where *M*_{x}, *M*_{y}, and *M*_{z} refer to the number of grid points along the respective directions in Fourier-space. The triplet of grid sizes (*h*_{x}, *h*_{y}, *h*_{z}) are then given by (*L*_{x}/*M*_{x}, *L*_{y}/*M*_{y}, *L*_{z}/*M*_{z}). The estimate of the optimal RMS error in the Fourier-space force is given by^{9}

where

Figs. 3(d)–3(f) show very good agreement between the RMS error estimates for the Fourier-space force with order of B-splines ranging from 2 to 7 and the corresponding RMS error computed using the same system of $5\xd7103$ unit charges that was employed for the real-space error analysis. Because the simulation volume is a cube, the numbers of grid points *M* along each direction in Fourier-space is equal.

Eq. (33) is not of a form that explicitly reveals the influence of the parameters *h*, *α* and *p* on the error incurred in the Fourier-space force calculation. Therefore, (33) requires further simplification, which is possible if the arbitrary Green function decays sufficiently rapidly in Fourier-space that the alias contributions can be neglected. Following the approach in Ref. 10, (33) is approximated to

where $AF$ is given by

where $\beta (p,m)$ denotes an integral given by

and $Cm(p)$ are coefficients listed in Table I of Ref. 10 (see Appendix B for further detail about the simplification of (33)). For the Yukawa interaction, substituting $G^k=4\pi (k2+\kappa 2)e\u2212(k2+\kappa 2)/4\alpha 2$ results in a modified form for $\beta (p,m)$ given by

where $\psi (p,m,\kappa /\alpha )$ denotes an integral of the form

Finally, a simplified form of the RMS error estimate for the Fourier-space component of the force for the Yukawa interaction is given by

where

Fig. 4 shows a good agreement between $\Delta FFY(approx.)$ and the full error estimate $\Delta FFY$ for $\kappa $ = 0.1 and 1 for a system of 10^{3} unit charges. For $\kappa =1$ as $\alpha $ decreases below unity, $\Delta FFY(approx.)$ deviates from $\Delta FFY$. Further, we found that for $\kappa >1$, $\Delta FFY(approx.)$ is not a good approximation of $\Delta FFY$ for the $\alpha $ values of interest. Nevertheless, the simplified form of $\Delta FFY(approx.)$ is useful because it reveals that the error approximately varies as the product of the grid size and the Ewald splitting parameter ($h\alpha $) raised to the power given by the order of B-spline (*p*). Further, finding an optimal set of parameters requires considerable effort using (33), whereas using a simplified estimator can expedite the process of finding the optimal parameters. An alternative approach for estimating the errors is described in Ref. 54.

The total RMS error in the force computed using the PPPM algorithm is defined as

To compare the computational costs of the PPPM and LCL algorithms, the execution times for the two algorithms must be compared for the same RMS force errors. To this end, an estimate of the RMS force error for Yukawa interaction is needed. Substituting $\mathbf{f}Y(\mathbf{r})=\u2212\u2207\mathbf{r}\varphi Y(\mathbf{r})$ in (29), where $\varphi Y(\mathbf{r})=e\u2212\kappa r/r$, results in the following estimate of the RMS force error for Yukawa interaction:

This equation, in turn, implies that

(see Appendix B for further detail). As shown in Fig. 5, the RMS error estimate in (43) has been verified by comparison with the RMS error computed using 10^{5} unit charges placed at random positions in a box.

Based on the error analysis for the Yukawa interaction, Fig. 6 shows that for $\kappa =0.1$ and an RMS force error of 10^{−8} ($e2/ai2$), the minimum image convention $rc\u2264L/2$ is not satisfied below a system size of 10^{7} unit charges. Therefore, for a $\kappa $ value and particle numbers for which LCL would be less efficient, PPPM serves as an excellent alternative. As $\kappa $ increasesfor approximately the same error of 10^{−8} ($e2/ai2$), the threshold above which the particle number satisfies the minimum image convention decreases; for example, for $\kappa =0.5$, the minimum image convention is satisfied beyond 10^{5} unit charges, as shown in Fig. 6. Importantly, for some conditions where the LCL algorithm is applicable, we find the PPPM algorithm to be computationally advantageous. This observation is based on the comparison of the time per time step of PPPM and LCL algorithms for a fixed RMS force error.The timing results for a range of $\kappa $ and *N* (with unit charges) are provided in Table I. For example, with 10^{6} unit charges and an RMS force error of approximately 10^{−5} ($e2/ai2$), we find that the PPPM algorithm was approximately 25 and 130 times faster than the LCL algorithm for $\kappa =0.5$ and 0.3, respectively (as shown in Table I). As $\kappa $ approaches unity, we find the LCL and PPPM algorithms to be comparable in performance, as summarized by the entries in Table I for $\kappa =1$. As $\kappa $ increases beyond unity, we find that the PPPM algorithm loses its computational advantage over the LCL algorithm, as can be seen in the entries in Table I for $\kappa =2$. We should point out, however, that execution times are architecture dependent and our conclusions are based on that fact. All calculations used a 3.5 GHz Intel Xeon(R) processor and 32 GB RAM.

$\kappa $ . | N
. | $\alpha $ $(ai\u22121)$ . | r_{c} (a_{i})
. | M . | $rcD$ (a_{i})
. | $TR$ (s) . | $TF$ (s) . | $TPPPM$ (s) . | $TLCL$ (s) . | $TLCL/TPPPM$ . |
---|---|---|---|---|---|---|---|---|---|---|

0.3 | 1 × 10^{5} | 0.625 | 5.2 | 128 | 35.48 | 2.61 | 1.92 | 4.53 | 324.2 | 71.56 |

1 × 10^{6} | 0.579 | 5.6 | 256 | 35.62 | 25.65 | 26.81 | 52.46 | 7138.24 | 136 | |

0.5 | 1 × 10^{5} | 0.614 | 5.3 | 128 | 22.24 | 1.97 | 1.36 | 3.33 | 133.6 | 40.1 |

1 × 10^{6} | 0.569 | 5.7 | 256 | 22.33 | 24.79 | 23.16 | 47.95 | 1173.05 | 24.5 | |

1 | 1 × 10^{5} | 0.61 | 5.33 | 128 | 12.08 | 1.89 | 1.35 | 3.24 | 11.7 | 3.6 |

1 × 10^{6} | 0.57 | 5.8 | 256 | 12.34 | 26.1 | 21.2 | 47.3 | 119.9 | 2.53 | |

1.5 | 1 × 10^{5} | 0.684 | 4.6 | 128 | 7.99 | 1.28 | 1.22 | 2.5 | 3.07 | 1.23 |

1 × 10^{6} | 0.635 | 5.2 | 256 | 8.38 | 18.66 | 26.34 | 45 | 34.76 | 0.77 | |

2 | 1 × 10^{5} | 0.684 | 4.5 | 128 | 6.4 | 1.2 | 2.17 | 3.37 | 1.83 | 0.54 |

1 × 10^{6} | 0.683 | 4.7 | 256 | 6.47 | 13.2 | 22.8 | 36 | 16.3 | 0.45 |

$\kappa $ . | N
. | $\alpha $ $(ai\u22121)$ . | r_{c} (a_{i})
. | M . | $rcD$ (a_{i})
. | $TR$ (s) . | $TF$ (s) . | $TPPPM$ (s) . | $TLCL$ (s) . | $TLCL/TPPPM$ . |
---|---|---|---|---|---|---|---|---|---|---|

0.3 | 1 × 10^{5} | 0.625 | 5.2 | 128 | 35.48 | 2.61 | 1.92 | 4.53 | 324.2 | 71.56 |

1 × 10^{6} | 0.579 | 5.6 | 256 | 35.62 | 25.65 | 26.81 | 52.46 | 7138.24 | 136 | |

0.5 | 1 × 10^{5} | 0.614 | 5.3 | 128 | 22.24 | 1.97 | 1.36 | 3.33 | 133.6 | 40.1 |

1 × 10^{6} | 0.569 | 5.7 | 256 | 22.33 | 24.79 | 23.16 | 47.95 | 1173.05 | 24.5 | |

1 | 1 × 10^{5} | 0.61 | 5.33 | 128 | 12.08 | 1.89 | 1.35 | 3.24 | 11.7 | 3.6 |

1 × 10^{6} | 0.57 | 5.8 | 256 | 12.34 | 26.1 | 21.2 | 47.3 | 119.9 | 2.53 | |

1.5 | 1 × 10^{5} | 0.684 | 4.6 | 128 | 7.99 | 1.28 | 1.22 | 2.5 | 3.07 | 1.23 |

1 × 10^{6} | 0.635 | 5.2 | 256 | 8.38 | 18.66 | 26.34 | 45 | 34.76 | 0.77 | |

2 | 1 × 10^{5} | 0.684 | 4.5 | 128 | 6.4 | 1.2 | 2.17 | 3.37 | 1.83 | 0.54 |

1 × 10^{6} | 0.683 | 4.7 | 256 | 6.47 | 13.2 | 22.8 | 36 | 16.3 | 0.45 |

## V. APPLICATION: DYNAMIC STRUCTURE FACTOR

In this section, we apply our results to the computation of the dynamic structure factor $S(k,\omega )$. Our interest in $S(k,\omega )$ falls into two categories. Theoretically, $S(k,\omega )$ contains all of the many-body processes associated with thermal density fluctuations, including collective modes and transport.^{29,40–42} All of these processes are obviously impacted by the choice of $\u03f5(k)$, and we wish to use MD for model validation. The ionic $S(k,\omega )$ is also connected directly with neutron scattering^{43} and, indirectly, to X-ray Thomson Scattering (XRTS).^{44} In the context of warm dense matter physics, for example, recent XRTS studies have developed a modified Navier-Stokes equation (MNSE) to model the ionic $S(k,\omega )$.^{29}

Within the framework of the Yukawa model, we are interested in density and temperature conditions for which MD is required to validate approximate theoretical models and where LCL would be computationally less efficient than PPPM for force calculations. Consider the case $\Gamma =2$ and $\kappa =0.1$, for 10^{4} unit charges the PPPM algorithm can have a force error of approximately 10^{−6} ($e2/ai2$) for an optimal choice of $\alpha $, *r*_{c}, *h*, and *p*. For the same error and particle number, LCL would be less efficient because the minimum-image convention is not satisfied. This case corresponds to an effective coupling of $\Gamma eff=\Gamma e\u2212\kappa \u22430.74$, which is of an order that is relevant to systems of recent interest, such as warm dense matter.^{44}

### A. Numerical computation of the dynamic structure factor from MD

The dynamic structure factor for an *N*-particle system is defined as

Here, the brackets $\u27e8\cdots \u27e9$ refer to the stationary time average

where *T* is the length of the run. We note that for a system in equilibrium the stationary time average is equivalent to the ensemble average as $T\u2192\u221e$. The Fourier-transformed densities *n*(**k**,*t*) are obtained from the definition of the density,

to yield

which, through the change of variables $y=t+\tau $, can be written as

We use the usual definition of the temporal Fourier transform

For numerical purposes, the infinite domain must be approximated as finite, and as we are calculating stationary processes, we can take this domain to be [0, *T*] without loss of generality for sufficiently large *T*. This yields the approximate transform

Therefore, the final form for $S(\mathbf{k},\omega )$ is given by

as $n(\u2212\mathbf{k},\u2212\omega )$ = $n*(\mathbf{k},\omega )$. The Fourier-transformed density $n(\mathbf{k},\omega )$ is given by

Particle positions from MD are available only at certain times in [0, *T*]; hence, $n(\mathbf{k},\omega )$ is computed using

where *N*_{t} is the number of steps after the equilibration phase of MD, and $\Delta t$ is the step size for time integration. Particle positions are stored at intervals corresponding to this step size. Long runs are preferred because the noise in $S(\mathbf{k},\omega )$ decreases as *T* increases. The temporal Fourier transform is computed by FFT. $S(\mathbf{k},\omega )$ computed from MD data tends to be noisy, and the extent of the noise increases as *k* increases. To determine the location of the peak of $S(\mathbf{k},\omega )$ and its width, the noisy curves are usually smoothed by averaging over data from different MD runs that differ in their initial conditions. In Sec. V B, we present a potentially more efficient smoothing approach that employs kernel smoothing.^{49,50}

### B. Kernel smoothing

Kernel smoothing refers to weighted averaging of noisy observations to obtain a smooth function.^{49,50} To smooth our data using kernel smoothing, we use a kernel regression technique^{50} defined by

where *y*(*x*_{i}) is the observation of *y* at *x*_{i}, *B* is the number of observations, and $K\sigma (x\u223c,xi)$ is the smoothing kernel. This yields a weighted average to give a smooth function $y\u223c(x\u223c)$ evaluated at $x\u223c$. We use the Gaussian kernel

where $\sigma $ is the kernel width. For $\Gamma =50$ and $\kappa =1$, Figs. 7 and 8 show the smooth dynamic structure factor curves (red curves) obtained by kernel smoothing of noisy dynamic structure factor curves obtained from MD simulations (grey curves) with *σ* = 0.01 and *σ* = 0.005, respectively. MD simulations of 10^{3} unit charges were performed using the PPPM algorithm, with an RMS force error of approximately 10^{−4} ($e2/ai2$). The parameters used were $\alpha $=1, *r*_{c}=3, *M*=64, and *p* = 2.

We have defined a simple way of quantifying the smoothing error as follows:

where $y\u223c2(x\u223c)$ is defined as

More rigorous ways of quantifying the error in smoothing can be found in Refs. 49 and 50. Figs. 7 and 8 show the error in smoothing (blue band) about the smooth $S(q,\Omega )$ curves (red curves). Because the kernel width influences the extent of smoothing and the corresponding error in the smoothing, the width must be chosen carefully such that the smoothed data are consistent with the physically expected result.

### C. Comparison of the theoretical and MD dispersion relations

In this section, we compare dispersion relations obtained using PPPM-MD with those obtained using different theoretical models. For convenience, *k* and *ω* are expressed as dimensionless quantities *q* = *ka*_{i} and $\Omega =\omega /\omega p$, where $\omega p=(4\pi niQi2/M)1/2$ is the ion plasma frequency for ionic mass *M* and ionic number density *n*_{i}.

One simple dispersion relation obtained from a static structure factor *S*(*q*)^{29} is given by

In the random-phase approximation (RPA) for a Yukawa system, *S*(*q*) takes the approximate form

More accurate theoretical forms of *S*(*q*) are discussed in Refs. 29 and 46. Fig. 9(a) shows the comparison of a dispersion relation obtained using PPPM-MD and dispersion relations obtained using different theoretical approximations. The purpose of this comparison is to show that the MD result agrees well with the theoretical dispersion relations in the low-*q* limit, where the theory is considered to work well.^{29} As *q* increases, the MD curve begins to deviate from the theoretical curves; this occurs because of the lack of required many-body effects in those theoretical models. A very accurate *S*(*q*) can be obtained from hypernetted chain (HNC) equations with a bridge function^{48} (denoted by *S*^{HNC}(*q*)). Although particle correlations are captured in $\Omega (q)$ by *S*^{HNC}(*q*) (solid blue curve), this model does not incorporate viscoelastic effects,^{46} and hence it underestimates $\Omega (q)$ compared to MD. Fig. 9(b) shows the full-width-half-maximum (FWHM) of a smooth $S(q,\Omega )$ constructed by kernel smoothing of $S(q,\Omega )$ obtained from MD data using a Gaussian kernel of widths $\sigma =$0.01 (blue dots) and $\sigma =$ 0.005 (green dots). A smaller width of $\sigma =$0.005 results in a curve that approaches zero (as expected from theory) more accurately than does a curve constructed using a width of $\sigma =$ 0.01.

More rigorous theoretical models for computing the dispersion relation are the quasilocalized charge approximation (QLCA),^{45} modified Navier-Stokes (MNSE), ^{29} andviscoelastic-dynamic density functional theory (VE-DDFT).^{46} Using QLCA, the dispersion relation takes the form

where $K(x,y)$ is given by

The dispersion relation from the Navier-Stokes equation modified to incorporate low-frequency corrections results in the dispersion relation for MNSE is given by

where $\omega E$ is the Einstein frequency, and $\eta \u223c$ is the dimensionless viscosity given by $\eta \u223c=\eta /3\omega EMniai2$, where $\eta $ is the shear viscosity. Values of $\eta \u223c$ can be found in Ref. 47. VE-DDFT incorporates viscoelastic effects that are not incorporated in QLCA and MNSE formulations. The dispersion relation calculated using VE-DDFT is given by

where $\tau $ is given by

where $\gamma i$ is the adiabatic index, $\mu $ is the compressibility, $Ec(\Gamma ,\kappa )$ is the correlation energy in units of temperature, and $\eta \xaf$ is the normalized viscosity given by $\eta \xaf=(43\eta +\epsilon )/(Mni\omega pai2)$, where $\epsilon $ is the bulk viscosity.

$S(q,\Omega )$ curves from MD were obtained by averaging over data from 20 different PPPM-MD simulations for $\Gamma =2$ and $\kappa =0.1$ using 10^{4} unit charges and an RMS force error of 10^{−6} ($e2/ai2$). The parameters used were $\alpha $ = 0.46, *r*_{c} = 7.8, *M* = 64, and *p* = 6. Each MD run was 800 $\omega p\u22121$ long with a time step $\Delta t=0.05\omega p\u22121$ and frequency resolution $\Delta \omega \u22430.008\omega p$. $S(q,\Omega )$ curves shown in Fig. 10(a) were obtained by smoothing the noisy $S(q,\Omega )$ curves from MD simulations using Gaussian kernels of width $\sigma =0.01$. Figs. 10(b) and 10(c) show the corresponding dispersion relations obtained using the smoothed $S(q,\Omega )$ curves and comparisons with the theoretical dispersion relations obtained for QLCA, MNSE and VE-DDFT, with *S*(*q*) given by *S*^{HNC}(*q*). We can see that for values of *q* greater than approximately 0.3, the MD result increasingly deviates from the theoretical curves, while the VE-DDFT curve deviates from the MNSE and QLCA curves for *q* greater than approximately 0.4. Further, the QLCA and MNSE curves also deviate from each other for *q* greater than approximately 0.7. Hence, the differences between the dispersion relations for the theoretical models and their differences from MD results clearly demonstrate the need for model validation with accurate and computationally efficient MD simulations, which can be performed using the PPPM algorithm as discussed in this paper.

## VI. CONCLUSIONS

We have generalized the Ewald decomposition to screened Coulomb interactions with arbitrary DRFs. The resulting effective interactions span from short to long range, depending on the physics contained in the DRF. This extension allows us to identify parameters for which the decomposition offers a computational advantage over efficient neighbor-list-only algorithms (e.g., LCL-only), thereby providing an operational definition of “medium range.”

We have demonstrated an elegant way of performing the Ewald decomposition in Fourier-space and applied the decomposition to two choices of the DRF, Yukawa and EGS interactions, that span a wide range of temperatures and densities of interest. However, this Ewald decomposition is quite general and can be applied to non-Coulomb interactions, such as those that employ electron-ion pseudopotentials.^{51}

For the Yukawa interaction, we implemented a standard PPPM^{6,9} algorithm. We verified our implementation with an extensive analysis of force errors that are inherent to the PPPM algorithm. The analysis involved deriving, for a random distribution of particles, analytical error estimates that agreed very well with the computed error. The error estimates provide insight into the effects of truncation and discretization on the errors in the computed forces, and this information is needed when choosing optimal parameters for calculating forces efficiently. The error analysis was also extended to the plain Yukawa interaction, which allows for a cutoff radius because of its exponentially decaying form and, hence, can be implemented using the LCL algorithm. For the same force errors computed using the PPPM and LCL algorithms, we compared the time required for one step of a force calculation for a range of $\kappa $ values and particle numbers.

We found that the PPPM algorithm offers a significant computational advantage over the LCL algorithm for a medium-range Yukawa potential, which corresponds to screening lengths exceeding the ion-sphere radius ($\kappa <1$). For screening lengths less than the ion-sphere radius ($\kappa >1$), the computational advantage of PPPM over LCL was found to diminish (see Table I). As screening lengths decreased further to values less than half of the ion-sphere radius ($\kappa \u22652$), the LCL algorithm became more advantageous than the PPPM algorithm.

We then applied our method to the computation of the dynamic structure factor for a Yukawa system. By varying the screening length and the system size, we found cases for which the PPPM algorithm is computationally advantageous over the LCL algorithm. One such case is $\kappa =0.1$ and 10^{4} unit charges, corresponding to an RMS force error of approximately 10^{−6} ($e2/ai2$), which precludes the use of the LCL algorithm because the minimum image convention is not satisfied. The computed dynamic structure factor tends to be noisy and, therefore, must be smoothed, which is usually accomplished by averaging over the results of MD runs that vary in their initial conditions. We have presented an alternative and potentially more efficient way of smoothing $S(k,\omega )$ data using kernel smoothing. With Gaussian smoothing kernels, the dispersion relation obtained from the smoothed dynamic structure factor curves was compared with theoretical dispersion relations obtained using the QLCA, MNSE, and VE-DDFT models, as shown in Fig. 10. The differences between the dispersion relations obtained using MD and those obtained using theoretical models, as shown in Fig. 10, highlight the need for model validation with accurate and computationally efficient MD simulations, which are possible using the PPPM algorithm for a wide range of temperatures and densities of interest.

We note that a number of techniques, such as analytical differentiation,^{52} interlacing,^{53} oversampling,^{54} and others, could be used to design PPPM algorithms. The particular choice of the technique has consequences for the form of the optimal Green function, the errors, and the computational costs. Any of these techniques could be applied in a straightforward way to arbitrary dielectric response functions.

## ACKNOWLEDGMENTS

We would like to thank John Verboncoeur of Michigan State University for his support and useful discussions. The work of Dharuman and Murillo was supported by the Air Force Office of Scientific Research (Grant No. FA9550-12-1-0344); Stanton and Glosli were supported under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory Contract No. DE-AC52-07NA27344.

### APPENDIX A: EWALD DECOMPOSITION FOR YUKAWA AND EGS POTENTIALS

In this section, we provide the details of Ewald decomposition for Coulomb interactions screened with Yukawa and EGS DRFs.

In general, the Ewald decomposition of a Coulomb interaction linearly screened through an isotropic response takes the form

where $\upsilon (k)=4\pi /k2$. The long-range component $\varphi F(k)$ can be evaluated in Fourier-space; however, the remaining short-range component $\varphi R(k)$ must be inverted to real-space as

#### 1. Yukawa dielectric function

We first consider the Yukawa dielectric function of the form

where $ks=\lambda s\u22121$ is the magnitude of the inverse screening vector associated with the background charge distribution. Using the known result

and $erfc(z)$ being the usual complimentary error function, we can evaluate the above expression as

Lastly, the charge fraction $\zeta $ can be tuned to make this result as short range as possible. Noting that

the large-*r* behavior of (A7) is given by

Thus, $\zeta =e\u2212ks2/4\alpha 2$ is the optimal choice for the charge fraction, which results in the final form of the real-space interaction

#### 2. EGS dielectric function

We next turn to the next-order gradient correction to the Yukawa interaction using the EGS dielectric function

where expressions for *ν* and further generalizations can be found in Ref. 34; however, the analysis is applicable to any dielectric function that can be expressed in this form. To simplify the problem, this dielectric function can be written in terms of partial fractions as

where we have introduced the coefficients and new inverse lengths

We have now reduced the calculation to evaluating two Yukawa-type dielectric functions to yield

Note that this result is generic even for the case in which $\nu >1$, where the coefficients are complex. The large-*r* behavior for this interaction is given by

and there is no longer a choice in $\zeta $ that eliminates the leading-order term. However, given that $k\u2212<k+$ (for $\nu <1$), the optimal choice is $\zeta =e\u2212k\u22122/4\alpha 2$.

#### 3. Optimizing the charge fraction $\zeta $

In general, there will not be an analytic expression for the optimal $\zeta $ in terms of $\alpha $ and the physical parameters, and a numerical investigation will be necessary. An approximate expression can be obtained by noting that the large-*r* limit of $\varphi R(r)$ will be dominated by the low-*k* behavior of the integrand. By temporarily inserting the approximate dielectric function

the resulting optimal choice of charge fraction becomes $\zeta app=e\u2212ks2/4\alpha 2$. For example, *k*_{0} = *k*_{s} for the EGS dielectric function; this becomes increasingly valid for small *ν*, when the system is relatively hot and/or dense.

In the particular case of the EGS dielectric function, there is an alternative approach to accelerate the convergence of the real-space interaction. By introducing two screening clouds in the Ewald decomposition, the real- and Fourier-space interactions, respectively, become

The above analysis can then be repeated to obtain

which now has the limiting behavior

This leading-order term can hence be eliminated by solving the system of equations

to obtain the optimal charge fractions

Note that this approach results in two Ewald parameters to optimize, which could prove to be advantageous.

### APPENDIX B: ERROR ANALYSIS

In this section, we provide details of the derivation of the RMS error estimate for forces computed using the PPPM algorithm. We begin with the analysis for the real-space portion of the algorithm. As introduced in Section IV, for a system of point charges with an arbitrary pair-wise short-range interaction, the estimate of the RMS force error is given by (29). For the real-space part of the Ewald decomposition for Yukawa interaction, (29) takes the form

where $\mathbf{f}RY(\mathbf{r})$ is given by

because of the radial nature of $\varphi RY(r)$. To simplify the analysis, we make use of the symmetry in $\varphi RY(r)$ with respect to $\kappa $ and express $\varphi RY(r)$ as a sum of two terms,

where

This implies that

where

For a sufficiently large distance ($r\u226b1$), the complementary error function $erfc(\alpha r+\kappa 2\alpha )$ can be approximated by the first term of the asymptotic expansion^{39} given by

Using this approximation in (B6) gives

For $r\u226b1$, the second and third terms on the right-hand side can be neglected compared to the first term, resulting in

which implies that $\u2212\u2202VRY(r,\kappa )\u2202r=\u2212\u2202VRY(r,\u2212\kappa )\u2202r$. This, in turn, yields

Using the first term of the asymptotic expansion,^{39} the integral on the right-hand side is approximated as

resulting in the simplified expression for $\Delta FRY$ given by

We note that the approximation, which involves truncation of the asymptotic expansion that led to the above error estimate, is not valid for $\kappa /\alpha \u2192\u221e$ as $\alpha \u21920$. Hence, this error estimate will not tend to the error estimate for the plain Yukawa interaction that is derived below.

Similar error estimates can be derived for the plain Yukawa interaction because it decays exponentially. For the plain Yukawa interaction

we have

Using (29), the error estimate for the Yukawa interaction is given by

and the integral in this equation can be approximated using the first term of an asymptotic expansion^{39} to yield

For $1rc\u226a\kappa $, we could further approximate (B17) to obtain

We now proceed to simplifying the estimate of the RMS error in the Fourier-space force computed using the PPPM algorithm. As introduced in Section IV, the RMS error estimate for the Fourier-space force is given by

where

If the arbitrary Green function (denoted by $G^\mathbf{k}\mathbf{n}$) rapidly decays in Fourier-space such that alias contributions to the Green function can be neglected (**m** denotes the alias indices), then (B20) can be simplified further, beginning with the approximation given by

This implies that

which can be rewritten as

with the assumption that the maximum length of **n** (denoted by $\mathbf{n}max$) is sufficiently large such that $G^\mathbf{k}\mathbf{n}2kn2\u22430$ for $\mathbf{n}=\mathbf{n}max$. For an isotropic system, following the derivation in Ref. 10, we find that

where the coefficients $Cm(p)$ are listed in Table I of Ref. 10. Therefore, (B23) is approximated to a form given by

When the box dimensions are large ($L\u226b1$, for a cube of edge length *L*),

which implies that

If the Green function is symmetric, then the integral *I* can be reduced further to the following 1D integral:

where

The integral *I* can then be expressed as

where

Therefore, for an arbitrary Green function,

which yields the following approximated estimate of the RMS error for an arbitrary Green function:

The Green function for the Ewald decomposition for Yukawa interaction is given by

The integral $\beta (m,p)$ for this case takes the form

which results in (38).

## REFERENCES

A slightly more accurate fit for $\lambda TF$ is discussed in Ref. 27.

The codes are written in high performance Python using the optimizing compiler Numba which compiles Python code to machine code giving similar performance to C and Fortran. More details about Numba can be found here: http://numba.pydata.org. In the future the code will be publicly available under the name Sarkas.