Bosonic exchange symmetry leads to fascinating quantum phenomena, from exciton condensation in quantum materials to the superfluidity of liquid 4He. Unfortunately, path integral molecular dynamics (PIMD) simulations of bosons are computationally prohibitive beyond particles, due to a cubic scaling with the system size. We present an algorithm that reduces the complexity from cubic to quadratic, allowing the first simulations of thousands of bosons using PIMD. Our method is orders of magnitude faster, with a speedup that scales linearly with the number of particles and the number of imaginary time slices (beads). Simulations that would have otherwise taken decades can now be done in days. In practice, the new algorithm eliminates most of the added computational cost of including bosonic exchange effects, making them almost as accessible as PIMD simulations of distinguishable particles.
I. INTRODUCTION
Path integral molecular dynamics (PIMD) simulations are a powerful computational tool for studying quantum condensed phases at thermal equilibrium.1,2 They are also the basis of several prominent methods for approximating quantum transport properties.3 For distinguishable particles, PIMD samples the partition function of the quantum system through classical molecular dynamics (MD) simulations of “ring polymers,” formed by connecting P copies of each particle (“beads”) through harmonic springs.4 State-of-the-art PIMD simulations can infer properties of quantum solids and liquids with up to particles,5–18 employing empirical force fields or interaction potentials that are based on density functional theory.19–22
Exchange symmetry between indistinguishable particles is central to quantum phenomena such as Bose–Einstein condensation and superfluidity. However, enforcing it poses a formidable challenge to PIMD. It requires summing over all permutations of the identical particles in the quantum partition function, leading to a different set of ring polymers for each permutation where the rings of exchanged particles are connected.23 As a result, a straightforward implementation of PIMD for indistinguishable particles scales exponentially, , where N is the number of particles. For that reason, while efficient sampling of permutations is possible in Path Integral Monte Carlo (PIMC),23–32 exchange effects were not included in PIMD simulations until recently, except for very few particles.33–35
In previous work, we showed36 that bosonic PIMD simulations can be performed with only cubic scaling, . Based on a recurrence relation for the ring polymer potential and forces, the algorithm allowed the first PIMD simulations of moderately large bosonic systems; the largest application to date consists of N = 128 atoms.37 Fermionic systems whose sign problem is not too severe were also simulated using a related approach.38–43 Nevertheless, even for bosonic systems, the cubic scaling hindered applications to systems with more than particles.
In this paper, we present an algorithm that reduces the scaling of bosonic PIMD from cubic to quadratic, . Our new algorithm generates the exact same potential, forces, and trajectories as the previous algorithm, only more efficiently. The outcome is that bosonic PIMD simulations are times faster than before. This significant improvement allows us to perform simulations of thousands of bosonic particles, which were previously beyond reach using PIMD. For example, a simulation of N = 1372 4He atoms in the superfluid liquid phase takes days, but would require over 14 years with the previous algorithm. We therefore dramatically reduce the added computational overhead of bosonic exchange in PIMD simulations in comparison to distinguishable particles.
In Sec. II A, we first describe the theory behind the original bosonic PIMD algorithm, and then explain how to reduce the scaling from cubic to quadratic in the evaluation of the ring polymer potential and forces. In Sec. II B, we verify the improved scaling of the algorithm numerically, and present the observed speedup thanks to our method in simulations of of ultracold trapped atoms and superfluid liquid 4He atoms. Section III presents our conclusions while Sec. IV provides additional computational details.
II. RESULTS AND DISCUSSION
A. Theory
This section explains the new algorithm and how it achieves the reduced scaling. We start by reviewing bosonic PIMD, and the original, cubic-scaling algorithm.
1. Background: Cubic scaling bosonic PIMD
Equation (1) is the isomorphism4 between the partition function of bosonic quantum particles and that of a system of classical ring polymers, where are the positions of the P polymer beads that correspond to particle ℓ. It is exact in the limit P → ∞, and should be interpreted as follows: (1) The summation is over all the permutations σ of the N particles. (2) Beads of the same index associated with different particles interact through a scaled potential , which is invariant under particle permutations. We do not consider the potential further since it is the same for bosonic and distinguishable particles. (3) Eσ = Eσ(R1, …, RN) is the spring energy of the ring polymer configuration that corresponds to σ, as we now explain.
Ring polymer configurations for all permutations over three particles. The first and last beads are marked in green (solid) and red (dashed), respectively. Each permutation is given in two notations: the two-line notation, , and the cycle notation, where each cycle is a sequence (ℓ, σ(ℓ), σ(σ(ℓ)), …), which must close due to the properties of permutations, and each element appears in exactly one cycle (see the supplementary material). Permutations that contribute to the potential V[1,N] are enclosed in a square.
Ring polymer configurations for all permutations over three particles. The first and last beads are marked in green (solid) and red (dashed), respectively. Each permutation is given in two notations: the two-line notation, , and the cycle notation, where each cycle is a sequence (ℓ, σ(ℓ), σ(σ(ℓ)), …), which must close due to the properties of permutations, and each element appears in exactly one cycle (see the supplementary material). Permutations that contribute to the potential V[1,N] are enclosed in a square.
Equation (1) provides an expression for the bosonic partition function that can be sampled, at least in principle, through simulations of a classical system with ring polymer potential . However, the number of permutations N! grows exponentially, making this naive approach intractable.
The recurrence relation defining V[1,N] allows to compute both the potential and the derived forces in time. To evaluate the potential, first the cycle energies are evaluated, each of which costs due to the double sum in Eq. (4), resulting in overall. The forces due to the potential, i.e., for all ℓ and j, are also computed in time using a recurrence relation that is obtained from Eq. (3) by taking the derivative. See the supplementary material of Ref. 36 for the full complexity analysis.
In this paper, we improve on this complexity and show that the potential and forces can be computed in time, leading to significantly faster bosonic PIMD simulations. As in the previous method, the algorithm first computes the potential and then the forces on all beads. We achieve a faster evaluation of the potential by a recursive formula also for the cycle energies E[N−k+1,N], and a faster evaluation of the forces by writing a probabilistic expression for them.
2. Reducing the potential to
(a) The interior springs of particle 1 are the same in all the cycles in which the particle participates. (b) The cycle energy E[1,3] can be obtained from E[2,3] by removing the term corresponding to the spring closing the cycle from 3 to 2 (red, dotted), connecting particle 1 to the beginning of 2 (purple, solid), adding the interior springs of particle 1 (gold, solid), and closing the cycle from 3 to 1 (purple, solid).
(a) The interior springs of particle 1 are the same in all the cycles in which the particle participates. (b) The cycle energy E[1,3] can be obtained from E[2,3] by removing the term corresponding to the spring closing the cycle from 3 to 2 (red, dotted), connecting particle 1 to the beginning of 2 (purple, solid), adding the interior springs of particle 1 (gold, solid), and closing the cycle from 3 to 1 (purple, solid).
With this procedure, all the cycle energies are computed in time: (1) summing over the P − 1 interior springs of N particles [Eq. (6)] takes time. (2) Forming the single-particle cycles of Eq. (7) takes additional , and (3) evaluating each cycle energy E[u,v] from E[u+1,v] [Eq. (5)] is , with such cycles.
Once the cycle energies are known, evaluating V[1,N] takes only, since it amounts to evaluating all the N potentials V[1,v], each of which costs due to the sum in Eq. (3). We provide a pseudocode of the potential evaluation algorithm in Algorithm 1 of the supplementary material.
3. Reducing the force to
Reducing the time required to compute the potential does not directly reduce the scaling of the evaluation of the forces. In the original algorithm, the recurrence for the force is once the cycle energies are known. However, it needs to be evaluated separately per bead, which amounts to overall. The techniques we use above cannot improve this further. To overcome this, we need to rewrite the forces in a new, probabilistic form.
a. Potential as a sum over representative permutations.
The reason that Eq. (8) transforms permutations using G is that V[1,N] does not include all permutations. For example, Eq. (9) does not include terms for the permutations (132) and (13)(2) (see Fig. 1). However, Eq. (8) proves that V[1,N] indeed samples the correct quantum partition function, since G[σ] preserves the topology of σ by construction.
It might seem counter-intuitive to rewrite the recursive potential as a sum over representative permutations, because their number also grows exponentially (as shown in the supplementary material). However, we never use this expression to evaluate the potential, but rather to derive an efficient expression for the force, as we now explain.
b. Force as an expectation value.
In itself, this expression is still not amenable to efficient computation due to the sum over the permutations. However, it can be greatly simplified by observing that, in any permutation, the force on a bead depends only on its immediate neighbors. When taking the expectation value, we can group together all permutations that have the same force exerted on the bead. The calculation then splits to two cases: the case of interior beads (j = 2, …, P − 1), and the case of exterior beads (j = 1, P).
c. Force on interior beads.
d. Force on exterior beads.
Equations (13) and (14) imply that once the connection probabilities are known for all ℓ, ℓ′, the force on all exterior beads can be computed in time: there are 2N of them, and each is computed in time due to the sum in Eqs. (13) and (14). Our final goal then is to precompute the connection probabilities efficiently. This is a mighty challenge, because the connection probability sums over an exponential number of permutations (as shown in the supplementary material).
e. Connection probabilities.
We provide a pseudocode of the force evaluation algorithm in Algorithm 1 of the supplementary material.
B. Benchmarks and applications
This section presents our numerical results for the improved performance of the algorithm.
1. Scaling
The improved scaling of our algorithm, quadratic rather than cubic, is shown numerically in Fig. 3(a) for our Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS)46 implementation. Here, we measure the time per simulation step as a function of the number of particles N on a log-log scale, for a system of trapped cold non-interacting bosons with P = 32 beads. The slope of the linear fit is 1.6 for our new algorithm (blue), compared to 2.5 for the original algorithm (green). In our i-PI47 implementation, the slopes are 1.9 and 3.2 respectively (see the supplementary material).
(a) The scaling of the original and current algorithms with the number of particles for a system of N non-interacting cold trapped bosons on a log-log scale. Blue circles and green squares represent the current and original algorithms, respectively. Black triangles represent simulations of distinguishable particles. Lines represent a linear fit to the simulation data. (b) The speedup gained by using the proposed algorithm, defined as the ratio between time per step in the original and current implementations.
(a) The scaling of the original and current algorithms with the number of particles for a system of N non-interacting cold trapped bosons on a log-log scale. Blue circles and green squares represent the current and original algorithms, respectively. Black triangles represent simulations of distinguishable particles. Lines represent a linear fit to the simulation data. (b) The speedup gained by using the proposed algorithm, defined as the ratio between time per step in the original and current implementations.
2. Speedup
The reduced scaling of the current algorithm implies a speedup of over the original algorithm. Figure 3(b) shows the speedup (the ratio of the time per step between the original and current algorithms) in the same system as in panel (a). The new algorithm achieves acceleration by orders of magnitude, ranging from ×21 faster simulations of N = 32 bosons to ×944 speedup for N = 1024. The speedup is close to linear with the number of particles (R2 = 0.993), with a slope of . We note that the cost of evaluating pairwise interactions, if present, is the same in the current and original algorithm (and in ordinary PIMD). This reduces the speedup in such systems, but it remains significant; we find a linear speedup with slope 0.8 in liquid 4He, ×6 for N = 32, and ×1069 for N = 1372 (see the supplementary material).
3. Simulating particles
Such speedups allow us to simulate systems far beyond what was previously possible in bosonic PIMD. Figure 4 shows a simulation of superfluid liquid 4He, where we compute the pair correlation of N = 1372 atoms, with P = 96 beads. The pair correlation is in very good agreement with the PIMC results of Ceperley23 (see the supplementary material). This simulation takes days in our algorithm, but would take longer than 14 years with the original algorithm (see Sec. IV). Similarly, simulations of N = 1600 trapped cold bosons, with and without Gaussian repulsive interaction (see the supplementary material), take less than days, whereas the previous algorithm is estimated to require over 20 years to complete on the same hardware (see Sec. IV).
The pair correlation of N = 1372 4He atoms in the superfluid liquid phase at temperature 1.379 K with P = 96 beads. The simulation lasts days, instead of years with the previous algorithm.
The pair correlation of N = 1372 4He atoms in the superfluid liquid phase at temperature 1.379 K with P = 96 beads. The simulation lasts days, instead of years with the previous algorithm.
4. Bosonic vs ordinary PIMD
The new algorithm eliminates most of the added computational cost of bosonic exchange, and brings the scale of bosonic PIMD to be largely on par with ordinary PIMD. Figure 3(a) compares the new algorithm to the PIMD algorithm in LAMMPS for distinguishable particles (black). For 64 particles, there is only very little difference (37%) in the time per simulation step. For 1024 particle, bosonic simulations are merely times slower than for distinguishable particles. In the original algorithm, bosonic simulations of 1024 particles would have been times more expensive than simulations of distinguishable particles.
III. CONCLUSIONS
This paper presents an algorithm for bosonic PIMD with improved computational complexity, reducing the cost from cubic to quadratic with the number of particles. Our improved algorithm relies on two observations: (1) that the interior beads of each ring polymer are not affected by exchange interaction, and (2) that the bosonic ring polymer force can be evaluated efficiently as the expectation value of the force over a Boltzmann distribution of particle permutations.
Using our method, simulations of condensed phase bosonic systems with thousands of particles can be done in days instead of years, allowing the first PIMD simulations of thousands of interacting particles including bosonic exchange symmetry.
We believe that our algorithm is a significant step towards including bosonic exchange effects in PIMD simulations as routinely as performing standard simulations for distinguishable particles. The new algorithm will enable studies of systems such as exotic phases of superfluids under confinement using PIMD. Our insights might also be used to accelerate other algorithms where a similar recursion has been identified.48–50 Finally, it would be exciting to see whether approximate methods for obtaining time correlation functions based on PIMD, such as Matsubara dynamics, Ring Polymer Molecular Dynamics or Centroid Molecular Dynamics,3 could be devised also for bosonic systems.
IV. METHODS
A. Simulation details
We implemented our algorithm in development branches of LAMMPS46 and i-PI.47 Unless otherwise stated, the results are obtained using the implementation in LAMMPS. We validated the proposed algorithm by reproducing results obtained before using the original method36 on several bosonic systems with and without interaction, and extending some of them to larger system sizes, as detailed in the supplementary material.
For the trapped bosons presented in Fig. 3, we use P = 36, a time step of 1 fs, and average the time over 1000 simulation steps. For the superfluid liquid 4He presented in Fig. 4, the simulation was performed with the interaction potential of Aziz et al.,51 at a temperature of 1.379 K and a density of 0.021 82 Å−3, in periodic boundary conditions in all spatial dimensions, and for 2.5 × 106 steps of 0.5 fs each, where the first 20% were used as equilibration. For the two-dimensional density of trapped bosons with and without Gaussian repulsive interaction (whose results are displayed in the supplementary material) we use P = 36, appropriate for the temperature of 11.6 K (βℏω = 3), and a time step of 1 fs. The simulations were performed for 3 × 106 steps with the first 20% discarded as equilibration. The repulsive potential is the same as described in Ref. 36, with g = 3.
B. Estimating the run-time of the original algorithm
To estimate the time required for the implementation of the original algorithm to complete the simulation of N = 1600 trapped cold bosons with a Gaussian repulsive interaction, we simulated 1000 steps of the implementations of both the current and original algorithm, and extrapolated to 3 × 106 steps. The result was in agreement with the actual time required to perform the simulation in the new method, and actually slightly shorter, rendering our prediction conservative. Specifically, with the new method, 1000 steps lasted seconds for g = 0 and seconds for g = 3, which is extrapolated to days for g = 0 and 5 days for g = 3; our actual simulation was slightly slower, days for g = 0 and 7 days for g = 3, which is likely due to the presence of other CPU-heavy processes on the same machine. In contrast, with the original method, 1000 steps took h for g = 0 and h for g = 3, which is extrapolated to years and years, respectively, for 3 × 106 steps.
We estimated the time to simulate the system of superfluid 4He in a similar manner. With the current algorithm, simulating 1000 steps on N = 1372 and P = 64 took s, which is extrapolated to days. In contrast, with the original algorithm, 1000 steps required h with P = 64, which is extrapolated to years. This is a conservative estimate for the simulation with the original algorithm and P = 96, because increasing the number of beads increases the required time. We did not perform this estimation based on the time to perform 1000 steps with P = 96 because the original implementation uses a simple numerical stability procedure when evaluating sums of exponentials (see the supplementary material), which is not sufficiently stable for large P and N.
C. Definition of G[σ]
In the context of rewriting the potential as a sum over representative permutations [Eq. (8)], for a general N, the representative permutation G[σ] is defined by transforming the cycle notation of σ, in two steps. First, the cycles are sorted in ascending order according to the largest element in each cycle. For example, σ = (13)(2) is rewritten as (2)(13), because the highest element in the cycle (13) is larger than the highest element in (2). Second, the elements are replaced by the numbers 1, …, N consecutively while maintaining the same length and order of cycles. In the same example, (2)(13) turns into (1)(23), and overall G[(13)(2)] = (1)(23).
D. Expressions for connection probabilities
In the context of evaluating the forces [Eqs. (13) and (14)], we describe how to compute the connection probabilities based on the potentials defined in Eq. (15).
Consider the connection probability . For ℓ′ > ℓ, the probability is nonzero only when particle ℓ is not the particle with the highest index in the ring, due to the properties of G[σ] (see Fig. 1). In that case, ℓ is connected to the particle ℓ + 1. We prove in the supplementary material that the expression for this probability is the expression in Eq. (16) above.
The intuition underlying the proof is that V[ℓ+1,N] accounts for all the permutations over the particles ℓ + 1, …, N, similarly to how V[1,ℓ] accounts for the permutations over 1, …, ℓ. Therefore, in Eq. (16), we compute the complement of the probability that ℓ, ℓ + 1 do not belong to the same cycle. Similarly, in Eq. (17), we account for all configurations that include a ring polymer that starts with particle ℓ′ and ends with particle ℓ, and all possible permutations over the other particles.
SUPPLEMENTARY MATERIAL
The supplementary material contains the pseudocode of the algorithm, additional simulation details, and proofs of theorems.
ACKNOWLEDGMENTS
Barak Hirshberg acknowledges support by the USA-Israel Binational Science Foundation (Grant No. 2020083) and the Israel Science Foundation (Grants Nos. 1037/22 and 1312/22). Yotam Feldman was supported by the Ratner Center Fellowship, as well as Schmidt Science Fellows, in partnership with the Rhodes Trust.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
Yotam M. Y. Feldman: Conceptualization (equal); Formal analysis (lead); Software (lead); Writing – original draft (equal); Writing – review & editing (equal). Barak Hirshberg: Conceptualization (equal); Funding acquisition (lead); Supervision (lead); Writing – original draft (equal); Writing – review & editing (equal).
DATA AVAILABILITY
The data that support the findings of this study are openly available on GitHub, at https://github.com/Hirshberg-Lab/quadratic-bosonic-pimd-jcp23-data