The chemical potential of a component in a solution is defined as the free energy change as the amount of that component changes. Computing this fundamental thermodynamic property from atomistic simulations is notoriously difficult because of the convergence issues involved in free energy methods and finite size effects. This Communication presents the so-called S0 method, which can be used to obtain chemical potentials from static structure factors computed from equilibrium molecular dynamics simulations under the isothermal–isobaric ensemble. This new method is demonstrated on the systems of binary Lennard-Jones particles, urea–water mixtures, a NaCl aqueous solution, and a high-pressure carbon–hydrogen mixture.
INTRODUCTION
The chemical potential of a solute as a function of its concentration is pivotal for understanding many important physical and biological processes, such as osmosis, the solvation of organic molecules, the behavior of electrolytes, precipitation of crystals from solutions, and phase equilibria of mixtures.
Despite their massive importance, computing the chemical potential of solutions from atomistic simulations is notoriously difficult and has not become a routine task. This is because the existing methods1–9 apply with many caveats and are often restricted to a subset of systems. For example, methods based on Monte Carlo particle insertion and removal10,11 can have numerical convergence issues5 and do not work for dense fluids. Thermodynamic integration or overlapping distribution method from the real solution to a reference fluid1,2 involves running multiple simulations with different coupling constants as well as along different thermodynamic paths, may have singularity problems at the end points of integration,7,8 and is difficult to use for molecules with complex topology. The direct coexistence6 can have long equilibration time, and it only works at conditions close to the solubility limit. Kirkwood–Buff integrals12,13 suffer from severe finite size effects.13,14
Here, we propose a generic, easy-to-use method (the S0 method) for computing chemical potentials of solutes at different concentrations in solutions. This method uses the thermodynamic connection between composition fluctuations and the derivative of chemical potential with respect to concentration. In practice, only the static structure factors computed from equilibrium isothermal–isobaric (NPT) molecular dynamics (MD) simulations at different solute concentrations are needed. We benchmark the method against the systems of binary Lennard-Jones particles, urea–water, NaCl aqueous solution, and a high-pressure carbon–hydrogen mixture.
THEORY
Consider a large binary particle reservoir with fixed numbers of type A and type B particles in the NPT ensemble (illustrated in Fig. 1). The Gibbs free energy for this reservoir can be expressed as
which indicates that each particle type has a constant chemical potential.
An illustration of the isothermal–isobaric (NPT) and the grand canonical (μVT) ensembles. The whole NPT ensemble is used to compute the structure factors.
An illustration of the isothermal–isobaric (NPT) and the grand canonical (μVT) ensembles. The whole NPT ensemble is used to compute the structure factors.
Inside the NPT ensemble, a fixed volume V, which is large but much smaller than the reservoir, with permeable boundary can be regarded a grand canonical ensemble (μVT). The corresponding grand potential is
where Q is the canonical partition function of the V-region with the numbers of the two types of particles nA and nB. By taking the derivatives of Ω with respect to the chemical potentials, one gets the particle number fluctuations ΔnA = nA − ⟨nA⟩ and ΔnB = nB − ⟨nB⟩,
where ⟨…⟩ indicates the expectation value of an observable in the grand canonical ensemble. As derived in the seminal paper of Kirkwood–Buff,12 such equilibrium fluctuations can be used obtain the derivatives of μ at constant (P, T) conditions, e.g.,
where cA = ⟨nA⟩/V equals the concentration of type A particles in the NPT reservoir. Reference 12 then relates the particle number fluctuations to the Kirkwood–Buff integrals of radial distribution functions. Instead, here we take a different route and evaluate ⟨ΔnAΔnB⟩ using static structure factors detailed below.
The instantaneous density field of particle number inside the NPT ensemble is
where is the position of atom i of type A at time t, and the average number density is
where is the position of atom i of type A at time t, and for isotropic systems, . Note that ⟨…⟩NPT is the NPT ensemble average, while ⟨…⟩ without subscript indicates μVT average. To consider the two-body correlations between the density at different points in space, r′ and r″,
These density correlation functions from the NPT ensemble encode the particle fluctuations inside the μVT ensemble of volume V because
One can do a Fourier expansion of the instantaneous density field in space r inside the μVT ensemble with volume V, e.g.,
As the density field is a real function, , where the latter is the complex conjugate.
The static structure factor between two types of particles (A–A, A–B, or B–B) is defined as
It is easy to verify that SAB(k) is the Fourier expansion of , i.e.,
Combining Eq. (10) with (13), the structure factor is related to the particle number fluctuations via
Furthermore, the Kirkwood–Buff integral (KBI)12 between components A and B is related to the structure factor by
Importantly, although the Fourier expansions above are performed inside the volume V of the μVT ensemble, for isotropic liquid with translational invariance, SAB(k) should be the same in all parts of the larger NPT ensemble—including the whole volume of the NPT ensemble. To obtain SAB(k) from NPT simulations of finite fluid systems with periodic boundary conditions, one can only let one dimension of the simulation box fluctuate with barostat, while only collecting S(k) for k along the plane of the two fixed dimensions. Even better, one can also perform the Fourier expansion on the scaled coordinates and obtain the static structure factors using
where and l(t) is the instantaneous dimension of the supercell. The scaling procedure is rigorous at the thermodynamic limit, where the NPT and the NVT ensembles are equivalent.
To determine limk→0SAB(k) from MD simulations of finite system sizes, one can compute S(k) at small k under the NPT ensemble and extrapolate to the k → 0 case using the Ornstein–Zernike form,15
Finally, to compute chemical potentials, one can run multiple equilibrium NPT simulations with different concentrations and then obtain using numerical integration with respect to ln cA,
referenced to the chemical potential at a standard molar concentration . One can conveniently select this reference to be the pure state. Strictly speaking, Eq. (19) provides the relative chemical potential instead of the absolute value, but only the former is a physical observable. The first two terms on the right-hand side of Eq. (19) give the ideal-mixture chemical potential μid and the third term is the excess chemical potential μex = kBT ln(γA) where γA is the activity coefficient of solute A. The ratio is related to the activity coefficient by .
In addition, although one can evaluate μA(cA) and μB(cB) separately using Eq. (19) from the same simulations, one can obtain μA from μB employing the Gibbs–Duhem equation under constant P,T conditions,16
To automatically satisfy the Gibbs–Duhem equation with the S0 method, one can use
where χA and χB = 1 − χA are the molar fraction of A and B, respectively.
It is worth discussing finite size effects in the current approach and, in particular, the difference with the KBI method. Typically, when using the KBI, one either starts from the pair correlation functions or collects nA and nB inside a fixed volume V during NPT simulations.14,17 However, the open boundary of V imposes very large finite size effects,14 and even for μVT systems with hundreds of thousands of atoms, a significant fraction is found lying on the boundary.18 Without finite size corrections, the KBI approach is hardly applicable.14,17 In contrast, the computation of S(k) in an NPT ensemble avoids such boundary effects. In addition, the effects coming from using a finite wavelength (2π/k) are partly corrected by the physically inspired extrapolation [Eq. (18)].
BINARY LENNARD-JONES SYSTEM
A binary mixture (A, B) of Lennard-Jones (LJ) fluids was simulated using the same potential as Ref. 14: a purely repulsive 6–12 LJ potential truncated and shifted with cutoff radius 21/6σ, and with σAA = σBB = σAB = 1, ϵAA = 1.2, ϵBB = 1.0, and . Constant temperature and pressure were enforced through a stochastic velocity rescaling thermostat and Nose–Hoover barostat. Three system sizes, i.e., total number of atoms, 4000, 23 328, and 108 000, were considered in the range of mole fractions of type A particles, χA = 0.05, …, 1.0. Simulations were carried out using LAMMPS19 with a time step of 10−3. The total simulation time was 106 simulation steps for the two larger systems and 107 for N = 4000. To compute S(k), we collected a snapshot per 2000 steps of the trajectory.
Figure 2(a) shows SAB(k) computed from NPT simulations at different system sizes for a selected molar fraction of particles. The dashed curves are fits to Eq. (18) with a maximum cutoff in the wavevector kcut = 2π × 0.2σ−1, although we found the fitted values for S0 are insensitive to this choice as long as the corresponding wavelengths are larger than the atomic spacing. Although the SAB(k) values at small k-vectors are scattered, due to the many SAB(k) values available, the fitted values of have extremely small statistical errors, as shown in the legend of Fig. 2(a). Even with a small size of 4000 particles, the estimate for is converged. Such insensitivity to the system size is again confirmed from Fig. 2(b), which shows the extrapolated values of at different molar fractions. As a benchmark, in Fig. 2(c), we compare the excess chemical potential and , with the reference and set to the concentrations of pure A and pure B [Eq. (19)], computed using the S0 method and Widom particle insertion.20 Widom insertion simulations were performed with the same system size (23 328 atoms), 4 × 106 time step, and one particle insertion per time step. The magnitudes of μex are quite small, and they only differ slightly between A and B. μex values for both A and B are larger when χA is higher because type A particles have a stronger repulsive core. However, the μex values and their trend, which are smaller than the statistical errors of the Widom particle insertion method, can be captured by the S0 method.
Analysis of the chemical potentials of the binary LJ system using the S0 method. (a) SAB(k) computed from NPT simulations with T = 1.2 and P = 2 using different system sizes N. The dashed curves are the corresponding fits using Eq. (18). (b) , , and at different molar fractions of A, computed using three system sizes. (c) The excess chemical potentials of particle A and B at different χA, computed using the S0 method and the Widom particle insertion.
Analysis of the chemical potentials of the binary LJ system using the S0 method. (a) SAB(k) computed from NPT simulations with T = 1.2 and P = 2 using different system sizes N. The dashed curves are the corresponding fits using Eq. (18). (b) , , and at different molar fractions of A, computed using three system sizes. (c) The excess chemical potentials of particle A and B at different χA, computed using the S0 method and the Widom particle insertion.
UREA IN WATER
We analyzed the MD trajectories of urea/water mixtures from Ref. 21, simulated using the Kirkwood–Buff derived force field22 and SPC/E water at 1 atm pressure and 300 K temperature. Four urea molar concentrations were considered (2, 4, 6, and 8M) with the total number of urea and water molecules, i.e., system size, equal to 13 000–16 000. A comparison of the derivative of the activity coefficient with previous results is shown in Fig. 3. The S0 results agree closely with those of Ref. 14, which uses a KBI with finite size corrections, and with Exp. 1.22,23
Comparison of the derivative of the activity coefficient with the previous simulation results14,21 and experiments (Exp. 1,22,23 Exp. 224).
NaCl AQUEOUS SOLUTION
Simulations of NaCl water solutions at different molar concentrations (0.1–9.3M) were performed using LAMMPS19 at 298.15 K and 1 bar. The JC/SPC/E6 force field was used, with the Lennard-Jones interactions truncated at 1 nm, long-range Coulomb interactions treated using a particle–particle particle-mesh solver, and the constraints for the rigid water molecules enforced by SHAKE. A fixed amount of 32 640 water molecules together with 128–10 880 NaCl ion pairs were used. The time step was 2 fs, the total number of steps were 2 000 000, and the Nose–Hoover thermostat and barostat were used.
Figure 4 shows the excess chemical potentials for NaCl ion pairs and water at different salt molality m (mol NaCl/kg water) calculated using the S0 method, which was integrated using Eq. (19), and we found that employing Eq. (21) rendered fully consistent values. and are compared with previous results computed using Osmotic Ensemble Monte Carlo,16 the Bennett acceptance ratio method,25 and thermodynamic integration.26 Note that the value of depends on the standard chemical potential reference, and each study handles this differently (we set ), so only the change as a function of m is physically meaningful. All four studies compared in Fig. 4 are fairly consistent, while the results of the S0 method agree particularly well with those of Ref. 25.
The excess chemical potentials of NaCl ion pairs (upper panel) and water molecules (lower panel) as a function of molality, compared with previous simulation results,4,25,26 with all employing the JC/SPC/E6 force field.
HIGH-PRESSURE CARBON–HYDROGEN MIXTURE
The implications for this C–H mixture in the context of planetary science will be discussed in Ref. 27; here, we focus on how adding hydrogen to liquid carbon changes the chemical potential of carbon atoms. We used a machine learning potential developed in Ref. 27 and performed NPT simulations at different NC/NH ratios at T = 5000 K and P = 100 GPa. The system sizes, i.e., total number of atoms, lie between 11 232 and 82 944.
In Fig. 5, we show the derivative of the activity coefficient and excess chemical potential of carbon at different molar fractions. The S0 method predicts values that are in good agreement with the results obtained by coexistence simulations,27 but it can be used for the regime of low carbon concentration, which becomes prohibitive using the coexistence approach. Interestingly, shows a nonmonotonic behavior with a minimum at NH/NC = 2 and maxima at NH/NC = 1 and 4. These magic numbers are probably related to the chemical bonds between C and H.
The derivative of the activity coefficient (upper panel), and the excess chemical potential (lower panel) at different molar fractions of C, at T = 5000 K and P = 100 GPa. The values are referenced to the bulk liquid carbon, and they are compared with values from coexistence simulations.27
The derivative of the activity coefficient (upper panel), and the excess chemical potential (lower panel) at different molar fractions of C, at T = 5000 K and P = 100 GPa. The values are referenced to the bulk liquid carbon, and they are compared with values from coexistence simulations.27
To conclude, we present the S0 method to compute the chemical potential of mixtures just from equilibrium MD simulations at the NPT ensemble by simply computing static structure factors. We demonstrate the generality and robustness of the S0 method on diverse systems, including a model LJ system, organic molecule in water, aqueous electrolyte, and a high-pressure solution. In principle, the S0 method is also applicable to larger molecules such as polymers or large-molecule solvents. Extension to multicomponent systems is also straightforward: To compute the chemical potential of a certain species, one can simply treat the remaining particles as the second species. Compared with the previous methods, such as particle insertion10,11 and thermodynamic integration,1,2 the S0 method is more generally applicable and works particularly well for dense fluids with complex interactions and high solute concentrations. Extension to solid solutions may be possible by following the derivation of Ref. 28. We envisage the S0 method will largely simplify the computation of the chemical potential of complex solutions and make them routine tasks.
SUPPLEMENTARY MATERIAL
The supplementary material contains a detailed derivation of the statistical mechanical framework and additional information of the molecular dynamics simulations.
ACKNOWLEDGMENTS
I thank Daan Frenkel for providing feedback on an early draft and for stimulating discussions, Debashish Mukherji and Robinson Cortes-Huerto for sharing the trajectories for urea–water mixtures, and Aleks Reinhardt for useful suggestions on the manuscript.
AUTHOR DECLARATIONS
Conflict of Interest
The author have no conflicts to disclose.
Author Contributions
Bingqing Cheng: Conceptualization (equal); Data curation (equal); Formal analysis (equal); Funding acquisition (equal); Investigation (equal); Methodology (equal); Project administration (equal); Resources (equal); Software (equal); Validation (equal); Visualization (equal); Writing – original draft (equal); Writing – review & editing (equal).
DATA AVAILABILITY
All PYTHON scripts and simulation input files generated for the study are in the SI repository https://github.com/BingqingCheng/S0, Ref. 29.