We present an accurate and efficient method to obtain Kirkwood-Buff (KB) integrals in the thermodynamic limit from small-sized molecular dynamics simulations. By introducing finite size effects into integral equations of statistical mechanics, we derive an analytical expression connecting the KB integrals of the bulk system with the fluctuations of the number of molecules in the corresponding closed system. We validate the method by calculating the activity coefficients of aqueous urea mixtures and the KB integrals of Lennard-Jones fluids. Moreover, our results demonstrate how to identify simulation conditions under which computer simulations reach the thermodynamic limit.

One of the most significant achievements of statistical mechanics is the connection between macroscopic thermodynamic quantities and the microscopic components and interactions of a physical system. This relation is routinely employed in computer simulations, e.g., molecular dynamics (MD), where structural, nanoscale features of a system can be easily measured and related to macroscopic emergent properties. However, this protocol is still a source of major concern.1 In particular, the physical systems under examination in experiments are open, that is, they exchange energy and matter with the environment or, at least, are composed by an Avogadro number of molecules. By contrast, MD simulations are usually constrained to work under canonical (microcanonical) conditions with fixed temperature (energy), volume V0, and number of particles N0. Finite number of particles and small volumes used to compute physical observables thus introduce various kinds of finite size effects, whose impact needs to be quantified if one aims at extracting thermodynamic quantities from simulations.

In this study, we include finite size effects into statistical mechanics integral equations, namely the Kirkwood-Buff2 equations, to derive accurate relations between thermodynamic properties and structural quantities obtained from nanoscale MD simulations. We use simple scaling arguments to keep track of two effects originating from the finite size of the simulated system: ensemble effects due to the fixed number of particles and boundary effects due to the small size of the subdomain used to compute physical observables.3 The derived analytical expression is used to obtain Kirkwood-Buff (KB) integrals in the thermodynamic limit. We validate this approach with a mixture of Lennard-Jones particles and employ it to study the solvation thermodynamics of prototypical urea/water liquid mixtures. In the latter case, we compare the results with previous computational studies4 and available experimental data.5–7 

We focus our attention on the Kirkwood-Buff (KB) theory,2 as it is arguably the most successful framework to investigate the properties of complex liquid mixtures. KB theory relates the local structure of a system to density fluctuations in the grand canonical ensemble, which are in turn related to equilibrium thermodynamic quantities such as the compressibility, the partial molar volumes, and the derivatives of the chemical potentials.8 Despite the fact that KB theory was formulated more than sixty years ago, it enjoys renewed interest in computational soft-matter and statistical physics communities. Recent works, in fact, have shown promising applications related to solvation of biomolecules9 and potential uses to compute multicomponent diffusion in liquids,10 and to study complex phenomena such as self-assembly of proteins.11 

For a multicomponent fluid of species i, j in equilibrium at temperature T, the Kirkwood-Buff integral (KBI) is defined as

(1)

with δij the Kronecker delta. The superscript (o) indicates that this definition holds for an open system, i.e., a system described by the grand canonical ensemble. This means that we compute fluctuations of the number of particles in a subdomain of volume V embedded in a reservoir whose size goes to infinity. Thus, 〈Ni〉 is the average number of i-particles inside V, or ρi = 〈Ni〉/V. Moreover, g i j o ( r 12 ) is the multicomponent radial distribution function (RDF) of the infinite system, with r12 = r2r1.

To evaluate the integral on the right hand side (r.h.s.) of Eq. (1), we assume that the fluid is homogeneous, and we use translational invariance that implies the limit V → ∞. Therefore, the transformation r2r = r2r1 gives G i j , the KB integrals of the bulk system

(2)

Eq. (2) is strictly valid in the thermodynamic limit. However, it is extensively used in computer simulations despite the fact that one is usually constrained to study closed systems (c), i.e., systems described by the canonical or microcanonical ensembles. More precisely, in computer simulations one calculates the quantity

(3)

where g i j c ( r ) is the multicomponent RDF of the finite system and R is a cutoff distance.

To justify the use of Eq. (3), it is always assumed that g i j o ( r ) g i j c ( r ) , which implies that the volume of the simulation box is large enough to locally reproduce the open system, or grand canonical, thermodynamics. In addition, it is assumed that Rζ, with ζ being the structural correlation length of the liquid. This implies that g i j c ( r R ) = 1 , that is, structural correlations between points separated by a distance R vanish and the integral of Eq. (3) can be safely truncated.

We thus face two different finite sizes simultaneously at play: the finite—and often too small—number of particles that constitute the simulated system (ensemble); and the extension of the subregion in which some physical observable is computed (boundary effects), which is at risk of being smaller than or too close to the system’s correlation length. It is therefore only fair to expect that results obtained using Eq. (3) might not be fully reliable. As a matter of fact, the minimal errors introduced by using Eq. (3) yield dramatic explicit finite size effects that become apparent when comparing standard MD with semi-grand canonical simulations.12 

FIG. 1.

Simulation snapshot of a liquid mixture of urea and water indicating the total volume of the simulation setup, V0, and one volume used to compute the fluctuations of the number of molecules, V.

FIG. 1.

Simulation snapshot of a liquid mixture of urea and water indicating the total volume of the simulation setup, V0, and one volume used to compute the fluctuations of the number of molecules, V.

Close modal

Here, instead of using Eq. (3), we explicitly include both finite size effects into an expression, analogous to Eq. (1), defined for a finite system. Let us recall that in computer simulations one considers systems with total fixed number of particles N0 and volume V0 (see Fig. 1). Thus we define13 

(4)

or, to rephrase it, we define a quantity Gij(V, V0) that is evaluated by computing fluctuations of the number of particles in finite subdomains of volume V inside a simulation box of volume V0. The average number of i-particles 〈Ni〉′ ≡ 〈NiV,V0 depends on both subdomain and simulation box volumes, and it is therefore conceptually different from the average used in Eq. (1). Moreover, the integral on the r.h.s. of Eq. (4) should be evaluated for the RDF of the finite system with volume V0 by using a finite integration domain V.

As a first step, we introduce a correction to ensemble effects accounting for the fact that g i j c ( r ) g i j o ( r ) holds only in the thermodynamic limit. This implies that the asymptotic behaviour of the tail of g i j c ( r ) should be included to account for small differences at large values of r. We propose to use an expression for the multicomponent pair correlation functions of the form

(5)

based on the asymptotic limit g i j c ( | r | ζ ) = 1 ( δ i j / ρ i + G i j ) / V 0 discussed in Ref. 8. As expected, when the total volume V0 → ∞ we recover g i j c ( r ) = g i j o ( r ) . We include Eq. (5) in the integral on the r.h.s. of Eq. (4) and obtain

(6)

In Eq. (5) we neglect O(1/V0) contributions that include derivatives of g i j o ( r ) with respect to ρi, as well as O ( 1 / V 0 2 ) terms, because their contribution to ensemble size effects becomes negligible upon integration.

Second, for the boundary effects, we correct for the finite domain in which we compute the integral on the r.h.s of Eq. (6), which indeed includes the RDF of the open system. However, the volume V is finite and the transformation r2r = r2r1 cannot be applied directly since the system is no longer translationally invariant. Alternatively, we take into account the different integration domains defined by the volumes V and V0,13 and write

(7)

with S the surface area of the subdomain V and α i j a proportionality constant. The meaning of Eq. (7) can be understood as follows. Both integrals, IVV and IVV0, are equal to zero for r12 > ζ. Nevertheless, for values of r12 close to the boundary of the subdomain V, i.e., r1 inside V and r2 outside the subdomain with r12 < ζ, there are contributions to IVV0 that are missing in IVV. Consequently, the difference IVV0IVV must be proportional to the surface area of the subdomain V, so that V V 0 V d r 1 d r 2 [ g i j o ( r 12 ) 1 ] S .14 Replacing Eq. (7) in Eq. (6), we obtain

(8)

where for convenience we use S / V 1 / V 1 3 . The constant αij depends only on intensive thermodynamic system properties such as density and temperature.

To compute the integral on the r.h.s. of Eq. (8), we assume that Vζ < V < V0 with Vζ = 4πζ3/3 and implicitly neglect finite size effects due to oscillations of the RDF beyond the volume V. Accordingly, we assume that periodic boundary conditions are applied such that the system becomes translationally invariant. We use the coordinate transformation r2r = r2r1 to obtain an integral equal to the one in Eq. (2), since for |r| > ζ, g i j o = 1 . Finally, we rewrite Eq. (8) as

(9)

It is worth noting that Eq. (9) gives both the correct thermodynamic limit and asymptotic behaviour8 of Gij(V, V0), provided V > Vζ. The former is obtained in the limit (V, V0) → ∞ that gives G i j ( ( V , V 0 ) ) = G i j . The latter holds when V = V0, and gives Gij(V = V0, V0) = − δij/ρi.

The two finite size corrections discussed above can be easily identified in Eq. (9): (i) the two V/V0 terms (ensemble effects) appear from using an asymptotic correction to the pair correlation functions, Eq. (5), which includes explicitly the impact of the finite size of the simulation box V0. This correction is similar to the one deduced from the Taylor expansion in terms of N0 applied to the RDF of closed single-component systems.3 (ii) The 1/V1/3 term (boundary effect) results from considering finite integration domains explicitly. Using a different approach, namely arguments based on the thermodynamics of small systems,15 a correction of this form was proposed recently for single component systems16 and extended for multicomponent systems.17 However, the results obtained using this method neglect the finite size of the simulation box, because the underlying theory is only valid for finite systems embedded in a reservoir of infinite volume.16 

Finally, we emphasise that Eq. (9) allows us to evaluate G i j , the infinite system KBI, by computing Gij(V, V0). In practice, we perform a single NVT MD simulation at fixed V0 and, for the production trajectory, compute the fluctuations of the number of particles as a function of the volume V of the subdomain. We recall that extrapolations of thermodynamic properties from finite size computer simulations have been applied in the past to study single-component idealised systems. Examples include the calculation of the compressibility of the Ising lattice gas18 and hard disk fluids,19 and the investigation of the gas-liquid transition in two-dimensional Lennard-Jones fluids,20 as well as the calculation of the elastic constants of model solids.21 

We define λ V / V 0 1 3 and rewrite Eq. (9) as

(10)

This expression depends on λ and the volume of the simulation box V0. Since in MD simulations V0 is fixed, the effect of the size of the simulation box on Eq. (10) reduces to a global shift that leaves the general form of the equation unchanged. This implies that results for different values of V0 superpose and it is possible to calculate G i j , and therefore the open system RDF given the order of approximation implicit in Eq. (5), even from relatively small MD simulations.

FIG. 2.

Simulation results for a binary CA = 0.3 (a) and CA = 0.5 (b) LJ fluid at T = 1.2 and constant pressure P = 9.8. GAA(λ) (red circles), GBB(λ) (blue squares), and GAB(λ) (green triangles). There is a linear regime for λ ≤ 0.3, indicated by vertical lines, where the slope of the line is equal to G i j . Black lines were obtained by fitting Eq. (10) to the data in this linear regime, however the superposition is evident for the whole interval 0 < λ < 1.

FIG. 2.

Simulation results for a binary CA = 0.3 (a) and CA = 0.5 (b) LJ fluid at T = 1.2 and constant pressure P = 9.8. GAA(λ) (red circles), GBB(λ) (blue squares), and GAB(λ) (green triangles). There is a linear regime for λ ≤ 0.3, indicated by vertical lines, where the slope of the line is equal to G i j . Black lines were obtained by fitting Eq. (10) to the data in this linear regime, however the superposition is evident for the whole interval 0 < λ < 1.

Close modal

To validate Eq. (10), we first consider a binary mixture (A, B) of Lennard-Jones (LJ) fluids. We use a purely repulsive 6-12 LJ potential truncated and shifted with cutoff radius 21/6σ. The potential parameters are chosen as σ = σAA = σBB = σAB = 1, and ϵAA = 1.2, ϵBB = 1.0 with ϵAB = (ϵAA + ϵBB)/2. All the results are expressed in LJ units with mass m = 1, time σ(m/ϵBB)1/2, temperature ϵBB/kBT, and pressure σ3/ϵBB. Simulations were carried out using ESPResSo++22 with a time step of δt = 10−3. Constant temperature was enforced through a Langevin thermostat with damping coefficient γ = 1.0. Two system sizes, N0 = 23 328 and 108 000, were considered in the range of mole fractions of A-molecules CA = 0.1, …, 1.0. The number density has been adjusted around ρ = 0.864, thus V0 ≈ (30σ)3 and (50σ)3, such that pressure remains nearly constant at P = 9.8. We performed equilibration runs of 64 ⋅ 106 MD steps (16 ⋅ 106 for N0 = 108 000 systems) and production runs of 2 ⋅ 106 MD steps. To compute Gij(λ), we selected 800 frames per trajectory and for each frame identified 1000 randomly positioned subdomains with volumes ranging from (2σ)3 < V < V0.

Fig. 2 shows results corresponding to CA = 0.3 (a) and CA = 0.5 (b) mole fractions. In both cases a linear regime for λ ≤ 0.3, indicated by vertical lines, appears. In this interval we fit a line with slope G i j and intersection αij. By replacing these values in Eq. (10) we obtain curves that superpose remarkably well on the simulation data over the entire range.

In this linear regime, λ3 ≈ 0 thus G i j ( V , V 0 ) G i j , provided V > Vζ. This suggests that the region defined by λ ≤ 0.3 corresponds to the regime where the canonical and grand canonical ensembles coincide, i.e., the thermodynamic limit. Moreover, this also implies g i j c ( r ) g i j o ( r ) . In practical terms, if we seek an accurate value of KBIs using the standard expression (3) for G i j R , i.e., obtaining g i j c ( r ) from simulations and integrating up to a cutoff R, for a LJ system with correlation length ζ = 10σ, we would need to run a simulation with, at least, V0 = (50σ)3. In other words, the volume of the subdomain V > Vζ should be at most 3% of the volume V0 in order to get G i j R G i j .12 

FIG. 3.

G A B R for LJ mixtures with CA = 0.3. Data points correspond to results obtained using Eq. (3) for two box sizes: V0 = (30σ)3 (blue line) and V0 = (50σ)3 (red line). The horizontal line indicates the value obtained using the extrapolation proposed in Eq. (10) with V0 = (30σ)3.

FIG. 3.

G A B R for LJ mixtures with CA = 0.3. Data points correspond to results obtained using Eq. (3) for two box sizes: V0 = (30σ)3 (blue line) and V0 = (50σ)3 (red line). The horizontal line indicates the value obtained using the extrapolation proposed in Eq. (10) with V0 = (30σ)3.

Close modal

To test this argument, we compare the results obtained using Eq. (10) with the quantity G i j R obtained from Eq. (3). Fig. 3 shows G A B R obtained with V0 = 50σ (red line) and V0 = 30σ (blue line). The horizontal line indicates the value of G A A obtained from Eq. (10) and Fig. 2. Clearly, upon increasing the size of the simulation box, G A A R converges to the G A A calculated with our method.

Finally, we use Eq. (10) to study the solvation thermodynamics of urea/water mixtures. In a system with cosolvent U (urea) and solvent W (water) we have, at pressure P, temperature T, and density ρU, the derivative of urea activity coefficient defined as

(11)

with γU and kBTlnγU the urea activity coefficient and chemical potential, respectively.8 

Urea/water mixtures were simulated23 in GROMACS,24 using the Kirkwood-Buff derived force field6 and SPC/E water.25 Four urea molar concentrations were considered (2.0, 3.99, 6.00, and 8.02 M). NPT simulations were performed with pressure and temperature enforced with Berendsen barostat (0.5 ps coupling) and thermostat (0.1 ps coupling) to ensure 1 atm pressure and 300 K temperature, respectively.

We compute γUU and report the results in Fig. 4. We compare our data with two experiments, Exp. 15,6 (solid line) and Exp. 27 (dashed line), and with simulation results using Eq. (3).4 The latter simulation results (blue squares) tend to follow the behaviour observed in Exp. 1. This is not surprising since the force field6 used in the simulations was parameterised to reproduce the activity coefficients reported in Exp. 1. Nevertheless, by using our method with the same simulation sizes and trajectories of Ref. 4, we obtain results in decidedly better agreement with Exp. 1. This suggests that our method is more accurate than using Eq. (3) and can be profitably employed to investigate complex molecular fluids.

FIG. 4.

γUU for different urea molar concentrations CU. Data from two experiments, namely Exp. 15,6 and Exp. 2,7 and simulations4 are compared to results obtained with the method outlined in the text.

FIG. 4.

γUU for different urea molar concentrations CU. Data from two experiments, namely Exp. 15,6 and Exp. 2,7 and simulations4 are compared to results obtained with the method outlined in the text.

Close modal

In conclusion, by explicitly including ensemble and boundary finite size effects into integral equations of statistical mechanics, we presented here a method to extract thermodynamic properties of a physical system from small-sized atomistic simulations. In particular, the combination of small-sized simulation setups and the analytical relation established here allows us to determine quantitatively the activity coefficients of aqueous urea mixtures. We also identified the thermodynamic limit for MD simulations based on the ratio between the volume of the domain in which a physical observable is computed and the volume of the simulation box. Finally, we envisage a beneficial usage of this method to study solvation properties of other complex mixtures of biological interest and to characterise the phase diagram of molecular fluids.

R.C.H. thanks Debashish Mukherji for many stimulating discussions and for kindly sharing with us his trajectories for urea water mixtures. We thank Roberto Menichetti, Aoife Fogarty, and Cristina Greco for a critical reading of the manuscript. R.C.H. gratefully acknowledges the Alexander von Humboldt Foundation for financial support.

1.
P. V.
Coveney
and
S.
Wan
,
Phys. Chem. Chem. Phys.
(published online).
2.
J. G.
Kirkwood
and
F. P.
Buff
,
J. Chem. Phys.
19
,
774
(
1951
).
3.
J. L.
Lebowitz
and
J. K.
Percus
,
Phys. Rev.
122
,
1675
(
1961
);
J. J.
Salacuse
,
A. R.
Denton
, and
P. A.
Egelstaff
,
Phys. Rev. E
53
,
2382
(
1996
);
F. L.
Román
,
J. A.
White
, and
S.
Velasco
,
J. Chem. Phys.
107
,
4635
(
1997
);
F. L.
Román
,
A.
González
,
J. A.
White
, and
S.
Velasco
,
Am. J. Phys.
67
,
1149
(
1999
);
D.
Villamaina
and
E.
Trizac
,
Eur. J. Phys.
35
,
035011
(
2014
).
4.
T. E.
de Oliveira
,
P. A.
Netz
,
K.
Kremer
,
C.
Junghans
, and
D.
Mukherji
,
J. Chem. Phys.
144
,
174106
(
2016
).
5.
R. H.
Stokes
,
Aust. J. Chem.
20
,
2087
(
1967
).
6.
S.
Weerasinghe
and
P. E.
Smith
,
J. Phys. Chem. B
107
,
3891
(
2003
).
7.
O.
Miyawaki
,
A.
Saito
,
T.
Matsuo
, and
K.
Nakamura
,
Biosci. Biotechnol. Biochem.
61
,
466
469
(
1997
).
8.
A.
Ben-Naim
,
Molecular Theory of Solutions
(
Oxford University Press
,
2006
).
9.
V.
Pierce
,
M.
Kang
,
M.
Aburi
,
S.
Weerasinghe
, and
P. E.
Smith
,
Cell Biochem. Biophys.
50
,
1
(
2008
).
10.
S.
Kjelstrup
,
S. K.
Schnell
,
T. J. H.
Vlugt
,
J.-M.
Simon
,
A.
Bardow
,
D.
Bedeaux
, and
T.
Trinh
,
Adv. Nat. Sci.: Nanosci. Nanotechnol.
5
,
023002
(
2014
);
X.
Liu
,
S. K.
Schnell
,
J. M.
Simon
,
P.
Krüeger
,
D.
Bedeaux
,
S.
Kjelstrup
,
A.
Bardow
, and
T. J. H.
Vlugt
,
Int. J. Thermophys.
34
,
1169
(
2013
).
11.
A.
Ben-Naim
,
J. Chem. Phys.
138
,
224906
(
2013
).
12.
D.
Mukherji
and
K.
Kremer
,
Macromolecules
46
,
9158
(
2013
).
13.
F.
Román
,
J.
White
,
A.
González
, and
S.
Velasco
, “
Ensemble effects in small systems
,” in
Theory and Simulation of Hard-Sphere Fluids and Related Systems
(
Springer Berlin Heidelberg
,
Berlin, Heidelberg
,
2008
), pp.
343
381
.
14.
M.
Rovere
,
D. W.
Heermann
, and
K.
Binder
,
J. Phys.: Condens. Matter
2
,
7009
(
1990
).
15.
T. L.
Hill
,
Thermodynamics of Small Systems
(
Dover
,
1963
);
J. L.
Lebowitz
and
J. K.
Percus
,
Phys. Rev.
124
,
1673
(
1961
).
16.
S. K.
Schnell
,
T. J.
Vlugt
,
J.-M.
Simon
,
D.
Bedeaux
, and
S.
Kjelstrup
,
Chem. Phys. Lett.
504
,
199
(
2011
).
17.
P.
Krüger
,
S. K.
Schnell
,
D.
Bedeaux
,
S.
Kjelstrup
,
T. J. H.
Vlugt
, and
J.-M.
Simon
,
J. Phys. Chem. Lett.
4
,
235
(
2013
);
[PubMed]
P.
Ganguly
and
N. F. A.
van der Vegt
,
J. Chem. Theory Comput.
9
,
1347
(
2013
).
[PubMed]
18.
19.
F. L.
Román
,
J. A.
White
, and
S.
Velasco
,
EPL
42
,
371
(
1998
);
20.
M.
Rovere
,
D. W.
Hermann
, and
K.
Binder
,
EPL
6
,
585
(
1988
);
M.
Rovere
,
P.
Nielaba
, and
K.
Binder
,
Z. Phys. B
90
,
215
(
1993
).
21.
S.
Sengupta
,
P.
Nielaba
,
M.
Rao
, and
K.
Binder
,
Phys. Rev. E
61
,
1072
(
2000
).
22.
J. D.
Halverson
,
T.
Brandes
,
O.
Lenz
,
A.
Arnold
,
S.
Bevc
,
V.
Starchenko
,
K.
Kremer
,
T.
Stuehn
, and
D.
Reith
,
Comput. Phys. Commun.
184
,
1129
(
2013
).
23.
D.
Mukherji
,
N. F. A.
van der Vegt
, and
K.
Kremer
,
J. Chem. Theory Comput.
8
,
3536
(
2012
).
24.
S.
Pronk
,
S.
Páll
,
R.
Schulz
,
P.
Larsson
,
P.
Bjelkmar
,
R.
Apostolov
,
M. R.
Shirts
,
J. C.
Smith
,
P. M.
Kasson
,
D.
van der Spoel
,
B.
Hess
, and
E.
Lindahl
,
Bioinformatics
29
,
845
(
2013
).
25.
H. J. C.
Berendsen
,
J. R.
Grigera
, and
T. P.
Straatsma
,
J. Phys. Chem.
91
,
6269
(
1987
);
L. X.
Dang
and
B. M.
Pettitt
,
J. Phys. Chem.
91
,
3349
(
1987
);
Y.
Wu
,
H. L.
Tepper
, and
G. A.
Voth
,
J. Chem. Phys.
124
,
024503
(
2006
).
[PubMed]