Hybrid quantum mechanics/molecular mechanics (QM/MM) models are successful at describing the properties and reactivity of biological macromolecules. Combining ab initio QM/MM methods and periodic boundary conditions (PBC) is currently the optimal approach for modeling chemical processes in an infinite environment, but frequently, these models are too time-consuming for general applicability to biological systems in a solution. Here, we define a simple and efficient electrostatic embedding QM/MM model in PBC, combining the benefits of electrostatic potential fitted atomic charges and particle-mesh Ewald sums, which can efficiently treat systems of an arbitrary size at a reasonable computational cost. To illustrate this, we apply our scheme to extract the lowest singlet excitation energies from a model for Arabidopsis thaliana cryptochrome 1 containing circa 93 000 atoms, accurately reproducing the experimental absorption maximum.

Embedding methods in quantum chemistry allow a substantial reduction in the overall computational cost by treating a small subsystem of atoms with an accurate theoretical method while treating the rest of the system with a cheaper and often less accurate approach.1 In such embedding schemes, the total energy is computed as the sum of energies of the constituent subsystems and some interaction terms between each fragment.2 One of the most popular embedding methods for treating biological macromolecules is quantum mechanics/molecular mechanics (QM/MM),3 in which the energy is expressed as

E=EQM+EMM+Eint,
(1)

where EQM is the energy of the (small) QM subsystem, EMM is the energy of the (large) MM subsystem, and Eint is the interaction term between them. Usually, the interaction is electrostatic, complemented with other pairwise atom–atom interactions. The majority of ab initio QM/MM methods have been formulated employing an electrostatic Coulomb interaction between the QM and MM subsystems, the complete macromolecular system being in the gas phase.4 The accuracy of such QM embeddings depends on the locality of the property of interest, as well as on parameters such as the size of the QM subsystem or the constraints imposed on the MM atom positions.5 There exist several attempts in the literature to formulate an ab initio QM/MM method for macromolecules surrounded by an extended environment (solvent, membrane, etc.), either using non-periodic continuum models6 or periodic boundary conditions (PBC) employing the Ewald summation technique.7–22 Most QM/MM PBC formulations rely on atomic point charges for efficiently representing the long-range QM–QM interactions such as Mulliken,7,8,11,12,15,19 ChElPG,14,16,21 electrostatic potential (ESP),17 or other types.20,22 Such methods mainly use Ewald pair potentials7,14,16,17,21 or standard Ewald8,10,13,20 or exploit the efficiency of the particle-mesh Ewald (PME) method.23–25 The PME, which is a state-of-the-art algorithm for efficiently calculating long-range interactions in large MM systems, has mainly been implemented for semi-empirical QM/MM methods.11,12,15,19 We are aware of only two recent articles reporting its use in ab initio QM/MM methods.18,22

FIG. 1.

Schematic representation of the electrostatic embedding QM/MM PBC method described in this communication for a chromophore (QM) in water (MM). In the original cell (center), the QM subsystem is represented by a quantum charge density. In the replica cells, the QM atoms are represented as ESPF point charges (green circles). The MM atoms are represented with blue and red point charges. All point charges are used to polarize the QM density in the original cell.

FIG. 1.

Schematic representation of the electrostatic embedding QM/MM PBC method described in this communication for a chromophore (QM) in water (MM). In the original cell (center), the QM subsystem is represented by a quantum charge density. In the replica cells, the QM atoms are represented as ESPF point charges (green circles). The MM atoms are represented with blue and red point charges. All point charges are used to polarize the QM density in the original cell.

Close modal

Here, we combine the advantages of electrostatic potential fitted (ESPF) charges4,26 and PME potentials to formulate an efficient ab initio electrostatic embedding QM/MM + PBC method, defining a unified consistent embedding energy from an interaction Hamiltonian. Our formulation takes full computational advantage of the PME; it reduces the number of integrals to be computed and can be applied to any ab initio self-consistent field method. The definition of the ESPF QM/MM + PBC interaction (see Fig. 1) and the interaction Hamiltonian is based on the pairwise electrostatic interaction energy between the QM and MM subsystems (see the detailed derivation in the supplementary material). The interaction energy in terms of potentials is defined as

Eint=ANQMqAΦAMM+12ANQMqAΦAQM,
(2)

where the first term accounts for the QM–MM interactions and the second term the QM–QM interactions, with the 1/2 factor to avoid double counting. NQM is the number of QM atoms, and qA = ZAQA are their partial charges, defined as the difference between the atomic charge ZA and the electronic charge population QA = μνPμνQA,μν, obtained as the contraction between the quantum density matrix P and an atomic charge operator matrix QA to be defined later on. The notation ΦA is a short-hand notation for the external potential felt at atom position rA, that is, ΦA = Φ(rA). We can obtain an interaction operator by deriving the interaction energy with respect to the density matrix, leading to

hμνint=ANQMQA,μνΦAMM+ΦAQM.
(3)

Up to this point, this is a general formulation for a QM/MM embedding when using charge operators, including QM–QM interactions. The different QM/MM models are then distinguished by defining the MM energy, the interaction energy, and the external electrostatic potential from a classical electrostatic interaction. For example, in QM/MM models of pairwise Coulomb interactions without PBC, the QM/MM procedure is defined given an external electrostatic potential

ΦA=ΦAMM=j=1NMMqjrAj.
(4)

Here, rAj = rArj is the distance between QM particle A and MM particle j. In this case, ΦAQM=0, and therefore, the interaction energy and operator matrix elements can be computed given the potential generated by MM atoms on QM centers.

The use of PBC allows one to account for the long-range interactions and to build a model of the macromolecule interactions with the solvent. However, the introduction of Coulomb interaction with replicas results in a slow and conditionally convergent interaction energy.27 The Ewald sums guarantee a faster convergence of the energy sums over the replica cells. A general Ewald sum formulation in terms of potentials for N interacting charges can be written as

EEw=12αNqαΦαshort+Φαlong+Φαself.
(5)

Here, Φα has to be understood as the electrostatic potential calculated at the position of atom α in the original cell, that is, Φα = Φ(rα0), where rα0 = rα + 0L, L being the length of the cubic unitary cell. The first term is computed in real space and recovers the short-range part of the electrostatic interaction. The short-range potential is given by

Φαshort=n=0γ=1Nqγrαγnerfc(βrαγn),
(6)

where rαγn = rαrγ + nL. The first summation over n vectors runs over the original box and all the replicas, whereas the sum over γ runs over all the N particles in the system. The prime in the second sum means that we are excluding those terms for which α = γ when n = 0. Finally, the complementary error function, erfc(x) = 1 − erf(x), ensures the range separation and the convergence control through parameter β. The second term in Eq. (5) corresponds to the long-range energy. The corresponding long-range potential, computed in the reciprocal space, is defined as

Φαlong=1πVm0eπ2m2β2m2Re2πimrαS(m),
(7)

where R[z] is the real part of z. The summation runs over all the reciprocal space vectors m, using the so-called structure factors S(m), which are defined as

S(m)=γ=1Nqγe2πimrγ.
(8)

The long-range energy includes a spurious interaction of charges with themselves in the original cell, which is removed by subtracting the self-interaction energy, corresponding to the third term in Eq. (5), using the self-interaction potential

Φαself=2βπqα.
(9)

Applying the general Ewald sums formula in the QM/MM framework defined in Eqs. (2) and (3) requires new expressions for the MM energy, the interaction energy, and potential. The purely MM electrostatic energy can be easily obtained by restricting indices α and γ in Eqs. (5)(9) to MM atoms only. The Ewald interaction energy can be obtained by extracting the terms defined in Eq. (2) from Eq. (5), which results in

Eint,Ew=ANQMqAΦAshort,MM+ΦAlong,MM+12ANQMqAΦAshort,QM+ΦAlong,QM+ΦAself,QM,
(10)

recalling that the first term includes the interaction of the QM charges with the MM charges, while the second term includes the interaction of the QM charges in the original cell with the QM charges in the replicas. The Ewald potentials in Eq. (10) need to be defined for such a definition of the interaction. Note that, in contrast to the non-PBC case, electrostatic potentials now include both an MM and a QM part. The MM potential contains short- and long-range contributions, which are defined as

ΦAshort,MM=n=0i=1NMMqirAinerfcβrAin,ΦAlong,MM=1πVm0eπ2m2β2m2Re2πimrASMM(m).
(11)

Here, SMM(m) refers to the structure factor [Eq. (8)], in which the index α is restricted to MM charges. The self-correction is not necessary in the MM potential, since the long-range term involves particles of different nature: one QM, an A, and all the MM ones. The QM potential is given by

ΦAshort,QM=n0B=1NQMqBrABnerfcβrABn,ΦAlong,QM=1πVm0eπ2m2β2m2Re2πimrASQM(m)BANQMqBrAB0erfβrAB0,ΦAself,QM=2βπqA,
(12)

where SQM(m) refers to the structure factor [Eq. (8)], in which the index α is restricted to QM charges. In the short-range potential, the sum n runs only over the replicas since the QM–QM interactions in the original cell are already included in the QM calculation. Similarly, the long-range potential needs to be corrected by subtracting the long-range QM–QM interactions in the original cell.7 

While Ewald summation is useful for handling long-range interactions, its original formulation scales like O(N2), becoming computationally unfeasible for large MM systems. Different approaches have been proposed for reducing the algorithmic complexity.28–31 Here, we make use of the Smooth Particle Mesh Ewald (SPME) method,23–25 featuring a reduced complexity of O(N log N). The main idea behind SPME consists in approximating the structure factor [Eq. (8)] by interpolating the complex exponentials on a numerical grid of B-splines.25 In practice, the QM atomic charges are projected along with the MM atomic charges in the PME grid, from which one can construct the total long-range potential, ΦAlong, which corresponds to the sum of the long-range QM and MM contributions to the potential, besides the correction due to the long-range QM–QM interactions in the original cell, which can be added a posteriori as a correction term with the same expression as in Eq. (12). Thus, no modification of the original SPME algorithm is required.

The electrostatic potentials [Eqs. (11) and (12)] can be used to construct the interaction operator [Eq. (3)] once the charge operator is defined. Here, we define the charge operator via the ESPF method, which allows an efficient mapping of the quantum electron density to a classical point charge description compatible with the most common force fields.32 In the ESPF procedure, the charge operator matrix elements are fitted to QM-only electrostatic integrals computed on a numerical grid constructed around the molecule.26 For obtaining the charge operators used in Eq. (3), a system of linear equations,

AQA,μνrkrA=d3rχμ*(r)1|rrk|χν(r),
(13)

has to be solved. Here, rk are the point coordinates of a Lebedev atom-centered grid defined around the molecule and χ are the atomic orbitals. The charge operator is then corrected to enforce the total charge conservation of the QM subsystem.4 

The presented methodology, combining SPME and ESPF charges, represents a consistent formulation between the energy and the Hamiltonian and an efficient method for computing the ground and excited state energies. To test this, we extract the lowest singlet excitation energies of 43 snapshots of Arabidopsis thaliana cryptochrome 1 (see Fig. 2; computational details are in the supplementary material). On the one hand, the time-dependent density functional theory (TDDFT) has been used on top of the QM/MM PBC ground state Kohn–Sham reference. In this model, we consider that the external potential is fixed in the excited state calculations, and therefore, the only excitation process occurs in the original cell, while the replicas remain in the ground state. No response terms have been added to the TDDFT equations. This approximate model, for which the average excitation energy is around 412 ± 26 nm, is in excellent accordance with the experimental absorption maximum of 420 nm.33 The blueshift can be attributed to the lack of vibronic effects, which are known to be important in the absorption spectra of isoalloxazine.34 On the other hand, we have implemented a restricted open-shell Kohn–Sham (ROKS) model to extract the lowest energy excited state directly from the SCF.35 In this model, the chromophores in both the original cell and the replicas are excited. In this case, the average excitation energy is 518 ± 35 nm, thus underestimated by around 100 nm with respect to the experiments. Of course, this model is limited not only by the fact that excited states are represented by a single configuration, but also by the probably unrealistic situation that the photoexcited protein is surrounded by simultaneously excited proteins in the replicas.

FIG. 2.

(Top) ESPF QM/MM PBC model of Arabidopsis thaliana cryptochrome 1 in a water box. The isoalloxazine, D396, and W400 (62 atoms in total) are treated at the QM level, with the link atoms shown in purple (bottom, left). Excitation energies for several snapshots computed with restricted open-shell Kohn–Sham and time-dependent density functional theories. The mean values (dashed lines) are compared to the experimental values from Ref. 33.

FIG. 2.

(Top) ESPF QM/MM PBC model of Arabidopsis thaliana cryptochrome 1 in a water box. The isoalloxazine, D396, and W400 (62 atoms in total) are treated at the QM level, with the link atoms shown in purple (bottom, left). Excitation energies for several snapshots computed with restricted open-shell Kohn–Sham and time-dependent density functional theories. The mean values (dashed lines) are compared to the experimental values from Ref. 33.

Close modal

In conclusion, we have presented an efficient QM/MM formulation in periodic boundary conditions based on electrostatic potential fitted charges and smooth particle-mesh Ewald sums. It is worth noting that the QM/MM framework defined by Eqs. (2) and (3) is general and extra energy terms are easily included as potential sources (see, for example, the derivation of surface-dipole and the non-neutral cell potentials in the supplementary material). The method scales approximately like O(αNMM1.3) (α = 4 · 10−6, see the supplementary material for further details), opening up the route for a general application of QM/MD simulations in large-sized periodic systems. This will require the computation of analytic energy first derivatives, which we plan to develop in the future.

See the supplementary material for the derivation of Eqs. (2) and (3), the derivation of surface-dipole and non-neutral cells potentials, the description of the QM/MM PBC algorithm, the scaling of the method, and the computational details.

We acknowledge the support from “Agence Nationle de la Recherche” through the project MAPPLE (Grant No. ANR-22-CE29-0014-01). Centre de Calcul Intensif d’Aix-Marseille is acknowledged for granting access to its high-performance computing resources.

The authors have no conflicts to disclose.

Simone Bonfrate: Methodology (equal); Software (equal); Validation (lead); Writing – original draft (equal); Writing – review & editing (equal). Nicolas Ferré: Conceptualization (equal); Methodology (equal); Supervision (equal); Writing – review & editing (equal). Miquel Huix-Rotllant: Conceptualization (equal); Methodology (equal); Software (lead); Supervision (equal); Validation (supporting); Writing – original draft (equal); Writing – review & editing (equal).

The data that support the findings of this study are available from the corresponding author upon reasonable request.

1.
L. O.
Jones
,
M. A.
Mosquera
,
G. C.
Schatz
, and
M. A.
Ratner
,
J. Am. Chem. Soc.
142
,
3281
(
2020
).
2.
D. G.
Fedorov
,
T.
Nagata
, and
K.
Kitaura
,
Phys. Chem. Chem. Phys.
14
,
7562
(
2012
).
3.
Q.
Cui
,
T.
Pal
, and
L.
Xie
,
J. Phys. Chem. B
125
,
689
(
2021
).
4.
M.
Huix-Rotllant
and
N.
Ferré
,
J. Chem. Theory Comput.
17
,
538
(
2021
).
5.
S.
Gómez
,
T.
Giovannini
, and
C.
Cappelli
, “
Multiple facets of modeling electronic absorption spectra of systems in solution
,”
ACS Phys. Chem. Au
(in press) (
2022
).
6.
E.
Falbo
,
M.
Fusè
,
F.
Lazzari
,
G.
Mancini
, and
V.
Barone
,
J. Chem. Theory Comput.
18
,
6203
6216
(
2022
).
7.
K.
Nam
,
J.
Gao
, and
D. M.
York
,
J. Chem. Theory Comput.
1
,
2
(
2005
).
8.
D.
Riccardi
,
P.
Schaefer
, and
Q.
Cui
,
J. Phys. Chem. B
109
,
17715
17733
(
2005
).
9.
T.
Laino
,
F.
Mohamed
,
A.
Laio
, and
M.
Parrinello
,
J. Chem. Theory Comput.
1
,
1176
(
2005
).
10.
T.
Laino
,
F.
Mohamed
,
A.
Laio
, and
M.
Parrinello
,
J. Chem. Theory Comput.
2
,
1370
(
2006
).
11.
G. d. M.
Seabra
,
R. C.
Walker
,
M.
Elstner
,
D. A.
Case
, and
A. E.
Roitberg
,
J. Phys. Chem. A
111
,
5655
(
2007
).
12.
R. C.
Walker
,
M. F.
Crowley
, and
D. A.
Case
,
J. Comput. Chem.
29
,
1019
(
2008
).
13.
C. F.
Sanz-Navarro
,
R.
Grima
,
A.
García
,
E. A.
Bea
,
A.
Soba
,
J. M.
Cela
, and
P.
Ordejón
,
Theor. Chem. Acc.
128
,
825
(
2011
).
14.
Z. C.
Holden
,
R. M.
Richard
, and
J. M.
Herbert
,
J. Chem. Phys.
139
,
244108
(
2013
).
15.
K.
Nam
,
J. Chem. Theory Comput.
10
,
4175
(
2014
).
16.
Z. C.
Holden
,
R. M.
Richard
, and
J. M.
Herbert
,
J. Chem. Phys.
142
,
059901
(
2015
).
17.
T.
Vasilevskaya
and
W.
Thiel
,
J. Chem. Theory Comput.
12
,
3561
(
2016
).
18.
T. J.
Giese
and
D. M.
York
,
J. Chem. Theory Comput.
12
,
2611
(
2016
).
19.
H.
Nishizawa
and
H.
Okumura
,
J. Comput. Chem.
37
,
2701
(
2016
).
20.
Y.
Kawashima
,
K.
Ishimura
, and
M.
Shiga
,
J. Chem. Phys.
150
,
124103
(
2019
).
21.
Z. C.
Holden
,
B.
Rana
, and
J. M.
Herbert
,
J. Chem. Phys.
150
,
144115
(
2019
).
22.
J. P.
Pederson
and
J. G.
McDaniel
,
J. Chem. Phys.
156
,
174105
(
2022
).
23.
T.
Darden
,
D.
York
, and
L.
Pedersen
,
J. Chem. Phys.
98
,
10089
(
1993
).
24.
U.
Essmann
,
L.
Perera
,
M. L.
Berkowitz
,
T.
Darden
,
H.
Lee
, and
L. G.
Pedersen
,
J. Chem. Phys.
103
,
8577
(
1995
).
25.
C.
Sagui
,
L. G.
Pedersen
, and
T. A.
Darden
,
J. Chem. Phys.
120
,
73
(
2004
).
26.
N.
Ferré
and
J. G.
Ángyán
,
Chem. Phys. Lett.
356
,
331
(
2002
).
27.
28.
A. Y.
Toukmaji
and
J. A.
Board
, Jr.
,
Comput. Phys. Commun.
95
,
73
(
1996
).
29.
D. R.
Wheeler
and
J.
Newman
,
Chem. Phys. Lett.
366
,
537
(
2002
).
30.
Y.
Shan
,
J. L.
Klepeis
,
M. P.
Eastwood
,
R. O.
Dror
, and
D. E.
Shaw
,
J. Chem. Phys.
122
,
054101
(
2005
).
31.
F.
Nestler
,
M.
Pippig
, and
D.
Potts
,
J. Comput. Phys.
285
,
280
(
2015
).
32.
F.
Melaccio
,
M.
Olivucci
,
R.
Lindh
, and
N.
Ferré
,
Int. J. Quantum Chem.
111
,
3339
(
2011
).
33.
M.
Ahmad
,
N.
Grancher
,
M.
Heil
,
R. C.
Black
,
B.
Giovani
,
P.
Galland
, and
D.
Lardemer
,
Plant Physiol.
129
,
774
(
2002
).
34.
K.
Schwinn
,
N.
Ferré
, and
M.
Huix-Rotllant
,
J. Chem. Theory Comput.
16
,
3816
(
2020
).
35.
T.
Kowalczyk
,
T.
Tsuchimochi
,
P.-T.
Chen
,
L.
Top
, and
T.
Van Voorhis
,
J. Chem. Phys.
138
,
164101
(
2013
).

Supplementary Material