Many-body dispersion (MBD) is a powerful framework to treat van der Waals (vdW) dispersion interactions in density-functional theory and related atomistic modeling methods. Several independent implementations of MBD with varying degree of functionality exist across a number of electronic structure codes, which both limits the current users of those codes and complicates dissemination of new variants of MBD. Here, we develop and document libMBD, a library implementation of MBD that is functionally complete, efficient, easy to integrate with any electronic structure code, and already integrated in FHI-aims, DFTB+, VASP, Q-Chem, CASTEP, and Quantum ESPRESSO. libMBD is written in modern Fortran with bindings to C and Python, uses MPI/ScaLAPACK for parallelization, and implements MBD for both finite and periodic systems, with analytical gradients with respect to all input parameters. The computational cost has asymptotic cubic scaling with system size, and evaluation of gradients only changes the prefactor of the scaling law, with libMBD exhibiting strong scaling up to 256 processor cores. Other MBD properties beyond energy and gradients can be calculated with libMBD, such as the charge-density polarization, first-order Coulomb correction, the dielectric function, or the order-by-order expansion of the energy in the dipole interaction. Calculations on supramolecular complexes with MBD-corrected electronic structure methods and a meta-review of previous applications of MBD demonstrate the broad applicability of the libMBD package to treat vdW interactions.

## I. INTRODUCTION

Van der Waals (vdW) dispersion interactions stem from electron correlation induced by the long-range part of the Coulomb force. Since they are attractive under normal circumstances, their importance grows with the characteristic length scale in an atomistic system, making them an often decisive force in large molecular complexes, molecular crystals, nanostructured materials, surface phenomena, and soft matter in general. At the same time, the workhorse electronic structure method for simulating molecules and materials—Kohn–Sham density-functional theory (KS-DFT)—usually neglects vdW interactions in the popular semilocal and hybrid approximations. As a result, a large number of approaches to mitigate this issue have been developed, and some of them are now used routinely when performing DFT calculations.^{1–3}

One such popular approach is the many-body dispersion (MBD) method, which is based on a model Hamiltonian of charged harmonic oscillators that capture the long-range electrodynamic response of an atomic system and yield its vdW energy.^{4} Its essential offering is that the coarse-graining of the full electronic structure to oscillators makes it efficient, while full many-body treatment of the electronic fluctuations provides consistent description of beyond-pairwise vdW interactions.^{5} Both these characteristics make the MBD framework especially apt for large-scale modeling of molecular systems, for which the demand is bound to only increase in the future.^{6} Given the existence of MBD integrations with even more efficient substitutes of DFT based on density-functional tight binding (DFTB)^{7} and machine learning,^{8–11} as well as the fact that MBD is an approach that is still evolving and being improved and extended (see below), the situation calls for an efficient, flexible, and reusable implementation of MBD that can be easily deployed in any given electronic structure code. Here, we develop and document libMBD, a library package that answers this call. This puts MBD alongside other popular approaches to treat vdW dispersion interactions in DFT such as DFT-D4^{12} or the vdW-DF family,^{3} which are already represented by standalone programs^{13} or libraries.^{14}

libMBD^{15} is written in modern Fortran with additional bindings available for C and Python. It uses MPI^{16}/ScaLAPACK^{17} parallelization, and implements MBD for both finite and periodic systems, with analytical gradients of the energy with respect to all input parameters, which enables both structure optimization as well as self-consistent vdW energy with respect to the electron density.^{18} libMBD is designed as a framework for any method based on the MBD Hamiltonian, and has a built-in support for the original MBD formulation,^{4} the range-separated self-consistently screened variant (MBD@rsSCS),^{19} the metal-surface parametrizations (MBD^{surf}),^{20,21} as well as the recent hybrid of MBD with nonlocal functionals (MBD-NL).^{22} For convenience, an efficient implementation of the predecessor of MBD, the pairwise TS model,^{23} is included as well. Finally, libMBD offers not only the vdW energy and its gradients, but can calculate also other quantities derived from the MBD Hamiltonian, namely the eigenstate charge fluctuations and induced charge-density polarization,^{24} first-order Coulomb corrections,^{25} the dielectric function,^{26} and the order-by-order expansion of the energy in the dipole interaction.^{27}

The popularity of MBD as a method is revealed through its independent implementations in numerous electronic structure codes, namely FHI-aims,^{28,29} VASP,^{30,31} Q-Chem,^{32,33} Quantum ESPRESSO,^{34,35} CASTEP,^{36,37} and DFT-D4.^{12,13} The original standalone reference implementation of MBD is available for download via Internet Archive.^{38} While some of these implementations introduced methodological innovations—such as the one in VASP, which introduced a reciprocal-space formulation,^{39} and the one in Quantum ESPRESSO, which introduced analytical gradients^{40}—all of them lack both in computational efficiency and in capability compared to libMBD. The single exception is the linear-scaling stochastic evaluation of MBD,^{41} which is more efficient for very large systems, but for the moment lacks analytical gradients. Furthermore, none of these MBD implementations have been designed as a library and their reuse across different electronic structure codes is problematic in theory, and nonexistent in practice. In contrast, libMBD requires only the quantities that directly depend on the electron density to be evaluated in the host program (since this implementation is usually code-specific), and has been already integrated into several such codes, including FHI-aims, VASP, Q-Chem, Quantum ESPRESSO, DFTB+,^{42} and ASE.^{43,44}

The flexibility of MBD as a general approach can be seen in the ever-growing list of methods and models that are based on it, and for this reason libMBD has been designed from the start as a framework that is easy to extend and modify. Besides the methods listed above that are already incorporated in libMBD, there are the fractionally ionic variant (MBD/FI)^{45} and its successor “universal” MBD (uMBD),^{46} the Wannier-function parameterization (vdW-WF2),^{47} beyond-dipole extension,^{48} anisotropic extension,^{49} extension to excited states,^{50} and the MBD variant of DFT-D4.^{12} If implemented in or with libMBD, these methods and models or any future ones would become instantly available across all the electronic structure codes to which libMBD has already been or will be integrated, and could take advantage of parallelization, analytical gradients, and both finite and periodic formulations. Next to computational efficiency, this is the biggest offering of libMBD.

The rest of the paper is structured as follows: Sec. II reviews all the MBD methods implemented in libMBD in terms of equations, as well as the physics of integrating MBD with DFT and its substitutes. Section III documents the structure of the code and the public interface served to the electronic structure codes, its scaling properties, numerical convergence, functionality, and existing interfaces. Section IV describes a number of example MBD calculations and a meta-review of important past applications of MBD. The paper is concluded in Sec. V.

## II. THEORY

This section reviews all mathematical equations that define most of the functionality of libMBD. It is structured such that equations in Secs. II A–II G are implemented in libMBD, while equations in Sec. II H must be implemented in the codes with which libMBD is integrated, since their implementation is inherently code-specific.

### A. Many-body dispersion

**R**

_{i}, with static polarizabilities

*α*

_{0,i}and frequencies

*ω*

_{i}. In most variants, the oscillators represent atoms, but both finer fragments, such as Wannier functions,

^{47}and coarser fragments, such as whole molecules,

^{51}are also possible. As a result of the coarse-graining, only the long-range part of the interaction between oscillators is expected to represent faithfully the original electronic system, justifying a dipole approximation. The MBD Hamiltonian for a finite system of

*N*oscillators then reads

*m*

_{i}, and

**T**

^{lr}is a damped dipole interaction tensor, parametrized through some single-oscillator properties

*κ*_{i}. The mass is in principle the third parameter (after static polarizability and frequency) required to fully specify a charged quantum oscillator, but it has no effect on the energy under the dipole approximation. Other sets of three independent oscillator parameters can be used as well, for example the

*C*

_{6}dispersion coefficient or charge

*q*, but these are always connected through simple relations, such as

**T**for large distances,

**T**

^{lr}, which necessarily depends on the effective range (degree of non-locality) of the electronic structure method to which MBD is coupled.

**Q**are negative. This can result from a failure of the dipole approximation, which has two possible underlying causes in practice. First, the local polarizability model might be flawed and overestimating the local response, which was commonly occurring with the original parametrization based on Hirshfeld volumes for ionic compounds. This issue can be resolved by using better polarizability models (Sec. II H). Second, the strong assumption about the locality of the charge fluctuations and the resulting aptness of the dipole approximation is inappropriate in metallic systems, in which the delocalized charge fluctuations of the conducting electrons cannot be adequately described by a local oscillator model. Here, only an effective treatment applicable to hybrid organic–inorganic interfaces is possible,

^{21,22}although a recent extension of MBD suggests that a more general solution might be viable.

^{52}

### B. Periodic boundary conditions

**q**from the first Brillouin zone (FBZ) as a good quantum number, and the energy per periodic unit cell can be calculated as an integral over the FBZ, evaluated in practice as a sum over a

**q**-point mesh,

*E*

_{MBD}(

**q**) is obtained from Eqs. (5)–(7) by making $\omega \u0303$,

**T**

^{lr}, and

**C**formally dependent on

**q**and considering

**n**, and skipping the

**n**=

**0**,

*i*=

*j*terms, is evaluated efficiently with Ewald summation,

^{53}

**m**indexes the cells of the reciprocal lattice,

**G**

_{m}is a reciprocal-lattice vector,

*R*

_{c}and

*k*

_{c}are the real-space and reciprocal-space cutoffs, respectively,

*γ*is the Ewald range-separation parameter (see Sec. III C), Ω is the unit-cell volume, and

**T**

^{erfc}is the short-range part of the bare dipole tensor,

^{54,55}which does not contribute to the energy (in practice the

**q**-point mesh avoids the Γ-point) or most other physical observables, and is stated here only for completeness.

### C. Self-consistent screening

**E**

^{ext}, a polarizable system partially (or fully) screens the field through induced polarization that is self-consistent with the total field. For charged harmonic oscillators interacting via some dipole interaction tensor

**T**

^{scs}, this reads

*i*-th oscillator induced by an external field on the

*j*-th oscillator.

*C*

_{6}coefficients

*C*

_{6}coefficients and oscillator frequencies.

### D. Analytical gradients

Analytical formulas for the gradients of the MBD energy with respect to all Hamiltonian input parameters can be obtained straightforwardly by application of the derivative chain rule. The two steps related to matrix diagonalization in Eq. (6) and matrix inversion and contraction in Eqs. (14) and (15), which are nontrivial, and for which naive application of the chain rule could result in inefficient evaluation of the gradients are explicitly stated below. When implemented, these formulas can be straightforwardly parallelized and result in a computational cost for the gradients that scales with the same power of system size as the energy evaluation.

*X*

_{i}is

*X*

_{i}is

### E. MBD properties

*α*

_{0,i}and

*ω*

_{i}, are needed to get the MBD energy, a third parameter, such as

*m*

_{i}or

*q*

_{i}, is needed to evaluate the wave function in real space, in order to convert from

*ξ*_{i}to

**r**

_{i}. The wave function of the non-interacting oscillators is simply

^{24}

^{,}

^{25}

**T**

^{lr},

^{27}

*α*T^{lr}can be replaced with the Hermitian

*α*^{1/2}

**T**

^{lr}

*α*^{1/2}since both are equivalent under the trace operator.

*x*

_{k}of

**(i**

*α**u*)

**T**

^{lr}can be heuristically renormalized to obtain a real-valued MBD energy even when the exact diagonalization gives a complex-valued energy due to the dipole collapse.

^{45}In particular, one replaces

*ɛ*_{M}of a periodic system of charged oscillators, which can be calculated via SCS from $\alpha \u0304$ as

### F. Pairwise dispersion

*f*is a damping function (

*f*→ 1 when

*R*

_{ij}→ ∞) and is typically specified directly rather than through

**T**

^{lr}. This

*ansatz*has been used in numerous vdW models, among them the predecessor of MBD, the TS method.

^{23}

^{57}

### G. MBD methods

The general MBD framework discussed up to this point is concretized into an MBD method by specifying at least two aspects. First, how are the MBD oscillators parametrized (*α*_{0}, *ω*). Second, how is **T**^{lr} chosen to smoothly integrate the long-range correlation from MBD with a given method that accounts for the short-range correlation, typically KS-DFT or its substitute. While independent to a certain degree, these two choices are ultimately related in practice.

#### 1. TS

^{23}each atom is mapped to a single oscillator parametrized through scaling of reference free-atom vdW parameters based on atomic volumes,

*V*

_{i}is some measure of a volume of an atom in a molecule or material and $Viref$ is the same measure for a corresponding free atom. Possible choices for the volume measure are discussed in Sec. II H.

*R*

^{vdW}is used as the damping function in Eq. (28),

*s*

_{R}adjusts the onset of the vdW interaction and is optimized separately for each individual short-range correlation model, and

*a*= 20.

#### 2. MBD@rsSCS

^{19}is the many-body extension of the TS method, in which the oscillators interact through the dipole tensor derived from the full Coulomb interaction of two oscillator (Gaussian) charge densities with widths $\sigma i2$,

^{58}

**T**

^{GG}is range-separated, and the short-range part is used in SCS to refine the TS oscillator parameters from Eq. (30), while the long-range part is used in the MBD Hamiltonian,

*g*is a damping function of the same form as

*f*in Eq. (31), but with

*a*= 6 and

*s*

_{R}usually denoted as

*β*.

#### 3. MBD-NL

^{22}removes the need for SCS, and the MBD Hamiltonian is parametrized directly with

*ω*

^{df}) for an atom in a molecule or material and the corresponding free atom (see Sec. II H). MBD-NL uses the same parametrization of

**T**

^{lr}in Eq. (35) as MBD@rsSCS, but with vdW radii in Eq. (31) replaced with

### H. Integration with short-range models

^{59}Second, the short-range model is used to parametrize the MBD Hamiltonian, that is, to obtain the vdW parameters of each oscillator. How this is done depends on a particular model and the various options are described below. In the context of libMBD, the following equations must be necessarily evaluated in the code that implements the short-range model and couples to libMBD.

#### 1. Hirshfeld volumes

**r**= 0. A straightforward extension is to use not only neutral free atoms, but also free ions

^{60}for the density partitioning, which improves description of ionic compounds. This was further extended to use not only ion densities, but also ionic reference vdW parameters by using a piecewise linear dependence of atomic polarizabilities on the charge.

^{45}This results in atomic

*α*

_{i}(i

*u*) that cannot be expressed through a single oscillator with Eq. (17), requiring either direct use of Eq. (24) or calculation of effective oscillator frequencies via Eq. (16).

#### 2. MBD-NL

^{61}is coarse-grained via Hirshfeld partitioning,

*g*,

^{22}[Eq. (7)] which is a function of the local ionization potential

*I*and the iso-orbital indicator

*χ*, which in turn are functions of the electron density, its gradient, and the kinetic energy density, quantities readily available in any code that implements meta-GGA functionals.

#### 3. Charge population analysis

*et al.*

^{7}found that a correlation between effective atomic polarizability of an atom in a molecule can be established based not only on the effective volume of an atom, but also on measures derived from the density matrix represented in a local atomic orbital basis set, $|\psi a\u3009=\u2211\nu c\nu a|\varphi \nu \u3009$, where single-electron states |

*ψ*

_{a}⟩ are expanded in atomic orbital basis functions |

*ϕ*

_{ν}⟩. When scaling free-atom vdW parameters [Eq. (30)] with the degree of hybridization between atoms as measured by the sum over the on-site component of the Mulliken charges (or equivalently the atom-projected trace over the density matrix) and the atomic charge

*Z*

_{i},

#### 4. Machine-learned interatomic potentials

Several works have shown that the effective atoms-in-molecules volume scaling ratios as a function of the atomic positions can be accurately represented by machine-learning regression.^{8–10} Once a continuous representation of the scaling ratio is constructed, an MBD description of the long-range correlation energy can be straightforwardly coupled to short-range force fields or atomistic machine-learned potentials. In cases where these potentials are directly trained on semilocal approximations to DFT, the choice of the range-separation parameter remains identical to the optimal choice for the underlying density-functional approximation.

## III. IMPLEMENTATION

### A. Code structure and API

libMBD implements all equations in Secs. II A–II G in Fortran 2008. To enable straightforward and rapid development of new variants and extensions of MBD, libMBD has been designed from the start to maintain a close correspondence between code and physics (Fig. 1). Equations are documented in source code with LATEX right next to the individual procedures that implement them and rendered in auto-generated documentation (https://libmbd.github.io). Gradients of the energy are computed in a forward mode, where each individual function that computes a value from its arguments also optionally returns the partial derivatives of the output value with respect to some (or all) of its input arguments. Parallelization is available either by trivial distribution of the **q**-point mesh [Eq. (8)] or by distribution of atom-indexed matrices (**T**, **Q**, $\alpha \u0304$) via BLACS.^{62} Linear-algebra operations are handled via LAPACK,^{63} ScaLAPACK,^{17} or ELPA^{64,65} interfaced through ELSI.^{66,67} libMBD can be used natively from Fortran, as a plain binary library via C API made by the iso_c_binding Fortran module, or from Python via a C extension that internally binds to the C API. The Fortran API (Listing 1) is based on two derived types, mbd_input_t and mbd_calc_t, where the former is a plain data structure of all input parameters used to initialize the latter, which is an opaque handle that provides access to computation and results. The Python API (Listing 2) is based on the MBDGeom class that represents a given geometry and provides access to computation via its methods.

libMBD’s Fortran API minimal example.

use mbd, only: mbd_input_t, mbd_calc_t |

use iso_fortran_env, only: real64 |

type(mbd_input_t) :: inp |

type(mbd_calc_t) :: calc |

real(real64) :: energy, gradients(3, 2) |

integer :: code |

character(200) :: origin, msg |

inp%atom_types = ['Ar', 'Ar'] |

inp%coords = reshape([0d0, 0d0, 0d0, 0d0, 0d0, 7.5d0], [3, 2]) |

inp%method = 'mbd-rsscs' |

inp%xc = 'pbe' |

call calc%init(inp) |

call calc%get_exception(code, origin, msg) |

if (code > 0) then |

print *, msg |

stop 1 |

end if |

call calc%update_vdw_params_from_ratios([0.98d0, 0.98d0]) |

call calc%evaluate_vdw_method(energy) |

call calc%get_gradients(gradients) |

call calc%destroy() |

use mbd, only: mbd_input_t, mbd_calc_t |

use iso_fortran_env, only: real64 |

type(mbd_input_t) :: inp |

type(mbd_calc_t) :: calc |

real(real64) :: energy, gradients(3, 2) |

integer :: code |

character(200) :: origin, msg |

inp%atom_types = ['Ar', 'Ar'] |

inp%coords = reshape([0d0, 0d0, 0d0, 0d0, 0d0, 7.5d0], [3, 2]) |

inp%method = 'mbd-rsscs' |

inp%xc = 'pbe' |

call calc%init(inp) |

call calc%get_exception(code, origin, msg) |

if (code > 0) then |

print *, msg |

stop 1 |

end if |

call calc%update_vdw_params_from_ratios([0.98d0, 0.98d0]) |

call calc%evaluate_vdw_method(energy) |

call calc%get_gradients(gradients) |

call calc%destroy() |

libMBD’s Python API minimal example.

from pymbd.fortran import MBDGeom |

geom = MBDGeom([(0, 0, 0), (0, 0, 7.5)]) |

energy = geom.mbd_energy_species( |

[’Ar’, ’Ar’], [0.98, 0.98], beta=0.83 |

) |

from pymbd.fortran import MBDGeom |

geom = MBDGeom([(0, 0, 0), (0, 0, 7.5)]) |

energy = geom.mbd_energy_species( |

[’Ar’, ’Ar’], [0.98, 0.98], beta=0.83 |

) |

80% of the code base of libMBD is covered by tests, which include both unit tests of analytical gradients of individual procedures as well as regression tests of the total energies and gradients. libMBD is built with CMake.^{68}

### B. Scalability

For libMBD to be practical, it needs to keep up with the codes that implement the corresponding short-range models (DFT and its substitutes) in terms of scalability to large systems, and scalability to a large number of processors. libMBD uses only process parallelization through the MPI and BLACS, and no thread parallelization.

Tested on various supercells of crystalline urethane (unit cell with 26 atoms), with or without periodic boundary conditions, the MBD method and its implementation in libMBD scale with the number of atoms, *N*, as follows (Fig. 2). For small *N*, the computational cost is dominated by the construction of the various matrices and as such grows quadratically with *N*. At around a few hundred atoms (finite system) or a few thousand atoms (periodic system), the matrix operations that scale cubically with *N* (multiplication, diagonalization, inversion) start to dominate. The MBD analytical gradients as implemented in Ref. 40 add one extra factor of *N* to the scaling (making it quartic) due to explicit evaluation of the gradient of $\alpha \u0304$ [see Eqs. (44) and (45) therein]. libMBD avoids this by calculating only the gradient of the fused evaluation of $\alpha \u0304$ and its contraction to atoms [Eq. (19)], and as a result calculation of the gradients changes only the prefactor in the scaling behavior.

Tested on a 5 × 5 × 5 supercell of urethane (3250 atoms), libMBD exhibits good strong scaling with the number of processor cores *n* (Fig. 3). The construction of the various matrices (**T**, **Q**, $\alpha \u0304$) is trivially parallelizable and has perfect strong scaling up to *n* = 4096. The evaluation of the MBD gradients requires only very little inter-process communication and shows similarly good behavior. The distributed matrix inversion (SCS) and diagonalization (MBD) as provided by ScaLAPACK and ELPA exhibits strong scaling up to *n* = 256 on our particular setup, at which point the scaling plateaus. The evaluation of SCS gradients requires only matrix multiplications and contractions, but its current simple implementation in libMBD requires an unnecessary amount of inter-process communication, making the strong scaling plateau already at *n* = 64 for finite systems. Note that this slight inefficiency still allows the calculation of the MBD energy and gradients in ∼100 s at *n* = 256, and furthermore it affects only the MBD@rsSCS method, not the newer MBD-NL method which does not use SCS. Nevertheless if better strong scaling is required in the future, the inter-process communication in the evaluation of the SCS gradients can be massively optimized.

### C. Convergence and default parameters

There are a few critical parameters that determine the numerical accuracy of the calculated MBD energy with respect to the exact energy as defined in Secs. II A–II G. The SCS is evaluated and integrated on an imaginary-frequency grid [Eq. (16)], for which libMBD uses a Gauss–Legendre quadrature transformed from [−1, 1] to [0, ∞] with *x* → 0.6 · (1 + *x*)/(1 − *x*). The MBD energy converges exponentially with the grid size, and using the default 15 grid points, it is converged to relative error of 10^{−8} [Fig. 4(a)]. The frequency grid size is the only parameter for calculations on finite systems. For periodic systems, there are additional parameters related to converging the infinite lattice sums involved.

The MBD energy per unit cell of a periodic system is calculated by integration over the FBZ [Eq. (8)]. libMBD uses a uniform quadrature grid in the reciprocal space, offset from the origin since the MBD energy has a removable singularity at the Γ-point (see Sec. II B). The convergence of the MBD energy with the **q**-point grid size is polynomial and system-dependent [see Fig. 4(b) for example convergence on urethane], and as such the grid needs to be specified by the user.

*γ*, and the real- and reciprocal-space cutoffs

*R*

_{c},

*k*

_{c}. libMBD uses the following default prescription for the parameters,

^{−10}[Fig. 4(c)]. The range-separation parameter does not affect accuracy, but through its determination of the cutoffs affects computational cost. The default value in libMBD ensures an optimal balance between the costs of the real- and reciprocal-space parts [Fig. 4(d)].

### D. Functionality

libMBD implements all equations in Secs. II A–II G, with all physical quantities accessible via the Python bindings (Table I). Namely, this includes the calculation of MBD and TS energies, evaluation of SCS, evaluation of the MBD energy via the imaginary-frequency integration [Eq. (24)], access to MBD eigenstates, and evaluation of the MBD polarization density and Coulomb correction. All energy evaluations are implemented for both finite and periodic systems, parallelized, and with available analytical gradients (except for the imaginary-frequency integration). Evaluation of MBD properties such as the Coulomb correction or density polarization is currently implemented only for finite systems and is not parallelized.

. | Python bindings . | Periodic . | Parallel . | Gradientsa . |
---|---|---|---|---|

MBD energy | ✓ | ✓ | ✓ | ✓ |

SCS | ✓ | ✓ | ✓ | ✓ |

TS energy | ✓ | ✓ | ✓ | ✓ |

MBD energy via ∫du | ✓ | ✓ | ✓ | |

MBD eigenstates | ✓ | ✓ | ✓ | |

MBD pol. density | ✓ | |||

MBD Coulomb corr. | ✓ |

. | Python bindings . | Periodic . | Parallel . | Gradientsa . |
---|---|---|---|---|

MBD energy | ✓ | ✓ | ✓ | ✓ |

SCS | ✓ | ✓ | ✓ | ✓ |

TS energy | ✓ | ✓ | ✓ | ✓ |

MBD energy via ∫du | ✓ | ✓ | ✓ | |

MBD eigenstates | ✓ | ✓ | ✓ | |

MBD pol. density | ✓ | |||

MBD Coulomb corr. | ✓ |

^{a}

Includes forces, stress tensor, and energy gradients with respect to the vdW parameters (for self-consistence).

### E. Existing interfaces

libMBD was originally developed alongside FHI-aims^{28,29} and was tightly integrated into it. Due to specific build requirements of FHI-aims, libMBD is currently distributed with FHI-aims in a lightly pre-processed form. The current API of libMBD was then based on the preexisting framework for interfacing external libraries in DFTB+,^{42,69} which was the first third-party program in which libMBD was integrated. As a result, the DFTB+ integration is very lightweight and can serve as a model for interfaces with other programs. libMBD has been since integrated into Quantum ESPRESSO,^{34,35} VASP,^{31} and Q-Chem^{32,33} without having to change anything in the existing source code, demonstrating its universality.

While libMBD is typically integrated into a larger electronic structure program as part of a DFT or similar calculation, this is not the only use case. Especially when developing new methods or when investigating unexpected behavior of existing methods, it is practical to be able to evaluate MBD separately from DFT, and to be able to do so as flexibly as possible. For this reason, libMBD was from the start developed together with a Python interface called pyMBD. Beyond using pyMBD directly (Listing 2), it can also serve as a bridge between libMBD and other Python programs. One such existing example is the interface to ASE.^{9,43}

## IV. APPLICATIONS

To demonstrate the functionality of libMBD, here we report on novel calculations done on supramolecular complexes and review selected previous applications of MBD.

### A. Interaction energies

We apply the MBD scheme via the libMBD interface within the FHI-aims^{29} software package to benchmark the interaction energies of structures in the L7 dataset^{70} against existing data in the literature. We compare the interaction energies of the L7 complexes as calculated with MBD against higher-level theories, in particular diffusion Monte Carlo (DMC)^{71,72} data from Refs. 73 and 74 and local natural orbital coupled cluster with single, double, and perturbative triple excitations [LNO-CCSD(T)]^{75,76} data from Ref. 73. Both DMC and LNO-CCSD(T) have been shown to accurately predict intermolecular interactions of small organic molecules. We also compare MBD-calculated interaction energies against DFT-D3^{77} data from Ref. 78 and DFT-D4 data from Refs. 12 and 79. Interaction energies were calculated as the difference between the total energy of the structure and the sum of the monomer energies.

DFT calculations were performed with the FHI-aims software package, using the PBE^{80} and HSE06^{81,82} density-functional approximations. The standard screening parameter of 0.11 /*a*_{0}^{−1} was used for HSE06, and the numeric atomic orbitals were represented using a “tight” basis set.^{29,83} The total energy, sum of eigenvalues, and charge density criteria were set to 1 × 10^{−6} eV, 1 × 10^{−2} eV, and 1 × 10^{−5} eV/*a*_{0}^{−3}, respectively. Input and output files for all DFT+MBD calculations are available as a dataset in the NOMAD electronic structure data repository.^{84}

The L7 dataset^{70} was chosen as it contains non-covalent complexes considerably larger than other datasets such as S22^{85} or S66.^{86,87} The L7 structures [Figs. 5(a)–5(g)] comprise intermolecular complexes between 48 and 112 atoms in size and are mostly dispersion-dominated, which makes them not only a good representation of biochemical structures but also provides a suitable test case to assess the accuracy of MBD for medium to large complexes involving π–π stacking, electrostatic interactions, and hydrogen bonding.^{73} The L7 dataset contains a parallel-displaced π–π-stacked coronene dimer (C2C2PD), a π–π-stacked coronene⋯adenine dimer (C3A), a π–π-stacked circumcoronene⋯Watson–Crick hydrogen-bonded guanine-cytosine dimer (C3GC), an octodecane dimer in stacked parallel conformation (CBH), a π–π-stacked Watson–Crick hydrogen-bonded guanine-cytosine dimer (GCGC), a stacked guanine trimer (GGG), and a phenylalanine residues trimer in mixed hydrogen-bonded-stacked conformation (PHE). In addition, a larger system of a C_{60} buckyball inside a [6]-cycloparaphenyleneacetylene ring (C_{60}@[6]CPPA), symmetrized to the *D*_{3d} point group and comprising 132 atoms, was considered [Fig. 5(h)]. The C_{60}@[6]CPPA complex has a large polarizability^{88} which gives rise to considerable dispersion interactions, and the confinement between the ring and the buckyball can result in non-trivial long-range repulsive interactions.^{25,89}

DFT+MBD performs fairly well for the supramolecular complexes with respect to the higher-level LNO-CCSD(T) and DMC methods (Table II). For the majority of L7 complexes (C3A, C3GC, GCGC, GGG, and PHE), DFT+MBD is within chemical accuracy (±1 kcal mol^{−1}) of LNO-CCSD(T) and/or DMC. For complexes where the difference between dispersion-corrected DFT and higher-level interaction energies is greater than 1 kcal mol^{−1}, both PBE+MBD and HSE06+MBD generally outperform PBE+D3 with respect to the higher-level methods, as can be seen for both C2C2PD and CBH.

Complex . | LNO-CCSD(T)a^{,}b
. | FN-DMCa^{,}c
. | DMCd . | PBE+D3e . | PBE0+D4a . | PBE+MBD . | PBE0+MBDa . | HSE06+MBD . |
---|---|---|---|---|---|---|---|---|

C2C2PD | −20.6(6) | −18.1(8) | −17.5(7) | −15.78 | −21.15 | −16.57 | −17.04 | −17.34 |

C3A | −16.5(8) | −15.0(1.0) | −16.6(9) | −13.58 | −16.46 | −13.43 | −13.86 | −14.05 |

C3GC | −28.7(1.0) | −24.2(1.3) | −25.1(9) | −22.87 | −27.60 | −22.80 | −23.48 | −23.88 |

CBH | −11.0(2) | −11.4(8) | −10.9(8) | −14.16 | −11.68 | −13.95 | −12.70 | −12.95 |

GCGC | −13.6(4) | −12.4(8) | −10.6(6) | −11.75 | −15.54 | −11.74 | −11.61 | −11.78 |

GGG | −2.1(2) | −1.5(6) | −2.0(4) | −1.39 | −2.10 | −1.66 | −1.27 | −1.31 |

PHE | −25.4(2) | −26.5(1.3) | −24.9(0.6) | −25.00 | −26.96 | −26.56 | −25.83 | −27.65 |

C_{60}@[6]CPPA | −41.7(1.7) | −31.1(1.4) | ⋯ | ⋯ | −39.47 | −29.59 | −32.14 | −33.89 |

Complex . | LNO-CCSD(T)a^{,}b
. | FN-DMCa^{,}c
. | DMCd . | PBE+D3e . | PBE0+D4a . | PBE+MBD . | PBE0+MBDa . | HSE06+MBD . |
---|---|---|---|---|---|---|---|---|

C2C2PD | −20.6(6) | −18.1(8) | −17.5(7) | −15.78 | −21.15 | −16.57 | −17.04 | −17.34 |

C3A | −16.5(8) | −15.0(1.0) | −16.6(9) | −13.58 | −16.46 | −13.43 | −13.86 | −14.05 |

C3GC | −28.7(1.0) | −24.2(1.3) | −25.1(9) | −22.87 | −27.60 | −22.80 | −23.48 | −23.88 |

CBH | −11.0(2) | −11.4(8) | −10.9(8) | −14.16 | −11.68 | −13.95 | −12.70 | −12.95 |

GCGC | −13.6(4) | −12.4(8) | −10.6(6) | −11.75 | −15.54 | −11.74 | −11.61 | −11.78 |

GGG | −2.1(2) | −1.5(6) | −2.0(4) | −1.39 | −2.10 | −1.66 | −1.27 | −1.31 |

PHE | −25.4(2) | −26.5(1.3) | −24.9(0.6) | −25.00 | −26.96 | −26.56 | −25.83 | −27.65 |

C_{60}@[6]CPPA | −41.7(1.7) | −31.1(1.4) | ⋯ | ⋯ | −39.47 | −29.59 | −32.14 | −33.89 |

^{a}

Data from Ref. 73.

^{b}

The indicated errors are extrapolated from the convergence of basis sets and local approximations.

^{c}

The indicated errors account for the stochastic uncertainty of the estimation and identify a 95% confidence interval (i.e. two standard deviations).

^{d}

Data from Ref. 74.

^{e}

Data from Ref. 78.

LNO-CCSD(T) and FN-DMC are in close agreement for five of the L7 structures (Table II). However, their interaction energies differ by 1.1 kcal mol^{−1} for C2C2PD, 2.2 kcal mol^{−1} for C3GC, and 7.6 kcal mol^{−1} for the C_{60}@[6]CPPA complex. Interestingly, PBE0+D4 is in close agreement with LNO-CCSD(T) across all complexes, including the three aforementioned structures, with a mean absolute deviation of 1.1 kcal mol^{−1}. In contrast, all the DFT+MBD approaches (PBE+MBD, PBE0+MBD, and HSE06+MBD) are much closer to FN-DMC, especially for the C_{60}@[6]CPPA complex. The differences between LNO-CCSD(T) and DFT+MBD, and between FN-DMC and DFT+D for the C2C2PD, C3GC and C_{60}@[6]CPPA complexes makes it difficult to evaluate a reliable reference interaction energy for these three complexes. This disparity should motivate the continued development of dispersion-correction methods in order to better characterize the interactions within dispersion-dominated complexes.

### B. MBD properties

The MBD method is based on an effective Hamiltonian and as such provides access to more than just energies (Sec. II E). The coupled zero-point oscillations of the electron density and the density polarization have been used to reinterpret *π*–*π* stacking.^{24,90} Both can be readily calculated with libMBD using the pyMBD interface.

The C3A dimer from the L7 dataset exhibits typical behavior of these quantities in a *π*–*π* complex. The low-energy zero-point MBD oscillations display long-range order spanning the whole complex, and corresponding to simple electrostatic patterns such as a dipole moment across a whole molecule interacting with an opposite dipole moment on the other molecule [Fig. 6(a)]. The vdW interaction in general polarizes the electron density out-of-plane in *π*–*π* complexes. The C3A dimer demonstrates that while the coupled oscillations are completely delocalized even in the case of differently sized interaction partners, the density polarization is mostly localized in the overlap region between the two molecules [Fig. 6(b)].

### C. Previous applications of MBD

The MBD scheme has been used to identify chemical and thermodynamic trends across a broad range of materials and complexes, and has often succeeded even when other vdW-inclusive methods have failed.^{91,92}

On many occasions, MBD has been able to accurately predict the thermodynamic stability of polymorphs often exhibited by molecular crystals, which are typically governed by non-covalent interactions.^{91,92} The ability to correctly identify the energetic rankings of such vdW-bound systems has applications in numerous fields such as pharmacy and organic electronics.^{91,93} One well-studied case is oxalic acid,^{91} for which many vdW-inclusive methods do not correctly predict the relative stability of its two polymorphs, whereas PBE+MBD agrees with experimental results.^{94} Another example is coumarin, where the inclusion of beyond-pairwise interactions significantly improves the predicted energy ranking of crystalline polymorphs compared to the pairwise TS method.^{95} Furthermore, the explicit accounting for many-body interactions elucidates the relative prevalence of a particular aspirin polymorph (“form 1”) over the other in nature, whereas most other electronic structure methods (both with and without vdW corrections) predict both polymorphs to be energetically degenerate.^{96}

MBD has also been shown to accurately calculate interaction energies for supramolecular host–guest complexes.^{91} In particular, DMC calculations show that the binding energies of the C_{70}-fullerene to [10]- and [11]-cycloparaphenylene are degenerate (within DMC uncertainty).^{24} Only the explicit inclusion of many-body interactions alongside DFT correctly predicts this degeneracy, whereas pairwise or two- and three-body vdW models show a clear preference for the 10-membered ring.^{24,97}

Other applications that the MBD approach has been used for include layered materials^{39} and molecular switches.^{98} In the former case, the heat of adsorption of toluene on graphene, as calculated using PBE+MBD, was found to be in good agreement with experiment.^{39} Furthermore, the balance between Pauli repulsion, chemical binding, electrostatic interactions, and MBD forces was observed to be critical for achieving molecular switches between the two bistable configurations of gold hexamers on single-walled carbon nanotubes, while the many-body effects were controlled by the former two interactions.^{98}

Many-body dispersion effects have also been shown to be important for the description of hybrid organic–inorganic interfaces where MBD clearly outperforms additive pairwise dispersion schemes. For example, MBD was shown to improve the adsorption energetics and geometries of adsorbates such as xenon, graphene, 3,4,9,10-perylene-tetracarboxylic dianhydride (PTCDA), and 7,7,8,8-tetracyanoquinodimethane (TCNQ) on silver surfaces, as they reduce the overbinding typically found in pairwise additive dispersion-correction schemes.^{99,100} Recently, PBE+MBD-NL provided an accurate prediction of the energetic ranking and the adsorption height of different adsorption phases of TCNQ on Ag(100) when compared to x-ray standing wave measurements.^{101} MBD also permits changes in the anisotropic polarizability tensor, the description of adsorbate vibrations and adsorbate-surface interaction screening to be captured.^{99} Furthermore, MBD interactions were shown to be a key factor behind the stabilization of a PTCDA molecule being brought to a standing configuration on Ag(111) surfaces using a scanning probe microscopy tip.^{102} At a temperature of 5 K, the MBD-calculated lifetime of the standing configuration was calculated to be 1.7 × 10^{5} s, which was consistent with experimental data; however, the vdW^{surf}-calculated lifetime was evaluated to be only 0.1 s.^{102} In addition, the MBD-calculated energy barrier heights were around 170% larger than the vdW^{surf}-calculated barrier heights.^{102} This is because the screening of long-range vdW interactions via non-additive many-body interactions reduces the molecule-surface attraction, which ultimately stabilizes the molecule in the upright geometry due to higher-order contributions to the vdW energy contained in MBD that counteract the pairwise contributions within vdW^{surf}.

## V. CONCLUSIONS

libMBD is a mature, efficient, yet flexible library that offers all the functionality one might need to model vdW interactions in molecules and materials with MBD. It can be used to enhance any electronic structure code that can interface with Fortran, C, or Python external libraries with a powerful implementation of the MBD and TS methods, provided the said code can calculate the Hirshfeld volumes or evaluate the VV polarizability functional (for MBD-NL). With the recent advent of machine learning models for Hirshfeld volumes and atomic polarizabilities in general, it is even possible to calculate MBD energies with libMBD without running a single electronic structure calculation. libMBD is hosted and developed on GitHub under the Mozilla Public License 2.0, is distributed with the ESL Bundle,^{103,104} and can be easily installed from conda-forge^{105} and PyPI.^{106} We hope that libMBD either standalone or embedded in a growing number of third-party programs will equip researchers with a new computational tool to study molecules and materials.

## ACKNOWLEDGMENTS

We thank R. A. DiStasio Jr., A. Ambrosetti, and T. Markovich for countless discussions about MBD, V. Gobre for writing the original FHI-aims implementation of MBD, and F. Knoop for correcting mistakes in the equations in the early version of the manuscript. Funding is acknowledged from the German Ministry for Education and Research (Berlin Institute for the Foundations of Learning and Data, BIFOLD), the EPSRC Centre for Doctoral Training in Diamond Science and Technology (Grant No. EP/L015315/1), the Research Development Fund of the University of Warwick, the UKRI Future Leaders Fellowship programme (Grant No. MR/S016023/1), the European Research Council (ERC-AdG Grant No. FITMOL), and the Luxembourg National Research Fund (FNR-CORE Grant BroadApp No. C20/MS/14769845). We are grateful for computing resources from the Scientific Computing Research Technology Platform of the University of Warwick, the EPSRC-funded High-End Computing Materials Chemistry Consortium (Grant No. EP/R029431/1) for access to the ARCHER2 UK National Supercomputing Service (https://www.archer2.ac.uk), the EPSRC-funded UKCP Consortium (Grant No. EP/P022561/1), and the partially EPSRC-funded UK Materials and Molecular Modelling Hub (Grant Nos. EP/P020194 and EP/T022213) for access to Young.

## AUTHOR DECLARATIONS

### Conflict of Interest

The authors have no conflicts to disclose.

### Author Contributions

**Jan Hermann**: Conceptualization (equal); Software (equal); Writing – original draft (equal). **Martin Stöhr**: Software (supporting); Writing – original draft (supporting). **Szabolcs Góger**: Software (supporting). **Shayantan Chaudhuri**: Data curation (supporting); Visualization (supporting); Writing – original draft (supporting). **Bálint Aradi**: Conceptualization (supporting); Methodology (supporting). **Reinhard J. Maurer**: Conceptualization (equal); Funding acquisition (equal); Software (supporting); Supervision (equal); Writing – review & editing (equal). **Alexandre Tkatchenko**: Conceptualization (equal); Funding acquisition (equal); Supervision (equal); Writing – review & editing (equal).

## DATA AVAILABILITY

The data that support the findings of this study are openly available in NOMAD, at http://doi.org/10.17172/NOMAD/2023.06.16-2.