Semi-empirical quantum chemical approaches are known to compromise accuracy for the feasibility of calculations on huge molecules. However, the need for ultrafast calculations in interactive quantum mechanical studies, high-throughput virtual screening, and data-driven machine learning has shifted the emphasis toward calculation runtimes recently. This comes with new constraints for the software implementation as many fast calculations would suffer from a large overhead of the manual setup and other procedures that are comparatively fast when studying a single molecular structure, but which become prohibitively slow for high-throughput demands. In this work, we discuss the effect of various well-established semi-empirical approximations on calculation speed and relate this to data transfer rates from the raw-data source computer to the results of the visualization front end. For the former, we consider desktop computers, local high performance computing, and remote cloud services in order to elucidate the effect on interactive calculations, for web and cloud interfaces in local applications, and in world-wide interactive virtual sessions. The models discussed in this work have been implemented into our open-source software SCINE Sparrow.

## I. INTRODUCTION

Semi-empirical methods have originated as feasible quantum chemical approaches at times when computer hardware was struggling with advanced *ab initio* methods (and even with Hartree–Fock calculations). The advancement of modern computer hardware in the past three decades has not rendered semi-empirical approaches superfluous, but instead pushed the limits of molecular size that can be considered in a single electronic structure calculation (not requiring any embedding techniques, which would introduce further approximations of different flavor into the first-principles calculations). In recent years, several research groups have successfully carried out semi-empirical calculations for systems exceeding thousands of atoms. For example, Hennemann and Clark reported a calculation on no fewer than 100 000 atoms with their EMPIRE code.^{1} Dral presented results for carbon nanotubes and carbon peapods with about 1000 atoms.^{2} Thiel and co-workers developed a computer program accelerated by graphical processing units (GPUs), able to carry out calculations on systems with more than 1000 heavy atoms (e.g., a water cluster consisting of 1800 H_{2}O molecules).^{3} A similar approach was taken by Maia *et al.*, who extended the MOPAC program to take advantage of GPUs. With the resulting software, these authors could also successfully carry out calculations for systems with more than 1000 atoms.^{4} Recently, Tretiak and co-workers have described PYSEQM, a computer program for Born–Oppenheimer molecular dynamics based on semi-empirical methods; they presented results for a phenylene–ethynylene dendrimer consisting of 884 atoms.^{5} Stewart demonstrated a calculation on a protein with 14 566 atoms.^{6} In addition, the DFTB+ software package, implementing density functional tight binding methods, can be applied to systems with 1000 atoms and more.^{7}

Due to their short runtimes, semi-empirical methods allow one to conduct high-throughput virtual screening campaigns^{8,9} on very large molecule sets in comparatively little time. For example, Halls and Tasaki screened more than 7000 compounds for the suitability as lithium ion battery additives with the PM3 model.^{10} Jensen and co-workers presented a workflow based on the PM6 model to predict pK_{a} values, which is fast enough to be used in screening studies with as few as one hundred processors.^{11} Zwijnenburg and co-workers proposed a high-throughput screening approach to assess optoelectronic properties of conjugated polymers, which is based on the xTB model.^{12,13} Wen *et al.* assessed the detonation velocity of about 100 000 high-energy materials with the PM7 model.^{14}

In many cases, semi-empirical methods are employed in high-throughput screening studies as part of more complex workflows. For example, structures can be preoptimized with a fast semi-empirical calculation and then subsequently refined with a more accurate albeit computationally more costly method, or compound structures can be entirely determined with semi-empirical methods, while some desired property is evaluated with a more accurate electronic structure model (see, e.g., Refs. 15–23).

The rise of data-driven machine learning approaches in computational chemistry also gives a substantial impetus for the application of semi-empirical methods both for data generation and for developing new hybrid machine learning/semi-empirical methods. Data-driven approaches are hungry for data, and fast semi-empirical methods allow for generating vast amounts of data within a reasonable amount of time (e.g., PM6 was used to calculate optimized molecular geometries and electronic properties of 221 × 10^{6} molecules,^{24} and earlier, PM7 was used to preoptimize structures of 134 thousand molecules in the QM9 dataset^{23}). Pure machine learning models are faster than semi-empirical methods but suffer from poorer transferability. The combination of machine learning with physics-based semi-empirical methods creates more transferable hybrid methods that can target higher accuracy approaches (mostly, density functional theory,^{25–31} but also a few methods,^{32,33} such as AIQM1,^{33} targeting coupled cluster or G4MP2 accuracy; see the recent overview in Ref. 34). For instance, AIQM1 turned out to be computationally faster and at the same time more accurate than many standard density functional theory methods for various applications and even helped to revise experimental data.^{33,35}

An important area of application for ultrafast semi-empirical methods is interactive quantum mechanics.^{36–39} Behind this approach is haptic quantum chemistry,^{40} which exploits the tactile human sense so that a person can manipulate a molecule of interest in real time (e.g., try to abstract a proton), while the structure is continuously relaxed. For a realistic experience, it is important that this relaxation is carried out in a physically meaningful way, i.e., according to the laws of quantum mechanics. Furthermore, it needs to be fast enough so as not to break the interactivity. For a fluent visual feedback, about 30–60 calculations need to finish sequentially per second, while direct haptic (i.e., tactile) feedback requires, in principle, up to 1000 single-point calculations per second, which can, however, be reduced to a smaller number through activation of a mediator potential.^{41} Semi-empirical methods are generally fast compared to full-fledge electronic structure models, which makes them therefore ideally suited for interactive applications.

Last but not least, we note that very fast quantum chemical calculations are also a key ingredient to *ab initio* molecular dynamics,^{42} where long trajectories with femtosecond time step resolution demand fast electronic structure programs to make thousands of single-point calculations viable.

Approaches to reduce the computational complexity of a given quantum chemical method, in particular linear-scaling approaches, obviously play an important role in the acceleration of quantum chemical calculations.^{43–46} However, it is important to realize that such approaches are primarily targeting atomistic systems of increasing size. For small- and medium-sized molecules, the overall calculation time might not be significantly reduced with respect to a canonical approach. This is an important observation because especially for interactive applications, the absolute calculation time is decisive rather than the relative scaling of a method. It is for this reason that our focus here will be on the actual, overall computation time of the methods discussed.

In this work, we report the implementation of a particular class of semi-empirical methods, the so-called orthogonalization-corrected methods (OMx),^{47–53} and AIQM1, an artificial intelligence-enhanced method based on OMx, into our software package SCINE Sparrow.^{54,55} In addition, we investigate runtimes of various semi-empirical methods implemented in Sparrow and relate them to transmission times of various computer networks. In particular, for interactive applications, these transmission times are an important factor to consider. This allows us to investigate the suitability of different combinations of semi-empirical methods and computing environments for high-throughput screening and interactive settings for various molecule sizes.

This work is organized as follows: In Sec. II, we review the central approximations for the various classes of semi-empirical methods available in Sparrow. Particular emphasis is placed on OMx. Then, we detail the computational methodology adopted in this work and present and analyze runtimes measured and relate them to the constraints set by interactive quantum mechanics and by high-throughput computational campaigns in Sec. III. Finally, we draw conclusions in Sec. IV.

## II. THEORY

In this section, we give a brief overview of the various semi-empirical methods implemented in Sparrow, which are summarized in Table I. For a more general introduction on semi-empirical methods, we refer the reader to the recent review in Ref. 56.

Method . | Parameterized elements . | Analytical gradients . | References . |
---|---|---|---|

Density functional tight binding | |||

DFTB0 | H, B–F, Si–Cl, Br, I, Na, Mg, K–Ti, Fe–Ni, Zn^{a} | Yes | 57 and 58 |

DFTB2 | H, B–F, Si–Cl, Br, I, Na, Mg, K–Ti, Fe–Ni, Zn^{a} | Yes | 59 |

DFTB3 | H, B–F, Si–Cl, Br, I, Na, Mg, K–Ti, Fe–Ni, Zn^{a} | Yes | 60 |

Neglect of diatomic differential overlap (MNDO-type) | |||

MNDO(/d) | H–Ca, Zn–Sr, In–Ba, Hg–Bi | Yes | 61 and 62 |

AM1 | H–Be, C–Ca | Yes | 63 |

RM1 | H, C–F, P–Cl, Br, I | Yes | 64 |

PM3 | H–Ca, Zn–Sr, Cd–Ba, Hg–Bi | Yes | 65 and 66 |

PM6 | H–Bi except Ce–Lu^{b} | Yes | 67 |

Neglect of diatomic differential overlap | |||

(orthogonalization-corrected and machine learning-enhanced) | |||

OM2 | H, C, N, O, F | No | 49, 50, and 52 |

OM3 | H, C, N, O, F | No | 51 and 52 |

AIQM1 | H, C, N, O | No | 33 |

Method . | Parameterized elements . | Analytical gradients . | References . |
---|---|---|---|

Density functional tight binding | |||

DFTB0 | H, B–F, Si–Cl, Br, I, Na, Mg, K–Ti, Fe–Ni, Zn^{a} | Yes | 57 and 58 |

DFTB2 | H, B–F, Si–Cl, Br, I, Na, Mg, K–Ti, Fe–Ni, Zn^{a} | Yes | 59 |

DFTB3 | H, B–F, Si–Cl, Br, I, Na, Mg, K–Ti, Fe–Ni, Zn^{a} | Yes | 60 |

Neglect of diatomic differential overlap (MNDO-type) | |||

MNDO(/d) | H–Ca, Zn–Sr, In–Ba, Hg–Bi | Yes | 61 and 62 |

AM1 | H–Be, C–Ca | Yes | 63 |

RM1 | H, C–F, P–Cl, Br, I | Yes | 64 |

PM3 | H–Ca, Zn–Sr, Cd–Ba, Hg–Bi | Yes | 65 and 66 |

PM6 | H–Bi except Ce–Lu^{b} | Yes | 67 |

Neglect of diatomic differential overlap | |||

(orthogonalization-corrected and machine learning-enhanced) | |||

OM2 | H, C, N, O, F | No | 49, 50, and 52 |

OM3 | H, C, N, O, F | No | 51 and 52 |

AIQM1 | H, C, N, O | No | 33 |

^{a}

Depending on the actual parameter set employed, parameters for some elements will usually not be available in a given calculation (see Sec. II A 2).

^{b}

For some pairs of elements, some parameters are not available.

### A. Semi-empirical models and their intricate parameter dependence

#### 1. Neglect of diatomic differential overlap

The basic Hartree–Fock approach toward the solution of the nonrelativistic Schrödinger equation formally scales with the fourth power of the number of single-particle basis functions. This scaling was a severe bottleneck for the limited computational resources available in the 1960s and 1970s, which is why semi-empirical methods were developed by introducing rather drastic approximations.

An important class of semi-empirical models is based on the neglect of diatomic differential overlap (NDDO) that makes three key approximations. First, the number of explicitly considered electrons is reduced to a minimum. For example, for carbon, only four electrons are considered explicitly; the two 1*s*-type core electrons are treated implicitly. In fact, the number of valence electrons considered explicitly is never larger than 12 for any element.

Second, the number of basis functions is reduced to the absolute minimum by relying on minimal basis sets. For example, for carbon, only an *s*-type and three *p*-type functions are used.

Third, the NDDO approximation itself is adopted to approximate the electron-repulsion integrals (ERIs),

Here, *ϕ*_{μ} is the *μ*th orthogonalized basis function and $\chi \mu I$ is the *μ*th atomic basis function centered on atom I. Any ERI for which basis functions *μ* and *ν* and *λ* and *σ* are not located on the same atom are neglected, hence lowering the scaling from the fourth to the second power in the number of basis functions.

During the past decades, a range of models has been developed based on these three core approximations. Notable examples are MNDO,^{61} MNDO/d,^{62} AM1,^{63} RM1,^{64} PM3,^{65,66} PM6,^{67} and PM7^{68} and the family of orthogonalization-corrected methods described below. For a more detailed introduction into the NDDO approximation and the multitude of models applying it, we refer the reader to the review in Ref. 55. Sparrow implements MNDO, AM1, RM1, PM3, and PM6.^{69} Note that extensions to PM6, such as PM6-DH,^{70} PM6-DH2,^{71} PM6-DH+,^{72} and PM6-D3H4,^{73} which aim at improving the description of noncovalent interactions, are currently not available in Sparrow. For MNDO, parameters are supplied for elements H–Ca, Zn–Sr, In–Ba, and Hg–Bi. For AM1, the parameters cover elements H–Be and C–Ca, while RM1 has parameters for H, C–F, P–Cl, Br, and I. PM3 features parameters for elements H–Ca, Zn–Sr, Cd–Ba, and Hg–Bi. PM6 covers all elements up to Bi except the lanthanides Ce–Lu. Note, however, that not every possible pairwise combination of these elements is parameterized. If parameters are missing for a certain element or element pair, SCINE Sparrow produces an error message, pointing out which parameters are exactly missing. Arbitrary, user-defined parameters can also be supplied.

#### 2. Density functional tight binding

A different approach is taken by the so-called density functional tight binding (DFTB) family of models.^{74} It is based on a Taylor series expansion of the electronic energy around a reference energy based on a superposition of atomic electron densities. With this ansatz, the electronic energy *E*^{DFTB} is obtained as

Here, *E*_{rep} models the repulsion energy, whereas the second term represents the binding energy. The latter is based on a linear combination of atomic orbitals and determined from an eigenvalue problem similar to the one appearing in the standard Hartree–Fock model. In its simplest form (sometimes called DFTB0, non-self-consistent DFTB, or non-SCC DFTB), only these first two terms are considered.^{57,58} Both terms are heavily parameterized, and the eigenvalue problem required for the second term does not have to be solved in an iterative fashion (hence the term “non-self-consistent DFTB”).

The third term in Eq. (2) takes into account charge transfer, allowing for a better description of polarized systems. The charge fluctuations on atom A, Δ*q*_{A}, are determined in a Mulliken population analysis, and *γ*_{AB} is a parameterized function of the internuclear distance between atoms A and B. When this third term is also considered, the resulting method is called DFTB2 or self-consistent-charge DFTB (SCC-DFTB).^{59}

Finally, the DFTB3 method^{60} takes into account density fluctuations up to third order by incorporating also the fourth term on the right-hand side of Eq. (2). Γ_{AB} is also a parameterized function depending on the internuclear distance. Note that the function *γ*_{AB} is not exactly the same in DFTB3 compared to DFTB2, but contains an additional scaling factor.

Besides these three original DFTB approaches, another class of semi-empirical models based on the tight-binding idea was developed by Grimme and co-workers, namely, GFN0-xTB,^{75} GFN1-xTB,^{76} and GFN2-xTB.^{77} These models were developed with a focus on the general applicability for almost all elements of the Periodic Table, which has been a key feature for their fast and widespread acceptance.

Sparrow implements DFTB0, DFTB2, and DFTB3.^{69} For these methods, the parameter sets 3ob-1-1 (covering elements C, H, N, and O), 3ob-2-1 (C, H, N, O, P, and S), 3ob-3-1 (covering C, H, N, O, P, S, F, Cl, Br, I, Na, K, Ca, Mg, and Zn), mio-1-1 (C, H, N, O, and S), mio-1-1 (containing C, H, N, O, P, and S), borg-0-1 (H, B), pbc-0-3 (C, H, N, O, Si, F, and Fe), trans3d-0-1 (C, H, N, O, Sc, Ti, Fe, Co, and Ni), and znorg-0-1 (covering C, H, N, O, S, and Zn) are supplied. Other, user-defined parameter sets can also be supplied.

#### 3. Orthogonalization-corrected methods

For the present work, we have implemented in Sparrow orthogonalization-corrected methods (OMx): OM2,^{49,50} OM3,^{51} ODM2*,^{53} ODM3*,^{53} and AIQM1.^{33} Other OMx variants not implemented in Sparrow are OM1,^{47,48} ODM2,^{53} and ODM3.^{53} Different from most traditional NDDO-based semi-empirical molecular orbital methods, such as MNDO,^{61} AM1,^{63} PM3,^{65,66} PM6,^{67} and PM7,^{68} OMx models include orthogonalization corrections, *V*^{ORT}, in the core Hamiltonian, *H*^{core}, which is part of the Fock matrix,

In addition, the attractive penetration integrals, *V*^{PI}, and the repulsive effective core potentials, *V*^{ECP}, are also incorporated into the one-center core Hamiltonian matrix. Here, we follow a common notation to distinguish different centers of basis functions: subscripts *μ* and *υ* denote basis functions centered on the same atom A, while subscript *λ* denotes basis functions at another atom B, which is different from A, and C represents an atom, which is neither A nor B.

The first term *U*_{μμ} in Eq. (3) is the one-center one-electron energy of atom A and is obtained by parameterization. $V\mu \upsilon ,Bs$ is the semi-empirical core–electron attraction and calculated as in MNDO-type models,

where *Z*_{B} is the core charge of atom B and the semi-empirical two-center electron ERIs are obtained by multiplying the analytic ERIs $\mu A\nu A,sBsBs$ with a Klopman–Ohno scaling factor *f*_{KO},

The introduction of *f*_{KO} ensures that the semi-empirical two-center ERIs smoothly transition to the corresponding values in the one-center limit. It is given by

where the integrals in the numerator are the MNDO-type ERIs and calculated from semi-empirical multipole–multipole interactions. The core–core repulsion energy is also scaled with the *f*_{KO} factor,

where *R*_{AB} is the distance between atoms A and B.

Another key feature of OMx methods is that the minimal valence contracted Gaussian-type orbitals (GTOs) instead of Slater-type orbitals (STOs) are employed in integral calculations, and a scaling factor *ζ* is used to adjust the exponents of the primitive Gaussian functions. That is, an STO-3G contraction is used for hydrogen and ECP-3G is used for all other elements. Then, the (analytic) ERIs—and also the overlap integrals below—can be evaluated analogously to the full “*ab initio*” ones with the Libint library^{78} in Sparrow.

In the following, we provide a detailed description of *V*^{ORT}, *V*^{PI}, and *V*^{ECP} because these three terms are the key ingredients in OMx models (beyond MNDO-type models) and achieve systematic improvements. Moreover, their description is scattered over multiple resources in the original literature, which may hamper the implementation of OMx models.

**Orthogonalization corrections** *V*^{ORT}**.** The inclusion of the orthogonalization correction *V*^{ORT} retrieves the effect of a transformation from a nonorthogonal (“atomic-orbital”) basis {*χ*} to an orthogonal (“molecular-orbital”) basis {*ϕ*} by expanding the Löwdin orthogonalization^{79} to second order in terms of overlap integrals,

where $S\u2032\chi =S\chi \u22121$.

Based on Eq. (10), OMx methods adopt different parametric formulas in the matrix-element calculation of $H\varphi $. For simplicity, we will neglect the superscripts *χ* and *ϕ* in the following.

The very first OMx method OM1 only introduces the one-center orthogonalization corrections $V\mu \nu ,BORT$ defined as

where *F*_{1} and *F*_{2} are element-dependent parameters controlling the magnitudes of the one-center orthogonalization corrections. *S*_{μρ} and *β*_{ρν} are the overlap and resonance integrals, respectively.

The resonance integrals represent the two-center one-electron integrals $\varphi \mu A|T+VA+VB|\varphi \mu B$ containing kinetic energy and nuclear–electron attraction terms arising from orbitals $\varphi \mu A$ and $\varphi \mu B$ on atoms A and B. Due to the severe effect of symmetric orthogonalization,^{80} they are not evaluated analytically, but by an empirical formula in attempt to recover the orthogonalization effects through parameterization. The empirical formula is different compared to MNDO-based methods and given by^{47,48}

where *α*^{X} and *β*^{X} are element- and orbital-type-dependent parameters. The resonance integrals are first calculated in a local diatomic coordinate system and then transformed into the global coordinate system.

For *s*-, *p*_{x}-, *p*_{y}-, and *p*_{z}-type basis functions, the rotation matrices for this transformation from the local to the global coordinate system have the following form:

where the angle *b* is defined as the angle between vectors ** z** and

*z*^{loc}and the angle

*a*is the angle between vector

**and the projection of**

*x*

*z*^{loc}into the

*xy*plane. The details of the construction of vectors

*x*^{loc},

*y*^{loc}, and

*z*^{loc}can be found in the supporting information of Ref. 55.

Considering the integral elements of matrix ** I** in the local coordinate systems of atoms A and B, their corresponding values in the global coordinate system can be calculated by

where *n*_{A} and *n*_{B} are the number of basis functions on atoms A and B, respectively.

The matrix element *H*_{μμ,X} is restricted to an atom pair A–X and is calculated by

It should be noted that the second sum in Eq. (11) is implemented by calculating each contribution in a local coordinate system. Hence, a local to global coordinate transformation is required.

In contrast to OM1, OM2 additionally includes two-center orthogonalization corrections $V\mu \lambda ,CORT$ in the core Hamiltonian in Eq. (4),

where $G1AB$ and $G2AB$ are defined as the arithmetic means of element-dependent parameters *G*,

All terms in Eq. (16) are directly calculated in the global coordinate system. To ensure rotational invariance, *H*_{μμ,X} needs to be averaged when *μ* is a *p*-type basis function,

Differing from OM2, OM3 neglects the second-order terms in one- and two-center orthogonalization corrections, i.e., both $F2X$ and $G2X$ are zeros in OM3.

**Penetration integrals** *V*^{PI}**.** The penetration integrals $V\mu \nu ,BPI$ are defined as the difference between the actual core–electron attraction integrals, which are obtained from the analytic core–electron attraction integrals $V\mu \nu ,Ba$ scaled by *f*_{KO}, and their corresponding semi-empirical analogs,

In our Sparrow implementation, $V\mu \nu ,Ba$ are evaluated by the integral engine of Libint. Note that there is a slight mistake in the description of the OMx methods in Ref. 55: For the OMx methods, the first two terms of the second sum of Eq. (77) of Ref. 55 always cancel each other. Accordingly, Eq. (80) of Ref. 55 is always fulfilled, not only for s-type orbitals.

**Effective core potentials** *V*^{PI}**.** To account for core–valence interactions, ECP contributions are incorporated into OMx methods. In OM2 and OM3, the following empirical expression is used to calculate $V\mu \nu ,BECP$:

where *F*_{αα} is treated as an atomic parameter and *G*_{μα} is represented by an expression analogous to the resonance integrals,

Two further atomic parameters, *α*_{α} and *β*_{α}, have been introduced here.

The ODM2 and ODM3 models employ the same electronic structure model as OM2 and OM3. Compared with OM2 and OM3, however, Grimme’s D3 dispersion corrections^{81} are included as an integral part in ODMx. In addition to these dispersion corrections, a much more robust parameterization procedure and a wide range of training sets (including noncovalent interactions and electronic excitation energies) were used in the parameterization of ODMx models. Moreover, the calculation of enthalpies of formation in ODMx is analogous to *ab initio* methods by explicitly adding zero-point vibrational energy and thermal corrections, while other NDDO-based methods derive enthalpies of formation directly from self-consistent field (SCF) total electronic energies. In Sparrow, we have not implemented the two- and three-body D3 dispersion corrections yet, and hence, our ODMx implementation does not contain any dispersion corrections. We denote the resulting methods as ODM2* and ODM3*, which is consistent with notation introduced previously.^{33} The Hamiltonians of the ODMx* models are then the same as in the OMx methods, but the ODMx* models have different parameters trained on a more diverse dataset in addition to the aforementioned difference in the treatment of SCF total electronic energies.

Until now, OMx models have only been parameterized for second-period elements H, C, N, O, and F. Moreover, it must be emphasized that there are some slight differences in physical constants used for MNDO-type (constants taken from CODATA2014^{82}) and for OMx methods in Sparrow (Table II) to ensure that the results are consistent with the original implementation of OMx methods.

. | MNDO-type methods . | OMx . |
---|---|---|

Bohr radius | 0.529 177 210 67 × 10^{−10} m | 0.529 167 × 10^{−10} m |

Hartree/eV | 3.674 932 248 × 10^{−2} | 1/27.21 $\u22483.6751194414\u22c510\u22122$ |

eV/Hartree | 1/3.674 932 248 × 10^{−2} $\u224827.21138602$ | 27.21 |

. | MNDO-type methods . | OMx . |
---|---|---|

Bohr radius | 0.529 177 210 67 × 10^{−10} m | 0.529 167 × 10^{−10} m |

Hartree/eV | 3.674 932 248 × 10^{−2} | 1/27.21 $\u22483.6751194414\u22c510\u22122$ |

eV/Hartree | 1/3.674 932 248 × 10^{−2} $\u224827.21138602$ | 27.21 |

#### 4. AIQM1

The AIQM1 method is an artificial-intelligence-enhanced quantum mechanical method, which can be used out of the box for very fast quantum chemical calculations with an accuracy as that of the “gold-standard” coupled cluster approach for ground-state energies of neutral, closed-shell species and with reasonable accuracy for other properties, charge, and spin states.^{33} The energy of AIQM1 is composed of three parts: the baseline energy is obtained with the ODM2* Hamiltonian; then, there is an ANI-type neural network (NN) correction^{83–85} and a D4 dispersion correction,^{86,87}

Since NN corrections are added on top of a prediction with a semi-empirical model, AIQM1 can be considered as a universal Δ-learning model,^{25} which is improved by transfer learning: first, NN corrections were trained on 4.6 × 10^{6} data points measuring the difference between density functional theory and ODM2* energies and forces. Then, NNs were partially refitted to a smaller 0.5 × 10^{6} data points set of coupled cluster and ODM2* (+dispersion correction) energy differences. The NN correction *E*_{NN} is calculated by using the locality approximation introduced by Behler and Parrinello,^{88} i.e., it is the sum of atomic energy contributions *E*_{i},

The NN part of AIQM1 consists of an ensemble of eight ANI-type NNs,^{89} and the architecture of each NN in AIQM1 is similar to those of ANI-1x^{90} and ANI-1ccx^{84} except for two modifications: the previous CELU (continuously differentiable exponential linear unit) activation function was changed to GELU (Gaussian error linear unit) to ensure that higher derivatives can be obtained with this NN; the angular cutoff was increased from 3.5 to 4 Å.

Importantly, an ensemble of eight NNs not only improves the overall accuracy of an approach but also enables convenient uncertainty quantification (UQ) of AIQM1 predictions by evaluating standard deviations in NN corrections (NN SDs).^{35} Such mechanism of UQ is lacking for any other traditional semi-empirical method not based on machine learning. We found that if NN SDs do not exceed 0.41 kcal/mol, AIQM1 enthalpies of formation are generally reaching chemical accuracy (errors lower than 1 kcal/mol).^{35}

The dispersion energy *E*_{disp} is obtained from a sum over all pairs AB of atoms A and B,

The scaling factors *s*_{6} and *s*_{8} are empirical parameters. The dispersion coefficients $CnAB$ (*n* = 6, 8) are obtained from tabulated values,^{86} the Mulliken charge, and the coordination numbers of atoms A and B. The damping function $fdamp(n)(RAB)$ is a parameterized function of the internuclear distance *R*_{AB}.^{91} The D4 dispersion corrections used in AIQM1 include Axilrod–Teller–Muto three-body contributions,

where the triple–dipole constant $C9ABC$ is approximated as the geometric mean of the *C*_{6} coefficients of atom pairs AB, AC, and BC and *θ*_{A}, *θ*_{B}, and *θ*_{C} are the angles of the triangle formed by *R*_{AB}, *R*_{BC}, and *R*_{CA}.

### B. Constraints on interactivity and high throughput

Interactive and high-throughput calculations have the common requirement that calculation times should be as short as possible. There are a variety of ways to accelerate quantum chemical calculations (see, e.g., Ref. 36). On the one hand, one can speed up the calculation for a given system size by relying on highly optimized computer programs, by parallelizing time-consuming parts of the calculation, and by taking advantage of sophisticated algorithms (cf. the multitude of methods available to accelerate SCF convergence^{93–102}). On the other hand, one can try to reduce the system size (i.e., by reducing or simplifying the Fock matrix while keeping the molecule identical). This can be achieved, for example, with standard integral screening approaches^{103–105} and by using small basis sets. For interactive quantum chemistry, ultra-fast semi-empirical methods, which make use of (almost) all these approaches, are currently state of the art.

For interactive calculations, two more aspects are decisive for a successful application, namely, the latency and bandwidth available for the transmission of the results to operators. Latency is a general term describing the time interval between the cause and effect of some physical change in a system. It is ultimately due to the limited velocity with which any physical interaction can propagate. Here, we use it to describe the time it takes for the results of a finished calculation to be transmitted to a user. On a standard local Ethernet network, typical latencies are below 1 ms, but on the internet, latencies on the order of 10 ms and more may occur if packets are routed via several devices (such as routers or switches). Bandwidth refers to the amount of data, which can be transmitted to the user per unit time. On a local network, bandwidths of 1 Gb/s can be easily achieved. For a large-scale network, such as the internet, the practically available bandwidth is typically lower, often on the order of 100 Mb/s.

It is straightforward to see that the bandwidth is not a problem for interactive calculations. For a basic interactive exploration of a chemical system, the energy and the nuclear gradient (i.e., the forces acting on all atoms) must be calculated and transmitted. While the energy is a single floating-point number, the nuclear gradient consists of 3*N* such numbers, with *N* being the number of atoms present. Even for a comparatively large molecule of 100 atoms, transmitting 1000 nuclear gradients per second (which would be fast enough even for un-mediated haptic quantum chemistry^{41}) requires about 18.3 Mb/s. This is an order of magnitude less than what is realistically available.

The latency can be a much more problematic parameter. For visual applications, the calculation times must be between 10 and 100 ms, while for haptic feedback, they have to be as low as 1 ms if no mediator potential is activated. Even the small latencies of about 0.1–1 ms achievable on a local area Ethernet network introduce a significant delay for haptic applications. Hence, in this situation, it is important either that the calculations are carried out on the same machine on which the results are displayed or that a special, low-latency network, such as InfiniBand, is employed. For purely visual feedback, however, even the comparatively high latencies on the internet will usually be unproblematic. Only in cases in which the latency exceeds about 50 ms, this will lead to a noticeable delay for the user.

### C. Key design principles and capabilities of Sparrow

SCINE Sparrow was originally designed with interactive quantum chemistry as its primary application case. Hence, the considerations of Sec. II B heavily affected its design. Sparrow is written in C++, which is a compiled language that typically results in very fast machine-executable code, since contemporary compilers can take advantage of sophisticated optimization procedures. All linear algebra operations are handled by the library Eigen3, which contains highly optimized routines for these operations. Time-intensive parts of the code are parallelized with OpenMP. While this is beneficial for large molecules, the overhead introduced for small molecules means that single-threaded calculations are often faster. In addition, note that for all OMx methods and AIQM1, only the calculation of the nucleus–nucleus repulsion and the formation of the Coulomb and exchange matrices from the two-electron integrals are currently parallelized, so there is little (or even no) speedup to be expected for these methods.

In addition, Sparrow contains a range of SCF accelerators such as the direct inversion of the iterative subspace (DIIS) method and extended DIIS and a combination of both. Moreover, we proposed two schemes to extrapolate the initial density matrix from that of a previously converged calculation.^{101} In interactive calculations, subsequent structures are typically very similar so that such extrapolation schemes work very well, allowing the SCF iterations to converge in fewer steps (see also below).

For all semi-empirical models implemented, the ground-state electronic energy can be calculated in both spin-restricted and unrestricted formalisms. For the MNDO-type NDDO and DFTB-type methods, analytical nuclear gradients are available; for these methods, Hessian matrices can be calculated with a seminumerical approach by taking derivatives of the analytic gradients. In addition, excited states can be calculated with the TD-DFTB formalism for DFTB2 and DFTB3. More complex calculations, such as structure optimizations, can be done, e.g., with SCINE ReaDuct,^{106} exploiting Sparrow as a backend raw data generator. Currently, Sparrow does not feature any implicit solvation model. Sparrow can be employed as standalone command-line binary or as a library to be accessed from other C++ and Python programs (see also the Appendix).

To leverage the full power of interactive quantum chemistry, having at one’s disposal an efficient way to carry out quantum chemical calculations is, however, not enough. Rather, it is important to combine the human chemical intuition provided by the user with the power of automated calculations, making sure that no important chemical process is undetected. For this, we introduced the concept of molecular propensity,^{107} which denotes the tendency of a system to change its electronic structure such that the underlying potential energy surface needs to be modified (e.g., by addition or removal of a proton or electron or by adopting an electronically excited state). One can check whether such processes become important by carrying out in the background automated calculations for, e.g., a different spin multiplicity, a different electronic state, or with an additional proton. As soon as the new energy comes close (or is lower) than the energy of the system currently explored by the user, the user can be made aware of this.

## III. RESULTS

### A. Runtimes of different models

In this section, we report runtimes of our new implementation of OM2, OM3, and AIQM1 methods and of the already existing methods PM6 and DFTB3 in Sparrow for the calculation of the ground state electronic energy for linear alkanes ranging from methane to hectane (C_{100}H_{202}). All timings have been obtained on a computer with two Intel Xeon E5-2667 v2 processors (except otherwise stated), repeating every runtime measurement five times. Computer systems usually run several hundred operating-system-related processes already in an “idle” state, which perturb the runtime measurements. To make sure that no particularly large or small perturbation skews the average timing, the largest and smallest timing values were discarded. Afterward, the arithmetic mean of the remaining three values was taken. In all cases, the standard deviation was found to be at least one order of magnitude smaller than the corresponding average timing, indicating that all timings show a comparatively high precision. For these runtime measurements, Sparrow was compiled with version 10.1.0 of the GCC C++ compiler,^{108} linking it against Intel Math Kernel Library 2021.4.0^{109} (via Eigen 3.4.0^{110}) and Boost library 1.73.0.^{111} The development version of Sparrow used for the present work is available on Zenodo.^{112}

In the top part of Fig. 1, we show the timings for PM6, DFTB3, OM2, OM3, and AIQM1 as obtained for methane, ethane, butane, hexane, octane, decane, icosane, tetracontane, hexacontane, octacontane, and hectane, measured on a single central processing unit (CPU) core. As one can see, the individual models are associated with different runtimes. For small systems, DFTB3 is consistently the fastest method even though OM2 and OM3 need only slightly more time to finish the calculation. Starting with hexane, however, for both OM2 and OM3, the time needed to calculate the electronic energy increases rather dramatically, showing a much steeper scaling with system size than DFTB3. For C_{20}H_{42} and larger systems, the calculation time of OM2 is about ten times larger than that of DFTB3. For these systems, OM3 is about twice as fast as OM2.

AIQM1 shows an identical scaling behavior as OM2 and almost identical timings for the large systems. However, for small systems, namely, methane to decane, the computation time for AIQM1 is significantly larger than that of any other method. What is more, for the systems up to hexane, this time is roughly constant, suggesting that for small systems, a constant (i.e., system size independent) effect dominates. This is most likely the initialization routine, combined with the fact that for some parts of the algorithm, external binaries are invoked (the spawning of a new process is rather time demanding). PM6 shows a similar, almost constant runtime for small systems up to decane, before increasing with a similar scaling, such as DFTB3. For the largest systems, PM6 needs only slightly more time than DFTB3.

It is interesting to investigate the effect of activating additional CPU cores when calculating the electronic energy. The most time-consuming parts of Sparrow are parallelized with OpenMP. However, the benefits of this parallelization are only expected to manifest themselves for very large systems as for small systems, the overhead imposed by the parallelization adversely dominates the runtime: compare the top of Fig. 1 to its bottom part, which shows the same timings obtained on 16 CPU cores.

Up to icosane, all models require more time on 16 cores than on a single one. Starting with tetracontane, DFTB3 and PM6 show a slight speedup, whereas the other models still do not benefit from the additional CPU cores. However, as we have noted above, the current implementation of OM2, OM3, and AIQM1 has only very little parallelization, so this is expected. One must keep in mind that most modern CPUs do adjust their operating frequency depending on how many CPU cores are utilized. The Intel Xeon E5-2667 v2 processor chosen for this work features a maximum peak frequency of 4.00 GHz. This is the frequency we can expect to be in operation when Sparrow runs on a single CPU core. When all cores are utilized, however, the CPU is expected to throttle the frequency to 3.30 GHz, implying a performance drop of about 17.5% for a single core. Therefore, the maximum speedup expected when utilizing all 16 cores—considering only the CPU frequency—is a factor of 13.2. Obviously, the speedup of a factor of only 1.17 as seen for DFTB3 is clearly below this theoretical threshold, implying that the current parallelization of Sparrow has ample room for improvement.

### B. Speedup attainable on higher-performance hardware

It is instructive to investigate the effect different CPU models have on the calculation times. All timings reported in Sec. III A have been obtained on a machine with two Intel Xeon E5-2667 v2 processors. This old CPU type released in the second half of 2013 features a base frequency of 3.30 GHz and a maximum peak frequency of 4.00 GHz. For comparison, we repeated the timings on a more recent Intel Xeon W-1390P processor released in 2021, featuring a base frequency of 3.50 GHz and a maximum peak frequency of 5.30 GHz. Considering only the peak frequencies, one would expect any Sparrow calculation to run about 1.33 times faster on the more modern CPU.

A comparison of the runtimes of DFTB3 and AIQM1 for the series of linear alkanes studied is represented in Fig. 2. Here, all calculations were run on a single CPU core. As one can see, for all system sizes, calculations with AIQM1 are significantly faster, namely, about a factor of 1.8 for the smallest systems and a factor of roughly 1.6 for hectane. This is significantly larger than the theoretical speedup expected based on a comparison of the CPU frequencies alone. This highlights that a multitude of factors affect the calculation runtime besides the processor frequency, such as memory and disk speeds. For our test machine with the more modern CPU, also the rest of the hardware is newer and more performant compared to their counterparts in the older machine. For example, the main memory in the older machine runs at an internal clock frequency of 233 MHz, achieving a data transfer rate of about 15 GB/s. In the newer machine, the memory works at 400 MHz, and the total data transfer rate is almost 26 GB/s. All these factors combine for a significant total speedup.

While the runtimes of DFTB3 show a similar speedup for the large systems, most curiously, however, this speedup is reversed for the small systems. For methane and ethane, the total time needed to calculate the electronic energy is about a factor of three larger on the more modern computer. Without a more detailed drill-down analysis, it is difficult to unambiguously determine the cause for this reversal. Obviously, the CPU speed as the dominant factor for the computing time is more important the larger the systems are. For small systems, the overhead imposed by initialization routines takes a higher percentage of the total runtime than for large systems. During the initialization, Sparrow reads in parameters from disk, among other data. One may speculate that this reading from disk is, for some reason, slower on the newer machine compared to the older one.

### C. Comparison to different computer programs

A comparison of the runtime of Sparrow to that of other programs implementing the same methods is obviously of great interest. In Fig. 3, we present a runtime comparison of PM6, DFTB3, and AIQM1 as implemented in Sparrow, MOPAC 2016 (version 19.090L),^{113} DFTB+ 22.2,^{7} and MLatom^{114–116} (interfaced to the MNDO program^{117} providing the ODM2* part, TorchANI providing the NN part, and dftd4 providing the D4 part), respectively. Care has been taken to set numerical thresholds such that convergence settings were as similar as possible for the sake of comparability.

As one can see, the PM6 implementation in MOPAC is consistently faster than the one of Sparrow. Depending on the exact system studied, MOPAC is about a factor of 4 (tetracontane) to 10 (methane and hectane) faster than SCINE Sparrow. For DFTB3, Sparrow’s implementation is somewhat faster for the smaller systems (up to a factor of about 2 for methane) than that provided by DFTB+. However, starting with tetracontane, DFTB+ begins to be faster than Sparrow; for hectane, the implementation of DFTB+ is about a factor of 7 faster. For AIQM1, Sparrow’s implementation is also somewhat faster for the smaller systems (up to a factor of about 5 for butane) than that provided by MLatom interfaced to the MNDO program. This is because the file reading and writing process is the most time-consuming part in MLatom for the smaller systems. For larger systems, where calculations of the ODM2* part with the MNDO program dominate, we find that MLatom’s AIQM1 calculations are much faster than by Sparrow and even faster than PM6 and DFTB3 by Sparrow.

Despite being slower in some cases, we emphasize that the strong focus on interactive quantum chemistry of Sparrow has motivated the development of ways not only to reduce the computation time of an individual single point calculation (for which we reported the timings above) but also to accelerate an entire series of such calculations (see also Sec. III E). Naturally, it can be expected that future developments are also likely to reduce the single-point timings.

### D. Implications for high-throughput virtual screening

We can relate the runtimes discussed above to high-throughput virtual screening and interactive quantum chemistry. For this purpose, we first define three different molecule sizes and the typical time required to calculate their ground state electronic energy with Sparrow; cf. Table III.

Method . | Small (butane-like) . | Medium (icosane-like) . | Large (hectane-like) . |
---|---|---|---|

PM6 | 76 | 100 | 5 300 |

DFTB3 | 6 | 30 | 4 800 |

OM2 | 17 | 290 | 61 800 |

OM3 | 13 | 180 | 26 100 |

AIQM1 | 224 | 500 | 63 100 |

Method . | Small (butane-like) . | Medium (icosane-like) . | Large (hectane-like) . |
---|---|---|---|

PM6 | 76 | 100 | 5 300 |

DFTB3 | 6 | 30 | 4 800 |

OM2 | 17 | 290 | 61 800 |

OM3 | 13 | 180 | 26 100 |

AIQM1 | 224 | 500 | 63 100 |

For small, butane-like molecules, calculations typically need less than 100 ms with the exception of AIQM1. For molecules of medium size (comparable to icosane), runtimes vary between 30 and 500 ms depending on the method chosen. Large, hectane-like molecules imply runtimes between about 5 s and 1 min. With these runtimes, we can estimate the total amount of molecules one can screen within 10 000 CPU hours, amounting to a small-scale high-throughput virtual screening campaign employing 100 CPU cores for about four days (cf. Table IV).

Method . | Small (butane-like) . | Medium (icosane-like) . | Large (hectane-like) . |
---|---|---|---|

PM6 | 473 684 210 | 360 000 000 | 6 792 452 |

DFTB3 | 6 000 000 000 | 1 200 000 000 | 7 500 000 |

OM2 | 2 117 647 058 | 124 137 931 | 582 524 |

OM3 | 2 769 230 769 | 200 000 000 | 1 379 310 |

AIQM1 | 160 714 285 | 72 000 000 | 570 522 |

Method . | Small (butane-like) . | Medium (icosane-like) . | Large (hectane-like) . |
---|---|---|---|

PM6 | 473 684 210 | 360 000 000 | 6 792 452 |

DFTB3 | 6 000 000 000 | 1 200 000 000 | 7 500 000 |

OM2 | 2 117 647 058 | 124 137 931 | 582 524 |

OM3 | 2 769 230 769 | 200 000 000 | 1 379 310 |

AIQM1 | 160 714 285 | 72 000 000 | 570 522 |

As one can see, very efficient models, such as PM6 and DFTB3, can screen several millions of large (hectane-like) molecules in this time frame, while the more involved methods OM2 and AIQM1 can screen more than 500 000 large molecules. For small systems, the throughput rate increases, reaching no fewer than 6 × 10^{9} small, butane-like molecules with DFTB3 within only 10 000 CPU hours.

### E. Implications for interactive quantum chemistry

Interactive quantum mechanics places a different focus on automated high-throughput calculations: A maximum runtime must not be exceeded, which eventually limits the molecular size that can be studied. For interactive quantum chemistry, we have seen above that calculation times should not exceed about 34 ms for a fluent visual feedback, and for haptic feedback, this limit is even lower (depending on additional tricks such as the mediator potential^{41}). Comparing this to the runtimes reported above, we see that the maximum molecule size that can currently be treated interactively is between decane and icosane for the fastest method DFTB3 when only visual feedback is needed (allowing for up to 10 ms of latency; see above). However, we have to keep in mind that when a human person is interactively manipulating the atomic coordinates of a molecule, these change in a rather smooth, continuous way. The consequence is that successive structures occurring in an interactive quantum chemistry application are rather similar to each other. Consequently, this can be exploited to accelerate the calculations. In Sparrow, it is possible to use the density matrix of the previous calculation as the starting point for the next calculation.^{101} If two structures are rather similar to one another, their density matrices will also be similar, making such a starting guess a good approximation. To show this, we demonstrate a series of calculations on a molecular trajectory representing a keto–enol tautomerism of a long-chained molecule. A few snapshots along this trajectory are represented in Fig. 4 (here, we employ the trajectory called “T1” from Ref. 118). When calculating all 200 frames of this trajectory with the DFTB3 method implemented in Sparrow independently of each other, the total runtime is 4944.6 ms. When re-using the density matrix of the previous structure, the runtime drops to 3434.3 ms, i.e., it is reduced by a factor of 1.4. This is roughly consistent with the decrease of the average number of SCF cycles needed, being 11.7 when the density matrix is not re-used and 7.5 otherwise. A similar speedup by a factor of 1.5 is also observed for the PM6 model. This increase in the calculation rate directly translates into larger molecules being amenable to an interactive manipulation. In practice, molecules containing up to thirty heavy atoms are straightforward to deal with in an interactive setting on custom present-day hardware.

We note that for OM3, the speedup observed when injecting the previous density matrix is only a factor of 1.07 even though also in this case, the average number of SCF cycles required for convergence is also reduced by about a factor of 1.5, namely, from 14.8 to 10.1. This is evidence for two factors decreasing the computational efficiency of OM3. First, the total number of SCF cycles needed is higher for OM3 compared to DFTB3 or PM6. Second, for OM3, some part of the algorithm outside the SCF routine needs comparatively much time. Clearly, this indicates potential for future optimizations.

For a fluent haptic feedback, the runtimes reported in this work suggest that only very small molecules can be treated. Indeed, the high feedback frequency needed poses a problem also for very efficient semi-empirical methods on contemporary hardware. Therefore, besides a number of algorithmic improvements tailoring the methods in Sparrow to interactive quantum chemistry, also other completely complementary approaches have to be developed. Within the current SCINE framework, the most important of these is the so-called mediator potential, which is a surrogate function modeling essentially a harmonic potential to which the displacement of all atomic nuclei is subjected. The parameters of this potential are obtained from a calculation with Sparrow. Afterward, this simple function can be evaluated extremely quickly and restrains the interactive manipulation to the region of the potential energy surface for which a calculation with Sparrow has already been carried out. In the background, semi-empirical calculations are continuously carried out such that the surrogate function can be continuously updated, enabling a fluent manipulation of the system. We refer the interested reader to Ref. 41 for more details on this approach.

## IV. CONCLUSIONS AND OUTLOOK

In this work, we presented the open-source free-of-charge SCINE Sparrow (https://github.com/qcscine/sparrow), a C++ program implementing a wide range of semi-empirical electronic structure models. We reported a new implementation of the orthogonalization-corrected methods OM2 and OM3 and AIQM1, which is an artificial intelligence-enhanced approach based on the OMx model. We discussed key design principles of Sparrow, highlighting how it is especially tailored toward applications in interactive quantum chemistry.

SCINE Sparrow offers a unique collection of semi-empirical approaches broadly ranging from traditional MNDO-type to more advanced orthogonalization-corrected to DFTB-type to state-of-the-art machine learning-improved AIQM1 methods. Sparrow is used as backend with MLatom^{115,116} to enable AIQM1 calculations that can be performed on the MLatom@XACS cloud^{119} in a web browser without the need for any software installation, further increasing the accessibility of quantum chemistry.

We then reported the runtimes of a range of semi-empirical models currently implemented in Sparrow for molecules of different sizes, demonstrating the suitability of semi-empirical methods, in general, and their implementation in Sparrow for high-throughput virtual screening, *ab initio* molecular dynamics simulations, and interactive quantum chemistry. For screening purposes, semi-empirical methods allow one to screen a huge number of molecules with only modest computational resources. For interactive applications, semi-empirical methods are currently the only way to simulate molecular systems quantum mechanically and on the fly, i.e., without precomputing the potential energy surface.

The runtimes presented in this work also highlight potential directions for the further improvement of Sparrow. For example, the newly implemented methods OM2, OM3, and AIQM1 show a very steep scaling with molecule size. While the time needed by OM2 and OM3 for small molecules is among the shortest of all methods implemented in Sparrow, these methods are about an order of magnitude larger compared to PM6 or DFTB3 for large systems. However, we note that further improvement of the software may increase the efficiency of the new approaches by an order of magnitude. Furthermore, analytical nuclear gradients are currently not available for OM2, OM3 and AIQM1 in Sparrow. Future work in Sparrow will address these points.

## ACKNOWLEDGMENTS

This work was generously supported by the Swiss National Science Foundation (SNSF) through Project No. 200021_182400. P.O.D. acknowledges funding from the National Natural Science Foundation of China (Grant No. 22003051), the Fundamental Research Funds for the Central Universities (Grant No. 20720210092), and the Lab project of the State Key Laboratory of Physical Chemistry of Solid Surfaces.

## AUTHOR DECLARATIONS

### Conflict of Interest

The authors have no conflicts to disclose.

### Author Contributions

**Francesco Bosia**: Software (equal). **Peikun Zheng**: Formal analysis (equal); Investigation (equal); Methodology (equal); Software (equal); Validation (equal); Writing – original draft (equal). **Alain Vaucher**: Software (equal). **Thomas Weymuth**: Conceptualization (equal); Data curation (equal); Formal analysis (equal); Investigation (equal); Methodology (equal); Project administration (equal); Software (equal); Validation (lead); Visualization (equal); Writing – original draft (equal); Writing – review & editing (supporting). **Pavlo O. Dral**: Conceptualization (equal); Formal analysis (equal); Funding acquisition (equal); Investigation (equal); Methodology (equal); Project administration (equal); Software (equal); Supervision (equal); Validation (equal); Writing – original draft (equal); Writing – review & editing (equal). **Markus Reiher**: Conceptualization (lead); Formal analysis (equal); Funding acquisition (lead); Investigation (equal); Methodology (equal); Project administration (equal); Resources (equal); Supervision (equal); Writing – original draft (supporting); Writing – review & editing (lead).

## DATA AVAILABILITY

The data that support the findings of this study are available from the corresponding authors upon request.

### APPENDIX: RUNNING SCINE Sparrow

Sparrow can be used either as a standalone binary via the command line or as a library via C++ or Python. When used as a standalone binary, the command

sparrow -h

provides a list of all available options. For example, a simple calculation of the electronic energy of a structure in the file structure.xyz with the MNDO method can be achieved with

sparrow -x structure.xyz -M MNDO.

By default, a spin multiplicity of 1 and a charge of 0 are assumed; the calculation is carried out in the spin-restricted formalism. Suppose that we want to specify a charge of +1 and a spin multiplicity of 2. For this case, the following options have to be specified:

sparrow -x structure.xyz -M MNDO -c 1 -s 2.

The following example calculates the nuclear gradients and the Hessian matrix with the AM1 method:

sparrow -x structure.xyz -M AM1 -G -H.

The following code snippet illustrates the usage of Sparrow via Python; here, the electronic energy and the nuclear gradients are calculated with the OM3 method:

import scine_utilities as su

import scine_sparrow

manager = su.core.ModuleManager()

calculator = manager.get(’calculator’, ’OM3’)

calculator.structure = su.io.read(’structure.xyz’)[0]

calculator.set_required_properties([su.Property.Energy, su.Property.Gradients])

results = calculator.calculate()

print(results.energy)

print(results.gradients).

Further examples and instructions concerning the usage of Sparrow are given in the manual, provided together with the source code of Sparrow.^{54,112}