MOLCAS/OpenMolcas is an ab initio electronic structure program providing a large set of computational methods from Hartree–Fock and density functional theory to various implementations of multiconfigurational theory. This article provides a comprehensive overview of the main features of the code, specifically reviewing the use of the code in previously reported chemical applications as well as more recent applications including the calculation of magnetic properties from optimized density matrix renormalization group wave functions.
Modern quantum chemistry is impossible without a versatile computational software, which includes calculation of integrals, optimization of wave functions, and computation of properties. Not surprisingly, computational codes in the field are grouping into large packages, which simplifies the development process and their usage.
The list of quantum chemistry computer programs, maintained at Wikipedia,1 contains almost one hundred different computational codes with a large overlap in the functionality. Even in a narrow field of codes, computing the electronic structure of the ground and excited states of molecules, a researcher can choose between a large set of either well-established or newly developed codes. In addition to functionality from the scientific point of view, the codes are very different with respect to performance, the need of hardware resources, documentation, and user friendliness.
As a general purpose package for quantum chemical calculations, MOLCAS has made a long journey from a collection of homemade codes to professional software with distributed development, automatic verification, user support, etc. Recently, the vast majority of codes in MOLCAS have been released as open source—the OpenMolcas project. This package is user-friendly and ready-to-use, and is also a developers’ platform. While OpenMolcas is a free-of-charge package, with web based community support, the MOLCAS package is a licensed distribution of a refurbished version of the OpenMolcas package, with some additional external utilities. The MOLCAS package is user-oriented and offers explicit support in installation, performance enhancement, and basic calculations. For all practical purposes, both packages offer the same or similar computational tools. In what follows, we will make the distinction between MOLCAS and OpenMolcas when so is called for, elsewise we will refer to both distributions as [Open]Molcas.
There are several reviews describing new and recent features of MOLCAS2–4 and OpenMolcas,5 as well as the development infrastructure of the code,6 external software interfaced to MOLCAS, and graphical user interface programs.7,8 For a more comprehensive list of features, computed properties, implementation details, and theoretical description, we would suggest the reader to consult these papers, together with the manual.9
The purpose of the current paper is to highlight not necessarily the newest but the most commonly used features of [Open]Molcas with focus on the general outcome for computational chemistry. In order to reduce the size of the paper, the computational details (including input examples) are presented in the supplementary material. Thus, we encourage the reader interested in a deeper understanding of the possibilities of the [Open]Molcas package to consult the supplementary material section of this paper.
II. PROGRAM OVERVIEW
From the programming point of view, [Open]Molcas is a package that is composed of a large number of individual computational units, which run together driven by an input parser. The modularity simplifies the distributed development and allows us to have a better control over computational resources, especially memory. As a drawback of the module structure of [Open]Molcas, there is a need to organize communication between executable codes, which includes handling of return codes, transferring the data, and keeping the parallel infrastructure. Currently, there are two implementations of the input parser, written in Perl/C (molcas.exe in MOLCAS) and in Python (pymolcas in OpenMolcas). Both parsers use Enhanced MOLCAS Input Language (EMIL), described in detail in the supplementary material.
The main computer language used in [Open]Molcas is Fortran. The majority of the code is compatible with the Fortran 77 standard, but some newer contributions are written in Fortran 90 or later. All system-dependent routines, such as handling of memory, I/O, and parallelization, are written in C language. Not surprisingly, the main computational tasks are related to linear algebra problems. The code has an interface to standard BLAS/LAPACK libraries, providing the best performance for production calculations, but at the same time, there is a possibility to control the usage of mathematical libraries for development purposes.
Despite the general trend of reducing the variety of hardware architecture and software, we continue to support [Open]Molcas not only on the mainstream setup (x86_64 CPU with Linux OS and GNU or Intel compiler) but also on marginal hardware and compilers. The complete list of supported platforms, compilers, and libraries is provided in the supplementary material. This policy restricts the usage of some, usually nonstandard or recently standardized, features of compilers, but at the same time, it helps reduce the number of bugs in the code.
Since MOLCAS 6.0, a robust verification system is used to maintain the stability of the code. An application developer includes a call to a library function, specifying as parameters a label, a value, and an allowed threshold for any computed data. The verification procedure can use this call either to store the computed data or to verify it against the reference. The verification can be performed either locally or at remote installations of [Open]Molcas (with the use of a special selection of compilers and compilation flags of libraries).
A breakthrough in performance originated from the implementation of what is known as “Cholesky infrastructure.”8,10,11 In practice, every quantum chemistry method implemented in OpenMolcas is formulated directly in terms of the so-called Cholesky vectors instead of the traditional formulation based on two-electron integrals. Besides the massive savings in storage, significantly reduced scaling is achieved in the evaluation of most tensor quantities used within the algorithms for mean-field and correlation energy methods.
[Open]Molcas computes the optimized wave function and the corresponding energies for a large set of approximate Hamiltonians, including Hartree–Fock (HF), Kohn–Sham Density Functional Theory (DFT), multiconfigurational methods [Complete Active Space Self-Consistent Field (CASSCF), Restricted Active Space Self-Consistent Field (RASSCF), Generalized Active Space Self-Consistent Field (GASSCF), Density Matrix Renormalization Group (DMRG)], perturbation theory of the second order (MP2, CASPT2, RASPT2), valence bond theory (CASVB), and coupled cluster theory [CCSD and CCSD(T)], and Multireference Configuration Interaction (MRCI). [Open]Molcas is capable of optimizing the geometry of a molecule (using either analytical or numerical gradients). Most of the methods allow us to compute the electronic structure of a molecule in a medium (solution and solids). Recent development of [Open]Molcas is also focused on computing various properties.
It is difficult to present strict and accurate limitations of the codes in [Open]Molcas, but on the current hardware, it is possible to study a molecule with more than a hundred atoms, with about two thousand basis functions. The limits in terms of the active space selection depend on the method and the hardware configuration. A user should be aware that multiconfigurational methods require a very large amount of memory: 2 GB for a small calculation and 16–32 GB for a large calculation, but not extreme calculation. Most of the modules are parallelized; however, a production run will require even more memory if parallelization is used.12
Recent development is also focused on creating a data interface with other computational and visualization codes. In some cases, the data exchange is customized (interface to other computational codes, e.g., QCMaquis and Columbus), but the work in progress also includes a general HDF5-formatted interface and XML/json formatted output.
Sections II A–II F present the main computational codes in [Open]Molcas. The flowchart of modules, as well as a short programming description of codes, is presented in the supplementary material. In order to distinguish, say, the Self-Consistent Field (SCF) method and the scf program, the latter is printed in small capital letters.
A. Integrals, gradients, and second derivatives
The original MOLCAS package has since its inception utilized Gaussian basis sets to expand the one-particle space. While the original MOLCAS 1.0 used MOLECULE,13 subsequent versions have used the seward,14 alaska,15 and mckinley16 programs to compute integrals and up to second order derivatives of those. The hallmark of these modules is the efficient computation of integrals in the Gaussian basis of generally contracted basis functions. Moreover, the integral codes are implemented for the efficient use of real spherical harmonics of arbitrary order. The two-electron integrals and associated derivatives are computed using the Rys–Gauss quadrature, while one-electron integrals are computed using the Hermite–Gauss quadrature. As any standard package, OpenMolcas supports a large array of one-electron operators and also some that are unique. Here, it is worth to mention integrals over the exact semi-relativistic operator of the interaction between light and matter.17 The two-electron integrals are also implemented in an integral-direct fashion in the scf module.18 Recently, the conventional integral generation to disc has been less used as this has been replaced with an efficient implementation of Cholesky decomposition (CD) and so-called density-fitting techniques, a method that reduces computational times one order of magnitude or more. This matter will to some extent be discussed in Subsection II A 1.
1. Cholesky infrastructure
Most quantum chemistry software needs to cope with the computational bottlenecks arising from the evaluation and storage of the two-electron integrals. A strategy that has become popular over the years with the names of “Density Fitting” (DF) or “Resolution of the Identity” (RI) uses the following approximate tensor decomposition in terms of 3-center and 2-center integrals:
where sets of so-called auxiliary basis functions P, Q are designed for each atomic basis set through data-fitting of specific energy contributions—e.g., second-order Møller–Plesset (MP2) correlation energy—and then used to calculate the two-electron integrals (μν|λσ) over atomic orbitals. Despite the need to compute the matrix inversion over the auxiliary functions [(P|Q)]−1, significant computational advantages come from employing Eq. (1) due to the fact that the number of auxiliary functions is only about 3–5 times the total size of the atomic basis set, thus potentially orders of magnitude smaller than the leading dimension of the full integral matrix. Therefore, the number of 2- and 3-center integrals is much smaller than the number of otherwise needed two-electron (and up to 4-center) integrals.
On the other hand, it is now a well-established fact that such generation of the auxiliary basis set can be made without data-fitting or bias toward a specific quantum chemistry method, if one exploits the onset of numerical linear dependence in the product basis of the atomic orbitals.19 This type of ab initio DF originated in the earlier idea of employing the scalar product of vectors from the Cholesky decomposition (CD) of the integral matrix in order to approximate the integrals,
with computational advantages arising from the fact that the length of each CD vector is only about the same as the size of a standard auxiliary basis set used in Eq. (1). The main appeal of standard CD over RI approximations is that one can, at the price of increasing the CD vector length, effectively control the accuracy of the integral representation by means of the threshold used for the incomplete matrix decomposition. This is a very important property, especially in the context of highly accurate methods, as it helps preserve the systematic improvability of such quantum chemical models. The use of ab initio DF combines the accuracy of the CD representation with the computational ease of Eq. (1) by requiring only CD of each atomic subblock (acCD) of the integral matrix to define the auxiliary basis.20 Once generated on the fly, such an auxiliary basis set is used in Eq. (1), and in this way, one can also compute analytical derivatives for the two-electron integrals, as needed for potential energy surface (PES) exploration at any level of theory.21,22 The option to perform acCD based calculations is available in [Open]Molcas by means of the keyword RICD in gateway and seward. Additional features of the Cholesky infrastructure deal with exploiting a fully localized reformulation of Eq. (1) for use in connection with linear-scaling correlation methods, especially nonvariational ones, as any local DF two-electron integral approximation may cause variational collapse.23,24
B. The rasscf program
In a “Multi-Configurational Self-Consistent Field” (MCSCF) calculation, also called “Multi-Configuration Hartree–Fock” (MCHF), the electronic wave function is composed of a number of different configurations, with different occupation numbers. Orbitals that are doubly occupied in all the configurations are variously called “core” or “inactive,” while those with varying occupation are called “active.” Moreover, the configuration functions, as well as the orbital functions, are optimized so as to make the energy of the MCSCF state optimal, i.e., minimized or stationary.
General MCSCF optimization is usually very hard to converge to a solution and is also marred by multiple local minima. In contrast, the CASSCF, i.e., “complete” active space, procedure25 is usually well-behaved numerically. The term “complete” implies that all configurations that are formed by distributing electrons in all possible ways among the active orbitals are used in the Configuration Interaction (CI). Use of a fast and robust CI solver and selecting the energy eigenfunction then makes the energy a function of only the “active space,” rather than of the individual orbitals. This makes the optimization procedure simpler and, in most cases, rather robust.
However, the number of configurations, i.e., the size of the CI eigenvalue problem, will grow in a very inconvenient way beyond a dozen or so active orbitals. One way out is to restrict the active space a bit by disallowing some of the configurations. By requiring certain orbital subspaces to have a restricted range of occupation numbers, we get a drastic reduction in the number of configurations and also a more difficult optimization problem. This kind of restriction is called “generalized” active space, i.e., the acronym is GASSCF.26,27 A less problematic, and also less powerful, method is the “restricted” active space, RASSCF.28 Here, we can conceptually start with a big CAS, but then restrict it by requiring the orbitals in one subspace, the RAS1 space, to be fully occupied with just a few exceptions, i.e., the number of “holes” in RAS1 is limited. Similarly, there is a RAS3 space, which is empty except for a maximum allowed number of electrons. The remaining active orbitals belong to the RAS2 space.
The rasscf program originally computed only CASSCF wave functions but was later expanded by allowing RASSCF and later also GASSCF. However, it has also become possible to do similar calculations with a huge number of active orbitals, when the optimization is performed by relaxation procedures called “density matrix renormalization group” methods, which grew out of methods used in lattice quantum dynamics. Some DMRG codes have now been interfaced as external modules, which can be optionally linked to OpenMolcas.
Multireference calculations can in principle be applied to arbitrary kinds of electronic systems, regardless of charge, spin, and point group symmetry, and with molecules with any conformation and level of excitation. However, they can require large resources in terms of basis sets, many active orbitals, and large CI expansions. The calculations outlined above will usually require also a calculation of residual dynamic correlation.
CASSCF cannot be used to optimize more than 12–16 active orbitals. Many more are possible using RASSCF or GASSCF. Even then, the optimization procedure yields a wave function that is optimized with the restriction that the virtual orbitals have zero occupation in the wave function. Such a wave function includes non-dynamic correlation but lacks much dynamic correlation, which can also differ considerably between different states/geometries.
Thus, the CASSCF (as well as RASSCF and GASSCF) calculation will usually be followed by a perturbative calculation of the missing dynamic correlation. In fact, for CASSCF, the accuracy is usually comparable to that of a Hartree–Fock (HF), when the state in case is well suited for such calculations. The difference is that CASSCF is able to give results of uniform quality, also for radicals, broken bonds, excited states, etc. RASSCF and GASSCF are much better in allowing at least some non-dynamic correlation, but often not enough. One procedure to go further will be briefly described in what follows.
C. The caspt2 program
The caspt2 program solves the Raleigh–Schrödinger perturbation equation
where Ψ0, the zeroth-order wave function, is a CASSCF or, more generally, a RASSCF wave function. Ψ1, the first-order perturbation to the wave function, is expressed as a large number—typically 1.0 · 105–1.0 · 106—of excitation amplitudes. The zeroth-order Hamiltonian is
Here, and are projectors onto the reference function and the interacting space, respectively, and is an effective one-electron spin-summed operator. Its action on a multiconfigurational wave function is not diagonal, and the equation is solved by an iterative Preconditioned Conjugate Gradient (PCG) method. The interacting space is usually huge. It is subdivided into eight symmetry-blocked parts, further partitioned into symmetry block by index permutations and point group symmetry, and each such block represents the action of some linear combinations of two-electron operators acting on the reference CASSCF or RASSCF state.
The linear combinations, and the blocking, are used to transform the perturbation equation into a form where it can be solved by efficient PCG iterations. Thus, the diagonal blocks become themselves diagonal matrices, and so a very efficient solver is obtained for the pre-conditioning, which also gives a starting diagonal solution. This is then refined by typically 6–20 PCG iterations.
The result is a first-order perturbation wave function Ψ1. This is then used to express the final energy in the form of a Hylleraas functional, i.e., the energy has an error that scales with the square of errors in Ψ1.
Developments include, e.g., multi-state CASPT2 (MS-CASPT2), extended MS-CASPT2 (XMS-CASPT2)—which uses a state-average and ensures invariance under unitary rotations of the reference states—and extended dynamically weighted CASPT2 (XDW-CASPT2),29 a method that interpolates between the former two and retains advantages from both.
The caspt2 program can accept CASSCF or RASSCF reference wave functions, DMRG wave functions, if an optional DMRG package has been linked with OpenMolcas.
D. The rassi program
For multi-configurational methods, a common problem is to compute overlaps and matrix elements of various operators over a set of wave functions for different states. The orbitals are usually optimized for the various states. If these are not too dissimilar and have the same spin and the same point group representation, they may be treated together, giving an ensemble optimization (state averaged CASSCF). If any interstate properties are sought for, we have the following problem: How to compute matrix elements over CI wave functions that use different orbital bases?
This was the original reason for developing the state interaction programs. The original one, cassi, took pairwise states from a set of states and could compute their overlaps and Hamiltonian matrix elements, as well as matrix elements of property integrals such as multipole moments. Using the wave functions as basis functions, it could then compute orthonormal states and non-interacting linear combinations of these, and the energy eigenvalues and proper dipole transition probabilities, for example.30
This was shown to be efficient and practical, provided that the state functions had been computed using a common set of Atomic Orbital (AO) basis functions and for CASSCF wave functions. It was primarily used for the purposes indicated above and also, e.g., for transforming to diabatic wave function components for crossing or nearly crossing states. Afterward, of course a large number of other disparate uses were found. RASSCF wave functions can be used in a similar way.31 States that differ in spin and/or number of electrons may be treated, leading to so-called Spin–Orbit RASSI32 (SO-RASSI) where the spin-free RASSI wave functions are combined with spin functions, and using so-called “Atomic Mean-Field Integral” (AMFI) spin–orbital Hamiltonian integrals from seward, properties end energies involving individual spin components can be computed. Using different numbers of electrons, so-called Dyson amplitudes can be obtained, yielding e.g., probabilities for core-hole dynamics. rassi can also compute so-called “binatural orbitals,”33 which describe pairs of states. The implementation is new and has not yet been much used.
In conclusion, the rassi program is a fairly versatile tool for handling multiconfigurational wave functions in many different ways. In the future, it should also have the capability to combine states having different basis functions or different numbers of inactive/active orbitals.
E. External programs for DMRG-based calculations
For a general description of DMRG and the interfaces available for performing these calculations with [Open]Molcas, the reader is advised to refer to Ref. 5.
1. DMRG-based electronic structure with QCMaquis
QCMaquis34–36 is a stand-alone software module that provides algorithms utilizing the DMRG approach37,38 in quantum chemistry.39 Targeting large active orbital spaces beyond restrictions of traditional CAS approaches in [Open]Molcas calls for either specialized approaches such as the ones discussed in Sec. II F or new wave function parametrizations that allow one to circumvent the exponential growth of basis states with system size in a CASCI formulation. The optimization of matrix-product state (MPS) ground- and excited-state wave functions by the module QCMaquis, which stands for Quantum Chemical Modern Algorithms for Quantum Interacting Systems, exploits a genuine matrix-product operator formalism of the (relativistic) quantum chemical Hamiltonian.34,35,40
By means of a recently established Fortran-to-C++ interface, QCMaquis is modularly integrated into the current CASSCF solver of OpenMolcas (under the alias dmrgscf). Hence, DMRG-SCF calculations with QCMaquis41 allow for—in analogy to the traditional CASSCF approach in OpenMolcas—(i) the inclusion of equilibrium and non-equilibrium solvent models,42 (ii) the computation of state-specific43 as well as state-averaged44 ground- and excited gradients, and (iii) the approximation of dynamical electron correlation in post-DMRG-SCF calculation second-order multi-reference perturbation theory;45 (iv) multi-configurational pair-density functional theory46 can be employed for this purpose; and (v) moreover, the capabilities of the RASSI approach (see Sec. II D) to handle non-orthogonal CI-type wave functions as input states are matched with the MPS state-interaction (MPSSI) approach47 of QCMaquis. The latter is accessed in OpenMolcas with the mpssi program. This capability is decisive for the calculation of transition dipole moments as well as SO coupling matrix elements between states optimized separately as MPS-type wave functions. In Ref. 47, we showed with the example of actinide complexes that DMRG-SCF in combination with the MPSSI approach yields magnetic properties such as ground- and excited-state g-tensors in full agreement with data obtained from corresponding CASSCF/RASSI calculations. For an example that combines for the first time the DMRG-SCF/PT2/MPSSI functionality with the SINGLE_ANISO approach, the reader is referred to Sec. III B 6. In addition, by taking advantage of the same implementation framework as the rassi module, the mpssi module facilitates the computation of Spin Orbit (SO)-coupled wave functions that are suitable for a subsequent ground- and excited-state bonding analysis, as described in Sec. III D 4, and/or that can serve as input functions to the single_aniso module.4
The QCMaquis–OpenMolcas interface can target excited states within either a state-specific (SS)34 or a state-average (SA) formulation of DMRG. The SA formulation is realized based on the multi-state matrix product state idea (MS-MPS)44,48 that generalizes the MPS concept to multiple electronic states. MS-MPS relies on a common tensor to encode all target electronic states that therefore can be optimized simultaneously with a lower computational cost and a higher stability in the presence of nearly degenerate electronic states, e.g., in the vicinity of a conical intersection. We presented an extension of the Lagrangian-based SA-CASSCF linear response theory49 to MS-MPS wave functions that allows us to calculate SA-DMRG-SCF energy gradients and non-adiabatic couplings with QCMaquis. In future work, we plan to further improve the efficiency of the SA-DMRG-SCF gradient calculation by exploiting the structure of an MS-MPS to efficiently evaluate the contractions required for the calculations of one- and two-body reduced density matrices for a given MS-MPS.
The QCMaquis–OpenMolcas interface contains an implementation of the second-order n-electron valence state perturbation theory (NEVPT2),50–52 which employs an MPS reference wave function and exploits the Cholesky decomposition of two-electron integrals (CD-DMRG-NEVPT2).45 Quasi-degenerate NEVPT2 (QD-NEVPT2),53 i.e., a genuine multi-state formulation of NEVPT2, is also supported. Prominent examples of a successful application of DMRG-NEVPT2 include the calculation of spin-state energetics5,45 as well as dissociation energies54 of several large transition metal complexes. The main computational bottleneck present in nevpt2 calculations, especially with large active orbital spaces, is the evaluation of the four-particle reduced density matrix (4-RDM), which scales as the eighth power with the number of active orbitals N. Although DMRG supports active spaces of up to about N = 100 orbitals, the prohibitive scaling of the 4-RDM limits the current implementation in OpenMolcas to N ≤ 22 orbitals. An interface for a massively parallel computation of 4-RDM matrix elements on a cluster is provided, and in a future release, efficiency improvements in 4-RDM element computation will be implemented, raising this limit to N ≤ 30 active orbitals.
Although the present implementation of QCMaquis exploits an efficient, OpenMP-based shared-memory parallelization scheme, MPS optimizations for very large active orbital spaces (N > 60) can suffer from severe, memory-bound limitations for a given number of renormalized block states, the bond dimension m. To address this issue adequately, future releases of QCMaquis will benefit from a state-of-the-art hybrid OpenMP-MPI parallelization framework that takes advantage of the shared-memory functionalities across multiple compute nodes provided by the MPI-3 standard. Developments along these lines are currently ongoing.
2. DMRG with [Open]Molcas–CheMPS2 interface
[Open]Molcas–CheMPS2 is a simple yet robust DMRG interface between [Open]Molcas and the CheMPS2 library,55 allowing one to perform very efficient DMRG-CASSCF and (especially) DMRG-CASPT2 calculations. To the best of our knowledge, DMRG-CASPT2 is only available in three codes: [Open]Molcas, CheMPS2,56 and orz.57 As compared to the latter two, the uniqueness and strength of the implementation in [Open]Molcas is that the Cholesky decomposition technique is fully supported. Thus, calculations of large molecules with several thousand basis functions are feasible.58 DMRG-CASPT2 in OpenMolcas has been employed to tackle difficult problems involving mono-59–61 and di-nuclear transition metal complexes.58,62–64 In [Open]Molcas, the DMRG wave function and the RDMs required for CASPT2 calculations [2-, 3-RDMs and the generalized Fock matrix contracted with the 4-RDM (F.4-RDM)] are calculated using the CheMPS2 library. In the current implementation, the computational cost of F.4-RDM is significantly reduced by employing the pseudocanonical orbital basis. This strategy, however, is more suitable for highly symmetric or compact molecules and limits the active space to ∼30 orbitals. For general molecules and calculations requiring a larger number of active orbitals, a cumulant approximation of F.4-RDM65 [DMRG-cu(4)-CASPT2] has been implemented61 and will be available in a future release.
F. Selection of the active orbital space
Multiconfigurational approaches require a selection of the active space.66 Traditionally, this procedure is described as a set of recipes,67 but this set appears ever expanding and following these rules requires some skills from the user side. In Sec. II F 1, we describe the recent updates of the tools, which can simplify this uneasy step in multiconfigurational calculations.
1. From small-basis active orbital selection to large-basis CAS/RAS with EXPBAS
Manual identification of active spaces is usually simpler in a small basis set, largely due to the fact that the virtual orbitals become more localized so that “chemical” interpretation is easier. In addition, since the computational time for any quantum chemical method scales with the size of the basis set, using a smaller basis allows for a faster diagnosis of the stability of the chosen active space. [Open]Molcas contains a module, called expbas, which expands a wave function from a smaller basis set into a larger one, giving the user the ability to first find an active space in a small basis—either manually or through the AutoCAS algorithm described next, before choosing a larger, more expensive basis, for production calculations.
An example of how to use expbas on a model [Co(H2O) (OH)3]-system is given in the supplementary material. This system serves as a minimal model for the interaction between a water molecule adsorbed on a Co2O3-surface. In the supplementary material, we describe how to first select an as small as possible ANO-type basis set for generating an initial guess for an active space consisting of the water molecule’s valence orbitals as well as the Co 3d- and 3d′/4d-orbitals. Furthermore, we provide input examples for how to expand the wave function from the smaller basis into a larger one using expbas and finish off with a small comparison between the expbas-strategy for identifying an active space compared to starting with guess orbitals generated by guessorb.
2. AutoCAS facilitates automated orbital selection
Since choosing an appropriate active orbital space for a problem at hand is a non-trivial task, it ultimately calls for a high degree of automation—already to overcome expertism and potential anthropogenic bias. To this end, QCMaquis and OpenMolcas can be steered by the graphical user interface AutoCAS, which provides a fully automated active orbital space selection protocol for multiconfigurational wave functions.68,69 We developed this protocol70,71 to cope with valence properties, for which it inspects the whole valence orbital space of a molecule in an approximate but comparatively fast DMRG calculation to identify the strongly statically correlated orbitals. We note that the protocol will also work for other properties (e.g., Rydberg excited states) if relevant orbitals are also considered in this fast exploratory DMRG calculation.
The AutoCAS selection algorithm based on orbital entanglement measures replaces empirical selection guidelines by physical quantities that are calculated from an approximate wave function—a procedure that eliminates arbitrariness and enhances reproducibility. The selection threshold70 of the protocol is defined such that only the most strongly statically correlated orbitals are selected for the active space, which yields compact active spaces. This guarantees a separation of the static and dynamic electron correlation, where the latter can be accounted for by subsequent CASPT2 or NEVPT2 calculations within OpenMolcas. For instance, the AutoCAS selection algorithm reliably identifies relevant double-shell orbitals of 3d-transition metals,72 confirming that suitable zeroth-order DMRG-SCF or CASSCF wave functions can be constructed from the automatically selected active spaces.
In cases where the active space is constructed only from valence orbitals, we recommend an automated active space selection in a minimal basis with AutoCAS and subsequent expansion of these pre-optimized active orbitals to the final large basis set with expbas.
III. TYPICAL PROBLEMS SOLVED BY [OPEN]MOLCAS
In this section, several cases of the use of [Open]Molcas package in typical quantum chemical applications are described. In supplementary material, the user can find corresponding inputs.
A. Precise wave function calculations
1. RASSCF/RASPT2 electronic structure calculation
An incomplete time line illustrates the evolution from the early times when only CASPT2 was available for dynamic correlation of CASSCF wave functions. This combination was then used extensively for molecules with lighter elements up to transition metal compounds. Spin–orbit interaction could already be treated,73 e.g., in the study of EPR g-tensors,74 and by the Douglas–Kroll–Hess scalar relativistic integrals by Ref. 75, also calculations involving heavier elements including up to actinides became possible.76–78 Using RASSCF, larger active spaces were possible, but technical complications did not allow a proper RASPT2 program to be developed. Ultimately, an approximate form of RASPT2 started being used experimentally,79 and it was accepted into Molcas. A benchmark study by Sauri et al.80 showed the general usability of the approach.
2. Electronic structure and energetic properties calculated with DMRG-CASPT2
To demonstrate typical problems that can be solved by DMRG-CASPT2 in [Open]Molcas, we extend a previous work60 by studying the electronic structure and energetic properties of O2 adducts of three transition metal macrocycles, i.e., 3FeIIP, 2CoIIP (P = porphin), and 2[CoII(corrin)(Im)]+ (Im = imidazole) (Fig. 1). The latter complex, serving as a model for vitamin B12r,84 is of particular interest considering its potential application in oxygen reduction reaction (ORR).85 The active space consisting of 19 active orbitals is computationally prohibited with conventional CASSCF-CASPT2 but can be easily treated with DMRG-CASPT2. The standard binding enthalpies predicted by DMRG-CASPT2 (including DFT thermal correction) are in excellent agreement with experimental data.82,83 The binding enthalpy of O2 to 2[CoII(corrin)(Im)]+ (9.3 kcal/mol) is similar to experimental values measured for five-coordinate CoN4 complexes (7.8 kcal/mol to 13.1 kcal/mol).86 This high binding enthalpy explains the capability of vitamin B12 catalyzing the four-electron ORR.85
3. Strategies for generating excited states
Excited states (ESs) are often calculated using the state-average formalism, which avoids separate optimizations of each state while ensuring a balanced description of all states. In case the excited state belongs to a different irreducible representation, these states can be optimized separately by taking advantage of point-group symmetry. States with different spin multiplicity are always optimized separately. [Open]Molcas makes it possible to further control the position of holes using restricted active space wave functions.26,28,79 Together with energy-penalty terms or core-valence separation (CVS), this makes it possible to target single and double hole states, even in deep core orbitals.87,88
The exploitation of the RASSCF/RASPT2 protocol benefiting from parallel execution, point-group symmetry, and integral calculation speed up via Cholesky decomposition has facilitated the calculation of the manifold of higher lying ES with an accuracy that allows a direct comparison with experiments.5
4. State interactions with rassi
A main point with most multiconfigurational calculations is that the orbitals are optimized for the state of interest. However, in calculations involving several or many excited states, it is frequently difficult, sometimes almost impossible, to optimize orbitals specifically for each state. In most cases, orbitals are optimized for a set of states, which is a stable procedure if this set contains all the lowest states of any particular symmetry. If the molecule has a higher symmetry (e.g., C3), which causes degeneracy of any states (like states with open e orbitals), it is also necessary to make sure that the degenerate components do not fall into different IRREPS of the actual symmetry used for the calculation.
In any case, the result is then subsets of states with their own specific set of optimized orbitals, and it is usually necessary to compute various matrix elements that couple wave functions with more or less different orbitals. For transition amplitudes, it is necessary that the two states are orthonormal and non-interacting, which is not automatic if the states of a multiconfigurational calculation are obtained with different orbitals.
In MRCI, the states are obtained from a large set of configurations, all described using one common set of orbitals, and the problem disappears. In order to get the advantage in MCSCF of having orbitals more or less adapted to each state, the problem can be addressed efficiently by the RASSI procedure. Its principle is the ability for any two states to compute transition density one- and two-electron matrices in the common set of AO basis functions, and these are then combined with AO integrals to form any matrix elements needed.
This ability has also been extended for special calculations, e.g., the Dyson amplitudes to model ionization processes and the capability to include spin–orbit coupling in spin-free RASSCF calculations. For higher accuracy of interactions with UV and x-ray photons, OpenMolcas can compute transition probabilities using not only the dipole (length or velocity) approximation but also using higher order or exponential forms17,89 of the electron–photon interaction. rassi will simply combine the transition densities with different molecular integrals provided by seward.
5. Computing excitation energies with large active spaces
As an example of what is described in Secs. III A 3 and III A 4, we present here the strategy applied to compute the vertical excitation energies of bacteriochlorophyll a (BChl) molecules. BChls, together with carotenoids, represent the main chromophores in many light harvesting complexes, such as the much studied LH1 and LH2 systems.90 BChls are porphyrin-like molecules, which present an extended π conjugated system surrounding a central magnesium atom.
The complete π–π* system of a BChl unit extends over 24 molecular orbitals, containing 26 electrons. One empty 3p magnesium atomic orbital is also, often, considered as part of the orbitals available to accept electronic density upon photo-excitation, bringing the total number of potentially active orbitals to 25. A complete active space31 treatment of such a system represents a daunting task, since it would require the handling of 3 863 302 870 000 configuration state functions (CSFs). Instead, the pursued strategy was that of employing a restricted active space28 methodology, followed by a second order perturbation theory correction.79 However, the choice of which orbitals to include in each RAS subspace is not straightforward.
The first step was that of employing a simple “singles and doubles” calculation, i.e., to include 13 fully occupied orbitals in RAS1, 0 orbitals in RAS2, 12 empty orbitals in RAS3, and allowing two holes in RAS1 and 2 excitations in RAS3, together with a state averaging over 2 electronic states (ground and first electronically excited). Subsequently, the orbitals of the obtained trial wave function (product of a linear combination of only 12 403 CSFs) were ordered by their average occupation. After orbital re-ordering, the final wave function was computed including 11 orbitals in RAS1, four orbitals in RAS2, 10 orbitals in RAS3, three holes and three excitations in RAS1 and RAS3, respectively (for a total of 10 203 265 CSFs), and state averaging over 2 states.
The energies of the ground and excited states were re-evaluated at the multi-state RASPT2 (MS-RASPT2)80 level of theory, employing a technique allowing the splitting of the calculation of each root correction and the corresponding Hamiltonian matrix elements into as many separate calculation as the number of roots.4 The so-obtained excitation energies, transition dipole moments, and oscillator strengths were in line with what was expected from experimental data, while the obtained configuration interaction coefficients demonstrated the need for a multi-reference method for the accurate treatment of such large systems.91 Example inputs are provided in Sec. S5 of the supplementary material. The same methodology was successfully applied also to BChls of a different light harvesting system, namely LH3.92 Furthermore, the effects on the excitation energies due to neighboring amino acidic residues were evaluated by including also one amino acid moiety in the calculation.93 While such calculation did not increase the number of CSFs, it increased the number of basis functions (from the starting 852 for BChl using ANO-RCC double zeta)94 up to 1075 after addition of a tyrosine moiety. Finally, a similar procedure was successfully employed to compute the excitation energy to the second excited state, which however required the calculation of five states so as to resolve excited state mixing.95
6. Lanthanide-activated luminescent materials
Inorganic phosphors made of solids optically activated with transition metal and, especially, lanthanide ions are the subject of intense scientific and technological research because of their key role in a broad range of solid-state devices with high societal demand, such as energy-efficient solid-state lighting devices, displays, ultraviolet to infrared solid-state lasers, scintillating detectors for medical imaging, security and high-energy physics calorimetry, remote pressure and temperature measurement systems, solar cell enhancers, energy storage phosphors, persistent luminescence, and quantum information processing.96
All these applications are based on the rich photon engineering that can be built on the complex manifolds of excited states of the lanthanide ions in the solid hosts. Dense sets of tens to hundreds of excited states of different natures separated with gaps, and their different radiative and non-radiative transition probabilities, offer endless practical single- and multi-photon absorption and emission possibilities. Knowledge of the manifolds from experiments alone is limited and information from ab initio calculations is increasingly demanded.
In these complex manifolds of heavy elements in solids, states with partial filling of shells of molecular orbitals of several kinds co-exist, such as 4fn, 4fn−15d, and 4fn−16s configurations, together with impurity-trapped-exciton configurations (ITE) 4fn−1ϕITE and all sort of charge transfer states, such as ligand-to-metal (LMCT), inter-valence (IVCT), metal-to-metal (MMCT), or compensator-to-metal (CMCT) states. Hence, the multireference multiconfigurational wave function techniques of [Open]Molcas and its scalar and spin–orbit coupling relativistic Hamiltonians are the right instruments for the theoretical tackling of these manifolds, provided they are used together with the ab initio model potential (AIMP) embedded tools that take properly into account the quantum mechanical (QM) interactions between the active center and the solid host.97
Recent applications involving halide, oxide, and sulfide hosts, and the complex manifolds of lanthanides such as Pr, Eu, Tm, or Yb, together with the archetype R1 line of Cr3+, are found in Refs. 98–102.
B. Magnetic properties of mono- and poly-nuclear compounds
In recent years, the [Open]Molcas package has been successfully employed for describing magnetic properties of mono- and polynuclear compounds containing transition metals and/or lanthanides and actinides. In this section, the main functions allowing these studies are reviewed alongside with some results. Practical examples on how to run such calculations are given in the supplementary material. In all these studies—with the exception of the last example, which rested on DMRG-SCF/MPSSI/SINGLE_ANISO calculations47—the computational methodology employed was based on the RASSCF/RASPT2/RASSI/SINGLE_ANISO + POLY_ANISO calculations. A more detailed description of the capabilities of [Open]Molcas for magnetic properties can be found in Refs. 4 and 5.
1. Ab initio computation of the parameters of spin Hamiltonians for any dimension of the pseudospin
Experimental data obtained in electron paramagnetic resonance (EPR) measurements are commonly interpreted using a spin Hamiltonian formalism.103 In the limit of weak effects of the spin–orbit coupling effects, the spin Hamiltonian may be written as
where is the effective spin (pseudospin) defined in the basis of the low-lying states. The first term in Eq. (5) is the Zeeman spitting, where B represents the applied field, while g is a 3 × 3 tensor describing the evolution of the energy states in the applied field. The second term is the Zero-Field Splitting (ZFS) (for ), where D is a 3 × 3 tensor describing the splitting of the levels in the absence of the applied magnetic field. The last term in Eq. (5) is the hyperfine interaction, where is the spin of the nucleus and A is the parameter describing the interaction between the nuclear spin and the ground electronic spin. It is common to extract parameters for Eq. (5) by fitting experimental EPR spectra. It is noted here that the general form of spin Hamiltonian equations may include tensorial terms up to rank for Zeeman splitting and for zero-field splitting operators.
A comparison between the experimentally fitted parameters of the spin Hamiltonians and the ab initio extracted parameters represents an active research area. In MOLCAS, two implementations exist for the computation of the g-tensor. One implementation is done inside the rassi module (EPRG), allowing the computation of the g-tensor for a Kramers doublet.74 Another implementation, inside the single_aniso module (MLTP), is more general, allowing the computation of the g-tensor for any size of the pseudospin (e.g., for triplet and quadruplet). The details of this implementation are given in Ref. 104. The single_aniso implementation allows the extraction of the parameters of the zero-field splitting (i.e., the D-tensor) for any size of the pseudospin, as well as the parameters of higher-rank tensorial operators of Zeeman splitting and zero-field splitting.104 In addition, the sign of the product of the main values gXgYgZ of the g-tensor is computed.105 It was shown106 that this sign determines the direction of precession of the magnetic moment around the applied field.
2. Ab initio crystal field for lanthanides and transition metal compounds
The crystal field of lanthanide ions in complexes or in crystals is one of the main characteristics defining the structure of luminescence bands and magnetic properties.107 Since [Open]Molcas allows direct computation of the electronic structure of lanthanide compounds by a RASSCF/RASPT2/RASSI computational scheme with satisfying accuracy, a methodology for the first-principle derivation of crystal–field parameters for lanthanides was implemented inside single_aniso. Two schemes were implemented: (i) J-multiplet specific crystal field and (ii) L-term specific crystal field. This implementation is based on the following steps: (i) performing a rigorous ab initio calculation on a desired lanthanide/transition metal compound; (ii) diagonalization of the (Z-component of the) magnetic moment in the basis of the eigenstates of the free-ion multiplet J or free ion term L; (iii) putting in correspondence ab initio states and the eigenstates of the magnetic moment; (iv) transformation of the (diagonal) RASSI energy matrix to a new basis, where the Z component of the magnetic moment is diagonal and each of the states is characterized with a definite projection |MJ⟩/|ML⟩ on the quantization axis; and (v) applying the irreducible tensor operator method to find all crystal field parameters.
The crystal field Hamiltonian is represented by
where are the parameters extracted from the performed ab initio calculation and represent the irreducible tensor operator of rank k and projection q, which is written in the basis of 2J + 1 or 2L + 1 states. The extracted parameters allow us to recompute the complete energy matrix ĤCF via Eq. (6), which gives after diagonalization, the original RASSI or CASSCF energies, eigenstates, and their properties as they were initially obtained in the performed ab initio calculation.
For both methods described above, the extracted parameters are given in several operator representations: extended Stevens operators,108 Chibotaru-ITO,104 Iwahara,109 etc. In each case, a statistical analysis of the role played by each parameter is given such that the importance of each parameter for the total crystal field splitting is quantified.
3. Blocking barriers of mono- and poly-nuclear single-molecule magnets (SMMs)
Single molecule magnets (SMMs) are paramagnetic molecules that are able to preserve their intrinsic magnetization for a long time, after the applied magnetic field is switched off. The performance of the single-molecule magnets is indicated by the temperature at which the magnetic hysteresis is closing (blocking temperature) and the height of the magnetization blocking barrier. The magnetization blocking phenomenon is directly related to the axiality of the crystal field (i.e., the dominance of the axial crystal field parameters over non-axial ones, at least for the ground multiplet or term) and magnetic axiality of the ground and excited doublet states (i.e., the dominance of the main value gZ over the other two components, gX and gY). The requirements and routes of how to obtain performant molecular magnets have been previously discussed, e.g., Refs. 110 and 111, while it is an active research area.112–119
In the case of mono-nuclear single-molecule magnets, the direct application of the ab initio computational scheme based on RASSCF/CASPT2/RASSI/SINGLE_ANISO allows for a direct evaluation of the crystal field for lanthanides and transition metal compounds107 and of the g-tensor in the ground and excited states. The computational accuracy of this method proved sufficient to understand the origin of the slow magnetic relaxation, the subtle differences between similar compounds, etc.
In particular, the performed ab initio calculations allow for direct evaluation of the blocking barrier for single-molecule magnets. As an example, Fig. 2 shows the structure and blocking barrier of today’s record holder for the best molecular magnet, computed with MOLCAS,120 while this research area benefited a lot from this computational strategy.
4. Thermodynamic magnetic functions: Susceptibility, magnetization, and torque
Thermodynamic functions such as field- and temperature-dependent molar magnetization vector ; molar magnetic susceptibility tensor χαβ; powder averaged susceptibility and powder magnetization; field-, temperature-, and direction-dependent magnetization torque function ; and field- and temperature-dependent magnetic susceptibility are available inside the single_aniso module.
From the computation point of view, the entire single_aniso module is much faster than other calculations discussed here, usually done within a few minutes in most of the cases. Only the computation of the powder magnetization is the most time consuming as the Zeeman Hamiltonian has to be built and diagonalized independently for a large number of field directions and field strength points.
5. Semi-ab initio description of magnetic exchange interaction and the simulation of magnetic properties of polynuclear compounds
Direct ab initio determination of the energy spectra and properties of polynuclear compounds containing lanthanides and/or transition metals with unpaired spins is still a challenge for the current computational methods. The difficulties arise due to several factors. First, the active space of the CASSCF method must comprise all frontier orbital shells (e.g., all sets of d and f shells) containing unpaired electrons. The second issue is related to the number of states required to be optimized for a decent description of the spin–orbit interaction. Finally, the numerical accuracy of the dynamical correlation methods, such as MRCI or CASPT2, would need to provide very accurate energy differences, within 1 cm−1 or lower, if possible. Obviously, it is not yet possible to perform such calculations routinely for any compound of interest. To this end, in order to tackle the electronic structure and magnetic behavior of polynuclear compounds, a semi-ab initio approach was proposed and implemented in poly_aniso.121,122 In brief, the method consists of the following steps:
For a compound consisting of N magnetic metal sites, build N mononuclear fragments, with each fragment containing unpaired electrons only on one metal site. This is achieved by computational substitution of all other (magnetic) metal sites by their diamagnetic equivalents [e.g., (Dy3+ Lu3+)]. Ideally, the ligand framework is left unaltered in all fragments.
Perform ab initio calculation for each fragment at the desired level of theory (e.g., CASSCF/CASPT2/RASSI/SINGLE_ANISO). For each fragment, the local spin–orbit spectra and spin and magnetic moments are obtained.
The total magnetic exchange and magnetic dipole–dipole interaction Hamiltonians are built and diagonalized in the basis of the products of the local ab initio wave functions obtained for the individual metal sites at the previous step. The magnetic exchange interactions between metal fragments are included in an effective (phenomenological) way using the Lines approximation.123 Magnetic dipole–dipole interaction is added exactly, given that all local magnetic moments are available.
Compute the magnetic properties of the entire polynuclear compound using the obtained exchange (coupled) eigenstates.
Extract (fit) the parameters of the inter-site exchange interaction from the direct comparisons between computed and measured magnetic data.
The algorithm was implemented in the poly_aniso software and is available as a module in the MOLCAS package and also as a stand-alone program. Figure 3 shows a particular example of application of the above described computational approach for the investigation of the magnetic properties of Dy3 triangles.122
Recently, the projection of exchange interaction on products of irreducible tensor operators was implemented in the poly_aniso. This function allows a full derivation of all parameters of the multipolar magnetic exchange, in a way similar to the extraction of the crystal field. The feature was first reported in Ref. 5.
6. Magnetic properties of an actinide complex calculated with MPSSI
In this section, we will discuss the first application of the MPSSI approach47 described in Sec. II E 1 that makes a combined use of both the AutoCAS functionality described in Sec. II F 2 in order to rationalize the selected active orbital space and the SINGLE_ANISO approach described above to calculate magnetic properties. In particular, we consider the magnetic properties of a hexakisamidouranium(V) complex, [U(NH2)6]–1, which serves as a model complex for a unique, pseudo-octahedral hexakisamidouranium(V) complex synthesized and experimentally characterized by Meyer et al.124
The active orbital space consisting of seven electrons distributed in 10 active orbitals, CAS(7,10), is sufficiently small to facilitate a direct comparison of the DMRG-SCF/MPSSI data with the corresponding results obtained from CASSCF/CASSI calculations. Dynamical correlation effects in the MPSSI (CASSI) calculations were taken into account by standard multi-state CASPT2 calculations for the seven lowest-lying electronic states of [U(NH2)6]–1, which also define the ensemble of states for the preceding state-averaged DMRG-SCF (CASSCF) calculations. Accordingly, the number of renormalized block states m that determines the accuracy of the DMRG-SCF calculations was chosen such as to exactly reproduce the corresponding CASSCF calculations for each of the active orbital spaces considered in this example.
Figure 4 illustrates the state-averaged orbital entanglement diagram obtained by means of the AutoCAS program based on DMRG-SCF calculations with CAS(7,10). Here, orbital Nos. 4–10 correspond to predominantly 5f orbitals of the U central ion while orbital Nos. 1–3 exhibit mainly U 6p character with little contributions from each of the N center of the (NH2)–1 amido ligands. Interestingly, orbital No. 8 (of U 5fδ character), which displays a seemingly small orbital entanglement, will be automatically included by the automatic selection protocol because it is singly occupied in the electronic ground state. The latter finding is in perfect agreement with conclusions drawn from scalar relativistic DFT calculations for the same model complex.124 The situation is, however, less clear for orbital Nos. 1–3. Inclusion of the latter in the zeroth-order wave function [corresponding to CAS(7,10)] leads to lowering of the MS-CASPT2 excitation energies for the two sets of lowest-lying three-fold degenerate spin-free states of ∼10% and 5%, respectively. More importantly, the isotropic ground-state g-factor of this highly symmetric molecular complex changes from gxx = gyy = gzz = −1.06 to −1.10 in going from a CAS(1,7) to a CAS(7,10) zeroth-order DMRG-SCF (or equivalently CASSCF) wave function. The latter g-value of −1.10 obtained with our DMRG-SCF/MS-PT2/MPSSI approach is in very good agreement with the measured isotropic |g|-factor of 1.12 for the hexakisamidouranium(V) complex of Ref. 124. Moreover, the calculated DMRG-SCF/MS-PT2/MPSSI effective magnetic moment μeff = 0.95 μB (within a temperature range of 5–35 K) of the present hexakisamidouranium(V) model complex [U(NH2)6]–1 agrees reasonably well with the corresponding effective magnetic moment μeff = 1.16 μB experimentally determined in SQUID measurements within the same temperature range for the larger hexakisamidouranium(V) complex of Ref. 124.
C. Beyond single point calculations: Geometry optimization, potential energy surfaces, and molecular dynamics
Quantum chemistry calculations are rarely reduced to computing the energy or other properties of a system at some predefined structure. More often than not, the user would like to optimize the structure, find other significant points on the potential energy surface (saddle points, crossing points… ), obtain plausible reaction mechanisms, simulate the change of structure with time, or otherwise explore the relation between the structure and properties. [Open]Molcas, like many other quantum chemistry packages, includes some tools to allow and facilitate these tasks.
1. Energy derivatives
The key property for exploring potential energy surfaces is the energy gradient or the derivative of the energy with respect to the nuclear coordinates. In [Open]Molcas, it is possible to compute analytical gradients for many wave function methods, including SCF (HF or DFT), MP2,125 CASSCF (state-specific or state-averaged49) MC-PDFT,126 and DMRG-SCF, also when using Cholesky-decomposed two-electron integrals in the acCD flavor.22,125,127–129 For state-averaged CASSCF, analytical non-adiabatic couplings are available too.21 The advantage of analytical gradients is that their computational cost is usually of the order of magnitude of an energy calculation. Even in cases where [Open]Molcas does not currently support analytical gradients (notably CASPT2), numerical differentiation is supported and works transparently for the user.
Similarly, the second derivatives of the energy are available analytically for some variational methods (HF and state-specific CASSCF) and can be computed numerically for the rest. Second derivatives provide a means to obtain vibrational frequencies and thermochemical properties or characterize critical points on a potential energy surface.
2. Geometry optimization
Geometry optimizations are performed by using nonredundant internal coordinates to represent the molecular structure, thus eliminating translational and rotational degrees of freedom whenever the energy is invariant to them. By default, the internal coordinates are generated automatically based on empirical force constants130 such that the most relevant coordinates are prioritized and the user is not required to choose them. In addition, arbitrary constraints can be specified, and they will be satisfied at the converged geometry.131 These constraints can be any combination of “primitive” constraints, including interatomic distances, angles, dihedral constraints to hypersphere or hyperplane surface, or treating fragments as rigid bodies. Moreover, numerical energy differentiation can be made aware of constraints, reducing the number of energy calculations required.132
Judicious use of constraints permits the exploration of specific regions of a potential energy surface.131 One can, for instance, perform a “relaxed scan” by setting the values of some coordinates and optimizing all the other degrees of freedom. Saddle points (transition states) can be located starting from any structure by specifying a guiding constraint that forces the optimization toward a region where a saddle point is presumed to exist; once the right curvature of the surface is detected, the constraint is automatically released and a standard saddle-point optimization proceeds. Minimum energy path (MEP) optimization and intrinsic reaction coordinate analysis are performed automatically as a series of constrained optimizations. An additional energy difference constraint allows the location of minimum energy crossing points or conical intersections, and in combination with other constraints, other points or paths in the intersection seam can be optimized.21
3. Molecular dynamics
An alternative way to explore the potential energy surface (PES) is to run molecular dynamics (MD) simulations. In this type of simulation, the molecular system of interest is evolving in time and its kinetic energy is explicitly taken into account. This is in contrast to a geometry optimization or other static calculations such as the minimum energy path (MEP) where the kinetic energy is zero. As a result, MD allows us to overcome small barriers and reaches regions of the PES, which are otherwise not accessible. This is accomplished by the dynamix module in [Open]Molcas that solves Newton’s equations of motion to propagate the system in time. In order to start an MD simulation, the user needs to provide the initial conditions, which are the nuclear coordinates and velocities. Based on the selected electronic structure method and basis set, forces are computed that are required to extract the acceleration to solve Newton’s equations of motion. The dynamix module works with any electronic structure method in [Open]Molcas. However, it has been more widely used to study ultrafast photochemical or chemiluminescent processes that are non-radiative and are completed within a few hundred femtoseconds (10−15 s).133–141 Photochemical processes start in the electronic excited state and then pass non-radiatively through a conical intersection (Fig. 5) to the ground state, while chemiluminescent processes behave the other way around. The non-radiative transition is modeled by a trajectory surface hopping algorithm142 that works for multiconfigurational methods in [Open]Molcas. The simulation of these events provides a molecular level insight into the rearrangements responsible for the excited state relaxation.
Furthermore, the effect of external forces can also be included.136 The MD can be combined with other modules in [Open]Molcas such as espf143 to run QM/MM molecular dynamics in complex environments. One example is a study of the photoisomerization in the proteorhodopsin protein (Fig. 5) by Borin et al.144 Using the input file (see the supplementary material), a QM/MM trajectory was calculated at the CASSCF level of theory. Starting from the Franck–Condon point, the trajectory evolved toward the conical intersection within 227 fs. While this conformational change did not require much space in the tight protein binding pocket, the completion of the trans-to-cis isomerization was hindered specifically by the interaction of the C14 of the retinal chromophore and the amino acid Tyr200. It was found that this is due to a twist around C13=C14 and C15=N in opposite directions. Hence, it was deduced that tyrosine 200 is a critical amino acid for a successful isomerization.
In addition to the native dynamix module, [Open]Molcas is interfaced to several standalone molecular dynamics programs. One option is SHARC,145 which provides the possibility to include the effect of an external electric field and transitions between the states of different spin multiplicity.146 Another option is COBRAMM,147 which, with its combined quantum mechanical/molecular mechanical (QM/MM) framework and surface hopping scheme, allows us to simulate ground and excited state reactions in vacuum, explicit solvent environments, or complex macromolecular systems (e.g., biopolymers), including non-adiabatic molecular dynamics and the underlying transient spectral signatures. A more sophisticated option is explicitly considering the nuclear wave function and simulating quantum molecular dynamics. This approach automatically includes pure quantum effects such as tunneling. In this case, the interface of [Open]Molcas with Quantics148 is used to apply the direct dynamics variational multiconfigurational Gaussian (DD-vMCG) method.149
4. Automated QM/MM modeling
A specialized protocol for the automated construction of hybrid quantum mechanical/molecular mechanical (QM/MM) models, which uses OpenMolcas as the electronic structure calculation engine, has been introduced. This is the Automatic Rhodopsin Modeling (a-ARM) protocol, a command-line oriented computational tool, designed for the congruous and reproducible generation of monomeric, gas-phase, and globally uncharged QM/MM models of rhodopsins based on electrostatic embedding and the hydrogen-link-atom frontier between the QM and MM subsystems.150,151 Although a-ARM currently only constructs rhodopsin-like models, it provides a template for future development and generation of an automatic QM/MM building strategy for other, more general systems. Furthermore, the tool is already much in use for research as rhodopsins represent one case of a widespread family of biological photoreceptors capable of carrying out different biological functions. In fact, members of the rhodopsin family are found in many diverse organisms and, thus, constitute an exceptionally widespread class of light-responsive proteins, driving fundamental functions in vertebrates, invertebrates, and microorganisms.152–154 a-ARM has been shown to be able to generate models suitable for the prediction of trends in photochemical properties [i.e., maximum absorption () and emission () wavelengths] of wild type rhodopsin-like photoreceptors and their variants, with an error bar of 3.0 kcal/mol (0.13 eV).150,151,155–158
The current version of the protocol151 comprehends two phases, called input file generator and QM/MM model generator, respectively (see Fig. S15 of the supplementary material). The first phase (Fig. S15-A) allows the automatic preparation of a 3D structure in PDB format, which contains information on the protein structure, including the retinal (RET) chromophore and excluding membrane lipids and ions; assigned protonation states of all the ionizable residues; optimized positions of Cl−/Na+ external counterions needed to neutralize both inner and outer protein surfaces; and a file containing the list of amino acid residues forming the cavity hosting the RET protonated Schiff base (rPSB). Such a structure is a suitable input to be used in the subsequent QM/MM model generator phase. Given the options selected in the first phase, a-ARM is sub-divided into a-ARMdefault and a-ARMcustomized. The former refers to a fully automatic input generation, which uses default parameters as suggested by the code (i.e., chain, pH, protonation states, and cavity), whereas the latter allows the customization of some of such parameters. The novelty of these two approaches is that, regardless of the user or the computational facilities, reproducible inputs, and consequently reproducible QM/MM models, are guaranteed when the same parameters are imposed.
a-ARM generates QM/MM models of rhodopsin proteins (Fig. S15-B of the supplementary material) using some of the [Open]Molcas features described above, e.g., the speed-up due to the Cholesky infrastructure. Actually, a-ARM employs a particular QM/MM approach only available on the [Open]Molcas/TINKER interface (espf module). In the framework of the ESPF method,143 the QM part of the chromophore directly interacts with the MM electrostatic potential (usually generated by permanent and sometimes induced atomic multipoles) through one-electron operators whose expectation values represent the QM charge distribution of the chromophore. It is noteworthy that the ESPF method scales with the size of the QM subsystem only, making it particularly efficient for very large molecular systems such as rhodopsin proteins. As described in Ref. 150, the chromophore nonbonding and bonding interactions are modeled based on AMBER94 rules, where a customized set of AMBER94 parameters is used for the van der Waals parameters for RET. Partial charges are computed as AMBER-like RESP charges, by using a different set of parameterized charges for each conformation of the RET.
Briefly, in order to obtain 10 independent QM/MM models, the a-ARM input is treated starting with N = 10 different random seeds that provide 10 independent sets of initial velocities, by the following actions (see Fig. S15 of the supplementary material): optimization of crystallographic water molecules and addition of hydrogen atoms to polar residues, using DOWSER;159 molecular mechanics energy minimization and simulated annealing/molecular dynamics relaxation, employing GROMACS;160 and geometry optimization and single point energy calculation at the CASSCF(12,12)/AMBER and CASPT2(12,12)/6-31G(d) [also MS-] levels, respectively,161 using the [Open]Molcas4/TINKER162 interface. For the output 10 replicas of the final equilibrated gas-phase and globally uncharged monomer QM/MM a-ARM model, the first vertical excitation energies () between the ground state (S0) and the first singlet excited state (S1) are calculated. The final a-ARM result is the average of the 10 values. A detailed explanation of the a-ARM protocol workflow is provided in Refs. 150 and 151.
The validation of the a-ARM protocol for the prediction of trends on () was performed by using a benchmark set of 44 rhodopsins (25 of wild type and 19 mutants).151 The a-ARMdefault approach proved to be capable of reproducing the values for 79% (35/44), with an error lower than 3.0 kcal/mol (0.13 eV), whereas the other 21% were successfully obtained with the a-ARMcustomized approach (i.e., changing the protonation states) (see Fig. 5 in Ref. 151). Figure 6 reports the current state of the art for a-ARM-obtained QM/MM models of rhodopsins. As shown, the a-ARM protocol demonstrates a very good agreement for the prediction of trends in excitation energies for rhodopsins, which structure was obtained from either x-ray crystallography or comparative modeling, as well as for rhodopsin variants. Some of these a-ARM QM/MM models have been used as preliminary inputs for more sophisticated calculations such as constant-pH dynamics,163 the construction of fully relaxed QM/MM models embedded in a biological membrane, population analysis, and one/two photon absorption spectrum simulation.158 Finally, a-ARM is available as a user-friendly web interface, called Web-ARM (web-arm.org),164 which provides the same features as the a-ARM protocol to the general audience.
D. From states to spectra
Electronically excited states play important roles in many chemical reactions and spectroscopic techniques. Theoretical simulations of spectra make it possible to directly compare with experiments and to predict their outcomes. Spectra can either be generated through response theory on the ground state or by explicit calculations of excited states. The latter approach is particularly useful for high-level wave function methods such as CASPT2 and/or multiphoton processes where response equations are not yet available. In this section, the unique opportunities to study excited state properties with the rassi module are outlined,31,32 followed by recent examples from transient non-linear spectroscopy and x-ray scattering.
1. Multichromophoric systems and Frenkel excitonic model
Multichromophoric complexes are conveniently studied by reconstructing the electronic Hamiltonian of the complete system in terms of independent but interacting units. A general theoretical framework to describe such systems is the so-called Frenkel excitonic model, for which more details are reported in the supplementary material of the present paper. In brief, one first performs quantum chemistry computations on single sites, i.e., subregions of the entire system, and then computes site–site coupling terms to reconstruct the multichromophoric manifold of electronic states. A MOLCAS application of this computational model to light harvesting complexes, with site–site coupling terms computed at the transition density level (beyond the dipole–dipole approximation), has been demonstrated in recent years.165
2. Transient non-linear spectral signatures
Transient spectroscopy is a versatile tool for resolving molecular dynamics on both the ground and the excited state. It groups an ample body of techniques that combine laser pulses ranging from the IR to the x-ray that allow us to follow dynamical events from the femtosecond to the hour time scale. Its simulation requires to describe the electronic structure of the system, its coupling to the nuclear degrees of freedom, and non-adiabatic effects. Regarding the electronic structure calculations, a bottleneck is associated with the computation of the manifold of excited states and the transitions between them at different molecular geometries. These are the physical observables that determine the energy position and intensity of the recorded signals and reveal the system dynamics in the experimental measurements. Novel unique functionalities in OpenMolcas aim at facilitating the computation of these observables. Recently, we exploited some of these features in simulations of transient spectroscopy on a number of conjugated and aromatic compounds.165–171
a. UV–Vis transient spectroscopy.
The simulation of UV–Vis signals in transient absorption spectroscopy requires to compute the manifold of higher-lying bound valence excited states.172 Despite not being directly involved in the photoinduced dynamics, they serve as state specific probes (i.e., “spectator” states) that allow following the system evolution in real time. The electronic structure was computed at few selected geometries (excited state minima and conical intersections) within the RASSCF/RASPT2 framework. This strategy allows us to expand the active space beyond the full π-valence space, a requirement of paramount importance for obtaining quantitative results when dealing with higher lying excited states.172,173 The point-group symmetry (C2h) and the parallelized Cholesky decomposition facilitated the simultaneous treatment of tens of excited states in the state-averaging formalism. Transition dipole moments, RASPT2 energies, and (numerical) gradients were then used to simulate the system time-resolved non-linear spectroscopy via the Spectron code.174,175 The resulting spectra [Fig. 7(a)] show excellent agreement with state-of-the-art experiments, revealing photoinduced absorption (PA) features of the “spectator” states whose intensity beatings are an indicator of the vibrational dynamics.166 On the basis of these simulations, we could show that near-UV irradiation of azobenzene involves a radiative-to-kinetic energy transfer into CNN in-plane bending modes, triggering an ultrafast nonproductive decay channel that explains the experimentally observed reduced quantum yield of the trans → cis isomerization.
b. X-ray based transient spectroscopy.
The photoinduced dynamics of a variety of systems can also be followed employing x-ray pulses. In NEXAFS (near edge x-ray absorption fine structure) spectroscopy, these are sent on an already excited sample with a controlled time delay, inducing a transition from the core orbitals to empty valence orbitals. These core-to-valence transitions may be used to reveal the system dynamics in a state-specific way.176 To this aim, we explored the azobenzene nπ* state dynamics by computing the N1s → valence transitions at selected geometries along the minimum energy path connecting the Franck–Condon point and the lowest energy conical intersection with the ground state. We made use of the recently implemented projection technique called highly excited states (HEXSs), which allows the selective target of core-hole configurations and the straightforward calculation of the transition dipole moments between valence and core-excited states (via the rassi program).171
The simulated spectra, shown in Fig. 7(b), demonstrate the sensitivity of NEXAFS to the key molecular modes involved in the azobenzene isomerization on the nπ* surface: eventually, we showed that some specific pre-K-edge features are particularly suitable for tracking coherent ultrafast CI-mediated non-adiabatic dynamics, as they show molecular mode specific changes, and appear in a background-free spectral window.
3. X-ray scattering in transition-metal complexes
Coordination complexes of first-row transition metals participate in a wide range of processes, such as electron transfer and spin crossover, and can act as efficient catalysts. X-ray spectroscopy is widely used to study these complexes due to its elemental specificity and sensitivity to the electronic structure. An excellent example of the strength of the RASSI approach is modeling of the two-photon resonant inelastic x-ray scattering (RIXS) process. When performed at the metal L-edge, a 2p → 3d excitation is followed by the decay of a 3d electron refilling the 2p hole; the energy difference between incident and emitted photons, i.e., the energy loss, gives the valence-excitation energies of the final state. The two-dimensionality of RIXS grants access to intense metal-centered transitions and spectrally separable metal–ligand excitations, which makes it a powerful probe of metal–ligand interactions.
RIXS spectra are affected by both 2p and 3d spin–orbit coupling, as well as correlation between 2p and 3d electrons and strong relaxation due to the core-hole, which makes the RASSI approach highly suitable.178–180 Furthermore, in response theory, the RIXS process is a relativistic two-photon process, but with RASSI, the spectrum can be generated by combining individual transition dipole moments between initial, intermediate, and final states. The same procedure also works for valence excited species, effectively simulating three-photon optical pump-x-ray probe processes, as illustrated in Fig. 8. RASPT2 simulations have been used for orbital-specific mapping of the femtosecond ligand exchange dynamics of Fe(CO)5 in solution.181 Another example is the prediction of the fingerprints of electronic, spin, and structural dynamics in photoexcited ferricyanide.177 By including terms beyond the electric dipole approximation, metal K pre-edge (1s → 3d) RIXS can also be described.182
4. X-ray absorption spectra of actinide complexes and excited state bonding analyses
The Natural Bond Orbital (NBO) toolkit183,184 is widely popular in all branches of chemistry and used, for instance, for chemical bonding analyses within electronic structure calculations. NBO is most often used in conjunction with DFT calculations, but it can accommodate the one-particle (spin-) density matrices from post-HF calculations as well. This includes the complex multi-reference spin–orbit (SO) states generated by the rassi module of [Open]Molcas. Elsewhere,4,185,186 we described a modification of the RASSI property code to generate, from the SO states, various types of one-particle (transition) density matrices in the AO basis that are associated with different types of one-electron property operators [real spin free (type 1), imaginary and/or spin dependent (types 2–4)]. The type 1 density matrices, for example, can be diagonalized in order to obtain spin-independent natural orbitals (NOs) and their occupations for the SO states, which facilitates the analysis of SO effects on metal ligand bond orders and such.187 An interface has been developed recently in a Python code project named eXatomic188 to create NBO inputs from the SO-RASSI type 1 density matrices. Initial applications of this functionality have focused on the chemical bonding in actinide (An) complexes in the context of core-excitation spectroscopy.189,190
X-ray absorption near-edge structure (XANES) experiments on actinide complexes are pursued to obtain insight into the metal–ligand bonding or metal oxidation state. The interpretation of XANES spectra in terms of An–ligand chemical bonding should be accompanied by theoretical calculations. The accurate modeling of XANES of An complexes requires the calculation of multireference correlated valence and core-hole wave functions that account for the core-hole interactions with the valence shell and the SO coupling (SOC). The chemical bonding information from the complex spin–orbit coupled core-hole states probed by the XANES experiments can then be obtained in terms of real NOs of the type mentioned above or localized combinations such as those generated by the NBO algorithms. A compact description for the different electronic states may be best achieved by describing each state with its own optimal set of orbitals. This leads to the concept of orbital relaxation, for instance, upon an electronic excitation, which may alternatively be viewed as the impact of the differential electron correlation between the two states.
RASSCF XANES calculations with [Open]Molcas have been reported for a variety of transition metal complexes178,179,191–193 (see also Sec. III D 3). The applicability of the RASSCF/RASSI protocol in modeling XANES spectra of An complexes has only recently been demonstrated.189,190 Computational details are available in these references and the supplementary material of the present article. Valence and different groups of core-hole states are ideally converged in separate, state-averaged RASSCF calculations by either exploiting symmetry and/or the HEXS keyword. The rassi module then produces spin–orbit coupled manifolds and transition dipoles (and other multipoles) needed to generate the XANES spectrum.
The eXatomic [Open]Molcas–NBO interface was used in Ref. 189 to rationalize the measured high-resolution An M4,5 XANES spectra of (An = U, Np, and Pu) molecules194 in terms of the An–ligand bonding. The spectra (see Fig. 3 of Ref. 189) feature a main peak assigned to transitions from An 3d mainly to non-bonding (ϕ, δ) 5f orbitals and a satellite at high energy due to excitations from 3d into ligand–An 5f orbitals. In the ground states, the show increasing 5f–σ covalent bonding when going from U to Np to Pu. A concomitant stronger splitting is then expected to lead to an increasing off-set of the satellites from the main peaks when going from U to Pu. However, the experiments and the OpenMolcas calculations show the opposite trend. NBO analyses of the dipole-intense SO core-excited states, using the eXatomic [Open]Molcas–NBO interface, revealed a strong σ/σ* orbital relaxation upon excitation into the orbital such that the σ An–ligand covalency is much reduced in the core-excited state compared to the GS. These results highlight the problem of interpreting XANES and other types of absorption spectra in terms of ground-state orbitals only. The usage of eXatomic is illustrated in the supplementary material.
E. Final considerations for large calculations
Implementation of multiconfigurational methods requires large computational resources. In the supplementary material, we collected known limitations of the different modules in [Open]Molcas and estimated the time required for a typical run.
1. Cost of calculating spectra
Compared to ground-state calculations, the inclusion of large numbers of excited states leads to a significant increase in cost, roughly proportional to the number of states, even with recent improvements in the CI algorithm.88 As an example, for x-ray calculations of iron(III) hexacyanide with a ten-orbital valence space, each combination of irreducible representation and spin multiplicity takes around 300 core hours. More than two thirds of the time is spent on the PT2 step, while the cost of the RASSI calculation is almost negligible.192 The “packaging” of states into independent, task-farmable RASSCF+PT2 calculations therefore constitutes a trivial but efficient parallelization of the simulation protocol, unavailable for methods that, unlike RASSI, rely on a shared state averaging for all the states.
2. Techniques for cost-effective calculations
In order to mitigate the costs of computing energies and properties by means of the highly correlated methods available in [Open]Molcas, a number of techniques have been implemented.195–198 In particular, the use of the so-called frozen natural orbital (FNO) approximation195,196,199 is a very robust technique for reducing the computational costs of any correlated calculation (e.g., CC methods or the PT2 step of the various multiconfigurational methods). With the help of FNO, the number of virtual orbitals required to compute the dynamical correlation energy for ground and excited states can be reduced considerably—and controllably—leading to a massive reduction in the number of wave function amplitudes to be determined. In practice, this means that a calculation for which the cost is dominated by, for example, the PT2 step can be effectively sped up, often by up to one order of magnitude, without sacrificing the accuracy of the computed excitation energies. The keyword FNOC implements this option within caspt2, whereas FNOM in the mbpt2 input is used when trying to benefit from this cost-effective alternative in a subsequent coupled-cluster calculation (chcc and cht3 modules).
IV. CONCLUSIONS AND OUTLOOK ON FUTURE DEVELOPMENT
The change of the development model in MOLCAS code (from a code made by a closed community to an opensource project) unleashed a rapid development of new features and simplified the integration with other computational codes.
The [Open]Molcas open-source code is in many aspects a legacy code. For it to adapt to future soft- and hardware environments, it needs substantial rewriting and modifications. In particular, the use of a multitude of different files and old algorithms makes efficient parallelization impossible.
Near-future extensions of the capabilities of the QCMaquis module to further enable DMRG-related techniques are broad. Some work will focus on exploiting the current capabilities of [Open]Molcas to implement efficiently the time-dependent CASSCF (TD-CASSCF) algorithm.200–202 Thanks to the implementation of the time-dependent DMRG algorithm within the QCMaquis package,203 active spaces with about 20–30 orbitals could be targeted. This would pave the route toward ab initio simulations of time-resolved spectra of complex molecules, including metal complexes. Moreover, we plan to extend the traditional SA-RASSCF-based algorithms designed to simulate x-ray absorption spectra192 to DMRG within the QCMaquis setting. In this way, the L-edge absorption spectra of large metal complexes could be simulated by combining the SA formulation of DMRG44 to optimize a large number of excited states and the MPSSI algorithm47 to properly account for spin–orbit effects. Further developments of QCMaquis in OpenMolcas will also concentrate on the improvement of (parallelized) traditional multiconfigurational methods within the DMRG framework including (i) DMRG-SCF, (ii) DMRG-NEVPT2 and DMRG-CASPT2, and (iii) DMRG-MC-srDFT.204 It is also planned to streamline and clear the use of external libraries such that it will be possible to enhance the performance of QCMaquis by a factor of more than ten.
New approaches for analyzing the most important orbitals or even most important individual excitations will be developed and polished. The possibility to verify an active space selection by the usage of DMRG codes in combination with graphical user interfaces (e.g., AutoCAS) will convert multiconfigurational calculations into automated routine and will allow us to integrate these calculations into existing automated workflows.
Development of the dynamix module will include an improvement of the sampling procedure to generate initial conditions for molecular dynamics simulations. Furthermore, the trajectory surface hopping algorithm will be extended to be used with the CASPT2 method.
The [Open]Molcas environment is constantly evolving, as demonstrated by the varied possible features here described. A common denominator, and an excellence of the program suite, is definitely its capacity of treating complex molecular systems at the highest possible level of theory, both as a stand-alone and as interfaced to other computational codes. Furthermore, constant development ensures the possibility of employing [Open]Molcas to reproduce, rationalize, and predict experimental data obtained employing the latest spectroscopical techniques.
The supplementary material contains the following sections and illustrates the short description of features described in the article and also contains ready to use examples: S1. Overview of main computational codes: (i) short description of computational modules, (ii) flowchart, (iii) description of Molcas input language (EMIL); S2. Frenkel excitonic Hamiltonian; S3. Computation of BChl excitation energies; S4. X-ray absorption calculations; S5. ARM protocol; S6. Examples of DMRG calculations; S7. Tools to select active space; S8. Excited states of lanthanides in solids; S9. Example of calculation of magnetic properties; S10. Molecular dynamics calculations; S11. RASSCF/RASPT2/RASSI methodology for heavy elements; S12. SINGLE_ANISO calculations; and S13. Example of applying the semi-ab initio approach: fragmentation + POLY_ANISO.
All authors contributed equally to this work.
The data that supports the findings of this study are available within the article and its supplementary material.
F.A. acknowledges financial support from the EU-H2020 research and innovation programme under Grant Agreement No. 654360 within the framework of the NFFA-Europe Transnational Access Activity. Part of this work was performed, thanks to computer resources provided by CINECA, under Project No. HPC-EUROPA3 (Grant No. INFRAIA-2016-1-730897), with the support of the EC Research Innovation Action of the H2020 Programme. D.-C.S. and J.A. acknowledge support from the U.S. Department of Energy, Office of Basic Energy Sciences, Heavy Element Chemistry program, under Grant No. DE-SC0001136. S.B. acknowledges support from the Swiss National Science Foundation (Grant No. P2SKP2_184034). A.B. is grateful for support from ETH Zurich (ETH Fellowship No. FEL-49 18-1). M.R. acknowledges support from the Swiss National Science Foundation (Project No. 200021_182400). L.D.V., L.P.-G., and M.Ol. acknowledge a MIUR (Ministero dell’Istruzione, dell’Università e della Ricerca) grant “Dipartimento di Eccellenza 2018-2022.” M.Ol. acknowledges NSF Grant No. CHE-CLP-1710191. M.D. and M.L. acknowledges support from the Olle Engkvist Foundation. E.D.L. and V.V. acknowledge computational resources provided by SNIC through LUNARC and NSC. T.B.P. acknowledges support from the Research Council of Norway through its Centres of Excellence scheme, Project No. 262695, and through Research Grant No. 240698. K.P. acknowledges financial support from KU Leuven through Grant No. C14/15/052. L.S. acknowledges financial support from Ministerio de Economía y Competitividad, Spain (Dirección General de Investigación y Gestión del Plan Nacional de I+D+i, Grant No. MAT2017-83553-P). J.S.-M. acknowledges support from the EU-H2020 Marie Curie Actions (AttoDNA, FP8-MSCA-IF, Grant No. 747662). I.S. gratefully acknowledges funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (Grant No. 678169 PhotoMutant). L.U. and X.G. gratefully acknowledge scientific Grant Nos. R-143-000-A80-114 and R-143-000-A65-133 from the National University of Singapore. Computational resources of the NSCC (ASPIRE-1, Grant No. 11001278) were used for this study.