While bonding molecular orbitals exhibit constructive interference relative to atomic orbitals, antibonding orbitals show destructive interference. When full localization of occupied orbitals into bonds is possible, bonding and antibonding orbitals exist in 1:1 correspondence with each other. Antibonding orbitals play an important role in chemistry because they are frontier orbitals that determine orbital interactions, as well as much of the response of the bonding orbital to perturbations. In this work, we present an efficient method to construct antibonding orbitals by finding the orbital that yields the maximum opposite spin pair correlation amplitude in second order perturbation theory (AB2) and compare it with other techniques with increasing basis set size. We conclude the AB2 antibonding orbitals are a more robust alternative to the Sano orbitals as initial guesses for valence bond calculations due to having a useful basis set limit. The AB2 orbitals are also useful for efficiently constructing an active space, and they work as good initial guesses for valence excited states. In addition, when combined with the localized occupied orbitals, and relocalized, the result is a set of molecule-adapted minimal basis functions that is built without any reference to atomic orbitals of the free atom. As examples, they are applied to the population analysis of halogenated methane derivatives, H–Be–Cl, and SF6, where they show some advantages relative to good alternative methods.
I. INTRODUCTION
Virtual orbitals are important in chemistry as they play a central role in molecular orbital theory. From a computational standpoint, orbital mixing between occupied orbitals and virtuals determines the optimal occupied orbitals in mean-field Hartree–Fock theory1–3 and Kohn–Sham density functional theory.4–7 In wavefunction theory, electron correlation is typically described by amplitudes, such as the pair correlations describing the simultaneous promotion of two electrons from occupied to virtual orbitals. The virtual orbitals span the unoccupied space, and the choice of representation is important. Canonical virtual orbitals are delocalized levels that are appropriate for electron attachment. Localized virtuals, such as the redundant non-orthogonal basis of atomic orbitals projected into the virtual space,8,9 permit the development of efficient local correlation methods because the amplitude tensors describing correlation become sparse.10 Other prescriptions for localized orthogonal virtuals exist,11–13 as well as proposals to form sets of virtuals that are specifically optimized for correlations that involve a given occupied level, as will be discussed below.
The virtual orbitals span the entire unoccupied space, which can be contrasted with the intuitive notion of antibonding orbitals that exist in 1:1 correspondence with bonding orbitals. The 1:1 correspondence is evident from constructive and destructive interference of a pair of 1s-type functions on two hydrogen atoms in H2,
Antibonding orbitals themselves play a central role in describing chemical reactivity14–19 of one molecule with another through donor–acceptor interactions between a high-lying occupied orbital of one species and a low-lying antibonding orbital of the other. Frontier orbital theory is constructed on these ideas. Antibonding orbitals also play an important role in describing strong electron correlations. A simple example is the stretching of the H–H bond, which leads, in a minimal basis, to a strong increase in the amplitude for excitation, which breaks the bond.
While the antibonding orbitals are intuitive,16–18,20 it is nonetheless not routine to extract them from modern quantum chemistry calculations performed in extended basis sets, which return canonical orbitals. By contrast, in a minimal basis description of hydrocarbons, the space of antibonding orbitals is naturally spanned by the canonical virtual orbitals. In larger basis sets, however, different methods have been developed to extract the antibonding orbitals, often by relying on projection back onto some chosen minimal basis,13,21–25 typically a tabulated one for a specific free-atom Hartree–Fock energy eigenstate. For example, Schmidt et al. found antibonding orbitals by performing an SVD (singular value decomposition) of the overlap between the virtual orbitals and a minimal basis to produce valence virtual orbitals.13 Some methods have been developed to produce a minimal basis specifically adapted to a molecular environment,26,27 but those are nonlinear optimization procedures that are often iterative and costly. One famous method that does not rely on a reference minimal basis is the Natural Bond Orbital (NBO) procedure,14,15 where the density matrix coupling between multiple atom-tagged orbitals is utilized to produce bonding and antibonding orbitals. However, atom tagging of basis functions plays a critical role in the NBO procedure—in fact, the standard NBO method is specific to atom-centered orbital (AO) basis calculations. Few methods cut the umbilical cord to the minimal basis in producing antibonding orbitals. Aside from the Sano antibonding orbitals28 (discussed below), Foster and Boys29 suggested oscillator orbitals which are virtual orbitals with the maximum dipole from localized occupied orbitals.
Local correlation has been intensively studied,8,9,25,30–38 leading to the conclusion that dynamic correlation can be well approximated using domains of localized virtual orbitals that are in the same spatial region as a localized occupied orbital.8,9 This reduces the fourth rank tensor of pair correlation amplitudes to an asymptotically linear number of significant elements. Nevertheless, all virtual orbitals are required for post-SCF methods, such as coupled cluster theory, that recover dynamic correlation, rather than just the much smaller set of valence virtual orbitals. By contrast, static or strong correlation resides mostly in the valence virtuals (i.e., the antibonding orbitals). Thus, complete active space (CAS) methods that seek to describe strong correlation require only a description of the valence virtuals. Methods in this class include CASSCF,2,39–42 spin-coupled valence bond (VB),43–46 and approximations, such as generalized valence bond (GVB)47 and coupled cluster valence bond (CCVB).48–50 CAS, GVB, and CCVB methods, thus, need an initial guess for the antibonding orbitals. We do note that the orbitals associated with key amplitudes for strong correlation are not necessarily spatially localized.51–54
One method used to obtain initial guess antibonding orbitals is the so-called Sano procedure.28 In brief, after localizing a set of occupied orbitals using standard methods,12,55–57 the Sano procedure finds the virtual orbital that has maximum exchange interaction with each given localized occupied orbital. The idea of maximizing exchange is very old58,59 and comes from its predecessor, the modified virtual orbitals60,61 (note that modified virtual orbitals have been since used to refer to any noncanonical set of virtual orbitals62). The resulting orbitals are symmetrically orthogonalized to yield a set of valence antibonding orbitals. This method has worked quite well for GVB-PP and CCVB calculations in moderately sized basis sets.49–51,63,64 In this work, we will show that the Sano procedure shows undesirable behavior with increase in the size of the AO basis set. This motivates the need for a better behaved alternative. We suggest that an antibonding orbital that gives the largest first order perturbation amplitude for exciting an electron pair from a given bonding orbital is a suitable alternative. A range of numerical results confirm this to be the case. These antibonding orbitals can be viewed as a specific instance of orbital specific virtuals.30–32
II. THEORY
A. Defining the set of antibonding orbitals
Solving the mean field Hartree–Fock (HF) equation self-consistently gives the lowest energy, single Slater determinant electronic wavefunction. To solve the many-body problem, one needs to include the missing correlation energy.65 Second order Møller–Plesset (MP2) perturbation theory66,67 offers a useful and computationally inexpensive approximation to treat the correlation, yielding the following expression in the case of restricted HF orbitals:
where
This expression folds together contributions from the correlation of two electrons of opposite spin (OS), with amplitudes
together with the contribution of correlations of electrons with the same spin. The two-electron repulsion integrals (ERIs) over spatial orbitals describing the interaction of each occupied with each virtual are
Let us collect the ERIs associated with occupied orbital i into the symmetric matrix Ki, where
Here, Ki is positive semi-definite; thus, the eigenvector belonging to its largest eigenvalue will correspond to the virtual level with the strongest exchange interaction with occupied level i. That is the Sano prescription28 for finding the antibonding orbital associated with i.
We can likewise define a matrix of second order pair correlation amplitudes, Ti, associated with a given occupied orbital,
This matrix is negative semi-definite since the denominators are negative for the ground state determinant. We can, therefore, find the largest OS pair-correlation amplitude as the lowest eigenvalue, of Ti, and the associated virtual orbital is the eigenvector, with expansion coefficients in the original virtual basis,
Upon repeating for each occupied level, most naturally in a localized representation, and using similar arguments to Kapuy’s zeroth order in the Fock and second order in correlation approximation,68,69 we suggest that this is an appropriate non-iterative way to find a set of antibonding orbitals in 1:1 correspondence with the bonding orbitals. This approach may be contrasted with Sano’s suggestion to obtain the virtual orbital with maximum repulsion from the bonding orbital by solving the eigenvalue problem for each orbital using Ki rather than Ti. The inclusion of orbital denominators in Eq. (9) provides a clear physical meaning of the antibonding orbital as having strongest pair correlation amplitude with its parent bonding orbital. As will be demonstrated numerically later, this property also dramatically improves basis set convergence relative to the Sano definition.
We will refer to these virtual orbitals as “second order antibonding” (AB2) MOs to emphasize their second order origins and their 1:1 correspondence with bonding MOs. In terms of existing literature, the AB2s are directly related to the “orbital-specific virtual” (OSV) orbitals30,31 that are sometimes used to evaluate the correlation energy. Each AB2 orbital is the most important OSV for a given localized bonding orbital. Of course, the reason for selecting the amplitudes associated with MP2 is computational efficiency. The exact limit of this procedure would be to diagonalize the corresponding exact (i.e., from Full CI) doubles amplitudes, , via Eq. (9).
A closely related alternative that has some advantages over Eq. (9) is to define the space of valence antibonding orbitals from the virtual–virtual block of the MP2 one-particle density matrix,70,71
Upon diagonalizing, the (M − O) eigenvectors with largest occupation numbers span the valence antibonding orbital space and, together with the occupied space, complete the span of a molecule-adapted minimal basis. Localization of these valence virtual orbitals will then yield an alternative to the localized virtuals above. The advantage of this approach is for cases where there is no simple 1:1 mapping between bonding and antibonding orbitals, as discussed more later.
The virtual orbitals obtained this way are the valence subset of the “frozen natural orbitals” (FNO),70,71 and we emphasize that they are not generally localized in contrast to the AB2 MOs. They are close to the virtual natural orbitals associated with PMP2 as defined by the gradient of the MP2 energy,72–74 with the caveat that only the virtual–virtual block is diagonalized.
B. Population analysis using the effective minimal basis
Finding a suitable set of antibonding orbitals provides the missing part of the valence space not spanned by the occupied orbitals. Thus, the union of the occupied space and the space of antibonding orbitals spans the space of an effective minimal basis. It is well accepted that full valence CASSCF wavefunction is spanned by an effective minimal basis within the molecule for this reason.39,40 Accordingly, localizing the union of the occupied orbitals with the antibonding orbitals reveals a set of molecule-adapted atomic orbitals (MAOs),75,76
For a given pair of well-localized bonding and antibonding orbitals (say σ and σ*), this procedure amounts to inverting Eq. (2) to discover the corresponding MAOs even though we may be using a very extended basis, or even a non-atom-centered basis, such as plane waves or a real-space grid, to perform the calculations.
The resulting MAOs, χ, are thus expressed in terms of the AOs, ω, as χ = ωCMAO. The MAOs are orthogonal, and they typically localize onto atoms. The MAOs exactly span the space of the occupied orbitals, and they can be used for population analysis among other things.26,40,77–83 Let us denote p as an MAO label for χp, which is centered at . Using A, B as atom labels and given that the density matrix in the MAO basis is , one can make a population analysis as follows:
where QA and ZA are the atomic charge and the nuclear charge, respectively. Such a population analysis has no dependence on atom-tagging of the underlying basis and does not rely upon a reference minimal basis. Therefore, it generalizes nicely to plane wave basis and real-space methods. If the orbitals are unrestricted, we construct antibonding pairs and the MAOs for the alpha and beta spin spaces independently.
This approach to generating an MAO representation does have some limitations. First, it assumes that there is a 1:1 mapping between bonding and antibonding orbitals. One class of exceptions can be found in electron deficient molecules (e.g., LiH will not recover 2p-like orbitals on Li and BH3 will not recover a 2pz orbital on B). Such species can be said to have “virtual lone pairs,” whose identification is a problem that we shall not address here. A second class of exceptions can be found in species such as cyclopentadiene anion, where there are three semi-localized π occupied orbitals, but the valence space only admits two antibonding orbitals. Third, in symmetric systems with multiple Lewis structures (e.g., C6H6), the MAOs will derive from localized bonding and antibonding orbitals corresponding to a single Lewis structure and may not reflect the indistinguishability of the atoms. Broadly, we can say that this MAO approach is readily applicable to neutral molecules with a single dominant Lewis structure.
C. Implementation details
Computational efficiency is very important for quantum chemistry in order to treat molecules that are as large as possible for given computational resources (e.g., computer speed and memory size). Our AB2 implementation uses exact four-center integrals in a basis of Gaussian-type atomic orbitals (other alternatives such as using auxiliary basis expansions can also be readily implemented). Each step with its computational complexity is shown in Fig. 1. Note that for the figure and the discussion here, we use O, V, and N for the number of bonding orbitals, virtual orbitals, and AO basis functions, respectively. We start by making a pseudo-density for each bonding orbital, i. To generate the two-electron integrals (μν|λσ), Q-Chem84 only generates significant μν (i.e., AO basis) pairs to some target numerical cutoff, yielding a total that we term as (NN)cut. (NN)cut scales quadratically [i.e., (NN)cut ≈ N2] for small systems but approaches linear scaling [i.e., (NN)cut ∝ N] in the limit of large system size. The integrals are made and contracted on-the-fly with the bonding orbitals’ pseudo-densities to make bonding-specific exchange integrals with computational effort scaling as . The matrices are then transformed into the virtual space as in Eq. (7) with computational cost scaling as . Asymptotically, this is the dominant step in this method unless more careful thresholding is considered.85 Then, we divide by the appropriate denominator to get in Eq. (8) [with effort]. Finally, we diagonalize Ti for each bonding orbital to get the AB2 antibonding orbitals as in Eq. (9) with effort. Note that the last step can in principle be made since we are only solving for the eigenvector with the largest amplitude in each matrix. We can contrast this procedure with the modified FNO approach, which has a dominant computational step that scales as the fifth power of molecule size: constructing Pab in Eq. (10) with complexity of .
A chart illustrating the mathematical steps needed to construct AB2 orbitals with the appropriate computational complexity for each step indicated. Here, O, V, N, and (NN)cut refer to the number of occupied orbitals, virtual orbitals, AO basis functions, and significant AO pairs, respectively.
A chart illustrating the mathematical steps needed to construct AB2 orbitals with the appropriate computational complexity for each step indicated. Here, O, V, N, and (NN)cut refer to the number of occupied orbitals, virtual orbitals, AO basis functions, and significant AO pairs, respectively.
One reason for the efficiency of the AB2 approach compared to FNO comes from focusing on the bonding orbitals one at a time rather than the whole occupied space at once. It is then important to start by localizing the occupied space, which is known to be a cubic scaling iterative procedure for, e.g., the Boys and Pipek–Mezey localization measures.86,87 Then, one must also distinguish between localized orbitals with different character: specifically core, bonding, and nonbonding, e.g., lone pairs. Our implementation uses an automatic bonding detection option that runs before AB2. The detection process is simply determined by Pipek’s delocalization measure88 on Mulliken charges, where measures amounting to 1 indicate an orbital localized on an atom (core or nonbonding) and measures around 2 correspond to orbitals split between two atoms.
D. Computational details
All methods discussed here were implemented in a developer version of Q-Chem 5.84 The geometries used for molecular calculations were optimized at the ωB97X-D/def2-TZVPD level of theory. All geometries are included in the supplementary material.
III. RESULTS AND DISCUSSION
We will compare different approaches to generating effective antibonding orbitals: In particular, we are interested in whether the second order antibonding (AB2) MOs significantly improve upon the Sano antibonding orbitals, as measured by usage-relevant metrics obtained from a set of numerical experiments. We will first examine orbital plots, orbital energies, and orbital variances. We then test the applicability of Sano and AB2 MOs to several valence correlation methods: coupled cluster valence bond (CCVB),49,50 complete active space configuration interaction (CASCI),89–92 and complete active space self-consistent field (CASSCF).39–42 Next, we look into their uses for describing valence excited states. For basis set, we are using the Dunning basis set family93 and Ahlrichs.94 These are available in Q-Chem 5.3 with an automated detection of bonding orbitals.
A. Orbitals, orbital energy, and orbital variance
We start by looking at the σ* orbital of H2, as shown in Fig. 2, evaluated by the Sano procedure, the AB2 approach, and CAS(2,2) (performed as one-pair perfect pairing). It is visually clear that the Sano σ* orbital is contracting as the basis set is improved. Figure 3 displays (1) the orbital energy (diagonal matrix element of the Fock operator) and the variance (⟨r2⟩ − ⟨r⟩2) of the σ bonding orbital and (2) the Sano and AB2 models of the antibonding orbital. The variance confirms that the size of the Sano σ* orbital contracts with basis size, while its orbital energy increases (reflecting increasing electron confinement) unsatisfactorily. By contrast, the behavior of the AB2 orbital is very close to that of the bonding orbital, with pleasing stability in both energy and variance as the basis set is converged toward completeness. The stark difference is due to Sano orbitals including high energy orbitals to maximize the exchange interaction whereas AB2 biases against those higher energy orbitals with the denominator penalty in Eq. (8).
Comparison of σ* orbitals predicted by Sano, AB2, and CCVB in H2 with increasing size of the basis set. For this problem, CCVB is identical with (2,2) CASSCF. Orbitals were plotted with ten contour isovalues logarithmically spaced [0.1,10], five for each phase.
Comparison of σ* orbitals predicted by Sano, AB2, and CCVB in H2 with increasing size of the basis set. For this problem, CCVB is identical with (2,2) CASSCF. Orbitals were plotted with ten contour isovalues logarithmically spaced [0.1,10], five for each phase.
Comparison of orbital energy (diagonal matrix element of the Fock operator) for σ, Sano, and AB2 orbitals for H2 with increasing size of the basis set. Bottom graph compares the variance.
Comparison of orbital energy (diagonal matrix element of the Fock operator) for σ, Sano, and AB2 orbitals for H2 with increasing size of the basis set. Bottom graph compares the variance.
Next, we look into C2H4, where the localization scheme of Boys produces mixed σ − π orbitals (sometimes called banana bonds), while Pipek–Mezey predicts separate σ and π orbitals. We will, therefore, use Pipek–Mezey orbitals whenever we encounter π orbitals. Inspecting the σ C–C bond in C2H4 in Fig. 4 shows that the shape of both the occupied Pipek–Mezey and converged CCVB bonding orbitals does not change much upon increasing the size of the basis set. By contrast, when looking at σ* in Fig. 4, we see even poorer behavior of the Sano C–C antibonding orbital as a function of basis set size than we did for H2. This is confirmed in Fig. 5 where we compare the orbital energy and the orbital variance of the bonding and the antibonding C–C σ orbital in C2H4. The Sano σ* orbital does not converge with the size of the basis set, with the variance decreasing and the energy increasing. By contrast, the AB2 σ* orbital converges rapidly both in terms of energy and variance for similar reasons to before.
Comparison of σ orbitals predicted by Pipek–Mezey localization with those found by converging CCVB (top row) and σ* orbitals predicted by Sano, AB2, and CCVB (bottom row) in C2H4 with increasing size of the basis set. Orbitals were plotted with ten contour isovalues logarithmically spaced [0.1,10], five for each phase.
Comparison of σ orbitals predicted by Pipek–Mezey localization with those found by converging CCVB (top row) and σ* orbitals predicted by Sano, AB2, and CCVB (bottom row) in C2H4 with increasing size of the basis set. Orbitals were plotted with ten contour isovalues logarithmically spaced [0.1,10], five for each phase.
Comparison of orbital energy (diagonal matrix element of the Fock operator) for the C–C σ orbital with the σ* (left) and π orbital with the π* (right) predicted by Sano and AB2 in C2H4 with increase in the size of the basis set. It can be seen that Sano orbitals do not converge with increase in the basis set cardinality whereas AB2 converges much quicker especially for the σ orbital. Bottom graphs compare the spatial variance for the same orbitals where Sano contracts orbitals further with increase in the basis set size and less so for the π orbital.
Comparison of orbital energy (diagonal matrix element of the Fock operator) for the C–C σ orbital with the σ* (left) and π orbital with the π* (right) predicted by Sano and AB2 in C2H4 with increase in the size of the basis set. It can be seen that Sano orbitals do not converge with increase in the basis set cardinality whereas AB2 converges much quicker especially for the σ orbital. Bottom graphs compare the spatial variance for the same orbitals where Sano contracts orbitals further with increase in the basis set size and less so for the π orbital.
In Fig. 5, we compare the orbital energy and the orbital variance of the bonding and the antibonding orbitals for the C–C π in C2H4. The shortcomings of Sano seem to be much less severe in π* orbitals. We believe this is due to the diffuse nature of the π orbitals making the maximum exchange, thus spatial locality, sufficient to describe the π*. However, we can still see that the orbital energy and variance do not converge for Sano while they do for AB2, and they converge to drastically different orbital energy and orbital variance.
The quantitative advantage of the AB2 antibonding orbitals relative to the Sano orbitals seen so far can also become qualitative advantages in systems with more complex electronic structure. One such example is Cu2, which, considering that the valence state of Cu can be taken as 3d104s1, is isoelectronic to H2. The σ orbital (HOMO) of Cu2 is shown in the upper panel of Fig. 6, along with the optimized correlating orbital from CCVB, as well as the Sano and AB2 antibonding orbitals. Maximizing exchange results in a Sano antibonding orbital that resembles an empty π-bond between the two metals. By contrast, the AB2 and CCVB orbitals look qualitatively identical.
Comparison of the shape of the orbitals in Cu2 where the σ bond is used to produce Sano and AB2 antibonding orbitals. While the AB2 method produces very similar orbitals to CCVB, the Sano approach fails to give a qualitatively correct antibonding orbital.
Comparison of the shape of the orbitals in Cu2 where the σ bond is used to produce Sano and AB2 antibonding orbitals. While the AB2 method produces very similar orbitals to CCVB, the Sano approach fails to give a qualitatively correct antibonding orbital.
B. CCVB iterations
The CCVB method is a simple low-scaling approximation49–51 to exponentially scaling spin-coupled valence bond theory that can separate a system of 2n electrons into fragments with spin purity, provided that UHF can also reach the dissociation limit. One price to be paid for these advantages is a challenging orbital optimization problem: The CCVB orbitals have no invariances to rotations within the active space, in contrast to CASSCF. Hence, a good initial guess is very important. Sano orbitals28 have been commonly used as a starting guess for valence bond methods,51,95 such as CCVB, due to their resemblance to antibonding orbitals. For simple alkanes, we examine how many iterations are needed to converge a CCVB calculation with Sano and compare with AB2 shown in Fig. 7 with increase in the molecule size and the basis set size (using the Dunning cc-pVXZ sequence of basis sets93). Since the double-zeta basis set does not involve many high energy orbitals, both methods converge at almost at the same speed. Upon increasing the size of the basis set, overly contracted Sano orbitals deviate more from the optimal antibonding orbitals and, therefore, require far more iterations to converge. For this reason, we recommend using AB2 orbitals as a starting guess for valence bond methods instead of the Sano orbitals.
Number of iterations needed to converge CCVB calculations on alkanes of increasing size, with increasing ζ of the basis set. This shows a relatively constant number of iterations needed for AB2 regardless of system size, while the number of iterations rise unfavorably for the Sano guess in large basis sets. Geometric direct minimization (GDM)95,96 is used to determine the steps.
Number of iterations needed to converge CCVB calculations on alkanes of increasing size, with increasing ζ of the basis set. This shows a relatively constant number of iterations needed for AB2 regardless of system size, while the number of iterations rise unfavorably for the Sano guess in large basis sets. Geometric direct minimization (GDM)95,96 is used to determine the steps.
C. CAS methods
The relative fraction of correlation energy recovered using AB2, Sano, FNO, or other choices for antibonding orbitals to complete an active space can help us discern which ones are most appropriate to use for configuration interaction with fixed orbitals as well as for a CASSCF initial guess. As a simple example, we stretch the C–C bond in C2H4 while keeping the geometry of the methylene groups fixed at those of the equilibrium ground state geometry of ethene. Looking at Fig. 8, we see that canonical virtual orbitals capture less and less correlation as the def2 basis set is improved from SVP to TZVPP to QZVPP. We also observe that the gap between Sano and AB2 orbitals increases with increase in the size of the basis set. Finally, we can see that FNO and AB2 orbitals perform almost identically and are the best choices, with AB2 having lower computational costs. Nonetheless, it is encouraging that both follow the CASSCF energies closely.
CAS-CI (12e,12o) for C2H4 using canonical, Sano, AB2, and frozen natural orbitals in three different basis sets. Restricted HF (RHF) and unrestricted HF (UHF) curves without any correlation correction are shown for comparison.
CAS-CI (12e,12o) for C2H4 using canonical, Sano, AB2, and frozen natural orbitals in three different basis sets. Restricted HF (RHF) and unrestricted HF (UHF) curves without any correlation correction are shown for comparison.
Since the AB2 and FNO orbitals seem to capture quite a lot of the static correlation, we sought to compare them to CASSCF orbitals. In Fig. 9, we are comparing the smallest singular value of the overlap matrix between the CASSCF orbitals and those of canonical, Sano, AB2, and FNO at the optimized geometry of C2H4. Once again, the canonical orbitals become dramatically worse with increasing basis set size. Both Sano and FNO become very slightly worse with increase in the size of the basis set, that is, by increasing zeta, while AB2 seems to be nearly basis set-independent.
The smallest singular value from the overlap of CASSCF (12e,12o) orbitals with those from Sano, AB2, FNO, and canonical orbitals. Canonical orbitals with the lowest energy and FNOs with the highest occupancy were selected. Canonical orbitals differ strongly from optimized CASSCF orbitals while AB2 orbitals have the highest agreement.
The smallest singular value from the overlap of CASSCF (12e,12o) orbitals with those from Sano, AB2, FNO, and canonical orbitals. Canonical orbitals with the lowest energy and FNOs with the highest occupancy were selected. Canonical orbitals differ strongly from optimized CASSCF orbitals while AB2 orbitals have the highest agreement.
D. Excited states
Since the AB2 orbitals seem to be good guesses for GVB methods and yield orbitals close to converged CASSCF orbitals, this led us to believe that they could also provide a good description of valence excited states. State-specific methods, such as orbital-optimized DFT (OO-DFT),97 need a suitable starting guess, as convergence is typically to the nearest stationary point;98 so, we used Sano and AB2 guesses for the π → π* excitation in methanal (H2CO). For our purposes, we employed the square gradient minimization method,98 which looks for saddle points in the orbital Hilbert space to converge restricted open-shell Kohn–Sham (ROKS).97,99,100 In Fig. 10, we compare the overlap of the π* orbital from converged singlet open-shell HF calculations with Sano and AB2 orbitals. For this excitation, AB2 orbitals overlap the optimized orbital by at least 0.9 and vary minimally with the size of the basis set. We note here that aside from the double-zeta case, converging the excited state starting from the Sano orbital sometimes lands on a Rydberg excited state, while AB2 landed on the correct π* state in all cases.
The overlap of the converged ROKS-HF antibonding orbital with the Sano and AB2 initial guesses in H2CO for the π → π* excitation. The π* orbital is well described by AB2 regardless of basis set size.
The overlap of the converged ROKS-HF antibonding orbital with the Sano and AB2 initial guesses in H2CO for the π → π* excitation. The π* orbital is well described by AB2 regardless of basis set size.
E. Population analysis
Antibonding orbitals belong to the valence space and contribute to making a minimal basis that can be used to gain insight into chemistry, for instance via population analysis to assign effective charges on each atom. The population analysis we present here is constructed from the union of the occupied space and the antibonding orbitals without dependence on the basis set used. To study our atomic charge predictions and compare it to some other methods in the literature, we look into fluoro- and chloro-substituted methanes, which have been studied theoretically101–103 and experimentally.104,105 These simple systems are nonetheless interesting because they manifest the effect of substituting electron withdrawing halogen atoms of different sizes and electronegativities for hydrogen in methane. How consistent or inconsistent are different atomic population analysis schemes as descriptors of these chemical substitutions?
In Fig. 11, we examine the effect of progressive substitution of hydrogen by chlorine and fluorine in the methane molecule on the computed net charge at the C atom. We consider some commonly used methods, specifically charges on electrostatic potential grid (ChElPG),106 iterative Hirshfeld (Iter-Hirsh),107,108 intrinsic atomic orbital (IAO),109,110 and the method presented in this work, molecular atomic orbital (MAO). Most obviously, the charge transferred upon halogen substitution will depend strongly on the electronegativity difference between X and H. Furthermore, while halogens are more electronegative than hydrogen (or carbon), the electron donating capacity of C is not unlimited, and so we expect the first halogen substituted to pull away a greater fraction of an electron from C compared to the next, and so forth. Such a change will also have some dependence on the X vs H electronegativity difference. With these preambles aside, atomic charges are not observables; therefore, no single answer should be viewed as strictly correct. Nevertheless, we can examine the results of each population analysis for signs of incorrectness relative to physical intuition.
The charge on the carbon atom for successive chlorination and fluorination of methane predicted using four different population analysis methods (see text for the names). The triangle, square, hexagon, and octagon correspond to charges using def2-SV(P), def2-SVPD, def2-TZVPD, and def2-QZVPD, respectively.
The charge on the carbon atom for successive chlorination and fluorination of methane predicted using four different population analysis methods (see text for the names). The triangle, square, hexagon, and octagon correspond to charges using def2-SV(P), def2-SVPD, def2-TZVPD, and def2-QZVPD, respectively.
For instance, while all methods agree that the C–H bonds of CH4 are polarized Cδ−Hδ+, and all likewise agree that the C–F bonds of CF4 are polarized Cδ+Fδ−, different methods predict different polarities for the C–Cl bond in CCl4. Perhaps the most counterintuitive result is that the population on C becomes more negative via ChElPG upon going from CHCl3 to CCl4 despite the higher electronegativity of Cl vs H. At the other extreme, iterative Hirshfeld suggests that the change in C population with successive halogenation is linear, as if the electron donating capacity of C does not saturate. This is especially striking for chlorination, and we suspect, as unreasonable as the ChElPG result for CCl4. By contrast, we find no obvious fault with the MAO values or with the IAO values for these interesting test cases, although we prefer the slight negative charge for Cl in CCl4 predicted by MAO (recalling the electronegativity of chlorine is 3.5 vs carbon at 2.5 on the Pauling scale) relative to the slight positive charge predicted by IAO.
Finally, we examine an unusual linear molecule, which is the result of insertion of Be into HCl, yielding H–Be–Cl.111,112 While H–Cl is polarized as Hδ+Clδ−, Be has lower electronegativity than H, and so there will be substantial charge transfer. Indeed, the ionic limit would be H−Be2+Cl−. What then is the actual charge distribution when we consider the covalent character of the molecular orbitals? The calculated populations are shown in Fig. 12, and it is immediately evident that predicted charges on Be vary widely. The least polar picture comes from ChElPG and MAO, with q(Be) ≈ +0.5, while the IAO scheme suggests q(Be) ≈ +1.35. How should we understand this dramatic difference and suggest which might be more correct?
The charges on each atom in the BeHCl molecule predicted by the four methods mentioned in the text. The triangle, square, hexagon, and octagon correspond to charges using def2-SV(P), def2-SVPD, def2-TZVPD, and def2-QZVPD, respectively.
The charges on each atom in the BeHCl molecule predicted by the four methods mentioned in the text. The triangle, square, hexagon, and octagon correspond to charges using def2-SV(P), def2-SVPD, def2-TZVPD, and def2-QZVPD, respectively.
From the MAO perspective, there are two σ bonds involving Be, one with H and one with Cl. Each is made from sp hybrid orbitals on Be, meaning that the p orbitals of Be are at play in this σ bonding, as shown in Fig. 13. Both these bonds are polarized away from Be as expected. The origin of the much larger IAO charge can now be understood. The IAO reference minimal basis set, known as “MINAO,”109 does not include 2p orbitals for Be; therefore, we are instead seeing essentially only the Be(2s) charge via the IAO approach! Iterative Hirshfeld evidently struggles with a similar issue, leading to similar overestimation of the Be charge. Overall, this case nicely illustrates the advantages of the MAO population scheme that is based entirely on the system at hand, rather than some reference atomic orbitals or states.
The union of the Boys localized bonding orbitals and their AB2 counterparts in BeHCl forms a complete valence space. Performing a Boys localization on this set of valence orbitals leads to molecule-adapted atomic orbitals that are intuitive and centered on atoms, which can then be used for population analysis.
The union of the Boys localized bonding orbitals and their AB2 counterparts in BeHCl forms a complete valence space. Performing a Boys localization on this set of valence orbitals leads to molecule-adapted atomic orbitals that are intuitive and centered on atoms, which can then be used for population analysis.
Next, we study the hypervalent molecule, SF6, which has Oh symmetry and whose chemical bonding has long been of interest.113 While empty 3d functions on sulfur are needed to form six equivalent sp3d2 hybrids, the energetic cost of promoting electrons to the 3d shell is too high for d-orbital participation in the bonding to be chemically important.46,114–117 Rather, the bonding may be thought of as resonance between Lewis structures with four covalent S–F bonds, and two F− anions, with a formal charge of +2 on S.118 Using Boys localization produces six equivalent σSF orbitals, as shown at the left of Fig. 14. As expected, these σSF bonds are strongly polarized toward the more electronegative fluorine atom. The AB2 antibonding orbitals, also shown on the left of Fig. 14, are fascinating because contrary to simple chemical expectations, they are not strongly polarized toward the sulfur atom.
The union of the localized σSF bonding orbitals and the corresponding AB2 antibonding orbitals in SF6 forms a complete valence space describing the SF bonds (and excluding the F lone pairs). Localizing the set of valence orbitals leads to six molecule-adapted atomic orbitals on S (one is illustrated), showing no visual signs of d orbital participation.
The union of the localized σSF bonding orbitals and the corresponding AB2 antibonding orbitals in SF6 forms a complete valence space describing the SF bonds (and excluding the F lone pairs). Localizing the set of valence orbitals leads to six molecule-adapted atomic orbitals on S (one is illustrated), showing no visual signs of d orbital participation.
Localizing the union of bonding and antibonding sets produces atom-centered MAOs, where the sulfur valence orbitals are a set of six equivalent orbitals that are strongly polarized toward the fluorine atoms, as shown in the third image of Fig. 14. These six functions are linearly independent (though non-orthogonal) and combine with the conventional hybrid orbital on each F to form the strongly polarized σSF orbitals. This hypervalent bonding problem is treated very naturally by the MAO analysis, while a minimal basis is insufficient to describe the occupied space since one in principle needs a set of six sp orbitals on sulfur. As measures of polarity, the calculated charges on S are +2.9 (IAO), +2.1 (Iter-Hirsh), and +1.6 (MAO) in the def2-QZVPPD basis set. The IAO charge may overestimate the polarity due to its minimal basis on S, while the MAO charge surely underestimates it because the MAO orbital assigned to S is actually quite strongly polarized toward F. So, while the MAOs themselves provide interesting chemical insight, they cannot resolve the intrinsic limitations of population analysis.
There are some limitations associated with reducing the union of the localized occupied orbitals and the AB2 antibonding orbitals to a set of MAOs that should be mentioned. First, some conjugated π systems, such as benzene and , present a multiple minimum solution problem for orbital localization methods. Since our method relies on the localization procedure heavily, we expect there will be inconsistencies in these systems. For example, in benzene, there are different sets of solutions for the localized π orbitals, nominally corresponding to the two different Kekule structures. Using the Boys localized orbitals yields populations that reflect D6H symmetry, while the Pipek–Mezey scheme gives alternating charges on successive carbons going around the ring. There is a second class of molecules that are inaccessible in our method. These are anions where the natural valence minimal basis is too small to provide an antibonding orbital for each bonding orbital. One such example is C5H5−, the cyclopentadienyl anion. Forming the set of AB2 valence orbitals and taking the union with the occupied space leads to a set of orbitals that cannot be localized to atoms. Broadly, we can say that neutral species with a single Lewis structure are well-handled by the approach described here as well as some more complex bonding situations like SF6 discussed above.
IV. CONCLUSION
We presented a relatively cheap, non-iterative procedure to produce a set of antibonding orbitals that vary minimally with the size of the atomic orbital basis set from which they are constructed. Specifically, antibonding second order (AB2) orbitals show far less variation with basis than the Sano orbitals, which are sometimes used as valence antibonding orbitals. We showed that the use of AB2 rather than Sano orbitals as initial guesses provides improved convergence for valence bond methods (specifically CCVB) as well as for CASSCF. The AB2 orbitals were successfully used as guesses for state-specific ROKS calculations of excited states, where they better resemble the converged orbitals than does the corresponding Sano orbital guess. We have shown how these AB2 orbitals can be used with the localized occupied orbitals to construct an effective minimal basis that can be used for population analysis among other things. Population analysis on the substituted fluoromethane and chloromethane sequence shows the method is stable and consistent with other common methods that accord with chemical intuition. For the insertion of Be into HCl, the resulting charges show some advantages. Overall, the AB2 antibonding orbitals are relatively efficient to compute and quite useful for a variety of applications in quantum chemistry.
SUPPLEMENTARY MATERIAL
See the supplementary material for data used to generate the plots as well as the molecular geometries.
ACKNOWLEDGMENTS
The authors would like to acknowledge support from the U.S. National Science Foundation through Grant No. CHE-CHE-1955643, with additional support from Q-Chem, Inc., through an NIH SBIR subcontract (Grant No. 2R44GM121126-02). We would like to thank Professor Farnaz Heidar-Zadeh and Dr. Susi Lehtola for very interesting discussions about localization and population analysis.
AUTHOR DECLARATIONS
Conflict of Interest
Martin Head-Gordon is a part-owner of Q-Chem, Inc., which produces the software into which the developments reported here were implemented.
DATA AVAILABILITY
The data that support the findings of this study are available within the article and its supplementary material.