Interpolating a density matrix from a set of known density matrices is not a trivial task. This is because a linear combination of density matrices does not necessarily correspond to another density matrix. In this Communication, density matrices are examined as objects of a Grassmann manifold. Although this manifold is not a vector space, its tangent space is a vector space. As a result, one can map the density matrices on this manifold to their corresponding vectors in the tangent space and then perform interpolations on that tangent space. The resulting interpolated vector can be mapped back to the Grassmann manifold, which can then be utilized (1) as an optimal initial guess for a self-consistent field (SCF) calculation or (2) to derive energy directly without time-consuming SCF iterations. Such a promising approach is denoted as Grassmann interpolation (G-Int). The hydrogen molecule has been used to illustrate that the described interpolated method in this work preserves the essential attributes of a density matrix. For phosphorus mononitride and ferrocene, it was demonstrated numerically that reference points for the definition of the corresponding tangent spaces can be chosen arbitrarily. In addition, the interpolated density matrices provide a superior and essentially converged initial guess for an SCF calculation to make the SCF procedure itself unnecessary. Finally, this accurate, efficient, robust, and systematically improved G-Int strategy has been used for the first time to generate highly accurate potential energy surfaces with fine details for the difficult case, ferrocene.

Density matrices are one of the most useful constructs in electronic structure theory.^{1–5} This quantity plays a central role in the self-consistent field (SCF) calculation including both Hartree–Fock (HF) theory and Kohn–Sham density functional theory (KS-DFT).^{6,7} In these methods, the quality of the initial density matrix can determine whether the SCF iterative procedure converges, the number of SCF cycles required to reach convergence, and the internal stability of SCF solutions.^{8} In addition, any desired molecular property can be predicted by using a converged density matrix.

In the SCF approach,^{6,9,10} one deals with the general eigenvalue problem,

where **F** is the Fock matrix, **C** is the molecular orbital (MO) coefficient matrix, **S** is the overlap matrix, and **E** is the MO energy matrix. The **C** in Eq. (1) has the property **C**^{T}**SC** = **1**, where **1** is a unit matrix. The matrix **C**^{T} denotes the transpose of **C**. The density matrix **P** is then defined as

where the **C ^{occ}** is a submatrix of

**C**and has columns that correspond to the occupied MOs.

**P**is an

*N*

_{basis}×

*N*

_{basis}matrix, where

*N*

_{basis}is the number of basis functions. The Fock matrix

**F**depends on

**P**, which is why Eq. (1) is solved in a self-consistent fashion. The density matrix

**P**has the following properties:

**P**=

**P**,

^{T}**PSP**=

**P**, and Tr[

**PS**] =

*N*

_{elec}, where

*N*

_{elec}is the number of electrons in the investigated system.

The MOs, which are columns in **C**, are not orthonormal. It is convenient to enforce orthonormality among the MOs.^{9,11} To do this, one has to redefine the MO coefficients as

in the symmetric orthogonalization procedure. A new density matrix can then be defined as

In comparison to **P**, this new density matrix $P\u0304$ has the following properties: $P\u0304=P\u0304T$, $P\u03042=P\u0304$, and $Tr[P\u0304]=Nelec$.

These properties of the new density matrix $P\u0304$ are isomorphic to the Grassmann manifold $MGr$.^{12} Hence, one can exploit the properties of $MGr$ when working on $P\u0304$. We note that the collection of $P\u0304$ in $MGr$ does not obey a closure property. That is, a linear combination of $P\u0304$ is not necessarily an element of $MGr$ where the mathematical details are shown in Sec. A of the supplementary material. As a result, given a set of known $P\u0304r$ at certain geometries **r**, it is not trivial to interpolate a density matrix at a desired geometry, say, **r**_{k}.

When one studies the potential energy surface (PES) of a molecular system, it is often the case that SCF calculations are needed to be performed at several geometries. For instance, one has to perform either a rigid or relaxed scan along a dihedral angle of a molecule to investigate the relative stability of the conformational isomers.^{13–15} Moreover, one may need to build a PES to study chemical reactions^{16,17} or to investigate vibrational spectra.^{18,19} In some cases, one needs to evaluate potential energies that are “off the grid.” Such a task is often performed using interpolation. There are also cases when one needs to interpolate other molecular properties, such as dipole moment. Although one can perform separate interpolations for each of these molecular properties, it would be good to have a method that can interpolate a density matrix. By just knowing the density matrix for a given geometry, one can obtain any molecular property, in principle.

Recently, Polack and co-workers proposed a method to interpolate density matrices in an indirect manner using the properties of the Grassmann manifold.^{20} Instead of working on the $MGr$ manifold, one has to map the known density matrices to the tangent space of $MGr$ at a given reference point **r**_{0}. The tangent space, which has the structure of a vector space, is then denoted as $Tr0MGr$. Such mapping is expressed as

Since $Tr0MGr$ is a vector space, the objects of this space is closed under addition and scalar multiplication.^{21} As a result, the interpolation can then be easily performed in $Tr0MGr$. The tenets of the method are as follows: First, the density matrices are sampled at a given set of nuclear geometries ${P\u0304r}$. Then, one selects a point **r**_{0} among this set to be used to define $Tr0MGr$. Afterward, $P\u0304r\u21a6\Gamma r$ is achieved using the Grassmann logarithm,^{22,23}

where $logr0Gr(P\u0304r0)=0$. The matrices **U**_{ℓ}, **Σ**** _{ℓ}**, and $V\u2113T$ are obtained from the singular value decomposition (SVD)

^{24}of the matrix

**L**, which is defined as

where $C\u0304r0occ$ are the set of occupied MO at point **r**_{0}. From here, the set of **Γ**_{r} can then be used to interpolate $\Gamma rk$. In this work, the Lagrange^{25} interpolation method is used. Once $\Gamma rk$ is known, it can then be mapped back to the $MGr$ manifold to obtain $P\u0304rk$. Such mapping can be achieved using the Grassmann exponential,^{22,23}

where $P\u0304r0=expr0Gr(0)$. The occupied MO coefficients $C\u0304occrk$ for the unknown geometry **r**_{k} are obtained from the SVD of $\Gamma rk=Ue\Sigma eVe$ and $C\u0304r0occ$. $C\u0304occrk$ can be calculated using the following equation:

Based on these developments, $C\u0304occrk$ can be mapped back to $Prk$, which can then be used either as an initial SCF guess to accelerate quantum chemistry calculations or to predict SCF energies directly without SCF iterations. We shall denote this approach as Grassmann interpolation (G-Int), which automatically ensures the correct geometrical structure, properties, and physical requirements of the generated density matrix. Polack and co-workers^{20} demonstrated that the quality of the density matrix depends on the interpolation order. A higher-order Lagrange interpolating polynomial gave a density matrix that is closer to the converged SCF solution. In this Communication, we extended their study in investigating the following aspects:

Guarantee the physical properties of the generated density matrix.

Sensitivity of the interpolation with respect to the choice for

**r**._{0}Utility of the interpolated density matrix as an initial guess for difficult to convergence SCF cases.

Construct the PES for the first time using the G-Int strategy without SCF iterations.

Here, we limit our cases to closed-shell systems where all electronic structure calculations were performed at the restricted HF and DFT. In these methods, the *α* density matrix **P**^{α} and *β* density matrix **P**^{β} are the same. Hence, only interpolations for the **P**^{α} were performed in this work. The total density matrix^{9} **P** is then taken as 2**P**^{α}. Similarly, $P\u0304$ can also be expressed as $2P\u0304\alpha $. The total number of electrons *N*_{elec} can then be expressed as $2Nelec\alpha $, where the number of *α* electrons can be obtained from $Nelec\alpha =Tr[P\alpha S]=Tr[P\u0304\alpha ]$. All electronic structure calculations in this work were performed using a locally modified version of Q-Chem^{26} interfaced with an external Python code for G-Int.

We first demonstrate that the method outlined above would yield **P**_{interp}, which possesses following properties: (1) **P**_{interp} is symmetric and (2) Tr[**P**_{interp}**S**] = *N*_{elec}. For the sake of illustration purposes and in order to make these properties numerically tractable, we chose the H_{2} molecule with two electrons as a case study. A restricted Hartree–Fock calculation with the 3-21G basis set will be used in this case. For the H_{2} molecule, this basis set has a total of four basis functions only. We then sample this H–H bond length at the 0.50 Å ≤ *R* ≤ 1.50 Å interval with a step size of *dR* = 0.10 Å. The resulting grid has a total of 11 points. We then use these grid points to interpolate the density matrix at *R* = 0.7348 Å, which is the equilibrium bond distance of H_{2} at the said level of theory and basis set.

We now examine the properties of the interpolated density matrix $Pinterp\alpha $. Table I shows the $Pinterp\alpha $, which in this case is a 4 × 4 matrix. We note that this $Pinterp\alpha $ is symmetric. $Pinterp\alpha S$ is also shown in Table I. It is easy to show that $Tr[Pinterp\alpha S]$ is equal to 1, which is the number of *α* electrons. Then, *N*_{elec} is $2Tr[Pinterp\alpha ]=2$, which is the total number of electrons for H_{2}. It is also easy to show that $P\u0304interp\alpha $ also satisfies the necessary properties for $P\u0304$, as shown in Sec. B of the supplementary material.

$Pinterp\alpha $ . | |||
---|---|---|---|

0.084 479 13 | 0.090 257 74 | 0.084 479 13 | 0.090 257 74 |

0.090 257 74 | 0.096 431 63 | 0.090 257 74 | 0.096 431 63 |

0.084 479 13 | 0.090 257 74 | 0.084 479 13 | 0.090 257 74 |

0.090 257 74 | 0.096 431 63 | 0.090 257 74 | 0.096 431 63 |

$Pinterp\alpha S$ | |||

0.220 900 49 | 0.261 230 58 | 0.220 900 49 | 0.261 230 58 |

0.236 010 72 | 0.279 099 51 | 0.236 010 72 | 0.279 099 51 |

0.220 900 49 | 0.261 230 58 | 0.220 900 49 | 0.261 230 58 |

0.236 010 72 | 0.279 099 51 | 0.236 010 72 | 0.279 099 51 |

$Pinterp\alpha $ . | |||
---|---|---|---|

0.084 479 13 | 0.090 257 74 | 0.084 479 13 | 0.090 257 74 |

0.090 257 74 | 0.096 431 63 | 0.090 257 74 | 0.096 431 63 |

0.084 479 13 | 0.090 257 74 | 0.084 479 13 | 0.090 257 74 |

0.090 257 74 | 0.096 431 63 | 0.090 257 74 | 0.096 431 63 |

$Pinterp\alpha S$ | |||

0.220 900 49 | 0.261 230 58 | 0.220 900 49 | 0.261 230 58 |

0.236 010 72 | 0.279 099 51 | 0.236 010 72 | 0.279 099 51 |

0.220 900 49 | 0.261 230 58 | 0.220 900 49 | 0.261 230 58 |

0.236 010 72 | 0.279 099 51 | 0.236 010 72 | 0.279 099 51 |

Having established that the above described interpolation method preserves the properties of a density matrix, we now address the second aspect. We consider the diatomic phosphorus mononitride (PN). This species is the first phosphorus containing compound identified in the outer space.^{27} Similar to H_{2}, the PN molecule has only one internal degrees of freedom, which is the P–N bond length *R*. We conducted geometry optimization for the ground singlet state at the B3LYP/aug-cc-pVTZ level. The SG-1 quadrature grid^{28} was used for all DFT calculations in this work. The B3LYP/aug-cc-pVTZ equilibrium P–N bond length is 1.488 Å, which is close to the reported experimental bond length (1.491 Å).^{29} The density matrices for this molecule were scanned in the range 0.80 Å ≤ *R* ≤ 3.40 Å with a step size of *dR* = 0.20 Å. This sampling corresponds to a total of 14 points. From these 14 points, we will interpolate the density matrix at *R* = 1.488 Å using different reference points **r**_{0} for the definition of the tangent space $Tr0MGr$. In each of these interpolations, all of the 14 points were used, which then corresponds to a 13th order-Lagrange polynomial interpolation.

To assess the interpolation error, we performed an SCF calculation at *R* = 1.488 Å. The converged density matrix $Pconv\alpha $ is then compared with the interpolated density matrix $Pinterp\alpha $. The interpolation error is then measured by taking the Frobenius norm^{30} error, which is defined as

with $\Delta P\alpha =Pinterp\alpha \u2212Pconv\alpha $, which is the difference between the interpolated $Pinterp\alpha $ and converged $Pconv\alpha $ density matrices.

Table II shows the Frobenius norm error for several $Pinterp\alpha $, which were interpolated using different tangent spaces. These tangent spaces were defined by different reference points. Notice that the Frobenius norm error is comparable (∼10^{−4}) across different reference points. In other words, the reference point at which the tangent space is defined does not affect the quality of the interpolated density matrix for a given set of sampled points.

References point r_{0} (Å)
. | Frobenius Norm error ||ΔP^{α}||_{F}
. | DIIS error . | $EinitialSCF(Eh)$ . | $EinitialSCF\u2212EconvSCF(Eh)$ . |
---|---|---|---|---|

0.8 | 4.16 × 10^{−4} | 8.45 × 10^{−6} | −396.117 573 902 3 | 1.20 × 10^{−8} |

1.4 | 4.18 × 10^{−4} | 7.49 × 10^{−6} | −396.117 573 903 9 | 1.04 × 10^{−8} |

2.0 | 3.92 × 10^{−4} | 7.61 × 10^{−6} | −396.117 573 903 9 | 1.04 × 10^{−8} |

2.6 | 3.75 × 10^{−4} | 7.67 × 10^{−6} | −396.117 573 903 9 | 1.04 × 10^{−8} |

3.2 | 3.67 × 10^{−4} | 7.65 × 10^{−6} | −396.117 573 904 0 | 1.03 × 10^{−8} |

References point r_{0} (Å)
. | Frobenius Norm error ||ΔP^{α}||_{F}
. | DIIS error . | $EinitialSCF(Eh)$ . | $EinitialSCF\u2212EconvSCF(Eh)$ . |
---|---|---|---|---|

0.8 | 4.16 × 10^{−4} | 8.45 × 10^{−6} | −396.117 573 902 3 | 1.20 × 10^{−8} |

1.4 | 4.18 × 10^{−4} | 7.49 × 10^{−6} | −396.117 573 903 9 | 1.04 × 10^{−8} |

2.0 | 3.92 × 10^{−4} | 7.61 × 10^{−6} | −396.117 573 903 9 | 1.04 × 10^{−8} |

2.6 | 3.75 × 10^{−4} | 7.67 × 10^{−6} | −396.117 573 903 9 | 1.04 × 10^{−8} |

3.2 | 3.67 × 10^{−4} | 7.65 × 10^{−6} | −396.117 573 904 0 | 1.03 × 10^{−8} |

We now examine the quality of these $Pinterp\alpha $ when used as an initial guess for an SCF calculation. We performed the SCF calculations using the popular direct inversion in the iterative subspace (DIIS) algorithm.^{31–33} The SCF convergence criterion was set when the maximum variation of the density matrix is less than 1.0× 10^{−8}. Table II also shows the initial DIIS errors associated with $Pinterp\alpha $. Across all the reference points, the DIIS error corresponding to these $Pinterp\alpha $ is ∼10^{−6}. Compare with the conventional superposition of atomic densities (SAD)^{34} guess, which has an initial DIIS error of 3.67 × 10^{−2}, these interpolated density matrices have a superior quality over SAD. The DIIS errors of the G-Int density matrix (∼10^{−6}) are four-orders of magnitude smaller than that of the SAD initial guess. As for other conventional initial guesses, the initial DIIS errors for core Hamiltonian (Core) and generalized Wolfsberg–Helmholtz (GWH) initial guesses are 1.17 × 10^{−1} and 8.08 × 10^{−2}, respectively.

The next question is how good is the total energy, which is one of the most commonly sought molecular properties, when such a high quality G-Int **P**_{interp} is used to evaluate the total energy. In both HF theory and KS-DFT, this property can be directly obtained from the density matrix through a single Fock build. Table II shows the $EinitialSCF$ energies, which corresponds to **P**_{interp}. These $EinitialSCF$ are numerically similar up to the eight decimal place regardless of the reference point’s location. When compared with the converged SCF energy $EconvSCF$, these $EinitialSCF$ energies are very close to the converged result with errors within 1.20 × 10^{−8} *E*_{h}, which is similar to the 1.0 × 10^{−8} SCF convergence criterion. Such an excellent agreement suggests that the interpolated density matrix can provide a sufficient accuracy in predicting the energies at the unknown geometry to the extent that one can, in principle, bypass SCF iterations. When these **P**_{interp} were subjected as initial guesses to SCF calculations, all of them converged at the sixth cycle. In each of these cycles, the SCF energy only varies after the seventh decimal place. The slow SCF convergence with the oscillation behavior of this almost convergent density matrix may be due to the DIIS algorithm. It is expected that the diagonalization-free procedure^{35} may be an ideal algorithm to further accelerate the SCF convergence based on this highly accurate density matrix and reduce the SCF iterations. Meanwhile, SCF calculations using SAD, Core, and GWH as initial guesses took 12, 15, and 14 SCF cycles to reach convergence, as shown in Table III. Thus, the G-Int strategy reduces the numbers of SCF cycles by at least 50% for PN.

Initial Guess . | PN . | Ferrocene (eclipsed) . |
---|---|---|

SAD | 12 | 60 |

Core | 15 | 61 |

GWH | 14 | 59 |

G-Int | 6 | 7 |

Initial Guess . | PN . | Ferrocene (eclipsed) . |
---|---|---|

SAD | 12 | 60 |

Core | 15 | 61 |

GWH | 14 | 59 |

G-Int | 6 | 7 |

One should note that the quality of **P**_{interp} depends on the sampling used for the interpolation. In the above sampling scheme, the points *R* = 1.40 Å and *R* = 1.60 Å are the two closest points to the unknown point (*R* = 1.488 Å). As shown in Sec. C of the supplementary material, the accuracy of the **P**_{interp} can be adjusted based on the sampling. For instance, we can sample on a narrower region with a finer grid and do the interpolation from there. As an illustration, we repeated the sampling in the range 1.45 Å ≤ *R* ≤ 1.55 Å with a step size of *dR* = 0.01 Å. In this case, the tangent space was defined using the *R* = 1.45 Å point. The corresponding ||**ΔP**^{α}||_{F} for $Pinterp\alpha $ at *R* = 1.488 Å went down to 3.25 × 10^{−9}, which implies that **P**_{interp} is numerically the same to **P**_{conv}. When used as an initial guess for an SCF calculation, this **P**_{interp} convergences after two SCF cycles. Such highly accurate interpolation occurs since a density matrix was sampled at *R* = 1.490 Å, which is close to the unknown point at *R* = 1.488 Å.

In the end, the accuracy of **P**_{interp} would depend on the goal of the researcher. If the goal is to interpolate the density matrix to be used as an initial guess for an SCF calculation, then the **P**_{interp} should be of moderate accuracy. However, if the goal is to totally bypass the SCF calculation, then one needs to obtain a highly accurate **P**_{interp}. For the PN case studied in this work, a moderately accurate **P**_{interp} shown in Table II is accurate enough to derive energy (error ∼1.0 × 10^{−8} *E*_{h}) directly without SCF iterations.

We now demonstrate our third aspect that given a set of known **P** for a difficult system, which requires numerous SCF cycles to reach convergence, the G-Int approach can be used to generate **P**_{interp}, which can then be used as an initial guess to significantly reduce the required number of SCF convergence cycles. Hence, this **P**_{interp} would be a superior initial guess than the conventional SAD, Core, and GWH guesses. For this case, we examine the metal sandwich Fe(C_{5}H_{5})_{2} ferrocene complex, which has two conformational isomers with respect to the orientation of the two cyclopentadiene moieties.^{36–39} These are the eclipsed $D5h$ and staggered $D5d$ conformations.^{40,41} Coriani and co-workers^{42} conducted high-level *ab initio* simulations on ferrocene at the level of the coupled cluster with single, double, and perturbative triple excitations and indicated that the eclipsed conformer has a lower energy than the staggered conformer. When it comes to DFT methods, previous works have shown that the B3LYP functional can provide predicted structural parameters for ferrocene that agrees reasonably well with experimental values.^{42} Given these, we first optimized the ground singlet state structure of the ferrocene’s eclipsed conformer using the B3LYP method with the LANL08^{43,44} effective core potential and basis set for the Fe atom and the 6-311G(d) basis set for the C and H atoms. Performing an SCF calculation for the eclipsed conformational isomer at the B3LYP level with the above mentioned basis sets requires dozens of SCF cycles to reach convergence, as shown in Table III. Note that for the eclipsed ferrocene, the SAD, Core, and GWH initial guesses require about 60 SCF cycles to reach convergence. Figure 1 shows the behavior of the DIIS errors for SAD, Core, and GWH with respect to the number of SCF cycle where their DIIS errors don’t obviously reduce until about the 40th SCF cycle.

From the optimized geometry, the dihedral angle between the two cyclopentadienes were scanned in the range −36.00° ≤ ** θ** ≤ 36.00° with a step size of

**= 8.00°. This grid corresponds to a total of 10 points. From here, $Pinterp\alpha $ was interpolated at**

*θ***= 0.00° using a ninth-order Lagrange interpolation. The definition of the dihedral angle can be seen in Sec. D of the supplementary material.**

*θ*Table IV shows the Forbenius norm error for $Pinterp\alpha $, which was obtained from different tangent spaces defined at various reference points. Similar to the phosphorus mononitride’s case, the Forbenius norm error is also not sensitive with respect to the reference point. Across all reference points, the ||**ΔP**^{α}||_{F} is about (∼10^{−4}), which is also similar to the PN’s case. When the corresponding **P**_{interp} are used as an initial guess for an SCF calculation, the initial DIIS error is already small (∼10^{−6}), which indicates that these **P**_{interp} are somewhat close to **P**_{conv}. In addition, $EinitialSCF$ for these **P**_{interp} are very similar as well. All of the tabulated $EinitialSCF$ agree up to the ninth decimal place. A comparison of the $EinitialSCF$ with $EconvSCF$ shows that these $EinitialSCF$ do agree within 4.46 × 10^{−8} *E*_{h}—making the SCF calculation at ** θ** = 0.00° unnecessary. In addition, it takes seven SCF cycles to converge the SCF calculation of ferrocene at

**= 0.00° when**

*θ***P**

_{interp}is used as an initial guess. As shown in Fig. 1, the G-Int approach not only has a lower initial DIIS error but also converges faster than other conventional initial SCF guesses. Again, the SCF oscillation behavior of this highly accurate density matrix is probably due to the DIIS algorithm. Nevertheless, the G-Int strategy reduces the numbers of SCF cycles by about 90% for the difficult case, ferrocene.

References point θ_{0}(°)
. | Frobenius norm error ||ΔP^{α}||_{F}
. | DIIS error . | $EinitialSCF(Eh)$ . | $EinitialSCF\u2212EconvSCF(Eh)$ . |
---|---|---|---|---|

−36.00 | 8.18 × 10^{−4} | 2.27 × 10^{−6} | −510.616 967 730 7 | 4.42 × 10^{−8} |

−20.00 | 8.21 × 10^{−4} | 2.28 × 10^{−6} | −510.616 967 730 4 | 4.45 × 10^{−8} |

−4.00 | 8.22 × 10^{−4} | 2.28 × 10^{−6} | −510.616 967 730 3 | 4.46 × 10^{−8} |

12.00 | 8.21 × 10^{−4} | 2.28 × 10^{−6} | −510.616 967 730 3 | 4.46 × 10^{−8} |

28.00 | 8.20 × 10^{−4} | 2.28 × 10^{−6} | −510.616 967 730 5 | 4.44 × 10^{−8} |

References point θ_{0}(°)
. | Frobenius norm error ||ΔP^{α}||_{F}
. | DIIS error . | $EinitialSCF(Eh)$ . | $EinitialSCF\u2212EconvSCF(Eh)$ . |
---|---|---|---|---|

−36.00 | 8.18 × 10^{−4} | 2.27 × 10^{−6} | −510.616 967 730 7 | 4.42 × 10^{−8} |

−20.00 | 8.21 × 10^{−4} | 2.28 × 10^{−6} | −510.616 967 730 4 | 4.45 × 10^{−8} |

−4.00 | 8.22 × 10^{−4} | 2.28 × 10^{−6} | −510.616 967 730 3 | 4.46 × 10^{−8} |

12.00 | 8.21 × 10^{−4} | 2.28 × 10^{−6} | −510.616 967 730 3 | 4.46 × 10^{−8} |

28.00 | 8.20 × 10^{−4} | 2.28 × 10^{−6} | −510.616 967 730 5 | 4.44 × 10^{−8} |

Finally, we will construct the PES for the first time using the G-Int strategy without SCF iterations. Usually, one has sampled a molecular property as a function of a given coordinate. There are instances when one needs to refine the PES. If the SCF converges pretty fast, one can scan more points along a given coordinate. However, if scanning these extra points can be time consuming, one can do an interpolation. Earlier, we have shown that although the ||**ΔP**^{α}||_{F} for the ferrocene system is about (∼10^{−4}), nevertheless they provide a highly accurate energy with an error ∼10^{−8} *E*_{h}. As a result, one can use these interpolated density matrices to refine the earlier scanned potential curve. Figure 2 shows such refinement. The blue circle markers correspond to the potential scan in the −36.00° ≤ ** θ** ≤ 36.00° with a step size of

**= 8.00°. From these sampled points, the density matrix that corresponds to**

*θ***= −36.00° was used as the reference point in defining the tangent space. We then used these blue markers as data points to interpolate the density matrices in the −32.00° ≤**

*θ***≤ 32.00° with a step size of**

*θ***= 8.00°. From these**

*θ***P**

_{interp}, we evaluated their corresponding energies, which correspond to $EinitialSCF$ in the earlier discussions. These $EinitialSCF$ are plotted as red circle markers in Fig. 2. To see how good these $EinitialSCF$ are in capturing the actual energies $(EconvSCF)$, we performed an actual scan in −36.00° ≤

**≤ 36.00° with a step size of**

*θ***= 4.00°, which is shown as a black curve in Fig. 2. We note that the corresponding $EinitialSCF$ shows only 1.35 × 10**

*θ*^{−6}

*E*

_{h}mean absolute error and 5.94 × 10

^{−6}

*E*

_{h}maximum error for those nine red interpolated points generated by G-Int in Fig. 2. In addition, they accurately captured the behavior of the actual potential energy curve, including the minimum (well) and maximum (barrier) energies, which were not included in the sampling. The accurate and reliable PES built by G-Int is due to the right physics captured by

**P**

_{interp}. Although one can also perform interpolation directly from the energy data, the minimum and maximum points may be missed. In addition, one advantage of having the

**P**

_{interp}based on G-Int is that one can also, in principle, evaluate any molecular property.

To conclude this Communication, the utility of the Grassmann manifold’s tangent space for density matrix interpolation, G-Int, was presented. A first proof-of-concept application to PN and ferrocene demonstrates that this first-principle based G-Int scheme provides very accurate density matrices, which can be used to provide the subtle details and correct physical shape of PESs without time-consuming SCF iterations. This strategy can greatly reduce the number of time-consuming *ab initio* calculations but is still able to construct robust and accurate *ab initio*-based PESs, which are essential for predicting and understanding the spectroscopy and dynamical behavior.^{45,46} This promising approach opens a door for many different applications, such as the evaluation of gradients and higher-order derivatives at a particular geometry for free. The gradients can be used to study the dynamics of a system at a given PES, while the higher-order derivatives can be used to study vibrational spectroscopy. Alternatively, rather than performing an interpolation, one can also perform extrapolations on the tangent space, $Tr0MGr$.^{47} It should be pointed out that the application of the Grassmann manifold concept to electronic structure theory has received attention recently.^{20,47–51} Our group is currently working on exploring the algorithms for both interpolation and extrapolation schemes as well as the extension of these schemes to higher dimensions. The feasibility of using this method to solve other quantum chemistry problems is also currently being explored by our group.

See the supplementary material for the mathematical details of density matrices, the necessary properties of $P\u0304interp$ in H_{2}, the dependence of $P\u0304interp$ on the sampling in PN and ferrocene, the Cartesian coordinates for the optimized eclipsed structure of ferrocene, and the dihedral angle defined for the scan of ferrocene.

This work was supported by start-up funds from the Virginia Commonwealth University College of Humanities and Sciences. This research used resources of the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231 using NERSC Award No. BES-ERCAP0020838. High Performance Computing resources provided by the High Performance Research Computing (HPRC) Core Facility at Virginia Commonwealth University (https://chipc.vcu.edu) were also used for conducting the research reported in this work.

## AUTHOR DECLARATIONS

### Conflict of Interest

The authors have no conflicts to disclose.

### Author Contributions

**Jake A. Tan**: Conceptualization (equal); Data curation (equal); Formal analysis (equal); Investigation (equal); Methodology (equal); Validation (equal); Visualization (equal); Writing – original draft (equal); Writing – review & editing (equal). **Ka Un Lao**: Conceptualization (equal); Funding acquisition (equal); Investigation (equal); Methodology (equal); Project administration (equal); Resources (equal); Software (equal); Supervision (equal); Validation (equal); Visualization (equal); Writing – review & editing (equal).

## DATA AVAILABILITY

The data that support the findings of this study are available within the article and its supplementary material.