For an electronic system, given a mean field method and a distribution of orbital occupation numbers that are close to the natural occupations of the correlated system, we provide formal evidence and computational support to the hypothesis that the entropy (or more precisely −*σS*, where *σ* is a parameter and *S* is the entropy) of such a distribution is a good approximation to the correlation energy. Underpinning the formal evidence are mild assumptions: the correlation energy is strictly a functional of the occupation numbers, and the occupation numbers derive from an invertible distribution. Computational support centers around employing different mean field methods and occupation number distributions (Fermi–Dirac, Gaussian, and linear), for which our claims are verified for a series of pilot calculations involving bond breaking and chemical reactions. This work establishes a formal footing for those methods employing entropy as a measure of electronic correlation energy (e.g., i-DMFT [Wang and Baerends, Phys. Rev. Lett. **128**, 013001 (2022)] and TAO-DFT [J.-D. Chai, J. Chem. Phys. **136**, 154104 (2012)]) and sets the stage for the widespread use of entropy functionals for approximating the (static) electronic correlation.

Accounting for electronic correlation in electronic structure methods has been an active topic of research for decades. Particularly difficult is accounting for the strong, static correlation that arises when stretching and breaking covalent bonds in molecules and materials. Multireference methods,^{1–3} reduced density matrix functional theory,^{4–9} and methods based on range separation^{10–12} have been perhaps the most active lines of work to account for static correlation. Methods that account for electronic correlation are typically built on top of a reference mean field method. In quantum chemistry, the Hartree–Fock (HF) method is generally adopted employing a single Slater determinant as the reference wavefunction, which features aufbau occupations of the HF orbitals. However, the eigenvalues of the one-electron reduced density matrix (1-rdm) of the correlated system generally do not follow aufbau occupations. In addition, the natural orbitals (the eigenfunctions of the 1-rdm) are close to the HF orbitals when the occupation numbers are large but deviate significantly for small occupations in a system-dependent way.^{5,13}

*E*

_{NN}is the nuclear–nuclear repulsion. For example, non-aufbau Hartree–Fock (nHF) maintains the usual HF energy expression with every occurrence of the 1-rdm replaced by the non-aufbau 1-rdm and similarly for approximate DFT methods (such as LDA, GGA, and hybrid DFT methods).

We remark that the definition of correlation energy in Eq. (1) differs from the definition of correlation energy found in quantum chemistry textbooks (see, for example, Szabo and Ostlund^{14}). This is because the classical (Hartree) Coulomb and exchange energies (and correlation energy in the case of nDFT) evaluated with a non-aufbau 1-rdm and an aufbau 1-rdm differ. Such a discrepancy motivated Hammond and Mazziotti^{15} to introduce the concept of “connected energy” and Wang and Baerends^{9} to name the same energy difference “cumulant energy.” Although we do not dispute their choice, we will not adopt the same nomenclature in this work.

*ɛ*

_{i}, and single-particle eigenfunctions (orbitals), $\varphi i$. By occupying the orbitals according to a distribution function,

*ρ*

_{σ}, which relates the single-particle energies to the occupations (

*σ*, here, is a measure of the width of the distribution),

*μ*, is chosen such that the number of electrons is always set to

*N*, i.e., $Tr[\gamma \u0302]=N$. It is instructive to notice that the following derivative holds:

Several authors^{9,16,17} showed that the Fermi–Dirac (FD) distribution applied to nMF methods can generate occupations that are in good agreement with natural occupations from correlated wavefunction methods for such processes as bond breaking^{9,16} and systems displaying a diradical character.^{18} In addition, computational evidence shows that the Shannon entropy of the occupations derived from the FD distribution is a good approximation to the correlation energy when it is added on top of the energy of an nMF of choice. Examples are available for nHF^{9} and semilocal and hybrid DFT methods^{16,19} (nDFT).

A lingering question arises: why should the entropy be a suitable approximation to the correlation energy? In their development of the i-DMFT method, Wang *et al.*^{9,20} sought entropy-like expressions as approximations for the correlation energy (which, as mentioned before, they call cumulant energy), drawing inspiration from the studies of Collins.^{21} However, they do not offer a formal justification for this approach. Chai^{16,22} attempted to justify the use of entropy functionals, but the argument provided^{16} is flawed as it corrects the noninteracting kinetic energy instead of the exchange–correlation functional. TAO-DFT has recently been formulated on formal grounds;^{23} however, the reliance on the FD entropy, as for the i-DMFT method, has remained *ad hoc*. In this context, we also note alternative approaches to extend mean field methods to incorporate static correlations from non-idempotent 1-rdm’s, such as the ones from the Mazziotti group,^{24} which, in limiting cases, relate to the i-DMFT and TAO-DFT methods.^{25}

Below, we provide a formal justification for using an entropy-like functional as an approximation for the correlation energy of electronic systems. We start from the following initial conditions:

$EnMF[\gamma \u0302]\u2261EnMF[{ni},{\varphi i}]$ is a reference nMF energy functional, which is a functional of the MF occupation numbers, {

*n*_{i}}, and the MF orbitals, {*ϕ*_{i}}. See Eq. (4).The correlation energy is defined as the difference between the nMF energy and the true energy,

*E*. See Eq. (1).

The occupation numbers, {

*n*_{i}}, derive from an invertible distribution*ρ*_{σ}having width*σ*that relates each*n*_{i}to the difference between the corresponding orbital energy,*ɛ*_{i}, and the Fermi energy,*μ*. See Eq. (3).We note that this is a standard treatment of finite temperature systems. However, it is an approximation in the context of electron correlation.

- The correlation energy is exclusively a functional of the nMF occupation numbers,This is perhaps the most severe of the approximations introduced in this work, and it is shared with the i-DMFT work of Wang and Baerends.(6)$Ec\u2261Ec[{ni}].$
^{9}

^{9}exploiting the hermiticity of the Lagrange multipliers, $\lambda ij=\lambda ji*$ implies that

*λ*

_{ij}be diagonal without loss of generality,

*λ*

_{ij}=

*n*

_{i}

*ɛ*

_{i}

*δ*

_{ij}. In addition, and most importantly, Eq. (8) shows that $\u2202Ec\u2202ni=\mu \u2212\epsilon i$.

We wish to prove the following theorem.

*Let the electronic energy be given as the sum of two terms. The first is the energy of a non-aufbau mean field method,* *E*_{nMF}[{*n*_{i}}, {*ϕ*_{i}}]*, employing an invertible occupation number distribution,* *ρ*_{σ}*. The second is a pure functional of the occupation numbers,* *E*_{c}[{*n*_{i}}]*. We establish that* *E*_{c} = −*σS*_{ρ}*, where* *σ* *represents the width of the distribution and* *S*_{ρ} *denotes the entropy associated with the distribution* *ρ*_{σ}.

*ρ*

_{σ}, to find an expression for

*ɛ*

_{i}−

*μ*exclusively in terms of the corresponding occupation number

*n*

_{i}. Namely,

where $\rho \sigma \u22121(ni)$ is the inverse function of *ρ*_{σ}(*ɛ*_{i} − *μ*). Such an inverse typically exists for physically behaved distributions. For example, it exists for FD, G, and L distributions employed here and listed in Table I as long as *σ* > 0.

X
. | ρ_{σ}(ɛ_{i} − μ)
. | Entropy density (s_{i})
. |
---|---|---|

FD | $ni=e\epsilon i\u2212\mu \sigma +1\u22121$ | −n_{i} ln n_{i} − (1 − n_{i}) ln(1 − n_{i}) |

G | $ni=12Erfc\epsilon i\u2212\mu \sigma $ | $12\pi e\u2212Erf\u22121(1\u22122ni)2$ |

L | $ni=1,\epsilon i\u2212\mu <\u2212\sigma ,1\u2212121+\epsilon i\u2212\mu \sigma 2,\u2212\sigma \u2264\epsilon i\u2212\mu \u22640,121\u2212\epsilon i\u2212\mu \sigma 2,0<\epsilon i\u2212\mu \u2264\sigma ,0,\sigma <\epsilon i\u2212\mu $ | $ni\u2212223ni3/2,ni\u22640.5,(1\u2212ni)\u2212223(1\u2212ni)3/2,ni>0.5$ |

X
. | ρ_{σ}(ɛ_{i} − μ)
. | Entropy density (s_{i})
. |
---|---|---|

FD | $ni=e\epsilon i\u2212\mu \sigma +1\u22121$ | −n_{i} ln n_{i} − (1 − n_{i}) ln(1 − n_{i}) |

G | $ni=12Erfc\epsilon i\u2212\mu \sigma $ | $12\pi e\u2212Erf\u22121(1\u22122ni)2$ |

L | $ni=1,\epsilon i\u2212\mu <\u2212\sigma ,1\u2212121+\epsilon i\u2212\mu \sigma 2,\u2212\sigma \u2264\epsilon i\u2212\mu \u22640,121\u2212\epsilon i\u2212\mu \sigma 2,0<\epsilon i\u2212\mu \u2264\sigma ,0,\sigma <\epsilon i\u2212\mu $ | $ni\u2212223ni3/2,ni\u22640.5,(1\u2212ni)\u2212223(1\u2212ni)3/2,ni>0.5$ |

*E*

_{c},

*s*

_{ρ}(

*n*

_{i}), and the correlation energy density,

*e*

_{c}(

*n*

_{i}) = −

*σs*

_{ρ}(

*n*

_{i}). The entropy density is related to the total entropy by

*S*

_{ρ}=

*∑*

_{i}

*s*

_{ρ}(

*n*

_{i}), and similarly, the correlation energy density is related to the total correlation energy,

*E*

_{c}=

*∑*

_{i}

*e*

_{c}(

*n*

_{i}). We also note that, in practice, the argument of the distribution function is the dimensionless $\epsilon i\u2212\mu \sigma $. This is the reason why, upon integration, the factor of

*σ*is recovered in Eq. (11) in all cases.

To evaluate the integral above, the lower limit of integration must be chosen such that the occupations correspond to a state of null electron correlation, i.e., aufbau occupations. Meanwhile, the upper limit is determined by the occupations obtained from the distribution *ρ*_{σ} or {*n*_{i}}.

Equation (11) yields the entropy energy contribution for the distribution *ρ*_{σ}, which, it follows, is also equal to the correlation energy functional concluding the proof of this theorem.

The use of the FD distribution and its associated entropy has proven to yield accurate energy curves along the bond-breaking coordinate for numerous molecules. This approach has been employed with both mainstream non-aufbau DFT methods^{16,19} and nHF^{9} as nMF methods. Chai and co-workers showed that this method predicts improved reaction barrier heights, thermochemistry,^{22} and other properties for various classes of molecules and materials.

A reasonable approach would involve exploring the collection of distributions such that 0 ≤ *n*_{i} ≤ 1 and *∑*_{i}*n*_{i} = *N* to identify the one that most accurately represents the correlation energy across a wide range of processes and molecular systems. The computational demands of such an undertaking are significant. Thus, we defer this task to future work. In Table I, we present the expressions for three commonly used distributions and their respective entropy density expressions, denoted by *e*_{c}(*n*_{i}) = −*σs*_{ρ}(*n*_{i}); see Eq. (11).

We proceed to compare the three distributions, examining whether they yield significantly different occupation numbers and correlation energies. Figure 1 displays the three distributions alongside their corresponding correlation energy densities. In order to facilitate a clear comparison, we select *σ* values for each distribution that yield occupation numbers most similar to the FD distribution: *σ*_{G} = 2.4*σ*_{FD} and *σ*_{L} = 4.1*σ*_{FD}. The resulting occupation numbers and correlation energies exhibit a high degree of similarity. Nevertheless, Fig. 1 clearly reveals slight disparities in both the occupation numbers (∼4% for linear and 2% for Gaussian) and entropy densities (around 4 mHa for linear and 3 mHa for Gaussian). Considering that *σ*_{FD} = 1 Ha, we can foresee that these differences, although minor in comparison with the errors caused by MF methods in regimes characterized by a strong static correlation,^{10,26} will have a noticeable, however, mild impact on the computed correlation energies.

The calculations presented in this work are performed using PySCF.^{27} In order to ensure full reproducibility of our results, we have made the Jupyter notebooks available on Zenodo.^{28} These notebooks can be utilized to reproduce all calculations and figures described in this Communication. The list of benchmark methods employed, along with the corresponding basis set, nMF method used, and *σ* values, is given in Table II. For the Diels–Alder reaction, B3LYP/6-31G* was employed to predict the reaction coordinate via nudged elastic band calculations, and the transition structure was obtained from Ref. 29 (also accessible on Zenodo using the aforementioned link).

System . | Basis set . | Benchmark . | nMF . | σ_{FD}
. |
---|---|---|---|---|

H_{2} | cc-pVDZ | Full-CI | nHF | 0.095 |

nPBE | 0.029 | |||

nPBE0 | 0.047 | |||

Ethane | cc-pVDZ | CAS(6,6) | nHF | 0.093 |

Benzene | 6-31G* | CAS(6,6) | nHF | 0.096 |

Diels–Alder | 6-31G* | CAS(6,6) | nHF | 0.070 |

Retinal | cc-pVDZ | CAS(15,22)+NEVPT2 | nHF | 0.065 |

System . | Basis set . | Benchmark . | nMF . | σ_{FD}
. |
---|---|---|---|---|

H_{2} | cc-pVDZ | Full-CI | nHF | 0.095 |

nPBE | 0.029 | |||

nPBE0 | 0.047 | |||

Ethane | cc-pVDZ | CAS(6,6) | nHF | 0.093 |

Benzene | 6-31G* | CAS(6,6) | nHF | 0.096 |

Diels–Alder | 6-31G* | CAS(6,6) | nHF | 0.070 |

Retinal | cc-pVDZ | CAS(15,22)+NEVPT2 | nHF | 0.065 |

While Theorem 1 provides no link between the nMF occupation numbers and the natural occupation numbers, we have determined that it is advantageous to select a *σ* that allows for close matching of the natural occupations and the MF occupations. We noted that the distribution width ratios determined for the data presented in Fig. 1 are generally optimal. Thus, it is enough to find an optimal *σ*_{FD} to then infer on the optimal *σ*_{L} or *σ*_{G}. We tested the single bond stretching process for H_{2}, benzene (C–H bond), and ethane (C–C bond). Figure 2 shows the bond energy curves. The optimal *σ*_{FD} used in this work is reported in Table II. While we detect some (expected) differences in the results from the three distributions, they all substantially improve the nHF energies. We can conclude that the correlation energy functional is able to capture the static correlation in the bond breaking of molecules.

In their work, Wang and Baerends^{9} employed the i-DMFT correlation energy functional given by *E*_{c} = −*σS* + *b*, where *b* represents a system-dependent shift added to the nHF energy. We observe that the total electronic energies of the systems investigated in this study exhibit a shift compared to the reference energies. Therefore, incorporating an energy shift, *b*, on a case-by-case basis, could enhance the quality of the results. However, we have been unable to identify a theorem similar to our Theorem 1 that could provide an explanation or justification for employing such an energy shift. Speculatively, based on the ongoing debate and understanding of the i-DMFT method, relaxing approximation (a2), i.e., introducing a functional dependence of *E*_{c} on the nMF orbitals as well, *E*_{c}[{*n*_{i}}, {*ϕ*_{i}}], may address this issue. It is evident that the true correlation energy must encompass dynamic correlation effects as well, which are unlikely to be captured by the entropy functional alone. In addition, in a comment on the i-DMFT paper,^{30} Ding *et al.* showed that the 1-rdm predicted by i-DMFT does not quantitatively agree with those obtained from accurate wavefunction methods (such as MCSCF).

These considerations suggest two potential avenues for improvement: (1) exploring alternative nMF methods that offer more favorable error cancelation, as nHF may not be the optimal or “best” choice of the nMF method; and (2) acknowledging that the correlation energy should also exhibit a dependence on the nMF orbitals. While we will analyze point (2) (orbital dependence) later, to address point (1), we note that Theorem 1 is not specific to nHF. It can be applied to any nMF method provided that the nMF energy lies above the energy of the electronic ground state. This realization opens an avenue for improving the accuracy of the total energies as well as the electron densities by searching for the “best” nMF methods. In fact, the TAO-DFT method^{16,19,22} effectively utilizes Eq. (11) in conjunction with DFT methods. Chai showed^{22} that changing the amount of HF exchange in the DFT functional in TAO-DFT can improve the description of reaction barrier heights. Problematic is the fact that often approximate xc functionals lead to electronic energies below the variational energy minimum.

In Fig. 3, we compare the nMF dependence of the energy functional (*E*_{nMF} + *E*_{c}, where *E*_{c} is the one from Eq. (11) evaluated with an nMF-specific *σ*) and the electron density against full-CI (FCI) for H_{2}. This figure shows that the 1-rdm’s obtained with nDFT methods, such as nPBE0, are closer to FCI than nHF; the nHF 1-rdm error is slashed by 50%. The total electronic energy is also closer to FCI, reducing the shift (which is captured by the −*b* constant in i-DMFT) from 27 mHa for nHF to barely 3 mHa for nPBE0. A contrast is HF (aufbau occupations), which provides the most deviated densities and energies. These results provide us with barely an indication for improving the i-DMFT method, ridding it of the −*b* constant in compliance with Theorem 1. We should expect the performance of nPBE0 to be system dependent but generally to outperform nHF for bond dissociations. Thus, further inspection in the dependence on the reference nMF method is needed and will be tackled in a future study.

So far, we have only considered bond dissociation curves. More interesting are barrier heights of chemical reactions. In Fig. 4, we show the energy profile for the prototypical Diels–Alder reaction between 1,3-butadiene and ethene computed by HF (*E*_{HF}), nHF (*E*_{nHF}), and *E*_{nHF} + *E*_{c}. The reaction coordinate has been determined by nudged elastic band calculations carried out with B3LYP. The results are quite convincing in that the barrier height is overestimated by HF, underestimated by B3LYP, and well reproduced by *E*_{nHF} + *E*_{c}. Specifically, the overestimation by HF is about 20 kcal/mol and the underestimation by B3LYP is of similar size. This result shows that the good performance of *E*_{nHF} + *E*_{c} is not solely relegated to bond stretching, but it ventures into the prediction of barrier heights and, potentially, many other processes where static correlation is important.

An example of such processes is the cis–trans isomerization of alkenes, where the reaction coordinates is a dihedral angle that twists a double bond. In doing so, the double bond is strained and eventually broken when the dihedral angle reaches 90°. A particularly important class of molecules where cis–trans isomerization is important is retinal derivatives. In Fig. 5, we showcase the 9-*trans* to 9-*cis* isomerization of all-trans retinal. B3LYP, HF, nHF, and nHF+*E*_{c} energies are compared against reference NEVPT2 [using a large CAS(15,22)] along the rotation of angle *α* for unrelaxed geometries.

The trans–cis isomerization reaction barrier in Fig. 5 delivers a picture consistent with the behavior for Diels–Alder reactions as well as for single-bond dissociation. In addition, we see that appropriate *σ*_{FD} depends on the particular bond being stretched with a value nearing 0.09 Ha for single *σ* bonds and 0.06–0.07 Ha for single *π* bonds.

In conclusion, in this work, we formalize and provide computational support to the hypothesis that the entropy is a good approximation to the correlation energy. We show that if the total energy functional is approximated by the sum of the energy of a non-aufbau mean field method, *E*_{nMF}, and a correlation energy functional purely dependent on the occupation numbers, *E*_{c}[{*n*_{i}}], the correlation energy is given by −*σS*_{ρ}, where *S*_{ρ} is the entropy associated with the occupation number distribution employed. This formal result validates the i-DMFT method^{9} (which uses non-aufbau HF as the nMF) and TAO-DFT^{16} (which uses approximate DFT methods as the nMF). The computational support of this result shows that the excellent performance of i-DMFT (which uses the Fermi–Dirac distribution) can be matched with other reasonable occupation number distributions, such as Gaussian and linear. The entropy used in the evaluation of the correlation energy is the one associated with the chosen distribution, showing that there is nothing special about the Shannon entropy, and entropy expressions derived from other distributions (such as Gaussian and linear) are just as good in approximating the correlation energy as the Shannon entropy. The computational work includes considering several bond dissociation systems as well as the barrier height of a Diels–Alder reaction where the static correlation is known to negatively affect the results of mean field methods. In addition to our results, we can also leverage the good results from the i-DMFT and TAO-DFT methods to provide further computational evidence that such a strategy is effective.

A question, however, lingers: why should the occupation number distributions considered here (FD, G, and L) reproduce so well the natural occupations along bond breaking, double bond twisting, and certain chemical reactions? We have not provided an answer to this question, which, to the best of our knowledge, remains open.

Despite the satisfactory results, it is imperative that the method be improved in future work. This can be achieved in several ways. As hinted from Fig. 3, potential research avenues include choosing judiciously the non-aufbau mean field method to be paired with *E*_{c}. For H_{2}, we notice a clear improvement for the 1-rdm when using nPBE0 compared to nHF. In addition to considering improvements from the mean field method, concurrent improvements to the entropy functional itself, *E*_{c} = −*σS*_{ρ}, should be formulated. Wang and Baerends proposed making *σ* a density (or 1-rdm) functional.^{31} While this approach is, indeed, feasible (potentially by relating *σ* to the energy gap^{19}), we also recognize that the entropy functional itself will likely need to be modified when incorporating orbital dependence.

To clarify this aspect, an approach can be derived from the method of spectral decomposition of correlation energy by Savin and co-workers.^{32,33} They formally showed that *E*_{c} = *∑*_{i}*ɛ*_{i}*D*_{i}, where $Di=\u222b01\u22121\lambda 2(ni\lambda \u2212ni0)d\lambda $, with the integration path *λ* following the Hamiltonian $H\u0302\lambda =\lambda H\u03021+(1\u2212\lambda )H\u03020$, in which $H\u03020$ represents a mean field N-electron Hamiltonian and $H\u03021$ denotes the Coulomb molecular Hamiltonian. The correlation energy is defined for a given value of *λ* as $Ec\lambda =\u27e8\Psi \lambda |H\u0302\lambda |\Psi \lambda \u27e9\u2212\u27e8\Psi 0|H\u0302\lambda |\Psi 0\u27e9$ (where $H\u0302\lambda \Psi \lambda =E\lambda \Psi \lambda $), and the diagonals of the natural occupations expressed in the mean field basis are denoted by $ni\lambda $. It is evident that *λ* and $ni\lambda $ are interrelated and can be used interchangeably as the primary integration variable, yielding $Di=\u222bni0ni1f(n)(n\u2212ni0)dn$, where formally $f(n)=d1\lambda dn$. Notably, by realizing that $\u2202Ec\lambda \u2202ni\lambda =\epsilon if(ni\lambda )ni\lambda \u2212ni0\lambda $, we observe that the entropy functional can be retrieved by approximating $ni\lambda =\rho (\lambda \sigma )(\epsilon i\u2212\mu )$. In other words, to recover the entropy as *E*_{c}, the occupations along the connection path *λ* follow a distribution function with the scaled width, *λσ*. However, in reality, the width scaling may not be simply linear and may, instead, be an orbital functional. Thus, as one might envision, when including orbital dependence in the definition of the correlation energy, the actual correlation energy functional bears some resemblance to the entropy, but, in general, it will differ. This analysis may guide future developments of correlation energy functionals.

This material is based upon the work supported by the U.S. National Science Foundation under Grant Nos. CHE-2154760, OAC-2117429, and OAC-1931473. M.P. acknowledges discussions with Andreas Savin.

## AUTHOR DECLARATIONS

### Conflict of Interest

The authors have no conflicts to disclose.

### Author Contributions

**Jessica A. Martinez B**: Data curation (equal); Investigation (equal); Software (equal). **Xuecheng Shao**: Conceptualization (equal); Formal analysis (equal); Investigation (equal); Software (equal); Supervision (equal). **Kaili Jiang**: Investigation (equal); Software (equal). **Michele Pavanello**: Conceptualization (equal); Supervision (equal); Writing – original draft (equal).

## DATA AVAILABILITY

The data that support the findings of this study are openly available in Zenodo at https://doi.org/10.5281/zenodo.8240064, Ref. 28.

## REFERENCES

*Modern Quantum Chemistry: Introduction to Advanced Electronic Structure Theory*

_{2}molecule

*ab initio*, density functional, CASSCF, CASPT2, and CBS-QB3 methods for the prediction of activation barriers, reaction energetics, and transition state geometries