Since its formal inception in 1964–1965, Kohn-Sham density-functional theory (KS-DFT) has become the most popular electronic structure method in computational physics and chemistry. Its popularity stems from its beautifully simple conceptual framework and computational elegance. The rise of KS-DFT in chemical physics began in earnest in the mid 1980s, when crucial developments in its exchange-correlation term gave the theory predictive power competitive with well-developed wave-function methods. Today KS-DFT finds itself under increasing pressure to deliver higher and higher accuracy and to adapt to ever more challenging problems. If we are not mindful, however, these pressures may submerge the theory in the wave-function sea. KS-DFT might be lost. I am hopeful the Kohn-Sham philosophical, theoretical, and computational framework can be preserved. This Perspective outlines the history, basic concepts, and present status of KS-DFT in chemical physics, and offers suggestions for its future development.
I. INTRODUCTION
Density-functional theory (DFT) is a subtle, seductive, provocative business. Its basic premise, that all the intricate motions and pair correlations in a many-electron system are somehow contained in the total electron density alone, is so compelling it can drive one mad. I attended my first DFT conference in 1983, a two week NATO workshop in Alcabideche, Portugal.1 The intensity of the debates and the level of scientific excitement at the meeting were impressive. I was hooked. It was about ten years before DFT made its big splash in computational chemistry. The field was new.2 We were still finding our way.
The theoretical foundations had been laid by Hohenberg, Kohn, and Sham in 1964–19653,4 but the formative years were ∼1980–2010. An analysis of Web of Science citation data undertaken at Tulane University reveals that DFT was the most active field in physics during this thirty-year period.5 Of the top three most cited physicists,6 the first (Perdew: 65 757 citations) and third (Becke: 62 581 citations) were density-functional theorists. The second was Richard Smalley (63 354). The top three most cited physics papers,7 and eight of the top ten, were in the field of DFT. The Tulane analysis covers citations during, and to papers published within, the 1980–2010 window. The Hohenberg-Kohn-Sham (HKS) papers are therefore not included. Total citation counts for notable DFT papers, including the HKS papers, are shown in Table I, from Web of Science and Google Scholar as of January 1, 2014. The numbers speak for themselves. DFT is huge.
I will focus in this Perspective on the basic engine of Kohn-Sham DFT, the ground-state exchange-correlation functional, as it pertains to problems in chemical physics. I have been involved in its development since the “muffin-tin Xα” days of the late 1970s to the present and hopefully into the future. The narrative will have a historical and somewhat personal slant, but I intend for this article to be a teaching tool as well. Of course DFT is a much larger subject, and thankfully the recent Perspective of Burke8 provides a broader overview. Also the Perspective of Klimes and Michaelides9 on dispersion interactions in DFT is an excellent accompaniment.
The contributions to this special issue reflect the many colors of the modern DFT tapestry. It is all very good and increasingly powerful stuff. I wonder, however, if the heart and soul of Kohn-Sham DFT may be slipping from our grasp. Although it is not strictly so defined, I think KS-DFT is about occupied orbitals only (I hesitate to suggest the acronym “OOO”). Yet virtual orbitals are more and more being used: perturbation theory, random phase approximation, etc. My personal research philosophy has always been occupied orbitals only. It focuses the mind. It defines one's path. It is what gave KS-DFT its popularity and a share of the 1998 Chemistry Nobel prize to Walter Kohn. Let us not give up on the original spirit of Kohn-Sham DFT. Here is my take on the first fifty years. We use atomic units throughout, where ℏ = me = e = 4πɛ0 = 1.
Paper . | Web of Science . | Google Scholar . |
---|---|---|
LYP | 43 123 | 49 703 |
B3PW91 | 42 642 | 52 028 |
PBE | 30 575 | 37 771 |
B88 | 24 766 | 28 529 |
KS | 21 670 | 31 251 |
HK | 15 222 | 27 317 |
Paper . | Web of Science . | Google Scholar . |
---|---|---|
LYP | 43 123 | 49 703 |
B3PW91 | 42 642 | 52 028 |
PBE | 30 575 | 37 771 |
B88 | 24 766 | 28 529 |
KS | 21 670 | 31 251 |
HK | 15 222 | 27 317 |
II. IN THE BEGINNING
The intuitive origins of density-functional theory predate the seminal 1964–1965 Hohenberg-Kohn-Sham papers3,4 by many decades. Thomas,10 Fermi,11 and Dirac12 imagined that the kinetic and exchange energies of systems of many electrons could be locally modeled by their uniform electron gas energy densities. The result was an approximate theory of electronic structure (“TFD”) depending only on the total electronic density . Though wonderfully simple, TFD fails qualitatively because it is unable to self-consistently reproduce atomic shell structure. Even with accurate input densities from other sources, TFD energies have errors of around 10%, too large for computational purposes. Furthermore, Teller deduced13 that Thomas-Fermi theory cannot bind molecules. TFD is useful for rough estimates of atomic properties only.
HF theory, while immensely more useful than TFD, is still not accurate enough for energy predictions in chemistry. Bond energies are significantly underestimated. Some molecules, F2, for example, are not even bound at the Hartree-Fock level. Thus post-HF methods, adding to Hartree-Fock numerous other determinants involving excited or “virtual” orbitals, are generally required for viable chemical computations. Post-HF technology is well developed (see, e.g., Refs. 14 and 15 for overviews) and capable of very high accuracy, but the development and computational costs are severe. Simply put, it is complicated, and the computer-time scaling with system size N is several orders larger than for Hartree-Fock depending on the method (i.e., the formal scaling16 of HF is N4, while post-HF methods scale like N5 and higher as their sophistication increases). I will refer to HF and post-HF methods as wave-function methods, and the theory as wave-function theory (WFT). The more common terminology, ab initio theory, is regrettable as it denigrates density-functional theory. As we shall see in Secs. III and IV, DFT is as “ab initio” as WFT.
Slater anticipated much of what was later to transpire in Kohn-Sham DFT. His last book,18 The Self-Consistent Field for Molecules and Solids, is an excellent DFT introduction, though restricted to exchange only. Slater's insights on the nature of the “hole,” in compact and extended systems, inspired me and many others to enter DFT. Particularly fascinating was that HFS bond energies were superior to HF bond energies.19 There was no clear underbinding or overbinding trend for HFS, and the errors were still unacceptably large, but this early victory over Hartree-Fock foreshadowed things to come.
III. THE BIRTH OF DENSITY-FUNCTIONAL THEORY
While intriguing in its simplicity and early successes, the Hartree-Fock-Slater method is a model, not a theory, of electronic structure. Where is dynamical correlation? Where is the correlation contribution to kinetic energy? HFS offers no answers to these questions. In their seminal 1964–1965 papers,3,4 Hohenberg, Kohn, and Sham founded the rigorous theory that finally legitimized the intuitive leaps of Thomas, Fermi, Dirac, and Slater. Thus, 1964 is widely accepted as the birth year of modern DFT.
That the functionals T(ρ) and Vee(ρ) are known to exist does not imply that we can write them down! The requirement of v-representability (or the looser Fermion representability in the approach of Levy) on the densities is an additional difficulty. Nonrepresentable variational densities will collapse to the same shell-structureless densities obtained in Thomas-Fermi-Dirac theory. A year later, Kohn and Sham addressed both of these problems.4
Kohn-Sham DFT is astounding in its simplicity. It is operationally an independent-particle theory, simpler even than Hartree-Fock. Yet it delivers, in principle, the exact density [through Eq. (14)] and exact total energy [through Eq. (18)] of any interacting, correlated electronic system. Everything hinges on the functional EXC(ρ) and its functional derivative. We are assured of its existence, but no explicit expression is known. In 1965, the quest for the holy grail of electronic structure theory began.
. | Exact . | LDA . | B86 . | B86b . | B88 . | PW86 . | revPW86 . | PBE . | BR . |
---|---|---|---|---|---|---|---|---|---|
He | −0.998 | −0.862 | −1.003 | −0.999 | −1.001 | −1.009 | −1.022 | −0.990 | −1.015 |
Ne | −12.01 | −10.97 | −12.09 | −12.08 | −12.06 | −12.15 | −12.29 | −11.99 | −12.12 |
Ar | −30.09 | −27.81 | −30.12 | −30.12 | −30.09 | −30.23 | −30.54 | −29.93 | −30.03 |
Kr | −93.68 | −88.54 | −93.71 | −93.76 | −93.77 | −93.73 | −94.57 | −93.32 | −92.77 |
Xe | −178.9 | −170.5 | −178.8 | −178.9 | −179.0 | −178.5 | −179.9 | −178.2 | −176.3 |
. | Exact . | LDA . | B86 . | B86b . | B88 . | PW86 . | revPW86 . | PBE . | BR . |
---|---|---|---|---|---|---|---|---|---|
He | −0.998 | −0.862 | −1.003 | −0.999 | −1.001 | −1.009 | −1.022 | −0.990 | −1.015 |
Ne | −12.01 | −10.97 | −12.09 | −12.08 | −12.06 | −12.15 | −12.29 | −11.99 | −12.12 |
Ar | −30.09 | −27.81 | −30.12 | −30.12 | −30.09 | −30.23 | −30.54 | −29.93 | −30.03 |
Kr | −93.68 | −88.54 | −93.71 | −93.76 | −93.77 | −93.73 | −94.57 | −93.32 | −92.77 |
Xe | −178.9 | −170.5 | −178.8 | −178.9 | −179.0 | −178.5 | −179.9 | −178.2 | −176.3 |
IV. THE EXCHANGE-CORRELATION HOLE
For two decades after its birth, KS-DFT was relatively unknown in quantum chemistry. Slater's HFS work24 predated the Hohenberg-Kohn-Sham papers, and, in combination with the “muffin tin” (or “scattered wave”) numerical implementation of Johnson,25 had acquired some momentum. HFS implementations using standard atom-centered basis sets,26,27 numerical atom-centered basis sets,28 and high-precision numerical grids19 were also under development. By the early 1980s, several reliable molecular HFS computer codes were up and running, but all this was largely ignored by the greater quantum chemistry community. The accuracy of the HFS model was no match for their well-established wave-function methods.
Solid-state physicists, on the other hand, made key advances in KS-DFT in its early decades. Reference 2 surveys the progress and literature of the period. Most importantly, a rigorous formula for the exchange-correlation energy EXC in terms of quantum-mechanical pair density and an exchange-correlation “hole” was established in Refs. 29–33 (see also Refs. 34 and 35 for simpler derivations). It is called the “adiabatic connection” formula and its central role in KS-DFT cannot be overstated. Hohenberg, Kohn, and Sham proved the existence of EXC but gave no prescription for finding it. The adiabatic connection provides the prescription. We outline the derivation in the paragraphs below.
These concepts are tremendously important. They will inform the development of exchange-correlation functionals throughout Secs. V–X. An immediate benefit is an understanding of how the local density approximation, Eq. (21), works as well as it does in atoms and molecules. Atoms and molecules, especially of small nuclear charge, do not resemble uniform electron gases even locally. However, the normalization condition, Eq. (29), on the exchange-correlation hole is universal. The same constraint applies to the exact hole, at any in an atom or molecule, as applies to the UEG model hole that replaces it in the LDA. The success of the LDA is therefore no surprise at all.
V. GRADIENT APPROXIMATIONS: THE ASCENDENCY OF DFT
Let us introduce the acronym DFA at this point for “density-functional approximation.” If you attend DFT meetings, you will know that Mel Levy often needs to remind us that DFT is exact. The failures we report at meetings and in papers are not failures of DFT, but failures of DFAs.
The most fundamental process in chemistry is the making and breaking of bonds. Computation of bond energies is therefore one of the greatest concerns of quantum chemists. It was not until the 1980s that software implementations of HFS (and hence KS-DFT) were up to the task of computing accurate molecular energies,26–28 but by the end of the decade robust programs were in place36–39 and dissociation energies were under test. The energy required to dissociate a molecule entirely to free atoms, i.e., break all its bonds, is called “atomization” energy. The accurate benchmarking of DFAs on atomization energies has been the driving force of DFA development for the past thirty years.
For consistency in our DFA assessments, we will use the G2/97 atomization-energy test set of Pople and co-workers40 throughout this article. G2/97 is a compilation of 148 atomization energies of common organic and inorganic molecules, and radicals, from diatomics to molecules of about a dozen atoms. The source is experimental data with better than 1 kcal/mol precision. One kcal/mol (4.2 kJ/mol or 0.043 eV) is often referred to as “chemical accuracy” and is the target accuracy of quantum chemical methods. Our G2/97 assessments will be tabulated in Table IV, with the mean error (ME) and mean absolute error (MAE) listed for each method. All computations are performed with our grid-based NUMOL code41–44 and are “post-LDA” [all energies computed using orbitals from the local (spin) density approximation, with the spin-dependent UEG exchange-correlation energy parametrization of Perdew and Wang23]. All entries in Table II through Table IV have been specifically (re)computed for this Perspective.
Hartree-Fock theory is assessed first. We see, as expected, massive underbinding statistics, with ME = −158.9 and MAE = 158.9 kcal/mol. Hartree-Fock-Slater, essentially the exchange-only LDA [see Eq. (38) below] is a huge improvement, with ME = −23.6 and MAE = 35.0 kcal/mol. But there is no clear underbinding or overbinding trend. The LDA for exchange-correlation has a massive overbinding trend, as observed, e.g., for diatomic molecules by Becke45 in the mid 1980s. The G2/97 errors are ME = 82.9 and MAE = 82.9 kcal/mol. This large overbinding tendency is actually good news. It means something profound is awaiting discovery.
Does B88 capture, in addition, the exact −1/r asymptotic functional derivative? Engel et al.64 have shown that, unfortunately, no GGA can give the exact asymptotic exchange energy density and the exact functional derivative. The functional derivative of B88 falls off like −1/r2.
Despite the dramatic rise of exchange GGAs through the 1980s, quantum chemists, and physicists, were reluctant to adopt them. Ziegler65 was their sole advocate during this time, applying GGAs to challenging problems in inorganic and organometallic chemistry. But everything was about to change. At the International Congress of Quantum Chemistry in Menton, France, 1991, I reported GGA atomization-energy benchmarks on Pople's older “G1” test set. Pople was present, and we had a memorable conversation that week, the first of what would be many. It was the boost that DFT needed.
Within a year, Pople and co-workers66 combined B88 exchange with the “LYP” correlation functional of Lee, Yang, and Parr67 (based on earlier work of Colle and Salvetti68), as modified by Miehlich et al.69 to create “BLYP.” This was quickly followed70 by tests of a larger variety of functionals using the modest but efficacious 6-31G* basis set. Their software platform was the ubiquitous GAUSSIAN program, requiring only minor modifications of its Hartree-Fock technology71 and the implementation of numerical integration grids41 to incorporate DFAs. With the consequent release of GAUSSIAN 92/DFT, density-functional theory was widely launched into the chemistry community. This was a turning point. Virtually any chemist in the world could now do DFT computations.
The LO bound is a divisive issue in DFT. Atomic and molecular exchange energies are well within this bound74 by about a factor of two. Any reasonable exchange functional will satisfy Eq. (48) for any real chemical or physical density. Just examine the data in Table II. Proponents of the LO bound therefore apply it locally, even though the energies in Eq. (48) are total energies. This is okay, if you understand that the local LO bound is a manufactured bound; a sufficient but not necessary constraint in functional design. Unfortunately not everyone does. The DFT literature is permeated by absurd statements that, e.g., PW86, B86b, B88, etc. “violate” the (local) LO bound. Those who make these statements ignore the inconvenient fact that the exact exchange energy density also “violates” the (local) LO bound, in the tail of any finite system [see Eq. (46)]. It is the local Lieb-Oxford bound that “violates” exact exchange!
All matter in our terrestrial world is composed of atoms, few in number and well defined. Atomic exchange energies are computable with arbitrarily high precision. Atomic correlation energies are accurately known up to Ar75–77 and are known with reasonable accuracy beyond.78 My philosophy of DFA design, and that of many others, is that atoms are the logical calibration targets. This philosophy is often labeled “empirical.” Yet atoms are no less sacred than uniform or nearly uniform electron gases, or the sufficient-but-not-necessary local Lieb-Oxford bound. The “nonempiricists” work with selected theoretical/mathematical constraints and, in the end, assess their functionals on the same data as the “empiricists.” Both groups are really doing the same thing, except that the nonempiricists need more iterations to get there.
There is considerable literature on gradient-corrected correlation DFAs as well. This is a much more difficult problem than the exchange problem. Correlation functionals do not have the simple dimensionality of exchange functionals because, unlike exchange, correlation involves the interelectronic coupling strength to all orders. Even the uniform electron gas is a challenge, with quantum Monte Carlo simulations the only systematic route to the “right” answer. UEG parametrizations are available,21–23 as are correlation GGAs,54,55,59,79,80 predominantly from the Perdew group. The 1996 PBE correlation GGA59 is our current favourite. “PW91”80 was our pre-1996 favourite. Fortunately EC is of much less importance than EX, so minimal discussion of EC will be offered here. The next two paragraphs will suffice.
In low-Z atoms the correlation LDA makes a huge relative error, a factor of roughly two (too large), compared to the ∼10% error of the exchange LDA. Stoll et al.57,58 offered an enlightening explanation. About half the UEG correlation energy is between parallel-spin electrons, essentially smoothing out long-range oscillations in the UEG exchange hole. In finite systems, especially low-Z atoms, this parallel-spin UEG effect is inoperative. Stoll et al.57,58 suggested that the correlation LDA for opposite spins only be used in chemical applications. Thus, DFAs for EC must attenuate the full correlation LDA by about a half, or more, in light atoms. Indeed, the correlation energy of a hydrogen atom is zero.
. | Exact . | LDA . | Stoll . | PBE . | B88c(BR) . |
---|---|---|---|---|---|
H | 0 | −0.022 | 0 | −0.006 | 0 |
He | −0.042 | −0.111 | −0.057 | −0.041 | −0.041 |
Ne | −0.391 | −0.740 | −0.380 | −0.346 | −0.362 |
Ar | −0.726 | −1.423 | −0.730 | −0.703 | −0.728 |
. | Exact . | LDA . | Stoll . | PBE . | B88c(BR) . |
---|---|---|---|---|---|
H | 0 | −0.022 | 0 | −0.006 | 0 |
He | −0.042 | −0.111 | −0.057 | −0.041 | −0.041 |
Ne | −0.391 | −0.740 | −0.380 | −0.346 | −0.362 |
Ar | −0.726 | −1.423 | −0.730 | −0.703 | −0.728 |
VI. OTHER STROKES
All GGA and meta-GGA functionals, by definition, are corrections to the LDA. They all revert to the uniform electron gas at zero density gradient. Might other contexts be better, or at least equally well, suited to chemistry? In 1989, Becke and Roussel81 introduced an exchange hole model exact for any hydrogenic atom, but not exact for the UEG. It misses the UEG exchange energy by 2%–3%.82
At first glance, the BR model might seem concocted and unnecessarily complicated. It depends on all of ρ, ∇ρ, ∇2ρ, and τ. Notice, however, that a normalized off-center exponential is precisely the exchange hole in a hydrogenic atom. Thus the BR hole is exact in two respects. It replicates the second-order Taylor expansion in r12 of the exact exchange hole in general, and it fully replicates the exact exchange hole in any hydrogenic atom. Formulas for A, a, b, and may be found in Ref. 81, but are not expressible in closed analytical form. Imposition of the BR constraints yields a 1D nonlinear equation that must be solved by, e.g., the Newton-Raphson method. This is easily done. Becke-Roussel atomic exchange energies are about an order of magnitude better than the exchange LDA, but not as accurate as B88 (Table II).
To discard the uniform electron gas as a reference system, or not, is a personal choice. After a conference talk some years ago on the functionals OPTX83 and OLYP84 that do not respect the UEG limit, Nicholas Handy verbalized his own choice as follows: “I don't know about you chaps in America, but at Cambridge we have never been able to synthesize jellium!” It should be added, though, that the performance of OPTX/OLYP deteriorates for heavy atoms because the LDA does become exact, in a relative-error sense, in the infinite Z limit. It was the view of the Handy group that this deterioration resides in the cores and is of little chemical importance. The same would apply to the BR functional.
BR has been successful in myriad subsequent applications. It has been spliced into85 the correlation functional B88c34 [called B88c(BR) in Tables III and IV]. It has been generalized, in a logical and nonempirical manner, to include paramagnetic current density86 (“BRj”). It was shown by Becke87 and by Johnson, Dickson, and Becke88 that BRj solves a long-standing fundamental problem with the LDA and GGAs in open-shell atomic states. The LDA and GGAs give large energy discrepancies between atomic open-shell Slater determinants that should be degenerate! BRj fixes this. In the “reverse” direction, BR has become the foundation of our latest work on exchange-correlation DFAs based on exact exchange (see Sec. VIII). Last, but not least, BR has been used to “density functionalize” the dispersion model of Becke and Johnson (see Sec. IX).
Meanwhile the group of Perdew, after honing the GGA functional class to its PBE form,59 worked on meta-GGAs. Employing their favored strategy, satisfaction of UEG and other theoretical constraints, they obtained the “TPSS” functional89–91 and the “revTPSS” functional,92 among others. Both these functionals give the correct exchange energy of the hydrogen atom in addition to the UEG, and their MAEs on the G2/97 atomization energies are ∼5 kcal/mol, slightly better than typical GGAs.
The functional derivative of τ-dependent DFAs is problematic, because the functional τ(ρ) is not known. It was the BR functional that prompted Neumann, Nobes, and Handy93 to circumvent the problem through differentiation with respect to orbital expansion coefficients instead of functional differentiation with respect to the density. This is the trick that made Hartree-Fock theory practical long ago. Formally it is not Kohn-Sham theory, but in practice should be very close. The same strategy was employed by Adamo, Ernzerhof, and Scuseria94 to implement meta-GGAs in the GAUSSIAN program. Alternatively, Arbuznikov and Kaupp95 have implemented an “optimized effective potential” (OEP) for meta-GGAs that is equivalent to the Kohn-Sham potential. Total DFT energies are rather insensitive to the orbitals, self-consistent, non-self-consistent, or OEP, but properties such as nuclear shielding constants are much less so.95 In a recent paper on a meta-GGA relativistic kinetic-energy approximation, Becke96 has derived a variational orbital equation applicable to meta-GGAs which, again, is not the Kohn-Sham equation but should generate orbitals that are very close.
. | G2/97(ME) . | G2/97(MAE) . | HAT(ME) . | HAT(MAE) . |
---|---|---|---|---|
HF | −158.9 | 158.9 | 22.4 | 22.4 |
HFS | −23.6 | 35.0 | −12.8 | 13.0 |
LDA | 82.9 | 82.9 | −17.8 | 17.8 |
B86+PBE | 5.4 | 8.0 | −7.9 | 7.9 |
B86b+PBE | 11.5 | 12.6 | −8.7 | 8.7 |
B88+PBE | 5.5 | 8.1 | −7.6 | 7.6 |
PW86+PBE | 5.7 | 8.7 | −7.9 | 7.9 |
revPW86+PBE | 3.1 | 7.7 | −7.6 | 7.6 |
PBE+PBE | 16.3 | 16.9 | −9.5 | 9.5 |
BR+B88c(BR) | −5.9 | 9.0 | −5.6 | 5.7 |
B3PBE | 1.1 | 3.2 | −3.7 | 3.8 |
PBE0 | 1.9 | 4.4 | −3.6 | 3.6 |
B05 | 0.9 | 2.6 | 0.2 | 1.2 |
B13 | 0.8 | 3.8 | 0.9 | 1.8 |
. | G2/97(ME) . | G2/97(MAE) . | HAT(ME) . | HAT(MAE) . |
---|---|---|---|---|
HF | −158.9 | 158.9 | 22.4 | 22.4 |
HFS | −23.6 | 35.0 | −12.8 | 13.0 |
LDA | 82.9 | 82.9 | −17.8 | 17.8 |
B86+PBE | 5.4 | 8.0 | −7.9 | 7.9 |
B86b+PBE | 11.5 | 12.6 | −8.7 | 8.7 |
B88+PBE | 5.5 | 8.1 | −7.6 | 7.6 |
PW86+PBE | 5.7 | 8.7 | −7.9 | 7.9 |
revPW86+PBE | 3.1 | 7.7 | −7.6 | 7.6 |
PBE+PBE | 16.3 | 16.9 | −9.5 | 9.5 |
BR+B88c(BR) | −5.9 | 9.0 | −5.6 | 5.7 |
B3PBE | 1.1 | 3.2 | −3.7 | 3.8 |
PBE0 | 1.9 | 4.4 | −3.6 | 3.6 |
B05 | 0.9 | 2.6 | 0.2 | 1.2 |
B13 | 0.8 | 3.8 | 0.9 | 1.8 |
VII. HYBRID FUNCTIONALS
In 1993, Becke97 observed that GGAs, while dramatically reducing the massive overbinding tendency of the LDA, show a small overbinding tendency still. This can be understood from the adiabatic connection formulas, Eqs. (27) and (28). EXC depends on the coupling-strength averaged exchange-correlation hole. The λ = 0 exact exchange hole is relatively delocalized (“nonlocal”) in multicenter systems. In H2, for example, hX(1, 2) is spread equally over both centers regardless of the electron's position [see Eq. (30)]. Electron correlation then localizes the hole as it evolves from λ = 0 to λ = 1. GGA holes, however, are inherently localized at all λ, including λ = 0, and are therefore too compact. This explains why exchange-correlation GGAs, despite their sophistication, are slightly overbinding. The delocalized character of the exchange-correlation hole at λ = 0 can only be captured by replacing a small amount of DFA exchange by exact exchange!
Frisch and co-workers98 reworked B3PW91 using the LYP67 correlation DFA instead of PW91. The reworked functional employs the same three parameters as in Eq. (52) and is known as “B3LYP.” For the past two decades, B3LYP has been the most popular exchange-correlation DFA in computational chemistry.8
Perdew, Ernzerhof, and Burke99 argued that 25% exact exchange was preferable to the 20% in B3PW91. Combined with the PBE exchange-correlation GGA,59 and fixing b = 0.75 and c = 1, the corresponding functional is called “PBE0.”100 The slightly higher exact-exchange fraction, a = 0.25, in PBE0 compensates for the distinctive overbinding tendency (Table IV) of the pure PBE+PBE GGA.
These functionals are called “hybrid” functionals for obvious reasons. They mix GGA exchange with explicitly nonlocal exact exchange. Also, their implementation in the GAUSSIAN program, and in other programs, was a hybrid of technologies. The exact exchange part was implemented with well-developed Hartree-Fock techniques (differentiation with respect to orbital expansion coefficients) and the rest was KS-DFT.71 As such the part is not Kohn-Sham exact exchange.101 The orbitals are nevertheless close to true Kohn-Sham orbitals. Mixing of BR-type functionals85 and meta-GGAs with exact exchange has also been widely tested. An excellent review of GGA and meta-GGA hybrids can be found in Ref. 102. It appears that meta-GGAs need less exact exchange than GGAs. The kinetic energy density in meta-GGAs can crudely sense exact exchange nonlocality, as Van Voorhis and Scuseria103 and Becke104 have pointed out.
The parameter fitting in B3PW91 crosses a philosophical line. All the functionals in Tables II and III were determined by whatever means (fitting to atoms or fitting to theoretical constraints) once and for all time. They were not fit to molecular data. With B3PW91, fitting to molecular data was initiated. Many wish this door had never been opened. Unlike the comfortably small number of noble-gas atoms employed in the fitting of B86, B86b, or B88, there are uncountably many molecules! But there was little choice. The exact-exchange mixing fraction cannot be determined from atomic data. A few years later Becke went further,105,106 introducing a flexible form for GGA-hybrid functionals (“B97”) and for second-order GGA-hybrids (“B98”) that allowed systematic fitting of even more parameters. He stopped at ten, beyond which symptoms of overfitting set in. Some have gone further.103,107 Some have, arguably, gone too far108 (“20, 30, or even 50” parameters108).
In an essay entitled “Obituary: Density Functional Theory (1927-1993)” in 2001,109 Gill pronounced that hybrid functionals marked the death of DFT. It is a purist view that has some merit. However, the heart of Kohn-Sham DFT is not necessarily that EXC is explicitly density dependent, but that only the occupied orbitals are used, and that EXC is invariant with respect to unitary orbital transformations. Hybrid functionals, meta-GGA and BR-type functionals, and the much more complicated nonlocal functionals in Secs. VIII–X, preserve these. It will take several more years, until Sec. IX of this article, before Kohn-Sham DFT finds itself in peril.
VIII. NONLOCALITY
After the heady decades of the 1980s and 1990s, it was easy to be complacent about the “standard” DFT approximations. As DFT applications multiplied, however, the list of failures of the standard GGA and hybrid functionals grew: overstabilization of molecular radicals, poor treatment of charge transfer processes, and, most troubling of all, the inability to account for dispersion interactions (to be discussed separately in Sec. IX). All of these are truly nonlocal effects, extending beyond intraatomic to interatomic distances. They cannot be treated by local functionals depending on ρ, or even by functionals depending additionally on ∇ρ, ∇2ρ, and/or τ. All such functionals are “local” as they use information only at each integration point . Truly nonlocal effects must be sensed by explicitly sampling, at each integration point, all other points .
It has been known for some time110 that local DFAs give a much too low barrier for the simplest hydrogen-atom-transfer (HAT) reaction in chemistry, H2 + H → H + H2. The transition state is a linear H3 arrangement with an unpaired electron in an orbital concentrated on the end hydrogens. Because the electron is unpaired, electron correlation cannot localize this highly delocalized hole. It is delocalized for all λ. Therefore local DFAs, with their inherently localized exchange-correlation holes, seriously overstabilize the transition state. A valuable benchmark set of HAT reaction barriers, compiled by Lynch and Truhlar,111 underscores these problems. Table IV lists MEs and MAEs of HAT barriers in addition to G2/97 statistics. All the local DFAs, and also the hybrids derived from them, significantly underestimate the HAT barriers.
What about locality/nonlocality of the orbitals? Local DFAs prefer delocalized self-consistent orbitals relative to exact exchange. Exact exchange must localize the orbitals in order to localize the hole and hence lower the energy. The localized holes underlying local DFAs, on the other hand, obviate the need to localize the orbitals. In homonuclear dissociations of closed-shell molecules, this is a good thing. The Coulson-Fischer point in diatomic molecules (the internuclear separation at which spin-restricted orbitals break symmetry in order to lower energy) for local DFAs is much further out than for Hartree-Fock. In heteronuclear dissociations, however, local DFA orbitals are too delocalized. In charge-transfer complexes, for example, too much delocalization leads to too much transferred charge and too much stabilization. An early computational study of charge-transfer complexes by Ruiz et al.112 foretold this. A much more recent study113 of charge transfer in the TTF/TCNQ complex, as a function of exact-exchange mixing fraction, nicely illustrates (see Fig. 5 in Ref. 113) that local DFAs transfer too much charge.
By far the most embarrassing failure of local DFAs is the energy curve of the simplest bond in chemistry, , plotted in Figure 1. Near the equilibrium internuclear separation, standard DFAs perform well, giving a good bond energy. As is stretched, however, a typical local-DFA energy curve (e.g., BLYP in Fig. 1) falls increasingly below the exact curve, displays an artifactual maximum, and dives to an erroneous asymptote almost as deep as the bond minimum itself! This is the worst possible case of a highly delocalized hole whose extent cannot be sensed by local DFAs. Local DFA holes, in this worst case, are far too compact and local DFAs overstabilize stretched horribly.
All the failures of local DFAs reside in the exchange part. Also, most of the parameters in DFAs, whether “empirical” or “nonempirical,” are related to the exchange part. An overwhelming proliferation of DFAs has diluted the DFT literature in recent years. The flexibility of meta-GGAs, in particular, invites endless reworking and endless refitting. The “Minnesota” functionals are fit to almost a thousand data points, in over fifty different sets, by dozens of parameters.108 The Perdew group has also published a bewildering number of meta-GGAs. Broad applicability to solid-state and surface physics123 and to metallic and non-covalent in addition to covalent bonds124 is their objective. Good intentions notwithstanding, things are spinning out of control. How, and when, will it stop?
If “100%” exact exchange were used in DFAs instead of GGA, meta-GGA, LRC, or other exchange approximations, nonlocality errors would be eliminated. Contentious parameter fitting would, for the most part, be eliminated as well. Furthermore, exact exchange is the only way to ensure that any one-electron system is properly treated. However, exact exchange must be coupled with entirely new kinds of correlation functionals. Local electron-gas-like functionals describe short-range dynamical correlations only, and will not suffice. To be partnered with exact exchange, correlation functionals must be as nonlocal as exact exchange itself, in order to describe multicenter nondynamical (or “static”) correlation effects.
However, B05 is a very complicated functional. It depends on ρ, ∇ρ, ∇2ρ, τ, and the Slater potential, Eq. (4). It is that communicates the nonlocality of the exchange hole to the B05 nondynamical correlation model, allowing the coupling of with . Is the effort of evaluating this complicated functional worth it? With no reparametrization whatsoever, B05 gives the best HAT reaction barriers126 in Table IV by far. B05 appears to treat delocalized radical systems very well, as was the hope for this new class of functionals. Also, B05 is exact for any one-electron system. The highly embarrassing problem is no problem for B05. It is noteworthy too that, beginning with B05, absolutely nothing from the uniform electron gas is used in our functionals. Becke-Roussel and B88c ideas are used throughout.
Exact-exchange-based functionals have also been studied by Mori-Sanchez, Cohen, and Yang127,128 (MCY2). MCY2 is based on modeling of the adiabatic connection and contains three empirical parameters. It has the advantage over B05 that the Slater potential is not required. Therefore, MCY2 is similar in cost to standard hybrid functionals.
Perdew and co-workers have pursued the construction of exact-exchange-based functionals as well. The first of these is from Perdew et al.129,130 (PSTS) and contains five empirical parameters. Like B05, PSTS requires the Slater potential. More recent work by Odashima, Capelle, and others131,132 is “nonempirical.” Their results are exploratory so far, with G2/97 MAEs exceeding 10 kcal/mol. Perdew has called this class of functionals “hyper GGAs,” an unsuitable name because they do not resemble GGAs at all. Rather, exact exchange replaces GGA exchange altogether, and the full nonlocality of the exact exchange energy density is input to the correlation part. “Exact-exchange-based” (EXX-based), “true correlation,” or “pure correlation” are better names.
B05, MCY2, and PSTS have been assessed by Liu et al.,133 using an efficient RI (resolution of the identity) implementation of Proynov et al.134–136 for self-consistent computations. They find that B05, MCY2, and PSTS do not perform quite as well as heavily parametrized meta-GGAs or LRC GGAs on standard thermochemical tests. The EXX-based functionals are significantly better, however, in dissociation curves, including and the equally problematic . Also, B05 and MCY2 give the correct singlet ground state for the NO dimer, a well-known stringent test of nondynamical correlation, and a reasonable binding energy. PSTS gives the singlet ground state but too much binding. All the meta-GGAs and LRC GGAs qualitatively fail the NO-dimer test.133
With these explicitly nonlocal, exact-exchange-based functionals looking promising, it is tempting to proclaim that “universal” DFAs are in sight. But there is one more nonlocality to consider.
IX. DISPERSION INTERACTIONS: THE FLOODGATES OPEN
The great promise of DFT is the simulation of complex systems of very large size: biological systems, materials, surfaces, and interfaces. In areas such as these, van der Waals (vdW) interactions play a crucial role. Standard DFAs treat hydrogen-bonding interactions fairly well,137 as their origin is largely electrostatic. Dispersion interactions, on the other hand, are highly problematic for conventional functionals. I have heard it said that “chemistry of the 20th century was about intramolecular interactions; chemistry of the 21st century will be about intermolecular interactions” (Mark Ratner in a 2004 seminar). It is absolutely critical that DFAs be able to treat dispersion interactions.
Kristyan and Pulay published an early warning in 1994,138 stating the more-or-less obvious fact (but it needed to be said) that local DFAs cannot give an asymptotic dispersion interaction of London form, −C6/R6. The asymptotic interaction energy of local DFAs falls off exponentially instead. Undeterred, a few studies of conventional DFAs and noble-gas vdW curves appeared in the DFT literature.139–142 Some functionals, including the LDA, bound noble-gas dimers. Some did not. Agreement with experimental data, when binding was observed, was haphazard at best. It was all very confusing.
A few papers were enlightening. Lacks and Gordon143 suggested that pure exchange DFAs should be assessed on their ability to reproduce exact-exchange noble-gas curves, which are repulsive. Their best picks were PW86 and B86b, whose enhancement factors fX(s) both have the same asymptotic behavior, s2/5. The important influence of asymptotic fX(s) behavior on noble-gas curves was highlighted by Zhang, Pan, and Yang144 as well. But the fact remained that no known correlation functional could generate the London −C6/R6 dispersion energy.
Strategies to add dispersion in a less empirical way were also investigated. Moller-Plesset (MP2) perturbation theory and Goerling-Levy second-order Kohn-Sham perturbation theory148 account for dispersion. Indeed MP2 tends to overestimate it. Several groups therefore introduced double hybrid functionals combining GGAs, exact exchange, and second-order perturbation theory.149–151 Goerigk and Grimme have reviewed these methods in Ref. 152. Double hybrid DFAs are, in my opinion, undesirable. Their formal computer-time scaling is an order higher than Hartree-Fock or KS-DFT. They are philosophically undesirable as well, from the “occupied-orbitals-only” point of view.
The group of Tsuneda and Hirao162,163 have combined the Andersson-Langreth-Lundqvist153 underpinnings of vdW-DF with the LRC functional of Ref. 117 to build a general Gaussian-type-orbitals methodology going beyond just vdW interactions. Also in the vdW-DF family and designed for general chemical applications, are the functionals of Vydrov and Van Voorhis.164
Outside the vdW-DF family, many researchers continued to work on corrections of the London, Eq. (56), form. A nonempirical way to obtain interatomic dispersion coefficients, employing only occupied Kohn-Sham orbitals, was needed. Becke and Johnson developed a nonempirical approach in a series of papers beginning in 2005.165,166 The concept is simple. The exchange hole, hX(1, 2), of an electron at in a nonuniform system is generally not centered on the electron. Hence the electron plus its hole is a neutral entity with a nonzero -dependent “exchange-hole dipole moment,” , easily calculable from Eq. (30). This position-dependent XDM moment, if taken in each of two distinct atoms, generates a dipole-dipole interaction which, when inserted into second-order perturbation theory, generates a dispersion energy.167
The initial XDM work of Johnson and Becke employed exact (HF) exchange, the B88c(BR) dynamical correlation DFA,34,85 and the above dispersion correction. The subsequent goal of Kannemann and Becke170–172 was to find an optimum exchange-correlation GGA with which to partner XDM. There is enormous variability between exchange DFAs and their noble-gas interaction curves; everything from massive over-repulsion to massive artifactual binding. Kannemann and Becke concluded,170 as did Lacks and Gordon,143 that PW86 is the optimum exchange GGA for reproducing exact-exchange (Hartree-Fock) noble-gas curves. Murray et al.62 concluded the same in consideration of intermolecular interaction curves. They therefore revisited Perdew's original model61 and reparametrized it62 in the light of more recent knowledge (“revPW86” in Tables II and IV), and inserted revPW86 into vdW-DF2.158 The choice of GGA for dynamical correlation is not critical. Kannemann and Becke chose PBE.59 The PW86+PBE+XDM combination was tested with Gaussian-type-orbitals in Ref. 172, and has been fully developed into an efficient and powerful general-purpose chemical tool in Ref. 173.
Following quickly after XDM, other nonempirical dispersion models appeared: Tkatchenko and Scheffler174 (“TS”), Sato and Nakai175 (“LRD”), and Grimme et al.176,177 [“D3” and “D3(BJ)”], each based on a different underlying strategy. TS174 employs a volume ratio trick [Eq. (7) in Ref. 174] to determine atom-in-molecule dispersion coefficients and vdW radii from free-atom values. This trick is from Becke and Johnson [Ref. 167 (1st paper) and Ref. 169]. The LRD (local response dispersion) method of Sato and Nakai175 is related to the vdW-DF family but avoids the double integration in Eq. (57). The D3 model of Grimme et al.176 uses precomputed reference data for dispersion coefficients, and the concept of “fractional coordination number” to mimic the environment of an atom in a molecule. D3(BJ)177 is a refinement of D3 that incorporates the rational damping, Eq. (58), of Johnson and Becke.169 Grimme's models are garnering popularity among chemists, as they are the simplest of the above to implement.
All of this was a dramatic reversal of fortune. Circa 2000, with nothing in sight but empirical dispersion corrections, the future of DFT was bleak. By 2010, everything had changed for the better.
Applications of DFT dispersion methods are rapidly growing9 and theoretical progress continues unabated. The thermochemistry benchmark sets40 that were indispensible for DFA development in the past have been joined by vdW benchmark sets. The S66 set of Rezac, Riley, and Hobza178 for biochemical systems, the set of supramolecular complexes of Risthaus and Grimme,179 and the molecular solids set of Otero-de-la-Roza and Johnson180 are a few examples. Three-center dispersion interactions, in addition to two-center interactions, are being studied.176,181–183 Screening effects are being studied as well.184 I think the jury is still out on the importance of three-center interactions and screening. The balance between these latter effects, higher-order C8 and C10 terms, damping functions, and the exchange-correlation DFAs themselves, is incredibly delicate.
The dispersion problem has also stimulated great interest in the “random phase approximation” (RPA) of the physicists185 and its application to molecular systems.186 RPA invokes the virtual orbitals and, like all WFT methods that use the virtuals, naturally incorporates dispersion. The RPA literature is complicated, confusing (at least to me!), and growing fast. Early explorations of RPA in a DFT context were made by Langreth and Perdew.30 Recent flavors of the subject may be found in Refs. 187–191. The floodgates are open now. The virtual orbitals are pouring in. Double hybrids were just the first step. The ultimate conclusion of combining DFT and WFT is what Bartlett calls192 “ab initio DFT,” a mathematical and theoretical framework that exploits the best of both worlds and seeks a systematic route to improving both. Operationally though, it is not in the occupied-orbitals-only spirit of Kohn and Sham.
Gill's pronouncement109 of the death of DFT was perhaps premature, but not unfounded. The dispersion problem has, in large part, plunged DFT into an identity crisis. Have we forgotten why Kohn-Sham theory is popular in the first place? If employment of the virtual orbitals increases, I am afraid the beauty and simplicity of the Kohn-Sham vision will be lost.
X. INTO THE FUTURE
With dispersion interactions in place, the future looks bright. But we cannot, quite yet, forget the basics. The focus of ground-state DFA developers has shifted to what I hope is the last frontier, “strong” correlation. Strong correlation refers to substantial mixing of configurations in systems with a small HOMO-LUMO gap. Chemical problems in which the gap approaches zero include dissociation of homonuclear bonds and “avoided-crossing” reactions. Systems with a large number of low-lying virtual states are especially troublesome. The chromium dimer, Cr2, is a nightmare test of strong-correlation methods in quantum chemistry.193,194 We will visit this nightmare below.
Like B05, the five (or six) parameters in B13 are “fine tuning” linear prefactors with logical values rooted in the adiabatic connection. All of , and have value ∼0.6, the prefactor in has value ∼0.4, and it is no accident that these add up to ∼1. B13 is essentially nonempirical.
Systems at “avoided crossings” have a zero HOMO-LUMO gap and are challenging strong-correlation problems. An excellent discussion of the energy surface of the prototypical H4 system, with an avoided-crossing transition state at the square geometry, can be found in Ref. 202. Fractional occupancies202,203 are implicated in this and other avoided-crossing problems.204 It is not possible to represent the D4h density of the open-shell square H4 structure, or structures near it, without fractional occupancies.202,204 The B13 barrier of 141.2 kcal/mol in this H4 system (unpublished) is in fair agreement with the high-level WFT barrier202 of 147.6 kcal/mol. A few other B13 avoided-crossing barriers can be found in the very recent Ref. 204.
Like B05, B13 is a functional of ρ, ∇ρ, ∇2ρ, τ, and the Slater potential , Eq. (4). Direct computation of the Slater potential on all grid points is costly, and differentiating B05 (or B13) with respect to orbital expansion coefficients for the purpose of self-consistent computations is daunting. Yet both of these tasks have been accomplished for B05134–136,205 and are therefore in hand for B13. Self-consistent B05/B13 computations are a few times lengthier than, e.g., B3LYP, but are not intractable thanks to RI (resolution of the identity) bases. If self-consistency is not desired, then post-LDA B05/B13 computations are a fast alternative.133–136
The distributed Slater potential may also be computed in an indirect manner by a literal implementation of Slater's original orbital-averaged HF-potential concept (Sec. II). A simple reorganization of the HF equations is all that is required.206–208 Thus, post-Hartree-Fock implementations of EXX-based functionals like B05, B13, or PSTS are conceivable at essentially no cost beyond HF itself (plus, perhaps, an additional HFS computation to correct for basis-set artifacts207,208). Post-Hartree-Fock DFT is not new. The 1992 implementation of BLYP by Gill et al.66 was post-HF, followed by Oliphant and Bartlett209 and others. Numerous DFT researchers had used HF orbitals long before then. Post-Hartree-Fock DFT has advantages over self-consistent DFT. As Verma, Perera, and Bartlett210 have reminded us, HF requires no numerical integration grids and therefore has high speed and orbital precision (i.e., no grid noise). However if forces are required, perturbations of the HF orbitals need to be computed and some advantage is lost.210 Overall, though, post-Hartree-Fock DFT chemistry and self-consistent DFT chemistry are of similar cost. Future prospects for nonlocal EXX-based DFAs employing the Slater potential, whether implemented post-HF, post-LDA, or self-consistently, are exciting.
Looking back thirty years to the genesis of DFA growth, I am amazed at where we have come. Our goal back then was to replace all things WFT, including Hartree-Fock, with simpler density-functional approximations. I did not imagine, until 2003, that I would ever espouse the use of exact exchange. Yet this is my position, today, on how to proceed into the future. We have come “full circle” back to exact exchange or, if you prefer, Hartree-Fock. Figure 3 plots the path. Post-Hartree-Fock WFT, circa 1980, involved (still does) configuration mixing. Post-Hartree-Fock DFT, circa 2010, is a very different thing. Pure, nonlocal correlation functionals are DFT at its best. Everything we have learned about GGAs, meta-GGAs, BR, etc., these past thirty years is in these functionals. To understand nonlocality, you have to understand locality first. Although we have come full circle ultimately, the thirty intervening years were well spent.
XI. SUMMARY
DFT detractors complain that the linchpin of KS-DFT, the exchange-correlation functional and its underlying hole, will never be expressible in closed analytical form and is not amenable to systematic improvement. DFA development has, in fact, been systematic, as the history in this Perspective has hopefully shown. We started with the LDA, which samples only the density at each grid (integration) point. GGAs sample density gradient information in addition to the density. Remarkably, the gradient helps us capture the physics of the exchange energy density out to the asymptotes of finite systems (B88). Next we included kinetic energy density, τ, to ensure zero correlation energy in one-electron systems and to exploit its crude knowledge of nonlocality. The BR functional includes ∇2ρ as well, exactly sampling the exchange-hole Taylor expansion as far as second order in r12. The full multicenter nonlocality of the exact exchange hole is partially incorporated into hybrid functionals. Recent “100%” EXX-based functionals, and functionals including dispersion and even virtual orbitals, are truly nonlocal. We have progressed from the LDA to full nonlocality in a rational and systematic manner, climbing the rungs of “Jacob's DFT ladder” (an image, popularized by Perdew,211 recounting the above advances as successive rungs in Jacob's biblical ladder) as we reach for DFT heaven.
But DFT heaven is probably unattainable. DFAs, local or nonlocal, will never be exact. Users are willing to pay this price for simplicity, efficacy, and speed. As we strive to make DFAs more robust, will their accuracy and speed continue to satisfy users? Will Kohn-Sham DFT, in its pristine occupied-orbitals-only form, survive the pressure? Or will DFT and WFT transform each other and merge into a single entity. The next fifty years will be as interesting as the first.
ACKNOWLEDGMENTS
I am very grateful to the Killam Trust of Dalhousie University for my appointment as Killam Chair since 2006. Research support has been provided by the Natural Sciences and Engineering Research Council (NSERC) of Canada, and computing support has been provided by the Atlantic Computational Excellence Network (ACEnet).