We report a new implementation of the density functional embedding theory (DFET) in the VASP code, using the projector-augmented-wave (PAW) formalism. Newly developed algorithms allow us to efficiently perform optimized effective potential optimizations within PAW. The new algorithm generates robust and physically correct embedding potentials, as we verified using several test systems including a covalently bound molecule, a metal surface, and bulk semiconductors. We show that with the resulting embedding potential, embedded cluster models can reproduce the electronic structure of point defects in bulk semiconductors, thereby demonstrating the validity of DFET in semiconductors for the first time. Compared to our previous version, the new implementation of DFET within VASP affords use of all features of VASP (e.g., a systematic PAW library, a wide selection of functionals, a more flexible choice of U correction formalisms, and faster computational speed) with DFET. Furthermore, our results are fairly robust with respect to both plane-wave and Gaussian type orbital basis sets in the embedded cluster calculations. This suggests that the density functional embedding method is potentially an accurate and efficient way to study properties of isolated defects in semiconductors.
I. INTRODUCTION
Point defects universally exist in all types of solid state materials and play important roles in engineering material properties. The presence of defects significantly affects the chemical, electrical, optical, and mechanical properties of the material.1 Understanding the physics of point defects is thus crucial for designing new materials for various applications. In particular, defects have been shown to be one of the key factors determining the performance of thin film solar cells:2 on the one hand, certain levels of intrinsic defects or extrinsic dopants are the source of free carriers, increasing the conductivity of the semiconductor; on the other hand, wrong types of defects also serve as non-radiative recombination centers and introduce immobile trap states. It is thus of paramount importance to understand defect states in order to ultimately control them. The dearth of experimental techniques to probe bulk atomic-scale defects motivates use of first-principles quantum mechanics calculations to do so.
Density functional theory (DFT) based on plane-wave (PW) basis sets is currently the state-of-the-art in the material science community for describing periodic extended systems. Although being successful in general, DFT suffers from approximate treatment of electron exchange-correlation (XC), resulting in self-interaction error3 and the related underestimation of band gaps. In quantum chemistry, more accurate correlated wavefunction (CW) methods (Møller-Plesset (MP) perturbation theory, configuration interaction (CI), coupled cluster (CC), etc.) are available, but these methods are not straightforward to implement in a periodic setting and much more expensive within the PW formalism.4 This is especially true for the simulation of isolated defects, which usually requires large supercells to avoid interactions between periodic images. The poor scaling behavior of the PW-CW methods significantly limits their applicability in these scenarios. This problem becomes even more severe for charged defects, which requires careful consideration using periodic boundary conditions (PBCs). For example, in some solar cell materials, the charges carried by the defects create local electrostatic potential fluctuations, which can be a major reason for low open-circuit voltages.5 To study such charged defects, besides large supercells, specially designed corrections6–9 are also needed to diminish the long-range electrostatic interactions between periodic images.
A promising alternative to eschew these difficulties associated with the periodic PW formalism is the use of embedded cluster calculations. In this approach, we exploit the locality of the defect and construct a finite cluster model, representing the critical region of interest. The influence of the surrounding parts of the system (the environment) is represented by an embedding potential. The environment and its interaction with the cluster can be computed using a lower level method such as a DFT approximation, while the cluster can be studied using high-level embedded correlated wavefunction (ECW) methods. Various approaches to constructing the embedding potential have been proposed, which have been summarized elsewhere.10–12 In this work, we focus on the density functional embedding scheme developed by Huang et al.13 In this scheme, one can define a unique embedding potential for both subsystems, which exactly reproduces the electron density of the total system by summation of the electron densities of the two embedded subsystems. By solving an optimized effective potential (OEP) problem, we obtain this unique embedding potential at the DFT level (see below for details). We can now conduct ECW calculations on the embedded cluster without PBCs, using the well-developed formalisms of quantum chemistry. The density functional embedding scheme has been successfully applied to studying absorbates (such as H2 or O2) on metal surfaces, including Al and Au.14–16 While it has been shown that the embedded metal cluster can decently reproduce bulk metal properties, the accuracy of the density functional embedding method beyond metallic systems has been tested sparingly.17 Several other embedding schemes have been applied to systems featuring more covalent bonding character, in both molecules and condensed matter.10–12,18–22 We therefore would also like to extend the scope of the density functional embedding theory (DFET) method beyond metals. In this paper, we will demonstrate some preliminary test cases illustrating the potential capability of the density functional embedding method for the study of semiconductor defect systems.
While performing PW-DFT calculations on semiconductors, there are two primary ways to handle the large wavefunction amplitudes near the core region that otherwise would require very large kinetic energy cutoffs using PWs: pseudopotentials (PPs) and the projector-augmented-wave (PAW)23,24 method. The density functional embedding method was implemented previously in the ABINIT25–27 program, in conjunction with norm-conserving PPs.28 Meanwhile, another highly parallelized program that is widely utilized in the material science community is VASP.29–32 VASP provides a wide selection of XC functionals, ranging from local density approximation (LDA)33 to hybrid functionals such as Heyd-Scuseria-Ernzerhof 06 (HSE06).34 Compared to ABINIT, VASP also provides a simpler U-J correction formalism developed by Dudarev et al.,35 which is often employed to alleviate the self-interaction errors for transition metal elements in semiconductor and ionic materials. More importantly, VASP offers a robust, well-tested library of PAW potentials that covers the entire periodic table. The PAW formalism is more accurate than PPs, as it (i) offers an exact transformation between the original, oscillating wavefunction and the pseudized, slowly varying pseudo-wavefunction used in the computation and (ii) explicitly treats the all-electron wavefunctions, albeit within the frozen-core approximation. The implementation of PAW in VASP offers an efficient way to conduct frozen-core, all-electron calculations with reasonably low kinetic energy cutoffs (typically even lower than the norm-conserving PP calculations). Considering these advantages, we would like to extend our implementation of the density functional embedding scheme to VASP. This extension should enable us to conduct embedding calculations with the PAW-DFT formalism beyond the current PP setup. As we will show, the PAW formalism introduces additional complications into the embedding scheme, which require a series of modifications of the embedding algorithm. In the following, we describe the algorithm developed in the new implementation and we test the robustness of the new algorithm on various types of systems, including covalently bound molecules, metals, and most importantly, semiconductors including defects.
II. THEORY AND IMPLEMENTATION DETAILS
A. Density functional embedding theory
In the following, we briefly review the algorithm and the numerical details of the density functional embedding method here. Interested readers can find a detailed treatment in Ref. 13 for more complete theoretical derivations.
Consider a total system partitioned into two subsystems, labeled as system A and system B, respectively. The essential idea of the DFET is to find a unique embedding potential Vemb for both systems A and B, such that
The , represent the ground-state electron densities of the two subsystems obtained from separate, converged self-consistent field (SCF) calculations in the presence of Vemb and nref represents the ground-state electron density of the total system, computed without embedding potential. In this work, we always use the same embedding potential for different spin states, and consistently we always perform non-spin-polarized calculations to maintain the spin symmetry. In practice, in order to obtain Vemb, we need to maximize the following extended Wu-Yang (WY) functional36 with respect to Vemb:
Here, EK[nK], K = A, B is the energy functional of system K, including its interaction with . Given that
it can be proven that the derivative of the WY functional with respect to Vemb is simply13
Equation (4) gives the gradient used in the optimization procedure: when W reaches its maximum, the gradient vanishes and condition (1) is naturally satisfied. The entire procedure is effectively solving for an OEP, which, in our case, turns to be the optimal unique embedding potential.
Analytically, Eq. (3) is guaranteed by the variational principle. In the current implementation in ABINIT, all the functions (densities, potentials, etc.) are expressed on one uniform three-dimensional grid and all integrations are performed as summations over this grid. In this circumstance, Eq. (3) always holds numerically, even using a finite grid size. However, as we discuss below, this is no longer true for dual grid schemes.
B. Derivation of the embedding potential within the PAW formalism
One of the key steps of the density functional embedding procedure is to perform SCF-DFT calculations for both subsystems embedded in an external potential Vemb, from which both densities (nK) and energies (EK[nK]) are obtained. Therefore, we need to consider how to conduct PAW-DFT calculations in conjunction with the presence of an arbitrary external potential. Arbitrary in this context refers to the typical length scale of potential fluctuations: while including a slowly varying potential (on the length scale of interatomic distances) is quite straightforward, rapid potential fluctuations close to the ionic cores affect the PAW projection and thus need to be treated carefully. In this section, we briefly review the PAW formalism and discuss the necessary modifications introduced by the arbitrary external potential. The readers are referred to previous papers23,24 for a more detailed description of the PAW method.
Using Blöchl’s nomenclature,18 the exact all-electron (AE) wavefunction of the n-th band is expressed as a soft nodeless pseudo (PS)-wavefunction via Eq. (5),
where and are the onsite AE and PS basis sets, respectively. These two sets of basis functions are identical beyond the core radius Raug but differ inside the core region. are the projector functions that are orthonormal to the PS basis: . The projection introduced by Eq. (5) damps the sharp oscillations of the wavefunction within the core radius Raug such that the resulting soft PS wavefunction can be expanded using considerably fewer plane waves. Usually, the frozen-core approximation is employed, i.e., the AE wavefunctions of the core electrons are fixed during the SCF iterations. In contrast to norm-conserving PPs, the exact AE wavefunctions for both valence and core electrons are explicitly represented and expanded using the ansatz given by Eq. (5). Therefore, the PAW scheme can effectively be considered as a frozen-core, all-electron level of theory.
In correspondence with the decomposition of the wavefunction, the total density of the system is decomposed into three parts,
The first term is the PS density calculated directly from PS wavefunctions on the uniform PW grid. The second (n1) and third () terms are the one-center densities (denoted by the superscript 1) defined within the augmentation sphere, associated with the AE and PS basis functions, respectively. Due to the rapidly oscillating character of the one-center terms, both densities are computed on dense radial grids centered on the ion positions.
Within the PAW ansatz, the Kohn-Sham (KS) Hamiltonian of the system can be written as
Here, is the Hartree and XC potential due to the soft PS valence electron density , the PS core electron density plus the nuclear charges , and the compensation charge . All these terms are soft (meaning they have no sharp oscillations) so they can be evaluated and stored on a coarse, uniform grid. The compensation charge and its angularly decomposed components are introduced to compensate the different multipole moments of the AE and PS densities within the core region.24 The term in Eq. (10) is the one-center counterpart of , which is still soft but evaluated on much finer, ion-centered radial grids. The sharp one-center potential corresponds to the oscillating components of both valence and nuclear-core charges (n1 and nZc). Naturally, these strongly varying terms are computed and stored on the fine, ion-centered radial grids.
When we introduce an embedding potential in the system, it enters all three potential expressions, resulting in extra terms (compare to Eqs. (43), (45), and (46) of Ref. 24),
where and nc are the PS and AE core electron densities and and nZc are the core electron densities plus the nuclear charges. It is straightforward to perform the addition in Eq. (11), as both the original soft potential and the embedding potential Vemb are stored on the same uniform grid. However, in Eqs. (12) and (13), the one-center potentials ( and ) are expressed on an angularly decomposed radial grid. Therefore, in order to incorporate the embedding potential, an extra step is needed to transform Vemb from the uniform grid to radial grid, using the following projection:
In this equation, SLM are the real spherical harmonics, is the LM-component of the embedding potential, in which L and M denote the angular momentum of the corresponding channel. r and are the length and direction of the vector , and is the ion position. Then, the angularly decomposed is added in Eqs. (12) and (13) to account for the effects of the embedding potential on the one-center terms in PAW.
The dual grid transformation in Eq. (14) has profound consequences for the evaluation of the energy derivatives with respect to the embedding potential. According to Ref. 24, the total KS energy is
Here, the first term is the summation of the orbital eigenvalues, which double counts the electron-electron interactions. In order to correct for this error, we have to include the double counting terms, which are denoted with subscript “dc.” The total energy also includes the ion-ion interaction term, which is represented by . Considering the variational principle and using the definition of (Eq. (27) in Ref. 24), the derivative of the total energy is
in which the onsite density matrix ρij is defined as
Here, the double counting terms (denoted with subscript “dc”) and the ion-ion interaction term vanish as they do not contain Vemb explicitly. We only consider the valence bands for the summation over n since we only aim to match the valence electron densities, as usual within the frozen-core approximation. The first two terms of Eq. (16) are the soft charges given by the PS wavefunction, which are easy to compute. However, the exact evaluation of the third term is less straightforward, as Vemb enters and in an indirect way. In consistent with the radial grid used in the VASP code, we insert Eqs. (9), (10), (12), and (13) into Eq. (16) and evaluate all the integrals using radial coordinates. In this way, we reach the following expression for the third term in Eq. (16):
where i = (l, m) and j = (l′, m′) denote the angular momenta of the projectors ( and ). is the integration of the product of three real spherical harmonics (), which can be easily calculated via linear combinations of Clebsch-Gordan coefficients. L, M are the angular indices for the angular components of the embedding potential and Qlml′m′ and are the depletion charges,
The definition of the compensation function gL(r) is given by Eq. (61) in Ref. 24. In Eq. (18), the most critical step is evaluating , which defines how the values on the radial grid vary with respect to the values on the uniform grid. The speed of computing this term depends on the projection algorithm we use for Eq. (14). We will compare different projection algorithms in more detail in Secs. II C and II D after we clarify the modifications we need to make to the OEP procedure.
As can be proven analytically, the energy derivative given by Eq. (16) should be identical to the AE density , consistent with Eq. (3). However, at a typical PW grid density, the energy derivative significantly differs from the AE density given on the uniform grid. In Fig. 1, we demonstrate this difference using the Cl2 molecule as an example (for more computational details, see Sec. III). It is clear that the AE density exactly corresponds to the energy derivative in the interstitial region but shows much stronger variations in the core region.
Comparison of different densities for the Cl2 dimer system. The black solid curve is the soft electron density () and the red dashed curve is the AE density for the valence electrons (). The blue dotted curve is the exact energy derivative (). All data are plotted along the bond axis of the Cl2 molecule. The two Cl atoms are located at 4 Å and 6 Å, respectively.
Comparison of different densities for the Cl2 dimer system. The black solid curve is the soft electron density () and the red dashed curve is the AE density for the valence electrons (). The blue dotted curve is the exact energy derivative (). All data are plotted along the bond axis of the Cl2 molecule. The two Cl atoms are located at 4 Å and 6 Å, respectively.
The reason for this discrepancy is directly related to the dual grid scheme adopted in the PAW algorithm. Since the radial grid is much finer than the uniform grid, the transformation from the uniform grid to the radial grid must use an interpolation. Consequently, the change of a potential value at one uniform grid point effectively leads to changes of potential values at multiple radial grid points. In other words, even though the derivative should have rigorously a δ-function form (), it is slightly nonlocal in practice. Therefore, the resulting energy change is no longer determined by the density value on one particular point (), but rather a smeared average of the densities of the neighboring region, explaining why the energy derivative grid is always softer than the AE density grid.
As a consequence, Eq. (3) breaks down and the WY functional given by Eq. (2) is no longer appropriate, as the simple formula for the derivative given by Eq. (4) no longer holds. In practice, the difference between the densities and the derivatives is so large that no meaningful optimizations can be done if we use densities as an approximate derivative. To resolve this problem, we slightly modify the WY functional,
With the associated derivative,
Basically, we replace the densities with the energy derivatives. Instead of matching the densities, we match the exact energy derivatives, which is physically similar since the exact energy derivative is effectively a smeared average of the density. We will show in Sec. III that this modification indeed does not change the essential physics of the original density functional embedding method. To conduct the optimization, we have to compute the exact energy derivatives for both the total system and the subsystems in each iteration, using Eqs. (16) and (18).
C. Reciprocal-space projection algorithm
As we discussed in Sec. II B, we need to transform the embedding potential from the uniform PW grid to the PAW radial grids (Eq. (14)). This projection can be performed using a reciprocal-space algorithm, which starts with the Fourier interpolation of the embedding potential,
Then, we utilize37
to expand the phase factor in spherical harmonics and spherical Bessel functions, obtaining
Here, we define as the direction of . Comparing it with Eq. (14) yields
Here, , and represents the n-th uniform grid point in real space, with n = (i, j, k) ranging from (1,1,1) to (Nx, Ny, Nz). is the coordinate of the ion position. N is the total number of uniform grid points (N = NxNyNz), and is the k-th point in reciprocal space. With this projection formalism, we can easily derive an expression for
The reciprocal-space projection algorithm is a very robust algorithm that we found well-behaved in all scenarios. However, the derivative given by Eq. (26) is extraordinarily expensive to compute. The summation in Eq. (26) must be conducted for every radial grid point r and every uniform grid point near every ion . The summation over the entire reciprocal space () contains millions of terms, making the treatment of realistic application cases challenging. Nevertheless, the robustness of the reciprocal-space projection algorithm makes it a perfect benchmark method for more approximate projection algorithms. Therefore, we used this algorithm to compute the exact derivatives in Fig. 1.
D. Real-space projection algorithm
The intrinsic reason for the heavy computational cost of the reciprocal-space projection algorithm is the nonlocality of the Fourier interpolation given by Eq. (22). Basically, the potential value at an arbitrary position depends on the potential values on all the reciprocal-space grid points (), which are again determined by the values on all the real-space grid points (). In other words, we need the information of the entire potential grid before we can find the interpolated potential value at a particular position. This is counter-intuitive, as should only be affected by the values on the grid points that are spatially close to . To overcome the nonlocality associated with the Fourier transformation, we need a projection algorithm based on a real-space interpolation scheme.
The algorithm we adopt is a modified version of Misner’s decomposition method.38 The basic idea is to first expand the potential using both angular and radial basis sets,
Then, we reassemble the radial expansion to obtain ,
Here, RK(r) are orthonormal radial basis functions (vide infra) and VLMK are expansion coefficients computed via integrations on a uniform grid,
To conduct the integration in Eq. (29), we introduce a secondary uniform grid for each ionic core, which is only defined within the augmentation sphere Ω defined by Raug. The introduction of the secondary grid is the first major improvement we made to Misner’s original method. It is necessary as the primary PW grid is usually not fine enough for the evaluation of VLMK. In this work, we always adopt a comparatively dense dimension of 50 × 50 × 50 for the secondary grid. To interpolate the potential values on the secondary grid () from the primary grid (), we utilize the uniform cubic B-spline scheme39 generalized to 3D space. Basically, we use the localized 1D cubic B-spline functions defined for each primary grid point as the fundamental basis set,
We assume that our 3D continuous potential can be expanded using these basis functions,
Once we have the expansion coefficients fijk, we can compute the potential values on the secondary grid using Eq. (31). The coefficients fijk are chosen such that the continuous interpolated matches all the values on the primary grid . In practice, we need to construct a series of 1D δ-grids for each of the dimensions to obtain fijk,
Following the standard 1D interpolation procedure,39 we interpolate the i-th 1D δ-grid in the x direction by solving the following equation for the 1D interpolation coefficients :
Note that due to permutation symmetry, Eq. (33) needs to be solved only once for all the i and i′. The resulting coefficients are very localized, given that vanishes quickly with respect to increasing . Similarly, we can obtain the interpolation coefficients and for y and z directions, respectively. Then, we construct the interpolation coefficients for the 3D δ-grids,
The final coefficients fijk for are computed using Eq. (35), recognizing that the grid can be written as a linear combination of the 3D δ-grids, ,
With this procedure, it is also straightforward to evaluate the term by exploiting the linearity of the interpolation scheme. When we consider for a particular uniform grid point n = (i, j, k), we only need to replace the term in Eq. (29) with the δ-grid defined in Eq. (34) and insert it into Eq. (28). The resulting radial function from Eq. (28) is exactly
Formally, it is not obvious why this procedure would be more efficient in evaluating derivatives, as the second summation in Eq. (36) still involves 50 × 50 × 50 terms, i.e., the size of the secondary grid. However, noticing that the interpolated function for the 3D δ-grids () is strictly localized around the primary grid point , the summation in Eq. (36) can be substantially reduced. With simple distance cutoffs, most of the terms with large can be dropped without losing accuracy. In fact, the locality of the interpolated δ-grid is a reflection of the locality of the real-space cubic B-spline interpolation algorithm, which is the intrinsic reason for the orders of magnitude speedup provided by the real-space projection procedure.
Note that , as given by Eq. (36), and thus Eq. (18), is independent of , further reducing the computational cost of the interpolation scheme. As long as the grid setup, the PAW library setup, and the atomic geometry are unchanged, the results from Eq. (18) remain constant. Consequently, during the OEP optimization, Eq. (18) needs to be computed only once in the beginning, and the results can be recycled in further iterations. This property further speeds up the code by several orders of magnitude.
E. Implementation details for real-space projection
In this section, we introduce several important aspects about the implementation of the real-space projection algorithm.
The most important factor in the real-space projection is the selection of the radial basis sets RK(r). Instead of the r-scaled Legendre polynomials used in the original Misner method, we adopt sinc functions,
where sinc functions are orthonormal within [0, Rcut], which is a fundamental requirement for RK(r),
In the r → 0 limit, all the sinc functions have well-defined finite limits, in contrast to the r-scaled Legendre polynomials, which diverge at short range. Because of this property of the sinc functions, the integration in Eq. (29) is well-defined within the entire augmentation sphere. By contrast, in the original Misner method, the real-space integration can only be performed within a shell to avoid the numerical instability at r → 0. This is an another major improvement we made to the original Misner method.
Although the above procedure is theoretically feasible, we find that the resulting is still numerically inaccurate at very short range. Increasing the number of radial basis functions (Kmax) improves the accuracy but slows down the computation. Furthermore, higher K values introduce strongly oscillating basis functions, which require a much finer secondary grid for the real-space integration. Compared to a radial grid, the uniform Cartesian grid lacks the necessary accuracy close to the center of the sphere, introducing larger numerical errors. This is an intrinsic problem for uniform grid layouts and should be avoided also from other considerations. If we Taylor-expand the potential and project it onto different angular channels, we can derive the short-range behavior of analytically,
To avoid the problem stated above, we switch to Eq. (39) in the limit r < ϵ. The value ϵ is usually inversely proportional to Kmax. As a rule of thumb, we take ϵ = Rcut/Kmax, with Kmax set to 20 in this paper, which we found high enough for our applications. The prefactor α is chosen to enforce continuity of .
In the limit r → Rcut, all sinc functions rigorously converge to zero and thus can only expand functions with the same boundary condition. In general, our arbitrary embedding potential has finite values over the entire space and thus cannot be expanded by the sinc functions in a naïve way. To solve this problem, we introduce a padding region outside of the augmentation spheres, within which the potential is gradually switched off by a cosine function,
Here, d is the thickness of the padding region, which equals to 0.1 Å in this work. Instead of , we actually project and define the cutoff radius as Rcut = Raug + d. After the projection, we discard the part of in the padding region and only keep the part within Raug in the PAW equations.
In Eq. (16), the soft charges () are almost negligible in computational cost, while the third term which corresponds to the core correction is much more expensive. Although the third term exists only in the augmentation sphere, it is still challenging to compute the corrections for all the points inside the sphere. Usually, at the boundary of the augmentation sphere, the AE and PS wavefunctions are already very close, so the derivative correction is small. Therefore, we introduce a scaling factor Fcorr. The derivative corrections are only computed for points within Fcorr ⋅ Raug. In practice, Fcorr controls the balance between accuracy and computational cost: for larger Fcorr, the derivative is more accurate and the calculation is more expensive. We have found a reasonable selection of Fcorr to usually lie in 0.67–0.90, with no significant effects on final results within this range.
Another problem we need to resolve during the OEP optimization is the unphysical oscillation of the resulting embedding potential. This problem is well-known for OEP optimizations40 and caused by the unbalanced basis sets for the density and the potential. Furthermore, the frozen core orbitals cannot react to the potential within the PAW sphere, possibly leading to unphysically large potential fluctuations. Using the formula in Ref. 40, we introduce a penalty function in the WY functional, damping potential oscillations,
Mathematically, the penalty function unfortunately damps both physical and unphysical potential fluctuations uniformly. Therefore, the contribution of the penalty function (controlled by the parameter λ) should not be too large, lest physical potential oscillations are damped out. In this work, unless stated otherwise, we always set λ to 10−5 eV/(V2/Å2), which we have empirically found to give robust results. We will show that the physical results derived from the embedding potential are relatively insensitive to the exact value of λ. Nevertheless, a more rigorous approach will be needed to solve the ill-posed OEP problem in PW basis sets, similar to algorithms put forward for Gaussian basis sets.41
F. Implementation of embedding potential for all-electron Gaussian type orbitals (GTOs) calculations in NWCHEM
In this section, we discuss the algorithms we implemented for the embedded cluster calculations using all-electron GTO basis sets in NWCHEM.42 Considering a system embedded in an arbitrary external potential V, we only have to modify the one-electron integrals to include the effects of V in a quantum chemical calculation. The new terms introduced by V are
Here, G1 and G2 are the two primitive Gaussians centered at atom positions and , with exponents of a1 and a2, respectively.
Normally, this integral can be directly evaluated via a summation over the uniform grid,
However, this real-space summation is problematic if the term is sharp in the corresponding region. Particularly, when performing all-electron calculations, the basis functions describing the core electrons are usually associated with very large exponents. The typical grid densities adopted in the VASP-PAW setup are certainly not fine enough for computing Eq. (43). Notice that , so the sharpness of the integrand is directly controlled by the total exponent a1 + a2. Therefore, we adopt an alternative reciprocal-space scheme if a1 + a2 is larger than 20 a.u. and (note that when the Gaussians are sharp and , they do not overlap and the integrand vanishes, so we only consider the situation here).
We assume that Cartesian functions are used for the angular part of the primitive Gaussians. So, the two primitive Gaussians can be written as
We perform a Fourier transformation of (note that the potential is relatively smooth compared to the density, so no ultra-fine uniform grid is needed for this Fourier transform) and plug it into Eq. (42). Then, we reach the following final working equation:
Here, is the Fourier transform of and represents all the reciprocal-space vectors. The integrals can be derived using the following recursion relations:
In this work, we always use Cartesian functions for the angular part of the basis sets. Theoretically, when spherical functions are used, we can still perform the calculations in Cartesians first and transform them to spherical form.43 Nevertheless, once we obtain the modified one-electron integrals, we conduct the rest of the calculations using the existing code. We are thus able to carry out embedded quantum chemistry calculations using any software with the modified one-electron integrals.
To sum up, with the real-space projection algorithm, we are able to evaluate the exact energy derivatives () of an embedded system in a very efficient way. With the energy derivatives of the two subsystems and the total reference system at hand, we can maximize the WY functional given by Eq. (20). This maximization eventually leads to the density functional embedding potential at the PAW level of theory. All algorithms mentioned in this section were implemented in a modified version of VASP 5.3.3.29–32 For the embedded cluster calculations using GTO basis sets, we also developed an accurate reciprocal-space algorithm to compute the effect of an arbitrary embedding potential within the one-electron integrals. This algorithm has been implemented in NWCHEM-6.5,42 enabling us to conduct embedded cluster calculations using all-electron GTO basis sets. In Sec. III, we use the code to compute the density functional embedding potentials for various test systems; we also demonstrate the robustness of the algorithm and the embedding method.
III. RESULTS AND DISCUSSION
In this section, we present the results of several test cases. Unless stated otherwise, all calculations were conducted using a modified version of VASP with the Perdew-Burke-Ernzerhof (PBE)44 XC functional and the PAW method. All calculations were conducted using only Γ-point sampling in reciprocal space as we only study molecules and large supercells. The readers are referred to the supplementary material45 for more computational details for each test case.
A. Chlorine dimer
The first test case we present is the covalently bound Cl2 molecule, serving as proof-of-principle for our approach. Two chlorine atoms are placed parallel to the z-axis, at the center of a 10 Å × 10 Å × 10 Å box, with a bond length of 2 Å. Naturally, the two subsystems are the two chlorine atoms, respectively.
We first examine the quality of the real-space projection using the reciprocal-space projection results as a benchmark. We take the optimized embedding potential of Cl2 and decompose it into different angular momentum channels using either real-space or reciprocal-space projection algorithms (see Fig. 2).
Comparison of real-space and reciprocal-space projection results for the decomposition of the optimized embedding potential of Cl2. The black curves are the s-channels and the red curves are the pz-channels. Due to symmetry, the px and py channels are all zero and not shown in the figure. Solid lines are real-space projection results and dashed lines are reciprocal-space projection results.
Comparison of real-space and reciprocal-space projection results for the decomposition of the optimized embedding potential of Cl2. The black curves are the s-channels and the red curves are the pz-channels. Due to symmetry, the px and py channels are all zero and not shown in the figure. Solid lines are real-space projection results and dashed lines are reciprocal-space projection results.
At short range, the potential is sharper and thus more sensitive to details of the interpolation, so small deviations can be observed between the two algorithms. Nevertheless, overall, the real-space algorithm reproduces the more accurate reciprocal-space results with very high fidelity and much lower computational cost. This confirms the numerical accuracy of the algorithm and validates our selection of projection parameters introduced in Sec. II.
To investigate the effect of the penalty function, we compare converged embedding potentials with and without applying a penalty function (see Fig. 3). Without a penalty function, the resulting potential features unphysical oscillations. The penalty function smoothens the potential, yielding a much more physical shape, with peaks (electron depletion) located in the core regions and valleys (electron concentrations) in the σ-bond region.
Embedding potentials along the bond axis of Cl2. The black solid curve is the potential obtained with the penalty function and the red dashed curve is the potential obtained without the penalty function. The two Cl atoms are located at 4 Å and 6 Å.
Embedding potentials along the bond axis of Cl2. The black solid curve is the potential obtained with the penalty function and the red dashed curve is the potential obtained without the penalty function. The two Cl atoms are located at 4 Å and 6 Å.
More systematically, we conducted the OEP optimization with different λ values and investigated the change of the optimized WY functional values with respect to λ (see Fig. S145). We observed a sharp decrease of the value of the WY functional below λ = 10−5 eV/(V2/Å2), mainly corresponding to the damping of unphysical oscillations. As we discussed above (Fig. 3), at λ ≈ 10−5 eV/(V2/Å2), a smooth potential can be obtained, while larger λ values start to damp the physical variations, which features a slower increase of the WY functional value. Therefore, we will adopt λ = 10−5 eV/(V2/Å2) in the following study.
As discussed above, care must be taken not to remove physical oscillations: the figure of merit here is the final correspondence between the sum of the subsystem densities and the reference density and the transferability of the final embedding potential to other approaches (i.e., quantum chemistry codes). For the OEP application required for density functional embedding, we can thus easily assess the quality of the embedding potential via comparing the densities/energy derivatives of the subsystems to the reference system (see Fig. 4).
Differences between the summations of the subsystem densities and the reference densities. Black dotted curves are the differences before optimization (without embedding potential), red dashed curves are the differences after optimization with penalty functions, and blue solid curves are the differences after optimization without penalty functions. All the data are plotted along the bond axis of Cl2: (a) energy derivative (δE/δV) differences and (b) soft charge () differences.
Differences between the summations of the subsystem densities and the reference densities. Black dotted curves are the differences before optimization (without embedding potential), red dashed curves are the differences after optimization with penalty functions, and blue solid curves are the differences after optimization without penalty functions. All the data are plotted along the bond axis of Cl2: (a) energy derivative (δE/δV) differences and (b) soft charge () differences.
Before optimization (i.e., with Vemb = 0), significant differences between the subsystem densities and the reference system densities appear (the root-mean-square deviation (RMSD) equals 1.1 × 10−2 e/Å3 for the energy derivative and 1.0 × 10−2 e/Å3 for the soft charge). This is expected as the interactions between the two subsystems are completely missing without embedding potential. After the optimization, the energy derivative errors are significantly reduced (RMSD = 3.4 × 10−4 e/Å3 without penalty function), which is expected as we were explicitly optimizing them. Moreover, the differences in densities on the real-space grids are also greatly reduced (RMSD = 4.6 × 10−4 e/Å3 without penalty function), just as required by the original definition of the density functional embedding scheme. This illustrates that our modifications to the original scheme defined by Eq. (20) do not change the essential physical idea of the original method.
We also note that the penalty function introduces small but non-negligible residual errors in the core region for δE/δV. For the entire 3D grid, the RMSD of the energy derivatives increases from 3.4 × 10−4 e/Å3 to 7.1 × 10−4 e/Å3 when using a penalty function. By introducing the penalty function, we effectively introduce some extra constraints on the potential, and thus, the density fitting is numerically less accurate. This shows that the penalty function not only damps unphysical but also a certain level of physical variations of the embedding potential, which is almost inevitable. However, the penalty function is necessary to remove the unphysical oscillations, which is critical for the robustness and the transferability of the resulting potential. In order to obtain a potential that can be utilized in conjunction with different basis sets and methods, we have to filter out the oscillations on the length scale of the uniform grid size. As we showed in Fig. 3, the penalty function accomplished this goal, though with a little compromise in reproducing the reference densities.
B. Al metal surface
We conduct the density functional embedding calculations for an Al12 cluster in a 5 × 5 × 4 Al slab surface (see Figs. 4(a) and 4(b)), which was previously studied using our ABINIT implementation.14 Using this well-understood metal system, we will verify that our new implementation is able to reproduce the same physics demonstrated by our previous implementations. We would also like to understand the effects of the PAW scheme on the numerical details of the resulting embedding potential, compared to the PP calculations. To be consistent with our previous work,14 no penalty functions are applied in this case. The embedding potentials obtained from both VASP-PAW and ABINIT-PP implementations are shown in Figs. 5(c)-5(e).
Embedding potentials for Al slab: (a) and (b) the cross-sectional views of the structures of the Al slab and the Al12 cluster; (c) isosurface of the embedding potential from the ABINIT-PP calculations, isovalue = − 1.206 V; (d) same as (c) for VASP and PAW; (e) scatter plot of the comparison, versus . Points within 1.2 Å (≈Fcorr ⋅ Raug) from the atomic centers are omitted due to the large difference between the PAW and PP wavefunctions at the atomic centers. The RMSD between and shown in (e) is 0.2 V.
Embedding potentials for Al slab: (a) and (b) the cross-sectional views of the structures of the Al slab and the Al12 cluster; (c) isosurface of the embedding potential from the ABINIT-PP calculations, isovalue = − 1.206 V; (d) same as (c) for VASP and PAW; (e) scatter plot of the comparison, versus . Points within 1.2 Å (≈Fcorr ⋅ Raug) from the atomic centers are omitted due to the large difference between the PAW and PP wavefunctions at the atomic centers. The RMSD between and shown in (e) is 0.2 V.
The ABINIT-PP potential (Fig. 5(c)) features negative values mainly in the boundary region between the cluster and the environment, representing the attractive interaction due to the metal bonds. Comparison with the VASP-PAW potential (Fig. 5(d)) reveals two distinct regions: in the core region, the potential obtained from the PAW calculations shows many more features compared to the potential from the PP calculations, as expected since the all-electron PAW approach yields a more complete description of the core region. In the interstitial region, the VASP-PAW potential and the ABINIT-PP potential show excellent agreement, as their isosurfaces are similar in both size and shape. For a more systematic and quantitative comparison for the interstitial region, we make a scatter plot comparing the two embedding potentials on a point-by-point basis (Fig. 5(e)). Due to the effects of PAW scheme on the interstitial regions, we find a root mean square deviation of 0.2 V between the two potentials. The PAW wavefunction gives slightly different densities and response properties compared to the PP wavefunction, so some numerical differences are expected. Furthermore, we can never completely rule out the effects of unphysical oscillations, which possibly vary with respect to different core region treatment methods. Nevertheless, the nice overall correlation (R = 0.96) between the two embedding potentials demonstrates that both potentials capture the same physics in the interstitial region, consistent with the isosurface plots.
In summary, comparing embedding potentials from the VASP-PAW and ABINIT-PP calculations shows that our new implementation is consistent with the old PP-based implementation. Moreover, we are able to obtain new features in the core region representing the embedding effects on the core electrons, which were missing from the previous PP calculations.
C. Defects in semiconductors
In this section, we examine the capabilities of the density functional embedding method for treating defect states in semiconductors. We use a 2 × 2 × 2 64-atom supercell of zincblende ZnS as our test system and introduce one SnZn (CuZn) antisite in the center to represent an n-type (p-type) point defect (see Fig. 6). In this case, we always apply a penalty function to the WY functional, so we can obtain more physical and more robust potentials.
Bulk and cluster structures of the test semiconductor systems: (a) bulk structure of the Sn-doped ZnS; (b) cluster structure of the Sn-doped ZnS; (c) bulk structure of the Cu-doped ZnS; and (d) cluster structure of the Cu-doped ZnS.
Bulk and cluster structures of the test semiconductor systems: (a) bulk structure of the Sn-doped ZnS; (b) cluster structure of the Sn-doped ZnS; (c) bulk structure of the Cu-doped ZnS; and (d) cluster structure of the Cu-doped ZnS.
For the Sn-doped system, we carve out a cluster containing the central Sn defect and its first and second nearest neighbors, i.e., a SnZn12S4 cluster. For the Cu-doped system, we adopt the same cluster size, generating a CuZn12S4 cluster. In the previous cases (homogeneous diatomic molecules and metals), it was natural to assume that both subsystems were neutral. However, the partitioning of electrons is much less straightforward in semiconductors: if we assume covalent bonds between Zn and S, it is reasonable to cleave the Zn–S bond homogeneously; if we emphasize the ionic character of the Zn–S bonds, we should assign all the bonding electrons to sulfur. It is difficult to determine which way is more physical as the Zn–S bonds have both covalent and ionic characters. The DFET requires an appropriate partition as an input and, theoretically, the optimization process should work for any electron or atom partitions. Therefore, in the framework of the density functional embedding scheme, the partition of electrons is an extra degree of freedom that cannot be automatically optimized but has to be determined a priori. In this work, as we are trying to reproduce the electronic structure of the central atom, we will use the Bader charges46 of the central atom as a gauge to determine the partition. Basically, we optimize the embedding potentials with different electron partitions and conduct embedded cluster calculations using a consistent level of theory (PBE-PW). Then, we obtain the Bader charge of the central atom from the embedded cluster calculation and compare it with the supercell value. We will show that determining the correct charge for the cluster is critical in this study, and that the Bader charges indeed serve as a good gauge for this purpose.
Through the analysis described above, we find that the optimal cluster configurations are [SnZn12S4]18− and [CuZn12S4]0, respectively (see Table I). In order to confirm that the embedded cluster models are able to reproduce the correct physics in the bulk, we conduct embedded cluster calculations of defect states with a consistent level of theory (PBE-PW) and compare the projected density of states (PDOSs) at the central atom with results from non-embedded bulk crystal supercell calculations of the same defect (see Fig. 7 and Table I).
Defect state energy levels relative to the Fermi level at 0 eV. Bader charges (in (e)) of the defect atom of the corresponding system are given in parentheses.
. | Defect band position (eV) . | |
---|---|---|
Systems . | Bare calculation . | Embedded calculation . |
Supercell with SnZn | −0.22 (1.25) | … |
[SnZn12S4]0 | 0.45 | 0.12 (1.35) |
[SnZn12S4]18− | −0.55 | −0.23 (1.23) |
Supercell with CuZn | −0.01 (0.55) | … |
[CuZn12S4]0 | −0.45 | −0.09 (0.52) |
[CuZn12S4]18− | −2.00 | −1.00 (0.45) |
. | Defect band position (eV) . | |
---|---|---|
Systems . | Bare calculation . | Embedded calculation . |
Supercell with SnZn | −0.22 (1.25) | … |
[SnZn12S4]0 | 0.45 | 0.12 (1.35) |
[SnZn12S4]18− | −0.55 | −0.23 (1.23) |
Supercell with CuZn | −0.01 (0.55) | … |
[CuZn12S4]0 | −0.45 | −0.09 (0.52) |
[CuZn12S4]18− | −2.00 | −1.00 (0.45) |
PDOS of the central defect atoms: (a) PDOS of Sn in bulk Sn-doped ZnS; (b) PDOS of Sn in the embedded [SnZn12S4]18− cluster; (c) PDOS of Cu in bulk Cu-doped ZnS; and (d) PDOS of Cu in the embedded [CuZn12S4]0 cluster. All plots are shifted along the horizontal axis such that the Fermi level is located at zero.
PDOS of the central defect atoms: (a) PDOS of Sn in bulk Sn-doped ZnS; (b) PDOS of Sn in the embedded [SnZn12S4]18− cluster; (c) PDOS of Cu in bulk Cu-doped ZnS; and (d) PDOS of Cu in the embedded [CuZn12S4]0 cluster. All plots are shifted along the horizontal axis such that the Fermi level is located at zero.
In Fig. 7(a), the PDOS of the central Sn atom shows a distinctive peak −0.22 eV below the Fermi level, which is the fully occupied n-type SnZn state mainly composed of the Sn 4s orbital. This feature is well-reproduced by the embedded cluster results plotted in Fig. 7(b), with the Sn 4s band located at −0.23 eV (see also Table I). In Figs. 7(c) and 7(d), similar agreement is also found for the p-type case, as the embedded cluster calculation gives a Cu-3d peak position of −0.09 eV, consistent with the position of the half occupied CuZn defect state in bulk (−0.01 eV). To verify the role of the embedding potential, we also provide bare cluster results (i.e., with Vemb = 0) in Table I. Unsurprisingly, without a proper treatment of the cluster-environment interactions, the defect states in the bare cluster all appear at wrong energies. In the SnZn case, the Sn 4s state is pushed towards lower energies (−0.55 eV), being no longer a frontier orbital, which is contradictory to the supercell results. In the CuZn case, the Cu-3d state is fully occupied at −0.45 eV, rather than being a partially occupied p-type state at the Fermi level. For further comparison, we also performed classical point-charge embedding calculations for the Sn-doped ZnS system using a scheme previously developed for more ionic oxide crystals.47 Note that a point-charge embedding scheme requires us to assign formal charges to each ion, so we lose the degrees of freedom associated with tuning the total cluster charges. Moreover, it requires us to terminate our cluster with anions so we can use simplified effective core potential (ECP) sites representing the neighboring cations, so a larger cluster has to be used ([SnZn12S28]30−). The resulting Sn 4s peak is located at −0.57 eV, much lower than the values obtained with supercell and our DFET embedded cluster calculations. Comparing the success of the embedded clusters with the failure of the bare clusters and point charge embedding schemes, we emphasize the importance of a proper embedding potential. Moreover, we confirm that the density functional embedding is a suitable scheme to generate such a proper embedding potential.
In Table I, we also demonstrate the effect of the cluster charges on the defect state energy. On the one hand, the embedded cluster calculations always outperform the bare cluster calculations regardless of the charges on the cluster, showing the capability of the embedding potential in all cases. On the other hand, comparing the negative and neutral cluster results, we find that the final defect state energies are sensitive to the cluster charges. Therefore, in order to obtain results quantitatively comparable with the supercell benchmark data, we have to adopt an appropriate electron partition. Unfortunately, the optimal choice of the cluster charges is case dependent and there is no simple rule for the charge assignment. In the SnZn case, the negatively charged [SnZn12S4]18− cluster outperforms the neutral one, while in the CuZn case, the neutral [CuZn12S4]0 cluster result is better. As we mentioned above, we propose to use the Bader charge of the central atom as a simple index for the electron partition. As shown in Table I, in the SnZn case, the Bader charge of Sn in [SnZn12S4]18∓ is 1.23 e, in better agreement with the supercell value (1.25 e) compared to the neutral cluster (1.35 e). Similarly, in the CuZn case, the Bader charge of Cu in [CuZn12S4]0 is 0.52 e, matching the supercell value (0.55 e) accurately. In general, whenever the Bader charge of the central atom agrees with the supercell calculation, an excellent agreement in defect state energy can also be found. This shows that Bader charge indeed serves as a good gauge for the cluster charge assignment.
While being successful in describing the frontier defect states, the embedded cluster cannot correctly reproduce all the states in the PDOS plot. Core levels are generally shifted from their bulk positions and extra states appear due to the dangling bonds at the boundary. However, these states are also less important as long as they do not significantly couple with the frontier defect levels (i.e., those close to the Fermi level), as only the defect states are the ones in which we are interested.
At the end of this section, we briefly discuss the effects of the penalty function on the final results. All the results shown above were computed with λ = 10−5 eV/(V2/Å2). Using the [SnZn12S4]18− system as an example, we obtain a Sn 4s band located at −0.28 eV with λ = 5 × 10−5 eV/(V2/Å2), −0.05 eV lower than the results we obtained above. Therefore, a factor of five times change in λ leads to a few hundredths of an eV orbital energy variation, which is not significant. Our results suggest that within a certain range, predictions are stable with respect to the exact value of λ. This is considered to be a good property, given that our selection of λ is purely based on experience.
To sum up, in this section, we demonstrate the power of the density functional embedding method in describing semiconductor defect states. We showed that the embedded cluster calculations can reproduce the defect state energy at a consistent level of theory. As a preliminary study, our embedded cluster calculations were all conducted using DFT-PBE and consistent PW basis sets for the bulk and embedding potential calculations. Therefore, we are presently only testing the capability of the embedding scheme on the DFT level, and not yet solving PW-DFT’s problem using more advanced methods. However, this work lays out a solid foundation for future ECW studies, as it indicates that the embedding potential is able to accurately describe the cluster-environment interactions at the DFT level.
D. Potential transferability with respect to GTO basis sets
As discussed above, we used a periodic PW basis set to conduct all embedded cluster calculations; however, typical quantum chemistry programs usually adopt atom-centered GTO basis functions. Wavefunctions expanded in different basis sets are likely to have different polarizabilities and thus could respond differently to the same external potential. Therefore, the transferability of our results to other basis sets might be questionable. To understand the sensitivity of our PAW density-based embedding approach to different basis sets, we conduct the embedded [SnZn12S4]18− DFT calculations with different GTO basis sets below.
The GTO embedded cluster calculations were all carried out with a modified version of NWCHEM-6.5,42 with the embedding potential function incorporated. In the first five calculations, we use all-electron basis sets without ECPs, being consistent with the PAW method in VASP. For the central Sn atom, we utilize the double-zeta ADZP basis set, which contains diffusion and polarization functions.48 For other atoms (S and Zn) inside the cluster, we tested ADZP,49,50 as well as the slightly smaller double-zeta basis set def2-SVPD.51 In some cases, we also keep part of the basis sets from the environmental sulfur atoms that directly contact the cluster. For these “ghost” sulfur atoms, only a few of the most diffuse shells of the def2-SVPD basis sets are kept while all other core shells are removed, similar to the approach used by Barnes et al.52 In this section, we denote the entire basis set using three labels, describing the basis set we used for the three different parts individually. For example, (ADZP/def2-SVPD/2s,2p) indicates that we use ADZP for the central Sn atom, def2-SVPD for Zn and S atoms, and the two most diffuse s and p shells from def2-SVPD for the ghost sulfur atoms. Using these basis sets, we conduct PBE-GTO calculations with the embedded cluster and identify the molecular orbitals dominated by the Sn s-functions. Then, we can calculate the relative positions of the highest occupied Sn s-orbitals with respect to the Fermi energy and compare it with the PW results from Sec. III C (see Table II).
Basis sets (Sn/Zn,S/ghost S) . | Sn 4s band position (eV) . |
---|---|
PW-500 eV | −0.23 |
ADZP/ADZP/none | −0.04 |
ADZP/def2-SVPD/none | −0.37 |
ADZP/def2-SVPD/2s,2p | −0.33 |
ADZP/def2-SVPD/3s,3p | −0.29 |
ADZP/def2-SVPD/2s,2p,1d | −0.24 |
Def2-SVPD/def2-SVPD/2s,2p,1d | −0.24 |
DZP-DKH/def2-SVPD/2s,2p,1d | −0.25 |
Def2-TZVPD/def2-SVPD/2s,2p,1d | −0.10 |
Def2-TZVPD/def2-TZVPD/2s,2p,1d | −0.15 |
Basis sets (Sn/Zn,S/ghost S) . | Sn 4s band position (eV) . |
---|---|
PW-500 eV | −0.23 |
ADZP/ADZP/none | −0.04 |
ADZP/def2-SVPD/none | −0.37 |
ADZP/def2-SVPD/2s,2p | −0.33 |
ADZP/def2-SVPD/3s,3p | −0.29 |
ADZP/def2-SVPD/2s,2p,1d | −0.24 |
Def2-SVPD/def2-SVPD/2s,2p,1d | −0.24 |
DZP-DKH/def2-SVPD/2s,2p,1d | −0.25 |
Def2-TZVPD/def2-SVPD/2s,2p,1d | −0.10 |
Def2-TZVPD/def2-TZVPD/2s,2p,1d | −0.15 |
Comparing the second and the third lines in Table II, we find a strong basis set dependence for the Sn 4s-orbital position when no ghost basis set is used. Changing the basis sets of Zn and S from def2-SVPD to ADZP creates an orbital energy fluctuation of ±0.1-0.2 eV and both of them deviate from the PW benchmark value in opposite directions. Apparently, compared to the systematically converging PWs, it is more difficult to reach convergence for Gaussian type basis sets. Some critical parts of the wavefunction might be missing as Gaussian type functions are localized whereas PWs are delocalized. Adding ghost basis sets at the cluster boundary region alleviates this problem, as they extend the cluster basis sets spatially, in particular along the cut covalent bonds, effectively making it more diffuse. On top of (ADZP/def2-SVPD/none), we systematically increase the size of the ghost basis sets and, clearly, the resulting orbital energies gradually converge to the benchmark value. Eventually, excellent agreement is achieved with the (ADZP/def2-SVPD/2s,2p,1d) basis sets, with an extraordinarily small error of 0.01 eV.
We also test the transferability of the embedding potential to GTO calculations with ECPs. We use def2-SVPD51 (which corresponds to an all-electron basis set for Zn and S but is used with an ECP for the heavier Sn) for the central Sn atom (see Table II). Obviously, the (def2-SVPD/def2-SVPD/2s,2p,1d) result is still in good agreement with the PW benchmark, even though the embedding potential itself was derived at the frozen-core all-electron level. This shows that the features of the embedding potential in the core region are not affecting the pseudized wavefunction in the interstitial region significantly, and the embedding potential obtained with PAW can be utilized in both all-electron and ECP calculations.
We further investigate the relativistic effect for the Sn atoms, considering that no such effect is included in the previous all-electron GTO calculations. We used the Douglas-Kroll-Hess (DKH) Hamiltonian53–55 implemented in NWCHEM and the corresponding DZP-DKH basis set for Sn developed by Barros et al.56 The result is shown in Table II. We observe very small changes to the Sn 4s peak position with the DKH Hamiltonian, showing that relativistic effects are not very significant in our test case. However, it would certainly be more important for even heavier atoms, and in those cases, the simple non-relativistic all-electron GTO calculations could be problematic.
While we achieved excellent agreement in double-zeta calculations, it does not guarantee that we have reached absolute convergence: if we increase the Sn basis set to triple-zeta level, while keeping the double-zeta description for all the other atoms (def2-TZVPD/def2-SVPD/2s,2p,1d), we obtain −0.10 eV for the Sn 4s state, worse than the double-zeta result. This result is slightly improved with a fully triple-zeta setup (def2-TZVPD/def2-TZVPD/2s,2p,1d), demonstrating the importance of a balanced description for the entire system. However, even the (def2-TZVPD/def2-TZVPD/2s,2p,1d) calculation is still 0.08 eV away from the PW benchmark. Moreover, we notice that in this triple-zeta calculation, the molecular orbital of Sn 4s state is much more diffuse and strongly coupled with the boundary states and the Sn p/d states. This is likely due to the unbalanced description of the central atom and the rest of the cluster, as well as the unbalanced description of the Sn 4s orbitals and other components with higher angular momentum. All of these observations indicate the complexity of the GTO basis sets, and careful considerations have to be taken in such calculations. Generally speaking, the uncertainty associated with the finite basis set effects is around 0.1 eV, and the double-zeta GTOs with proper ghost basis sets on the boundary region seem to achieve the most balanced description, leading to the best agreement with the PW benchmark.
Overall, these results illustrate that with carefully chosen basis sets, the embedding potential is fairly robust with respect to transfer to GTO calculations even though it was initially derived using only PW basis sets. We highlight the importance of the ghost basis sets right outside of the cluster boundary, at the position of the nearest neighbors, to sufficiently support the cut covalent bonds. We showed that incorporating these ghost basis sets is a very efficient way to reproduce the diffuse nature of the PW basis functions. As long as we construct the basis sets carefully, we are able to reach consistent results using both PW and GTO basis sets. This lays out another important foundation for future ECW studies using GTO basis sets.
IV. SUMMARY AND CONCLUSIONS
In this work, the density functional embedding method was successfully implemented in VASP, in conjunction with the PAW formalism. We found that the dual grid scheme adopted in PAW leads to extra complications and requires slight modifications to the original density functional embedding formulae. A new real-space projection algorithm was developed as an alternative to the more expensive reciprocal-space algorithm to transform the embedding potentials between different types of grids. This algorithm leads to fast evaluations of energy derivatives and enables conducting embedding potential optimizations at the PAW-DFT level of theory.
The embedding potentials at the PAW level are consistent with our previous results using norm-conserving PPs in the interstitial region and exhibit additional features due to the all-electron nature of PAW in the core region. Apart from proving the correctness of our implementation, this observation has further implications in practice. For the test calculations shown in Sec. III D, we utilized all-electron GTO basis sets to be truly consistent with PAW. This choice is potentially problematic since all-electron calculations might be both slow and incorrect (as they usually miss relativistic effects) for systems containing heavy elements. In such cases, it is more appropriate to perform ECW calculations employing ECPs. The agreement between the PAW and norm-conserving PP embedding potentials in the interstitial region indicates that the PAW embedding potential is also likely to be transferrable to ECP calculations. The difference in the core region should have less impact as the pseudo-wavefunction in this region is largely damped and relatively rigid. To verify these conjectures, we performed an ECP calculation using the SnZn system as an example, and the transferability of the embedding potential in both all-electron and ECP GTO calculations was confirmed. Above all, all the simple tests with covalent bonding and metals prove that the obtained embedding potentials are physically meaningful and numerically stable.
After we verified the fidelity of the resulting embedding potential, we examined its capability for describing defect states in semiconductors. We found that at a consistent DFT level of theory, the embedded cluster correctly reproduces the electronic structure of the point defects in non-embedded supercell calculations. Furthermore, the results of the embedded cluster calculations are very robust with respect to both PW and GTO basis sets. All these observations are very encouraging for future ECW studies on semiconductor defects using the methodologies introduced in this work.
Essentially, our work makes it possible to conduct first-principles calculations embedded in any arbitrary external potential within the PAW method. It also allows us to evaluate energy derivatives with respect to an external potential efficiently. In fact, this new function offers us an efficient way to conduct OEP calculations within PAW in general and density functional embedding is merely one of many possible applications. Therefore, we hope to see additional applications for our algorithm in the near future.
Acknowledgments
E.A.C. thanks the U.S. Department of Energy, Office of Science, Basic Energy Sciences under Award No. DE-SC0002120 for funding this project. We thank Dr. Georg Kresse and Dr. Martijn Marsman from the VASP development team for helpful discussions. We also thank Dr. Niranjan Govind for helpful discussions during the development of the embedding formalism within the NWChem program.