The WIEN2k program is based on the augmented plane wave plus local orbitals (APW+lo) method to solve the Kohn–Sham equations of density functional theory. The APW+lo method, which considers all electrons (core and valence) self-consistently in a full-potential treatment, is implemented very efficiently in WIEN2k, since various types of parallelization are available and many optimized numerical libraries can be used. Many properties can be calculated, ranging from the basic ones, such as the electronic band structure or the optimized atomic structure, to more specialized ones such as the nuclear magnetic resonance shielding tensor or the electric polarization. After a brief presentation of the APW+lo method, we review the usage, capabilities, and features of WIEN2k (version 19) in detail. The various options, properties, and available approximations for the exchange-correlation functional, as well as the external libraries or programs that can be used with WIEN2k, are mentioned. References to relevant applications and some examples are also given.

## I. INTRODUCTION

Quantum mechanical calculations play a central role in understanding the properties of materials and, increasingly, predicting the properties of new materials. While in the early days, the emphasis was mainly on understanding the energy, atom positions, and band structure, modern codes now calculate a large number of different properties ranging from piezoelectric response to nuclear magnetic resonance (NMR) shielding, examples of which will be given later. With the advent of increasingly sophisticated methods and the ever increasing speed of computers over the last decades, in some cases the accuracy of quantum mechanical calculations rivals or even surpasses the accuracy of experimental measurements.

There are many different methods of theoretically modeling the behavior of electrons and atoms in materials. While earlier approaches focused on dealing with the electrons via wave functions,^{1} many current methods use density functional theory (DFT),^{2} which has significant speed advantages. Following the method outlined by Kohn and Sham^{3} (KS), the interacting many-body system of electrons is mapped onto a non-interacting system of quasiparticles, characterized by KS orbitals with a specific KS energy. They have many of the properties of the true electron wave functions and of particular importance is that one can fill up these KS orbitals as a function of their KS energy yielding the true electron density. The KS approach needs an exchange-correlation (XC) functional and the corresponding XC potential. However, the exact functional is unknown and approximations are needed (see Sec. II B).

A second split in terms of methods is how the atomic positions are considered, and there are two main methods: cluster calculations for a finite number of atoms, which focus on the local properties of some atomic arrangement, and those which are designed to exploit the periodic nature of most solids; the WIEN2k code is an example of the latter. We represent the solid by a unit cell, which is repeated in all three directions, corresponding to periodic boundary conditions. This assumes that the solid is perfect, ordered, and infinite; however, a real crystal differs from this ideal situation, since it is finite, may contain defects or impurities, and may deviate from its ideal stoichiometry. For these important aspects and how to handle them using supercells, see Chap. 8.2 of Ref. 4.

There are many computational methods for solving the KS equations, for instance, linear combination of atomic orbitals (LCAO), numerical basis sets, pseudopotential schemes, or space partitioning methods. A recent comparison^{5} of these methods showed that especially all-electron codes predict essentially identical results, demonstrating a high reproducibility, whereas some pseudopotential codes lead to large deviations. One of the most accurate codes is our WIEN2k code,^{6} which is the focus of this paper and is based on the augmented plane wave (APW) method. Detailed descriptions including many conceptual and mathematical details are given in Ref. 7. The term all-electron (see Chap. 8.4 of Ref. 4) means that all electrons from the core (starting from the 1*s* shell) to the valence states are included.

Turning to some historical specifics of our approach (see also Ref. 8), Slater^{9} proposed the original APW method. Unfortunately, the original formulation leads to a nonlinear eigenvalue problem due to the energy-dependent radial basis functions, which is computationally expensive. An important improvement came from Andersen,^{10} who introduced a linearization of this energy dependency, and Koelling and Arbman^{11} made the linearized-APW (LAPW) method a practical computational scheme using the muffin-tin (MT) approximation (see Sec. II). This was taken a step further by Freeman and collaborators who made the LAPW method a full-potential all-electron total energy method.^{12,13}

This LAPW method formed the basis for the original WIEN code.^{14} However, the LAPW method had the drawback that only one principal quantum number per angular momentum *ℓ* could be described and thus failed to give reliable results for all elements on the left of the periodic table because these atoms require a proper description of shallow core states (semi-core) and valence states at the same time (e.g., 1*s* and 2*s* in Li or 3*sp* and 4*sp* in Ti). This problem was solved by Singh,^{15} who introduced local orbitals (LOs) for the description of semi-core states. He also noted that the LAPW method needed a larger plane-wave basis set than the APW method. To overcome this problem, he suggested the augmented plane wave plus local orbitals (APW+lo) method,^{16,17} where the linearization of the energy-dependent radial wave function was facilitated by an extra local orbital (lo, different from an LO, see Sec. II), which has a superior plane-wave convergence compared to LAPW. Last but not least, the linearization of the energy dependency can introduce some inaccuracy in high precision calculations. This problem was finally solved by introducing additional higher (second) derivative LOs (HDLOs).^{18,19} These latest developments form the basis of the present WIEN2k_19 code,^{6} while previous versions have been described in several reviews.^{7,8,20–23} The method of our choice could be named (L)APW+lo+LO+HDLO, but we use a shorter acronym APW+lo. It is described in detail in Sec. II A.

## II. THEORY

In the APW-based methods, the unit cell is decomposed into spheres centered at the nuclear sites and an interstitial region,^{7} as shown in Fig. 1. These atomic spheres with radii *R*_{MT} must not overlap, but should be chosen for computational efficiency as large as possible with the additional constraint that *R*_{MT} for *d*-elements should be chosen to be about 10%–20% bigger than for *sp*-elements, while *f*-elements should get even larger spheres because for identical sphere sizes the number of plane-waves (PWs) to reach convergence is largest for the localized 4*f* (5*f*) electrons, medium for 3*d* (4*d*, 5*d*)-electrons, and much smaller for *sp*-states. An exception is the H atom, whose sphere with short C–H or O–H bonds should be chosen approximately half the size of *R*_{MT}(C) or *R*_{MT}(O). In WIEN2k, these sphere radii can be set automatically in an optimal way using the *setrmt* utility. Note that non-optimal sphere sizes may lead to poor convergence (eventually only for one particular atom) and significantly longer computing time or suffer from truncated Fourier or spherical-harmonic expansions. In the worst case, they can even produce “ghost-states” (unphysical eigenvalues in the occupied spectrum) if the *R*_{MT} of an *sp*-element is much larger than that of the other atoms.

The electron density *ρ* and KS potential *v*^{KS} (defined in Sec. II B) are expanded as a Fourier series in the interstitial (*I*) region (**K** denotes a reciprocal lattice vector in units of inverse bohr) and as lattice harmonics (symmetry adapted combinations of spherical harmonics *Z*_{LM})^{24} times radial functions *ρ*_{LM}(*r*) [$vLMKS(r)$ for the potential] inside the spheres (*S*_{t}, where *t* is the atom index),

By default, the Fourier expansion runs up to |**K**| = 12 for large *R*_{MT} (16 for *R*_{MT} < 1.2; 20 for *R*_{MT} < 0.7 bohr), while the angular momentum expansion truncates at *L* = 6. Note that the old “MT” approximation uses a constant value in the interstitial (i.e., only **K** = 0) and a spherically symmetric density/potential inside the spheres (i.e., only *L* = 0).

This space decomposition plays a crucial role in the definition of core and valence electrons, which are treated differently in APW-type methods. Core states are defined as having wave functions (densities) completely confined inside the atomic spheres. Thus, we do not use the standard definitions of core and valence, but, e.g., in 3*d* transition metals (TMs), the 3*s* and 3*p* states are also considered as valence, since a couple of percent of their charge leaks out of the atomic sphere. To distinguish them from the conventional definition, we call them semi-core states. Typically, these states are treated using LOs (see below) and their energies are less than 6 Ry below the Fermi energy, but in special situations (small spheres due to short nearest neighbor distances or high pressure), even lower lying states (such as Al-2*p*) have to be included. Relativistic effects are important for the core states, and thus, they are calculated by numerically solving the radial Dirac equation in the spherical symmetric part of the potential *v*^{KS}. Core states are constrained to be localized and not hybridized with states at the neighboring atoms, but we use a thawed core (no frozen core approximation), i.e., the core states are recalculated in each self-consistent field cycle.^{7} The semi-core and valence electrons are commonly treated scalar relativistically, i.e., including mass velocity and Darwin *s*-shift corrections, but neglecting spin–orbit (SO) interactions.^{7,25} The SO effects can later on be included in a second variational step using the scalar-relativistic orbitals as a basis.^{7,26} Since *p*_{1/2} radial wave functions differ considerably from scalar relativistic (or *p*_{3/2}) orbitals, one can also enrich the basis set with additional *p*_{1/2} local orbitals, specifically, an LO (see below) with a *p*_{1/2} radial wave function, which is added in the second-variational SO calculation.^{27}

### A. The APW+lo method as implemented in WIEN2k

The basis functions for the valence electrons consist of APWs, which are plane waves in the interstitial region augmented with radial wave functions *u*_{tℓ}(*r*, *E*_{tℓ}) defined at a fixed energy *E*_{tℓ}, and lo.^{7,16,17} An APW is given by

where **k** is a point in the first Brillouin zone (BZ), $Y\u2113m(r^)$ are spherical harmonics, and *u*_{tℓ} are solutions of the scalar-relativistic radial KS equation^{7} inside the sphere *S*_{t}. Note that these radial functions *u*_{tℓ} are recalculated in each self-consistent-field (SCF) cycle, allowing for an expansion/contraction corresponding to the given charge state (ionicity) of the atom. These adaptive basis functions are part of the reason for the high accuracy of APW-based methods. The coefficients $At\u2113mk+K$ are chosen such that the interstitial and sphere parts of the APW match at the sphere boundary. However, these APWs allow no variations of the radial functions for eigenvalues different than *E*_{tℓ} and thus would be a poor basis. To overcome this constraint, the energy dependency is handled by a lo, which is nonzero only inside a MT sphere, and given by

where $u\u0307t\u2113$ is the first energy derivative of *u*_{tℓ}. The coefficients $At\u2113mlo$ and $Bt\u2113mlo$ are chosen such that $\varphi t\u2113mlo$ is zero at *R*_{MT} and normalized.

The APW+lo basis set has the advantage of a superior PW convergence as compared to the standard LAPW method,^{7,17} reducing the number of PWs by almost 50%, but it needs additional lo basis functions. Thus, in WIEN2k, the default is to restrict the *ℓ*_{max} to the chemical *ℓ*-values *sp*(*d*, *f*), which are hard to converge, but use a standard LAPW basis set inside the spheres for the higher *ℓ* values (by default up to *ℓ*_{max} = 10),

where the coefficients $At\u2113mk+K$ and $Bt\u2113mk+K$ are chosen such that $\varphi k+KLAPW$ and its first derivative are continuous at the sphere boundary.

As mentioned before, semi-core states (or also high-lying empty states) cannot be described accurately by APW+lo. For these states, the basis set has to be improved, and this can be done by adding another type of local orbitals, the LOs, containing radial functions *u*_{tℓ} calculated at the appropriate (e.g., semi-core) energy $Et\u2113LO,i$,

For instance, for TiO_{2}, one would use the Ti-3*p* energy to calculate $uTi,1(r,ETi,1LO,i)$ and then add some Ti-4*p* radial function *u*_{Ti,1}(*r*, *E*_{Ti,1}), choosing the coefficients $At\u2113mLO,i$ and $Ct\u2113mLO,i$ such that the LO is zero at the *R*_{MT} and is normalized. By adding such LOs (representing Ti-3*p* states), a consistent and accurate description of both the Ti-3*p* semi-core and Ti-4*p* valence states is possible, retaining orthogonality, which is not assured when the multiple-window approach is used (see Ref. 7). Cases where this improvement is essential is the electric field gradient (EFG) calculation of rutile TiO_{2}^{28} or lattice parameter calculations of compounds with such elements. Note how lo [Eq. (4)] and LO [Eq. (6)] differ in their respective second terms.

A clever choice of energy parameters *E*_{tℓ} in Eqs. (3)–(6) is essential for accurate results, and WIEN2k has several automatic ways to make an optimal choice in most cases.^{6} *E*_{tℓ} of semi-core states (actually, of all states whose energy in the free atom is more than 0.5 Ry below the highest occupied atomic orbital, e.g., also C-2*s* or Ar-3*s* states) are determined by taking the average of the two energies *E*_{bottom} and *E*_{top}, where the corresponding *u*_{tℓ}(*R*_{MT}) is zero or has zero slope. For localized *d* or *f* valence electrons, the same procedure is used, but *E*_{top} is searched only 0.5 Ry above *E*_{F} to ensure that the energy parameters are set below *E*_{F}. The energy parameters of all other valence states are set to 0.2 Ry below *E*_{F} (0.2 Ry above *E*_{F} if there is a high lying semi-core LO). Thus, all our energy parameters are dynamically updated during the SCF cycle and not fixed by input.

Implicit in this approximation is a linearization of the energy dependency of the radial wave functions. Since the true *u*_{tℓ}(*r*, *E*_{tℓ} = *ε*_{i}) varies most for more localized states (e.g., 3*d* or 4*f*) at larger distances from the nucleus, large spheres and a large valence bandwidth could cause a poor description of the variations of *u*_{tℓ}(*r*, *ε*_{i}) with energy, leading to a significant dependency of the results on *R*_{MT} (where smaller *R*_{MT} yield more correct results but with a larger computational effort). This can be solved by adding an LO, which involves the second energy derivative of *u*_{tℓ}, called an HDLO,^{18,19}

Figure 2 illustrates the effect of adding HDLOs for the lattice parameter of fcc-La as a function of *R*_{MT}. The lattice parameter becomes independent of *R*_{MT} when both *d* and *f*-HDLOs are added to the basis set, while the standard APW+lo+LO basis (LOs for 5*s*, 5*p* states) produces an error of 0.04 bohr for the largest *R*_{MT}.

The KS orbitals are expanded using the combined basis set described above (*n* is the band index),

and the coefficients $cnki$ are determined by the Rayleigh–Ritz variational principle. The number of APW (or LAPW) basis functions Eq. (3) [or Eq. (5)] is determined by the cutoff value *K*_{max} for the reciprocal lattice vectors **K** such that $k+K\u2264Kmax$ and depends on the smallest of the atomic radii $RMTmin$ and the type of atom. Typically, the necessary $RMTminKmax$ values range from 3 (for small H-spheres) to 7 for *sp*-elements, 8 for TM-*d* elements, and 9 for 4*f* lanthanides. These values can be reduced by 0.5–1 for low quality screening calculations and increased by 0.5–2 for highest precision. It should be mentioned that the efficiency of the APW+lo method depends crucially on the possible *R*_{MT} values. For instance, the O-2*p* states converge well with $RMTOKmax$ = 7. In MgO, one can use $RMTO$ = 2 bohrs, leading to a very small PW cutoff energy of 170 eV. However, in Mg(OH)_{2}, one has to use $RMTO$ = 1.1 bohrs due to the short O–H distances, leading to a PW cutoff of 550 eV, i.e., an order of magnitude larger effort.

The specific setup of all these basis functions can, of course, be selected manually by experts, but one of the great strengths of WIEN2k is that the default input usually works quite well and is fairly robust.

### B. Available DFT approximations

In DFT, the total energy of the system is given by^{3}

The terms on the right-hand side represent the noninteracting kinetic, electron–nucleus, Hartree, XC, and nucleus–nucleus energies, respectively. The variational principle leads to the KS [or generalized KS^{29} (gKS)] equations (in this section, the orbital index *i* is a shorthand notation for valence and core orbitals),

where *v*^{KS} is the KS potential,

which is the sum of the electron–nucleus, Hartree, and XC potentials. Choosing an appropriate functional *E*_{xc} in Eq. (9) [and potential *v*_{xc} in Eq. (11)] for the XC term is crucial in order to obtain reliable results for the problem at hand.^{30–32} Several hundred^{33,34} different functionals are available in the literature; some of them were proposed as general-purpose functionals, while others were devised for a specific property (e.g., bandgap) or types of systems (e.g., van der Waals). Numerous functionals have been implemented in the WIEN2k code, and below, we provide a brief overview of the different families of functionals. Note that for XC functionals that depend explicitly on the electron density *ρ*, e.g., the local density approximation (LDA) or the generalized gradient approximation (GGA), the XC potential *v*_{xc} is multiplicative, while for functionals that depend implicitly on *ρ*, e.g., meta-GGA (MGGA) or hybrids, the XC potential is non-multiplicative when implemented in the gKS scheme.

#### 1. LDA, GGA, and MGGA

The LDA, GGA, and MGGA represent the first three rungs of Jacob’s ladder of XC functionals.^{35} These approximations are semilocal, since *E*_{xc} is defined as

and the XC energy density *ε*_{xc} depends only locally on some properties of the system. In the LDA, *ε*_{xc} depends on the electron density $\rho =\u2211i=1N\psi i2$, while in the GGA, *ε*_{xc} depends also on the first derivative ∇*ρ*. At the MGGA level, the functionals depend additionally on the Laplacian of the electron density ∇^{2}*ρ* and/or the kinetic-energy density $t=1/2\u2211i=1N\u2207\psi i*\u22c5\u2207\psi i$. Semilocal functionals are the most commonly used methods in the solid-state community for the calculation of properties depending on the total energy such as the geometry, cohesive energy, or the adsorption energy of a molecule on a surface. The main reason is that they are faster than all other types of approximations and therefore allow calculations of larger systems.

There is a huge literature on the performance of semilocal functionals, concerning the geometry and cohesive energy of solids. Extensive benchmark studies have been conducted by us^{36–39} and others (see, e.g., Refs. 40 and 41). The results of these works showed that among the GGA functionals, those with a small enhancement factor such as AM05,^{42} PBEsol,^{43} or a few others^{44–46} are the most accurate for the lattice constant and bulk modulus, while the standard PBE^{47} is the best choice for the cohesive energy. At the MGGA level, the SCAN functional^{48} is becoming increasingly popular and has been shown to be simultaneously as good as the best GGAs for the geometry (e.g., PBEsol) and the cohesive energy (PBE).^{39,41} However, it should be mentioned that SCAN can be quite problematic for iterant magnetic systems^{49,50} or alkali metals.^{51}

Many semilocal functionals have been implemented directly in the WIEN2k code, but basically all existing semilocal functionals can be used because WIEN2k is interfaced to the Libxc^{33,34} library of XC functionals. One current limitation is that the MGGA functionals are not yet implemented self-consistently (by default, the GGA PBE potential is used for generating the orbitals although the user can choose another potential).

#### 2. Hybrid functionals

Beginning in the 21st century, hybrid functionals,^{52} which belong to the fourth rung of Jacob’s ladder, started to be extensively used for calculations of solids.^{53–55} The one that is currently the most popular is HSE06,^{56–58} which is a screened version of the other well-known PBE0.^{59,60} In (screened) hybrid functionals, the exchange energy is a linear combination of a semilocal (SL) functional and the Hartree–Fock (HF) expression,

where

In Eq. (14), *v* is either the bare Coulomb potential $v=1/r\u2212r\u2032$ for unscreened hybrids or a potential that is screened at short or long range for screened hybrids. For solids, it is computationally advantageous to use a potential that is short range, for instance, the Yukawa potential $v=e\u2212\lambda r\u2212r\u2032/r\u2212r\u2032$ (Ref. 61) or $v=erfc(\mu r\u2212r\u2032)/r\u2212r\u2032$ (Ref. 56), where erfc is the complementary error function. Although hybrid functionals are also used for the total energy (geometry optimization, and cohesive energy), they are particularly interesting for properties derived from the electronic band structure such as the bandgap, for which they significantly improve upon standard GGA functionals such as PBE (see Refs. 62–65 for recent extensive benchmarking).

In WIEN2k, unscreened and screened hybrid functionals are implemented^{66} according to the scheme of Massidda *et al.*,^{67} which is based on the pseudo-charge method for calculating the Coulomb potential.^{12} The treatment of the Coulomb singularity is done by multiplying the Coulomb potential by a step function,^{68} which is very efficient compared to other methods.^{69} The screened hybrid functionals use the Yukawa potential, and in Ref. 66, it was shown that the results obtained with the PBE-based hybrid YS-PBE0 for the bandgap are almost identical to those obtained with HSE06 (which uses the erfc screened potential), provided that the screening parameter is chosen appropriately [$\lambda =3/2\mu $, see Ref. 70]. Because of the double integral and summations over orbitals in the HF exchange [Eq. (14)], the calculations are much more expensive (between 10 and 1000 times) than semilocal methods; however, there are a couple of ways to speed-up such calculations significantly. For instance, one can first use a rather crude **k**-mesh and later on improve the **k** convergence in a few additional iterations continuing the previous calculations. Furthermore, a reduced **k**-mesh for the internal loop in the HF potential is possible,^{71} and finally, often a one-shot procedure^{72,73} is sufficient, where the hybrid orbitals and eigenvalues are calculated perturbatively on top of a calculation with the semilocal functional on which the hybrid functional is based.

#### 3. On-site methods for strongly correlated electrons

The high computational cost of hybrid methods discussed in Sec. II B 2 limits the size of the systems that can be treated. Alternatively, one can use an on-site method, namely, DFT+*U*,^{81} exact exchange for correlated electrons (EECE),^{82} or on-site hybrids,^{83} which can be viewed as approximate but cheap versions of the hybrid or HF methods. In these methods, a hybrid/HF treatment is applied only to the electrons of a particular angular momentum belonging to a selected atom. However, using such an on-site scheme only makes sense when the considered electrons are well localized around the atom, which is, in general, the case for strongly correlated electrons. The on-site methods are mostly applied to open 3*d*-, 4*f*-, or 5*f*-shells in strongly correlated materials in order to improve the description of the electronic and magnetic properties. For such systems, the standard GGA methods provide results that are often even qualitatively inaccurate.^{84}

Different versions of DFT+*U* exist in the literature, and those available in WIEN2k are the following: (1) the original version,^{81} called Hubbard in mean field (HMF) in WIEN2k, (2) the fully localized limit version,^{85,86} called self-interaction correction (SIC), and (3) the around mean-field (AMF) version.^{86} The details of the implementation of DFT+*U* in the LAPW method can be found in the work of Shick *et al.*,^{87} while a very good summary and discussion of the DFT+*U* flavors is given in Ref. 88. Note that since the on-site term is applied only inside the sphere surrounding the atom of interest,^{87} the results may depend on the radius *R*_{MT} of this sphere (see Refs. 89 and 90 for illustrations), which is a drawback of the on-site methods. Among our works reporting DFT+*U* calculations, we mention Refs. 77 and 91–93.

For many technical aspects, the EECE and on-site hybrid methods are quite similar to DFT+*U*; however, there are two conceptual differences. The first one concerns the double-counting term. While in DFT+*U*, the double-counting term is derived using concepts from the Hubbard model (see Ref. 88 for a summary of the various expressions), in EECE and on-site hybrids, it is given by the semilocal expression of *E*_{xc} (e.g., PBE) evaluated with the density of the strongly correlated electrons.^{82} The second difference is the calculation of the Slater integrals in the Coulomb and Hartree–Fock terms. In DFT+*U*, they are parameterized with screened intra-atomic Coulomb (*U*) and exchange (*J*) interactions, which are usually chosen empirically. However, in EECE and on-site hybrids, the Slater integrals are calculated explicitly using the orbitals of the strongly correlated electrons.^{82} The results obtained with the on-site hybrid methods can be found in Refs. 83 and 94–98.

The results obtained with DFT+*U* and on-site hybrids should be qualitatively similar in many (but not necessary all) cases.^{83,99–101} Actually, both methods contain empirical parameters: *U* and *J* (or only *U*_{eff} = *U* − *J*) in DFT+*U* and *α*_{x} [Eq. (13)] in on-site hybrids. In both cases, the results will depend crucially on the value of the parameters (*U*, *J*) or *α*_{x}. For applications, the EECE method^{82} is less interesting since it consists of 100% of unscreened Hartree–Fock exchange applied to correlated electrons, which usually is not accurate.

Technically, we mention that calculations with the on-site methods can only be done in a spin-polarized mode, i.e., with *runsp*_*lapw*. However, it is possible to apply on-site methods to non-magnetic systems by using the script *runsp*_*c*_*lapw*, which constrains the system to have no spin polarization.

#### 4. Methods for bandgaps

It is well known that the GGA functionals that are commonly used for total-energy calculations, such as PBE, provide bandgaps that are much smaller than experiment.^{57} Thus, one has to resort to other methods to get reliable results for the bandgap. Hybrid functionals and the *GW* method^{102,103} (see Sec. III I 4) provide much more accurate values; however, they are also significantly more expensive than semilocal methods and cannot be applied easily to very large systems. Therefore, fast semilocal methods have been proposed specifically intended for bandgap calculations, and those which are available in WIEN2k are discussed below. Note that a more detailed discussion of the DFT methods for bandgaps is provided in Ref. 104.

The Tran–Blaha modified Becke–Johnson (TB-mBJ) potential^{105} consists of a modified version of the BJ potential^{106} for exchange (which reproduces the exact KS potential of atoms very well^{106,107}) and LDA^{108} for correlation. The exchange part, which is a MGGA since it depends on the kinetic-energy density *t*, is

where $vxBR$ is the Becke–Roussel (BR) potential^{109} and

with

being the average of $\u2207\rho /\rho $ in the unit cell. The parameters in Eq. (16) are *α* = −0.012, *β* = 1.023 bohrs^{1/2}, and *p* = 1/2 and were determined by minimizing the mean absolute error of the bandgap for a set of solids.^{105} As shown in benchmark studies,^{64,65,80} the TB-mBJ potential is currently the most accurate semilocal method for bandgap prediction. Other parameterizations of Eq. (16) were proposed in Refs. 110 and 111 and are also available in WIEN2k.

The GLLB-SC potential^{112,113} is given by

where $exPBEsol$ is the PBEsol exchange-energy density per electron, $vcPBEsol=\delta EcPBEsol/\delta \rho $ is the PBEsol correlation potential, and $KxLDA=82/3\pi 2$. Since the GLLB-SC potential depends on the orbital energies (*ε*_{H} is the one at the valence band maximum), a non-zero derivative discontinuity^{114,115} can be calculated and added to the KS bandgap for comparison with the experimental value.^{113,116} Similar to TB-mBJ, the GLLB-SC potential is significantly more accurate than traditional GGA functionals, as shown in Refs. 80, 104, 113, and 117. On the other hand, we note that these potentials are not obtained as the functional derivative of an energy functional.

Among other DFT methods, which have been shown to provide bandgaps more accurately than PBE, are the GGAs EV93PW91,^{118,119} AK13,^{120,121} and HLE16,^{122} as well as the LDA-type functional Sloc.^{123} As mentioned in Sec. II B 1, the potential of MGGA energy functionals is not implemented in WIEN2k; however, it is still possible to calculate bandgaps non-self-consistently using the total energy (see Ref. 124 for details). Such MGGAs that are particularly interesting for bandgaps are HLE17^{125} and TASK.^{126}

Figure 3 shows results for the bandgap of 76 solids, which we considered in our previous works.^{64,80} Compared to the standard PBE functional, the results are much improved when TB-mBJ, GLLB-SC, or HSE06 is used, since the mean absolute error (MAE) drops from 1.99 eV with PBE to 0.47 eV, 0.64 eV, or 0.82 eV, respectively, for the other methods. Among all methods considered in Refs. 64 and 80, TB-mBJ leads not only to the smallest MAE but also to a slope (*b* = 0.97) of the linear fit that is closest to 1 (the AK13^{120} functional also leads to a slope of 0.97).

Finally, we also mention that the Slater^{127} and Krieger–Li–Iafrate^{128} potentials have been implemented in the WIEN2k code.^{129,130} However, these *ab initio* potentials, which are as expensive as the HF/hybrid methods, are not really intended for bandgap calculations, but may be interesting for more fundamental studies or as a better starting point for approximating the exact KS exchange potential.

#### 5. Methods for van der Waals systems

The semilocal and hybrid functionals are, in general, quite inaccurate for describing weak interactions.^{131,132} This is mainly due to the London dispersion forces that are not included properly in these approximations. Nevertheless, much better results can be obtained by adding to the semilocal/hybrid functional a correlation term (*E*_{c,disp}) accounting for the dispersion forces. There are essentially two types of dispersion corrections. The first one is of the atom-pairwise (at-pw) type,

where $CnAB$ are the dispersion coefficients for atom pair *A*–*B* separated by the distance *R*_{AB} and $fndamp$ is a damping function. Among the at-pw dispersion methods, Grimme’s DFT-D2^{133} and DFT-D3^{134} can be used with WIEN2k through the separate software DFT-D3^{135} that supports periodic boundary conditions.^{136} A feature of the DFT-D3 method is that the dispersion coefficients are not precomputed and fixed but depend on the coordination number of the system.

The second type of dispersion methods available in WIEN2k are the nonlocal van der Waals (NL-vdW) functionals^{137}

where the kernel Φ is a function of *ρ*, ∇*ρ*, and the interelectronic distance $r\u2212r\u2032$. Compared to the single integral for semilocal functionals [Eq. (12)], evaluating such a double integral is clearly more involved computationally. In order to make the calculations affordable, Román-Pérez and Soler^{138} proposed a very efficient method based on fast Fourier transformation (FFT) for evaluating Eq. (20). Their method is used for the implementation of NL-vdW functionals in WIEN2k. However, to apply the FFT-based method efficiently in an all-electron method, it is necessary to make *ρ* smoother by removing the high-density region close to the nucleus. This is done with the following formula:

where *A* = 1 bohr^{3} and *ρ*_{c} is the density cutoff that determines how smooth the new density *ρ*_{s} should be. More details can be found in Ref. 139, where it is shown that converged benchmark results can easily be obtained at a relatively modest cost.

In our previous works,^{39,140} a plethora of DFT-D3 and NL-vdW functionals were assessed on solids. The test set consists of strongly bound solids and van der Waals solids such as rare gases or layered compounds. The results showed that among the at-pw methods, PBE-D3/D3(BJ)^{134,141} seems to be a pretty good choice, while among the nonlocal methods, rev-vdW-DF2^{142} is the most balanced and actually more or less the best among all tested functionals.

### C. SCF convergence, total energies, forces, and structure optimization

The total energy *E*_{tot} of a periodic solid (with frozen nuclear positions) is given by Eq. (9). The individual terms are of opposite sign and, in an all-electron method, very large. In order to cancel the Coulomb singularity, we follow the algorithm of Weinert *et al.*,^{143} where the kinetic and potential-energy terms are combined and a numerically stable method is obtained.

The force exerted on an atom *t* residing at position **R**^{t}, defined as

is calculated from the Hellman–Feynman theorem and includes Pulay corrections,^{144} thereby taking into account that parts of the basis set used in WIEN2k depend upon the position of atoms (see Refs. 17 and 145–147 for the derivation specific to APW-based methods).

WIEN2k exploits the self-consistency of the KS equations, running through a sequence of calculations where the target is a density (and other parameters), which when passed through these SCF calculations yield the same density. This is equivalent to finding the solution to a set of simultaneous equations and is, in the most general case, referred to mathematically as a “fixed-point problem,” although it goes under the different name of “mixing” in the DFT literature. The general method used for just the electron density *ρ* and other density-like variables (for instance, the density matrix) is discussed in Ref. 148, while the extension to include atomic positions is described in Ref. 149. At any given iteration *n*, the relevant variables can be written as a vector $(\rho n,vnorb)$, including the density *ρ* at the start of an iteration as a function of the Cartesian coordinates **r** and an orbital potential *v*^{orb} (if used as in DFT+*U* or on-site hybrids, see Sec. II B 3), as well as other relevant variables. After running through the SCF sequence, a new density [symbolized by the SCF mapping $KS(\rho n,vnorb)$] is produced. The fixed-point for just the density is when the two are equal, i.e., for all of the variables, the set of simultaneous equations

where $D(\rho n,vnorb)$ is the density residual. Since the total energy with respect to the position of the atoms also has to be minimal, when these are allowed to vary, this can be expanded to include the forces $Fn(\rho n,Rnt,vnorb)=\u2212\u2207RntE(\rho n,Rnt,vnorb)$, i.e., solve the larger problem

This is equivalent to solving the SCF problem for an extended KS-equation, finding the variational minimum of both the atomic positions and densities.

The Hellman–Feynman forces due to the input density *ρ*_{n} are calculated within *lapw0*, while the Pulay corrections are calculated in *lapw2* using the new density $KS(\rho n,Rnt,vnorb)$, so the forces above are, in general, not true derivatives of the energy, rather pseudo-forces that converge to them as the density converges. The general method is to expand to first order, i.e., write for the next value of the variables,

where **H**_{n} is an approximation to the inverse Jacobian, which is constructed as a Simplex gradient, i.e., a multi-dimensional numerical derivative using some number of the prior steps. The approximate inverse Jacobian has two components:

A predicted component where the density, positions, and other variables have changed in a way that maps onto the prior steps in the SCF iterations, so some information is already available.

An unpredicted component where the changes in the variables are new, so no prior information is available.

Equation (25) ignores higher-order terms in the expansion, which can break down far from the fixed point. In addition, because it is generated by a type of numerical differentiation, it can have limited accuracy if the step sizes are inappropriate. In addition, the unpredicted component can lead to instabilities. The approach taken is to control the algorithm greed for the unpredicted step and also use trust regions for both the predicted and unpredicted steps. The general idea of trust regions is that Eq. (25) is only reasonably accurate for changes of the variables, which are smaller than some value, which is called the trust radius, as illustrated in Fig. 4. Starting from some initial defaults, at each iteration, the algorithm will check to see if the step the *mixer* proposes to use is small enough; if it is too large, then the step is reduced. In the next iteration, if the step used led to an adequate improvement when bounded by the trust radius, then this radius is increased; if the step was not good, then the trust radius is decreased. In addition to changing the trust radii based upon improvement (or not), the most recent algorithm^{150} also looks at the last step to see how large it should have been for both the predicted and unpredicted parts. This approach is significantly more stable and often leads to much smaller steps than earlier versions of the *mixer* used.

By design, the algorithm requires minimal user input beyond an estimate for the initial step to take for the unpredicted step; the most recent version of the algorithm^{150} automatically controls all the internal parameters. No algorithm is perfect, and the convergence of the mixing depends significantly upon the nature of the physical problem being considered. The better the description of the underlying quantum mechanical problem, primarily the XC potential, the more rapidly it will converge. A very badly posed problem, in contrast, may converge only very slowly or not at all.

Two main algorithms are used within WIEN2k. The first is MSEC3, which is an updated version of a multisecant Broyden method^{148} with trust region controls. This is a conservative algorithm, which uses the least greedy approach at every iteration. It is recommended for problems that converge badly. The more pushy MSR1^{149} uses a more aggressive algorithm, which is significantly better for problems with soft modes that may converge only very slowly with MSEC3.

One unique feature of WIEN2k is that it can simultaneously converge both the density and atomic positions by solving the fixed-point problem of Eq. (24) using a multisecant approach.^{149} This is often considerably faster than converging them independently as done in many other DFT codes and different from molecular dynamics approaches such as Car–Parrinello.^{151} The convergence rate depends upon the number and width of the eigenvalue clusters^{149} of the combined electron and atomic position Jacobian. This approach does not follow the Born–Oppenheimer surface, which is the energy surface when the density is converged, rather some other surface, which is a balance between having converged densities and pseudo-forces as illustrated in Fig. 5. As such it can be somewhat confusing to the user, particularly as the pseudo-forces can vary in a strange fashion. This mode can be used with both the more conservative MSEC3 and the more aggressive MSR1 algorithm.

For the optimization of lattice parameters, WIEN2k offers a couple of workflows and utilities to generate structures with different lattice parameters, running the corresponding SCF calculations and analyzing the results. The optimized lattice parameters, however, are found only from the lowest total energy since there is no stress tensor in WIEN2k yet. This makes the optimization tedious for low symmetry cases and practically impossible for triclinic lattices.

### D. User interface and utilities

WIEN2k consists of a large set of individual programs (mostly written in Fortran 90), which are linked together via *tcsh*-shell scripts representing a particular workflow. With this modular structure, WIEN2k is, on the one hand, very flexible and one can run a dedicated program for a particular task. On the other hand, there is not just one program and the specific task will be determined by directives in the input file, but a user has to know which program performs this specific task.

WIEN2k can be driven either from the command line or using a web-based graphical user interface (GUI), called *w2web*, which can be accessed by any web browser. Most likely, an experienced user will use the command line and explore all advanced features of WIEN2k, but for the beginner, the web-based GUI provides a very good starting point and it also teaches the user the corresponding command line.

#### 1. Structure generation

The first task of every calculation is to define the structural data. As an example, the *StructGen@w2web* page is shown in Fig. 6 for the case of TiCoSb. The necessary basic input consists of the following:

the lattice type (P, B, F, and H for primitive, body centered, face centered, and hexagonal, respectively) or, if already known, one of the 230 space groups; for the Heusler compound TiCoSb, we can select F lattice or space group 216 ($F4\xaf3m$);

the lattice parameters

*a*,*b*, and*c*(in bohr or Å) and the angles*α*,*β*, and*γ*(in degree);the atoms and their positions; if the space group is given, only one of the equivalent atoms has to be specified.

When the new structure is saved, the *setrmt* utility determines the nearest neighbor distances and automatically sets optimized atomic sphere sizes *R*_{MT} for this structure. The choice of *R*_{MT} has nothing to do with ionic radii but depends on the convergence properties of the atoms as discussed before (see the end of Sec. II A). It is important to note that if one wants to compare total energies for a series of calculations (e.g., for volume optimization), the *R*_{MT} should be kept constant.

An alternative on the command line is the *makestruct* utility, which works analogous to *StructGen@w2web*. More complex structures can be converted from cif or xyz files using the *cif2struct* or *xyz2struct* utilities. The generated structures can be conveniently visualized using XCrysDen^{152} or VESTA.^{153}

Starting from a basic (simple) structure, WIEN2k has powerful tools to generate supercells and manipulate them. *supercell* generates quickly *h* × *k* × *l* supercells (with/without B- or F-centering so that the supercell size can be increased by factors of two) and can add vacuum for surface slab generation. The *structeditor*, a collection of GNU Octave scripts, is even more powerful, since it can create arbitrary supercells (e.g., $3\xd73\xd7l$), rotate or merge structures, and delete or add atoms.

#### 2. Input generation

As mentioned above, WIEN2k consists of many individual programs and most of them have their own input file. Although this sounds very tedious at first, there are default inputs for all programs and several tools for changing the most important parameters on the fly. In *w2web*, the next step would be to check the symmetry of the newly generated structure and generate the input files for the SCF calculation (*initialize@w2web*). The user can provide a couple of parameters (only needed if one wants to change the defaults, see below) and run the following steps in batch mode or step by step:

*nn*: Determines the distances between all atoms up to twice the nearest neighbor distance. In addition, it checks for overlapping spheres and will issue an error message if the spheres overlap. It also checks if identical elements have the same environment and eventually regroups them into equivalent sets.*sgroup*: Checks the structure and determines the space-group. It will group the atoms into sets of equivalent ones according to the Wyckoff positions of the corresponding space-group. In addition, it will check and determine the smallest possible (primitive) cell and create the corresponding structure file if necessary. For instance, if one enters the NaCl structure as a primitive cubic structure with four Na and four Cl atoms, it will automatically create a primitive FCC cell with only one Na and Cl atom.*symmetry*: Finds the symmetry operations of the space group as well as the point group symmetry of each atom and the corresponding*LM*expansion for the density/potential [Eqs. (1) and (2)].*lstart*: Solves numerically the radial Dirac equation for free atoms and creates atomic densities. Using the eigenvalues (or the localization within the atomic spheres) of all atomic states, it groups them into core and valence states. It selects automatically LOs for semi-core states and writes the starting energy parameters*E*_{ℓ}to**case.in1**(during the SCF cycle, they are searched and adapted automatically to ensure best possible settings in all cases).*kgen*: Generates a shifted or non-shifted equidistant**k**-mesh with a user specified density in the irreducible part of the BZ.*dstart*: Superposes the atomic densities and creates the starting density for the SCF cycle.

On the command line, a corresponding script is called, which optionally allows us to specify various parameters (the most important ones are given below with their default values for reference):

*init_lapw* [-b -vxc PBE -ecut -6.0 -rkmax 7.0 -numk 1000]

The switches are described as follows: *-b* indicates batch mode (instead of step by step), *-vxc* selects the DFT functional, *-ecut* gives the core–valence separation energy (in Ry), *-rkmax* determines the plane wave cut-off parameter $RMTminKmax$, and *-numk* determines the total number of **k**-points in the full BZ.

The most critical parameter is $RMTminKmax$, which determines not only the quality but also the required computing time. The type of atom with the smallest *R*_{MT} determines this value because the *R*_{MT} for other atoms are set such that when the smallest atom is converged with the number of PWs, all others are also converged with the number of PWs. If the smallest sphere is a H atom (for instance, in short O–H bonds), $RMTminKmax=3$ is sufficient, most *sp*/*d*/*f*-elements converge with $RMTminKmax=7$/8/9. For lower (higher) precision, one can decrease (increase) these values by 10%–20%.

Of similar importance is the selection of a **k**-mesh. Generally speaking, small unit cells and metallic character require a large number of **k**-points (typical starting values for the SCF cycle would be a 10 × 10 × 10 mesh), while large cells (100 atoms) and insulators can be started with only one **k**-point. In any case, after the first SCF cycle, the **k**-mesh should be increased and the results (e.g., the forces on the atoms) should be checked. Certain properties (DOS, optics, and NMR) may need an even denser mesh, which in WIEN2k can be easily done and is fairly cheap, since it is only used for the property of interest.

#### 3. SCF cycle

The SCF cycle consists in WIEN2k of a complex workflow using several different programs. The main steps are as follows:

*lapw0*: Calculates the Coulomb and XC potential from the density.*lapw1*: Calculates the valence and semi-core eigenvalues and eigenvectors at all requested**k**-points*lapw2*: Calculates the valence electron density*lcore*: Calculates the core eigenvalues and the core density*mixer*: Adds up the core and valence densities and mixes the total density with densities from previous iterations. In addition, it may update the atomic positions according to the calculated forces (see Sec. II C) and also the density matrices or orbital potentials when DFT+*U*or on-site hybrid methods are used.

Additional programs may be called depending on the requested options to include SO coupling or one of the specialized functionals discussed in Sec. II B (DFT-D3, NL-vdW, DFT+*U*, on-site hybrid/EECE, or hybrid-DFT/HF).

In *w2web*, the SCF cycle can be started by clicking on *SCF@w2web*. In this interface, one can then specify several parameters such as convergence criteria, parallelization, simultaneous optimization of internal atomic positions, or adding SO coupling.

The most important parameters for the corresponding command line script are as follows:

*run_lapw* [-ec 0.0001 -cc 0.0001 -fc 1.0 -p -so -min]

The SCF cycle will stop when the (optional) convergence criteria *-ec* (energy in Ry), *-cc* (charge in e^{−}), and *-fc* (forces in mRy/bohr) are fulfilled three times in a row. SO coupling (only possible after a previous *init_so_lapw* step) is switched on using *-so*, *-min* relaxes the atomic positions simultaneously with the electron density (Sec. II C), and *-p* switches on parallelization (Sec. II E).

The basic summary of the SCF cycle is written into the **case.scf** file and all relevant quantities are labeled **:LABEL:** and can be searched/monitored using *analyse@Utils@w2web* or the Linux *grep* command. If the desired convergence has been reached, it is advisable to save all relevant input/output files using either *save@Utils@w2web* or the *save_lapw* utility. One can now either check results using more **k**-points (*kgen@single_prog@Execution@w2web*) or modify other inputs (*input files@Files@w2web*) such as $RMTminKmax$ or the XC functional and then continue with the SCF cycle. Later on, it is always possible to come back to a previously saved calculation using *restore@Utils@w2web* (*restore_lapw*).

#### 4. Tasks

Once these steps have been finished, one could, for instance, optimize the lattice parameters (*optimize@w2web*) or perform various other tasks (*Tasks@w2web*) such as *Bandstructure@Tasks@w2web*, *DOS@Tasks@w2web*, *ElectronDensities@Tasks@w2web*, *XSPEC@Tasks@w2web*, *TELNES@Tasks@w2web*, or *OPTIC@Tasks@w2web* (see Fig. 6). Each of these tasks consists of a guided workflow and let the user prepare the necessary inputs, executes various small programs, and visualizes the results.

### E. Software requirements and parallelization

#### 1. Software

WIEN2k runs on any Linux platform and also on Mac. It is written mainly in Fortran 90 (a few programs are written in C), and the workflows are managed by *tcsh* scripts. Most of the time-critical parts use libraries such as BLAS and LAPACK, and efficient libraries are therefore mandatory. There is direct installation support for the standard Linux tools GFortran+OpenBLAS (at least gcc 6.x) and Intel IFORT+MKL. The latter still gives the best performance. *w2web* is a specialized web server written in perl and listens on a user-defined high port. Its access is, of course, password protected and can be limited to specific IP addresses.

For the optional installation of the MPI-parallel version (useful only on clusters with InfiniBand network or larger shared memory workstations with at least 16 cores), one needs obviously MPI (e.g., Open MPI or Intel MPI) and also ScaLAPACK (included in the MKL), FFTW,^{154} and, optional but highly recommended, ELPA.^{155,156}

The following Linux tools are necessary (not all of them are always installed by default): tcsh, Perl 5, Ghostscript, gnuplot, GNU Octave, and Python 2.7.x+NumPy.

Optional, but highly recommended, programs for certain tasks include the following:

XCrysDen

^{152}and VESTA^{153}for structure and electron density visualization but also generation of band structure**k**-meshes or plotting Fermi-surfaces.Libxc

^{33,34}for XC functionals not directly implemented in WIEN2k.DFT-D3

^{135}for DFT+D3 calculations of van der Waals systems.Wannier90

^{157,158}for constructing Wannier functions using the*wien2wannier*utility.phonopy,

^{159,160}Phonon,^{161,162}or PHON^{163,164}for phonon calculations.BoltzTraP2

^{165}for transport calculations (see Sec. III I 1).fold2bloch

^{166}to fold supercell band structures back to the primitive BZ.SKEAF

^{167}to extract de Haas-van Alphen frequencies from WIEN2k.Critic2

^{168,169}is an alternative program to the WIEN2k program*aim*to analyze 3D scalar fields such as the electron density by using the “atoms in molecules” (AIM) theory of Bader.^{170,171}

#### 2. Parallelization

WIEN2k is a highly accurate all-electron code based on the APW method and thus certainly not as fast as some other (pseudopotential or minimal basis set) codes. However, it takes advantage of inversion symmetry and when present it will automatically use the “real” instead of the “complex” version of the code, thus saving half of the memory and running almost four times as fast. In addition, it is highly optimized and efficiently parallelized at three different levels, which can be optimally chosen depending on the size of the problem and the available hardware.

Except for OpenMP parallelization (see below), parallelization is activated by a *-p* switch in our scripts and needs a **.machines** file as listed and described below:

OpenMP parallelization:

The main (time consuming) programs are all parallelized using OpenMP and can use the corresponding threaded BLAS, LAPACK, and FFTW libraries. It is activated by either setting the *OMP_NUM_THREADS* variable globally or using *omp_prog:N* directives in **.machines**. While many parts of the code scale very well with the number of parallel threads on a multi-core shared memory machine, unfortunately, the scaling of the matrix diagonalization is, at present, limited to 2–4 cores due to performance bottlenecks in the corresponding OpenBLAS or MKL libraries.

**k**-point parallelization:

This together with OpenMP is a very simple and highly efficient parallelization, which works even on a loosely coupled cluster of simple PCs with a slow network for small to medium sized cases (up to 100 atoms/cell) where the eigenvalue problem needs to be solved for several **k**-points. It requires a common (NFS) filesystem on all machines and password-less *ssh* (private/public keys). *N* lines **speed:hostname** in **.machines** will split the list of **k**-points into *N* junks, and *N* jobs will be started in parallel on the corresponding hosts, followed by a summation step of the partial densities. Since WIEN2k can use temporary (local) storage for the eigenvectors, we are not limited in the number of **k**-points and our personal record is a NMR chemical shift calculation for fcc Al with 10^{6}**k**-points.

MPI-parallelization:

With a sufficiently powerful hardware (at least 16 cores or a cluster with InfiniBand network) and for medium to large sized problems (more than 50 atoms/cell), it is possible, and actually necessary, to parallelize further using MPI. Besides a tremendous speedup that can be achieved by parallelization over atoms and, in particular, over basis functions, this version will distribute the necessary memory on all requested computers, thus allowing calculations for unit cells with more than 1000 atoms.^{172–174} Such cells require basis sets of about 10^{5} APWs, and the resulting Hamiltonian, overlap, and eigenvector matrices may need about 500 GB of memory, which are distributed over the nodes in the standard ScaLAPACK block-cyclic distribution. For such large systems, the solution of the general eigenvalue problem can become the time-limiting (*N*^{3}) step (depending on the number, type, and *R*_{MT} of the atoms, the setup of the complicated matrix elements can take a comparable fraction of the total time), but the ELPA library provides a highly efficient and scalable (1000 cores) diagonalization. In cases with fewer atoms in large cells (isolated molecules in a large box or surface slabs with sufficient vacuum), we can use an iterative diagonalization^{175} using the previous eigenvectors as start. Depending on the requested number of eigenvalues, this method may be up to 100 times faster than full diagonalization and still scales very well with the number of cores. The MPI version of the code is used when lines with **speed:hostname:N** (or lines with more than one hostname) in **.machines** are specified. Of course, coupling of **k**- and MPI-parallelization (and/or OpenMP) is possible.

## III. PROPERTIES AND FEATURES

### A. Energy bands, density of states, electron densities

Once a self-consistent solution for a chosen atomic structure is done, one can focus on the electronic structure. The energy eigenvalues as a function of the **k**-vector obtained at the end of the KS calculation define the band structure. The **k**-path along high symmetry lines in the irreducible BZ (see the Bilbao Crystallographic Server^{176}) can be either obtained from WIEN2k default templates or generated graphically using XCrysDen.^{152} In WIEN2k, one can plot the energy bands by using the program *spaghetti*, indicating that its interpretation is difficult. However, there are some tools to help. A first tool is a symmetry analysis, which determines the irreducible representation (of the corresponding point group) for each KS eigenvalue. With this knowledge, one can connect the KS eigenstates to bands by using compatibility relations and satisfying the non-crossing rule. The chemical bonding information of state *n***k** is contained in the corresponding wave function *ψ*_{nk}, which is complex and three dimensional. However, when computing the square of its modulus, one obtains an electron density, which is a real function and easy to visualize. Integrating this (normalized) electron density, one obtains a charge *q*, which can be decomposed into contributions from the interstitial region *I* and the atomic spheres *S*_{t} (labeled by the atom number *t* and the quantum number *ℓ* according to the atomic-like basis set),

This allows us to compress the detailed information contained in the wave function of a single eigenstate state *ψ*_{nk} to a few numbers that can be stored and analyzed. In addition, WIEN2k decomposes the *q*_{tℓ} according to the symmetry of the corresponding point group. For example, the five *d*-orbitals of a TM atom surrounded by ligands in octahedral symmetry are split into the *t*_{2g} and *e*_{g} manifold (crystal field splitting), while for lower symmetry, a splitting into five different *d*-orbitals is obtained. A review paper^{8} (Sec. 6.2) illustrates these tools for TiC, a refractory metal (crystallizing in the sodium chloride structure) that is almost as hard as diamond but has metallic, covalent, and ionic bonding contributions. These data are very useful to analyze the electronic structure, and we illustrate this for the band structure of the Heusler compound TiCoSb (see Ref. 177). If one wants to know which atomic states (e.g., Co-*d*, Ti-*d*, or Sb-*p* states) contribute most to a certain band, one can show the character of the bands, sometimes called “fat bands.” For each eigenvalue *ε*_{nk}, the size of the circle represents the weight of the chosen character (e.g., a particular *q*_{tℓm}). Figure 7 shows which band states originate mainly from Co-*d*, Ti-*d*, or Sb-*p* states, giving the band structure a chemical interpretation.

From the KS eigenvalues calculated on a sufficiently fine **k**-grid in the irreducible BZ, one can obtain the density of states (DOS), usually by means of the (modified) tetrahedron method.^{178} By using the partial charges [Eq. (26)], one can decompose the total DOS into partial DOS (PDOS), which are useful for understanding chemical bonding and interpreting various spectroscopic data. Figure 8(a) shows how much each region (atomic spheres of Co, Ti, Sb, and the interstitial) contributes to the total DOS. We show the dominating valence contributions from the Co-3*d* and Ti-3*d* electrons in Fig. 8(b) and those from Sb-5*s*/5*p* in Fig. 8(c). Unfortunately, the interstitial PDOS cannot be decomposed into atomic and *ℓ*-like contributions uniquely. However, by analyzing the atomic orbitals in the free atom, we see (Table I) that only a fraction of the related electron density resides inside the corresponding atomic sphere, e.g., 81% for the Co-3*d* but only 15% for the Co-4*s* orbital. Therefore, a significant part of the density lies outside the atomic sphere, leading to a non-negligible PDOS from the interstitial region [Fig. 8(a)]. A simple scheme to eliminate this interstitial part is to renormalize the partial DOS with a factor $qt\u2113ren$ (determined by a least squares fit) such that the sum of the renormalized PDOS contributions yields the total DOS,

This sum runs only over the “chemical” *ℓ*, which are the main contributions. In Table I, we see that the $qt\u2113ren$ are close to the free atom situation for more localized orbitals but differs significantly, for example, for Co-4*s*, which is more localized in the solid than in the free atom. For the importance of this effect, see Sec. III B.

. | $qt\u2113free$ . | $qt\u2113ren$ . |
---|---|---|

Co-4s | 0.34 | 0.48 |

Co-4p | … | 0.45 |

Co-3d | 0.95 | 0.90 |

Ti-4s | 0.15 | 0.30 |

Ti-4p | … | 0.43 |

Ti-3d | 0.81 | 0.86 |

Sb-5s | 0.60 | 0.61 |

Sb-5p | 0.35 | 0.40 |

. | $qt\u2113free$ . | $qt\u2113ren$ . |
---|---|---|

Co-4s | 0.34 | 0.48 |

Co-4p | … | 0.45 |

Co-3d | 0.95 | 0.90 |

Ti-4s | 0.15 | 0.30 |

Ti-4p | … | 0.43 |

Ti-3d | 0.81 | 0.86 |

Sb-5s | 0.60 | 0.61 |

Sb-5p | 0.35 | 0.40 |

The fundamental variable in DFT is the electron density *ρ*, which can be compared to experimental data. The total *ρ*, which is obtained by summing over all occupied states, can be decomposed into its contributions coming from the core, semi-core, and valence states. A variety of tools (such as XCrysDen^{152} or VESTA^{153}) allows one to visualize the density *ρ* along a line, in a plane (2D), or in the unit cell (3D). One can easily compute the density corresponding to a selected energy window of electronic states in order to visualize their bonding character. By taking the difference between the crystalline density and a superposition of atomic densities (placed at the atomic position of the crystal), one obtains a difference density Δ*ρ* = *ρ*_{crystal} − *ρ*_{atoms}, which shows chemical bonding effects much more clearly than the total or valence density. Figure 9 provides an illustration of Δ*ρ* for TiCoSb within the (110) plane, where we can observe the strong asphericities in the electron density around the Ti and Co atoms originating from different occupations of the five 3d orbitals, as well as the charge transfer (discussed below using Bader charges).

When we want to compare the computed electron densities or the related X-ray structure factors (computed using the *lapw3* module) with experimental data, we must take into account the motion of the nuclei. In DFT calculations, we assume that the nuclei are at rest, whereas in an experiment, this motion must be considered, for example, by means of the Debye–Waller factors, which can also be calculated by phonon calculations.

It is a strength of theory to allow various decompositions (of the DOS or electron density, for instance), which are often useful for interpreting properties, but these may depend on the basis set used in a calculation, for example, when deriving atomic charges. In a LCAO scheme, one takes the weights of all atomic orbitals centered at a given atom to determine how much charge corresponds to that atom (Mulliken’s population analysis). In an APW scheme, the charge inside the related atomic sphere would give an atomic charge, but this value clearly depends on the chosen atomic radius and lacks the interstitial contribution. However, the renormalized partial charges $Qtren$ obtained from an integral over the renormalized PDOS gives a meaningful measure of charge transfer, as shown in Table II.

Atom . | $Qt,crystalBader$ . | $Qt,superBader$ . | $Qtren$ . |
---|---|---|---|

Co | −0.89 | −0.18 | −1.25 |

Ti | +1.28 | +0.82 | +0.85 |

Sb | −0.39 | −0.64 | +0.40 |

Atom . | $Qt,crystalBader$ . | $Qt,superBader$ . | $Qtren$ . |
---|---|---|---|

Co | −0.89 | −0.18 | −1.25 |

Ti | +1.28 | +0.82 | +0.85 |

Sb | −0.39 | −0.64 | +0.40 |

A basis-set independent alternative is the AIM procedure proposed by Bader,^{170,171} which is based on a topological analysis of the density. It uniquely defines volumes (called “atomic basins”) that contain exactly one nucleus by enforcing a zero-flux boundary: $\u2207\rho \u22c5n^=0$. Inside such an atomic basin, this scheme uniquely defines the Bader charge for a given density independently of the basis-set method that was used to calculate the electron density.^{170,171} An example of the application of the AIM method is given in Table II, which shows the charge inside the atomic basins of TiCoSb determined with the *aim* module of WIEN2k. $Qt,crystalBader$ is the nuclear charge *Z*_{t} minus the number of electrons (a positive value indicates a depletion of electrons) using the SCF density, and $Qt,superBader$ is the same quantity but using a density from a superposition of the free neutral atoms. In this crystal structure, according to a Bader analysis, even the superposition of neutral densities leads to a significant charge transfer from Ti to Co and Sb, which is enhanced for Co and Ti but reduced for Sb during the SCF cycle. These Bader charges can be compared to the $Qtren$. As we can see, in both methods, there is a transfer of electrons from the Ti to the Co atom, but the specific amount and, in particular, the charge state of Sb differs significantly depending on the way it is calculated. An inspection of the difference density (Fig. 9) shows a negative Δ*ρ* around Sb and thus indicates a positive charge of the Sb atom in contrast to the Bader charges, which seems to pick up a lot of charges in the interstitial region leading to a negative Sb charge. In essence, one should be careful with quantitative charge state assignments.

An alternative to *aim* is the Critic2 package,^{168,169} which determines Bader charges using a pre-calculated 3D mesh of densities. It is very fast; however, the integration of total charges on such a crude mesh is inaccurate and one should restrict its usage for magnetic moments (integrating the spin densities) or valence charges densities (be careful with the 3D mesh).

### B. Photoelectron spectroscopy

#### 1. Valence-band photoelectron spectroscopy

Experimental valence band photoelectron spectra (PES) are often just compared to the total DOS. Such a comparison, however, can at best reproduce certain peak positions, but usually not the experimental intensities. This is even more true with modern synchrotron-based hard X-ray PES (HAXPES), where the spectra differ considerably depending on the excitation energy. This is because the cross sections of different atomic orbitals change dramatically as a function of excitation energy and this effect should be taken into account. The *pes* module^{179} of WIEN2k uses the partial DOS (PDOS_{tℓ}) and multiplies it with the corresponding energy-dependent atomic orbital cross sections *σ*_{tℓ}, taken from various tables,^{180,181}

In addition, *pes* can use the renormalized PDOS (see Sec. III A), so that the contributions from the less localized orbitals (whose wave functions are mainly in the interstitial region) are also properly taken into account. This module allows us to specify the X-ray energy and can handle unpolarized and linearly polarized light as well as linear dichroism in angular distribution (LDAD). It was successfully applied for various examples^{179} (SiO_{2}, PbO_{2}, CeVO_{4}, In_{2}O_{3}, and ZnO).

Here, we compare in Fig. 10 the experimental HAXPES spectrum^{177} of TiCoSb at 6 keV with the theoretical calculation. The theoretical spectrum reproduces the experimental intensities very well, but the bandwidth is too small so that the Sb-*s* peak has about 1 eV less binding energy. This is a well-known DFT problem and concerns all states at lower energy. The decomposition of the total spectrum allows us to analyze the contributions to the different peaks in the spectrum. The low energy feature is almost exclusively from Sb-*s* states, the double peak at −5 eV is from Sb-*p* states, and the double peak at −2 eV is from Co-*d* states. However, Co-*s* also contributes significantly to the feature at −6 eV, and Sb-*p* contributions are even larger than Co-3*d* for the lowest binding energy peak at −1.5 eV. All Ti contributions are very small and thus not shown in Fig. 10.

#### 2. Core-level photoelectron spectroscopy

X-ray photoelectron spectroscopy (XPS) determines the binding energy (BE) of core states. These BEs are specific to certain atoms, but the possible small changes of BEs (core level shifts) provide important additional information about the chemical environment and, in particular, the oxidation state of that element. Since WIEN2k is an all-electron method, it has the self-consistent core eigenvalues available and one could calculate their BE as energy difference with respect to the Fermi level. However, according to Janak’s theorem,^{182} DFT eigenvalues represent the partial derivative of the total energy with respect to the orbital occupancy and are therefore not necessarily good approximations of experimental excitation energies. Such BEs are typically 10%–20% too small. In fact, even BE differences (core level shifts) from ground-state calculations might not be reliable because screening effects of the final state are not included. Much better approximations to experimental BEs can be obtained according to Slater’s transition state theory,^{183} where half of a core electron is removed.^{173,184,185} The corresponding SCF eigenvalue, which represents the slope of the total energy vs occupation at half occupation, is a much better approximation to the actual energy difference for *n* and *n* − 1 occupation, and typical BE errors are reduced to a few percent. For solids, such calculations should employ large supercells where only one of the atoms gets excited. This method also allows for some possible screening due to the valence electrons but suffers from the fact that, in solids, a neutral unit cell is required. The missing half electron can be compensated by adding a negative background charge, by increasing the number of valence electrons by one half, or by playing slightly with the nuclear charge^{186} according to a virtual crystal approximation. However, it is not always clear which of these methods should be preferred. Successful applications include, for instance, the N-1*s* shifts of h-BN covered Pt, Rh, and Ru(111) surfaces (with a unit cell of the “nanomesh” containing more than 1000 atoms)^{173} or to Pb-5*d* and Ta-4*f* shifts in the misfit layer compound (PbS)_{1.14}TaS_{2}.^{185}

The work function can also be obtained from surface slab calculations as the difference of the Fermi level and the Coulomb potential in the middle of the vacuum region. An example can be found for free and h-BN covered Ni and Rh(111) surfaces in Refs. 187 and 188. One has to carefully check the convergence of the work function with respect to the size of the vacuum region.

### C. X-ray absorption/emission spectroscopy and electron energy loss spectroscopy

Experimental techniques such as X-ray emission (XES), near-edge X-ray absorption (XAS, NEXAFS, and XANES), and electron energy loss spectroscopy (EELS) represent an electronic transition between a core state and a corresponding valence/conduction band state, which leads to the measurement of emitted/absorbed X-rays or the energy loss of transmitted electrons. The intensity of such a spectrum is given by Fermi’s golden rule according to dipole transitions between an initial (Ψ_{I}) and a final (Ψ_{F}) state,

The dipole selection rule is valid when the X-ray energy is not too large and limits transitions between a core state on atom *X* and angular momentum *ℓ* into/from a conduction/valence band state with Δ*ℓ* ± 1 on the same atom. In essence, the spectrum is calculated from the corresponding partial DOS times the squared radial matrix element. In the case of polarized light and oriented samples, the orientation-dependent spectra can be obtained by substituting the *ℓ*-like partial DOS by an appropriate *ℓm*-like DOS, e.g., for a K-spectrum of a tetragonal/hexagonal system by replacing the total *p*-DOS by *p*_{z} and *p*_{x} + *p*_{y}-DOS.^{189}

Such a scheme leads to very good results for XES spectra, where the final state has a filled core-hole and the valence-hole is usually well screened. For XANES and EELS spectra, however, the final state^{190} determines the spectrum. The final state has a core hole and an excited electron in the conduction band, and they will interact with each other leading to strong excitonic effects. In order to describe this effect in a DFT-based band-structure code, one has to create a supercell (as large as possible, but depending on the specific system and the hardware resources) of about 32–256 atoms and remove a core electron from one of the atoms. This electron should be added either to the valence electrons (if there are proper states in the conduction band, e.g., in B–K edges of BN) or to the constant background charge (if the lowest conduction band states are of completely wrong character, e.g., for the O–K edge of a TM-oxide) to keep the system neutral. In the following SCF cycle, the valence states on the atom with the core hole will get a lower energy and localize, but the surrounding electrons are allowed to partially contribute to the screening.

As an example, the Cs-L_{3} spectrum of CsK_{2}Sb is shown in Fig. 11(a) and compared with state-of-the-art calculations^{191} using the Bethe–Salpeter equation (BSE) (for further details on BSE, see Sec. III I 3). Obviously, the spectrum calculated with the ground state electronic structure is very different from core hole supercell or BSE calculations, where the spectral weight is redistributed into the first (excitonic) peak. In particular, the Cs-5*d* states come down in energy and partially screen the core hole [Fig. 11(b)]. The Cs-6*s* contributions (not shown) are 1–2 orders of magnitude smaller because both their dipole matrix elements and the 6*s*-PDOS are smaller than the corresponding Cs-5*d* quantities. On the other hand, the BSE and core hole supercell calculations agree quite well. Note that the BSE data have a larger broadening. In this example, the size of the supercell is easy to converge, but it should be noted that for details of the spectrum, a rather good **k**-mesh (8 × 8 × 8 for the 128 atom supercell) is necessary.

While the core hole approach works generally quite well, it also has clear limitations or needs extensions:

For metals or small bandgap semiconductors, a full core hole is sometimes too much because the static screening in the supercell might not be enough. Better results can be obtained using a “partial” hole,

^{192,193}although adjusting the size of the hole until the resulting spectra match the experiment is not fully*ab initio*anymore.The used DFT approximation may not be accurate. This can concern the ground state, for instance, the O–K edge of NiO could be greatly improved using the TB-mBJ approximation

^{194}instead of PBE or PBE+*U*. Even more problematic are excited-state effects due to the additional*d*electron in strongly correlated materials (for instance, 3*d*in TM oxides), where very poor L_{2,3}edges are obtained in single particle approaches and sophisticated methods such as dynamical mean field theory (DMFT, see Sec. III I 5) or configuration interactions^{195}are needed.For early (“

*d*^{0}”) TM compounds, the L_{2,3}edges are influenced strongly by interactions and interference effects between the 2*p*_{1/2}and 2*p*_{3/2}states, which are split only by a few eV. This can be accounted for using fully relativistic BSE calculations (see Sec. III I 3), where both the 2*p*_{1/2}and 2*p*_{3/2}states are taken into account simultaneously.^{196}The B-K edge of hexagonal BN has been investigated many times in the literature.

^{189,197}While the first strong excitonic peak originating from antibonding B-*p*_{z}(*π*^{*}) states is well described by supercell calculations and in full agreement with BSE calculations,^{197}the experimental double peak at around 7 eV above the*π*^{*}peak originating from*σ*^{*}(B-*p*_{x,y}) states shows up in the calculations only as a single peak. This can only be fixed by taking electron–phonon interactions into account. The approach is based on statistical averages over all vibrational eigenmodes of the system.^{197}Thus, one calculates first the vibrational modes of h-BN and then, using a supercell of e.g., 128 atoms, all atoms are displaced according to the vibrational eigenmodes with amplitudes determined by the Bose–Einstein occupations. Even at*T*= 0 K, the zero-point motion is enough to split the degenerate*p*_{x}and*p*_{y}states, since, on average, each B will have one*N*neighbor at a smaller (larger) distance than the two others. This leads to the desired splitting of the single*σ*^{*}peak into a double peak.^{197}

EELS has fairly similar basic principles as XAS but differs slightly because of the finite momentum of the electrons.^{198} The *telnes3* module of WIEN2k calculates the double differential scattering cross section on a grid of energy loss values and impulse transfer vectors. This double differential cross section is integrated over a certain momentum transfer *q* to yield a differential cross section, which can be compared to the experiment. This formalism allows the calculation of relativistic EELS including transitions of arbitrary order (i.e., non-dipole transitions), and it can take into account the relative orientation between the sample and beam.^{199} Practical aspects on how to perform EELS calculations have been given by Hébert,^{200} and some examples can be found in Refs. 201–204.

### D. Optics

The *optics* module of WIEN2k uses the independent-particle approximation (IPA) and calculates the direct transitions (conserving **k**) between occupied *n***k** and unoccupied *n*′**k** states, where for both states, KS eigenvalues are used.^{205} The joint density of states is modified by transition probabilities given by the square of the momentum matrix elements *M* = ⟨*n*′**k**|*A*·*p*|*n***k**⟩ between these states, which determine the intensity of optical spectra using dipole selection rules and clearly distinguish between optically allowed and forbidden transitions. From the resulting imaginary part *ε*_{2} of the dielectric function, its real part *ε*_{1} can be obtained by the Kramers–Kronig transformation and then additional optical functions such as conductivity, reflectivity, absorption, or the loss function can also be calculated. In a metallic solid, an additional Drude term accounts for the free-electron intraband contribution. For insulators and semiconductors, where the DFT gap is often too small when compared to the experiment, one can use a “scissor operator.” This sounds complicated but is nothing else than a rigid shift of the unoccupied DFT bands to adjust the (too small) DFT bandgap, either using the experimental gap or, more *ab initio*, using the gap calculated with TB-mBJ. Note that TB-mBJ usually gives very good bandgaps, but the bandwidth of both the valence and conduction bands are too small, and thus, the optical properties with TB-mBJ might not be very accurate, but still more accurate than standard GGA (see Sec. III B 2 in Ref. 104 for a brief summary of literature results with TB-mBJ). Alternatively, hybrid-DFT functionals^{66} can be used, which give quite good bandgaps for semiconductors, but one should be aware that the optical properties usually require a quite dense **k**-mesh, which makes hybrid calculations fairly expensive.

As an example we present in Fig. 12 the imaginary part of the dielectric function *ε*_{2} for CsK_{2}Sb using various approximations. First, we note that very dense **k**-meshes are necessary for converged results, which makes the application of more expensive many-body perturbation methods such as *GW* even more difficult. As expected, PBE calculations yield the smallest bandgap of 1.06 eV, while hybrid YS-PBE0 gives 1.68 eV and TB-mBJ 2.08 eV. This can be compared to *G*_{0}*W*_{0} results^{191} of 1.62 eV or early experimental estimates of 1.0–1.2 eV, which, however, have been criticized. It should be noted that the *GW* result^{191} for *ε*_{2} probably suffers from an under converged **k**-mesh and even with their large smearing (note the large tail below 1.62 eV in the *GW* results shown in Fig. 12), a distinct peak structure emerges, which is not present in the **k**-converged results.

Experimental results for optical conductivity, reflectivity, or absorption as well as the low energy valence electron energy loss spectrum (VEELS) can often be successfully interpreted in the IPA.^{203,204} However, sometimes (in particular, for wide gap insulators) the frequency-dependent dielectric function *ε* in the IPA may have little in common with the experimental situation. This is because excitations are two-particle processes and the missing electron–hole interaction (i.e., the excitonic effect already mentioned in Sec. III C) can significantly affect the calculated optical response of a material when they are strong. In order to overcome this problem, one needs to include the electron–hole correlation explicitly by solving the BSE (see Sec. III I 3).

### E. Magnetism

When magnetism occurs in a solid, it may come from localized electrons (e.g., from *f* electrons of rare-earth atoms) or itinerant (delocalized) electrons (e.g., in Fe, Co, or Ni). In any case, magnetism comes mainly from exchange splitting causing a partial occupation of states, which differ between the spin-up (*N*_{↑}) and spin-down (*N*_{↓}) electrons. The corresponding magnetic moment *M* is defined as the difference between these occupation numbers (*M* = *N*_{↑} − *N*_{↓}). For such cases, one must perform spin-polarized calculations (*runsp_lapw*) and needs the spin density in addition to the total electron density. The default is collinear magnetic order as found in ferromagnets, for example, in Fe, Co, Ni, or antiferromagnets, for example, in Cr. In addition to collinear magnets mentioned here, one can also handle non-collinear magnetism (for example, systems with canted magnetic moments or spin spirals), as described in Sec. III I 2.

For spin-polarized calculations of a specific complex antiferromagnetic structure, in most cases, it is essential to specify proper (antiferromagnetic) atomic spin-moments as an input for the SCF cycle and the WIEN2k tool *instgen* allows us to set this easily. If one is interested in the orientation of the magnetic moments with respect to the crystal structure (easy or hard axis) or the magneto-crystalline anisotropy,^{206} the SO interaction must be included. For heavy elements or when orbital moments become important, one needs this full relativistic treatment by including the SO interaction.

Nowadays, one can study complicated systems, for example, BaFe_{2}O_{5}, an oxygen-deficient perovskite-like structure, which shows a Verwey transition. At low temperature, this system has a charge-ordered state (with Fe^{2+} and Fe^{3+} at different Fe-sites), but above the Verwey transition temperature (at about 309 K), a valence mixed state with the formal oxidation state Fe^{2.5+} appears. DFT calculations made it possible to interpret this complicated situation (for details see Ref. 207 and Sec. 7.4.1 of Ref. 208). Another complex system is PrO_{2}, which has a Jahn–Teller-distorted CaF_{2} structure.^{209} It contains Pr-4*f* electrons, which form a localized band (lower Hubbard band) for one 4*f* electron, but the others hybridize with the valence electrons forming a situation between Pr^{3+} and Pr^{4+}. This compound is an antiferromagnetic insulator that requires a relativistic treatment. PBE+*U* calculations^{93} provide results that are consistent with all experimental data for the bandgap, magnetic moment, and structural distortion.

In the 1980s, a numerical problem occurred in connection with several studies of the Fe_{65}Ni_{35} INVAR alloy, which has a vanishing thermal expansion around room temperature. This is one of the systems for which the magnetization shows a hysteresis when a magnetic field is applied. The hysteresis causes numerical difficulties, since for a certain magnetic field, there are three solutions (magnetic moments) with very similar total energies causing difficulties in the convergence of a conventional SCF procedure. In order to solve this problem, the fixed spin moment (FSM) method was proposed.^{210,211} It is a computational trick interchanging dependent and independent variables. Physically speaking, one applies a magnetic field and obtains a moment, but computationally one chooses the moment (as input) and calculates the field afterwards. In a conventional spin-polarized calculation, the Fermi energy must be the same for the spin-up and spin-down electrons. The magnetic moment *M* is an output. In the FSM scheme, one does several constrained calculations, where the moment *M* is an input, but allowing different Fermi energies for the two spin states. One can interpret the difference in the Fermi energies as a magnetic field. Although one needs to perform several calculations (instead of a single conventional one), they converge rather rapidly. The FSM method allows expanding the usual total-energy vs volume curve to an energy surface *E*_{tot}(*V*, *M*) as a function of volume *V* and moment *M*, which also provides new insights.

### F. Hyperfine fields and electric field gradients

All aspects of nucleus–electron interactions, which go beyond the electric point-charge model for a nucleus, define the hyperfine interactions. Nuclei with a nuclear quantum number *I* ⩾ 1 have an electric quadrupole moment *Q*. The nuclear quadrupole interaction (NQI) stems from the interaction of such a moment and the EFG, the second derivative of the Coulomb potential at the corresponding nuclear site. One can measure the EFG with Mössbauer spectroscopy, NMR, nuclear quadrupole resonance (NQR), or perturbed angular correlation (PAC). The NQI determines the product of *Q* and the EFG, a traceless tensor. The latter has a principal component and an asymmetry parameter *η*. The EFG is a ground-state property that can be determined experimentally (measuring NQI), provided the nuclear quadrupole moment is known. In early studies, the EFG was interpreted as a simple point charge model with additional corrections (Sternheimer factor^{212}). However, later it was shown that the EFG can be calculated with DFT, as was illustrated^{213} for LiN_{3}. The EFG is sensitive to the asymmetric charge distribution around a given nucleus and thus is a local probe, which often helps in clarifying the local atomic arrangement. The reader can find a short description of several EFG calculations for selected examples in Chap. 6.4 of Ref. 8, and here, we describe some important aspects and examples below.

Computationally, it is important to treat both valence and semi-core states very accurately because, due to the 1/*r*^{3} factor in the EFG expression, even small asphericities near the nucleus lead to important contributions. This was demonstrated for the first time for TiO_{2} in the rutile structure, where the radial functions of the fully occupied Ti-3*p* semi-core orbitals are slightly different for *p*_{x}, *p*_{y}, and *p*_{z} states and thus contribute significantly to the EFG.^{28}

The mapping of the two Cu-EFGs in the high-temperature superconductor YBa_{2}Cu_{3}O_{7} to the “plane” and “chain” Cu sites provides insight into which Cu atom is responsible for superconductivity, and the analysis of the EFG on all other sites helped us to interpret chemical bonding.^{214} It should be mentioned that the EFG at the (superconducting) Cu-plane site comes out quite inaccurate and GGA+*U* calculations are necessary, leading to a redistribution of 0.07 e^{−} from a $3dx2\u2212y2$ orbital to a $3dz2$ orbital and an EFG in agreement with the experiment.^{92}

Next, we briefly discuss the study on 16 fluoroaluminates from Ref. 215, for which experimental NMR data were compared to DFT results for the ^{27}Al EFG. In all of these compounds, the aluminum atoms occur in Al$F63\u2212$ octahedra, which have a wide diversity of connectivity and distortions. One of these structures is shown in the inset of Fig. 13 for Ba_{3}Al_{2}F_{12}. These fluoroaluminates illustrate how sensitive the EFG is to the exact position of neighboring atoms. A perfect octahedral symmetry would have a vanishing EFG, but small distortions cause an EFG. The calculations were first done using the less accurate powder diffraction data for the atomic positions [Fig. 13(a)], and the correlation between experiment and theory is not very good. Then, a DFT structure optimization was done leading to an almost perfect correlation between experimental and theoretical EFGs [Fig. 13(b)]. This structure optimization has an even more pronounced effect on the asymmetry parameter, as shown in Ref. 215. We should mention that there can also be a sensitivity to DFT functionals for EFG calculations as described in Refs. 66 and 80.

Last but not least, we demonstrate how one can determine the nuclear quadrupole moment *Q* from a combination of theoretical EFG calculations and experimental measurements of the quadrupole splitting, which is proportional to the product of EFG and *Q*. From the slope of a linear regression for the EFG of several Fe compounds, we could deduce the nuclear quadrupole moment *Q* of ^{57}Fe, the most important Mössbauer isotope. It was found to be about twice as large (*Q* = 0.16 b) as the previous literature value (*Q* = 0.082 b), suggesting to revise this nuclear property using electronic structure calculations.^{216}

The magnetic hyperfine field (HFF) at a nucleus originates from a Zeeman interaction between the magnetic moment *I* of this nucleus and the magnetic field at this site produced by the spin-polarized electrons in a ferromagnet. The HFF has contributions from the Fermi-contact term (the spin density at the nucleus), an orbital, and a spin dipolar contribution. Here, we skip the details but mention that an all-electron treatment is crucial, especially for the Fermi-contact term, since one needs accurate values of the spin-density close to the nucleus, for which the basis set used in WIEN2k is extremely useful. We recalculate the densities of all electrons, including the core, in each cycle of the SCF scheme, in contrast to the frozen-core approximation. The resulting core polarization can often be the main contribution to the HFF. The high quality of such calculations was demonstrated for the double perovskite BaFe_{2}O_{5}, for which the DFT calculations of EFG and HFF provided new insights (for details, see Ref. 207).

### G. NMR chemical and Knight shifts

The NMR shielding $\sigma \u20e1$ tensor is defined as a constant between an induced magnetic field **B**_{ind} at the nucleus at site **R** and the external uniform field **B**_{ext},

Its value is usually in the range of ppm (part per million). Since the magnetic field cannot be controlled with such a precision, the tensor is measured only with respect to some reference,

and often only its isotropic part $\sigma (R)=tr[\sigma \u20e1(R)]$ is known.

The external magnetic field is a relatively weak perturbation compared to the typical energy scale of the electronic structure; therefore, its effect on the spin and orbit of an electron can be separated in the theoretical calculations. Here, we only outline the main features that are specific to the APW+lo method and the WIEN2k code and are vital for achieving high accuracy of the computed NMR tensor. A more in-depth discussion can be found in the original publications. The formalism for computing the orbital part of the shielding (chemical shift) has been described in Refs. 217 and 218, and the spin part (Knight shift) of the response has been described in Ref. 219. The formalism has been applied for computing shielding in various insulating^{78,220,221} and metallic systems.^{222–224}

As will be explained below, this approach can reach the basis set limit for NMR calculations, and benchmark calculations for small molecular systems have proven that standard quantum chemistry methods can only reach this precision with very large uncontracted quintuple-zeta basis sets and only for light atoms.^{225}

#### 1. Orbital component

The orbital part of the shielding, i.e., the orbital component of the induced field **B**_{ind}, is obtained directly from the Biot–Savart law (in atomic units, with *c* as speed of light),

where **j**(**r**) is the induced orbital current, evaluated as expectation value of the current operator

WIEN2k separates the calculation of valence and core states. The core state contribution to the induced current is computed using the spherically symmetric core density only,

The method for computing the valence contribution to **j**(**r**) is based on a linear response approach^{226–228} originally developed by Mauri, Pfrommer, and Louie.^{226} The expression for the induced current involves only the first-order terms with respect to the external field **B**_{ext},

where $\psi o(0)$ is an unperturbed KS occupied orbital, **J**^{0}(**r**) is the paramagnetic part of the current operator [the first term in Eq. (33)], and *J*^{1}(**r**) is the diamagnetic component of the current operator [the second term in Eq. (33)]. $\psi o(1)$ is the first-order perturbation of $\psi o(0)$, given by the standard formula involving a Green’s function,

and involves a sum over empty states (first term) as well as a sum over core states (second term) because the core states have been calculated before [Eq. (34)] and Eq. (36) is only correct if all states of a system are included. Note that core wave functions appear in Eq. (36) as if they were unoccupied states. *H*^{(1)} is the perturbation due to the external magnetic field in symmetric gauge,

As discussed in Sec. II A, the basis functions of the APW+lo method are highly tuned to describe the occupied Bloch states everywhere in the unit cell (in particular, also close to the nucleus), but they are not a complete basis set. Therefore, depending on the perturbation of the Hamiltonian, they may not be well suited to expand the perturbations of wave functions. In fact, magnetic fields and NMR are such cases where the perturbed wave functions, in particular, near the nucleus, are very different, which means that the sum in Eq. (36) cannot be converged with the available set of orbitals. In order to remedy this issue, we had to enhance the original WIEN2k basis set. First, the standard set of local orbitals is extended significantly, both in the number of LOs per *ℓ* (typically to 5–8 LOs) and also in *ℓ* (typically to *ℓ* + 1), where *ℓ* refers to the maximal “chemical” *ℓ* of this atom.^{217} These extra local orbitals are referred to as NMR-LOs, and the energy at which the radial functions of those NMR-LOs are computed is chosen such that each of the NMR-LO radial functions has zero value at the sphere boundary and the number of nodes inside the sphere of subsequent LOs increases by one corresponding to the next principal quantum number.^{217} However, these NMR-LOs alone cannot completely improve the variational flexibility close to the nuclei.

The perturbation of the Hamiltonian due to the external magnetic field is proportional to a product of position and momentum operators. As a result, the perturbation of the radial wave function *u*_{tℓ} contains components proportional to *u*_{tℓ±1} and their radial derivative $r\u2202\u2202rut\u2113\xb11$. A direct introduction of basis functions based on $r\u2202\u2202ru$ is not convenient within the APW formalism because such functions are not eigenstates of the radial Schrödinger equation. Therefore, we have proposed to add the desired term directly to the Green’s function present in the formula for the first-order perturbation of the valence state wave function.^{218} It is referred to as the “$\u2202\u2202ru$ correction” (DUC).

The convergence test with respect to the number of NMR-LO and DUC corrections is presented in Fig. 14. The induced current and shielding calculated within our linear response formalism is compared to the exact value for an isolated Ar atom. The current and shielding for a spherically symmetric atom can be computed exactly using only its density and the same formula [Eq. (34)] as for the core states (diamagnetic current).

Both DUC and several additional NMR-LOs are needed to reproduce the shape of the exact diamagnetic current in a region within 0.5 bohr from the nucleus. It appears in Fig. 14(a) that any error in the representation of the current in this region results in substantial errors of the computed shielding values [Fig. 14(b)]. This method can therefore reach the basis set limit.

The all-electron nature and the modular concept of WIEN2k makes it very easy to perform NMR calculations with wave functions including SO interactions for heavy nuclei or using different DFT approximations, including DFT+*U* (Sec. II B 3), but, in particular, also hybrid functionals (Sec. II B 2). Thus, we can compare the theoretical shielding with the experimental chemical shifts for several different compounds, and from the correlation and slope of the linear regression curve, the quality of a particular approximation to the XC effects can be evaluated. Ideally, the slope of this linear regression line should be −1, but typically for ionic compounds,^{78} the slope with PBE is too large (−1.2), while with the hybrid functional YS-PBE0,^{66} it is too small (−0.8). This is quite in contrast to organic molecules, where hybrid functionals perform much better than the GGAs. Surprisingly, the BJ potential^{106} performs quite well for ionic oxides or halides and yields slopes close to −1.

Most importantly, theory should not only reproduce measured experimental values but also provide insights. WIEN2k allows us to analyze and identify the contributions to the NMR shielding, and for instance, for the F-shielding in the alkali-fluoride series (Fig. 15), the following observations can be made:^{221} (i) Basically, all contributions to the F-shielding come from a region inside the F atomic sphere (the current in the rest of the unit cell contributes negligibly). (ii) The large (diamagnetic) shielding comes from the constant F-1*s* core contribution. (iii) The contributions from the F-2*s* bands are still diamagnetic, but much smaller and again constant within the series. (iv) The diamagnetic metal-*p* semi-core contributions (Na-2*p* to Cs-5*p*) increase within the series. This can be explained by the fact that for heavier elements the metal *p*-states increase in energy and come closer to the F-2*p* band. This leads to an increased bonding (in the metal-*d* band) and anti-bonding (in the F-*p* band) metal-*p*–F-*p* interaction giving slightly different (non-canceling) diamagnetic and paramagnetic contributions. (v) The trend of *σ* within the series comes mainly from the F-2*p* valence band. The most important ingredient, which determines the size of the (mostly paramagnetic) F-2*p* contribution, is the position of the unoccupied metal-*d* band. The perturbation due to the magnetic field couples the occupied F-*p* states with Δ*ℓ* ± 1 to unoccupied *d* states, and due to the energy denominator in Eq. (36), the Cs-*d* states give the largest contribution because they are the closest in energy to the valence bands. We can even artificially apply a (large) *U* value to the empty Cs-5*d* states shifting them further up. In this way, we do not alter the occupied states but still can increase the F shielding in CsF to reproduce the LiF or NaF shifts.

By a similar analysis, we could explain why the ^{33}S magnetic shielding decreases with the metal nuclear charge *Z* in the ionic alkali/alkali-earth sulfides but increases in TM sulfides.^{220}

#### 2. Spin component

In order to compute the induced spin density and spin part of the NMR shielding tensor, we use a direct approach,^{219} instead of applying the linear response formalism proposed, for instance, in Ref. 229. This is possible because the interaction of the spin with the external magnetic field does not break the periodicity. Therefore, we perform self-consistent spin polarized calculations with a finite external magnetic field **B**_{ext} acting on the electron spin only. The interaction with **B**_{ext} is cast into a spin-dependent potential leading to a spin splitting of eigenstates and a finite spin magnetization. The induced magnetic field at a given nucleus is computed using an expression for the magnetic hyperfine field,^{230}

The first term (**B**_{c}) is the Fermi contact term, where **m**_{av} is the average of the spin density in a region near the nucleus with a diameter equal to the Thomson radius. The second term (**B**_{sd}) captures the spin-dipolar contribution to the hyperfine field, where Φ_{1} is the large component of the wave function, *S* is the reciprocal relativistic mass enhancement, and *μ* is the magnetic moment operator of the electron. **B**_{sd} comes almost entirely from within the atomic sphere, which simplifies its calculation. The spin contribution (*σ*_{s}), i.e., the Knight shift to the shielding, is therefore given by two terms,

In order to obtain a sizable response and evaluate the NMR shielding with a numerical precision at the level of 1 ppm, we apply in our calculations an external magnetic field of 100 T, which induces a spin-splitting of approximately 1 mRy. These small changes require an extremely fine **k**-point sampling (for fcc Al, 10^{6}**k**-points are needed), and this must always be carefully checked.

Further details and results of our approach can be found in Refs. 219, 222, and 223, but, here, we summarize the main findings: (i) The previously accepted point of view, namely, that Knight shifts are proportional to the partial *s*-DOS at the Fermi energy and the orbital contribution *σ*_{orb} is identical to that in the (ionic) reference compound, is only true for simple *sp* metals. (ii) In TM *d*-elements or metallic compounds, the orbital part *σ*_{orb} can be as important as the spin part *σ*_{s}. (iii) The *s*-DOS at *E*_{F} is always important, but an induced TM-*d* magnetic moment (proportional to the partial *d*-DOS at *E*_{F}) polarizes the core states in the opposite direction so that the valence and core polarizations can partly cancel. (iv) The dipolar contribution *σ*_{sd} is usually small, but, in anisotropic materials, a large dominance of one particular orbital at *E*_{F} can eventually lead to a very large contribution. We have found this in BaGa_{2}, where the *p*_{z}-DOS has a large and sharp peak at *E*_{F} leading to an aspherical magnetization density and a large dipolar contribution.^{222}

### H. Wannier functions and Berry phases

A single particle state of a periodic system is conventionally represented as a Bloch state *ψ*_{nk}(**r**), which is labeled by a band index *n* and a vector **k** inside the first BZ. It satisfies Bloch’s theorem

where *u*_{nk}(**r**) = *u*_{nk}(**r** + **R**) is a lattice periodic function and **R** is a Bravais lattice vector. Alternatively, one can define Wannier functions (WFs) *w*_{nR}(**r**) in unit cell **R** for a set of *J* bands as^{231,232}

where *V* is the unit cell volume and *U*^{k} are unitary transformation matrices that mix Bloch states at a given **k**. Because of the arbitrary phase of *ψ*_{nk}(**r**), the resulting WFs are usually not localized. Maximally localized WFs can be obtained by choosing the *U*^{k} such that the spread Ω is minimized,

This involves overlap integrals $Mmnk,b=\u27e8umk|unk+b\u27e9$ between the periodic part of the wave functions on a uniform grid in the BZ.

The maximally localized WFs are calculated by Wannier90,^{231} and the transformation matrices *U*^{k} are provided by the *wien2wannier*^{232} module of WIEN2k. The resulting WF can be used for various tasks. They can be visualized and are useful for the interpretation of chemical bonding to generate tight-binding models or interpolations to very fine **k**-meshes for properties that require fine **k**-meshes. Such properties can be transport, anomalous Hall conductivity, linear and non-linear optics, Berry curvatures and topology, or electron–phonon interactions. In particular, for the improved description of electronic correlations via the DMFT approximation,^{233,234} WFs provide a realistic starting point.

In connection with *wien2wannier*, there is also the *BerryPi*^{235} module in WIEN2k, which calculates the polarization of solids using the Berry phase approach.^{236} *BerryPi* can calculate the change of polarization Δ*P* in response to an external perturbation to study ferroelectricity, the Born effective charges, pyroelectric coefficients, or the piezoelectric tensor. In addition, one can define Wilson loops and calculate Chern numbers to study topological properties.^{237}

### I. External programs

#### 1. Thermoelectric transport coefficients

WIEN2k is interfaced with the BoltzTraP2 program^{165} for calculating transport coefficients within the relaxation time approximation. The calculation is based on evaluating the transport distribution function,

using a fine mesh in **k**-space. To obtain the group velocities, **v**_{nk}, and also quasi-particle energies on a fine mesh or effective masses, BoltzTrap2 relies on interpolating the eigenvalues, *ε*_{nk}, and possibly also the relaxation times, *τ*_{nk}, using Fourier sums.

The interpolation is performed so that the calculated eigenvalue energies are reproduced exactly. Within KS theory, the multiplicative potential [Eq. (10)] means that it is often computationally very efficient to calculate a fine mesh of eigenvalues, which can then be interpolated further to evaluate Eq. (43). This argument no longer holds when hybrid functionals (Sec. II B 2), or the *GW* method (Sec. III I 4), are used to obtain the band structure. Therefore, BoltzTraP2 can also include the **k**-space derivatives in the interpolation scheme, as can be obtained from the momentum matrix elements introduced in Sec. III D. This allows a more efficient interpolation and the use of a coarser **k**-mesh in the actual DFT calculation.

Once the transport distribution has been obtained, the temperature and chemical potential dependent transport coefficients

can be obtained by a simple numerical integration. Figure 16 shows the highest valence bands of TiCoSb calculated using the interpolation scheme of BoltzTraP2. Compared to Fig. 7, a very good agreement with the band structure obtained using DFT is found. Since the Fourier interpolation is done bandwise (marked by color in Fig. 16), the band crossing along the Γ − *X* direction is not reproduced. However, the error is hardly visible by eye and, in accordance with the intention of the original algorithm,^{238,239} no Fourier ripples are seen, which means that errors in the derivatives are isolated to the points where the crossing occurs. The calculated thermoelectric power factor using a constant relaxation time is also shown. The power factor peaks at a high value close to the band edge, where a steep transport distribution can be expected. The high power factor can be attributed to the complex constant energy surface shown in the inset, which is typical for half-Heusler compounds with favorable *p*-type thermoelectric performance.^{240}

BoltzTraP2 is written mainly in PYTHON3 and can be used as PYTHON library. The interpolation is handled by a single PYTHON call, fitde3D. Once the Fourier coefficients have been obtained, the interpolation of the bands onto the direction needed for plotting the band structure (getBands), or the fine mesh needed for obtaining the transport distribution (getBTPbands), can be performed. The use of BoltzTraP2 as a library gives a reproducible and flexible work flow. The analysis associated with Fig. 16 can thus be performed with a single PYTHON script, which is included in the most recent distribution.

#### 2. Non-collinear magnetism

WIEN2k can only compute the electronic structure of magnetic systems with a collinear spin arrangement. For performance benefits, WIEN2k assumes that the spin density matrix along some direction is diagonal for each eigenstate. When SO interactions are taken into account, this condition is sometimes not satisfied. In such cases, the off-diagonal terms of the spin density matrix are simply ignored during the SCF procedure and only the *z* component of the spin density is properly converged. Generally, this is not a big issue for cases with collinear spin arrangement and is, in fact, common practice. However, such an approximation cannot be applied for systems with a non-collinear spin arrangement for which the full spin density matrix has to be considered. For that purpose, we have written a non-collinear spin version of WIEN2k, referred to as WIENNCM.

Our implementation is based on a mixed spinor basis set approach.^{241,242} In the interstitial region, the basis functions are pure spinors given in a *global* (*g*) spin coordinate frame,

where $\chi \u2191g=10$ and $\chi \u2193g=01$. Inside the atomic spheres, the basis functions are a combination of both up and down spinors, which are set in a *local* spin coordinate frame with a quantization axis pointing along the direction of the average magnetization of the given atomic sphere. This direction does not have to be the same for each sphere, and the basis functions are [for a LAPW basis set, where we drop the (*r*, *E*)-dependency in the radial *u* functions]

where $\chi \sigma t$ is a spinor given in a local coordinate frame. This choice of the spin coordinate frame allows us to use spin-polarized radial functions with the quantization axis along the direction of the average magnetization. The matching of the $\phi K+k,\sigma LAPW$ basis to the plane waves at *r* = *R*_{MT} is done for up and down plane waves in a *global* spin coordinate frame,

Thus, the *A*_{tℓm} and *B*_{tℓm} depend on *global σ* and *local σ*^{t} spin indices. Multiplying both sides of Eq. (47) by $\chi \sigma tg*$, integrating over the spin variable, and comparing to the standard collinear expression, the $At\u2113mK+k,\sigma \sigma t$ and $Bt\u2113mK+k,\sigma \sigma t$ are given by

where $At\u2113mK+k,\sigma t$ and $Bt\u2113mK+k,\sigma t$ are the collinear matching coefficients calculated for “local” spins. We have extended the original formalism^{241,243} beyond the atomic moment approximation, and the code processes the spin density matrices without any approximations also inside the atomic spheres. Note that the adaptation of the above equations for LOs or the APW+lo basis set is trivial.

A few more details of the implementation are the following: In WIENNCM, the default scalar relativistic Hamiltonian is extended with SO interactions and this doubles the size of the Hamiltonian. It is also possible to use the DFT+*U* method for correlated systems. The setup and execution of WIENNCM is as in WIEN2k; however, the atomic structure has to be augmented with definitions of the magnetic structure, which requires to define the orientation of the average magnetic moment for each atom. This is to some extent automatized in such a way that the user only needs to provide the orientation for the “magnetic atoms” (e.g., only U atoms in UO_{2}). The orientation for “non-magnetic” atoms (O in UO_{2}) is generated automatically. WIENNCM makes use of the spin symmetry, which simplifies the calculations.

If one wants to calculate spin spirals, one can either handle this by (big) supercells or, more efficiently, by using the generalized Bloch theorem^{244} (neglecting SO interactions) so that these calculations can be done in the small crystallographic cell.

#### 3. Electron–hole interactions

The state-of-the-art method to include electron–hole interactions is based on the solution of the equation of motion of the two-particle Green’s function, known as the BSE.^{246–248} The WIENBSE code allows the calculation of the optical response taking into account excitonic effects. The BSE is solved in an approximate manner by representing them in the form of an effective eigenvalue problem with the so-called BSE Hamiltonian,^{249,250}

where the sum runs over occupied (they form the hole upon excitation) valence (*v*) and unoccupied (they become occupied upon excitation) conduction (*c*) bands and **k** points (supplied by a DFT calculation performed with WIEN2k) and the electron–hole Hamiltonian consists of three terms, *H*^{e} = *H*^{diag} + *H*^{dir} + *H*^{x}, which are given by [**x** = (**r**, *σ*)],

The *H*^{diag} term depends only on the eigenvalues and accounts for the response in the non-interacting limit. The exchange *H*^{x} and the direct *H*^{dir} Coulomb terms couple the electron–hole pairs.^{251} The direct term, in principle, depends on the dynamically screened Coulomb electron–hole interaction and on the excitation energy (*E*^{λ}), but here we apply the usual approximation and only account for non-local but static screening. The coupling coefficients $Avck\lambda $ define the electron–hole correlation function and enter the expression for the imaginary part of the dielectric function,

The BSE approach is very successful in dealing with excitons and the response in the optical regime. The excitonic effects are sometimes small (e.g., in small gap semiconductors), but still important, with binding energies of some tens of meV, sometimes large (in particular, in insulators) with binding energies of a couple of eV such that the resulting optical functions have hardly any resemblance with independent-particle results. The WIENBSE implementation has been used in several works for various semiconductors,^{252–255} but together with supercell calculations and *GW*- or TB-mBJ-based single particle states, it can also successfully describe F centers in wide bandgap alkali halides.^{76,256} It has also been extended to include a fully relativistic treatment of core states to study XANES at L_{2,3} edges of 3*d* TM compounds.^{196} In particular, for formally 3*d*^{0} compounds such as TiO_{2}, the correct L_{2}/L_{3} branching ratio can be obtained due to interference effects of the 2*p*_{1/2} and 2*p*_{3/2} core states. Even fine differences in line shape between the rutile and anatase modifications are in agreement with the experiment demonstrating the power of this approach.

#### 4. *GW* approximation for quasi-particle calculations

The *GW* method^{102,103} is considered as the state-of-the-art method for a first-principles description of the electronic quasi-particle band structure in solids. The name of this many-body perturbation theory based method comes from the interacting Green’s function *G*(**r**, **r**′, *ω*), whose poles in the complex frequency plane determine the single-particle excitation energies and *W*, the dynamically screened Coulomb potential, which is obtained using the polarizability in the random-phase approximation. The central quantity, namely, the self-energy Σ(**r**, **r**′, *ω*) from which the first-order correction to the KS eigenvalues can be calculated, is obtained from

where *η* is an infinitesimal positive number.

The GAP2 code^{257,258} is the second version of an all-electron *GW* implementation based on the WIEN2k code. As WIEN2k, this highly parallelized code can use an arbitrary number of HDLOs,^{259,260} which ensures that one can obtain fully converged *GW* results even in difficult cases such as ZnO. Since it is based on an all-electron method and can use orbitals from DFT+*U*, a major advantage of this code is the possibility to explore *d*- and *f*-electron systems in a meaningful way.^{260} We note that for materials that are traditionally categorized as strongly correlated (e.g., the oxides), the standard semilocal functionals usually fail (see Sec. II B 3). For such systems, using the DFT+*U* (or hybrid) orbitals as input for one-shot, *G*_{0}*W*_{0} performs much better.^{261,262} In addition, the code allows for partially self-consistent *GW*_{0} calculations by updating *G* with the modified eigenvalues.

#### 5. Dynamical mean field theory

Many systems have valence electrons in orbitals, which are quite extended in space and overlap strongly with their neighbors. This leads usually to strong bonding-antibonding effects, large bandwidth *W*, and dominant contributions from the kinetic energy. If, in addition, the bare Coulomb interaction *U* between two electrons on the same site is strongly screened (maybe due to metallic character), such systems are usually quite well described by standard DFT approximations. However, as already mentioned in Sec. II B 3, 3*d* or 4*f* electrons may be more localized so that they participate much less in bonding and have a more atomic-like character. In these cases, the bandwidth *W* (metal) and Coulomb interaction *U* (Mott insulator) compete as does the crystal-field splitting (low-spin state) and Hund’s rule coupling *J* (high-spin state). We talk about “correlated electrons” and standard semilocal DFT approximations may fail badly in certain cases. The DFT+*U* and hybrid methods discussed in Secs. II B 3 and II B 2, respectively, can be much more accurate depending on the investigated property. However, the state-of-the-art approach for these correlated electron systems is DMFT,^{233,234} which is based on the Hubbard model on a lattice, described by the following Hamiltonian:

The first term (kinetic energy) describes the hopping *t*_{ij} of an electron with spin *σ* from lattice site *j* to lattice site *i*, while the second term (potential energy) accounts for the strong Coulomb repulsion *U* between two electrons at the same lattice site *i*, which is responsible for the correlations in the system. Within DMFT, the complicated lattice problem is replaced by a single-site impurity model, which hybridizes with a self-consistently determined non-interacting bath.

The basic idea of DFT+DMFT is to divide the electrons in the system into two groups: weakly correlated electrons (i.e., electrons in *s*- and *p*-orbitals) that are well described by an approximate DFT functional and strongly correlated electrons (i.e., *d*- and *f*-electrons) well described using DMFT. The model Hamiltonian for DFT+DMFT is then constructed for the correlated subset with a suitable basis usually defined by Wannier functions as discussed in Sec. III H. The full-orbital KS Hamiltonian *H*^{KS} is then projected onto the correlated subspace of the partially filled orbitals and many-body terms *H*^{U} as well as a double counting correction *H*^{DC} are added.

Several such DFT+DMFT codes^{264–269} use WIEN2k as basis, and numerous applications have proven the power of the combined DFT+DMFT approach. The DMFT approach is often applied to explain optical, XAS, or ARPES spectra (see, e.g., Ref. 270 for V_{2}O_{3}) and can also estimate the intensities of the spectral features due to lifetime broadening. Recently, free energies and forces have also been made available,^{269,271} which allows us to study structural (e.g., *α*-*γ* Ce), magnetic (e.g., bcc-fcc Fe), or metal–insulator (e.g., NdNiO_{3}) phase transitions with temperature.^{272}

#### 6. Phonons

WIEN2k does not have its own program to calculate phonon spectra, but it is interfaced with at least three different external phonon programs: phonopy,^{159,160} Phonon,^{161,162} and PHON.^{163,164} They all employ the finite-displacement method^{162} and a harmonic approximation. First, the crystal structure must be very well relaxed so that residual forces on all atoms are very small. For this structure, the phonon codes suggest a systematic set of displacements (depending on symmetry) in a chosen supercell and WIEN2k calculates the forces for these displacements. These forces are then used by the phonon codes to calculate harmonic force constants and setup and diagonalize the dynamical matrices at the desired **k**-points, which yields the phonon frequencies and their eigenmodes. For Γ-phonons (infrared or Raman spectroscopy), a supercell is not required; otherwise, the supercell should be large enough (typically more than 50 atoms/cell) such that the force constants between atoms separated by more than the supercell size become negligible. In ionic solids, the frequency splitting of the optical vibrational modes parallel and perpendicular to the electric field (the so-called LO-TO splitting) in the small wave-vector limit can be obtained when additionally the Born effective charges (see Sec. III H) are supplied to the phonon programs. An alternative approach for phonon calculations, namely, density functional perturbation theory,^{273} is not implemented.

Phonon calculations can be used to investigate various properties of materials. Frequencies at Γ are analyzed according to their symmetry and can be compared to IR and Raman spectra (see, e.g., Ref. 274 for application on PbFBr_{1−x}I_{x}). The full phonon band structure and the corresponding phonon-DOS can be calculated and integrated, yielding thermodynamic quantities such as the mean square thermal displacements, the specific heat, entropy, or free energy, which together with the quasi-harmonic approximation can be used to determine thermal expansion. Imaginary frequencies at certain **k**-points indicate an instability of this phase (at 0 K) and occur, for instance, in all cubic perovskites.^{275} Freezing in one of the corresponding eigenmodes with a certain amplitude and subsequent structure relaxation yields a more stable phase in a particular space group of lower symmetry and can be used to detect and analyze second-order phase transitions in various materials.^{276–278}

#### 7. Band structure unfolding

The standard way to model defects, vacancies, alloys (disorder), or surfaces is by means of a supercell approach. While sufficiently large supercells can handle the energetics of these problems quite well, it is fairly difficult to describe the effect of the perturbation on the bulk electronic structure. A band structure from a supercell calculation usually looks like a bunch of spaghetti and is very difficult to interpret. It is therefore highly desirable to display the band structure in the original BZ of the bulk material and indicate the original Bloch character as much as possible. This unfolding can be done conveniently by the fold2Bloch utility.^{166,279}

In an *h* × *k* × *l* supercell band structure, each **k**-point transforms into *h* × *k* × *l* **k**-points of the bulk BZ. fold2Bloch calculates the corresponding spectral weights *w*_{n}(**k**), which amounts to the Bloch character **k** of the *n*th eigenvalue *ε*_{n}, subject to the normalization that *∑*_{k}*w*_{n}(**k**) = 1, and displays *w*_{n}(**k**) in the unfolded band structure so that one can distinguish between regular bulk and defect states.

#### 8. de Haas–van Alphen effect

The knowledge of the Fermi surface (FS) of a metallic compound is important to understand its electronic and transport properties. de Haas–van Alphen (dHvA) measurements of the quantum oscillatory magnetization contain detailed information about the FS and report frequencies that are proportional to extremal FS cross sections perpendicular to the magnetic field direction.^{280} However, it is not so easy to reconstruct from the measured data the actual multi-band FS.

On the other hand, FS calculations in WIEN2k are rather trivial and can be well presented using XCrysDen.^{152} For a quantitative comparison with experiment, it is highly desirable to calculate the corresponding dHvA frequencies as well as the corresponding effective masses. This can be done conveniently using the SKEAF (Supercell **k**-space Extremal Area Finder) tool.^{167,280} An application can be found for the determination of transport properties in the HoBi^{281} compound.

## IV. DISCUSSION AND SUMMARY

In this paper, we have reviewed the widely used WIEN2k code, which is based on the APW+lo method to solve the KS equations of DFT. Particular emphasis was placed on the various types of basis functions that are available. One of the strengths of the WIEN2k code is the possibility to use an arbitrary number of local orbitals, which allows an accurate calculation of all states, from the low-lying occupied semi-core to the high-lying unoccupied states. For the latter, the use of local orbitals is crucial in order to get converged results for a property that is calculated using perturbation theory such as the NMR chemical shift.

Various types of approximations for the treatment of XC effects are mentioned, and the large number of functionals that are available constitutes another strength of the WIEN2k code. They range from the semilocal approximations (all the existing ones can be used via the Libxc library) to the more sophisticated approximations such as DFT+*U*, the hybrids, or functionals specifically developed for van der Waals interactions. In particular, the popular Tran–Blaha mBJ potential is implemented in WIEN2k, which is a cheap but accurate method to calculate bandgaps in solids. Since the WIEN2k code is a full-potential all-electron code, it is, in principle, able to provide the exact result within a chosen XC approximation. Thus, WIEN2k is ideally suited for the testing of XC functionals.

The structure of the WIEN2k code, as well as the workflow of programs in a SCF calculation, has been described. WIEN2k has also a user-friendly interface that is especially useful for beginners. In principle, an APW-based method needs many specific input parameters (various PW and *LM* cutoffs and case specific *LM* expansions and specific *E*-parameters for each atom and angular momentum), but one of the great strengths of our implementation is that for all these parameters very good defaults are provided automatically to the user so that WIEN2k can also be mastered by non-experts. Of course, an all-electron code cannot be as fast as PW pseudopotential codes, where the extensive use of FFTs speeds up the calculations. Despite this, an APW-based method can be fairly efficient when large atomic spheres can be used because of the relatively fast PW convergence in such cases. It also has a fast and robust method to solve the SCF problem, including a simultaneous optimization of the atomic positions. WIEN2k is a very efficient implementation of the APW+lo method from the computational point of view. The code is highly optimized and uses whenever possible efficient numerical libraries (BLAS, LAPACK, and ELPA). It has three different parallelization schemes, which allows us to run the code efficiently on a laptop as well as on a huge high performance computing cluster.

WIEN2k can calculate a large number of different properties. Besides the basic quantities such as the optimized atomic structure, cohesive energy, electronic band structure, or magnetism, numerous more specialized properties are available and can be readily calculated. Among them, those whose corresponding programs or modules are part of the WIEN2k code are, for instance, the optical properties, electric polarization, electric-field gradients, NMR chemical and Knight shifts, or magnetic hyperfine fields. In particular, for the latter two quantities, an all-electron method is mandatory. We also described programs that are not part of the WIEN2k distribution but are compatible with it. This includes, for instance, WIENNCM (non-collinear magnetism), WIENBSE (electron–hole interactions), BoltzTraP2 (thermoelectric transport coefficients), GAP2 (*GW*), or the various programs that calculate phonons.

The main advantage of working with an all-electron code such as WIEN2k is the possibility to implement methods for calculating properties exactly. However, implementations within the APW+lo method are not always straightforward, since the dual basis-set representation, atomic-like functions inside the atomic spheres, and plane waves between the atoms may lead to complicated equations. In addition, the discontinuity of the derivatives of the basis functions at the sphere boundary may require a careful treatment. DFT codes using a basis set consisting only of plane waves or only of localized basis functions (e.g., Gaussian) lead, in principle, to easier implementations. However, once a method has been implemented in the APW+lo method, it can then be applied to any element of the periodic table, from hydrogen to the actinides, without any restriction, which is the great power of the APW+lo method.

## ACKNOWLEDGMENTS

We would like to thank the current and former colleagues of the WIEN2k group and external collaborators who have contributed to the development of the WIEN2k code (see our website^{282} or the WIEN2k user’s guide^{283}). We also thank all the users keeping the WIEN2k mailing^{284} list alive. L.D.M. acknowledges support from the National Science Foundation, USA, under Grant No. DMR-1507101. P.B. acknowledges support from the Austrian Science Foundation (FWF) for Project W1243 (Solids4Fun).