GPAW: An open Python package for electronic-structure calculations

We review the GPAW open-source Python package for electronic structure calculations. GPAW is based on the projector-augmented wave method and can solve the self-consistent density functional theory (DFT) equations using three different wave-function representations, namely real-space grids, plane waves, and numerical atomic orbitals. The three representations are complementary and mutually independent and can be connected by transformations via the real-space grid. This multi-basis feature renders GPAW highly versatile and unique among similar codes. By virtue of its modular structure, the GPAW code constitutes an ideal platform for implementation of new features and methodologies. Moreover, it is well integrated with the Atomic Simulation Environment (ASE) providing a flexible and dynamic user interface. In addition to ground-state DFT calculations, GPAW supports many-body GW band structures, optical excitations from the Bethe-Salpeter Equation (BSE), variational calculations of excited states in molecules and solids via direct optimization, and real-time propagation of the Kohn-Sham equations within time-dependent DFT. A range of more advanced methods to describe magnetic excitations and non-collinear magnetism in solids are also now available. In addition, GPAW can calculate non-linear optical tensors of solids, charged crystal point defects, and much more. Recently, support of GPU acceleration has been achieved with minor modifications of the GPAW code thanks to the CuPy library. We end the review with an outlook describing some future plans for GPAW.

The electronic-structure (ES) problem, i.e. the solution to the time-independent Schrödinger equation for a collection of electrons and atomic nuclei, forms the starting point for the quantum-mechanical treatment of matter.Indeed, all chemical and physical properties of any substance (solid, molecule, surface, etc.) can in principle be obtained from the energies and wave functions that constitute the solution.A pioneering step towards solving the many-body ES problem was the formulation and formal proof of density functional theory (DFT) by Hohenberg and Kohn in 1964 [1], and a practical scheme for its solution by Kohn and Sham in 1965 [2].Today, most codes solving the ES problem from first principles are based on DFT.Such codes are extremely powerful and allow one to determine the atomic structure of solids and molecules containing hundreds of atoms with a relative error below 1% [3][4][5].Once the atomic structure of the compound has been solved, its properties (electronic, magnetic, optical, topological, etc.) can in principle be determined.The evaluation of properties often involves theories beyond the formal DFT framework to account for effects such as temperature and lattice vibrations [6,7], many-body interactions in excited states [8,9], or time dependence [10,11].As such, firstprinciples atomistic calculations often involve two successive phases: the solution of the ground-state ES problem (including ion dynamics) and the subsequent evaluation of physical properties.This review is structured accordingly as Secs.III-IV deal with the first phase while Secs.V-IX are devoted to the second.
In recent years, the scientific significance of ES codes has shifted from a useful tool to describe and understand matter at the atomic scale to an independent driver of the discovery and development of new materials [12][13][14][15][16].This change in scope has been fueled by the exponential increase in computer power accompanied by improved numerical algorithms [17,18] as well as the use of workflow management software for high-throughput computations [19][20][21][22] and the adoption of machine-learning techniques to leverage the rapidly growing data generated by ES codes [23][24][25].In parallel with these capacityextending developments, continuous progress in the fundamental description of exchange-correlation effects has advanced the predictive power of ES calculations to a level where they rival experiments in terms of accuracy for many important properties [26][27][28][29][30][31].
The GPAW code was originally intended as a Pythonbased multigrid solver of the basic DFT equations within the projector-augmented wave (PAW) formalism [32].The name GPAW accordingly was an abbreviation for "grid-based projector-augmented waves".Today, other choices than regular grids for representations of the wave functions exist in GPAW, but the name has stuck.During the years 2005-2010, GPAW evolved to a full-blown DFT package [33] supporting most of the functionality expected from a modern ES code, in addition to a few more specialised features including real-time propagation of wave functions [34] and an alternative basis of numerical atomic orbitals (referred to as the LCAO basis) [35] to supplement the real-space grid.In 2011, a plane-wave (PW) basis set was also implemented.Today, the possibility to use three different types of basis sets and even combining them within a single run remains a unique feature of GPAW, rendering the code very versatile.
The implementation of the PW basis set laid the groundwork for GPAW's linear-response module, which today supports the calculation of linear response functions [36], total energies from the adiabatic connection fluctuation-dissipation theorem [29,37], the GW selfenergy method for quasiparticle band structures [38], the Bethe-Salpeter Equation (BSE) for optical excitations [39], and more.The code also supports a wide range of features related to the calculation of magnetism and spin-orbit effects.Examples include spin-spiral calculations using the generalized Bloch theorem [40], external magnetic fields, orbital magnetization, magnetic anisotropy [41], adiabatic magnon dispersions from the magnetic force theorem [42], and dynamic magnetic response from TDDFT [43].For solids, the k-space Berry phases can be computed directly from the Bloch orbitals and may be used to obtain the spontaneous polarization [44], Born effective charges, piezoelectric response tensors [45] and various indices characterising the band topology [46].
In addition, GPAW can compute the localisation matrices forming the basis for the construction of Wannier functions with e.g. the Atomic Simulation Environment (ASE) [47] or Wannier90 [48].Electrostatic corrections to the formation energies of charged point defects in insulators are implemented as are calculations of the hyperfine coupling and zero-field splitting for localised electron spins.GPAW also offers the possibility to perform timeindependent, variational calculations of localised electronic excitations, in e.g.molecules or at crystal point defects, using direct orbital optimisation strategies implemented for all three types of basis sets [49][50][51].This provides an efficient and robust alternative to traditional "∆SCF" approaches.GPAW can also be used to describe ultrafast electron dynamics within time-dependent density functional theory (TDDFT) with wave functions represented either on a real space grid [34] or in the LCAO basis [52].The latter can provide a significant speed-up due to the relatively small size of the basis [53][54][55].The LCAO representation also forms the basis for calculation of electron-phonon couplings as well as non-linear optical spectra such as Raman scattering [56] (which can alternatively be obtained in the PW mode as a finite difference of the dielectric tensor), second-harmonics generation [57], and shift currents [58] using higher-order perturbation theory.

II. WHY GPAW? A. User's perspective
There are dozens of electronic-structure codes available for the interested user.The codes differ in their license (in particular, whether open or proprietary), the underlying programming language (e.g.Fortran, C, Python), their treatment of core electrons (all-electron versus pseudopotentials), the employed representations of the wave functions (plane waves, atom-centered orbitals, real-space grids), and the beyond-DFT features they support.Why should one choose GPAW?
In this section, we describe some of the features that make GPAW interesting from the point of view of a common user who wants to perform electronic-structure calculations.The next section focuses on its possibilities for more advanced users, who perhaps want to modify the code or implement completely new functionalities.
A first point to note is that GPAW is written almost exclusively in Python and is directly integrated with the Atomic Simulation Environment.This integration with ASE makes the setup, control, and analysis of calculations easy and flexible.The programming language is of course a key issue for developers, but also the common user benefits from Python and the ASE/GPAW integration.A typical stand-alone program only offers a fixed (though of course possibly large) set of tasks that it can perform, while Python scripting allows for a more flexible use of the code.This could for example mean combining several different GPAW calculations in new ways.Another advantage is that "inner parts" of the code like the density or the Kohn-Sham eigenvalues are directly accessible in a structured format within Python for further analysis.It is even possible to "open up the main loop" of GPAW and have access to, inspect, and also modify key quantities during program execution (see Fig. 1).
As already mentioned in the introduction, GPAW distinguishes itself from other available ES codes by supporting three different ways of representing the wave functions.The most commonly used basis set is plane waves (PW), which is appropriate for small or mediumsize systems, where high precision is required.Convergence is easily and systematically controlled by tuning the cut-off energy.A large number of advanced features and "beyond-DFT" methods are available in the PW mode.These include the calculation of hybrid functionals, RPA total energies, linear-response TDDFT, and many-body perturbation theory techniques like GW and the Bethe-Salpeter equations.The new GPU implementation also uses the PW mode.
The wave functions can alternatively be represented on real-space grids, which was the original approach in GPAW.The implementation of this so-called finitedifference (FD) mode relies on multi-grid solutions of the Poisson and Kohn-Sham equations.The FD mode allows for more flexible boundary conditions than the PW mode, which is restricted to periodic supercells.The boundary conditions may for example be taken to reflect the charge distribution in the unit cell.Calculations in the FD mode can be systematically converged through lowering of the grid spacing, but the approach to full convergence is slower than in the PW mode.The FD mode is particularly well suited for large systems because the wave-function representation allows for large-scale parallelization through real-space decomposition.Further-more, it is possible to perform time-propagation TDDFT including Ehrenfest dynamics in this mode.
The third representation of the wave functions is a basis of numerical atom-centered orbitals in the linear combination of atomic orbitals (LCAO) mode.The size of the basis set can be varied through inclusion of more angular momentum channels, additional orbitals within a channel, or polarization functions.GPAW comes with a standard set of orbitals, but a basis-set generator is included with the code so that users may construct different basis sets depending on their needs and requirements.The LCAO mode is generally less accurate than the PW and FD modes, but it allows for the treatment of considerably larger systems -more than ten thousand atoms.It is also possible to study electron dynamics through a fast implementation of time-propagation DFT, and Ehrenfest dynamics is under development.
As explained, the different modes have different virtues and limitations, and it can therefore be an advantage to apply several modes in a project.For larger systems, it is for example possible to divide a structure optimization into two steps.First, an optimization is performed with the fast LCAO basis leading to an approximately correct structure.This is then followed by an optimization in either the PW or FD mode, which now requires much fewer steps because of the good initial configuration.Due to the ASE/Python interface this combined calculation can easily be performed within a single script.
Since GPAW was originally created with the FD mode only, and the LCAO mode was added next, some features have been implemented for only those modes.Examples are real-time TDDFT (see section VII) and electronphonon coupling (see section VI G).Conversely, some new features only work for the PW mode, which was added after the real-space modes.Examples are RPA total energies (see section VI C) and calculation of the stress tensor.To summarize, given the limitations just mentioned, users should most of the time use PW or LCAO mode and the choice will depend on the accuracy needed and the resources available.

B. Developer's perspective
The GPAW source code is written in the Python and C languages and is hosted on GitLab [59] licensed under the GNU General Public License v3.0.This ensures the transparency of all features and allows developers to fully customise their experience and contribute new features to the community.
An advantage of having a Python code is that the Python script you write to carry out your calculations will have access to everything inside a GPAW calculation.An example showing the power and flexibility this affords is the possibility to have user-code inserted inside the self-consistent field (SCF) loop as demonstrated in Fig. 1.
At the time of this writing (July 2023), GPAW has FIG. 1.The variable calc is the ground-state DFT calculator object and its icalculate method yields a context object at every self-consistent field (SCF) step.As seen, one can use this in a for-loop to implement special logic for termination of the SCF iterations or for diagnostics.In this example, the memory usage is written to the log-file for the first 15 SCF iterations.
two versions of the ground-state DFT code in the main branch of the code.There is the older version that has grown organically since the birth of GPAW: it has many features, but also a lot of technical debt that makes it harder to maintain and less ideal to build new features on top of.The newer ground-state code addresses these issues by having a better overall design.
The new design greatly improves the ease of implementation of new features.The goal is to have the new code be feature-complete so that it can pass the complete test suite and then delete the old code once that is achieved.At the moment, we recommend that all production calculations are done with the old code and that work on new features is done on top of the new code even though certain features are not yet production-ready.Three new features, not present in the old code base, have already been implemented based on the new code: GPU implementation of PW mode calculations (see section III B 9), reuse of the wave functions after unit-cell changes during cell optimization, and spin-spiral calculations (see section V E).
GPAW uses pytest [60] for its test suite that currently consists of approximately 1600 unit and integration tests (see Table I).A subset of those tests runs as part of Git-Lab's continuous integration (CI) thereby checking the correctness of every code change.Unfortunately, the full test suite is too time-consuming to run as part of CI, so we run that nightly both in serial as well as in parallel using MPI.
Many of the code examples in GPAW's documentation, exercises and tutorials [61] require resources (time and number of CPUs) beyond what would make sense to run as part of the pytest test suite.For that, we use MyQueue [21] to submit those scripts as jobs to a local supercomputer every weekend.At the moment this amounts to approximately 5200 core-hours of calculations.
As can be seen from Table I, the majority of the code is written in Python, which is an interpreted language that is easy to read, write and debug.
Interpreter-executed code will not run as efficiently as code that is compiled to native machine code.It is therefore important to make sure that the places in the code where most of the time is spent (hot spots) are in native machine code and not in the interpreter.GPAW achieves this by implementing the hot spots in C-code with Python wrappers that can be called from the Python code.Examples of such computationally intensive tasks are applying a finite-difference stencil to a uniform grid, interpolating from one uniform grid to another, or calculation of overlaps between projector functions and wave functions.In addition, we have Python interfaces to the numerical libraries FFTW [62], ScaLA-PACK [63], ELPA [17], BLAS, Libxc [64,65], libvdwxc [66], and MPI.Finally, GPAW makes heavy use of the NumPy [67] and Scipy [68] Python packages.NumPy provides us with the numpy.ndarraydata type which is an N -dimensional array that we use for storing wave functions, electron densities, potentials, matrices like the overlap matrix or the LCAO wave function coefficients, and much more.The use of NumPy arrays allows us to use the many sub-modules of SciPy to manipulate data.This also gives us an efficient memory layout allowing us to simply pass a pointer to the memory whenever we need to call the C-code from the Python code.With this strategy, we can get away with having most of the code written in a relatively slow interpreted language and still have most of the time spent in highly efficient C-code or optimized numerical libraries.
The advantage of the original FD-mode, where there are no Fourier transforms of the wave functions, is that the algorithms should parallelize well for large systems.In practice, it has turned out that FD-mode has a number of disadvantages: 1) Due to integrals over the unit cell being done as sums over grid-points, there will be a small periodic energy variation as you translate atoms and the period of the variation will be equal to the gridspacing used (the so-called egg-box error); 2) The system sizes that are typically most interesting for applications of DFT are too small for the parallel scalability to be the decisive advantage; 3) The memory used to store the wave functions on uniform grids in real-space is significant.In contrast, the PW mode has practically no eggbox error, is very efficient for the most typical system sizes and often uses a factor of 10 less memory compared to an FD mode calculation of similar accuracy.The main advantages for LCAO mode are low memory usage and high efficiency for large systems; for small unit cells with many k-points the PW mode is most efficient.One disadvantage of LCAO-mode is egg-box errors: LCAO-mode uses the same uniform grids as used in FD-mode for integration of matrix elements like ⟨Φ µ |ṽ|Φ ν ⟩ and therefore has similar egg-box energy variation.A second disadvantage of LCAO-mode is that as for any localized basisset, reaching the complete basis-set limit is more involved compared to PW and FD modes.This can have severe consequences even for ground state calculations of difficult systems such as Cr 2 for example [? ].In PW or FD modes the complete basis set limit is easy to reach by simply increasing the number of plane-waves or grid-points, respectively, which leads to a smooth convergence [32].At the moment we only provide double-ζ polarized (DZP) basis sets and going beyond DZP is left for users to do themselves.

A. Projector augmented-wave method
The diverging Coulomb potential causes rapid oscillations in electronic wave functions near the nuclei, and special care is required to be able to work with smooth wave functions.The projector augmented-wave (PAW) method by Blöchl [69] is a widely adopted generalization of pseudopotential methods, utilizing their strength of smooth pseudo wave functions while retaining a mapping from all-electron wave functions (|ψ n ⟩) to pseudo wave functions (| ψn ⟩).
The crux is to define a linear transformation T from pseudo to all-electron space, where Here pa i (r), ϕ a i (r) and φa i (r) are called projectors, partial waves and pseudo-partial waves, respectively.The pseudo-partial waves and projectors are chosen to be biorthogonal, dr φa i (r)p a j (r) = δ ij , allowing for an approximate closure relation which is utilized heavily to obtain an efficient, but allelectron, picture.In addition to biorthogonality, the pseudo and all-electron partial waves are chosen to be equal outside the PAW augmentation sphere cutoff radius φi (r) = ϕ i (r), r > r c .
The basic recipe for converting an operator Ô is where Ĥ is the single-particle all-electron Kohn-Sham Hamiltonian operator, can be transformed to their PAW counterparts: We have used Eq. 1, and also multiplied with T † from left to make its dual space the pseudo one, i.e. ⟨ ψ| ∈ H * can act from left.This results in a PAW Hamiltonian and in PAW overlap operators as follows: Terms such as ∆H a ii ′ and ∆S a ii ′ represent so-called PAWcorrections.In each part of the description, which handles a particular kind of operators, such as kinetic energy, spin-operators, or the electrostatic potential, respective PAW-corrections must be calculated.The most crucial ones are precalculated, such as overlap, kinetic energy, Coulomb, and stored in the 'setup'-file, which also stores the partial waves and projectors.As an example, the overlap PAW corrections are precalculated to setup as follows: We further define the atomic density matrices as The atomic density matrix contains all information required to construct PAW corrections to any local allelectron expectation value: The all-electron atomic density can be constructed as and the corresponding equation for pseudo-densities holds with n → ñ and ϕ → φ.
Since the exchange-correlation (xc) potential is nonlinear, the PAW corrections must be evaluated explicitly.
The xc PAW corrections are performed by constructing the atomic all-electron and pseudo-electron densities as given by Eq. 11: This integral is numerically evaluated in a Cartesian product of a Lebedev angular grid and a non-uniform radial grid (denser mesh closer to the nucleus) for each atom.

Wave-function representations
GPAW supports three representations for smooth wave functions.The plane wave (PW) and linear combination of atomic orbitals (LCAO) representations rely on basis functions.The finite difference (FD) mode relies on a representation of kinetic energy operator on a uniform Cartesian grid.

All-electron quantities
The beauty of the PAW method is that you never need to transform the pseudo wave functions to all-electron wave functions, but you can do it if you want to.GPAW has tools for interpolating the pseudo wave functions to a fine real-space grid and adding the PAW corrections.A fine real-space grid is needed to properly represent the cusp and all the oscillations necessary for the wave function to be orthogonal to all the frozen core states.
GPAW also has tools for calculating the all-electron electrostatic potential.This is useful for transmission electron microscopy (TEM) simulations [70].Most TEM simulations have relied on the so-called independent atom model (IAM), where the specimen potential is described as a superposition of isolated atomic potentials.While this is often sufficient, there is increasing interest in understanding the influence of valence bonding [71].This can be investigated by a TEM simulation code such as abTEM [72], which can directly use ab initio scattering potentials from GPAW.

Solving the Kohn-Sham equation
The default method for solving the Kohn-Sham equation for PW and FD modes is to do iterative diagonalization combined with density mixing; for LCAO mode we do a full diagonalization of the Hamiltonian.Alternatively, one can do direct minimization as described in the next section.
For PW and FD modes, we need an initial guess of the wave functions.For this, we calculate the effective potential from a superposition of atomic densities and diagnonalize an LCAO Hamiltonian in a small basis set consisting of all the pseudo partial waves corresponding to bound atomic states.
Each step in the self-consistent field (SCF) loop consists of the following operations: 1) Diagonalization of the Hamiltonian in the subspace of the current wave functions (skipped for LCAO).2) One or more steps through the iterative eigensolver (except for LCAO, where a full diagonalization is performed).3) Update of eigenvalues and occupation numbers.4) Density mixing and symmetrization.See previous work [32,33] and Ref. [73] for details.
GPAW has two kinds of Poisson equation solvers: direct solvers based on Fourier transforms or Fourier-sine transforms, and iterative multi-grid solvers (Jacobi or Gauss-Seidel).The default is to use a direct solver whereas the iterative solvers may be chosen for larger systems where they can be more efficient.
For 0-, 1-and 2-dimensional systems, the default boundary conditions are to have the potential go to zero on the cell boundaries.This becomes a problem for systems involving large dipole moments.The potential due to the dipole is long-ranged and, thus, the converged potential requires large vacuum sizes.For molecules (0D systems), the boundary conditions can be improved by adding multipole moment corrections to the density so that the corresponding multipoles of the density vanish.The potential of these corrections is added to the obtained potential.The same trick is used to handle charged systems.For slabs (2D systems), a dipole layer can be added to account for differences in the work functions on the two sides of the slab.
Methods for calculating occupation numbers are the Fermi-Dirac, Marzari-Vanderbilt [74] and Methfessel-Paxton distributions as well as the tetrahedron method and the improved tetrahedron method [75].

Updating wavefunctions in dynamics
Simulations commonly move the atoms without changing other parameters.If an atom moves only slightly, we would expect most of the charge in its immediate vicinity to move along with it.We use this to compute an improved guess for the wave functions in the next selfconsistency loop with FD or PW mode where the eigensolver is iterative.
Near the atoms, the dual basis of pseudo-partial waves and projectors is almost complete, i.e.
If an atom moves by ∆R a , the wave functions ψn (r) are updated by rigidly moving the projection i φa i (r)⟨p a i | ψn ⟩ along with it, i.e., As the partial waves on different atoms are not orthonormal, this expression generally "double-counts" contributions, resulting in wave functions that are to some extent unphysical.Nevertheless, we have found that this simple method achieves a significant speedup (∼15% in realistic structure optimisations) compared to not updating the wave functions.The method could be further improved by using the LCAO basis set and the overlap matrix to prevent double-counting.

Direct minimization
Direct orbital minimization [76][77][78][79] is a robust alternative to the conventional eigensolver and density mixing routines.The orbitals can be expressed as a unitary transformation of a set of reference, or auxiliary, orbitals Ψ 0 : In the direct minimization (DM) method implemented in GPAW [50,80], the unitary matrix U is parametrized as an exponential transformation, i.e.U = e A , where A is an anti-Hermitian matrix (A = −A † ).The energy can be considered as a functional of both A and Ψ 0 : Thus, in general, the optimal orbitals corresponding to the minimum of the energy functional can be found in a dual loop procedure.First, the energy is minimized with respect to the elements of A, and second, the functional Since anti-Hermitian matrices form a linear space, the inner loop minimization can use well-established local minimization strategies such as efficient quasi-Newton methods with inexact line search, e.g. the limitedmemory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) algorithm.The outer loop minimization follows the gradient ∂L/∂Ψ 0 projected on the tangent space at Ψ 0 .The GPAW formulation of DM is applicable with all representations of the orbitals available in GPAW, as well as Kohn-Sham (unitary invariant) and orbital density dependent (nonunitary invariant) energy functionals, and can be used for both finite and extended systems.In LCAO calculations, the reference orbitals are expressed as a linear combination of the atomic basis functions Φ, Ψ 0 = ΦC 0 , where the matrix of coefficients C 0 is fixed.Therefore, only a minimization with respect to the elements of the matrix A is required [80].For plane wave and real-space grid representations, a minimization in the space tangent to the reference orbitals is sufficient if the functional is unitary invariant.Otherwise, if the functional is nonunitary invariant, such as when self-interaction correction is used (see section III C 5), an inner loop minimization in the occupied-occupied block of the matrix A is performed to make the energy stationary with respect to unitary transfomation of the occupied orbitals [50].
The DM method avoids diagonalization of the Hamiltonian at each step and as a result it usually involves a smaller computational effort.The DM method has also been shown to be more robust than conventional eigensolvers and density mixing in calculations of molecules and extended systems [80].However, the current implementation does not support finite-temperature distribution of occupation numbers and thus can only be used for systems with a finite band gap.

Convergence criteria
The modular architecture of GPAW allows the user to have precise control over how the SCF loop decides that the electronic structure has converged to sufficient precision.GPAW contains simple keywords for common convergence criteria, such as "energy", "forces", (electron) "density", and "work function", which are sufficient for the most common use cases.
Internally, all convergence criteria are instances of convergence classes, and for each step through the SCF loop, each convergence class is called.When any convergence class is called, it is passed a context that contains the current state of the calculation, such as the wavefunctions and the Hamiltonian.The criterion can thus pull relevant data from the calculation to decide if it is converged.Because the convergence criterion itself is an object, it can store information, such as previous values of the energy for comparison to the new value.When all convergence criteria report that they are converged, the calculation as a whole is considered to be converged and terminates.
This modular nature gives the user full control over how each convergence criterion operates.For example, the user can easily ask the energy criterion to check the differences in the last four values of the energy, rather than the last three.If a convergence criterion itself is expensive to calculate, it can make sense to not check it until the rest of the convergence criteria are met.This can be accomplished by activating an internal "calcu-late_last" flag within the convergence criterion.
Users can easily add their own custom convergence criteria to the SCF loop.If a user would like to use a criterion not included by default with GPAW, it is straightforward to write a new criterion as a Python class, and pass this to the convergence dictionary of GPAW.For example, if one wanted to be sure the bandgap of a semiconductor was converged, the criterion could check the bandgap at each iteration and compare it to stored values from previous iterations, and report that the calculation is converged when the peak-to-peak variance among the last n iterations is below a given threshold.

PAW data sets and pseudopotentials
GPAW can read PAW datasets from (possibly compressed) XML-files following the PAW-XML specification [81].Dataset files for most of the periodic table can be downloaded from the GPAW webpage or installed with the gpaw install-data command-line tool.The datasets are available for the LDA, PBE, revPBE, RPBE and GLLBSC functionals.The electronic structure code Abinit [82] also reads the PAW-XML format allowing GPAW and Abinit to share PAW dataset collections such as the Jollet-Torrent-Holzwarth collection [83].
Specialized datasets can be generated with the gpaw dataset command-line tool.This allows one to tweak the properties of a dataset.Some examples could be: 1) add more/less semi-core states; 2) increase/decrease the augmentation sphere radius to make the pseudo wave functions more/less smooth; 3) add/remove projector functions and corresponding pseudo and all-electron partial waves; or 4) base the PAW dataset on a different XC-functional.These changes will affect the accuracy and cost of the calculations.
GPAW is also able to use norm-conserving pseudopotentials (NCPP) such as HGH [84] and pseudopotentials in the UPF format, such as SG15 [85].Non-local NCPPs can be considered an approximation to PAW: in the PAW description, the non-local part of the Hamiltonian (the term containing ∆H a ij in Eq. ( 6)) will adapt to the environment, whereas for a NCPP, ∆H a ij will be diagonal and have a fixed value taken from a reference atom.Because of the norm-conservation, NCPPs will have ∆S a ij = 0.

Parallelization
GPAW can parallelize over various degrees of freedoms depending on the type of calculation, and implements multiple algorithms for achieving good parallel performance and scalability.In calculations involving k-points, parallelization over them is typically the most efficient, as little communication is needed during the summation of wave functions to calculate the density and any derived quantities.As the number of k-points is often limited especially in large systems, parallelization is also possible over real-space grids in the FD and LCAO modes, as well as over plane waves in the PW mode.All modes also support parallelization over electronic bands, which is particularly efficient for real-time TDDFT where the time-propagation of each band can be carried out independently.Additional parallelization possibilities exist depending on calculations, such as over electron-hole pairs in linear-respose TDDFT calculations.
Parallelization is done mainly with MPI.In the FD and LCAO modes, it is possible in addition to use also OpenMP within shared-memory nodes, which can improve performance when the number of CPU cores per node is large.Dense linear algebra, such as matrix diagonalization and Cholesky decomposition, can be carried out with the parallel ScaLAPACK or ELPA libraries.This applies to both the direct diagonalization in the LCAO mode as well as to subspace diagonalizations in iterative Davidson and RMM-DIIS methods in the FD and PW modes.
For ground-state calculations, GPAW will divide the cores into three MPI-communicators: k-points, bands and domain.When parallelizing over k-points and/or bands, all the cores of the k-point and/or band communicators will have a copy of the density (possibly distributed over the domain communicator).GPAW has the option to redistribute the density from the domaincommunicator to all cores so that operations such as evaluating the XC-energy and solving the Poisson equation can be done more efficiently.
For each k-point, all the cores in the band and domain communicators will cooperate on calculating matrix elements of the Hamiltonian and the overlap operators.Dense-matrix linear-algebra operations on those matrices can be done on a single core (most efficient for small systems) or with ScaLAPACK where the matrices are distributed over either all or some of the cores from the pool of cores in the band and domain communicators.
One drawback of Python is that in large parallel calculations its import mechanism may incur a heavy load on the filesystem because all the parallel processes are trying to read the same Python files.GPAW tries to alleviate this by a special "broadcast imports" mechanism: during initial module imports only a single process loads the modules, afterwards MPI is used to broadcast the data to all the processes.Parallel scalability depends strongly on the calculation mode and the system.The FD mode offers the best scalability for high core counts as only nearest-neighbour communication is needed across domains over a domain decomposition.In PW mode, the limiting factor is all-toall communication in parallelization over plane waves.In LCAO mode, communications arise from multi-center integrals of basis functions across domains.At best, GPAW scales to tens or hundreds of nodes in supercomputers.

GPU implementation
The GPU implementation of GPAW works both on NVIDIA and AMD GPUs, targeting either CUDA or HIP backends, respectively.GPAW uses a combination of manually written GPU kernels, external GPU libraries (such as cuBLAS / hipBLAS), and CuPy [86].CuPy offers an easy-to-use Python interface for GPUs centered around a NumPy-like GPU array and makes many hardware details completely transparent to the end-user.
In the manually written GPU kernels, both GPU backends (CUDA and HIP) are targeted using a header-only porting approach [87] in which generic GPU identifiers are translated to vendor-specific identifiers at compile time.For example, to allocate GPU memory, the identifier gpuMalloc is used in the code, that is then translated either to cudaMalloc or hipMalloc depending on which GPU backend is targeted.This allows us to avoid unnecessary code duplication and still target multiple hardware platforms natively.
An earlier GPU implementation of GPAW [88,89] served as the starting point for the recent work on a new GPU code based on the rewritten ground-state code.The objects that store quantities like ψn (r), pa i (r), ii ′ , ñ(r) and ṽ(r), use Numpy arrays for the CPU code and CuPy arrays when running on a GPU.At the moment, GPUs can be used for total-energy calculations with LDA/GGA in the PW mode.
Parallelization to multiple GPUs is done using MPI.Each MPI rank is assigned a single GPU and communication between the GPUs is handled by MPI.Support for GPU-aware MPI makes it possible to do direct GPU-to-GPU communication without unnecessary memory copies between the GPU device and the host CPU.

C. XC-functionals
Exchange-correlation (XC) functionals provide a mapping between the interacting and the non-interacting system of electrons.In Kohn-Sham DFT, the density is built from a set of occupied non-interacting singleparticle orbitals , with f i denoting the occupation numbers, leading to the same density as the interacting system.The total energy in DFT is expressed as a sum of density functionals for the different contributions: where T S [n] denotes the kinetic energy of the noninteracting system, V ext [n] the energy of the density in the external potential, U H [n] the classical Coulomb energy of the density with itself, and E xc [n] the so called exchange-correlation energy, which collects all energy contributions missing in the prior terms and therefore provides a mapping between the interacting and the noninteracting system of electrons.While the first three terms can be calculated exactly, even the form of E xc is unknown and although proven to exist and be exact in principle, it has to be approximated in practice.A huge number of approaches belonging to several families exists [1,2,90].Several of these approximations are available in GPAW, and an overview is given in the following.

Libxc and Libvdwxc
The libxc library [65] provides implementations of several (semi-)local variants of the XC functional, given by the LDA, GGA, and MGGA families.These are available in GPAW by a combination of their names from libxc as, e.g."GGA_X_PBE+GGA_C_PBE" for PBE [3].Additionally, GPAW provides its own implementation of several (semi-)local functionals called by their shortnames, e.g.TPSS [91], PBE [3] and LDA, the latter with the correlation of Perdew and Wang [92].Several hybrids, see below, are implemented in GPAW with the support of the libxc library for their local parts.
For fully non-local van der Waals functionals, like the vdW-DF functional [31], GPAW uses the efficient fast Fourier-transform convolution algorithm by Roman-Perez and Soler [93] as implemented in the libvdwxc library [66].

Hubbard U
DFT+U calculations using a Hubbard-like correction can be performed in GPAW to improve the treatment of Coulombic interactions of localized electrons.This correction is most commonly applied to the valence orbitals of transition metals to assist with obtaining experimental band gaps of oxides [99], which may otherwise be underestimated [100,101].Also, formation energies and magnetic states are often improved due to a more correct description of the electronic structure.It may also be applied to main-group elements such as N, O, P, or S, but this is less commonly done [102].
In the formalism chosen in GPAW [103], one uses a single U eff parameter rather than seperate U and J parameters for on-site Coulombic and on-site exchange, respectively [104].The correction influences the calculation by applying an energy penalty to the system: where the sum runs over the atoms a and orbitals i for which the correction should be applied.E DFT is calculated by a standard GPAW calculation and is corrected to E DFT+U by penalizing the energy such that fully occupied or fully unoccupied orbitals are stabilized.The magnitude of the correction depends on U eff i , and the atomic orbital occupation matrix (ρ a i ) controls which orbitals contribute to the correction based on their occupation.In principle, any orbital on any element that is partially occupied can be corrected.
GPAW supports normalization of the occupation matrix, accounting for the truncated portion of the wave function outside the augmentation sphere.To maintain consistency with other codes that do not support normalization, this normalization can be disabled, but large disagreements are expected when applied to p orbitals.GPAW is one of the few codes that currently supports multiple simultaneous corrections to orbitals on the same atom; this is useful when two types of orbitals, such as p and d orbitals, are nearly degenerate but both are partially occupied.
There is no U eff that is strictly correct, but methods such as RPA [105] or linear response [106,107] can allow for a value to be calculated from first principles.More commonly, U eff is chosen semi-empirically to fit experimental properties such as formation energy [108], band gap [109], or more recently by machine learning predictions [110].

Hybrids
Hybrid functionals, especially range-separated functionals (see below), correct problems present in calculations utilizing (semi-)local functionals such as the wrong asymptotic behavior of the effective potential leading to an improper description of Rydberg excitations [111], the improper calculation of the total energy against fractional charges leading to the charge delocalization error [112], and the wrong description of (long-range) charge transfer excitations due to the locality of the exchange hole [113][114][115].
The exchange-correlation energy E xc , can be split into the contributions from exchange, E x , and from correlation, E c [2,116].Hybrid functionals combine the exchange from the (semi-)local functionals with the exchange from Hartree-Fock theory (HF).Global hybrids such as PBE0 [117] mix the exchange from DFT with the exchange from HF by a global fixed amount.Rangeseparated functionals (RSF) add a separation function ω RSF [118,119] to express the Coulomb kernel in the exchange integrals as where Here α and β are mixing parameters for global and range-separated mixing, respectively.ω RSF is a soft function with values ranging from one for r 12 = 0 to zero for r 12 → ∞, where the decay is controlled by the parameter γ.Long-range-corrected RSFs such as LCY-PBE [120] use (semi-)local approximations for short-range (SR) interaction and apply HF exchange for long-range (LR) interaction.Short-range-corrected RSFs such as HSE [5] reverse this approach.The parameter γ is either fixed or can be varied to match criteria of the ideal functional, e.g. that the energy of the highest occupied molecular orbital matches the ionization potential [116,121].Details on the FD-mode implementation of long-range corrected RSF can be found in Refs.[122,123].In general, the FD-mode implementation of hybrids is limited to molecules, and forces have not been implemented.The PW-mode implementation of hybrids handles k-points, exploits symmetries and can calculate forces.

SIC
A fully self-consistent and variational implementation of the Perdew-Zunger self-interaction correction [124] (PZ-SIC) is available in GPAW.It corrects the various problems with (semi-)local Kohn-Sham functionals mentioned above in the context of hybrid functionals.Atomic forces are available in the GPAW implementation with all three types of basis sets.A corrected KS functional has the form where the self-Coulomb and self-XC of each occupied orbital density is subtracted from the Kohn-Sham energy functional.Due to the explicit dependence on the orbital densities, the corrected energy functional is not invariant under a unitary transformation among occupied orbitals and thereby not a KS functional.As a result, the minimization of E SIC requires special direct minimization techniques (see section III B 5) and delivers a specific set of (typically localized) optimal orbitals.The calculations should be carried out using complex orbitals [125][126][127].The full PZ-SIC has been shown to give an over-correction to binding energy as well as band gaps and improved results for these properties are obtained by scaling the SIC by α = 1/2 [128,129], while the long-range form of the effective potential, necessary for Rydberg state calculations, requires the full correction [130].
PZ-SIC has been shown to give accurate results in cases where commonly used KS functionals fail.This includes, for example, the Mn dimer where the PBE functional gives qualitatively incorrect results while the corrected functional gives close agreement with high-level quantum chemistry results as well as experimental measurements [131].Another example is the defect state of a substitutional Al atom in α-quartz [132].Also, PZ-SIC has been shown to improve values of the excitation energy of molecules obtained in variational calculations of excited states [50,133] (see section VIII B).

BEEF
A great strength of Kohn-Sham DFT and its extensions is that a reasonably high accuracy of physical, material, and chemical properties can be obtained at a relatively moderate computational cost.DFT is thus often used to simulate materials, reactions, and properties, where there de facto exists no "better" alternative.Even though more accurate electronic structure methods in principle might exist for a given application, the poor scaling of systematically more accurate methods often makes these computationally infeasible at the given system size that is being studied using DFT.One is therefore often in the situation that the accuracy of a given DFT calculation of some materials or chemical property cannot be verified against e.g. a more accurate solution of the Schrödinger equation, even on the biggest available supercomputers.
On the other hand, the wealth of available XC functionals naturally allows one to look at how sensitive a DFT result is on the choice of functional, and often accuracy is therefore being judged primarily by applying a small set of different XC functions, especially if no accurate benchmark theoretical simulation or experimental measurement is available.A challenge, however, is that the different available functionals are often known to be particularly good at simulating certain properties and poor at others.It is thus not at all clear how much one should trust a given functional for a given simulated property.The Bayesian Error Estimation (BEE) class of functionals [134] attempts to develop a practical framework to establishing an error estimate from a selected set of functional "ingredients".
Assume that the XC model M (a) is a function of a set of parameters, a, that can be varied freely.If a benchmark data set, D, of highly accurate properties established from experiments or higher-accuracy electronic structure simulations is available, we can attempt to identify the ensemble of models given by some distribution function, P (M (a)), such that the most likely model in Bayesian ensemble of XC functionals around BEEF-vdW.Orange solid line is the BEEF-vdW exchange enhancement factor, while the blue lines depict Fx(s) for 50 samples of the randomly generated ensemble.Dotted black lines mark the exchange model perturbations that yield DFT results ±1 standard deviation away from BEEF-vdW results.
the ensemble, M (a 0 ), makes an accurate predictions for the benchmark data set, while the spread of ensemble reproduces the spread between the predictions of the most likely model and the benchmark data.
Bayes' theorem provides a natural framework to search for the ensemble distribution.If a joint distribution between M (a) and D exists, which we would assume, then Bayes' theorm gives where P 0 (M (a)) is the prior expectation to the distribution of models before looking at the data, D, and P (D|M (a)) is a likelihood of seeing the data given the model.To achieve a useful ensemble, much care has to be put into finding a large enough set of varied and accurate data for different materials and chemical properties, and much care has to be applied as to how the ensemble is regularized to avoid overfitting [135,136].In all the BEE functionals, choices are made such that ultimately the XC functional is linear in a, and such that the distribution of a ends up following a multidimensional Normal distribution given by a regularized covariance matrix Γ: where Γ has been scaled in such a way that the ensemble reproduce the observed standard deviation between M (a 0 ) and D.
Figure 2 shows an ensemble of exchange enhancement factors F x (s) from the BEEF-vdW functional, where s = |∇n|/(2k F n) is the reduced density gradient.The approach has given rise to several functionals BEEF-vdW [135], mBEEF [136], and mBEEF-vdW [137], which all include error estimation and are readily available in GPAW.Through the ASE interface, one can for example utilize the error ensembles to establish error estimates on Python-implemented models that are using DFT simulations for their parametrization.An example of this is the application to error estimates on adsorption scaling relations, microkinetics, and materials selection [138].
One risk to the approach of establishing error estimates from a small selected "ensemble" of XC-functionals is clearly that if the simulated property in question is poorly described by all functionals in the ensemble, then the error estimate might be also become poor.This could for example be the case if one tried to establish an error estimate for band gaps in oxides or van der Waals bonding of adsorbates on surfaces based on an ensemble of GGA XC-functionals, since no GGA functional may be accurate for simulating either property.

IV. ION DYNAMICS
GPAW can be employed as a 'black-box' calculator, supplying energies and forces to other programs such as e.g.ASE, which then perform optimization of ground state geometries and reaction paths or carry out molecular dynamics.In fact, this is a key design principle behind GPAW.Methodological developments and general implementations that are not directly dependent on fast access to detailed electronic structure information should preferably be implemented externally to GPAW.This lead to maximal simplicity of the GPAW code itself, while also allowing for the external code to be utilized and tested with other electronic structure codes and atomistic potentials.A key to the high efficiency of GPAW simulations involving ionic displacements is the versatile implementation of constraints in ASE.Here many types of constraints are readily accessible, from the simple removal of degrees of freedom to more exotic constraints allowing for rigid molecule dynamics [139], to harmonic restoring forces [140], spacegroup preservation and combined ionic-unit cell dynamics [141].Many algorithms are available for various structure optimization and molecular dynamics tasks.

A. Structure relaxation
Local structure optimization in GPAW is typically achieved through the use of an optimizer from ASE.Here a larger range of standard optimizers are available such as quasi-Newton algorithms including BFGS [142][143][144][145] and limited-memory BFGS [146], or Newtonian dynamicsbased algorithms such as MDMin and FIRE [147].That the optimizers have been implemented externally to GPAW provides benefits in terms of simple procedures for restarting long simulations and monitoring their convergence.Some optimizers from SciPy [68] are also available through the open-source ASE package, which provides a simple way to interface any optimizer in the SciPy format to GPAW.Preconditioning is implemented in an accessible way [148], which often leads to significant performance improvements.
Of the classical optimization methods, the quasi-Newton algorithms are often highly competitive.Here one builds up information on the Hessian or inverse Hessian matrix from calculated forces, ultimately leading to an accurate harmonic model of the potential energy surface in the vicinity of the local minimum.Such algorithms can, however, have problems both dealing with anharmonicity in the potential energy surface and with any noise in the electronic structure simulations.It often makes sense instead to fit a Gaussian process to the calculated energies and forces, and minimize within this model potential.This is implemented as the so-called GPmin method in ASE and often converges on the order of three times faster than the best quasi-Newton optimizers.[149]

B. Reaction paths and barriers
Reliable calculations of energy barriers are of key importance for determining the rate of atomistic processes.In many quantum-chemical codes utilizing accurate atom-centered basis functions, this is achieved using analytical second derivatives in the direct search for first-order saddle points.This approach is less useful in the plane-wave based or grid-based modes of GPAW.The Dimer method [150] is implemented in ASE and can be used with GPAW, but often one would like to have an overview of the entire reaction path of an atomicscale process to verify that the correct energy barrier for a process has been determined, and to obtain a full overview of the atomistic mechanism.For this purpose, the Nudged Elastic Band (NEB) method is typically employed through the ASE interface to GPAW.Both the original method [151] and a range of later improvements are available [152][153][154][155][156][157].Special care has to be taken in selecting the optimizer for a NEB algorithm, as this choice can have drastic influence on the convergence rate.For optimization of the reaction path, drastic performance improvements can be obtained by carrying out the optimization in a surrogate machine-learning model fitted to the potential energy surface [158][159][160].GPAW has been used to drive NEB calculations on both surface systems [161][162][163][164] as well as in molecules [164,165].

C. Global structure optimization
GPAW is integrated with various tools for global optimization of structures, compositions, as well as materials morphologies.Some quite generally applicable global optimization tools available are Basin Hopping [166], Minima Hopping [167], and Genetic Algorithms [168].Some of the most powerful global optimization problems addressed using GPAW rely on machine-learning accelerated global-optimization strategies.These strategies have for example been applied to surfaces, clusters [169], and crystal structures in general [170].In other machinelearning accelerated global-optimization routines, GPAW was used to generate initial databases of surface and bulk systems, and for later model validation in a strategy that uses Gaussian processes to generate surrogate potential-energy surfaces.These were then explored with Bayesian-optimization techniques, achieving speed-ups over more conventional methods of several orders of magnitude in finding optimal structures of the systems under investigation [171][172][173].The method has been augmented through introducing extra (hyper)dimensions by interpolating between chemical elements, which speeds up the efficiency of the global search [174].GPAW has been integrated with a Covariance Matrix Adaptation Evolution Strategy (CMA-ES) framework, providing energies and forces, generating training data, and evaluating CMA-ES candidate structures [175].

D. Molecular dynamics and QM/MM
Ab initio molecular dynamics (MD) can be performed and analyzed through the ASE interface to GPAW.This include standard simulations such as constant-NVE, constant-NVT, and constant-NPT simulations.There is access to e.g.Langevin-, Andersen-, Nose-Hoover-, and Berendsen-dynamics.Due to the externalization of the dynamics, the development of novel algorithms and analysis tools becomes facile.Some examples of the use of GPAW span ab initio MD studies that have explored the liquid structure of water [176] and the water/Au (111) electrochemical interface [177].
Furthermore, GPAW is capable of working with external electrostatic potential terms from user-supplied point charge values and positions, enabling Quantum Mechanical (QM) / Molecular Mechanical (MM) simulations.Since GPAW is from the outset designed around highly efficient grid operations, the computational overhead of evaluating this potential as well as the resulting forces on the MM point charges from the QM density is kept low [178].GPAW has been central in a range studies of ion dynamics in solution, both in and out of equilibrium.By using the molecular-dynamics functionality of ASE, researchers have performed QM/MM MD simulations of excited-state bond formation in photocatalyst model systems [179][180][181][182], and electron transfer as well as coupled solute-solvent conformational dynamics in photosensitizer systems [165,183].Work on polarizable embedding QM/MM within GPAW is ongoing [184,185].

V. MAGNETISM AND SPIN
Many important technological applications utilize magnetic order or manipulation of spin in materials.GPAW has a wide range of functionalities that faciliate the analysis of magnetic properties.This includes calculations with non-collinear spin, inclusion of external magnetic fields, spin-orbit coupling and spin spiral calculations within the generalized Bloch theorem.The implementaton of these features is described below while additional methods for calculating magnetic excitations are described in section VI D.

A. Spin-orbit coupling
The spin-orbit coupling is typically completely dominated by regions close to the nuclei where the electrostatic potential becomes strong.Within the PAW augmentation sphere of atom a, the spin-σ component of orbital n is given by Assuming a spherically symmetric potential, the spinorbit Hamiltonian for atom a is then written as where V a r is the radial electrostatic potential of atom a.We evaluate V a r as the spherical part of the XC-and Hartree-potential from the local expansion of the density given by Eq. 11.Since the partial waves ϕ a i are eigenstates of the scalar-relativistic Hamiltonian, these are independent of spin and it is straightforward to evaluate the action of L on them.
The eigenenergies can be accurately calculated in a non-selfconsistent treatment of the spin-orbit coupling and may be obtained by diagonalizing the full Hamiltonian in a basis of scalar-relativistic orbitals [186]: (29) Here ε 0 m and ψ 0 m represent the scalar-relativistic eigenenergies and eigenstates, respectively.This constitutes a fast post-processing step for any scalar-relativistic calculation and only requires the projector overlaps ⟨p a j | ψ0 nσ ′ ⟩.It should be noted that this approach in principle requires convergence with respect to the number of scalarrelativistic states included in the basis, but the eigenvalues typically converge rather rapidly with respect to the basis.In Fig. 3, we show the band structure of a WS 2 monolayer obtained from PBE with non-self-consistent spin-orbit coupling.The W atoms introduce strong spinorbit coupling in this material and the valence band is split by 0.45 eV at the K point.The spin degeneracy is retained along the Γ-M line, which is left invariant by two non-commuting mirror symmetries.
For magnetic materials the non-selfconsistent treatment of spin-orbit coupling is convenient for evaluating the magnetic anisotropy.The magnetic force theorem [187] implies that rotating the magnetic moments away from the ground state configuration yields a contribution to the energy, which is well approximated by the change in Kohn-Sham eigenvalues.The change in energy for a given orientation of the magnetization density can thus be obtained as where ε θ,φ n are the eigenvalues obtained from diagonalizing Eq. ( 29) with the spins rotated to a direction defined by the angles (θ, φ) and f θ,φ n are the associated occupation numbers.This corresponds to rotating the xc-magnetic field (defined below), which will lead to different eigenvalues when spin-orbit coupling is included.In two-dimensional magnets, an easy-axis anisotropy is decisive for magnetic order [41,188,189] and Eq. ( 30) is easily applied to high throughput computations of magnetic properties [190,191].

B. Self-consistent non-collinear magnetism
The Kohn-Sham framework for treating non-collinear magnetism was developed by Barth and Hedin [192] and involves the spin density matrix as the basic variable.The electronic density and magnetization are then given by n = Tr[ρ] and m = Tr[σρ] respectively.The XC-potential acquires four components, which are the functional derivatives of the XC-energy with respect to the density matrix and may be represented as a 2 × 2 matrix acting on spinor Kohn-Sham states.These can be expressed in terms of the density and magnetization, which lead to the XC-part of the Kohn-Sham Hamiltonian where the scalar potential and XC-magnetic field are given by In GPAW, the self-consistent treatment of non-collinear spin is implemented within LDA where B xc is approximated as Here m(r) = |m(r)| is the magnitude and m(r) is the direction of the magnetization.The generalization of Eq. 34 to GGAs are plagued by formal as well as numerical problems [193,194] and the self-consistent solution of the non-collinear Kohn-Sham equations is presently restricted to LDA.It should be emphasized that the PAW formalism allows for a fully non-collinear treatment that does not rely on intra-atomic collinearity, which is often imposed in other electronic structure packages.Spin-orbit coupling may be included by adding Eq. 28 to the Kohn-Sham Hamiltonian and this constitutes a fully self-consistent framework for spin-orbit calculations within LDA.

C. Orbital magnetization
Current ab initio methods for determining the orbital magnetization of a material involve either the modern theory, i.e. a Berry-phase formula, or the atom-centered approximation (ACA), where contributions to the expectation value of the angular momentum operator are restricted within atom-centered muffin-tin (MT) spheres with specified cutoff radii.The PAW formulation of the wave functions allows for an approximation similar to the MT-ACA where only the PAW expansion of the wave functions is assumed to contribute significantly to the expectation value of the angular momentum.The orbital magnetic moments can then be calculated through where D a σii ′ are elements of the atomic density matrix as defined in Eq. ( 9), ϕ a i (r) are the bound all-electron partial waves of atom a, and L is the angular-momentum operator.
We note that this expression does not entail cutoff radii for the atomic contributions to the orbital magnetic moments; instead, the entire all-electron partial waves are included although the all-electron atomic expansion is only formally exact inside the PAW spheres.Additionally, this means that the unbounded, i.e. nonnormalisable, all-electron partial waves must be excluded in Eq. ( 35) PAW-ACA 0.0611 0.0546 0.0845 0.0886 MT-ACA [195] 0.0433 0.0511 0.0634 0.0868 Modern theory [195] 0.0658 0.0519 0.0756 0.0957 Experiment [196] 0.081 0.053 0.120 0.133 A prerequisite for nonzero orbital magnetization is broken time-reversal symmetry as represented by a complex Hamiltonian that is not unitarily equivalent to a real counterpart.Practically this means that a finite orbital magnetization requires magnetic order and either inclusion of spin-orbit coupling or non-coplanar spin textures.In GPAW, the spin-orbit interaction can be included either self-consistently in a non-collinear calculation or as a non-self-consistent post-processing step following a collinear calculation.The orbital magnetization has been calculated using Eq. ( 35) for the simple ferromagnets bcc-Fe, fcc-Ni, fcc-Co, and hcp-Co, where the spin-orbit interaction is included non-self-consistently.The PAW-ACA results are displayed in Table II along with MT-ACA results, modern theory results, and measurements from experiment, demonstrating mainly that the PAW-ACA can be an improvement over the MT-ACA and secondly that there is decent agreement between the PAW-ACA and the modern theory for these systems.

D. Constant B-field
The coupling of the electronic spin magnetic moments to a constant external magnetic field B can be included by the addition of a Zeeman term in the Kohn-Sham Hamiltonian: As an example, we can consider the spin-flop transition in Cr 2 O 3 .The ground state is anti-ferromagnetic with a weak anisotropy that prefers alignment of the spins with the z-axis, and if a magnetic field is applied along the z-direction, it becomes favorable to align the spins along a perpendicular direction (with a small ferromagnetic component) at the critical field where the Zeeman energy overcomes the anisotropy.This is clearly a spinorbit effect and one has to carry out the calculations in a fully non-collinear framework with self-consistent spinorbit coupling.In Fig. 4 we show the energy of the two alignments of spin as a function of external magnetic field obtained with LDA+U (U = 2 eV).The minimum energy configuration changes from S z to S x alignment at 6 T, i.e., the spin flips at this critical field value.This is in excellent agreement with the experimental value of Ref. [197].The critical field is, however, rather strongly dependent on the chosen value of U and increases by a factor of two in the absence of U .The magnetic anisotropy (per unit cell) can be read off from the energy difference at B = 0 T.

E. Spin spirals
The ground-state magnetic structure of frustrated and/or chiral magnets is often non-collinear and may be incommensurate with the chemical unit cell.In the classical isotropic Heisenberg model, the energy is always minimized by a planar spin spiral [198].The energy of a general planar spin spiral can be evaluated efficiently within the chemical unit cell using the Generalized Bloch's Theorem (GBT) [199].In the GBT implementation of GPAW [40], there is no restriction on interatomic collinearity and thus it encodes a rotation of the all-electron magnetization density by an angle φ = q • R i upon translations of a lattice vector R i .The Kohn-Sham equations can then be solved self-consistently for a fixed wave vector q, using only the periodic part of the generalized Bloch orbitals, |ψ q,nk ⟩ = U † q (r)e ik•r |u q,nk ⟩, where U q denotes the generator of spin rotations about the z-axis: The generalized Bloch Hamiltonian without spin-orbit coupling is given by LSDA spin-spiral spectrum of the NiBr2 monolayer (structure taken from the C2DB [15]).The ground state has an incommensurate wave vector Q ≃ [0.1, 0.1, 0].The magnetic moment displays only weak longitudinal fluctuations and the band gap remains finite for all wave vectors q, indicating that a Heisenberg model description of the material would be appropriate.The inset shows the spin-orbit correction to the spin-spiral energies (in meV) as a function of the normal vector n of the planar spin spiral.n is depicted in terms of its stereographic projection in the upper hemisphere above the monolayer plane.The spiral plane is found to be orthogonal to Q and tilted slightly with respect to the outof-plane direction.and once Eq. ( 37) has been solved self-consistently for a given q, the corresponding spin-spiral energy is evaluated as usual, Using the GPAW functionality to compute the spinspiral energy as a function of q, one can then search for the ground-state wave vector Q that minimizes the energy.In Fig. 5, we show that the monolayer NiBr 2 has an incommensurate spin-spiral ground state, and that the local magnetic moment on the Ni atom depends only weakly on the wave vector q.
Furthermore, the orientation of the ground state spin spiral can be obtained by including spin-orbit coupling non-self-consistently in the projected spin-orbit approximation [200] and search for the orientation that minimizes the energy.The ordering vector Q and the normal vector n of the spiral plane thus constitute a complete specification of the magnetic ground state within the class of single-q states.The normal vector is of particular interest since spin spirals may lead to spontaneous breaking of the crystal symmetry, and the normal vector largely determines the direction of magnetically induced spontaneous polarization [40].In Fig. 5, we show that n is perpendicular to the wavevector Q in the monolayer NiBr 2 ground state, corresponding to a cycloidal spin spiral.

VI. RESPONSE FUNCTIONS AND EXCITATIONS
Linear response functions are the bread and butter of condensed matter physics.Their applications include the description of dielectric screening, optical and electron energy loss spectra, many-body excitations, and ground state correlation energies.In this section we describe the methods available in GPAW for calculating electronic response functions as well as GW quasiparticle band structures and optical excitations from the Bethe-Salpeter Equation (BSE).In addition to electronic response function, GPAW can also calculate the transverse magnetic susceptibility, which holds information about the magnetic excitations, e.g.magnons, and can be used to derive parameters for classical Heisenberg spin models and to estimate magnetic transition temperatures.Finally, we present methods to calculate Raman spectra of solids and quadratic optical response tensors for describing second harmonics generation and the Pockels electro-optical effect.

A. Linear-response TDDFT
To linear order, the change in electron density induced by a time-dependent external (scalar) potential V ext (r, t) is governed by the electronic susceptibility, χ: The susceptibility is itself given by the Kubo formula [201] where the expectation value is taken with respect to the ground state at zero temperature and the density operators are cast in the interaction picture.In the noninteracting Kohn-Sham system, the susceptibility can be evaluated explicitly from the Kohn-Sham orbitals, eigenvalues and occupations: Based on the Kohn-Sham susceptibility, χ 0 , the manybody susceptibility can be calculated via a Dyson-like equation [202]: Here, electronic interactions are accounted for via the Hartree-XC kernel, which is defined in terms of the effective potentials of time-dependent DFT (TDDFT) [203]: where v c is the Coulomb interaction and f xc is the XCkernel defined as In prototypical linear-response TDDFT (LR-TDDFT) calculations, the exchange-correlation part of the kernel is either neglected (leading to the random phase approximation (RPA)) or approximated via the adiabatic local density approximation (ALDA), where ϵ xc (n) denotes the XC-energy per electron of the homogeneous electron gas of density n.The ALDA is in fact rather restricted as it only ensures a correct description of the kernel for metals in the long wave length limit.This implies that it cannot account for excitons in extended systems and furthermore it leads to a divergent XC-hole.The latter problem can be resolved by a simple renormalization of the ALDA, that regulates the ontop XC-hole and drastically improves the description of local correlations over the ALDA (and RPA) [37].

Implementation for periodic systems
In crystalline systems, the susceptibility is periodic with respect to translations on the Bravais lattice, χ(r + R, r ′ + R, ω) = χ(r, r ′ , ω).Consequently, the susceptibility can be Fourier-transformed according to translating the Dyson equation ( 43) into a plane-wave matrix equation, which is diagonal in the wave vector q and can be inverted numerically.
By using a plane-wave representation, the crux of the implementation then becomes to calculate the reciprocalspace pair densities and to Fourier transform the XC-kernel Crucially, both can be evaluated by adding a PAW correction to the analogous pseudo quantity, which itself can be evaluated by means of a fast Fourier transform (FFT).
The Hartree contribution to the kernel ( 44) is simply given by the bare Coulomb interaction (here given in atomic units) and can be evaluated analytically when q is finite.However, in the optical limit, the G = 0 component diverges and one needs to be careful when inverting the Dyson equation (43).In GPAW, this is handled by expanding the pair densities within kp-perturbation theory: In the expansion, the diverging Coulomb term is exactly cancelled by the q-dependence of the pair densities, so that the product χ 0 v c remains finite.For additional details, see Ref. [36].

Spectral representation
GPAW offers two different ways of dealing with the frequency dependence of the Kohn-Sham susceptibility (42).One is to evaluate the expression explicitly for the frequencies of interest.This is advantageous if one is interested in a few specific frequencies.The other is to evaluate the associated spectral function from which the susceptibility can be obtained via a Hilbert transform, To converge the Hilbert transform, the spectral function is evaluated on a nonlinear frequency grid spanning the range of eigenenergy differences included in Eq. ( 52).Although the calculation becomes more memory-intensive as a result, it is usually faster to compute χ 0 via its spectral function.Furthermore, since S 0 is linearly interpolatable, the tetrahedron method can be employed to improve the convergence with respect to k-points.

B. Dielectric function
The longitudinal part of the dielectric tensor is related to the susceptibility χ as From the dielectric matrix, the macroscopic dielectric function, including local field corrections, is given by from which it is straightforward to extract the optical absorption spectrum as well as the electron energy-loss spectrum It is also possible to define a symmetrized version of the dielectric matrix as The susceptibility can be evaluated at the RPA level (by setting V xc in Eq. 43 to zero) or using one of the XC-kernels implemented in GPAW such as the the local ALDA, the non-local rALDA [37], or the bootstrap kernel [204].
The evaluation χ 0 is computationally demanding since it involves an integration over the Brillouin zone (BZ) as well as a summation over occupied and unoccupied states.The k-point convergence can be increased substantially with the tetrahedron method [205].Contrary to the simple point integration, where the δ-function in Eq. 52 is replaced by a smeared-out Lorentzian, the tetrahedron method utilizes linear interpolation of the eigenvalues and matrix elements.
In Fig. 6 we compare the dielectric function computed using the two different integration methods for two prototypical cases: (semiconducting) Si and (metallic) Ag.Since Si is semiconducting there are no low-energy excitations and consequently the imaginary part of the dielectric function is zero while the real part is flat for low frequencies.The point integration and tetrahedron integration yield the same value for ω → 0, but the tetrahedron integration avoids the unphysical oscillations of ε at higher frequencies, exhibited by the point integration due to the finite k-point sampling.For metals, the k-point convergence is even slower and the difference between the point integration and tetrahedron integration is thus even more pronounced for Ag.By increasing the k-point sampling, the point integration results will eventually approach the results obtained with the tetrahedron method, but at a much higher computational cost.

Screening in low-dimensional systems
In a three dimensional (3D) bulk crystal, the macroscopic dielectric function is related to the macroscopic polarizability α M of the material as  6.Real and imaginary parts of the dielectric function for Ag (solid) and Si (dashed) using the tetrahedron and point integration methods.
Since ε M is related to the macroscopic average of the induced potential, it will depend on the unit cell in lowdimensional systems and tend to unity when increasing the cell in the non-periodic direction.In contrast, it is straighforward to generalize the definition of the polarizability to such that it measures the induced dipole moment per length or area rather than volume.The ddimensional polarizability is thus defined as α d M = Ω d α M where Ω d is the cell volume with the periodic directions projected out.For example, for a 2D material Ω d is simply the length of the unit cell in the non-periodic direction.To improve convergence with respect to the size of the unit cell, the Coulomb kernel is truncated using the reciprocal-space method introduced in Ref. [206].This enables efficient calculations of dielectric properties of low-dimensional materials materials [207,208].This does, however, imply that the polarizability cannot be evaluated directly from the dielectric constant (which is just one), but one may obtain it directly from the susceptibility as

C. Adiabatic-connection fluctuation-dissipation theorem
The adiabatic-connection fluctuation-dissipation theorem (ACFDT) is a highly promising approach for constructing accurate correlation functionals with nonlocal characteristics.Unlike regular XC-functionals, the ACFDT correlation functionals do not rely on error cancellation between exchange and correlation.Instead, the ACFDT provides an exact theory for the electronic correlation energy E c in terms of the interacting electronic suseptibility, which can be combined with the exact exchange: The response function can be expressed in terms of the Kohn-Sham response function and the exchange- The random phase approximation (RPA) can be derived from ACFDT if the XC-kernel f xc is neglected.RPA has the strength of capturing non-local correlation effects and provides high accuracy across different bonding types including van der Waals interaction [29,[209][210][211].
Simple exchange-correlation kernels can also be incorporated into the response function, such as the adiabatic LDA (ALDA) kernel.However, the locality of adiabatic kernels leads to divergent characteristics of the pair-correlation function [212].This issue can be overcome by a renormalization (r) scheme [37], which is implemented in GPAW as rALDA.This class of renormalized kernels provides a significantly better description of the short-range correlations and hence also yield highly accurate total correlation energies [213][214][215][216].

D. Magnetic response
The LR-TDDFT framework described in Sec.VI A can be generalized to include spin degrees of freedom [43,217].Using the four-component density as the basic variable, n µ (r) (where µ ∈ {0, x, y, z}), one can define the four-component susceptibility tensor: In a similar fashion to Eq. ( 43), the many-body χ µν can be calculated from the corresponding susceptibility tensor of the Kohn-Sham system, χ µν KS .In the most general case, the Dyson equation for χ µν is a matrix equation in the µ and ν indices, explicitly coupling the charge and spin degrees of freedom.However, for collinear magnetic systems, the transverse components of the susceptibility decouple from the rest in the absence of spin-orbit coupling.

Transverse magnetic susceptibility
Taking the spins to be polarized along the z-direction, the transverse magnetic susceptibility of collinear nonrelativistic systems may be expressed in terms of the spin-raising and spin-lowering density operators n+ (r) = (n x (r) + in y (r))/2 = ψ † ↑ (r) ψ↓ (r) and n− (r) = (n x (r) − in y (r))/2 = ψ † ↓ (r) ψ↑ (r).In the ALDA, the Dyson equation for χ +− then becomes a scalar one [43], where the transverse XC-kernel is given by f −+ LDA = 2W LDA (r)/n z (r).The susceptibilities χ +− and χ +− KS are themselves defined via the Kubo formula (63), where the latter can be evaluated explicitly in complete analogy to Eq. ( 42): In GPAW, the transverse magnetic susceptibility is calculated using a plane-wave basis as described in Sec.VI A 1, with the notable exception that no special care is needed to treat the optical limit, since the Hartree kernel plays no role in the Dyson equation (64).In terms of the temporal representation, it is at the time of this writing only possible to do a literal evaluation of Eq. ( 65) at the frequencies of interest.For metals, this means that η has to be left as a finite broadening parameter, which has to be carefully chosen depending on the k-point sampling, see [43].

The spectrum of transverse magnetic excitations
Based on the transverse magnetic susceptibility, one may calculate the corresponding spectral function [43] which directly governs the energy dissipation in a collinear magnet relating to induced changes in the spin projection along the z-axis S z .In particular, one can decompose the spectrum into contributions from majority and minority spin excitations, S +− (r, r ′ , ω) = A +− (r, r ′ , ω)−A −+ (r ′ , r, −ω), that is, into spectral functions for the excited states where S z has been lowered or raised by one unit of spin angular momentum respectively: Spectrum of transverse magnetic excitations for ferromagnetic hcp-Co evaluated at q = 5qM/6.The spectrum was calculated using 8 empty-shell bands per atom, a plane-wave cutoff of 800 eV, a (60, 60, 36) k-point mesh and a spectral broadening of η = 50 meV.In the upper panel, the spectrum diagonal is depicted for the 1st and 2nd Brillouin zone out of the hexagonal plane, from which the acoustic and optical magnon frequencies can be respectively extracted.In the lower panel, the spectrum of majority excitations is shown.The full spectral weight A(ω) is calculated as the sum of all positive eigenvalues of S +− , the acoustic and optical mode lineshapes a0(ω) and a1(ω) are obtained via the two largest eigenvalues (which are significantly larger than the rest) and the Stoner spectrum is extracted as the difference Here, α iterates the system eigenstates with α = 0 denoting the ground state.
For collinear magnetic systems, the spectrum S +− is composed of two types of excitations: collective spinwave excitations (referred to as magnons) and excitations in the Stoner-pair continuum (electron-hole pairs of opposite spin).Since GPAW employs a plane-wave representation of the spectrum, S +− GG ′ (q, ω), one can directly compare the calculational output to the inelastic neutron scattering cross-section measured in experiments [218].
In particular, one can extract the magnon dispersion directly by identifying the position of the peaks in the spectrum diagonal, see Fig. 7.In this way, GPAW allows the user to study various magnon phenomena in magnetic systems of arbitrary collinear order, such as nonanalytic dispersion effects in itinerant ferromagnets [43], correlation-driven magnetic phase transitions [219] and emergence of distinct collective modes inside the Stoner continuum of an antiferromagnet [220].
Additionally, one can analyze the spectrum more intricately by extracting the majority and minority eigenmodes from S +− as the positive and negative eigenvalues respectively.Contrary to analysis of the plane-wave diagonal, this makes it possible to completely separate the analysis of each individual magnon lineshape as well as the many-body Stoner continuum, see Fig. 7.

Liechtenstein MFT
Not only can the transverse magnetic susceptibility be used to study magnetic excitations in a literal sense, but one can also use it to map the spin-degrees of freedom to a classical Heisenberg model, where i, j and a, b denote the indices of the Bravais lattice and the magnetic sublattice, respectively, u ia being the direction of spin-polarization of the given magnetic site.Based on the magnetic force theorem (MFT), the LSDA Heisenberg exchange parameters J ab ij can be calculated from the reactive part of the static Kohn-Sham susceptibility [42] and the effective magnetic field B xc (r) = δE xc /δm(r), using the well-known Liechtenstein MFT formula [187]: Here Ω ia denotes the site volume, which effectively defines the Heisenberg model (68).
Using GPAW's plane-wave representation of χ ′+− KS , one can directly compute the lattice Fourier-transformed exchange parameters [42], where the right-hand side of the second equality is written in a plane-wave basis and K a denotes the sublattice site-kernel: Since an a priori definition for the magnetic site volumes does not exist, GPAW supplies functionality to calculate exchange parameters based on spherical, cylindrical and/or parallelepipedic site configurations of variable size.Upon calculation of the exchange parameters Jab (q), it is straight-forward to compute the magnon dispersion within the classical Heisenberg model using linear spinwave theory and to estimate thermal quantities such as the Curie temperature, see e.g.[42].In Fig. 8 the MFT magnon dispersion of hcp-Co is compared to majority magnon spectrum calculated within LR-TDDFT.For Co, the two are in excellent agreement except for the dispersion of the optical magnon mode along the K−M−Γ highsymmetry path, where MFT underestimates the magnon frequency and neglects the fine structure of the spectrum.The fine structure of the ALDA spectrum appears due to the overlap between the magnon mode and the Stoner continuum.This gives rise to so-called Kohn anomalies (nonanalytical points in the magnon dispersion), which is a trademark of itinerant electron magnetism.Since the itinerancy is largely ignored in a localized spin model such as (68), one cannot generally expect to capture such effects.

E. GW approximation
GPAW supports standard G 0 W 0 quasiparticle (QP) calculations based on a first-order perturbative treatment of the linearized QP equation [9] where ε nkσ and ψ nkσ are Kohn-Sham eigenvalues and wave functions, and v xc and v x are the local XC-potential and nonlocal exchange potential, respectively.Σ GW is the (dynamical part of) the GW self-energy whose frequency-dependence is accounted to first order by the renormalization factor Z nkσ = (1 − Re Σ ′ GW (ε nkσ )) −1 .As indicated by the σ-index, spin-polarized G 0 W 0 calculations are supported.
The GW self-energy is calculated in a plane-wave basis using full frequency integration along the real axis to evaluate the convolution between G and W [38]. Compared to alternative schemes employing contour deformation techniques or analytical continuation [221][222][223], this approach is time-consuming but numerically accurate and can provide the full spectral function.A highly efficient and accurate evaluation of the self-energy based on a multipole expansion of the screened interaction, W [224], is currently being implemented, and a GPU version of the full GW code is under development.
An important technical issue concerns the treatment of the head and wings of W (G = 0 and/or G ′ = 0, respectively) in the q → 0 limit.The divergence of the Coulomb interaction appears both in the evaluation of the inverse dielectric matrix ε −1 GG ′ and in the subsequent evaluation of the screened interaction For 3D bulk crystals, GPAW obtains these components by evaluating ε −1 GG ′ on a dense k-grid centered at k = 0 while v c,GG ′ can be integrated numerically or analytically around the Γ-point.
For low-dimensional structures, in particular atomically thin 2D semiconductors, GPAW can use a truncated Coulomb interaction to avoid interactions between periodical images when evaluating W [206].It has been shown that the use of a truncated Coulomb kernel leads to slower k-point convergence [207].To mitigate this drawback, a special 2D treatment of W (q) at q = 0, can be applied to significantly improve the k-point convergence [225].A detailed account of the GW implementation in GPAW can be found in Ref. [38].
Figure 9 shows two matrix elements of the dynamical GW self-energy for the valence and conduction band states at the Gamma point.As can be seen, the agreement with the corresponding quantities obtained with the Yambo GW code [226] is striking.

F. Bethe-Salpeter Equation (BSE)
In addition to the LR-TDDFT discussed in Sec.VI A, the interacting response function may be approximated by solving the Bethe-Salpeter equation (BSE) [11].In particular, for a certain wave vector q, one may obtain the two-particle excitations by diagonalizing the Hamiltonian where ε km are the Kohn-Sham eigenvalues and f km the associated occupation numbers.The kernel is defined by where Ŵ = ε −1 v c is the static screened Coulomb interaction.The matrix elements of the kernel are evaluated in a plane-wave basis where they are easily expressed in terms of the pair densities (48) and the reciprocal-space representation of the dielectric matrix (54).
In the Tamm-Dancoff approximation, one only includes states with ε m1k2+q − ε m2k2 > 0 in Eq. ( 73) and the BSE Hamiltonian becomes Hermitian [11].The interacting retarded response function may then be written as where and E λ (q) denotes the eigenvalue of the Hamiltonian (73) corresponding to the eigenvector A λ m1m2k (q).In GPAW, the construction of the BSE Hamiltonian (73) proceeds in two steps [39].First, the static screened interaction is calculated at all inequivalent q-points in a plane-wave basis and the Kernel is then subsequently expressed in a basis of two-particle KS states.The first step is efficiently parallelized over either states or k-points and the second step is parallelized over pair densities.The Hamiltonian elements are thus distributed over all CPUs and the diagonalization is carried out using ScaLA-PACK such that the full Hamiltonian is never collected on a single CPU.The dimension of the BSE Hamiltonian and memory requirements are, therefore, only limited by the number of CPUs used for the calculation.We note that the implementation is not limited to the Tamm-Dancoff approximation, but calculations become more demanding without it.The response function may be calculated for spin-paired as well as spin-polarized systems and spin-orbit coupling can be included nonselfconsistently [45,227].In low-dimensional systems it is important to eliminate the spurious screening from periodic images of the structure, which is accomplished with the truncated Coulomb interaction of Ref. [206].
The most important application of BSE is arguably the calculation of optical absorption spectra of solids where BSE provides an accurate account of the excitonic effects that are not captured by semi-local approximations for K Hxc (44).In 2D systems, the excitonic effects are particularly pronounced due to inefficient screening [207,227] and in Fig. 10 we show the 2D polarizability of WS 2 calculated from BSE with q = 0. Comparing with Fig. 3, it is observed that the absorption edge is expected to be located at the K point, where spin-orbit coupling splits the highest valence band by 0.45 eV.This splitting is seen as two excitonic peaks below the band gap, which is interpreted as distinct excitons originating from the highest and next-highest valence bands (the splitting of the lowest conduction band is negligible in this regard).For comparison, we also show the RPA polarizabilty obtained with the BSE module by neglecting the screened interaction in the kernel and this shows the expected absorption edge at the band gap.This yields identical results to the Dyson equation approach of Sec.VI A, but has the advantage that the eigenvalues and weights of Eq. ( 76) are obtained directly such that the artificial broadening η may be varied without additional computational cost.The eigenstate decomposition also allows one to access "dark states" and the BSE calculation reveals two (one for each valley) triplet-like excitons that are situated 70 meV below the lowest bright exciton in Fig. 10.
In addition to optical properties, GPAW allows for solving the BSE at finite q, which can used to obtain plasmon dispersion relations from the EELS (57) and magnon dispersions from the transverse magnetic susceptibility of Sec.VI D [228].

G. Electron-phonon coupling
The electron-phonon coupling is the origin of several important materials properties, ranging from electrical and thermal conductivity to superconductivity.In ad- The real (Re) and imaginary (Im) parts of the self-energy matrix elements at the Gamma point for valence (top) and conduction (bottom) bands evaluated with Yambo and GPAW.Both codes are using full frequency integration with a broadening of 0.1 eV.Yambo is using norm conserving pseudo potentials, and GPAW its standard PAW setup.The k-point grid was 12×12×12, the plane wave cutoff was 200 eV, and the number of bands 200 for both codes.The results are virtually indistinguishable.
dition it provides access to the deformation potential, which can be used to obtain transport properties for electrons and holes in solids [229].
The first-order electron-phonon coupling matrix g ν mn (k, q) measures the strength of the coupling between a phonon branch ν with wave vector q and frequency ω ν and the electronic states m(k + q) and n(k) [230,231]: with Here m 0 is the sum of the masses of all the atoms in the unit cell, ∇ u denotes the gradient with respect to atomic displacements and e ν projects the gradient onto the direction of the phonon displacements.In the case of the three translational modes at |q| = 0, the matrix elements g ν mn vanishes as a consequence of the acoustic sum rule [231].
In GPAW, ∇ u v KS (r) is determined using a finitedifference method with a super cell description of the system.This step can be performed in any of the wave function representations available in GPAW.The derivative is then projected onto a set of atomic orbitals from an LCAO basis ϕ N M , where N denotes the cell index and M the orbital index: The Fourier transform from the Bloch to the real space representation makes it possible to compute M ν mn for arbitrary q.Finally, the electron-phonon coupling matrix is obtained by projecting the matrix corresponding to the supercell into the primitive unit cell bands m, n and phonon modes ν: where C nM are the LCAO coefficients and u qν are massscaled phonon displacement vectors.

H. Raman spectrum
The Raman effect describes inelastic light scattering, where vibrational modes are excited within the material.Resonant and non-resonant Raman spectra of finite systems such as molecules can be calculated in various approximations [232] using the corresponding interfaces in ASE.The Stokes Raman intensity is then written as where ν denotes phonon mode at q = 0 with frequency of ω ν and n ν is the corresponding Bose-Einstein distribution.Furthermore, u α in and u β out are the polarization vectors of the incoming and outgoing light, and R ν αβ denotes the Raman tensor for phonon mode ν.
The predominant approach for calculating R ν αβ involves the use of the Kramers-Heisenberg-Dirac (KHD) method.Within the KHD framework, the Raman tensor is determined by taking the derivative (utilizing a finitedifference method) of the electric polarizability concerning the vibrational normal modes.Alternatively, one can employ time-dependent third-order perturbation theory to compute Raman tensors.These two approaches are equivalent when local field effect are negligible [56].However, each approach comes with its own set of advantages and drawbacks.The KHD method is computationally more efficient but is limited to computing first-order Raman processes.The perturbative approach can be extended to higher-order Raman processes involving multiple phonons, but it is more computationally demanding, necessitating a greater number of bands and a finer kmesh grid to achieve convergence.The perturbative approach has been implemented in GPAW and is elaborated below while the KHD method has been implemented in the ASR package [22], utilizing GPAW as the computational backend.
In the perturbative approach, the Raman tensor R ν αβ is given by [56,233] where the first term is referred to as the resonant part and the remaining terms represent different time orderings of the interaction in terms of Feynman diagrams.Polarization-resolved Raman spectrum of bulk MoS2 in the 2H phase at ω = 488 nm.Phonons and potential changes were computed using a 700 eV plane wave cutoff and 2 × 2 × 2 k-point mesh in a 3 × 3 × 2 supercell.Each peak has been labeled according to its irreducible representation.
electronic bands m and n, with transition energy ε nm = E n − E m in polarization direction α and M ν nm is the electron-phonon coupling strength in the optical limit |q| = 0 as defined in Eq. (82).
Fig. 11 shows the polarization-resolved Raman spectrum of bulk MoS 2 in the 2H phase as computed with a laser frequency of ω in = 488 nm.This example uses only the resonant term, as the other contributions are small in this case.It's worth noting that we have conducted a comparison of the calculated spectra using the ASR package and observed a high level of agreement for several materials, for example MoS 2 .The results obtained from both methods closely align with each other in terms of peak positions and dominant peaks and minor disagreements between the two methods can be attributed to differences in implementation details and the distinct approximations employed by each.Specifically, within the ASR package, we utilized the phonopy package [234] to compute phonon frequencies and eigenvectors, whereas in GPAW, we directly computed phonon frequencies and eigenvectors using ASE's phonon module.Furthermore, in the ASR implementation, we rigorously enforced the symmetry of the polarizability tensor and the ASR results therefore typically exhibit a more accurate adherence to the required symmetry of the Raman tensor compared to the GPAW implementation.

I. Quadratic optical response functions
The nonlinear optical response of materials can be obtained by going beyond first order perturbation theory.Presently, the GPAW implementation is restricted to second-order response within the dipole approximation and without inclusion of local field effects.We apply the independent particle approximation, which cannot capture collective behavior such as excitonic effects [57].A spatially homogeneous incident electric field can be written in terms of its Fourier components as where ω 1 runs over positive and negative frequencies, e α denotes the unit vector along the α-direction, and E α (ω 1 ) is the electric field at frequency ω 1 .The induced quadratic polarization density P (2) γ (t) can then be expressed as where χ (2) γαβ is the rank-3 quadratic susceptibility tensor.
Due to intrinsic permutation symmetry, i.e. χ (2) γαβ has at most 18 independent elements which may be further reduced by the Neumann principle and point group symmetries [235].We note that the corresponding quadratic conductivity tensor is readily derived from χ (2) due to relationship between current density and polarization density [236].
Among the various response functions that can be calculated from χ (2) γαβ (ω 1 , ω 2 ), we have implemented secondharmonics generation (SHG) and the shift current tensor.The implementation currently requires time-reversal symmetry, which limits the application to non-magnetic systems.For SHG, the susceptibility tensor is separated into a pure interband term χ (2e) γαβ and a mixed term χ Here, C 0 ≡ e 3 /(2ϵ 0 V ) where V is the crystal volume, ∆ α mn ≡ (p α mm − p α nn )/m denotes the velocity difference between bands n and m, and r α nm is the interband (n ̸ = m) position matrix element, obtained from imr α nm = hp α nm /ε nm .All energies, occupations and matrix elements in the preceding expressions depend on the k-vector.Also, the summation over k implies an integral over the first BZ, i.e. (2π) 3  k → V BZ d 3 k.The primed summation signs indicate omission of terms with two or more identical indices.Finally, the generalized derivative r β nm;α (for n ̸ = m) are evaluated from the sum rule [237] Here, infinite sums have been substituted with finite sums over a limited, yet sizable, set of bands.It's important to emphasize that both sides of the equation are dependent on the k-vector, and the summation on the righthand side pertains exclusively to bands.It should be mentioned that another implementation for the quadratic susceptibility tensor in the velocity gauge is also available in the code, but is not documented here.For sufficiently many bands, the results of the two implementations are identical [236].
Regarding the shift current, where a DC current is induced in response to an incident AC field, one needs to compute the quadratic conductivity tensor σ (2) In practice, the delta function is replaced by a Lorentzian with a finite broadeining η.To avoid numerical instabilities, the implementation of Eqs. ( 87)-( 90) uses a tolerance, such that terms are neglected if the associated energy differences or differences in Fermi levels are smaller than a tolerance.The default value of the tolerance is 10 −6 eV, and 10 −4 for the energy and Fermi level differences, respectively.

VII. REAL-TIME TDDFT
The real-time TDDFT (RT-TDDFT) scheme, also known as time-propagation TDDFT, is implemented in FD [34] and LCAO [52] modes.It requires non-periodic boundary conditions, but is not restricted to the linear regime and can be applied to model molecules in strong fields.The metod may combined with hybrid quantumclassical modeling to simulate the dynamical interaction between molecules and plasmon resonances at metal surfaces [238].The LCAO-RT-TDDFT is the more recent implementation supporting versatile analyses and enabling modeling of large systems thanks to its efficiency [52][53][54][239][240][241][242][243][244][245].We focus on the capabilities of the LCAO version in this section, but some of the described functionalities are also available in FD mode.
The time-dependent KS equation in the PAW formalism is where the Kohn-Sham Hamiltonian Ĥ[n(r)] is implicitly dependent on time through the time-dependent density and v(t) is an explicitly time-dependent external potential.We have additionally assumed that the overlap matrix Ŝ is independent of time, i.e. there are no ion dynamics.
Starting from the ground state, Eq. ( 91) is propagated forward numerically.After each step, a new density is computed and Ĥ[n(r)] updated accordingly.The user can can freely define the external potential and implemented standard potentials include the delta-kick v(t) = xδ(t) and a Gaussian pulse where x is the dipole operator in the direction x.
During the propagation, different time-dependent variables can be recorded, and after the propagation they can be post-processed to quantities of physical and chemical interest.As a basic example, the time-dependent dipole moment recorded for a delta-kick perturbation can be Fourier-transformed to yield the photoabsorption spectrum [246].Observables are recorded by attaching observers to the calculation and implemented observers include writers for the dipole moment, magnetic moment, KS density matrix in frequency space and wave functions.
RT-TDDFT calculations can be started from the ground state or continued from the last state of a previous time propagation and the time-limiter feature allows one to limit jobs to a predefined amount of wall time.
Together with the continuation capability, this facilitates time propagation in short chunks, efficiently using shared high-performance resources.
In the LCAO-RT-TDDFT implementation, Eq. ( 91) is cast into a matrix equation and solved with ScaLA-PACK [52].The intermediate step of updating the Hartree and XC potential is performed on the real-space grid.

A. Kohn-Sham decomposition
The time-dependent KS density matrix can be written as where ψ * m (r ′ , t) are the time-dependent KS orbitals with ground state occupation factors f m .The KS density matrix is a central quantity enabling computation of observables and may be evaluated efficiently in the LCAO mode [54].
The Fourier transform of the induced density matrix ] can be built on the fly during time propagation through the density-matrix observer.Details on the implementation are described in Ref. [54].The KS density matrix in frequency space is related to the Casida eigenvectors and gives similar information as the solution of the Casida equation [54].Observables such as the polarizability can be decomposed into a sum over the electron-hole part of the KS density matrix ρ nn , where f n > f n ′ .This enables illustrative analyses, e.g., by visualizing ρ nn ′ (ω) on energy axes as a transition contribution map [247], from which the nature of the localized surface plasmon resonance can be understood (Fig. 12, see Ref. [54] for detailed discussion).

B. Hot-carrier analysis
The KS density matrix is a practical starting point for analyzing hot-carrier generation in plasmonic nanostructures.In the regime of weak perturbations, the KS density matrix resulting from arbitrary pulses can be obtained from delta-kick calculations by a postprocessing routine [243,244].By decomposing the matrix into different spatial and energetic contributions, hot-carrier generation in nanostructures [243,245] and across nanoparticle-semiconductor [241] and nanoparticle-molecule [240,244] interfaces can be studied.Computational codes and workflows for such hotcarrier analyses are provided in Refs.[248,249].

C. Circular dichroism for molecules
Electronic circular dichroism (ECD) is a powerful spectroscopic method for investigating chiral properties at the molecular level.The quantity that characterizes the ECD is the rotatory strength, which is defined through the magnetic dipole moment m(ω) as where the index α enumerates Cartesian directions, the α superscript in parenthesis indicates the δ-kick direction and κ is the strength of the kick.To resolve R(ω), one needs to perform the δ-kick in all three Cartesian directions using a perturbing electric field E α (t) = καδ(t).
The frequency components of the dipole moment m (i) i (ω) are calculated by Fourier transforming m (α) α (t), which is recorded during the propagation.Finally, the timedependent magnetic dipole moment is obtained as the expectation value of the operator m = − i 2c r × ∇, where f n is the occupation number of KS orbital n and ψ n (r, t) is the time-evolved KS state.The current GPAW implementation supports both the FD and LCAO mode and the computational efficiency of the LCAO mode enables calculation of the ECD of nanoscale metal-organic nanoclusters.More details on the implementation can be found in Ref. [55].

D. Radiation reaction potential
Plasmonic or collective molecular excitations are strongly susceptible to any kind of optical interaction.Induced currents will couple via Maxwell's equation of motion to the optical environment and result in radiative losses, i.e., decay towards the ground state.It is possible to solve the Maxwell problem formally by obtaining Green's tensor G ⊥ (ω).The dipolar interaction between electric field and electronic dipole can be absorbed into a local potential v rr (r, t) suitable for Kohn-Sham TDDFT, where R(t) is the total electronic dipole moment.A detailed discussion can be found in Ref. [250].
For many simple structures, such as free-space, onedimensional wave-guides, or dielectric spheres, G ⊥ (ω) is analytically known and radiative losses can then be included in TDDFT without additional coimputational cost.The tutorials on the GPAW web-page [61] includes an example for one-dimensional wave-guides for which the user can specify the cross sectional area and the polarization of the propagating modes.Extending the functionality of the radiation-reaction potential to 3D free-space and collective interaction of large ensembles in Fabry-Pérot cavities from first principles [251] is essential for the understanding of polaritonic chemistry.This functionality is currently under development.

E. Ehrenfest dynamics
Molecular dynamics (MD) simulations usually rely on the Born-Oppenheimer approximation, where the electronic system is assumed to react so much faster than the ionic system that it reaches its ground state at each time step.Thus, forces for the dynamics are calculated from the DFT ground-state density.While this approximation is sufficiently valid in most situations, there are cases where the explicit dynamics of the electronic system can affect the molecular dynamics, or the movement of the atoms can affect averaged spectral properties.These cases can be handled using so-called Ehrenfest dynamics, ie.time-dependent density functional theory molecular dynamics (TDDFT/MD).
Ehrenfest dynamics is implemented in the FD mode [252].A description of the theory and a tutorial is available on the GPAW web-page [61].This functionality has been used to model the electronic stopping of ions including core-electron excitations [253], study charge transfer at hybrid interfaces in the presence of water [254], simulate the coherent diffraction of neutral atom beams from graphene [255], model the dependence of carbon bond breaking under Ar + -ion irradiation on sp hybridization [256], and reveal charge-transfer dynamics at electrified sulfur cathodes [257].An LCAO implementation, inspired by recent work in the Siesta code [258,259], is currently under development.

A. Improved virtual orbitals
The linear response TDDFT approach generally provides reasonably accurate excitation energies for lowlying valence excited states, where the orbitals associated with the holes and excited electrons overlap significantly.However, it tends to fail for excitations involving spatial rearrangment of the electrons, such as charge transfer [113,260], Rydberg [261] and doubly excited states [262].
Some of these problems can be alleviated by using range separated functionals (see section III C 4).However, these functionals come with a significantly increased computational cost due to the evaluation of exchange integrals.Moreover, due to the missing cancellation of Coulomb and exchange terms for canonical unoccupied orbitals within Hartree-Fock theory, one obtains spurious unoccupied orbitals.This leads to slow covergence of linear-response TDDFT calculations with respect to the number of unoccuppied orbitals when hybrid and rangeseparated functionals are used [122,123].
Substantial improvement in convergence with respect to unoccupied orbitals can be obtained using improved virtual orbitals as devised by Huzinaga and Arnau [263,264].In this approach, a modified Fock operator is used for the unoccupied orbitals, which mimics the interaction between a hole arbitrarily chosen among the occupied ground state orbitals and the excited electron.This leads to faster convergence in linear-response TDDFT calculations with hybrid and range-separated functionals, and also makes it possible to evaluate excited state properties.For example, the energetics of long-range charge transfer can be obtained by means of a ground state calculations because the difference between the energy of an improved virtual orbital and a hole tends to approximate the excitation energy.The improved virtual orbitals approach is available in GPAW and details on the implementation are described in Ref. [123].

B. Variational excited-state calculations
GPAW also offers the possibility to perform excitedstate calculations using an alternative time-independent density functional approach [265] (sometimes referred to as the "∆SCF" method), which does not suffer from the limitations of linear-response TDDFT mentioned in the previous section.The method involves variational optimization of the orbitals corresponding to a specific excited state by optimizing the electronic energy to a stationary point other than the ground state.The computational effort is similar to that of ground-state calculations and the variational optimization guarantees that the Hellmann-Feynman theorem is fulfilled.Therefore, all the ground-state machinery available in GPAW to evaluate atomic forces can be used for geometry optimization and for simulating the dynamics of atoms on the excited state.Furthermore, coupling this timeindependent, variational approach for excited-state calculations with external MM potentials (see section IV) does not involve additional implementation efforts compared to ground-state calculations and provides a means for performing excited-state QM/MM molecular dynamics simulations that include the state-specific response of a solvent, i.e. the response due to changes in the electron density of the solute.
Variationally optimized excited states correspond to single Slater determinants of optimal orbitals with a nonaufbau occupation and are typically saddle points on the electronic energy surface.Hence, variational calculations of excited states are prone to collapsing to lower-energy solutions, which preserve the symmetry of the initial guess.A simple maximum overlap method (MOM) [266,267] is available in GPAW to address this problem.At each SCF step, the MOM occupies those orbitals that overlap the most with the orbitals of a nonaufbau initial guess, usually obtained from a ground-state calculation.The MOM, however, does not guarantee that variational collapse is avoided, and convergence issues are common when using SCF eigensolvers with density-mixing schemes developed for ground-state calculations.

Direct orbital optimization
To alleviate the issues leading to variational collapse and achieve more robust convergence to excited-state solutions, GPAW contains two alternative strategies that are more reliable than conventional SCF eigensolvers with the MOM.They are based on direct optimization of the orbitals and use saddle point search algorithms akin to those for transition-state searches on potential energy surfaces of atomic rearrangements.These approaches also facilitate variational excited-state calculations of nonunitary invariant functionals, such as selfinteraction corrected functionals (see section III C 5).
The first of these methods is a direct orbital optimization approach supplemented with the MOM (DO-MOM).This method is an extension of the direct minimization approach using the exponential transformation illustrated in section III B 5 where the search is for a generic stationary point of E[Ψ] instead of a minimum: The DO-MOM is available in GPAW for both LCAO [51,268], real-space grid, and plane wave basis sets [50].For the LCAO basis, the excited-state optimization only necessitates making the energy stationary with respect to the elements of the anti-Hermitian matrix A (the orbital rotation angles), while calculations using the real-space grid and plane-wave basis include an outer-loop minimization with respect to the reference orbitals Ψ 0 .The optimization in the linear space of anti-Hermitian matrices uses efficient quasi-Newton algorithms that can handle negative Hessian eigenvalues and therefore converge on saddle points.GPAW implements a novel limitedmemory SR1 (L-SR1) algorithm, which has proven to be robust for calculations of excitations in molecules [50,51].
The DO-MOM relies on estimating the degrees of freedom along which the energy needs to be maximized starting from an initial guess.For valence and Rydberg excitations, the initial guess consisting of ground-state canonical orbitals with non-aufbau occupation numbers and preconditioning with a diagonal approximation of the electronic Hessian using the orbital energies [79] can be sufficient.However, if the excitation involves significant charge transfer, large rearrangements of the energy ordering of the orbitals can occur, and DO-MOM can struggle to converge.A second direct optimization method with generalized mode following (DO-GMF) [49] alleviates these problems and is also implemented.In DO-GMF, the components of the energy gradient g along the modes v i corresponding to the n lowest eigenvalues of the electronic Hessian are inverted yielding and a minimization using the modified gradient g mod is performed by following the n modes simultaneously.This procedure guarantees convergence to an n th -order saddle point, eliminating the risk of variational collapse altogether.While it is computationally more expensive than DO-MOM due to the need for a partial Hessian diagonalization, DO-GMF is more robust.Hence, it is particularly useful in the exploration of potential energy surfaces because it is able to follow an excited state through bondbreaking configurations, where broken-symmetry solutions appear, by targeting the solution that preserves the saddle point order.This important advantage is exemplified by the challenging double-bond twisting in ethylene [49,133], where DO-GMF calculations of the lowest doubly excited state provide an avoided crossing with the ground state whereas other methods fail to do so.

Example applications of direct optimization
The efficiency and robustness of the directoptimization approaches combined with the possibility of choosing different basis-set types make variational calculations of excitated states in GPAW applicable to a great variety of systems ranging from molecules in gas phase or solution to solids.
State-specific orbital relaxation enables the description of challenging excitations characterized by large density rearrangements.Figure 13 shows the error on the vertical excitation energy of a charge-transfer excitation in the twisted N-phenylpyrrole molecule [49] obtained with direct optimization in GPAW using the LDA, PBE and BLYP functionals and an sz+aug-cc-pVDZ [53] basis set as compared to the results of linear-response TDDFT calculations with the same basis set and functionals, as well as the hybrid functionals PBE0 and B3LYP (results from Ref. [269]).For the variational calculations, the energy of the singlet excited state is computed using the spinpurification formula E s = 2E m − E t , where E m is the energy of a spin-mixed state obtained by promoting an electron in one spin channel and E t is the energy of the triplet state with the same character.The variational calculations underestimate the theoretical best-estimate value of the excitation energy (5.58 eV) in Ref. [269] by 0.15-0.3eV, an error that is significantly smaller than that of linear-response TDDFT calculations with the same functionals (−2.0 eV) or with the more computationally intensive PBE0 hybrid functional (−0.85 eV) [269].
The method has also been used to simulate the photoinduced structural changes of photocatalytic metal complexes and concomitant solvation dynamics [165,181,182,270,272].
Figure 13 shows an application to the prototypical copper complex photo-sensitizer [Cu(dmphen) 2 ] + (dmphen=2,9-dimethyl-1,10phenanthroline) in acetonitrile, where QM/MM molecular dynamics simulations elucidated an intricate interplay between deformation of the ligands and rearrangement of the surrounding solvent molecules following a photoexcitation to a metal-to-ligand charge-transfer state [165,270].
The last example of an application shown in Figure 13 is a calculation of the excited states of a solid state system [273], the negatively charged nitrogen-vacancy center in diamond, which comprises a prototypical defect for quantum applications.The system is described with a large supercell of up to 511 atoms, and the calculations use a plane-wave basis set.In contrast to previous reports, a range of different density functionals is found to give the correct energy ordering of the excited states, with the r 2 SCAN functional providing the best agreement with high-level many-body quantum embedding calculations with an error of less than 0.06 eV [271,273].This example shows that the direct optimization methods in GPAW are promising tools for simulating excited states in extended systems, where alternative approaches are either computationally expensive or lack accuracy.

IX. OTHER FEATURES A. Electric polarization
The formal polarization of bulk materials may be calculated from the modern theory of polarization [274,275] as where is the electronic contribution and is the contribution from the nuclei.Here the sums run over atoms in the unit cell and Z a is the charge of nucleus a (including core electrons), situated at position r a .The electronic contribution can be viewed as a Brillouin-zone integral of k-space Berry phases and may be evaluated from a finite-difference version of Eq. ( 98) [276].This involves the overlaps between Bloch orbitals at neighbouring k-points, which are straightforward to evaluate in the PAW formalism [46].Eq. ( 97) is only defined modulo eR i /V cell , which follows from the arbitrary choice of unit cell for the atomic positions as well as the choice of phases for u kn , which can shift the Berry phase by 2π.The change in polarization under any adiabatic deformation is, however, well LDA PBE BLYP PBE0B3LYP The experimental results from femtosecond X-ray scattering measurements [270] are compared to the average over excited-state molecular dynamics trajectories obtained using the QM/MM electrostatic embedding scheme in GPAW [165] (see section IV) and convoluted with the experimental instrument-response function [270].Right: Vertical excitation energy for excitations in the negative nitrogen-vacancy center in diamond obtained with the r 2 SCAN functional as compared to the results of previous calculations using an advanced quantum embedding approach [271].The orbitals involved in the electronic transitions are visualized in the inset (C atoms are grey and the N atom is orange).
defined and may be calculated as where λ is some dimensionless variable parameterizing the adiabatic path.In particular, for ferroelectrics the spontaneous polarization P S can be evaluated by choosing a path that deforms the structure from the polar ground state at λ = 1 to a non-polar structure at λ = 0.In Fig. 14, we show an example of this for tetragonal KNbO 3 , which is a well-known ferroelectric [277].The polar structure was relaxed under the constraint of tetragonal symmetry using PBE (λ = 1) and then linearly interpolated to the inverted structure (λ = −1) passing through a centrosymmetric point at λ = 0.There are infinitely many polarization branches differing by the polarization quantum ec/V cell (c being the lattice constant in the z-direction) and the spontaneous polarization is obtained by choosing a single branch and evaluating the difference in formal polarization at λ = 1 and λ = 0. Interestingly, the centrosymmetric point has a non-vanishing polarization given by half the polarization quantum.This is allowed due to the multi-valued nature of the formal polarization and such a "topological polarization" in non-polar materials has been shown to yield gapless surface states [278,279].Here, however, we merely use the topological polarization to emphasize the importance of evaluating the spontaneous polarization as the change in P F along an adiabatic path.
The expressions (97-100) can be applied to extract various properties of non-polar materials as well, for example the Born effective charge tensors which yield the change in polarization resulting from small shifts in atomic positions.In GPAW, these are obtained by a simple call to a module that introduces a (user-defined) shift of all atoms in the unit cell and calculates the resulting change in polarization from Eq. (97)(98)(99).The Born charges may be combined with the atomic force matrix to calculate equilibrium positions under an applied static electric field and the lattice contribution to the dynamic polarizability can be calculated from the eigenvalues and eigenvectors of the force matrix [280].
In addition, the piezoelectric response can be obtained by calculating the change in polarization in response to external strain [45].The lattice contribution to the polarizability is typically orders of magnitude smaller than the electronic part, but for ferroelectrics the soft phonon modes associated with spontaneous polarization can give rise to significant lattice polarizabilities.We exemplify this by the well known case of 2D ferroelectric GeS [281][282][283] where we obtain a spontaneous polarization of 490 pC/m [44,284] in excellent agreement with previous calculations [283].The lattice and electronic 2D in-plane polarizabilities are α 2D lat = 4.32 Å and α 2D el = 3.75 Å respectively [45].For comparison, the non-polar case of 2D MoS 2 yields α 2D lat = 0.09 Å and α 2D el = 6.19 Å [45].

B. Berry phases and band topology
Topological phases such as the quantum spin Hall state and Chern insulator depend crucially on presence of spinorbit coupling and the band topology may be obtained from the evolution of k-space Berry phases across the Brillouin zone.In particular, for insulators the eigenvalues of the Berry phase matrix of occupied states must change by an integer multiple of 2π when a component of k ⊥ (components of k orthogonal to k i ) is cycled through the Brillouin zone [285].Here, u n (k) are spinor Bloch states, which are typically not smooth functions of k.The evaluation of Eq. ( 102) thus requires the construction of a smooth gauge and in GPAW this is handled by the parallel-transport algorithm of Ref. [286].
The method has been applied to high throughput search for new topological two-dimensional materials [46] and in Fig. 15  state and closely related to the presence of gapless edge states [285].
The eigenvalues of Eq. ( 102) may also be used to calculate the electronic contribution to the formal polarization, since the sum of all individual Berry phases yields the same value as one may obtain from Eq. ( 98).The present approach is, however, more involved since it requires the construction of a smooth gauge, which is not needed in Eq. ( 98).

C. Wannier functions
Wannier functions (WFs) provide a localized representation of the electronic states of a solid.The WFs are defined by a unitary transformation of the Bloch eigenstates that minimises the spatial extent of the resultant orbitals.Specifically, the nth Wannier function in unit cell i is written as where ψnk is a generalized Bloch function (a superposition of Bloch eigenstates at k).
The minimization of the spatial extent of a set of WFs {w n (r)} Nw n=1 is equivalent to the maximization of the spread functional [288] where The {q α } is a set of at most 6 reciprocal vectors connecting a k-point to its neighbors and W α are corresponding weights accounting for the shape of the unit cell [289].
The generalized Bloch functions of Eq. ( 103) are determined by minimizing Ω using e.g. a conjugate gradient scheme as implemented in the ASE Wannier module.The inputs to this Wannierization algorithm are the matrices where ∆S a ii ′ are the PAW corrections from Eq. 8. From these matrices, the ASE Wannier module can be used to construct partially occupied Wannier functions [290,291], which are a generalization of maximally-localized Wannier functions [286] to entangled bands and nonperiodic systems.Recently, a further improvement in terms of robustness of the Wannierisation procedure was achieved using a modified spread functional containing a penalty term proportional to the variance of the spread distribution of the WFs, which leads to a more uniform spread distribution [292].

D. Point defect calculations with hybrid functionals
Point defects play a crucial role in many applications of semiconductors [293,294].First-principles calculations can be used to determine the atomic structure, formation energy, and charge-transition levels of points defects.It is well established that the best description of point defects in semiconductors/insulators is obtained using range separated hybrids, such as the HSE06 xc-functional [5,295].To illustrate the use of GPAW for point defect calculations, we determine the formation energy diagrams of the C N and C B defects in the hexagonal boron nitride (hBN) crystal with the HSE06 functional.These defects have been proposed to be responsible for the deep-level luminescence signal with a zero-phonon line (ZPL) around 4.1 eV.[295,296].The results are compared to similar results obtained with the VASP software package.
For a point-defect D in charge state q, the formation energy E f is calculated from the formula, ) Here E tot [D q ] and E bulk tot are the total energies of the crystal with the point defect in charge state q and of the neutral pristine crystal, respectively.µ i is the chemical potential of the element i while n i is the number of atoms added to (n i > 0) or removed from (n i < 0) the crystal to create the defect.E F is the chemical potential of the electrons, i.e. the Fermi level, which is written as E F = E VBM +∆E F , where VBM is the valence band maximum.Finally, E corr is a correction term which accounts for: (i) the spurious electrostatic interaction between the periodic images of the defect and their interaction with the compensating homogeneous background charge and (ii) the potential shift between the pristine and defect system.
For more details on the methodology of point defect calculations we refer the reader to the excellent review papers on the topic [297][298][299][300].
All calculations have been performed using the HSE06 functional with the default mixing parameter α=0.25, plane wave cut-off of 800 eV, and forces converged to 0.01 eV/Å.The lattice of the hBN crystal was fixed at the experimental parameters (a = 2.50 Å and c = 6.64 Å) [301].The band gap of the pristine crystal was determined to be 5.58 eV (using 8 × 8 × 4 k-points) in good agreement with the experimental band gap of 6.08 eV [302].The structure of the point defects were relaxed in a 4 × 4 × 2 (128 atom) supercell using Γ-point k-point sampling.For each defect three different charge states (q = 1, 0, −1) were considered.The corrections (E corr ) due to image charges and potential alignment are evaluated following Freysoldt, Neugebauer and Van de Walle [303] as implemented in GPAW.
Figure 16 shows the defect formation energies as a function of Fermi level for C N and C B , at N-rich and N-poor conditions, respectively.We can see that under N-rich conditions, C B is energetically lower, whereas C N is favorable under N-poor conditions.C B shows a 1/0 charge transition at 3.73 eV above the VBM, whereas C N has a 0/-1 charge transition 3.26 eV deep inside the band gap.We find good agreement with a similar study [295] also employing plane waves and the HSE06 functional (VASP calculations).Minor discrepancies can be attributed to use of a different supercell size and slightly higher fraction of the non-local mixing parameter (α=0.31).

E. Point-group symmetry representations
GPAW allows for the automatized assignment of pointgroup symmetry representations for the pre-computed Kohn-Sham wavefunctions.This can be used for determining the wave-function symmetry representations for both molecules [304] and extended structures [305] to analyze, for example, the symmetry-involved degeneracy of the bands and selection rules for dipole transitions.
The analysis follows directly from group theory [306], stating that the solutions to the Schrödinger equation inherit the symmetry group of the respective Hamiltonian, or essentially the external potential invoked by the atomic configuration.The representation matrices Γ are computed as where ϕ is a normalized wavefunction, n is the eigenstate label, and P (T ) is an operation that corresponds to the transformation T of the symmetry group of the Hamiltonian.The operations P (T ) include rotations that are non-trivial for the rectangular grid that the computed wavefunctions are projected onto.The wavefunction rotations are performed on the grid by cubic interpolation.The output of the analysis contains the irreducible representation weights c α,n for each eigenstate n as solved from where χ α are the character vectors of the group (i.e.rows of the character table ).
When doing the analysis, the user needs to input the coordinates of the center of symmetry (typically the coordinates of a single atom), and the point group for which the analysis is run.It is possible to analyze only a part of the wave function by selecting a cutoff radius from the center of symmetry beyond which the parts of the wave function are neglected.This enables the investigation of the purity of the local symmetry even if the symmetry of the Hamiltonian is broken far from the center of symmetry [304,305].To date, point groups of C 2 , C 2v , C 3v , D 2d , D 3h , D 5 , D 5h , I, I h , O h , T d , and T h are implemented.

F. Band-structure unfolding
When studying defect formation, charge-ordered phases or structural phase transitions, it is often needed to perform DFT calculations on a super-cell.A super-cell (SC) calculation comes with the cost of having to account for many more electrons in the unit cell when compared to the primitive cell (PC).This implies that besides the increased computational effort, the band structure of a SC contains more bands in a smaller Brillouin zone as compared to the PC.In order to compare electronic band structures between SC and PC, unfolding the band structure of the SC into the one of the primitive cell (PC) becomes convenient.
GPAW features the possibility of performing bandstructure unfolding in the real-space grid, plane wave, and LCAO modes.The implementation allows to unfold the SC band structure without the explicit calculation of the overlap between SC and PC wavefunctions, following the procedure described in Ref. [307].The unfolded band structure is given in terms of the spectral function with k, K momenta in the PC and SC Brillouin zone respectively, ϵ Km the SC eigenvalues obtained for momentum K and band index m, and P Km (k) is calculated as where C Km are the Fourier coefficients of the eigenstate |Km⟩ and {G} the subset of reciprocal space vectors of the SC that match the reciprocal space vectors of the PC.A more detailed explanation and technical details on how to perform a band-structure unfolding can be found on the GPAW web-page [61].
G. The QEH model The quantum-electrostatic heterostructure (QEH) [308] model is an add-on GPAW feature for calculating the dielectric response and excitations in vertical stacks of 2D materials, also known as van der Waals (vdW) heterostructures.The QEH model can be used independently from the GPAW code, but it relies on the GPAW implementation for the calculation of the fundamental building blocks used by the model, as elaborated below.
The dielectric screening in 2D materials is particularly sensitive to changes in the environment and depends on the stacking order and thickness of the 2D heterostructure, providing a means to tune the electronic excitations, including quasi-particle band gaps and excitons.While the dielectric response of freestanding layers can be explicitly represented ab initio in GPAW at the linearresponse TDDFT, GW, and BSE level of theory, lattice mismatch between different 2D layers often results in large supercells that make these many-body approaches infeasible.Since the interaction between stacked layers is generally governed by van der Waals interactions, the main modification to the non-interacting layers' dielectric response arises from the long-range electrostatic coupling between layers.
Therefore, in the QEH model, the dielectric function of the vdW heterostructure is obtained through an electrostatic coupling of the quantum dielectric building blocks of individual 2D layers [309].The dielectric building blocks consist of monopole and dipole components of the density-response function of the freestanding layers, χi (q ∥ , ω), calculated from ab initio on the RPA level.Subsequently, the full density-response function χ i,j (q ∥ , ω) (density perturbation on layer i due to a monopole or dipole component of the perturbing field acting on layer j) is calculated by solving the Dyson equation χ i,j (q ∥ , ω) = χi (q ∥ , ω)δ i,j + χi (q ∥ , ω) where the Coulomb matrix element is obtained as the real-space overlap over the density, ρ, and potential, ϕ, basis functions on the different layers.
From the density-response function, the dielectric function is obtained in the basis of monopole/dipole perturbations on each layer in the heterostructure.While building blocks pre-computed with GPAW for a large variety of 2D materials are provided with the QEH package, GPAW offers the possibility of calculating custom building blocks for any 2D material as explained in Ref. [310].
As an illustrative example of the QEH model, we show how the static dielectric function of a heterostructure can be engineered by multi-layer stacking.Fig. 17 shows the static dielectric function of a vdW heterostructure made up by stacking N MoS 2 layers and N WSe 2 layers.We see that the dielectric function increases significantly as a function of the number of layers, eventually approaching a bulk limit.The knowledge of the layer-dependence of the dielectric response for such a heterostructeres could be further exploited to investigate inter-and intra-layer excitonic properties and band-edge renormalization effects.

H. Solvent models
The presence of a solvent has a large effect on the energetics and the electronic structure of molecules or extended surfaces.In particular, the arguably most important solvent, water, is able to stabilize ions or zwitterions that would not form in the gas phase.The main effect relates to the large permittivity of water (ε r = 78) that effectively screens Coulomb interactions.
A convenient and computationally lean method to describe this effect is the inclusion of a position-dependent solvent permittivity ε(r) in the electrostatics via the Poisson solver [311].The solvent is represented solely as a polarizable continuum that averages out all movements and re-arrangements of the solvent molecules and their electrons.The computational cost is, therefore, practically the same as a calculation in vacuum.This implementation allows the calculation of solvation free energies of neutral and ionic species in solution [311].Further, it can be applied to periodic surfaces interfaced with an electrolyte to reproduce reasonable potential drops within the simulation of electrochemical reaction processes, as we will elaborate in the following.

I. Charged electrochemical interfaces
Simulating atomistic processes at a solid-liquid interface held at a controlled electrode potential is most appropriately performed in the electronically grandcanonical ensemble [312][313][314][315]. Here, electrons can be exchanged dynamically with an external electron reservoir at a well-defined electrochemical potential.In a periodic system, a non-zero net charge would lead to divergence of the energy; therefore, any fractional electrons that are added to or removed from the system must be compensated by an equal amount of counter charge.Several approaches able to account for this change in boundary conditions have recently been brought forward [316][317][318][319][320]. In GPAW, this is conveniently accomplished with the introduction of a jellium slab of equal and opposite charge to the needed electronic charge; the jellium is embedded in an implicit solvent localized in the vacuum region above the simulated surface (cf.IX H).As a particular highlight of this Solvated Jellium Method (SJM) [321], as it is known in GPAW, we are able to localize the excess charge on only the top side of the simulated atomistic surface, which occurs naturally by introducing the jellium region solely in the top-side vacuum and electrostatically decoupling the two sides of the cell via a dipole correction.Both a purely implicit and hybrid explicit-implicit solvent can be applied.
In the SJM, the simulated electrode potential is a monotonic function of the number of electrons in the simulation; calculations can be run in either a constantcharge or a constant-potential ensemble.The electrode potential (ϕ e ) within SJM is defined as the Fermi level (µ) referenced to an electrostatic potential deep in the solvent (the solution inner potential Φ w ), where the whole charge on the electrode has been screened and no electric field is present, We can relate ϕ e to the commonly used reference potentials, for example to the standard hydrogen electrode, by subtracting its absolute potential such as the experimental value of 4.44 V as reported by Trasatti [322].In practice, the reference potentials depend on the used solvent model [323], and the reference can be calibrated using computed and measured potentials of zero charge [324].The energy used in the analysis of electrode reactions is the grand-potential energy Ω, where N e are the excess electrons in the simulation.This allows for energetic comparisons between calculations with different numbers of electrons, and is the default energy returned to atomistic methods by SJM.While E tot is consistent with the forces in traditional electronicstructure calculations, the grand-potential energy Ω is consistent with the forces in constant-potential simulations [315,325].This means that relaxations that follow forces will correctly find local minima in Ω, and any kind of structure optimization or dynamical routine can be performed on the grand-canonical potential energy surface, such as the search for saddle points [326][327][328] or molecular-dynamics simulations [329].
In constant-potential mode, the potential is controlled by a damped iterative technique that varies N e to find the target ϕ e ; in practice, a trajectory (such as a relaxation or nudged elastic band) is run, where in a first series of SCF cycles the potential equilibrates.Upon achieving a target potential within the given threshold, the code will conduct the chosen geometry-optimization routine under constant potential control.

J. Constrained DFT
Constrained DFT (cDFT) [330][331][332] is a computationally efficient method for constructing diabatic or charge/spin-localized states.GPAW includes a real-space implementation of cDFT [333], which can be used in both the FD and LCAO modes.Compared to most cDFT implementations, in GPAW the periodicity can be chosen flexibly between isolated molecules and periodic systems in one-, two-, or three-dimensions with k-point sampling.
The key difference of cDFT compared to normal DFT is the introduction of an auxiliary potential to force a certain region (in real space around a molecule, molecular fragment, or atom) to carry a predefined charge or spin.This leads to a modified energy functional where E KS is the Kohn-Sham energy functional, σ denotes the spin, n σ (r) is the spin-dependent electron density, and N i is the predefined charge or spin constraint.
V i acts as Lagrange multiplier, which determines the strength of the auxiliary potential and needs to be determined self-consistently as discussed below.w σ i (r) is the weight function that defines how the charge or spin are to be partitioned, i.e. the regions where charge/spin is to be localised.This will be discussed in some detail below.
Introducing the constraining term in Eq. ( 116 In practice, the strength of the constraining potential V i is found through a self-consistent two-stage optimization of both {V i } and n(r).As the derivatives of F [n(r), {V i }] with respect to V i are readily available [333], gradientbased optimization algorithms in SciPy [68] are used for optimizing {V i }.The weight functions are defined by a Hirshfeld-type partitioning scheme with Gaussian atomic densities, and w σ i (r) and the resulting external potential are presented on the grid.With these definitions, the forces resulting from the cDFT external potential can be analytically computed and used in e.g.geometry optimization or molecular-dynamics simulations.
cDFT has been widely used for computing transfer rates within Marcus theory, which depends on the reorganization and reaction (free) energies and the diabatic coupling matrix element; the GPAW-cDFT implementation includes all the needed tools for obtaining these parameters for bulk, surface, and molecular systems [333,334].Recently, the cDFT approach has been combined with molecular-dynamics methods to compute the reorganization energy at electrochemical interfaces [335] as well as with grand-canonical DFT methods (see section IX I) to construct fixed electron-potential diabatic states [336].
K. Orbital-free DFT Orbital-free DFT (OFDFT) approximates the DFT energy functional by modelling the kinetic energy as a direct functional of the density E OF [n] = dr n 1/2 (r) − 1 2 ∇ 2 n 1/2 (r) Levy and colleagues showed that a Kohn-Sham-like equation derived variationally from the equation above holds for the square-root of the density [337] OFDFT approximately enforces the Pauli principle, partially accounting for quantum effects in an averaged way.
The OFDFT scheme implemented in GPAW offers the advantage of accessing all-electron values while maintaining computational linear-scaling time with respect to system size.To achieve this, we employ the PAW method in conjunction with real-space methods obtaining a mean absolute error of 10 meV per atom when compared to reference all-electron values [338].
While OFDFT functionals perform better using local pseudopotentials in bulk materials, the OFDFT PAW implementation can be interesting for assessing density functionals.For example, in studying large-Z limits or semiclassical limits of density functionals, the all-electron values allow us to find highly-performing OFDFT functionals [339].

L. Zero-field splitting
The zero-field splitting (ZFS) refers to the energetic splitting of the magnetic sub-levels of a localized triplet state in the absence of a magnetic field [340].The origin of the ZFS is the magnetic dipole-dipole interactions between the two electrons of the triplet.This interaction is described by a spin Hamiltonian of the form (α, β = x, y, z) [341,342] where Ŝ is the total spin operator and D is the ZFS tensor given by ρ 2 (r 1 , r 2 ) δ αβ r 2 − 3r α r β r 5 dr 1 dr 2 .
(122) where r α and r β denote the Cartesian components of r = r 1 −r 2 , ρ 2 is the two-particle density matrix of the Kohn-Sham ground-state Slater determinant, µ B is the Bohr magneton and g e is the Landé splitting factor.GPAW computes the D-tensor by evaluating the double integral in reciprocal space using the pseudo density including compensation charges [305].

M. Hyperfine coupling
The hyperfine coupling describes the interaction between a magnetic dipole of a nuclear spin, ÎN , and the magnetic dipole of the electron-spin distribution, Ŝ(r).The interaction is described by the spin Hamiltonian (α, β = x, y, z) where the hyperfine tensor of nucleus N at R = 0 is given by [343] A N αβ = 2µ 0 3 g e µ B g N µ N δ T (r)ρ s (r)dr + µ 0 4π g e µ B g N µ N 3r α r β − δ αβ r 2 r 5 ρ s (r)dr. ( The first term is the isotropic Fermi contact term, which is proportional to the spin density, ρ s (r), at the centre of the nucleus.δ T (r) is a smeared-out δ-function.g e and g N are the gyromagnetic ratios for the electron and nucleus, and µ N is the nuclear magneton.The second term represents the anisotropic part of the hyperfine coupling tensor and results from dipole-dipole interactions between nuclear and electronic magnetic moments.GPAW evaluates A N using the pseudo spin density with compensation charges [305].

X. OUTLOOK
As described in this review, GPAW is a highly versatile code that is both maintenance-, user-, and developerfriendly at the same time.The continued expansion of the code requires substantial efforts and can be lifted only because of the dedicated team of developers contributing at all levels.There are currently a number of ongoing as well as planned developments for GPAW, which will further improve performance and applicability of the code.We are currently finishing a major refactoring of the code, which will make it even more developer friendly and facilitate easier implementation of new functionality.
Another priority is to improve the parallelization of hybrid functional calculations in plane wave mode by enabling parallelization over bands and k-points.In the same vein, there is ongoing work to support LCAObased hybrid functional calculations using a resolutionof-identity approach.A natural next step would then be LCAO-based GW calculations.Such a method could potentially be very efficient compared to plane wave calculations, but it is currently unclear if the accuracy can be maintained with the limited LCAO basis.In relation to quasiparticle calculations, there are plans to implement (quasiparticle) self-consistent GW and vertex corrected GW using nonlocal xc-kernels from TDDFT.Constrained RPA calculations that provide a partially screened Coulomb interaction useful for ab initio calculation of interaction parameters in low-energy model Hamiltonians is currently being implemented.
Underlying any GPAW calculation are the PAW potentials.The current potentials date back to 2009.A new set of potentials, including both soft and norm-conserving potentials (for response function calculations), is being worked on.
As described herein, GPAW already has an efficient implementation of real time TDDFT in the LCAO basis while Ehrenfest dynamics is supported only the comparatively slower in grid mode.Work to enable Ehrenfest dynamics in LCAO mode is ongoing.
The current version of GPAW supports GPU acceleration only for standard ground state calculations.The CuPy library greatly simplifies the task of porting GPAW to GPU and we foresee that large parts of the code, including more advanced features such as linear response and GW calculations, will become GPU compatible.

i |ϕ a i ⟩⟨p a i | = 1 (
near atom a).

FIG. 3 .
FIG. 3. Band structure of WS2 monolayer obtained from PBE with non-selfconsistent spin-orbit coupling.The colors indicate the expectation value of Sz for each state.The grey lines show the band structure without spin-orbit coupling.

FIG. 4 .
FIG.4.Spin-flop transition in Cr2O3.The alignment of the spins with respect to the magnetic field is sketched for small and large magnitudes of the field.The canting (small ferromagnetic component after the spin flop) is exaggerated for visualization.The actual canting is roughly 1 • .
FIG. 5.LSDA spin-spiral spectrum of the NiBr2 monolayer (structure taken from the C2DB[15]).The ground state has an incommensurate wave vector Q ≃ [0.1, 0.1, 0].The magnetic moment displays only weak longitudinal fluctuations and the band gap remains finite for all wave vectors q, indicating that a Heisenberg model description of the material would be appropriate.The inset shows the spin-orbit correction to the spin-spiral energies (in meV) as a function of the normal vector n of the planar spin spiral.n is depicted in terms of its stereographic projection in the upper hemisphere above the monolayer plane.The spiral plane is found to be orthogonal to Q and tilted slightly with respect to the outof-plane direction.

FIG. 8 .
FIG.8.Magnon spectrum of ferromagnetic hcp-Co calculated using ALDA LR-TDDFT (shown as a heat map), compared to the spin-wave dispersion of Liechtenstein MFT.The acoustic magnon mode a0(ω) is shown to the left of the A-point, while the optical magnon mode a1(ω) is shown to the right.Please note that the two modes are degenerate in both the A and K high-symmetry points.The calculations were carried out using 8 empty-shell bands per atom, a plane-wave cutoff of 800 eV and a(36,36,24) k-point mesh.A finite value of η = 100 meV was used to broaden the ALDA spectrum.For the MFT calculations, the dispersion was calculated using linear spin-wave theory based on a Heisenberg model of closed-packed spherical sites centered at each of the Co atoms.
FIG. 9.The real (Re) and imaginary (Im) parts of the self-energy matrix elements at the Gamma point for valence (top) and conduction (bottom) bands evaluated with Yambo and GPAW.Both codes are using full frequency integration with a broadening of 0.1 eV.Yambo is using norm conserving pseudo potentials, and GPAW its standard PAW setup.The k-point grid was 12×12×12, the plane wave cutoff was 200 eV, and the number of bands 200 for both codes.The results are virtually indistinguishable.

FIG. 10 .
FIG.10.2D polarizability of WS2 calculated from the BSE and the RPA.For this calculation, we included spin-orbit coupling and used the 2D Coulomb truncation to eliminate screening from periodic images.A gamma-centered uniform k-point grid of 48 × 48 was applied and 8 valence states and 8 conduction states (shifted by 1 eV to match the GW band gap[45]) was included in the Tamm-Dancoff approximation.This yields a BSE Hamiltonian of size N × N with N = 147456, which is easily diagonalized with ScaLAPACK on 240 CPUs.

FIG. 12 .
FIG.12.Photoabsorption spectrum for a Ag147 icosahedral nanoparticle and the transition contribution map at 3.8 eV.The map reveals that transitions between KS states near the Fermi level contribute constructively to the plasmon resonance, while transitions from occupied states at the d-band edge contribute destructively (screening).

FIG. 13 .
FIG.13.Applications of time-independent, variational calculations of excited states to a molecule in vacuum (left), a molecule in solution (middle) and a solid-state system (right).Left: Deviation of the calculated excitation energy of a charge-transfer excited state in the N-phenylpyrrole molecule from the theoretical best estimate of 5.58 eV[269].The results of linearresponse TDDFT calculations with hybrid functionals are from Ref.[269].Middle: Time-evolution of the interligand angles of the [Cu(dmphen)2] + complex upon photoexcitation to the lowest metal-to-ligand charge-transfer (MLCT) state in acetonitrile.The experimental results from femtosecond X-ray scattering measurements[270] are compared to the average over excited-state molecular dynamics trajectories obtained using the QM/MM electrostatic embedding scheme in GPAW[165] (see section IV) and convoluted with the experimental instrument-response function[270].Right: Vertical excitation energy for excitations in the negative nitrogen-vacancy center in diamond obtained with the r 2 SCAN functional as compared to the results of previous calculations using an advanced quantum embedding approach[271].The orbitals involved in the electronic transitions are visualized in the inset (C atoms are grey and the N atom is orange).

FIG. 14 .
FIG.14.Formal polarization along an adiabatic path connecting two states of polarization and the energy along the path in tetragonal KNbO3.The spontaneous polarization is obtained as the difference in polarization between a polar ground state (λ = 1) and a non-polar reference structure (λ = 0).The energy along the path is also shown.

FIG. 15 .
FIG.15.Berry phases of the quantum spin Hall insulator 1T'-MoS2 obtained from PBE with non-selfconsistent spinorbit coupling.The colors indicate the expectation value of Sz for each state as defined in Ref.[46] .

FIG. 16 .
FIG. 16.Defect formation energies for CN and CB, under N-rich and N-poor conditions, respectively.The dashed lines are reproduced from Ref. 295.

2 N layers WSe 2 FIG. 17 .
FIG. 17.The static macroscopic dielectric function of a vdW heterostructure interface as a function of the number of layers.The heterostructure is made up of N MoS2 layers on one half and N WSe2 layers on the other half (see inset).Increasing the number of layers eventually leads to a bulk-like limit for the chosen stacking configuration.

( 117 )
The constraint is further enforced by demanding that the V i need to satisfy the chosen constraintsC ≥ σ drw σ i (r)n σ (r) − N i .

TABLE I .
Number of files and number of lines of code in the git repository of GPAW.The Python source-code files are split into three parts: the actual code, the test suite, and code examples in the documentation.

TABLE II .
Calculated and measured values for the orbital magnetization in units of µB per atom.