Localization procedures are an important tool for analysis of complex systems in quantum chemistry, since canonical molecular orbitals are delocalized and can, therefore, be difficult to align with chemical intuition and obscure information at the local level of the system. This especially applies to calculations obeying periodic boundary conditions. The most commonly used approach to localization is Foster–Boys Wannier functions, which use a unitary transformation to jointly minimize the second moment of the orbitals. This procedure has proven to be robust and fast but has a side effect of often mixing - and -type orbitals. /-separation is achieved by the Pipek–Mezey Wannier function (PMWF) approach [Lehtola and Jónsson, J. Chem. Theory Comput. 10, 642 (2014) and Jónsson et al., J. Chem. Theory Comput. 13, 460 (2017)], which defines the spread functional in terms of partial charges instead. We have implemented a PMWF algorithm in the CP2K software package using the Cardoso–Souloumiac algorithm to enable their application to real-time time-dependent density functional theory. The method is demonstrated on stacked molecules, linear acetylenic carbon, boron and nitrogen co-doped graphene, and nitrogen-vacancy doped diamond. Finally, we discuss its computational scaling and recent efforts to improve it with fragment approaches.
I. INTRODUCTION
Due to its high accuracy relative to computational demands, density functional theory (DFT)1,2 is the most popular electronic structure method of quantum chemistry.3 Modern quantum chemistry software implements Born–von Karman periodic boundary conditions (PBC) on unit cells or supercells of a given system to simulate bulk properties in DFT calculations. This method leads to periodic Bloch functions representing crystalline orbitals, as opposed to canonical orbitals (COs) in non-periodic calculations.4 In the following, we use the abbreviation CO for both crystalline and canonical orbitals depending on whether the system is periodic or not. The COs are not restricted in their spatial extent in any way and, as a result, often delocalized over the entire system, making them hard to interpret from chemical intuition and in terms of regional phenomena. The established approach to mitigate these issues in post-processing is to perform a localization procedure.5,6 Localization refers to optimization of a unitary transformation with respect to a certain localization measure, which is then applied to the COs to obtain a set of localized orbitals (LOs) without changing any observables. Under PBC, this results in a transformation into maximally localized Wannier functions (MLWFs).7–10 The most important of such spread-based functionals include the Foster–Boys (FB) criterion11 and its Berry phase analogue for periodic systems,12,13 leading to Foster–Boys Wannier functions (FBWFs). This approach is highly successful in making the orbitals more spatially confined (according to the FB criterion) than the COs but has the downside of commonly mixing - and -type orbitals. Such mixing results in asymmetric hybrid bonds as shown in Fig. 1, occasionally referred to as “banana” bonds. There, on the example of , we see that the delocalized canonical highest occupied molecular orbital (HOMO) spans all three atoms, which is not aligned with chemical intuition expecting two separate double bonds. In contrast, the corresponding FBWF7 is clearly localized between two atomic centers, but the expected -symmetry is lost due to admixture of the -bond.
Three-center canonical HOMO (a), the corresponding FBWF (b), and the corresponding PMWF (c) of , with the FBWF showing a characteristic mixing of - and -type states, creating asymmetric “banana” bonds. The PMWF retains the separation of - and -type states, creating a -symmetric bond with predominantly two-center character. This 3D rendering shows two isosurfaces in cyan at −0.01 a.u. (lighter) and −0.02 a.u. (darker) and two isosurfaces in red at 0.01 a.u. (lighter) and 0.02 a.u. (darker).
Three-center canonical HOMO (a), the corresponding FBWF (b), and the corresponding PMWF (c) of , with the FBWF showing a characteristic mixing of - and -type states, creating asymmetric “banana” bonds. The PMWF retains the separation of - and -type states, creating a -symmetric bond with predominantly two-center character. This 3D rendering shows two isosurfaces in cyan at −0.01 a.u. (lighter) and −0.02 a.u. (darker) and two isosurfaces in red at 0.01 a.u. (lighter) and 0.02 a.u. (darker).
This mixing is often inconsequential to the problem at hand, e.g., when expectation values are summed over many eigenstates. In cases where mixing is undesirable, a different localization criterion is needed. The established Pipek–Mezey (PM) criterion,14 which is based on a partial charge functional, preserves the / symmetry of the LOs. However, there exists ambiguity in the choice of the partial charge estimate: The original formulation used Mulliken charges.15 Since those suffer from having no proper basis set limit, Lehtola and Jónsson generalized the criterion to other partial charge estimates, including well-defined real-space methods.16 While this gives us a great deal of flexibility to choose whatever partial charge estimate is most suited to the problem at hand, they did find that all estimates tested yielded similar results. Thus, they settled on a flexible, generalized real-space method.10 The robustness of this PM localization scheme has recently been demonstrated by Clement et al.,17 which made use of an atomic charge estimate by projection into a minimal basis set and also introduced the idea of the Canonicalize Phase then Randomize (CPR) protocol for generating the initial guess of the localization in periodic systems. In this article, we will use the acronyms FBWF and PMWF for MLWFs obtained using the FB/Berghold and PM spread functionals, respectively. By design, subsystem density embedding methods also keep the COs local to the respective subsystem, which can be exploited to investigate local properties of larger and periodic systems.18,19
The theoretical foundations of DFT are generally limited to the electronic ground state. Thus, modifications are required to capture excited state dynamics, which give rise to some of the most interesting material properties, such as photochemistry, an increasingly important field of chemistry, e.g., in the pursuit of green energy. Time-dependent density functional theory (TDDFT)20–22 can be employed in these cases for detailed insights into excited state dynamics at a low computational cost. The two major approaches to TDDFT are a perturbative treatment,20,23 which operates in the frequency domain, and the real-time approach, which records the response to a given perturbation in the time domain.24 Real-time TDDFT has gained popularity over the past years, mainly due to its favorable scaling at higher density of states, and the ability of predicting natural linewidths,25–28 with our group successfully applying it to theoretical spectroscopy29 in the fields of electronic circular dichroism,30,31 (resonance) Raman,32,33 and (resonance) Raman optical activity.32 There are also variational DFT-based approaches to excited state dynamics, such as the self-consistent field method.34–41
Application of localization procedures in real-time TDDFT, however, requires an adapted formulation with appropriate optimization algorithms. In this article, we present a PMWF algorithm for propagated orbitals and its implementation in the massively parallel quantum chemistry package CP2K,42 based on the Cardoso–Souloumiac algorithm43 we previously employed for propagated FBWFs.19,44,45 We discuss its effectiveness in / separation and computational scaling on the example of stacked slipped carbon dioxide oligomers, polymeric carbon chains, boron and nitrogen (B/N) co-doped graphene, and nitrogen-vacancy doped (NV) diamond. All systems under investigation are subject to PBC, considering the point of the Brillouin zone.
II. THEORY
This theoretical discussion adheres to the following conventions:
Bold lowercase letters signify vectors, while bold capital letters signify matrices.
Latin subscripts run over COs/LOs, and Greek subscripts run over atomic orbitals (AOs).
Time- and location-dependence of CO/LO coefficients and wavefunctions is implied but might be made explicit for clarity.
The Born–Oppenheimer approximation is employed, and atomic positions are fixed in all examples.46
Hartree atomic units47 are used throughout.
Repeated indices imply summation following the Einstein summation convention unless explicit sums are given.48
A. RT-TDDFT
B. TD-PMWF
C. Joint approximate diagonalization algorithm
PM localization procedure using the Cardoso–Souloumiac algorithm for propagated PMWFs. denotes a function returning the Jacobi rotation matrix [Eq. (8)]. The variable “” refers either to the full set of COs (default) or to the set of COs determined by either of the fragmentation methods. denotes the change of the corresponding quantity from the last iteration. The “for” loop is parallelizable as described in Refs. 19 and 53 using the appropriate ScaLAPACK tooling.56
procedure TD-PMWF() ⊳ : tolerance |
while do |
for do ⊳ all ordered combinations |
⊳ set of 3D vectors |
⊳ eigenvalue problem |
if then |
⊳ rotate only in one direction for numerical stability |
end if |
end for |
end while |
end procedure |
procedure TD-PMWF() ⊳ : tolerance |
while do |
for do ⊳ all ordered combinations |
⊳ set of 3D vectors |
⊳ eigenvalue problem |
if then |
⊳ rotate only in one direction for numerical stability |
end if |
end for |
end while |
end procedure |
D. Fragmentation
The general PMWF scheme with an CO-based partial charge estimate scales approximately since matrices of rows and columns need to be diagonalized. As grows linearly with , the overall scaling of the PMWF localization scheme is approximately . This is in contrast to spatial schemes, where there are only one (FB) or two (Berghold) matrices per Cartesian direction, resulting in approximate scaling. To counteract this, there have been efforts to compress the factor in large systems for computational gain.57,58
1. Sequential variant
Introduced in Ref. 58, the sequential variant of F-PMWF (sF-PMWF) uses a double loop to forgo any knowledge of the full set of states. Here, we will present an algorithmic overview of this procedure.
The objective of this algorithm is to find states that are localized on a selected molecular fragment. To do so, we firstly identify a “core” region of states with the largest values for the locality measure . Any remaining states become part of the “rest” space and are ordered by their locality. The “rest” space is then divided into blocks, of which the sizes are given by an arbitrary parameter that can range from 1 to the total size of the rest space . If the size of the rest space is not divisible by , the size of the last block will be the modulo of that division. After this initialization step, the outer loop begins, by selecting one of the blocks in the rest space, selected by taking the modulo of the iteration counter and , ensuring that the rest space is rotated into the actual localization procedure one by one. The inner loop performs a TD-PMWF optimization as outlined before, with only the states enumerated by the core space and the current block being used in the loop over index combinations. After the inner loop terminates, the core and rest spaces are recomputed, the next block of the rest space is rotated into the inner loop, and the next iteration is performed. The outer loop terminates when the spread functional has converged, which is checked after each update of the core and rest spaces. A pseudocode rendition of this TD-sF-PMWF algorithm is given in Algorithm 2. This algorithm opens up a new parallelization opportunity in that one can rotate a different block into the inner loop on each processor and only choose the one with the biggest increase in localization to actually apply to the common set of matrices. It remains to be seen if this strategy performs better or worse than parallelizing the inner loop as implemented before.19
Sequential exhaustion variant of the F-PMWF procedure.
procedure TD-sF-PMWF() |
While do |
TD-PMWF() |
end while |
end procedure |
procedure TD-sF-PMWF() |
While do |
TD-PMWF() |
end while |
end procedure |
III. COMPUTATIONAL METHODOLOGY
IV. RESULTS
Since the time-dependent MOs are complex-valued, we color-mapped the following heatmaps accordingly: The argument of a complex number , with ), normalized to , is mapped to the hue, while its magnitude , normalized to , is mapped to the saturation in an HSV color-mapping69 where the value parameter V is kept constant at . The resulting colormap is shown in Fig. 2. A three-dimensional approach to complex-valued MOs was introduced in Ref. 70.
Complex colormap used in this article, on the example of real and imaginary components ranging from −1 to 1.
Complex colormap used in this article, on the example of real and imaginary components ranging from −1 to 1.
The following examples show four distinct test cases: first, we demonstrate the method on stacked molecules, which demonstrate the method’s ability to localize states from intermolecular bonds to intramolecular -bonds. We also test on linear acetylenic carbon (LAC), to assess the method’s ability to localize states from a totally delocalized system to a well-defined double bond. Third, we examine its ability to localize a heteroatomic bond in a totally delocalized system on the example of B/N-co-doped graphene. Finally, we examine its abilities in a 3D-periodic system on the example of NV diamond. The orbitals shown are the result of localization after of propagation time, corresponding to ten steps.
A. Slipped parallel stacks
The stacks of molecules shown here are extrapolated from the global minimum geometry of the dimer.71 Therefore, an attractive interaction is expected, which is symmetrically allowed for either -type interactions of the in-plane -orbitals oriented head-on or -type interactions of the -orbitals perpendicular to the plane. Here, we investigate two orbitals: the HOMO, which is formed by the -type interactions of the in-plane -orbitals, and the highest occupied -orbital perpendicular to the molecular plane. Figure 3 shows the evolution of the former as a PMWF over the first 100 time steps of 0.1 ℏ/ (2.42 a.s.) each, while Fig. 4 shows a top-down cross section through the bond axes of the stack, to present a sideways view of the expected -bonds and a sideways view to present a cross section of the perpendicular -bond at the end of the propagation. Without any localization method applied, the stacked molecules display a canonical HOMO of predominantly three-center character as in Fig. 1. In addition, the orbital is significantly delocalized over all molecules of the system, which renders it useless to discuss any local effects. After applying a TD-FBWF localization, the Wannier function is localized entirely on a single molecule. However, like in the isolated molecule, the expectedly -symmetric Wannier functions are significantly mixed with the corresponding -bonds, leading to a loss of the nodal plane through the bond. The result of this loss in symmetry is a banana bond in both orientations investigated here. When applying a TD-PMWF optimization instead, we successfully recover the expected nodal plane for a much improved -character of the Wannier function.
Evolution of an in-plane -bond PMWF in a slipped stack of four molecules during the first 100 time steps (242 a.s.) of propagation. Four isosurfaces are shown: cyan and red for the real part at and 0.04 a.u., and lime and purple for the imaginary part at and 0.04 a.u., respectively.
Evolution of an in-plane -bond PMWF in a slipped stack of four molecules during the first 100 time steps (242 a.s.) of propagation. Four isosurfaces are shown: cyan and red for the real part at and 0.04 a.u., and lime and purple for the imaginary part at and 0.04 a.u., respectively.
(a) 3D rendering of the tetramer with the volume slice indicated; cross sections of two orbitals of interest: as a delocalized CO after SCF optimization (left), the corresponding FBWF (center), and the corresponding PMWF (right): (b) the HOMO, a -bond oriented in parallel to the main plane, and (c) the highest occupied -bond oriented perpendicular to the main plane, isosurface through the carbon. Gray dots signify the nuclear positions of carbon, and red dots the nuclear positions of oxygen.
(a) 3D rendering of the tetramer with the volume slice indicated; cross sections of two orbitals of interest: as a delocalized CO after SCF optimization (left), the corresponding FBWF (center), and the corresponding PMWF (right): (b) the HOMO, a -bond oriented in parallel to the main plane, and (c) the highest occupied -bond oriented perpendicular to the main plane, isosurface through the carbon. Gray dots signify the nuclear positions of carbon, and red dots the nuclear positions of oxygen.
B. Linear acetylenic carbon
LAC is an artificial allotrope of carbon consisting of linear chains of repeated units.72 While it is of considerable interest to nanotechnology, it should be noted that it is likely to cross-link exothermically when two side chains approach each other, making it difficult to employ in condensed phase applications. Nevertheless, with it being the simplest carbon polymer possible, it can serve here as a toy system for observing a -system delocalized over one full dimension of the simulation box. PMWFs of LAC have previously been reported in Ref. 10. We have chosen a chain of eight carbon atoms as the repeating unit. The expected HOMO in that case is any of the eight degenerate -bonds since they are symmetrically equivalent. Thus, Fig. 5 shows a sideways cross section through the -bond of the arbitrarily chosen state. The COs that we obtained directly after the SCF procedure, indeed, show eight orbitals of equal energy distributed uniformly to all nuclei. After maximal localization, this picture changes drastically. While the FBWF is now confined to a single bond, it has been mixed with the bond’s -orbital, resulting in an almost total loss of -character. Only a slight skewedness of the orbital with regard to the rotational symmetry axis remains from the -type bond. In contrast, the PMWF retains a perfect nodal plane and shows a predominantly two-center bond, with a slight antibonding influence on the neighboring pairs.
(a) 3D rendering of the LAC section with the volume slice indicated; (b) sideways cross section of the HOMO: as a delocalized CO after SCF optimization (left), the corresponding FBWF (center), and the corresponding PMWF (right). Gray dots signify the nuclear positions of carbon.
(a) 3D rendering of the LAC section with the volume slice indicated; (b) sideways cross section of the HOMO: as a delocalized CO after SCF optimization (left), the corresponding FBWF (center), and the corresponding PMWF (right). Gray dots signify the nuclear positions of carbon.
C. B/N-co-doped graphene
In the research effort to further mature graphene-based microelectronics and nanotechnology, doping techniques are essential to fine-tune its properties at the molecular level. Here, we investigate a 2D-periodic graphene sheet doped with a pair of boron and nitrogen, which has recently been shown to behave as a peroxidase mimic73 and allows for finely tuned redox behavior when adsorbed on metal surfaces,74 depending on the doping configuration. The graphene sheet under investigation in this article has been doped with a directly connected pair, with the focus being the expected -orbital between them. As in the LAC example, Fig. 6 shows a sideways view of the system, to provide a cross section view of the expected -bond. We identified the corresponding Wannier function by the position of its Wannier center. After the SCF optimization, the CO displays delocalization over the whole simulation box, with no discernible -symmetry. The corresponding FBWF becomes strongly confined to the system but is still missing any noticeable -symmetry. Finally, the PMWF procedure is able to both confine the orbital to the bonding pair and reproduce the expected -symmetry.
(a) 3D rendering of B/N-co-doped graphene with the volume slice indicated; cross sections of the expected -bond in B/N-co-doped graphene: as a delocalized CO after SCF optimization (left), the corresponding FBWF (center), and the corresponding PMWF (right): (b) top-down view, perpendicular to the graphene sheet, and (c) sideways view, parallel to the graphene sheet. Gray dots signify the nuclear positions of carbon, the pink dot signifies the nuclear position of the boron, and the bright blue dot signifies the nuclear position of the nitrogen.
(a) 3D rendering of B/N-co-doped graphene with the volume slice indicated; cross sections of the expected -bond in B/N-co-doped graphene: as a delocalized CO after SCF optimization (left), the corresponding FBWF (center), and the corresponding PMWF (right): (b) top-down view, perpendicular to the graphene sheet, and (c) sideways view, parallel to the graphene sheet. Gray dots signify the nuclear positions of carbon, the pink dot signifies the nuclear position of the boron, and the bright blue dot signifies the nuclear position of the nitrogen.
D. NV diamond
In contrast to graphene, where the electrons are totally delocalized over the 2D sheet, the diamond allotrope of carbon only possesses localized -bonds. This property is immensely useful to frustrate electronic systems, as the resulting well-localized electron spin interacts with a wide range of electromagnetic radiation.75 Among other applications, the NV center has been used as a qubit.76,77 Here, we investigate the contribution of the lone pair on the nitrogen center to the vacancy. Therefore, we identified the corresponding Wannier function by its Wannier center and present in Fig. 7 a cross section perpendicular to the longest cell axis in the z-direction, through the NV cavity. In the CO picture, the orbital is already comparatively well-localized within the vacant site and the neighboring carbon and nitrogen atoms, with stronger localization toward the nitrogen site, as expected from their relative electronegativities. Both localization procedures fully agree with each other in that this particular contribution to the vacancy center is fully localized to the nitrogen site, forming an orbital shape strongly resembling an orbital from hybridization theory. This example demonstrates how FBWF and PMWF procedures can both arrive at the same solution in naturally well-localized systems, a finding also noted in Ref. 10.
(a) 3D rendering of diamond with the volume slice indicated; (b) cross section of the lone pair on N extending into the vacancy in NV diamond: as a delocalized CO after SCF optimization (left), the corresponding FBWF (center), and the corresponding PMWF (right). Gray dots signify the nuclear positions of carbon, the blue dot signifies the nuclear position of the nitrogen, and the green dot signifies the center of the vacancy.
(a) 3D rendering of diamond with the volume slice indicated; (b) cross section of the lone pair on N extending into the vacancy in NV diamond: as a delocalized CO after SCF optimization (left), the corresponding FBWF (center), and the corresponding PMWF (right). Gray dots signify the nuclear positions of carbon, the blue dot signifies the nuclear position of the nitrogen, and the green dot signifies the center of the vacancy.
E. Scaling
We used the slipped parallel stacks to carry out a scaling test between our TD-FBWF, TD-PMWF, and TD-F-PMWF implementations. With our computational setup as described, we were able to localize up to 32 stacked molecules before the slowest method at that size (TD-PMWF) failed to finish the localization procedure within one hour. With our pseudopotential-based methodology, each molecule contributes eight states, as well as three partial charge matrices to the localization. Figure 8 shows the average walltime required to perform a single real-time TDDFT step including the localization procedure for each combination of algorithms and system sizes. As expected, we observe an initial parity of computational demand between FBWF and PMWF (with a slight edge for PMWF). At more than eight molecules, the PMWF algorithm shows its weakness of the ever-growing amount of partial charge matrices to consider. This makes for much worse scaling behavior than the FBWF implementation, where the amount of matrices to diagonalize remains constant. In contrast, the fragment-based PMWF method (in which we selected one of the molecules as the fragment) suffers some overhead at low molecule counts but is able to retain a very close scaling behavior to FBWF, improving considerably over basic PMWF. Thus, the applicability of PMWF to larger systems is vastly broadened to systems accessible via FBWF using suitable fragments in F-PMWF.
Scaling tests of the tested localization procedures on the slipped parallel stacks, with being the number of molecules in the simulation box.
Scaling tests of the tested localization procedures on the slipped parallel stacks, with being the number of molecules in the simulation box.
F. Spectroscopy
To investigate the impact of / mixing at the application level, we performed a real-time TDDFT simulation with subsystem analysis of . The parameters of the simulation were identical to the previous calculations except for the number of propagation steps, which was set to 10 000 for a total propagation time of 1000 ℏ/. The subsystem analysis followed the process outlined in Ref. 19. In the TD-PMWF case, we replaced the TD-FBWF procedure with the TD-PMWF subroutine introduced in this work. The spectrum (Fig. 9) shows the signals of the - and -bonds between the p-orbitals of one bond localized with TD-PMWF and the mixed / bond localized with TD-FBWF. We observe that the effect of / mixing is significant: The principal difference between the well-separated and orbitals is that the orbital absorbs in all three peaks of the depicted range, while the orbital only absorbs at 86 nm. In contrast, the FBWF orbital displays mixed properties: it does not absorb at 95 nm, just like the well-separated orbital. At 86 nm, it absorbs with an intermediate intensity between the and orbitals. Then, at 83 nm, it absorbs like the orbital but at only half of the intensity. In cases like these, having the choice between different localization procedures can be a significant advantage.
(a) Optical absorption spectrum of a molecule (“total,” black), with subsystem analysis of some important orbitals: (b) the bond of the bond (TD-FBWF, red), (c) the bond of the bond (TD-PMWF, green), and (d) the bond of the bond (TD-PMWF, blue).
(a) Optical absorption spectrum of a molecule (“total,” black), with subsystem analysis of some important orbitals: (b) the bond of the bond (TD-FBWF, red), (c) the bond of the bond (TD-PMWF, green), and (d) the bond of the bond (TD-PMWF, blue).
V. DISCUSSION
In this article, we have adapted the PMWF localization procedure to real-time TDDFT and presented our implementation into the massively parallel quantum chemistry package CP2K. We have shown its efficacy in localizing time-dependent electronic states in cases where traditional FBWFs fall short, e.g., by /-mixing, and produce qualitatively equivalent results where they do not. The method performed well in this regard as well as computational efficiency for all tested systems, including stacked oligomers, 1D carbon polymers, B/N-co-doped graphene, and NV diamond. Inclusion of the sequential fragment variant proved vital to make the TD-PMWF approach feasible on large, periodic systems. We especially see potential in their application to the resolution of real-time TDDFT-derived theoretical spectroscopy by individual PMWFs on user-selected atomic fragments, which should be explored on more complex systems in further research efforts.
ACKNOWLEDGMENTS
This work was supported by the University of Zurich and the Swiss National Science Foundation (Grant No. 200021-197207). We thank the Swiss National Supercomputing Centre for computing resources (Project No. 1036).
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
Lukas Schreder: Data curation (lead); Formal analysis (lead); Investigation (lead); Validation (lead); Visualization (lead); Writing – original draft (lead). Sandra Luber: Funding acquisition (lead); Project administration (lead); Resources (lead); Supervision (lead); Writing – review & editing (lead).
DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author upon reasonable request.