We present a simple scheme to compute X-ray absorption spectra (e.g., near-edge absorption fine structure) and core ionisation energies within coupled cluster linear response theory. The approach exploits the so-called core-valence separation to effectively reduce the excitation space to processes involving at least one core orbital, and it can be easily implemented within any pre-existing coupled cluster code for low energy states. We further develop a perturbation correction that incorporates the effect of the excluded part of the excitation space. The correction is shown to be highly accurate. Test results are presented for a set of molecular systems for which well converged results in full space could be generated at the coupled cluster singles and doubles level of theory only, but the scheme is straightforwardly generalizable to all members of the coupled cluster hierarchy of approximations, including CC3.

## I. INTRODUCTION

The interest in accurate *ab initio* procedures to compute spectroscopic parameters related to X-ray absorption and ionization has significantly grown over the past five years,^{1–10} concurrently with the advances on the experimental side, in particular the advent of third-generation synchrotron radiation sources for the measurement of high-quality X-ray absorption spectra, and the ongoing development of fourth-generation synchrotron facilities (e.g., free-electron lasers) and their perspective opportunities for novel experimental studies on the interaction of matter and X-ray radiation.

The computation of X-ray spectra is an essential element in order to identify the origin of specific experimental signatures and extract detailed electronic and structural information, such as charge transfer, nature of bonding, hybridization, chemical environment, site symmetry. Even though (time-dependent) Density Functional Theory (DFT) methods remain the main workhorse for the simulation of (core-level) spectroscopic properties of large molecular systems, their intrinsic limitations due to the arbitrariness in the choice of functional and the self-interaction error also call for accurate and systematically improvable procedures towards which DFT can be benchmarked.

Coupled cluster (CC) (response) approaches^{11–14} are undoubtedly among the most accurate *ab initio* techniques currently available, but their application to X-ray phenomena is still rather limited. A general complicating factor in the computation of core spectra is due to the fact that traditional eigenvalue solvers apply a bottom-up approach, such that a prohibitive number of valence excitations are obtained before core-excitations are targeted. Increasing the molecular size, as well as the one- and N-electron spaces further complicates the situation.

We have earlier proposed two different approaches to compute near-edge x-ray absorption fine structure (NEXAFS) spectra at the CC level of theory. The first study was based on an asymmetric Lanczos algorithm,^{1,2} where a truncated tridiagonal representation of the full CC Jacobian is diagonalized to yield excitation energies and vectors, oscillator strengths as well as cross-section profiles. In the second,^{15} we presented a reduced space algorithm to solve the complex coupled cluster linear response equations of damped response theory^{16,17} and obtained directly the NEXAFS cross-section profiles from the imaginary part of the complex dipole polarizability. Both approaches, however, still suffered from the fact that core excitations are embedded in a sea of valence excitations. In the Lanczos case, this difficulty is reflected by the need to use large chain lengths to obtain well converged core excitations. For the damped solver, convergence problems are encountered when large basis sets are used.

Here we present a relatively simple solution to these problems based on the so-called core-valence separation (CVS) approximation, which is applied to both the Lanczos and the traditional CC-LR algorithms. The core-valence separation consists in removing all excitations that do not involve at least one core orbital from the excitation manifold when computing the core-level spectra. The fundamental motivation for such approximation, originally proposed by Cederbaum, Domcke, and Schirmer,^{18} lies in the large difference in energy and in the localization in space between core and valence orbitals. The CVS approximation was first implemented within the algebraic diagrammatic construction^{19–21} scheme^{22,23} and later extended to TD-DFT approaches.^{24} Restricted-window or energy-specific generalizations have also been presented in recent years.^{25–27} To the best of our knowledge, this is the first study exploring the applicability of the CVS approximation at the CC level of theory. We note nonetheless a very recent study by Peng *et al.*^{9} where an alternative, and much more elaborate, approach to target core excitation energies within the equation-of-motion coupled cluster singles and doubles (EOM-CCSD) ansatz is presented. In that study,^{9} which only addresses excitation energies, an energy-specific non-Hermitian Davidson eigensolver is used, where energy screening, eigenvector-bracketing, and growing window techniques are applied in order to obtain high-energy solutions without scanning through low-energy states. In Ref. 28, on the other hand, two algorithms for calculating excited states close to a specified energy shift (interior eigenvalues) were introduced, but no specific application to core-excitation and core-ionizations was presented.

In order to estimate the errors introduced by the CVS approximation, we have developed a perturbation correction to the core excitation energies that is obtained from a partitioning of the Jacobian eigenvalue problem. In this way, the quality of the CVS approximation can be tested without performing full space calculations that are, in most cases, unattainable.

Finally, we have coupled the CVS approximation with a restricted excitation onto a super-diffuse orbital, as proposed by Stanton and Gauss^{29} for UV-vis ionizations, to effectively compute core ionization potentials. Test results are presented for a set of molecular systems for which accurate experimental/theoretical results are available.

## II. THEORY

### A. Core-level spectra within CVS-CC-LR theory

The CC wave-function ansatz (for a closed-shell system) is defined by the exponential parametrization

where |HF〉 is the reference (Hartree-Fock) wave function, and *T* is the cluster operator, with *t*_{μ} being the cluster amplitudes and *τ*_{μ} the corresponding excitation operators. The ground state energy and amplitudes are conventionally determined by projection of the Schrödinger equation onto the reference state, and onto a manifold of excitations out of the reference state, respectively

In CC linear response (LR) theory, excitation energies (*ω _{k}*) and left (

**L**

_{k}) and right (

**R**

_{k}) excitation vectors are usually obtained solving the asymmetric eigenvalue equations

under the biorthogonality condition **L**_{j}**R**_{k} = *δ _{ik}*. The Jacobian matrix

**A**is defined as

Transition strengths (for dipole components *X* and *Y*) are determined from the single residues of the linear response function, and take the form

where the left and right transition moments are given by

and the auxiliary Lagrangian multipliers $ M \u0304 ( \omega j ) $ are obtained from the solution of the linear equation

In most implementations, Eq. (4) is solved iteratively via some generalization of the Davidson algorithm.^{30} The iterative procedure is initiated by selecting as starting guesses unit vectors for specific occupied to virtual orbital excitations (often chosen from the Hartree-Fock orbital energy differences). This procedure is however biased towards the lowest eigenvalues, i.e., it will tend to converge towards the lowest eigenvalues and eigenvectors even if the initial start vectors are chosen for selected high energy (e.g., core) excitations.

An alternative approach to solve Eq. (4) consists in building a (truncated) tridiagonal representation **T** of the Jacobian matrix **A** by application of an asymmetric Lanczos algorithm (or an Arnoldi algorithm), followed by its straightforward diagonalization. The non-zero elements of the tridiagonal matrix **T** = **P**^{T}**AQ** (where **P**^{T}**Q** = **1**) are given by

with

The diagonalization of **T**, conveniently truncated to dimension *J* ≪ *n* (*n* being the full dimension), generates an effective excitation spectrum, which is known to converge from the bottom and from the top towards the exact excitation spectrum.^{1,2}

Choosing as Lanczos seeds $ q 1 = u X \u2212 1 \xi X $ and $ p 1 T = v X \u2212 1 \eta X $ (where *u _{X}* = ‖

*ξ*^{X}‖ and $ v X = u X \u2212 1 \eta X \xi X $) yields an approximate diagonal representation of the (complex) linear response function in terms of the eigenvectors of the effective Lanczos spectrum.

^{1,2}The absorption cross-section is computed from its imaginary component

and the transition strengths from its residues

The CVS approximation may be easily implemented within both the (generalized) Davidson and the asymmetric Lanczos algorithms by applying, at each iteration, a projector $ P I v $ that removes all vector elements not referencing at least one core orbital (or a set of selected core orbitals) *I*. For a generic (trial) vector **b** that includes, e.g., single and double excitations,

In the Davidson case, we thus solve the projected eigenvalue equation

and similarly for the left eigenvectors. Moreover, by applying the projector at each iteration during the solution of Eq. (8), the computation of CVS-CC transition moments and transition strengths can also be easily implemented within any pre-existing CC linear-response code targeting low-energy states.

Within the Lanczos algorithm, we apply the projector at each iteration when generating the elements of the truncated **T** matrix, i.e., to both the $ p l T $ and **q**_{l} vectors and their linear transformations $ p l T A$ and **Aq**_{l}. In this way, the resulting Lanczos eigenvectors, as well as the Lanczos trial-vector bases **P**^{T} and **Q**, only contain excitations involving at least one core orbital, effectively decoupling them from excitations with contributions from occupied valence orbitals only. Diagonalization of the tridiagonal matrix generated in this way results in the core-excitations occurring as lowest roots, hereby quickly converging to the exact results with significantly smaller Lanczos chain lengths. The oscillator strengths and cross sections are obtained directly, without further modifications to the general procedure.

The projector is further generalized to yield core-ionization potentials. In the spirit of the recipe proposed by Stanton and Gauss^{29} for UV-vis ionizations, we included a continuum orbital in the basis set—specifically, a Gaussian basis function with a nearly zero exponent—and combined it with the CVS separation by additionally restricting the excitation manifold to excitation into this unoccupied orbital. Indicating with *A* the continuum orbital, the projector then reads

### B. Perturbative correction to the core excitation energies obtained within the CVS approximation

In this section, we propose a perturbative correction to the core-level excitation energies computed within the CVS approximation. We rewrite the eigenvalue equation as

where the Jacobian has been partitioned (Löwdin partitioning) in order to separate excitations from occupied core orbitals (superscript *c*) from those from occupied valence orbitals (superscript *v*). Subscript *i* indicates a specific core excitation. The partition yields the effective eigenvalue equation

where

We can then write

Notice that the effective Jacobian depends on the (excitation) frequency *ω _{i}*, so eigenvalue equation (19) should be solved iteratively with respect to the excitation frequency. Solution of linear equation (21) is also required to determine $ R i v $.

The perturbative correction to *ω _{i}* is now obtained by taking the CVS solution $ R \u0303 i c $ of eigenvalue equation (18)—that is, the solution of Eq. (18) for

**B**=

**C**=

**0**,

Then, we can write

where $ R \u0303 i v $ is the solution to the linear equation

The blocks $C R \u0303 i c $ and $ L \u0303 i c B$ are obtained projecting out the core excitations from $A R \u0303 i c $ and $ L \u0303 i c A$.

The computational cost of determining the perturbative correction to the core excitation energy amounts to solving the linear equation, Eq. (25), for each core excitation of interest.

## III. RESULTS

In Table I, we compare *K*-edge core excitation energies and oscillator strengths obtained applying the CVS approximation at the CCSD level, with corresponding, well-converged, Lanczos results where all excitations from occupied valence orbitals are retained (label “Full”). The latter were partly taken from Refs. 1–3. Augmented correlation-consistent basis sets,^{31} occasionally supplemented with center-of-mass Rydberg-type functions,^{32} were adopted.

. | Full . | CVS . | ||
---|---|---|---|---|

Excitation . | Energy (eV) . | f
. | Energy (eV) . | f
. |

H_{2}O – aug-cc-pCVTZ(O,H)+Rydberg | ||||

O_{1s} → 3s | 535.68 | 0.012 8 | 535.68 | 0.012 1 |

O_{1s} → 3p | 537.47 | 0.026 2 | 537.47 | 0.025 6 |

CO – aug-cc-pCVTZ(C,O)+Rydberg | ||||

C_{1s} → π^{∗} | 288.21 | 0.168 5 | 288.18 | 0.152 1 |

O_{1s} → π^{∗} | 535.85 | 0.081 3 | 535.84 | 0.077 7 |

Ne – aug-cc-pCVTZ+Rydberg | ||||

1s → 3p | 868.17 | 0.011 87 | 868.21 | 0.011 74 |

1s → 4p | 869.84 | 0.003 50 | 869.88 | 0.003 45 |

1s → 5p | 870.42 | 0.001 51 | 870.47 | 0.001 49 |

1s → 6p | 870.71 | 0.000 86 | 870.75 | 0.000 85 |

NH_{3} – aug-cc-pCVTZ(N,H)+Rydberg | ||||

N_{1s} → 3s | 402.14 | 0.006 1 | 402.13 | 0.005 7 |

N_{1s} → 3p | 403.80 | 0.039 8 | 403.78 | 0.039 2 |

N_{1s} → 3p | 404.48 | 0.006 5 | 404.46 | 0.006 5 |

C_{2}H_{4} – aug-cc-pCVDZ (C)/cc-pVDZ(H)+Rydberg | ||||

C_{1s} → π^{∗} | 287.47 | 0.095 2 | 287.46 | 0.089 0 |

C_{1s} → 3s | 289.93 | 0.009 1 | 289.92 | 0.009 3 |

C_{1s} → 3p | 290.57 | 0.026 5 | 290.56 | 0.026 2 |

HF – aug-cc-pCVTZ(F)/cc-pVDZ(H)+Rydberg | ||||

F_{1s} → 4σ^{∗} | 689.09 | 0.023 5 | 689.12 | 0.022 6 |

692.78 | 0.005 8 | 692.82 | 0.005 9 | |

692.94 | 0.011 1 | 692.96 | 0.010 9 | |

N_{2} – aug-cc-pCVTZ+Rydberg | ||||

N_{1s} → π^{∗} | 402.04 | 0.242 0 | 402.04 | 0.228 4 |

N_{1s} → 3s | 407.61 | 0.004 2 | 407.60 | 0.004 5 |

. | Full . | CVS . | ||
---|---|---|---|---|

Excitation . | Energy (eV) . | f
. | Energy (eV) . | f
. |

H_{2}O – aug-cc-pCVTZ(O,H)+Rydberg | ||||

O_{1s} → 3s | 535.68 | 0.012 8 | 535.68 | 0.012 1 |

O_{1s} → 3p | 537.47 | 0.026 2 | 537.47 | 0.025 6 |

CO – aug-cc-pCVTZ(C,O)+Rydberg | ||||

C_{1s} → π^{∗} | 288.21 | 0.168 5 | 288.18 | 0.152 1 |

O_{1s} → π^{∗} | 535.85 | 0.081 3 | 535.84 | 0.077 7 |

Ne – aug-cc-pCVTZ+Rydberg | ||||

1s → 3p | 868.17 | 0.011 87 | 868.21 | 0.011 74 |

1s → 4p | 869.84 | 0.003 50 | 869.88 | 0.003 45 |

1s → 5p | 870.42 | 0.001 51 | 870.47 | 0.001 49 |

1s → 6p | 870.71 | 0.000 86 | 870.75 | 0.000 85 |

NH_{3} – aug-cc-pCVTZ(N,H)+Rydberg | ||||

N_{1s} → 3s | 402.14 | 0.006 1 | 402.13 | 0.005 7 |

N_{1s} → 3p | 403.80 | 0.039 8 | 403.78 | 0.039 2 |

N_{1s} → 3p | 404.48 | 0.006 5 | 404.46 | 0.006 5 |

C_{2}H_{4} – aug-cc-pCVDZ (C)/cc-pVDZ(H)+Rydberg | ||||

C_{1s} → π^{∗} | 287.47 | 0.095 2 | 287.46 | 0.089 0 |

C_{1s} → 3s | 289.93 | 0.009 1 | 289.92 | 0.009 3 |

C_{1s} → 3p | 290.57 | 0.026 5 | 290.56 | 0.026 2 |

HF – aug-cc-pCVTZ(F)/cc-pVDZ(H)+Rydberg | ||||

F_{1s} → 4σ^{∗} | 689.09 | 0.023 5 | 689.12 | 0.022 6 |

692.78 | 0.005 8 | 692.82 | 0.005 9 | |

692.94 | 0.011 1 | 692.96 | 0.010 9 | |

N_{2} – aug-cc-pCVTZ+Rydberg | ||||

N_{1s} → π^{∗} | 402.04 | 0.242 0 | 402.04 | 0.228 4 |

N_{1s} → 3s | 407.61 | 0.004 2 | 407.60 | 0.004 5 |

Applying the CVS has a small, sometimes almost negligible effect, on the excitation energies (less than 0.05 eV in all cases). The differences on the oscillator strengths are also very modest and irrelevant for all practical purposes.

In Table II, we illustrate the performance of the proposed perturbative correction to the CVS-CCSD core-excitation energies by comparing CVS and perturbatively corrected CVS results (label PT-CVS) with results obtained retaining the core-valence coupling. For the selected *K*-edge energies, the correction is very small, as also expected from the results in Table I. For the given *L*-edge (no spin-orbit coupling included), the importance of the correction increases depending on whether excitations from the inner core orbitals (1*s*, 2*s*) are omitted or not.

. | CVS . | PT . | PT-CVS . | Full . |
---|---|---|---|---|

Ne – aug-cc-pCVTZ+Rydberg | ||||

1s → 3p | 868.214 6 | −0.039 6 | 868.174 9 | 868.17 |

1s → 4p | 869.878 1 | −0.040 3 | 869.837 8 | 869.84 |

1s → 5p | 870.496 2 | −0.040 4 | 870.455 8 | 870.42 |

H_{2}S – aug-cc-pCVDZ(S)/cc-pVDZ(H) | ||||

2p → _{z}b_{2}^{a} | 167.537 91 | −0.011 44 | 167.526 47 | 167.526 37 |

2p → _{z}b_{2}^{b} | 167.543 34 | −0.016 87 | 167.526 47 | 167.526 37 |

2p → _{z}b_{2}^{c} | 167.678 73 | −0.151 86 | 167.526 87 | 167.526 37 |

. | CVS . | PT . | PT-CVS . | Full . |
---|---|---|---|---|

Ne – aug-cc-pCVTZ+Rydberg | ||||

1s → 3p | 868.214 6 | −0.039 6 | 868.174 9 | 868.17 |

1s → 4p | 869.878 1 | −0.040 3 | 869.837 8 | 869.84 |

1s → 5p | 870.496 2 | −0.040 4 | 870.455 8 | 870.42 |

H_{2}S – aug-cc-pCVDZ(S)/cc-pVDZ(H) | ||||

2p → _{z}b_{2}^{a} | 167.537 91 | −0.011 44 | 167.526 47 | 167.526 37 |

2p → _{z}b_{2}^{b} | 167.543 34 | −0.016 87 | 167.526 47 | 167.526 37 |

2p → _{z}b_{2}^{c} | 167.678 73 | −0.151 86 | 167.526 87 | 167.526 37 |

^{a}

Include excitations from 1*s*, 2*s*, and 2*p*.

^{b}

Include excitations from 2*s* and 2*p*.

^{c}

Include excitations only from 2*p*.

As larger scale application of the approach, we present in Figure 1 the NEXAFS spectra of uracil at the C and O edges. A planar B3LYP/cc-pVTZ optimized structure was used in the calculations, together with the aug-cc-pCVDZ basis on the core-active nuclei, aug-cc-pVDZ on the other heavy nuclei, and cc-pVDZ on H. The computed spectra are compared with the experimental gas-phase spectra of Ref. 33. A broadening parameter of 1000 cm^{−1} was adopted. The computed spectra have been shifted towards lower energy, in order to align with the first experimental peak. Both *K*-edge spectra well reproduce the experimental profiles, with small difference that we primarily attribute to limitations in the chosen basis sets.

Finally, in Table III, we collect the results for the core-ionisation potentials obtained applying the CVS approximation in conjunction with restricted excitation into a continuum orbital. Some comparative theoretical results from Ref. 8 are also reported, together with experimental results. Our results are in line with previous findings and confirm the validity of the proposed recipe to obtain core-ionization potentials.

System . | Basis . | Ionization . | CCSD . | ΔUGA-SUMRCC^{8}
. | Expt. . |
---|---|---|---|---|---|

H_{2}O | cc-pVDZ | O 1s^{−1} | 542.73^{a} | 541.97 | 539.78 |

cc-pVTZ | 539.93^{a} | 539.02 | |||

cc-pCVTZ | 540.40^{a} | 539.24 | |||

CO | cc-pVTZ | C 1s^{−1} | 296.53^{a} | 295.25 | 296.2^{b} |

cc-pCVTZ | 297.10^{a} | 295.67 | |||

cc-pVTZ | O 1s^{−1} | 542.95^{a} | 542.5^{b} | ||

cc-pCVTZ | 543.43^{a} | ||||

N_{2} | cc-pVTZ | N 1s^{−1} | 409.91 | 409.9^{b} | |

cc-pCVTZ | 410.44 | ||||

HF | cc-pVTZ | F 1s^{−1} | 694.17^{a} | 693.80 | |

cc-pCVTZ | 694.59^{a} | 693.40 | |||

cc-pVTZ | 694.00^{c} | ||||

cc-pCVTZ | 694.42^{c} |

System . | Basis . | Ionization . | CCSD . | ΔUGA-SUMRCC^{8}
. | Expt. . |
---|---|---|---|---|---|

H_{2}O | cc-pVDZ | O 1s^{−1} | 542.73^{a} | 541.97 | 539.78 |

cc-pVTZ | 539.93^{a} | 539.02 | |||

cc-pCVTZ | 540.40^{a} | 539.24 | |||

CO | cc-pVTZ | C 1s^{−1} | 296.53^{a} | 295.25 | 296.2^{b} |

cc-pCVTZ | 297.10^{a} | 295.67 | |||

cc-pVTZ | O 1s^{−1} | 542.95^{a} | 542.5^{b} | ||

cc-pCVTZ | 543.43^{a} | ||||

N_{2} | cc-pVTZ | N 1s^{−1} | 409.91 | 409.9^{b} | |

cc-pCVTZ | 410.44 | ||||

HF | cc-pVTZ | F 1s^{−1} | 694.17^{a} | 693.80 | |

cc-pCVTZ | 694.59^{a} | 693.40 | |||

cc-pVTZ | 694.00^{c} | ||||

cc-pCVTZ | 694.42^{c} |

## IV. CONCLUSION

A practical approach to core-level excitations spectra and ionization potentials is proposed based on the core-valence separation for a coupled cluster wave function ansatz. For the *K*-edge core-absorption excitations, test results show deviations of at most a few hundredths of eV from the results obtained allowing for all valence excitations. No significant loss in accuracy is thus observed when decoupling core and valence excitations. For *L* edges, the difference increases slightly, and partly depends on which core orbitals are included. The perturbative correction is shown to correct this error leading to highly accurate core-excitation energies.

The approach, here illustrated only at the CCSD level, has been extended to all members of the CC hierarchy, CCS, CC2, CCSD, CC3, and also to CCSDR(3). A more extensive benchmark study is in progress and will be presented at a later stage.

## Acknowledgments

We acknowledge useful discussions with R. H. Myhre, M. Stener, F. Pawlowski, P. Norman, D. Mukherjee and O. Christiansen. S. C. acknowledges financial support from the AIAS-COFUND program (Grant Agreement No. 609033). H.K. acknowledges financial support from the FP7-PEOPLE-2013-IOF funding scheme (Project No. 625321). H.K. thanks T. J. Martinez for hosting the project at Stanford University. The COST Action Nos. MP1306 “Modern Tools for Spectroscopy on Advanced Materials (EUSPEC)” and CM1204 “XUV/X-ray light and fast ions for ultrafast chemistry (XLIC)” are also acknowledged.