In this paper, we analyze new approximations of the Green’s function coupled cluster (GFCC) method where locations of poles are improved by extending the excitation level of inner auxiliary operators. These new GFCC approximations can be categorized as the GFCC-i(*n*, *m*) method, where the excitation level of the inner auxiliary operators (*m*) used to describe the ionization potential and electron affinity effects in the *N* − 1 and *N* + 1 particle spaces is higher than the excitation level (*n*) used to correlate the ground-state coupled cluster wave function for the *N*-electron system. Furthermore, we reveal the so-called “*n* + 1” rule in this category [or the GFCC-i(*n*, *n* + 1) method], which states that in order to maintain size-extensivity of the Green’s function matrix elements, the excitation level of inner auxiliary operators *X*_{p}(*ω*) and *Y*_{q}(*ω*) cannot exceed *n* + 1. We also discuss the role of the moments of coupled cluster equations that in a natural way assures these properties. Our implementation in the present study is focused on the first approximation in this GFCC category, i.e., the GFCC-i(2,3) method. As our first practice, we use the GFCC-i(2,3) method to compute the spectral functions for the N_{2} and CO molecules in the inner and outer valence regimes. In comparison with the Green’s function coupled cluster singles, doubles results, the computed spectral functions from the GFCC-i(2,3) method exhibit better agreement with the experimental results and other theoretical results, particularly in terms of providing higher resolution of satellite peaks and more accurate relative positions of these satellite peaks with respect to the main peak positions.

## I. INTRODUCTION

Green’s function formalism^{1–6} is a broadly used tool to calculate properties such as total energies, densities, ionization potentials (IPs), electron affinities (EAs), and neutral excitations for molecules, clusters, nano-structures, and solids. This flexibility of the Green’s function formalism, stemming from the possibility of characterizing many-body systems in terms of electron removal and/or addition processes, has motivated numerous developments towards providing a theoretical framework that can efficiently combine accuracy and relatively low numerical cost. Various computational schemes based on the algebraic and diagrammatic perturbative approaches have been developed so far. Typical examples include the outer valence Green’s function (OVGF) approach,^{7–10} the diagonal third-order self-energy approximation (D3) (for the review of diagonal Green function approximations, see Ref. 11), the non-diagonal renormalized second-order approach (NR2),^{12} the two-hole-one-particle Tamm-Dancoff approximation (2*h* − *p* TDA),^{13} the third-order quasiparticle method P3,^{14} the algebraic-diagrammatic construction (ADC),^{8,15,16} and the algebraic approaches based on the intermediate-state representations.^{17,18} In addition, the state-of-the-art *GW* approximations (one-body Green’s function *G* and dynamically screened Coulomb interaction *W*)^{19–24} permeated many areas of computational chemistry and materials sciences.^{25–27}

Recently, one could witness an increasing interest in developing Green’s function approaches accounting for higher-order correlation effects, which turns out to be indispensable in describing correlated behavior of electrons. These efforts serve multiple purposes and have to meet certain requirements. For example, in order to support the analysis and interpretation of photoelectron and various X-ray spectroscopies, the Green’s function formulations have to efficiently deal with various energy regimes. Also, as can be seen from a class of Green’s function applications that are motivated by the development of various embedding methods including dynamical mean field theories (DMFT)^{28–34} and self-energy embedding theory (SEET)^{35–38} approaches, very accurate representations of Green’s function and/or corresponding self-energies are necessary in describing the so-called impurity regions, which are usually associated with the presence of strong correlation effects.

Hierarchical approximations have been recently proposed to describe self-energies using perturbative techniques for one-particle many-body Green’s function (MBGF)^{39,40} that have been used to re-derive linked cluster and irreducible-diagram theorems as well as to provide algorithms for the general order component of the self-energy. A lot of attention has also been attracted by the Green’s function formulation proposed by Nooijen and Snijders,^{41–43} who successfully the employed coupled cluster (CC) bi-orthogonal formalism to express Green’s function matrix in terms of the cluster operator (*T*) and the so-called Λ operator, which is frequently used in linear response CC (LRCC) theory.^{44,45} In particular, the latter approach provides a natural elimination of the problems associated with normalization of the ground-state wave function in exponential CC representation. More recently, algorithmic developments for calculating Green’s function CC (GFCC) have been focused on the design and solving of the auxiliary *ω*-dependent ionization potential/electron affinity equation-of-motion CC (IP/EA-EOM-CC) type *X*_{p}(*ω*) and *Y*_{q}(*ω*) operators acting in the *N* − 1 and *N* + 1 electron spaces (see Refs. 46 and 47 for details). This representation can be used to prove/derive basic properties of GFCC including connected character of Green’s function matrix elements and their arbitrary-order *ω*-derivatives.^{48,49} The possibility of calculating the GFCC matrix and their *ω*-derivatives, through the Dyson equations, extends in a natural way to the corresponding self-energies. These properties enable one to calculate spectral functions, pole strengths, and other properties using the GFCC method.

In this paper, we further extend the analysis carried out in the previous papers.^{46–50} In particular, motivated by the accuracy attainable by the GFCC model with singles, doubles, and triples (GFCCSDT),^{47} we explore the possibilities of further improving accuracies of calculated pole locations by only increasing the excitation level in the auxiliary inner operators *X*_{p}(*ω*) and *Y*_{q}(*ω*). To this end, we discuss the general class of GFCC-i(*n*, *m*) approximations, where *n* is the rank of excitations included in the *T* and Λ operators, while *m* designates the excitation level used to define *X*_{p}(*ω*) and *Y*_{q}(*ω*) operators. Also, using consistency conditions between *N* and *N* − 1/*N* + 1 particle systems introduced by Nooijen and Snijders,^{42} for the first time ever, we analyze results of the combined GFCC-i(*n*, *m*) formalisms from the point of view of properties of approximate GFCC spectral functions. We will demonstrate that, if *m* > *n* + 1, the disconnected components in the equations for *X*_{p}(*ω*) and *Y*_{q}(*ω*) will lead to the appearance of disconnected components in the corresponding Green’s function. These disconnected components originate in the non-vanishing higher-order moments of the CC equations for the *N*-electron ground-state problem. Moreover, if *m* = *n* + 1, the equations for *X*_{p}(*ω*) and *Y*_{q}(*ω*) operators are defined in terms of connected diagrams, which leads in turn to the fully connected representation of the corresponding GFCC-i(*n*, *n*+1) Green’s function. We will refer to this property of the GFCC-i(*n*, *m*) formulation as an “*n* + 1”-rule. We will illustrate the performance of the GFCC-i(*n*, *n* + 1) formalism by applying the GFCC-i(2,3) method to the computation of the spectral functions of the N_{2} and CO benchmark systems in the inner and outer valence regime, where several challenging satellite peaks have been identified. The GFCC-i(2,3) spectral functions/pole locations will be compared with those obtained with GFCCSD, IP-EOM-CC, configuration interaction (CI), and the experimental results. We will also discuss the favorable numerical scaling of the GFCC-i(2,3) method, which obviates the need of correlating the ground state of the *N*-electron at the CCSDT level that is associated with very high memory requirements limiting the application area of the GFCCSDT method.

## II. METHODOLOGY

The standard form for the matrix elements of the one-body Green function written in terms of a normalized ground-state wave function |Ψ⟩ for the *N*-electron system can be represented as

To construct the GFCC formulation, the direct use of the CC parameterization of the wave function is prohibitive, as in the general case, the exponential form of the wave function cannot be easily normalized. To overcome the normalization problem, Nooijen and Snijders^{41–43} first employed the biorthogonal CC formalism (see, for example, Refs. 51 and 52 for recent reviews of the biorthogonal CC method) to Eq. (1). Similarly, we can also introduce the bi-variational CC formalism into Eq. (1). In the bi-variational formalism, the parameterizations of the bra (⟨Ψ|) and ket (|Ψ⟩) are defined as follows:

Plugging Eqs. (2) and (3) to Eq. (1) and introducing resolution of identity 1 = *e*^{−T}*e*^{T}, the Green’s function CC formulation can then be expressed as

where |Φ⟩ is the reference function, the *ω* parameter denotes the frequency, and the imaginary part *η* is often called the broadening factor. The similarity transformed operators $H\xafN$ (in its norm product representation), *ā*_{p}, and $\u0101q\u2020$ are defined as

Here, *E*_{0} is the CC energy, *T* is the cluster operator, and Λ is the de-excitation operator. *T* and Λ are defined as

with $ta1\u2026ani1\u2026in$ and $\lambda i1\u2026ina1\u2026an$ being the antisymmetric amplitudes and the indices *i*, *j*, *k*, … (*i*_{1}, *i*_{2}, … ) and *a*, *b*, *c*, … (*a*_{1}, *a*_{2}, … ) corresponding to occupied and unoccupied spin-orbitals in the reference function |Φ⟩, respectively. *E*_{0}, *T*, Λ can be obtained from the conventional CC equations

where the *Q* is a projection operator,

representing the projection onto the subspace spanned by excited configurations $|\Phi i1\u2026ina1\u2026an$ defined as $aa1\u2020\u2026aan\u2020ain\u2026ai1|\Phi $. Note that other parameterizations of the *N*-electron wave function can also be used to generate their corresponding Green’s function formulations.

To evaluate the *G*_{pq}(*ω*) matrix elements from Eq. (4), one can formally diagonalize the non-Hermitian effective Hamiltonian $H\xaf=e\u2212THeT$ in the (*N* ± 1)-particle space. For the evaluation of the *G*_{pq}(*ω*) matrix elements in the CCSD level, the dimension of the secular matrix ($H\xafN$) is $no+no2nv$ for the retarded part and $nv+nv2no$ for the advanced part, where *n*_{o} and $nv$ denote the numbers of occupied molecular orbitals and virtual molecular orbitals, respectively. When systems of interest are getting larger, the increasing number of orbitals prohibits the direct diagonalization of the secular matrix, and an iterative algorithm such as *Krylov* subspace methods is often used. However, since only part of the spectrum is solved (or converged) by the conventional iterative subspace diagonalization methods, the introduction of the incomplete eigenpairs to the GFCC formulation would lead to a sum-over-states representation of the GFCC formulation (see Ref. 47), which is usually unfavored in a sense that there is unclearness of how many states should be involved in the representation. Besides, for systems that exhibit a significant many-body effect in the spectral region of interest, solving the secular equations by the conventional iterative diagonalization is often subject to nontrivial construction of initial vectors and convergence problems. To circumvent these problems, approximations such as core-valence separation (CVS)^{53} have been introduced in the high-order Green’s function method to compute the K-shell ionization spectra of small and medium-size molecules.^{54,55} Essentially, the CVS approximation neglects the coupling between core- and valence-excited states (set certain types of Coulomb integrals to zero in practice) and, therefore, reduces the dimension of the effective Hamiltonian for the problem of interest. The reduction of dimension of the effective Hamiltonian becomes more significant for core-electron ionization than for the valence-electron ionization, and the associated error was estimated to be 0.5–1.0 eV for the former and much smaller for the latter.^{56} In particular, for the core-electron ionization, the performance of the convergence would be greatly improved when the CVS is employed, and vast majority states of interest would then be easily positioned. In the context of wave-function-based or propagator-based *ab* initio methods, the CVS approximation has been applied to the ADC, CC2, CCSD, CC3, and CCSDR(3) methods see Ref. 57 for a recent review). Recently, on the computational side, more algorithms have also been proposed for iterative matrix-free eigensolvers [such as the asymmetric Lanczos-chain-driven subspace algorithm,^{58} energy-specific Davidson algorithm,^{59} Generalized Preconditioned Locally Harmonic Residual (GPLHR) method,^{60} etc.] and have been proposed to work with the LRCC or EOM-CC method to solve not only the lowest/highest eigenvalues but also the inner eigenvalues embedded deeply in the spectrum of the large Jacobian matrix such that accurate resolution can be obtained for the frequency window of interest, and viable approximation can even be generated for the entire spectrum. For the GFCC calculation, as mentioned in a recent study of the optical potential calculation for the nucleon-nucleon scattering,^{61} the advantage of using the Lanczos-based method lies in the fact that the resolution of the tridiagonal matrix representing the normal-ordered Hamiltonian matrix in the Lanczos basis only needs to be done once for all frequencies (*ω*’s). Alternatively, we recently proposed to first solve a set of linear equations for IP/EA-EOMCC type vectors (using their zeroth and first order terms as starting points) and then to contract these vectors with the converged amplitudes of the CC Λ de-excitation operators to get the GFCC matrix elements.^{46,47} The computational approach was designed to calculate GFCC for the whole complex plane; therefore, it includes all poles of the GFCC structure and serves naturally for, for example, an embedding scheme when working with low-level methods for large-scale applications. Facilitated by this approach, we then were able to prove the connected character of the diagrams contributing to GFCC matrix elements, as well as the connected character of its *n*th order derivative with respect to the energy and the corresponding CC self-energy operators,^{48,49} which provided a useful guidance for designing and analyzing new and size-extensive GFCC approximation schemes. Furthermore, due to its algebraic structure, the proposed method is highly scalable and is capable of computing the spectral function for a given molecular system in any energy region. We recently demonstrated this capability for several typical molecular systems (such as H_{2}O, N_{2}, CO, 1,3-butadiene, benzene, and adenine molecules).^{50} Consistent with previous ADC results,^{62–65} satellite peaks have been observed from the computed spectral functions within the GFCCSD framework in the energy regions where the many-body effect becomes significant and the single particle picture of ionization often breaks down.^{66,67} It should be emphasized here that solving a set of linear equations and diagonalizing the secular matrix are purely two different computational procedures with similar scaling (i.e., $O(N6)$ for CCSD) and a different computational procedure in the same GFCC theoretical context will not add anything new to the theory itself. Therefore, the peak positions obtained from the two computational procedures should be exactly the same within the same truncated subspace. Also, it needs to point out that the CVS approximation mentioned above is in principle applicable to either a diagonalization routine or a linear solver since similar tensor contractions need to be performed in both routines (see $H\xafNXp$ in the equations below). Nevertheless, it should be pointed out (also see the discussion in Sec. III) that the disadvantage of solving linear equations in comparison to the diagonalization routine lies in these facts that (i) a much smaller frequency interval is necessary in order to explore some detailed information (in particular, the state of interest with weak intensity) which unavoidably increases the computation cost and (ii) it fails to identify any dark state. Despite these disadvantages, solving linear equations has been chosen by some other studies recently for the computation of the spectral function of, for example, uniform electron gas,^{68} light atoms,^{69} heavy metal atoms,^{70} and simple 1-D periodic systems.^{71} A key step in our procedure is to, in a computationally tractable way, introduce *ω*-dependent IP/EA-EOMCC type operators *X*_{p}(*ω*) in an (*N*−1)-electron Hilbert space and *ω*-dependent EA-EOMCC type operators *Y*_{q}(*ω*) in an (*N* + 1)-electron Hilbert space,

which satisfy

and can be used to write a compact expression for the GFCC,

Typical approximations in the GFCC formulation discussed in early papers^{46–50} are defined by limiting the rank of excitation included in the all *T*, Λ, *X*_{p}(*ω*), and *Y*_{q}(*ω*) operators. This general approximation procedure determines the class of correlation effects used to describe the ground state of the *N*-electron system but also impacts the quality of pole locations described by *X*_{p}(*ω*) and *Y*_{q}(*ω*) for (*N* − 1)- and (*N* + 1)-electron systems. For example, in the GFCCSD approximation (GFCC with singles and doubles), the expansions for these operators are truncated at the singles and doubles level, i.e.,

*X*_{p}(*ω*) and *Y*_{q}(*ω*) operators are determined from

where subspaces on which Eqs. (23) and (24) are projected on are chosen to reflect excitation included in the CCSD *X*_{p}(*ω*) and *Y*_{q}(*ω*) operators. This leads to the following for the CCSD Green’s function:

which has been used in several studies to determine the efficiency of the GFCCSD approximations.

As discussed in Ref. 47, the GFCC formalism offers flexibility in treating the correlation effect. In particular, different levels of approximations can be used to describe ground-state effects for the *N*-electron systems (*T* and Λ operators) and location of poles for *N* − 1 and *N* + 1 systems (*X*_{p}(*ω*) and *Y*_{q}(*ω*) operators). In this paper, we will establish a simple rule that assures that the basic properties of the GFCC formalism, such as connected character of contributing diagrams, are still preserved. We will refer to these schemes as GFCC-i(*n*, *m*), where *n* and *m* define the level of excitations used to define approximate operators *T*/Λ and *X*_{p}(*ω*)/*Y*_{q}(*ω*), respectively, and ‘i’ stands for the inner space indicating that two-levels of approximations are used. An interesting feature of the GFCC-i(*n*, *m*) is the fact that, when *m* = *N*, the *X*_{p}(*ω*) and *Y*_{q}(*ω*) operators reproduce exact location of poles in the *N* − 1 and *N* + 1 spaces. This is a straightforward consequence that similarity transformation *e*^{−T}*He*^{T} does not change the spectral properties of the electronic Hamiltonian *H*.

As a specific example, let us focus on the GFCC-i(2,3) method. In this case, in addition to singly (*X*_{p,1}(*ω*)/*Y*_{q,1}(*ω*)) and doubly (*X*_{p,2}(*ω*)/*Y*_{q,2}(*ω*)) excited components of *X*_{p}(*ω*) and *Y*_{q}(*ω*) operators, we add three-body terms (*X*_{p,3}(*ω*)/*Y*_{q,3}(*ω*)), which leads to the following definitions of these operators:

At the same time, the *T* and Λ operators in the GFCC-i(2,3) are given by Eqs. (19) and (20). The working equations for the *X*_{p}(*ω*) and *Y*_{q}(*ω*) in the GFCC-i(2,3) method are given by expressions

The key role in assuring the connected character of the CC Green’s function matrix elements was played by the fact that the *X*_{p}(*ω*) and *Y*_{q}(*ω*) operators are expressible in terms of connected diagrams. In Refs. 48 and 49, we demonstrated that this feature is a consequence of the connected character of equations for the *X*_{p}(*ω*) and *Y*_{q}(*ω*) operators. It is interesting to analyze this property from the point of view of Eqs. (28) and (29). By closer inspection [see Figs. 1(a) and 1(b)], one can notice that the only disconnected diagrams contributing to Eq. (28) correspond to diagrams that involve elements of $H\xafN$ corresponding to CCSD equations [examples on these diagrams corresponding to projections on triples are also shown in Figs. 1(a) and 1(b); analogous behavior can also be found for the *Y*_{q}(*ω*) equations] and numerically disappear once the CCSD ground state converges. Therefore, the equations for *X*_{p}(*ω*) and *Y*_{q}(*ω*) GFCC-i(2,3) amplitudes [i.e., Eqs. (28) and (29)] involve only connected diagrams. Given the fact that GFCC-i(2,3) matrix elements are given by the same expression, Eq. (25), as in the GFCCSD case [with the difference that in the GFCC-i(2,3) case, *X*_{p,i}(*ω*) and *Y*_{q,i}(*ω*) (*i* = 1, 2) are iterated in the presence of triply excited amplitudes *X*_{p.3}(*ω*) and *Y*_{q,3}(*ω*)], their diagrammatic expansion contains connected diagrams only.

However, one should proceed with caution when extending GFCC-i(2,3) to higher *m*’s. For example, in the GFCC-i(2,4), a disconnected diagram shown in Fig. 1(c) appears in the equations for quadruples (*X*_{p,4}(*ω*) equations), which, as explained in Fig. 2, gives rise during the iterations to disconnected components entering *X*_{p}(*ω*) operators that correspond to lower order excitations. The provenance of these problems is associated with the non-vanishing character of higher-order moments of the CCSD equations [see Refs. 72 and 73], which are not used to define CCSD equations. Therefore, the only GFCC-i(*n*, *m*) methods that guarantee the connectivity of the corresponding matrix elements of the Green’s function are of the GFCC-i(*n*, *n* + 1) type.

Generally speaking, the GFCC-i(*n*, *m*) formalism can be viewed as an extension of previous studies of approximate inclusion of higher-order excitations in the IP/EA-EOMCC methods represented by a broad class of iterative and non-iterative methodologies including EOMIP-CCSD^{*},^{74–76} CCSDT-n (n = 1, 2, 3),^{77–81} CCn (n = 2, 3),^{82,83} EOMCC(m,n),^{84–86} and EOM-CCSD(T)(a)^{*} formalisms.^{87,88} Moreover, the EOM-CC(m,n) approach introduced by Hirata *et al.* combines different levels of approximations for the ground-state CC ansatz and equation-of-motion CC operators *R*_{K}, for vertical excitation energies, ionization potential, and electron affinities have already been applied to molecular systems. Several problems of the EOM-CC(m,n) class of methods stemming from possible inconsistencies between approximation levels of the *T* and *R*_{K} operator approaches and resulting in the lack of the size-intensivity of the calculated excitation energies have been discussed by Splipchenko and Krylov.^{86} In this paper, for the first time, we demonstrate that similar procedures applied to *X*_{p}(*ω*) and *Y*_{q}(*ω*) operators (which in contrast to the *R*_{K} operators are determined by solving linear equations) may violate size-extensive character of the corresponding Green’s function if the difference between excitation levels used to approximate *X*_{p}(*ω*)/*Y*_{q}(*ω*) operators and *T* operators is bigger than one.

In this paper, we will focus on the GFCC-i(2,3) method which introduces triple excitations in describing correlation effects needed for the accurate determination of the poles of the Green’s function. In contrast to the GFCCSDT method, the GFCC-i(2,3) formalism bypasses memory demands for storing *T*_{3} amplitudes (proportional to $no3nv3$, where *n*_{o} and $nv$ refer to the total number of occupied and virtual orbitals, respectively), and the maximum memory demand in the GFCC-i(2,3) formalism comes from storing *X*_{p,3}(*ω*) (*Y*_{q,3}(*ω*)) amplitudes that is proportional to $no3nv2$ ($no2nv3$).

To obtain the working equations for the GFCC-i(2,3) method, we have chosen a form of Eq. (28), in which the projections on singles and doubles are the same as in GFCCSDT [see Eqs. (30) and (31), where the full form of *X*_{p,3}(*ω*)-dependent coupling terms including $Q1(N\u22121)VNXp,3(\omega )|\Phi $, $Q2(N\u22121)VNXp,3(\omega )|\Phi $, and $Q2(N\u22121)(VNT1Xp,3(\omega ))C|\Phi $ is maintained],

As to the projections on triples, in contrast to GFCCSDT, we chose to maintain the leading contributing *X*_{p,1}, *X*_{p,2}, and *X*_{p,3} operators, i.e.,

where the one-particle part $FN=\u2211r\u03f5rN[ar\u2020ar]$ and the two-particle part $VN=14\u2211p,q,r,s\u2009vpqrsN[ap\u2020aq\u2020asar]$, with $vpqrs$ denoting antisymmetrized two-electron integrals and *N*[…] designating normal ordered form of a given second-quantized expression.

Remarkably, analogous to CCSDT-n (n = 1, 2, 3) and CC3 methods, in Eq. (32), the triply excited component *X*_{p,3}(*ω*) only contracts with zeroth order terms; i.e., the 3*h*2*p* − 3*h*2*p* coupling is only represented by the lowest-order contribution stemming from the Fock operator (note that full inclusion of $H\xafN$ in the 3*h*2*p* − 3*h*2*p* coupling would significantly increase the numerical overhead from $O(N6)$ to $O(N7)$ per iteration), which allows the on-the-fly determination of *X*_{p,3}(*ω*) as a function of *X*_{p,1}(*ω*) and *X*_{p,2}(*ω*) (hereafter, we call *X*_{p,3}(*ω*) of this type internal *X*_{p,3}(*ω*) or internal triple). That is, based on Eq. (32), the components of *X*_{p,3}(*ω*), $xab,pijk(\omega )$ (including both the real and imaginary parts) can be obtained through

where

and $\Delta \u03f5abijk(\omega )=\omega +\u03f5a+\u03f5b\u2212\u03f5i\u2212\u03f5j\u2212\u03f5k$, with *A*_{1}{…} and *A*_{2}{…} representing the corresponding antisymmetric permutation operators and *ϵ*_{r} denoting the *r*th Hartree-Fock (HF) orbital energy. In the above expressions, *i*, *j*, *k*, *l*, … denote the occupied Hartree-Fock orbital indices, *a*, *b*, *c*, *d*, … denote the virtual Hartree-Fock orbital indices, and *p*, *q*, *r*, *s*, … denote the general Hartree-Fock orbital indices. Note that the computational cost of Eqs. (33) and (34), for a given HF orbital *p* and frequency *ω*, only scales as $no3nv2$, while the cost of Eq. (35) scales as $no3nv3$.

Once *X*_{p,3}(*ω*) is generated, the major goal is then to solve Eqs. (30) and (31), which can alternatively be represented using the following compact form:

in which the index *p* runs over the entire spin-orbital domain and **b**_{p} is a real free term. As discussed in our previous work, by representing **x**_{p}(*ω*) as a vector

then the matrix form (36) becomes

Equation (38) can be approximately solved by iterative procedures (i.e., linear equation solvers). Once the one-body and two-body parts of **x**_{p}(*ω*) are converged, the GFCC matrix element can then be obtained through Eq. (25).

The overall procedure of executing a GFCC-i(2,3) calculation for the retarded part of *G*(*ω*) can then be summarized as three parts, i.e., (Λ-)CCSD calculations that scale as $no2nv4$, the GFCC-i(2,3) iterative part in which each iteration step scales as $no3nv3$ iteration for a given pair of *p* and *ω*, and one-shot contraction [Eq. (25)] that scales as $no2nv2$ for a given pair of *p* and *ω*. It is worth mentioning that, in comparison with the GFCCSDT method, the GFCC-i(2,3) method does not require the cumbersome $no3nv5$ (Λ-)CCSDT calculations. Furthermore, in spite of having the same numerical scalings during the iteration, the iteration part in the GFCC-i(2,3) method bears a much smaller prefactor than that in the GFCCSDT method. Also, in terms of storage requirement, the GFCC-i(2,3) method eases off the storage of *X*_{p,3}(*ω*) at the end of the iteration. Besides, in general, due to the inherent independencies in the GFCC formalism (i.e., the calculation of *X*_{p}(*ω*) for a given pair of *p* and *ω* is independent of the calculation for another pair), our GFCC method is a strong scaling method that allows the efficient utilization of process groups in the calculation to significantly reduce time-to-solution and is suitable for the simulation of larger systems with large basis sets.^{46}

As above mentioned, most of the computation cost in the GFCC-i(2,3) calculation is spent on solving the linear equation, Eq. (38), which is closely related to the linear equation solvers used in the calculation. In the present study, we chose to solve Eq. (38) by exploiting the so-called direct inversion in the iterative subspace (DIIS) procedure. Different from the conventional DIIS approach, here we adapt the DIIS to the complex space. Suppose we have *K* micro-iteration iterates **x**_{p,j+1}(*ω*), …, **x**_{p,j+K}(*ω*), then the new **x**_{p,j+K+1}(*ω*) is obtained through a linear combination of the *K* micro-iteration iterates

where **r**_{p,j+k}(*ω*) represents the residual vector of **x**_{p,j+k}(*ω*), $x\u0303p,j+k+1(\omega )$ is an intermediate update of **x**_{p,j+k}(*ω*), and the coefficient vector $c=c1,\u2026,ck,\u2026,cKT$ is defined in the complex space. The goal is to minimize the DIIS least square functional $JDIIS\u2254\Vert \u2211k=1Kckrp,j+k\Vert 2$ by solving the following system of linear equations:

where $1=1,\u2026,1$ is a constant vector of length *K*, *ζ* is a complex *Lagrangian* multiplier, and the *m* × *m* matrix **B** is constructed through

If we assume that the $H\xafN$ matrix is diagonal dominant, the correction vector at the (*j* + *k*)-th step, **r**_{p,j+k}(*ω*), can be approximated through

with **D** being the orbital-energy dependent diagonal part of the $H\xafN$ matrix.

Figure (3) exhibits the convergence performance of our linear equation solver (DIIS adapted to complex space) in the frameworks of both GFCCSD and GFCC-i(2,3) methods for the calculation of **x**_{p}’s over several frequencies that are close to the first main peak and first satellite peak in the spectral function of an N_{2} molecule. The position of the first main peak is predicted to be at *ω* = −0.558 a.u. (−15.185 eV) by GFCCSD and at *ω* = −0.550 a.u. (−14.968 eV) by GFCC-i(2,3), respectively. The position of the first satellite peak is predicted to be at *ω* = −1.058 a.u. (−28.792 eV) by GFCCSD and at *ω* = −0.930 a.u. (−25.309 eV) by GFCC-i(2,3), respectively. Given the number of micro iterates to be 15, Figs. 3(a) and 3(c) show that the relative residual norm of **x**_{p} of the frequencies that are close to the main peak can be quickly reduced to be <10^{−5} after one DIIS update in both the GFCCSD and GFCC-i(2,3) calculations, while the convergence of **x**_{p} of the frequency domains that include the satellite peak is a bit slower [see Figs. 3(b) and 3(d)], i.e., to reach the same accuracy as for the main peak, at least one more DIIS update would be needed in either GFCCSD or GFCC-i(2,3) calculation.

This performance difference can be attributed to different peak features that impact the complication of solving the coupled linear equations [Eq. (38)]. Note that different from the main peaks which are dominated by the one-electron process, the satellite peaks have significant contributions from double excitations (two-electron process). Thus, in order to solve for **x**_{p}, in the frequency domain for main peaks, the iterating update would be basically on the one-body part of **x**_{p} whose dimension is only *n*_{o}, while in the frequency domain for satellite peaks, the iterating update would be instead on the larger two-body part of **x**_{p} whose dimension is $no2nv$. Apparently, more variables would usually require more iterations to help the linear equations converge.

## III. RESULTS AND DISCUSSION

Our preliminary accuracy test of the GFCC-i(2,3) method is focused on the computed spectral functions of the N_{2} and CO molecules and their comparisons with the GFCCSD spectral function as well as other theoretical results and experimental observations. The experimental geometries^{89} of the N_{2} and CO molecules were used in all the calculations. The conventional (Λ-)CCSD calculations of the two molecules were performed first by using the NWChem suite of quantum chemical codes.^{90} The converged *T* and Λ amplitudes, as well as the two-electron integral tensors, were then used by our pilot code to compute the IP-EOMCCSD type vector **x**_{p}(*ω*) and the *ω*-dependent retarded GFCC matrix elements. The spectral function is then given by the imaginary part of the retarded GFCC matrix,

The computed spectral functions of the N_{2} and CO molecules are shown in Fig. 4, and detailed information of the peaks is given in Tables I and II. The Hartree-Fock electronic configurations of the ground states of the N_{2} and CO molecules can be described as $(1\sigma g)2$$(1\sigma u)2$$(2\sigma g)2$$(2\sigma u)2$$(3\sigma g)2\u2009(1\pi u)4$ and (1*σ*)^{2} (2*σ*)^{2} (3*σ*)^{2} (4*σ*)^{2} (1*π*)^{4} (5*σ*)^{2}, respectively.

State . | . | Significant $GppR$ . | Main configurations . | . |
---|---|---|---|---|

(Symmetry) . | Energy (eV) . | (Contribution ≥0.05) . | (|C|^{2} ≥ 0.02)
. | Weight of |2h, p⟩’s
. |

$X(\u2009\u20092\Sigma g+)$ | −14.968 | $G55R$(0.93) | $3\sigma g\u22121$(0.91) | 0.02 |

A(^{2}Π_{u}) | −16.328 | $G66R(0.48),G77R$(0.48) | $1\pi u\u22121$(0.94) | 0.01 |

$B(\u2009\u20092\Sigma u+)$ | −18.233 | $G44R$(0.94) | $2\sigma u\u22121$(0.90), $1\pi u\u221213\sigma g\u221211\pi g$(0.03) | 0.04 |

$C(\u2009\u20092\Sigma u+)$ | −25.309 | $G44R$(0.95) | $2\sigma u\u22121$(0.20), $1\pi u\u221213\sigma g\u221211\pi g$(0.70), | 0.75 |

$1\pi u\u221213\sigma g\u221212\pi g$(0.06) | ||||

$F(\u2009\u20092\Sigma g+)$ | −29.119 | $G33R(0.93)$, $G55R(0.05)$ | $2\sigma g\u22121$(0.32), $3\sigma g\u22121$(0.04), $1\pi u\u221212\sigma u\u221211\pi g$(0.46), | 0.63 |

$2\sigma g\u221213\sigma u\u221211\pi g$(0.08), $1\pi u\u221212\sigma u\u221212\pi g$(0.02) | ||||

$G(\u2009\u20092\Sigma g+,\u20092\Sigma u+)$ | −32.657 | $G33R(0.16)$, $G44R(0.22)$, | $2\sigma g\u22121$(0.14), $2\sigma u\u22121$(0.03), $2\sigma u\u221213\sigma g\u221211\pi g$(0.55), | 0.79 |

$G66R(0.30)$, $G77R(0.30)$ | $1\pi u\u221213\sigma g\u221211\pi g$(0.17), $1\pi u\u221212\sigma u\u221211\pi g$(0.02), | |||

$2\sigma u\u221213\sigma g\u221212\pi g$(0.02) | ||||

$I(\u2009\u20092\Sigma g+,\u20092\Sigma u+)$ | −33.473 | $G33R(0.11),G44R$(0.84) | $2\sigma g\u22121$(0.08), $2\sigma u\u22121$(0.08), $1\pi u\u221213\sigma g\u221211\pi g$(0.68), | 0.78 |

$1\pi u\u221212\sigma u\u221211\pi g$(0.02), $3\sigma g\u221224\sigma u$(0.03) | ||||

$K(\u2009\u20092\Sigma g+)$ | −35.378 | $G33R$(0.98) | $2\sigma g\u22121$(0.38), $1\pi u\u221212\sigma u\u221211\pi g$(0.54), $2\sigma u\u221213\sigma g\u221214\sigma u$(0.02), | 0.60 |

$(\u2009\u20092\Sigma g+)$ | −38.643 | $G33R$(0.99) | $2\sigma g\u22121$(0.85), $1\pi u\u221212\sigma u\u221211\pi g$(0.08) | 0.13 |

$(\u2009\u20092\Sigma g+,\u20092\Sigma u+)$ | −413.07 | $G11R$(0.46), $G22R$(0.54) | $1\sigma g\u22121$(0.44), $1\sigma u\u22121$(0.51) | 0.05 |

State . | . | Significant $GppR$ . | Main configurations . | . |
---|---|---|---|---|

(Symmetry) . | Energy (eV) . | (Contribution ≥0.05) . | (|C|^{2} ≥ 0.02)
. | Weight of |2h, p⟩’s
. |

$X(\u2009\u20092\Sigma g+)$ | −14.968 | $G55R$(0.93) | $3\sigma g\u22121$(0.91) | 0.02 |

A(^{2}Π_{u}) | −16.328 | $G66R(0.48),G77R$(0.48) | $1\pi u\u22121$(0.94) | 0.01 |

$B(\u2009\u20092\Sigma u+)$ | −18.233 | $G44R$(0.94) | $2\sigma u\u22121$(0.90), $1\pi u\u221213\sigma g\u221211\pi g$(0.03) | 0.04 |

$C(\u2009\u20092\Sigma u+)$ | −25.309 | $G44R$(0.95) | $2\sigma u\u22121$(0.20), $1\pi u\u221213\sigma g\u221211\pi g$(0.70), | 0.75 |

$1\pi u\u221213\sigma g\u221212\pi g$(0.06) | ||||

$F(\u2009\u20092\Sigma g+)$ | −29.119 | $G33R(0.93)$, $G55R(0.05)$ | $2\sigma g\u22121$(0.32), $3\sigma g\u22121$(0.04), $1\pi u\u221212\sigma u\u221211\pi g$(0.46), | 0.63 |

$2\sigma g\u221213\sigma u\u221211\pi g$(0.08), $1\pi u\u221212\sigma u\u221212\pi g$(0.02) | ||||

$G(\u2009\u20092\Sigma g+,\u20092\Sigma u+)$ | −32.657 | $G33R(0.16)$, $G44R(0.22)$, | $2\sigma g\u22121$(0.14), $2\sigma u\u22121$(0.03), $2\sigma u\u221213\sigma g\u221211\pi g$(0.55), | 0.79 |

$G66R(0.30)$, $G77R(0.30)$ | $1\pi u\u221213\sigma g\u221211\pi g$(0.17), $1\pi u\u221212\sigma u\u221211\pi g$(0.02), | |||

$2\sigma u\u221213\sigma g\u221212\pi g$(0.02) | ||||

$I(\u2009\u20092\Sigma g+,\u20092\Sigma u+)$ | −33.473 | $G33R(0.11),G44R$(0.84) | $2\sigma g\u22121$(0.08), $2\sigma u\u22121$(0.08), $1\pi u\u221213\sigma g\u221211\pi g$(0.68), | 0.78 |

$1\pi u\u221212\sigma u\u221211\pi g$(0.02), $3\sigma g\u221224\sigma u$(0.03) | ||||

$K(\u2009\u20092\Sigma g+)$ | −35.378 | $G33R$(0.98) | $2\sigma g\u22121$(0.38), $1\pi u\u221212\sigma u\u221211\pi g$(0.54), $2\sigma u\u221213\sigma g\u221214\sigma u$(0.02), | 0.60 |

$(\u2009\u20092\Sigma g+)$ | −38.643 | $G33R$(0.99) | $2\sigma g\u22121$(0.85), $1\pi u\u221212\sigma u\u221211\pi g$(0.08) | 0.13 |

$(\u2009\u20092\Sigma g+,\u20092\Sigma u+)$ | −413.07 | $G11R$(0.46), $G22R$(0.54) | $1\sigma g\u22121$(0.44), $1\sigma u\u22121$(0.51) | 0.05 |

State . | . | Significant $GppR$ . | Main configurations . | . |
---|---|---|---|---|

(Symmetry) . | Energy/eV . | (Contribution ≥0.05) . | (|C|^{2} ≥ 0.02)
. | Weight of |2h, p⟩’s
. |

X(^{2}Σ^{+}) | −13.335 | $G77R$(0.98) | 5σ^{−1}(0.96) | 0.02 |

A(^{2}Π) | −16.600 | $G55R(0.50),G66R$(0.50) | 1π^{−1}(0.96) | 0.02 |

B(^{2}Σ^{+}) | −19.322 | $G44R$(0.97) | 4σ^{−1}(0.91) | 0.05 |

C(^{2}Σ^{+}) | −23.948 | $G33R$(0.09), $G44R$(0.65), | 4σ^{−1}(0.17), 5σ^{−1}(0.05), 5σ^{−1}1π^{−1}1π^{*}(0.62), | 0.70 |

$G77R$(0.21) | 5σ^{−1}1π^{−1}2π^{*}(0.04) | |||

E, G(^{2}Σ^{+},^{2}Π) | −28.030 | $G33R(0.05)$, $G44R(0.06)$, | 3σ^{−1}(0.06), 4σ^{−1}(0.04), 1π^{−1}(0.02), | 0.67 |

$G55R(0.43)$, $G66R(0.43)$ | 5σ^{−1}(0.16), 4σ^{−1}5σ^{−1}1π^{*}(0.12), | |||

1π^{−2}1π^{*}(0.20), 5σ^{−2}1π^{*}(0.04), | ||||

4σ^{−1}1π^{−1}1π^{*}(0.05), 5σ^{−1}1π^{−1}1π^{*}(0.17), | ||||

F(^{2}Σ^{+}) | −29.935 | $G33R(0.28)$, $G44R(0.34)$, | 3σ^{−1}(0.02), 4σ^{−1}(0.03), 5σ^{−1}(0.08), | 0.80 |

$G77R(0.33)$ | 5σ^{−1}1π^{−1}1π^{*}(0.59), 4σ^{−1}1π^{−1}1π^{*}(0.10), | |||

5σ^{−2}1σ^{*}(0.02) | ||||

I(^{2}Σ^{+}) | −31.840 | $G33R(0.84),G44R$(0.14) | 3σ^{−1}(0.29), 4σ^{−1}(0.03), 4σ^{−1}1π^{−1}1π^{*}(0.46), | 0.64 |

5σ^{−1}1π^{−1}1π^{*}(0.04) | ||||

J(^{2}Σ^{+}) | −34.017 | $G33R$(0.80), $G44R$(0.11), | 3σ^{−1}(0.17), 4σ^{−1}1π^{−1}1π^{*}(0.65), | 0.79 |

$G77R$(0.08) | 5σ^{−1}1π^{−1}1π^{*}(0.03), 4σ^{−1}5σ^{−1}6σ(0.02) | |||

K(^{2}Σ^{+}) | −35.650 | $G33R$(0.33), $G55R$(0.30), | 3σ^{−1}(0.27), 4σ^{−1}1π^{−1}1π^{*}(0.03), | 0.64 |

$G66R$(0.30) | 5σ^{−1}1π^{−1}1σ^{*}(0.54), 1π^{−2}1π^{*}(0.02) | |||

(^{2}Σ^{+}) | −37.555 | $G33R$(0.83), $G33R$(0.09) | 3σ^{−1}(0.31), 4σ^{−1}5σ^{−1}1σ^{*}(0.50), 5σ^{−2}1σ^{*}(0.03) | 0.61 |

(^{2}Σ^{+}) | −38.916 | $G33R$(0.99) | 3σ^{−1}(0.85), 4σ^{−1}1π^{−1}1π^{*}(0.04) | 0.14 |

(^{2}Σ^{+}) | −298.26 | $G22R$(0.99) | 2σ^{−1}(0.92) | 0.08 |

(^{2}Σ^{+}) | −546.73 | $G11R$(0.99) | 1σ^{−1}(0.97) | 0.03 |

State . | . | Significant $GppR$ . | Main configurations . | . |
---|---|---|---|---|

(Symmetry) . | Energy/eV . | (Contribution ≥0.05) . | (|C|^{2} ≥ 0.02)
. | Weight of |2h, p⟩’s
. |

X(^{2}Σ^{+}) | −13.335 | $G77R$(0.98) | 5σ^{−1}(0.96) | 0.02 |

A(^{2}Π) | −16.600 | $G55R(0.50),G66R$(0.50) | 1π^{−1}(0.96) | 0.02 |

B(^{2}Σ^{+}) | −19.322 | $G44R$(0.97) | 4σ^{−1}(0.91) | 0.05 |

C(^{2}Σ^{+}) | −23.948 | $G33R$(0.09), $G44R$(0.65), | 4σ^{−1}(0.17), 5σ^{−1}(0.05), 5σ^{−1}1π^{−1}1π^{*}(0.62), | 0.70 |

$G77R$(0.21) | 5σ^{−1}1π^{−1}2π^{*}(0.04) | |||

E, G(^{2}Σ^{+},^{2}Π) | −28.030 | $G33R(0.05)$, $G44R(0.06)$, | 3σ^{−1}(0.06), 4σ^{−1}(0.04), 1π^{−1}(0.02), | 0.67 |

$G55R(0.43)$, $G66R(0.43)$ | 5σ^{−1}(0.16), 4σ^{−1}5σ^{−1}1π^{*}(0.12), | |||

1π^{−2}1π^{*}(0.20), 5σ^{−2}1π^{*}(0.04), | ||||

4σ^{−1}1π^{−1}1π^{*}(0.05), 5σ^{−1}1π^{−1}1π^{*}(0.17), | ||||

F(^{2}Σ^{+}) | −29.935 | $G33R(0.28)$, $G44R(0.34)$, | 3σ^{−1}(0.02), 4σ^{−1}(0.03), 5σ^{−1}(0.08), | 0.80 |

$G77R(0.33)$ | 5σ^{−1}1π^{−1}1π^{*}(0.59), 4σ^{−1}1π^{−1}1π^{*}(0.10), | |||

5σ^{−2}1σ^{*}(0.02) | ||||

I(^{2}Σ^{+}) | −31.840 | $G33R(0.84),G44R$(0.14) | 3σ^{−1}(0.29), 4σ^{−1}(0.03), 4σ^{−1}1π^{−1}1π^{*}(0.46), | 0.64 |

5σ^{−1}1π^{−1}1π^{*}(0.04) | ||||

J(^{2}Σ^{+}) | −34.017 | $G33R$(0.80), $G44R$(0.11), | 3σ^{−1}(0.17), 4σ^{−1}1π^{−1}1π^{*}(0.65), | 0.79 |

$G77R$(0.08) | 5σ^{−1}1π^{−1}1π^{*}(0.03), 4σ^{−1}5σ^{−1}6σ(0.02) | |||

K(^{2}Σ^{+}) | −35.650 | $G33R$(0.33), $G55R$(0.30), | 3σ^{−1}(0.27), 4σ^{−1}1π^{−1}1π^{*}(0.03), | 0.64 |

$G66R$(0.30) | 5σ^{−1}1π^{−1}1σ^{*}(0.54), 1π^{−2}1π^{*}(0.02) | |||

(^{2}Σ^{+}) | −37.555 | $G33R$(0.83), $G33R$(0.09) | 3σ^{−1}(0.31), 4σ^{−1}5σ^{−1}1σ^{*}(0.50), 5σ^{−2}1σ^{*}(0.03) | 0.61 |

(^{2}Σ^{+}) | −38.916 | $G33R$(0.99) | 3σ^{−1}(0.85), 4σ^{−1}1π^{−1}1π^{*}(0.04) | 0.14 |

(^{2}Σ^{+}) | −298.26 | $G22R$(0.99) | 2σ^{−1}(0.92) | 0.08 |

(^{2}Σ^{+}) | −546.73 | $G11R$(0.99) | 1σ^{−1}(0.97) | 0.03 |

For both N_{2} and CO molecules, it has been well known that strong correlation effects exist in their inner-valence ionization process and lead to the breakdown of the MO picture. A typical example is the splitting of the single 2*σ*_{g} line of N_{2} (similarly a 3*σ* line of CO) into several satellite lines (see Ref. 96 for an early study). In the present study, as shown in Figs. 4(a) and 4(b), the GFCCSD/cc-pVDZ and GFCC-i(2,3)/cc-pVDZ calculations almost give the same main peak positions (labeled by asterisks) above −0.735 a.u. (∼−20 eV). Below this energy regime, several satellite peaks started to emerge. Take N_{2}, for example, and the GFCCSD/cc-pVDZ calculation exhibits three satellite peaks between −43 eV and −20 eV which are dominated by the two-hole-one-particle process, but still show significant 2*σ*^{−1} configuration contributions. However, in comparison to previous experimental observations,^{93–95} the positions of these satellite peaks are not well reproduced by the GFCCSD calculations. For the first satellite peak in the calculated spectral function of N_{2} (similar to the CO molecule case), the GFCCSD calculation predicted its position at ∼−28.79 eV, which is about 3 eV below the experimental value.^{93,95} For all the following satellite peaks, a systematic overestimation (blue shift) of the satellite peak positions can be observed from the GFCCSD results. On the other hand, the GFCC-i(2,3) calculation significantly red-shifts the satellite peaks to be more close to the experimental observations. The largest red-shift for the N_{2} molecule (0.264 a.u. or 7.184 eV) occurs to the $K\u20092\Sigma g+$ state that is dominated by a $1\pi u\u221212\sigma u\u221211\pi g$ two-electron process (see Table I). For the CO molecule, the largest red-shift is 0.320 a.u. (8.708 eV) and is associated with the *I*^{2}Σ^{+} state that is dominated by a 5*σ*^{−1}1*π*^{−1}1*σ*^{*} two-electron process (see Table II). Besides, in comparison with the GFCCSD results, new satellite peaks are resolved from the GFCC-i(2,3) calculation. As shown in the spectral functions computed by the GFCC-i(2,3)/cc-pVDZ approach in Figs. 4(a) and 4(b), despite exhibiting relatively weak intensity, two new satellite peaks at −1.200 a.u. and −1.230 a.u. (that is, at −32.657 eV and −33.473 eV, assigned to the *G*^{2}*Π*_{u} and $I\u20092\Sigma u+$ states, respectively; see Table I) can be easily observed for the N_{2} molecule, and two new satellite peaks at −1.250 a.u. and −1.310 a.u. (that is, at −34.017 eV and −35.650 eV, assigned to the *J* ^{2}Σ^{+} and *K* ^{2}Σ^{+} states, respectively, see Table II) are observed for the CO molecule.

The significant red-shifts and emergence of these inner satellite peaks then distinguish the GFCCSD result and GFCC-i(2,3) result in terms of the agreement with the experimental results. As can be seen from the comparison of the peak positions obtained from the spectral function computed by the GFCC methods with those peak positions read from the experimental photoelectron spectra [Figs. 4(c) and 4(d)], the relative peak positions obtained from the GFCC-i(2,3)/cc-pVDZ calculation show better agreement with the experimental results.^{95,97} For example, for the $N2\u2009C\u20092\Sigma u+$ and CO *C*^{2}Σ^{+} ionic states, the GFCC-i(2,3)/cc-pVDZ results are ∼0.13 a.u. (∼3.538 eV) and ∼0.09 a.u. (∼2.449 eV) more close to the experiment results, respectively, than the GFCCSD/cc-pVDZ results.

It is worth emphasizing that, in Fig. 4, the comparisons of the computed spectral functions and the experimental photoelectron spectra are made in terms of peak positions, while the relative intensities between the different final ionic state manifolds exhibited in the photoelectron spectra are supposed to be different from those observed from the calculated spectral functions. This is because the intensity of the *n*th ionic state in the photoelectron spectrum is given by^{98,99}

where *ϵ*_{k} is the kinetic energy of the emitted electron, *I*_{n} is the *n*th ionization potential, *hν* is the energy of a photon beam, and *δ*(…) is the Dirac delta function. $\theta i,n=\Phi nN\u22121|ai|\Phi 0N$ determines the relative intensity of the *n*th ionic state in the spectral function for removing an electron from orbital *ϕ*_{i} [$|\Phi 0N$ is the initial state of the *N*-electron system and $|\Phi nN\u22121$ is the *n*th state of the (*N* − 1)-electron system], and $sk,i=\varphi k|r^|\varphi i$ is the dipole matrix element for photoionization ($r^$ is the position operator, and *ϕ*_{k} and *ϕ*_{i} are the continuum one-particle state and bound one-particle state, respectively). In the single particle picture (that is, there is only one *θ*_{i,n} contributing significantly), the above equation can be simplified to

where |*θ*_{i,n}|^{2} is the so-called pole strength whose convoluted sums compose the spectral function. Therefore, *P*_{n} is approximately proportional to |*θ*_{i,n}|^{2} only when *hν* is much larger than *I*_{n} (such that the dependence of |*s*_{k,i}|^{2} on *ϵ*_{k} is weak) and there are no appreciable initial state correlation effects (i.e., in the single particle picture). Also, as can be seen from Fig. 4, the computed spectral function includes the effects of orbital degeneracy. Here, due to the degeneracy of 1*π* orbitals, the corresponding intensity at −16.300 eV for *N*_{2} (−16.600 eV for CO) is almost twice as large as that corresponding to the ionization of the adjacent *σ* orbitals.

In Figs. 4(c) and 4(d), we also compare the positions of the peaks in the spectral function computed by our GFCC methods with other theoretical ionization potentials. As can be seen, early IP-EOM-CC calculations^{85} only reported the computed ionization potentials up to ∼-1.0 a.u. (∼−27.214 eV). Using the same cc-pVDZ basis set, the peak positions in this regime computed by the GFCCSD are almost the same as the ionization potentials computed by the IP-EOMCCSD method. By including the internal triple (*X*_{p,3}(*ω*)) in the calculation, the peak positions from the GFCC-i(2,3) calculations are very close to those ionization potentials computed by the IP-EOMCCSDT method. Beyond this regime, the GFCC-i(2,3) results, in comparison with the GFCCSD results, show better agreement with the SAC-CI results in terms of peak positions and associated configurations. The exploitation of a larger basis set in the GFCC calculations would be expected to follow the trends observed in the IP-EOM-CC studies^{85} that larger basis sets in the GFCC method would make the calculated results in the inner valence regime gradually deviate from the experimental values.

To have a more detailed examination, we found that, in the previous theoretical IP-EOM-CC and SAC-CI studies, a satellite peak classified as the *D* ^{2}Π state was also mentioned for the N_{2} and CO molecules to be located slightly above the *C* ^{2}Σ^{+} state. However, the intensity of the *D* ^{2}Π state was reported to be extremely low. According to the previous SAC-CI study,^{91} the ratio of the intensity of the *D* ^{2}Π state to its adjacent satellite *C* ^{2}Σ^{+} state is ∼1/520 for the N_{2} molecule and ∼1/13 for the CO molecule. Given the already low intensity of the *C* ^{2}Σ^{+} state, the direct observation of the *D* ^{2}Π state from the computed spectral function would be hard [in Figs. 4(a) and 4(b), the *D* ^{2}Π state cannot be clearly observed from the computed spectral function with Δ*ω* = 0.01 a.u. or 0.272 eV]. In fact, this state was originally even missing in the XPS spectra and can only be identified in the UPS (He II) and de-excitation (DES) spectra,^{95} which favor different final states in comparison with the photoelectron spectra.

In order to resolve this state, we switch to a smaller frequency interval, Δ*ω* = 0.002 a.u. (0.054 eV), in our GFCC calculations followed by a decomposition analysis. Figure 5 shows the computed spectral function of the CO molecule using the GFCC-i(2,3)/cc-pVDZ method in the energy regime of $\u22120.92,\u22120.84$ a.u. ($\u221225.037,\u221222.860$ eV) as well as the imaginary part of the individual diagonal Green’s function matrix elements that significantly contributes to the computed spectral function in this regime. From Fig. 5(a), slightly above the peak (at −0.88 a.u. or −23.948 eV) classified as the *C* ^{2}Σ^{+} state, a small shoulder bump can be seen being centered at ∼−0.856 a.u. (−23.295 eV). From the decomposition analysis in Fig. 5(b), we can further identify the bump as a satellite peak rather than a numerical fluctuation, and the major contributions to this shoulder satellite peak are from $\u2212I(G55R)$ and $\u2212I(G66R)$, which are associated with N2*p* orbitals. This then confirms the classification of this satellite peak as the *D* ^{2}Π state. The energy difference between the *C* ^{2}Σ^{+} state and the *D* ^{2}Π state in the GFCC-i(2,3), i.e., 0.024 a.u. (∼0.653 eV), is slightly less than the experimental observation (∼1.0 eV) and SAC-CI results (0.88 eV) and more close to the IP-EOM-CCSDT/cc-pVDZ result (0.77 eV). The main configurations of this satellite peak include 5*σ*^{−2}1*π*^{*} (∼64%), 4*σ*^{−1}5*σ*^{−1}1*π*^{*} (∼21%), and 5*σ*^{−2}2*π*^{*} (∼6%), which are consistent with MCS-CI and SAC-CI results.^{91,92}

Compared to the early Green’s function/CI studies of the inner-valence ionization of N_{2} and CO molecules,^{91,92,96,100,101} the GFCC-i(2,3) calculation exhibits better agreement than the GFCCSD calculation in terms of the overall spectral function profile (that is, the consistent prediction of the significant satellite peaks). Note that the previous Green’s function study^{96} is able to provide more detailed information of the inner valence ionization by providing the positions of weaker satellite peaks which might not be observable in our present GFCC calculations. For example, in the same energy regime as used in Fig. 4, there were more than 10 ionization potentials (with corresponding Green’s function residual >0.01) reported in Ref. 96, while only six significant peaks were observed from the GFCC-i(2,3) spectral function in the present study. As discussed in Sec. II, this is because the previous Green’s function methods (such as 2*h* − *p* TDA and ADC methods) are based on the diagonalization scheme which is in principle able to provide all the 1*h* and 2*h* − *p* poles for the given theoretical level within the energy window of interest. The diagonalization scheme will be especially useful to resolve poles that have different characters but are lying very close to each other. In the present study, solving the linear equations and scanning over the energy points in the energy window of interest are on the other hand targeting the overall profile, and to explore more detailed information for a specific energy regime, a smaller energy interval and decomposition analysis of the peaks need to be employed, as evidenced from our above effort to identify the weak *D* ^{2}Π state in the CO spectral function. However, even with the aid of a smaller energy interval and the decomposition analysis, it is still impossible to find dark states (corresponding to zero pole strength) and may still be difficult to find (or identify) the states with extremely low intensity. The missing of the states with low intensity would have direct impact on the accurate calculation of some properties (like the vibronic coupling constant) in which the corresponding energies are indispensable. In such cases, it would be more reliable to combine the GFCC results with the results from recently proposed iterative subspace methods^{58–60} for the narrow energy window of interest to resolve all the possible states.

Since our GFCC approach is able to target any energy regime, we also give the core ionization potentials in Tables I and II. Compared to our previous GFCCSD/cc-pVDZ results,^{50} the inclusion of internal triple in the GFCC framework does not significantly change the main ionic peak positions, and both the GFCCSD and GFCC-i(2,3) results show ∼3 eV deviation from the experimental observation^{102,103} and early fourth-order ADC study.^{96,104} However, similar to what we observed in Ref. 50, employing a larger basis set is able to reduce this deviation. For example, switching the cc-pVDZ basis set to Aug-cc-pVTZ basis, the position of the N 1*σ*^{−1} peak will be red-shifted to −15.09 a.u. (∼−410.62 eV), which is much closer to the experimental value (the deviation is <1 eV). Note that the satellite peaks in the core regime are also an interesting topic, and there were efforts focusing on computing the satellite peaks accompanying the K-shell ionization of N_{2} and CO molecules employing the CVS-ADC(4) method.^{96,104} However, the computation of the satellite peaks in the core regime is not covered in the present study, but will be one of the topics in our future work.

Finally, we would like to compare our GFCC-i(2,3) approach with other approximate EOM-CC (CCLR)-based methods where the treatment of triples is involved in a different manner. Table III lists the first three vertical ionization potentials of the N_{2} and CO molecules computed by GFCC-i(2,3), CCSDT-n (n = 1, 2, 3), and CC3 methods. Generally speaking, the CCSDT-3 and CC3 are able to provide the more accurate numbers than the CCSDT-n (n = 1, 2) and GFCC-i(2,3) methods, due to their more elaborate treatment of the triple equation, and the deviations from the experimental values for both methods are <0.1 eV. The quality of the GFCC-i(2,3) numbers lies between the CCSDT-3/CC3 and CCSDT-n (n = 1,2), and the largest deviation in these GFCC-i(2,3) numbers is ∼0.3 eV (for the $N2\u20091\pi u\u22121$ state). Note that both CCSDT-n (n = 1,2) and GFCC-i(2,3) exclude the orbital relaxation associated with the *T*_{1} cluster amplitude. In comparison to the CCSDT-n (n = 1,2) which totally ignores the triple-single terms (that is, the triple-single block of the associate Jacobian matrix is set to zero), the $(VNT2Xp,1)C$ term in the GFCC-i(2,3) approach contributes at least to the second order correction to the triple equation. Since the GFCC-i(2,3) approach does not require *T*_{3} cluster amplitude, its computational cost is even cheaper than the CCSDT-1 method.

. | . | CCSDT-n . | . | . | |||
---|---|---|---|---|---|---|---|

State . | GFCC-i(2,3) . | n = 1 . | n = 1b . | n = 2 . | n = 3 . | CC3 . | Exp. . |

N_{2} | |||||||

2$\sigma g\u22121$ | 15.67 | 15.85 | 15.84 | 15.89 | 15.58 | 15.54 | 15.60 |

1$\pi u\u22121$ | 17.28 | 17.35 | 17.34 | 17.40 | 16.95 | 16.88 | 16.98 |

2$\sigma u\u22121$ | 18.91 | 19.02 | 19.02 | 19.09 | 18.87 | 18.82 | 18.78 |

CO | |||||||

5σ^{−1} | 14.17 | 14.17 | 14.15 | 14.25 | 13.94 | 13.86 | 14.01 |

1π^{−1} | 17.09 | 17.33 | 17.35 | 17.36 | 17.10 | 17.09 | 16.91 |

4σ^{−1} | 19.81 | 19.88 | 19.88 | 19.91 | 19.77 | 19.76 | 19.72 |

. | . | CCSDT-n . | . | . | |||
---|---|---|---|---|---|---|---|

State . | GFCC-i(2,3) . | n = 1 . | n = 1b . | n = 2 . | n = 3 . | CC3 . | Exp. . |

N_{2} | |||||||

2$\sigma g\u22121$ | 15.67 | 15.85 | 15.84 | 15.89 | 15.58 | 15.54 | 15.60 |

1$\pi u\u22121$ | 17.28 | 17.35 | 17.34 | 17.40 | 16.95 | 16.88 | 16.98 |

2$\sigma u\u22121$ | 18.91 | 19.02 | 19.02 | 19.09 | 18.87 | 18.82 | 18.78 |

CO | |||||||

5σ^{−1} | 14.17 | 14.17 | 14.15 | 14.25 | 13.94 | 13.86 | 14.01 |

1π^{−1} | 17.09 | 17.33 | 17.35 | 17.36 | 17.10 | 17.09 | 16.91 |

4σ^{−1} | 19.81 | 19.88 | 19.88 | 19.91 | 19.77 | 19.76 | 19.72 |

## IV. CONCLUSION

In this paper, we propose a new category of the GFCC approximations, i.e., the GFCC-i(*n*, *m*) method, to improve the pole locations of the CC Green’s function. In the GFCC-i(*n*, *m*) method, the level of excitation (*m*) used to describe the ionization potential and electron affinity effects in the *N* − 1 and *N* + 1 particle spaces is higher than the level of excitation (*n*) used to correlate the ground-state coupled cluster wave function for the *N*-electron system. We demonstrate that in order to maintain size-extensivity of the GFCC matrix elements, *m* cannot be larger than *n* + 1. As our first practice in this direction, we implement the GFCC-i(2,3) method, in which the cluster operators *T* and the de-excitation operator Λ are kept at the CCSD level, while the auxiliary operators *X*_{p}(*ω*) and *Y*_{q}(*ω*) include the inner three-body terms (or internal triples) that can be obtained from one-body and two-body *X*_{p}(*ω*) and *Y*_{q}(*ω*) terms in an on-the-fly manner. The computational cost and storage requirement of the GFCC-i(2,3) method are significantly lower than the corresponding GFCCSDT method. Our preliminary accuracy test has been focused on the accurate computing of the spectral functions of the N_{2} and CO molecules in the inner and outer valence regime. In comparison with the GFCCSD results, the computed spectral functions by the GFCC-i(2,3) method are greatly improved and exhibit better agreement with the experimental results and other theoretical results. The improvement is particularly significant in terms of providing higher resolution of satellite peaks and more accurate relative positions of these satellite peaks with respect to the main peak positions. A similar strategy could also be applied to other CC methods. For example, it is feasible to combine the CCSD treatment of the ground state with excitation calculation that employs an extended biorthogonal CC expansion manifold comprising single (1*h*), double (2*h* − 1*p*), and zeroth order triple (3*h* − 2*p*) excitations. Such a treatment will be able to treat the one-hole (1*h*) main states through third order and give a distinctly better description of the 2*h* − 1*p* satellite states. Finally, it is worth mentioning that vibronic coupling is another important factor for accurately simulating spectral functions.^{105,106} Recent efforts include the development of similarity-transformed EOM vibrational coupled-cluster theory^{107} and relevant studies for benzene and adenine molecules.^{108,109} Our future development will also be focusing on this direction.

## ACKNOWLEDGMENTS

This work was supported by the Center for Scalable, Predictive methods for Excitation and Correlated phenomena (SPEC), which is funded by the U.S. Department of Energy (DOE), Office of Science, Office of Basic Energy Sciences, the Division of Chemical Sciences, Geosciences, and Biosciences. All calculations have been performed using the Molecular Science Computing Facility (MSCF) in the Environmental Molecular Sciences Laboratory (EMSL) at the Pacific Northwest National Laboratory (PNNL). EMSL is funded by the Office of Biological and Environmental Research in the U.S. Department of Energy. PNNL is operated for the U.S. Department of Energy by the Battelle Memorial Institute under Contract No. DE-AC06-76RLO-1830. B.P. acknowledges the Linus Pauling Postdoctoral Fellowship from PNNL. B.P. and K.K. acknowledge the constructive comments from the reviewers.