The challenging requirements of large scale quantum information processing using parametric heralded single photon sources involve maximizing the interference visibility while maintaining an acceptable photon generation rate. By developing a general theoretical framework that allows us to include a large number of spatial and spectral modes together with linear and non-linear optical elements, we investigate the combined effects of spectral and photon number impurity on the measured Hong–Ou–Mandel interference visibility of parametric photon sources, considering both threshold and number resolving detectors, together with the effects of spectral filtering. We find that for any degree of spectral impurity, increasing the photon generation rate necessarily decreases the interference visibility, even when using number resolving detection. While tight spectral filtering can be used to enforce spectral purity and increased interference visibility at low powers, we find that the induced photon number impurity results in a decreasing interference visibility and heralding efficiency with pump power, while the maximum generation rate is also reduced.

## I. INTRODUCTION

Almost all tasks in optical implementations of quantum communication,^{1–3} quantum cryptography,^{4–6} quantum sensing, and quantum information processing^{7–9} require sources of single photons.^{10,11} Although the requirements of such a source depend on the particular application,^{12,13} an ideal source would produce single photons with high spectral purity, either deterministically or with a very high efficiency. To this end, parametric non-linear processes can produce pairs of correlated photons in distinct modes with the detection of a photon in one mode heralding the presence of a single photon in another. Such sources benefit from potentially very high photon spectral purity^{14} and can in many cases be more readily incorporated into integrated devices with high density and reproducibility.^{15,16} Such sources, however, are not deterministic, and in order to achieve high efficiencies, it is necessary to multiplex many sources together.^{17,18} While many of the impressive purity metrics of these sources have been measured in the weak-excitation limit, an analysis of the level of multiplexing necessary to achieve a given efficiency obviously requires operation beyond this limit, and in conjunction with any effects of loss, including filtering.^{19,20} This problem is further complicated when one also considers the types of detectors employed, as beyond the weak excitation limit photon number resolving detectors project states in a way not possible with threshold (or “bucket”) detectors, which instead only indicate the presence of one or more photons.

Another route toward making a deterministic source is to instead use single quantum emitters such as semiconductor quantum dots,^{21} defect centers in crystals and 2D materials,^{12} and single organic molecules.^{22,23} Using these systems, an excited state can be populated on demand, and radiative relaxation can then occur with very high internal quantum efficiency, with the emitted photon in many cases having a high spectral purity.^{24–26} However, there are often efficiency–purity trade-offs using such sources,^{27} as it is typically necessary to strongly modify the surrounding photonic environment to boost extraction efficiencies or increase spectral purity (using, e.g., cavities), which can simultaneously have detrimental effects on other figures of merit. These considerations and the challenges involved in incorporating many such sources into a scalable platform^{21} leave significant questions regarding their utility in information processing applications.

For these reasons, the advantages offered by parametric sources motivate a detailed study of their operation beyond the weak excitation limit. While it is possible to write down the quantum state of the field produced by a parametric source in the Fock basis, and from this evaluate the heralding rate and fidelity to a pure single photon state including spectral impurity and multi-photon events,^{28} it is not a straightforward matter to propagate many such multi-photon multi-modal states through subsequent optical elements and calculate measured detection probabilities. This challenge arises due to the cumbersome nature of dealing directly with large Fock states and points toward the difficulties in the concatenation of multiple systems and tracking of errors. This represents a problem even when seeking to accurately model Hong–Ou–Mandel (HOM) interference of two heralded photons on a subsequent beam splitter, and particularly so in the presence of loss or if spectral filters are used, since strongly frequency dependent photon number states evolve non-trivially through latter devices in the system. Regarding detection, recent work on Gaussian boson sampling provides expressions for the detection probabilities using threshold detectors^{29} and number resolving detectors,^{30} while multiphoton contributions for threshold detectors have been investigated for single-mode sources,^{31,32} although in all cases ignoring the photon spectral properties. Spectrally multi-moded sources have been investigated but only for measurements of intensity correlation functions,^{33,34} which are not, in general, the appropriate measurements to investigate single photon interference.

In this work, we present a general framework of multimode Gaussian optics, which includes arbitrary spatial and spectral degrees of freedom and arbitrary photon numbers and readily accounts for linear optical elements such as beam splitters and filters as well as non-linear squeezing operations, which represent parametric sources. The Gaussian formalism offers a means to efficiently model an entire system, requiring propagation of a covariance matrix of size only linear in the number of modes, followed by a threshold or photon counting detection model, which extends previous results to the multi-spectral moded case. We put this new framework to use by performing a thorough evaluation of single photons heralded from a parametric photon pair source valid for all excitation powers and corresponding photon generation rates, and we simultaneously account for spectral impurity and the effects of loss and filtering. We assess the degree of quantum interference from a simulation of the experimentally measured HOM interference statistics between heralded photons from two sources, elucidating the quantitative and qualitative differences found using both detector types. As has been previously established, for threshold detectors, we find that the HOM interference visibility decreases with increasing photon generation rate, as at higher powers multiphoton components contaminate the heralded “single photon” state.^{28,35} When spectral impurity is included, this problem is also present when using number resolving detectors and post-selecting only one-photon events; even in the single photon subspace, increasing the photon generation rate necessarily and detrimentally affects the distribution of spectral modes in the heralded (truly) single photon state. With the inclusion of spectral filtering, while the interference visibility can be made arbitrarily high in the weak excitation limit, any increases in power to improve heralding rates deteriorate both the interference visibility and heralding efficiency. These results indicate that considerable care should be taken when designing a source with targeted figures or merit, but at the same time they provide a general framework to efficiently describe the larger systems for which the sources are intended, allowing for the consequences of multiple and varied source imperfections to be captured.

## II. GAUSSIAN SYMPLECTIC FORMALISM AND PHOTON DETECTION

The general scenario to which our formalism applies is depicted in Fig. 1 and consists of a collection of optical modes, which can be excited, coupled, and the states of which can be detected. As we are interested in accurately describing the spectral properties of photons, we take particular care to distinguish between spatial and spectral mode labels. A spatial mode label indicates where (e.g., in which waveguide) a mode is defined, while a spectral label refers to the frequency or wavelength of that mode. To accommodate both of these properties, we label the creation and annihilation operator of a mode with two indices, generically *i* and *ω*, referring to spatial and spectral degrees of freedom, respectively. We list all of our mode operators in a vector, which takes the form

where *N*_{s} is the number of spatial mode labels, and for each spatial mode label *i*, we have the vector $a\u0302i=(a\u0302i\omega 1,\u2026,a\u0302i\omega Nf)\u22a4$ with $a\u0302i\omega $ being the annihilation operator for a mode with spatial label *i* and spectral label *ω*. The number of spectral (equivalently frequency) modes is *N*_{f}, giving a total number of modes *N* = *N*_{s}*N*_{f}. The mode operators satisfy the usual commutation relations $[a\u0302i\omega ,a\u0302j\omega \u2032\u2020]=\delta ij\delta \omega \omega \u2032$.

A variety of candidates exist for parametric photon sources, such as free-space systems,^{36} cavity-based systems,^{37} in-line or wave-guided systems,^{38} and inter-modal phase-matched systems.^{14} Crucially, each shares the common feature that their dynamics are generated by Hamiltonians, which are at most quadratic in the quantum field mode operators. While the underlying electromagnetic non-linearity may be cubic or quartic, in all cases mentioned above, only two of the field operators in the Hamiltonian are treated quantum mechanically and the remaining bright fields contribute as time-dependent scalars. Consequently, once the dynamics are solved, the solutions are Gaussian transformations, which are described by an effective Hamiltonian $H\u0302$, which is quadratic in the field mode operators of interest and can in all generality be written as

where $H$ is a matrix of scalar coefficients, which may be a discrete approximation to a continuous function under appropriate regularity conditions.^{38} Although we are interested here in optical systems, this form of Hamiltonian and the analysis below is rather general and may also be applied to any quadratic bosonic Hamiltonian, such as those arising in condensed matter systems^{39} as well as quadratically coupled oscillator models in studies of quantum Brownian motion and decoherence.^{40,41} The Hamiltonian has a corresponding unitary time evolution operator $U\u0302=exp[\u2212iH\u0302]$, and using $H\u0302=H\u0302\u2020$ and the basic commutation relations for the creation and annihilation operators, one can show^{42}

where

with $1N$ being the *N*-dimensional identity matrix. Since the commutation relation between the mode operators is preserved under a unitary transformation, we have

### A. Gaussian states and transformations

Rather than working directly with the quantum state of the optical modes $\rho \u0302$, we instead consider the corresponding characteristic functions.^{42} We define the *s*-ordered characteristic function as

where $D\u0302(\Lambda )=expA\u0302\u2020K\Lambda $ and **Λ** is a 2*N*-dimensional vector of complex variables. Equation (6) describes a mapping from the quantum state of the field $\rho \u0302$ in Hilbert space to a multivariate function in an associated phase space. In this phase space representation, the nature and properties of the field (e.g., its energy and coherence) are reflected in the specific functional form of the characteristic function. As we detail in the Appendix, a Gaussian state is one having a Gaussian characteristic function of the complex variables **Λ** and is therefore uniquely determined by a 2*N* × 2*N* matrix of scalar coefficients *σ*, which is known as the covariance matrix, and a 2*N*-dimensional vector ** d** called the displacement vector. In this work, we will exclusively consider states for which

**= 0. If a Gaussian state undergoes a linear symplectic transformation as in Eq. (3), it follows that its covariance matrix transforms as**

*d*By noting that the vacuum has corresponding covariance matrix $12N$, we see that Eq. (7) allows us to construct a description of a quantum state generated by successive symplectic transformations.

In what follows, we show that it is possible to model both threshold (“bucket”) and number resolving detectors using only projections of a state onto the vacuum for different subsets of the modes. For this reason, let us therefore consider a subset of the total *N* modes, which we label *S*. We are then interested in the probability associated with the projector onto the vacuum for all modes in *S*, which we label $vacvacS$. As shown in the Appendix, the projection of a general Gaussian state described by a covariance matrix *σ* onto this state can be found to be^{31}

where |*S*| is the number of modes in *S* and *σ*_{S} is the 2|*S*| × 2|*S*| covariance matrix pertaining only to modes in *S*. Setting $\sigma S=12|S|$, we find *P*_{off}(*S*) = 1 as expected.

### B. Photon detection using threshold detectors

We now consider the probabilities associated with photon detection using threshold detectors, extending the existing results that include only multiple spectral modes^{31} or multiple spatial modes.^{29} Threshold detectors are currently the most widely used experimentally, returning a signal (or “click”) with a probability, which depends on the presence of one or more photons in a given spatial mode. These detectors do not typically resolve spectral degrees of freedom, and as such, the relevant projection operator corresponding to a detector click in *any* spectral mode with spatial label *i* is

which is the projection onto all states with spatial mode label *i* except the vacuum on all *N*_{f} spectral modes. In the following, we use the calligraphic notation $S$ to represent the set of spatial mode labels in which the detection events take place, while $\Theta ={\omega 1,\u2026,\omega Nf}$ is the set of all spectral mode labels. As such, a complete set of (spatial and spectral) labels is $S=S\xd7\Theta $. For example, we may be interested in spatial modes $S={1,3}$ (see Fig. 1), which including spectral labels gives the set $S={1,3}\xd7{\omega 1,\u2026,\omega Nf}={1\omega 1,\u2026,1\omega Nf,3\omega 1,\u2026,3\omega Nf}$.

Using this, the probability to detect at least one photon in each of the spatial modes $i\u2208S$ (and with any spectral mode labels) is

where $2S$ is the power set (the set of all subsets) of $S$ and

with $B=B\xd7\Theta $, and for which we can use Eq. (8) for Gaussian states. The general form for threshold detectors allows us to also calculate joint probabilities of clicks and vacuum detection events in sets of spatial modes. For detection events in spatial modes $X$ and vacuum in spatial modes $Y$, we have

This expression extends the previously known result involving the Torontonian function in Ref. 29, which gives the probability for threshold detector click patterns in the absence of unresolved spectral degrees of freedom (i.e., for *N*_{f} = 1). We note that the number of spectral degrees of freedom increases the size of the reduced covariance matrices in Eq. (8), while the number of explicit terms in the sum in Eq. (10) depends exponentially only on the number of *spatial* modes in which photon(s) are detected. In this way, the computational difficulty of these threshold detector probabilities asymptotically scales exponentially and is dominated by the total number of detection events. This exponential scaling is the origin of the intractability of Gaussian boson sampling with threshold detectors.^{29}

### C. Photon number resolving detection

We now present expressions for probabilities associated with number resolving detection in spatial modes. Our method takes inspiration from the way in which pseudo-number resolving detectors can be constructed experimentally, namely, by “fanning” out a mode across multiple modes and using threshold detectors.^{43} Interestingly, we find that these photon number statistics can be constructed only considering projections onto the vacuum for different combinations of modes.

In our derivation, for each spatial mode, we distribute its amplitude across *m* fictitious ancillary modes using a unitary which has equal amplitudes across its range. Doing so creates an equal superposition of the input state diluted by the vacuum over *m* modes, and at the end of each, we conceptually place a threshold detector. Provided that the number of modes *m* is much larger than the number of photons present in the initial state described by covariance matrix *σ*_{S}, we can assume that each of the *m* diluted modes contains at most one photon. In this way, the probability for a total number of threshold detector clicks, which we label *k*, gives an approximation to the probability to detect *n* photons in the initial state described by *σ*_{S}. Experimentally, this leads to a correspondence between the number of ancillary modes used and the number of photons that can be detected accurately. In our case, the ancillary modes are conceptual, allowing us to take the limit *m* → *∞* and give exact expressions for the probability to detect a fixed number of photons *n*.

We derive our expressions using the following steps, in part inspired by Ref. 44.

- We consider a subsystem of our total Gaussian state, which is also Gaussian and described by a 2|
*S*| × 2|*S*| covariance matrix*σ*_{S}, which may contain both spatial and spectral degrees of freedom. As depicted in Fig. 2, we fan this state out into*m*supermodes (which may each contain spatial and spectral degrees of freedom), and we label the set of supermodes with boldface calligraphic symbols, i.e., $S={1,\u2026,m}$. The complete set of mode labels is then given by $S\xd7S\xd7\Theta =S\xd7S$, of which there are in total $|S\Vert S|=m|S|$. All |*S*|(*m*− 1) ancillary modes are initially in the vacuum state. The total initial covariance matrix is written as $\sigma S\xd7S=\sigma S\u229512|S|(m\u22121)$. The fanning out transformation is achieved using the quantum Fourier transform (QFT) acting on the supermodes, although the specific unitary is unimportant. The QFT evenly distributes the initial amplitude in*σ*_{S}and has the matrix elements $(UQFT)nn\u2032=exp[2\pi inn\u2032/m]/m$. The covariance matrix following the action of the QFT is given bywhere we have defined $\sigma \u0303S=\sigma S\u221212|S|$ and $\sigma \u0303S\u20d7=(\sigma \u0303S,\u2026,\sigma \u0303S)\u22a4$ as the block-constant vector with each of its(13)$\sigma S\xd7S\u2032=UQFT\u229712|S|\sigma \u0303S+12|S|\cdots 0\vdots \ddots \vdots 0\cdots 0+12|S|UQFT\u2020\u229712|S|=1m\sigma \u0303S\cdots \sigma \u0303S\vdots \ddots \vdots \sigma \u0303S\cdots \sigma \u0303S+12|S|\cdots 0\vdots \ddots \vdots 0\cdots 12|S|=1m\sigma \u0303S\u20d7(12|S|\u20d7)\u22a4+12m|S|,$*m*elements equal to $\sigma \u0303S$ and $12|S|\u20d7$ is similarly a block-constant vector with all entries being the identity matrix $12|S|$. We see that this procedure results in a block rank 1 form for the total covariance matrix $\sigma S\xd7S\u2032$. - We next determine the probability to detect zero clicks (the vacuum) in some subset of the supermodes $B$. The quantity of interest iswhere $\sigma B\xd7S$ is the covariance matrix pertaining to the subset of modes with supermode labels in $B$, which takes on precisely the same form as in Eq. (13), but with the constant vectors $\sigma \u0303S\u20d7$ and $12|S|\u20d7$ instead having length $|B|=b$. Using the matrix determinant lemma and commutative subring properties, we can write(14)$PoffB\xd7S=det(12|S\Vert B|+\sigma B\xd7S)/2\u22121/2,$
^{66}which we see depends only on the number of supermodes in subset(15)$Poff(B\xd7S)=poff(b)=det12|S|+b2m\sigma \u0303S\u22121/2,$*b*. This result also reduces the determinant of the potentially high dimensional covariance matrix in Eq. (13) to essentially the determinant of one block $\sigma \u0303S$, which is only 2|*S*|-dimensional, meaning we can evaluate an arbitrary number of fanned out modes without changing the complexity of the determinant. - We now extend Eq. (12) to write down the probability to detect at least one photon in each supermode in the subset $X$ and vacuum in all other supermodes $Y$, namely,Equation (15) tells us that the terms in the summation above depend only on the size of the set of supermode labels $(B\u222aY)$. We then collect terms involving subsets $B$ of the same size to write(16)$Pon,off(X,Y)=\u2211B\u22082X(\u22121)|B|Poff((B\u222aY)\xd7S).$where $k=|X|$ is the total number of detection events. As we might expect, this probability depends only on the number of detection events and not the particular pattern. We then sum over all patterns $X$ with fixed length(17)$Pon,off(X,Y)=\u2211l=0k(\u22121)lklpoff(m\u2212(k\u2212l)),$
*k*to give the probability for all detection patterns with*k*clicks as(18)$\u2211|X|=kPon,off(X,Y)=mk\u2211l=0k(\u22121)lklpoff(m\u2212(k\u2212l)).$ - In the limit that the number of supermodes
*m*becomes very large, the probability that any supermode contains more than one photon becomes vanishingly small, and a threshold detector click amounts to the detection of exactly one photon. Our expression above for*k*clicks then becomes equal to the probability to detect exactly*n*photons. In the limit*m*→*∞*, we have*k*→*n*, we can recognize the right-hand side of Eq. (18) as a derivative, and we find the number resolving detection probability associated with the detection of*n*photons in the state described by covariance matrix*σ*_{S},(19)$PPNR(n)=limm\u2192\u221e\u2211|X|=nPon,off(X,Y)=(\u22121)nn!\u2202tndet12|S|+t2\sigma \u0303S\u22121/2t=1.$ - Similar steps can be used to find probabilities associated with number resolving detection in distinct spatial modes. The probability for a spatial mode detection pattern described by the vector $n=(n1,\u2026,n|S|)$ in spatial modes $S$ is given bywhere $TS=12\u2297(t\u22971|S|)$ with $t=diag(t1,\u2026,t|S|)1/2$.(20)$PPNR(S;n)=\u220fi=1|S|(\u22121)nini!\u2202tini\xd7det12|S|+TS\sigma \u0303STS2\u22121/2T=12|S|,$

Equation (20) constitutes one of the major results of this work. It allows for the calculation of number resolved detection probabilities across multiple spatial modes within which multiple unresolved spectral degrees of freedom may be present. It should be noted that although our expression in Eq. (20) is valid for any finite detector number pattern ** n** and appears relatively compact, the presence of the derivatives means that there is in principle an exponentially large number of terms involved in the limit of large photon numbers. Indeed, this can be seen in Eq. (18). This is to be expected, however, as it is known that the calculation of photon number probabilities from Gaussian states is in the #P computational complexity class.

^{29}

However, the utility of our expression above in fact lies in the way in which the spectral mode degrees of freedom are included. In our expression, the size of the matrix entering the determinant scales only with the number of modes and is fixed with respect to photon number. This should be compared to the other number resolving detection probability involving matrix Hafnians in Ref. 29, for which no distinction between spatial and spectral modes is made. As such, we can here include many spectral modes by simply (linearly) increasing the size of the covariance matrix *σ*_{S}, without affecting the scaling of the computation with respect to photon number. To do so using the expressions from Ref. 29, one would need to calculate matrix Hafnians for all of the different patterns of spectral modes in which photons could have been detected, thus exponentially increasing the number of terms being calculated. Our expressions permit, for example, one to consider arbitrary photon spectra, purity, and indistinguishability.

The derivatives in Eq. (20) can in principle be evaluated analytically using repeated applications of the Jacobi identity for derivatives of determinants. This may be a promising approach to pursue in order to draw conclusions regarding the computational complexity of photon number probabilities in the presence of multiple spectral modes. Being primarily interested in photon number probabilities for relatively small total photon numbers, in this work, we use a finite difference approximation to numerically evaluate the derivatives by taking linear combinations of the vacuum probabilities, in which the relevant function must be evaluated at $\u220fi=1|S|(ni+1)$ points for different values of *T*_{S}.

Finally, we note that making the set of modes explicit in the argument in Eq. (20) clarifies detection patterns when other spectator modes are to be traced out. In order that we have a similar notation for threshold detectors, we introduce a list ** n** for threshold detectors, which is a click pattern, analogous to the photon number resolving (PNR) click pattern

**in Eq. (20), but each element can take on values of only “on” or “off.” We then define**

*n*where it is understood that the set $X$ is those modes for which *n*_{i} = “on,” and $Y$ is those modes for which *n*_{i} = “off.” This notation allows us to specify, for either detector type, the set of modes we keep from the whole system *S* and the particular detector pattern ** n**. For example, we may wish to consider the first four (of potentially greater than four) spatial modes, which we label with italicized numbers, $S={1,2,3,4}$. With number resolving detectors, we then specify the photon numbers in each mode, e.g.,

**= (1, 0, 2, 0), while for threshold detectors, we specify the click pattern, e.g.,**

*n***= (on, off, on, off), and in Eq. (21), we should take $X={1,3}$ and $Y={2,4}$. Of course, in general, $PThres(S;n)\u2260PPNR(S;n)$, so this common notation only indicates a correspondence and not equality.**

*n*### D. Two-mode squeezers

Having introduced Gaussian states and photon detection in general terms, we now explore how to describe specific optical elements within this formalism. We begin with the parametric photon pair sources themselves, which arise from Hamiltonians that take the form of multimode two-mode squeezers. These have the general form

where $a\u0302i\u2020(\nu )$ is the creation operator for a mode with spatial mode *i* and frequency *ν*. We will refer to the function *F*(*ν*_{1}, *ν*_{2}) as the joint-spectral-amplitude (JSA), the properties of which have a significant effect on the quality of a single photon source based on a two-mode squeezer. To see this, we write the JSA in its Schmidt decomposition,^{45,46} namely,

where *λ*_{l} are positive coefficients known as the Schmidt coefficients and the functions {*ψ*_{l}(*ν*)} and {*ϕ*_{l}(*ν*)} are sets of orthonormal functions (though not necessarily equal or mutually orthonormal). In the low squeezing limit ‖*F*(*ν*_{1}, *ν*_{2})‖ ≪ 1, the propagator can be expanded to first order to give the bi-photon state,

where we have introduced the broadband mode operators, $C\u0302l\u2020=\u222bd\nu 1\psi l(\nu 1)a\u03021\u2020(\nu 1)$ and $D\u0302l\u2020=\u222bd\nu 2\varphi l*(\nu 2)a\u03022\u2020(\nu 2)$, which themselves are mutually orthogonal. We can therefore see that when the Schmidt decomposition has more than one term, the biphoton state is entangled, and when detecting one of the photons in an unknown spectral mode, the other will be left in a spectrally mixed state. Conversely, JSA separability guarantees a pure quantum state after detection of one of the photons. For these reasons, we will refer to a source described by a separable JSA as a spectrally pure source.

The continuous nature of the JSA function *F*(*ν*_{1}, *ν*_{2}) means that the Schmidt basis defined by the functions {*ψ*_{l}(*ν*)} and {*ϕ*_{l}(*ν*)} must be found by solving integral eigenvalue equations. In practice, it is often easier to instead discretize the integral in Eq. (22), which, again, is valid under reasonable regularity conditions.^{38} In doing so, we find the Hamiltonian can be written in precisely the form of Eq. (2), where the matrix of coefficients takes the form

and the matrix *F* has elements $F\omega 1\omega 2=\delta \nu F(\delta \nu \omega 1,\delta \nu \omega 2)$ with *δν* being the discretization step in frequency. The Schmidt decomposition of Eq. (23) in the discrete case is the singular value decomposition,

where *F*_{D} is a diagonal matrix of the singular values of *F* and *U* and *V* are the unitary matrices. We note that the factor of *δν* in the definition of the matrix elements of *F* means the singular values in *F*_{D} approximate the true singular values *λ*_{l} in Eq. (23).

Equation (26) allows the Hamiltonian coefficients to be written as $H=UFDU\u2020$, where

with

and we note that $U$ and $U$ are unitary. With $H$ written in this way, we can perform the exponentiation in Eq. (3) straightforwardly and find that the symplectic transformation for the multimode two-mode squeezer can be written as $M=UMDU\u2020$, where

We see that a multimode two-mode squeezer is simply a set of independent two-mode squeezers acting on the appropriate Schmidt modes.

The formalism presented here, which represents optical elements as symplectic matrices, requires that the frequency degree of freedom is discrete. In order that frequency features in the (continuous) JSA are accurately captured, we must choose our discretization step small enough that these features are resolved. We note, however, that the discretization need not necessarily be performed uniformly, and in fact, the Schmidt decompositions above provide one with a formal way in which to identify a small number of dominant frequency modes even in broadband cases where a large span of the frequency range may be important. Although we consider non-frequency resolving detection in this work, we also note that a discretization of frequencies can quite naturally and accurately reflect the timing or spectral resolution of detectors used.

### E. Unitary and passive transformations

In addition to two-mode squeezers that describe parametric sources, we also require unitary mode transformations such as beam splitters and phase shifters. In terms of symplectic transformations, as described in Eq. (5), these take the general block-diagonal form *M* = diag(*α*, *α*^{*}) with *α*^{†} = *α*^{−1}. For a dispersionless (frequency independent) beam splitter, we have

while a dispersionless phase shifter acting on say spatial mode 1 of 2 is described by $\alpha PS(\varphi )=diag(exp[i\varphi ],1)\u22971Nf$. A linear frequency dependent phase shift corresponds to a delay in the time domain. Such a transformation (again acting for illustrative purposes here on mode 1) is described by $\alpha delay=diag(\alpha \tau ,1Nf)$ with spectral matrix elements $(\alpha \tau )\omega 1\omega 2=\delta \omega 1\omega 2\u2061exp[i\omega 1\delta \nu \tau ]$. Here, *ω* is a discrete (integer) index, and we note that the delay *τ* should, in our context, be thought of relative to the bandwidth of photons generated by the two-mode squeezing process, which itself is determined by the typical width of the JSA in Eq. (22).

As well as these unitary transformations, we will also be interested in more general non-unitary yet passive transformations and, in particular, those that correspond to loss or filtering. To implement such transformations, we add ancillary loss mode(s), then use a unitary beam splitter type transformation as in Eq. (22) acting on the mode of interest and ancillary modes, and then trace out the ancillary modes. This can be done analytically, and we do not need to explicitly include the ancillary modes in our calculations. From the condition in Eq. (5), we know that a general pure covariance matrix will take the form

We imagine that Eq. (31) describes modes of interest, and ancillary loss modes are initially in the vacuum state. Coupling into the loss modes is a unitary process, and introducing an ancillary mode for each mode in *S*, the unitary can be written in a general block form

where unitarity is ensured by $USSUSS\u2020+USLUSL\u2020=12|S|$, and we are now working in a basis in which ancillary mode operators are ordered after all mode operators of interest. The action of this unitary on the total covariance matrix is then

where $USS=diag(USS,USS*)$, and we have omitted modes that we will shortly trace out. The top left block pertains to our modes of interest and can be written using $\sigma \u0303S\u2254\sigma S\u22121|2S|$ as $USS(\sigma \u0303S+12|S|)USS\u2020+USL12USL\u2020$. Then, using the unitary condition, we can rearrange to find

This map, described uniquely by the sub-matrix $USS$ of the unitary, is valid for all transformations that can be constructed as a unitary transformation acting on our system of interest and vacuum ancillary modes, followed by a trace over the ancillary modes.

For the purposes of modeling frequency-independent loss on all spatial modes, we can take $USS=1\u2212\u03f51|S|$ with 0 ≤ *ϵ* ≤ 1 being the loss parameter. For loss acting on, for example, spatial mode 1 of 2, we have $USS=diag(1\u2212\u03f5,1)\u22971Nf$. Spectral filtering can be included as a frequency-dependent loss with an associated filter function *f*(*ν*) such that $(USS)\omega 1\omega 2=\delta \omega 1\omega 2f(\omega 1\delta \nu )$. For example, a bandpass filter with central frequency *ν*_{0} and bandwidth Δ*ν*_{f} is described by

This provides us with all the optical components necessary to model a HOM interference experiment.

## III. HERALDED HONG–OU–MANDEL INTERFERENCE VISIBILITIES

To begin our investigation of HOM interference visibilities, let us first discuss how such a measurement may be performed. To measure a HOM visibility in a manner closest to the original experiment, two signals are interfered on a balanced beam splitter and coincidences at the outputs are compared when the signals are made to be as indistinguishable as possible (by, for example, tuning their arrival times to be equal) and when they are made distinguishable by varying some degree of freedom in one of the signals (typically delaying the arrival time of one of the signals).

However, in practice, it is not always straightforward to toggle a distinguishability degree of freedom in this way. In particular, typically HOM visibilities measured in integrated silicon platforms are performed by instead scanning through beam splitter angles using a Mach-Zehnder interferometer^{14} as there is no straightforward way to create a temporal delay with the third order non-linearity in silicon. The maximum and minimum coincidences as a function of beam splitter angle are then used to derive a visibility. Conversely, other platforms such as Ti:LiNbO_{3} have a second order non-linearity and are able to use the polarization degree of freedom to create a time delay.^{47} We introduce the term *interference parameter* to allow us to compare different degrees of freedom in one framework. A measured HOM visibility is then potentially dependent on (a) the interference parameter used, (b) the detector type used, and (c) the visibility function used to combine raw counts into a figure of merit.

We consider the interference between heralded photons from two sources, which we term heralded HOM interference. The optical circuits are shown in Figs. 3(a) and 3(d), and are composed primarily of two sources, a beam splitter and four detectors. In the ideal case, when the detectors in the two herald (equivalently *signal*) modes (1 and 4) register the presence of photons, photons are then necessarily present in the two *idler* modes (2 and 3), which then interfere and bunch, leading to a detection event in either mode 2 or 3. We are therefore interested in the fourfold coincidence terms,

which should vary from large to small as the interference parameter is increased. For later convenience, we will also introduce the bunching terms, which indicate successful heralded HOM interference,

We envisage simultaneously pumping both sources with, for example, a common optical pump field. In this case, the probabilities in both Eqs. (36) and (37) correspond to measured counts per excitation pulse, assuming the specified level of losses.

In Figs. 3(b), 3(c), 3(e), and 3(f), we plot the fourfold coincidences as defined in Eq. (36) for identical non-separable sources, as a function of time delay (top row) and beam splitter angle (bottom row). The black curves [(b) and (e)] correspond to coincidences measured with threshold detectors, while the purple curves [(c) and (f)] correspond to the same quantity using number resolving detectors. For the varying beam splitter case, referred to as the Mach-Zehnder HOM interferometer, the fourfold coincidence probability has a maximum for a beam splitter angle of *θ* = 0 (acting trivially on modes 2 and 3), which we label $(PD4)max$. This is compared to the corresponding value for a balanced beam splitter with *θ* = *π*/4, which we label $(PD4)min$. The visibility is typically defined as^{48}

On the other hand, for a fixed beam splitter angle of *θ* = *π*/4 and variable time delay, the coincidences are typically normalized with respect to a large (effectively infinite) delay, which we label $(PD4)\tau =\u221e$. This is then compared to the zero time delay coincidence probability $(PD4)\tau =0$, which is equal to $(PD4)min$. We then have the visibility metric,^{49}

In both cases, the visibility can depend on the type of detector used, and we see that in general these two visibilities are not equal. Even removing the $(PD4)min$ term in the denominator of Eq. (38) results in different expressions, as the maximum coincidence probability for variable beam splitter $(PD4)max$ is not equal to the maximum coincidence probability for variable time delay $(PD4)\tau =\u221e$, leading to a different normalization in each case. While in the low power limit $(PD4)max=2(PD4)\tau =\u221e$, which reflects the relative number of single photon pathways leading to a coincidence and gives a simple way to equate the two visibilities, this does not hold true away from the low power limit.

To explore this further, we can introduce the ratio $R=(PD4)max/(PD4)\tau =\u221e$, which allows us to write

We see that with access only to coincidence probabilities measured with variable beam splitter, the knowledge of *R* allows a visibility calculated in this way to be equated to that measured using a fixed beam splitter and a time delay. In Fig. 4, we show how *R* varies with increasing pump power as captured by the squeezing parameter *ξ* (defined below) and depending on the type of detector used and level of loss. For threshold detectors, we see *R* decreases with increasing power as coincidence probabilities begin to saturate. However, *R* also decreases for number resolving detectors beyond the low squeezing regime and for non-separable sources.

These findings demonstrate that some care must be taken in deducing a HOM interference visibility when using any given experimental setup. Although it would seem from Fig. 4 that either interference parameter can be used provided the measurements are taken in the low power limit, as we will see in the remainder of this paper, interference visibilities are not in general constant with power, even when using number resolving detectors. Moreover, when considering larger scale systems involving more photons, it is unlikely that their successful operation or fidelity with target states will be linear functions of, or even uniquely defined in terms of, these simple HOM interference visibilities. We will see in what follows, however, that in the limiting case of two identical sources and in the absence of loss, it is the visibility $VPNRHOM$ measured with variable time delay and number resolving detectors, which gives a direct measure of the heralded photon purity. As such, for the sake of concreteness, we will use Eq. (39) as our visibility metric for the remainder of this paper, but note that careful consideration of how this figure of merit affects a specific application will be required.

### A. Effects of spectral and number impurity

We now investigate how the pump power simultaneously affects the heralded HOM interference visibility and heralding rate. We first consider sources which have separable JSAs, which give rise to heralded photons that are spectrally pure. To do so, we take as an example an idealized JSA that takes the functional form of the product of two Gaussian functions,^{14}

where $\Delta \nu i=\nu i\u2212\nu \u0304i$ with $\nu \u0304i$ being the central frequencies of the signal and idler photons and *ζ* represents their bandwidth. The denominator here represents the Frobenius norm of the exponential factor. In this way, the JSA features a single Schmidt coefficient given by *ξ*.

In Fig. 5, we plot the JSA of both sources on the left, and on the right, we show the heralded HOM visibility and joint heralding rate as a function of squeezing parameter *ξ*, which represents the pump power. Visibilities are calculated using both threshold (purple squares) and number resolving detectors (green circles). The heralding rate is the probability that both heralding (signal) arms (modes 1 and 4) register photons and can be thought of as the square of the efficiency of one of the sources per excitation pulse. It is defined as

and

for number resolving detectors (yellow curve) and threshold detectors (blue dashed curve), respectively. We see that for pure sources, the visibility with number resolving detectors is 1 for all values of the squeezing parameter. This reflects the fact that regardless of the power, a heralding event selects single photon states in the idler modes, which then exhibit perfect HOM interference to give no coincidences. The heralding rate, however, begins to decrease for large squeezing parameters as the probability of only single photon events decreases, and the optimal squeezing parameter is approximately *ξ* = 0.9, which corresponds to 7.8dB of squeezing. Using threshold detectors, we see that the HOM visibility decreases with increasing squeezing parameter. As can be seen in Eq. (36), with threshold detectors, all events involving four or more photons are registered, and this decrease in visibility is due to the higher order multiphoton terms contaminating the “single photon” state; threshold detectors register events in which one or both sources produce more than one photon in an idler mode, for which perfect HOM interference does not occur. The heralding rate monotonically increases as it does not distinguish between the single photon and multiphoton subspaces and eventually saturates at 1. As we might expect, threshold detectors do reasonably approximate the number resolving heralding rate for squeezing parameter *ξ* < 0.2 (squeezing of 1.7dB). Since the sources here produce spectrally pure photons, Fig. 5 serves only to highlight the effects of photon number purity, which we see can be mitigated with the use of number resolving detectors.

We now also include the effects of spectral impurity. To do so, we consider two typical waveguide sources, which gives rise to non-separable JSAs. The functional form we use is

where the parameter $L\u0303$ captures the fields’ propagation over the effective length of the two-mode squeezing interaction. Such a JSA is achieved for spontaneous parametric down-conversion using a *χ*^{(2)} non-linearity where *ζ* represents the pump bandwidth or using degenerately pumped spontaneous four-wave mixing under a *χ*^{(3)} non-linearity with *ζ* being the width of the autoconvolution of the pump. The functional form above is a consequence of symmetric phase-matching with the central frequencies obeying $\nu \u0304s+\nu \u0304i\u2212(n\u22121)\nu \u0304p=0$, the central wave-vectors satisfy $k\u0304s+k\u0304i\u2212(n\u22121)k\u0304p+2\pi /P=0$, for possible poling period *P*, and the fields’ group velocities obey $vi\u22121=2vp\u22121\u2212vs\u22121$ so that $L\u0303=L(vs\u22121\u2212vp\u22121)$ with *L* being the interaction length. In all cases, the subscripts indicate the signal, idler, and pump, and *n* = 2, 3 is the order of the non-linearity. The normalization factor in Eq. (43) ensures the JSA has Schmidt coefficients of the form *λ*_{l} = *ξα*_{l} with $\u2211l\alpha l2=1$, and the multiplying factor *ξ* is the (multi-mode) squeezing parameter, which is related to the strength of the pump laser. While in the high-power regime, other non-linear effects such as phase-modulation can become appreciable,^{38} in this work, we neglect these effects and vary the pump strength solely through the parameter *ξ*. In this way, we can separate out the impact of the photon number statistics independently of the more application or material specific nonlinear effects that may also occur.

In Fig. 6, we again show the JSA of each source on the left and the heralded HOM visibility and joint heralding rate on the right. For threshold detectors, we see similar trends with increasing squeezing as for the pure sources case, though now with the HOM visibility starting below unity, reflecting the fact that even in the single photon subspace interference is imperfect owing to the spectral impurity of the heralded photons. In contrast to the case explored above, however, when using number resolving detectors, we see that the HOM interference visibility decreases with increasing squeezing parameter.^{28} This suggests that even when using number resolving detectors to herald photons only in the single photon subspace, the power, as captured by the squeezing parameter, cannot be increased without detrimentally affecting the interference probability of the photons produced.

We can see this decreasing of the interference visibility by following Ref. 28 and using the Schmidt decomposition in Eq. (23) as it allows the two-mode squeezing Hamiltonian in Eq. (22) to be written as $H\u0302=\u2211l\lambda lC\u0302l\u2020D\u0302l\u2020+h.c.$, where $C\u0302l\u2020=\u222bd\nu \psi l(\nu )a\u03021\u2020(\nu )$ and $D\u0302l\u2020=\u222bd\nu \varphi l*(\nu )a\u03022\u2020(\nu )$ are the generalized Schmidt modes as before. Since these generalized modes are independent, the unitary time evolution operator corresponding to a multi-mode two-mode squeezing operation is in fact just a product of independent two-mode squeezers, and we have

where $S\u0302l(2)(z)=exp[zC\u0302l\u2020D\u0302l\u2020\u2212z*C\u0302lD\u0302l]$ is a two-mode squeezer acting on spectral mode *l* in spatial modes 1 and 2, and which in normally ordered form is written as^{50}

where we have written *z* = *r*e^{iϕ}. To find the joint signal–idler state produced by a source including all spectral modes and photon numbers, we can act this on the vacuum to give

where the sum runs over all possible integer tuples indicating the number of photons in each spectral Schmidt mode, i.e., $n=(nl1,nl2,\u2026)$, while $|n\u3009i=\u2a02l|nl\u3009il$ with $|nl\u30091l=C\u0302l\u2020nl|vac\u30091l/nl!$ (and similarly for $|nl\u30092l$) representing the corresponding Fock state in spatial mode *i*, and the coefficients are $c(n)=\u220flsech\lambda l(\u2212i\u2061tanh\lambda l)nl$.

To find the interference probability of the idler photons, we consider the conditional state obtained when a single photon in any Schmidt mode is detected in spatial mode 1, i.e., the signal mode. The corresponding measurement operator is the projector $\Pi 1=\u2211l111l$. The probability for such a detection is $PPNR(1;1)=Tr[\Psi \Psi \Pi 1]$, giving

while the post measurement state is $\rho \u0302=Tr1[\Psi \Psi \Pi 1]/PPNR(1;1)$ with the trace only over modes with spatial label 1, which gives

The purity of this single photon state is a measure of its spectral indistinguishability. For this, we find

which we see depends on the distribution of the Schmidt coefficients. If one Schmidt mode dominates and is much larger than all others, we have simply $Tr[\rho \u03022]\u22481$. However, if this is not the case, then we see that linearly increasing each coefficient causes a decrease in the idler photon purity. Equation (49) is plotted with a black curve in Fig. 6 and perfectly matches the green circular data points as expected.

### B. Effects of photon loss and filtering

While the above analysis in the Fock basis does allow the state produced by a source to be scrutinized in this way, it is not a straightforward matter to describe its evolution through subsequent optical elements. As a pertinent example of this, and one of the main advantages of the formalism presented in this work, we now investigate the effect of frequency selective loss, i.e., filtering. The results so far have demonstrated that non-separability of the JSA is a significant factor affecting source figures of merit, and it is natural to ask to what extent separability of a JSA can be imposed by filtering.

In order to gain insight, let us first consider the case in which a fixed amount of frequency independent loss is present on all modes and investigate the measured interference visibility when using number resolving detectors. The results are shown in Fig. 7 with (a) and (b) corresponding to, respectively and as above, pure sources and non-separable sources. For each, the different curves correspond to different loss levels as indicated. We see that with loss there is a greater decrease in visibility with increasing power as compared to the cases without loss. In fact, we see from (a) that even in the case for which the sources are pure and number resolving detectors are used, when loss is included, there is a decrease in interference visibility with increasing power, which is not the case without loss. In all cases, the detrimental effects of loss can be understood as compromising the ability of a number resolving detector to herald a truly single photon state in the idler modes.

So far, the rate or efficiency of the sources investigated has been characterized by the heralding rate, which is the probability that there is a detection event in both of the heralding (signal) modes. In the absence of loss, this quantity is precisely the probability that photon(s) are heralded in both of the idler modes since the photon numbers in the signal and idler modes are perfectly correlated. With the inclusion of loss, however, this is not the case and it is instructive to also consider the heralding efficiency, defined as the conditional probability of detection events in both of the heralded (idler) modes (2, 3) when no beam splitter is present, given both the herald (signal) modes (1, 4) registered a detection event. In our case, this quantity can be calculated as the ratio of fourfold coincidence events measured with no interference to the heralding rate, and this is therefore written as

where $PDSPS=PD4+PDbunch$ is the sum of the bunching and antibunching terms and is independent of the beam splitter angle. Writing the heralding efficiency in this way allows it to be consistently defined even in the case where the beam splitter angle is fixed.

In Fig. 8, we plot the heralding rate and heralding efficiency when including loss as a function of the squeezing parameter for both (a) number resolving detectors and (b) threshold detectors and, in both cases, consider non-separable sources with the parameters in Fig. 6. We see that the peak in number resolved heralding rate is shifted to higher squeezing values with increasing loss, suggesting that decreases in photon generation rates caused by losses can in principle be easily overcome by increasing the pump power. However, while Fig. 7 demonstrates that this compensation will necessarily decrease the interference probability, Fig. 8 further shows that increasing the power will decrease the heralding efficiency, meaning that photons are less likely to be present in the idler modes when heralding events are registered. Therefore, loss detrimentally affects the effective purity, heralding rate, and heralding efficiency of a source with none of these figures of merit independently compensated by changing the pump power. With threshold detectors, we similarly see that loss-induced decreases in the heralding rate can be compensated by increasing the squeezing parameter. Although the heralding efficiency appears to increase with increasing power for threshold detectors, this reflects only the fact that multiphoton terms increasingly tend to saturate these detectors.

Let us now consider the effect of spectral filtering. We investigate the case in which two sources each described by a non-separable JSA are subject to spectral filtering on both spatial modes (signal and idlers). Each mode is filtered with the same ideal bandpass filter through which photons are either fully transmitted or fully removed depending on their frequency. The bandpass filters are centered at the center of the JSAs ($\nu 0=\nu \u0304=193.1THz$), and the sources have the same parameters as in Fig. 6. The results are shown in Fig. 9, where the left column corresponds to no filtering, the middle column corresponds to a filter width of Δ*ν*_{f} = 2 × 10^{15} rad/s, and the right column corresponds to Δ*ν*_{f} = 1.2 × 10^{15} rad/s. The widths of the filers are indicated by the white lines in the plots of the JSA in the top row, and in all cases, the different curves in the plots below correspond to the different levels of loss. As one might expect, we see that tight filtering of the sources means that at low powers an interference visibility of up to 99.58% can be achieved for the initially non-separable imperfect source. The trade-off is that the filtering process has reduced the maximum heralding rate from 0.07 to 0.063, and the heralding efficiency decreases from 100% to 88% at the point of maximal heralding rate.

More importantly, however, is that it appears spectral filtering of this sort does not allow unit visibility for all squeezing parameter values, even with number resolving detectors; a tightly filtered source does not behave like a separable source, even if perfect interference visibilities are found in the low power regime. For a truly pure source, as explored in Fig. 5, the use of number resolving detectors allows the heralding rate to be increased by increasing the pump power at no cost to the interference visibility. In the present case, however, this is no longer true, even when there is no (frequency independent) loss. Once again, the problem here is that filtering has introduced photon number noise, and number resolving detection is no longer able to herald a truly single photon state.

### C. Structured and non-identical sources

As a final demonstration of the utility of our presented formalism, we will now consider heralded HOM interference between structured and non-identical sources. The pairs of sources that we will consider are shown in the first two rows of Fig. 10. The first column shows a pair of identical sources with the JSA of each consisting of the sum of two Gaussian functions of the form in Eq. (41) with widths *ζ* = 0.03 THz. The fourfold coincidences are shown in the bottom plot, calculated for number resolving detectors (solid, purple curve) and threshold detectors (dashed, black). The heralded photons here have double peaked temporal (and spectral) envelopes, resulting in a HOM dip with multiple features reflecting the Fourier transform of the product of the two heralded photon spectra. In the middle column, we keep one source the same and replace the other with a filtered waveguide source with a non-separable JSA of the form in Eq. (43). The corresponding HOM dip shown in the plot at the bottom again has interesting temporal features, and here, the dip minima are higher than those in the previous case, reflecting both the spectral impurity of the first source and the poor overlap of the two heralded photons. Finally, in the column on the right, we consider again the filtered non-separable waveguide source, now interfering with a double peaked Gaussian with opposite phase, i.e., a JSA that is the difference of two terms of the form in Eq. (41). Here, we see almost no interference at 0 time delay, as the two heralded photons now have approximately orthogonal spectra. However, for time delays of Δ*τ* ≈ ±4 ps, we see the interference is improved, as for these parameters one of the peaks of the second source is centered on that of the first. We stress that in all cases here, while it may be possible to write down the conditional state of the heralded photons in the Fock basis and describe the qualitative features of these HOM dips, calculating the actual fourfold coincidence count rates and deriving a corresponding interference visibility beyond the weak excitation limit is considerably more involved. Within our continuous variable formalism, on the other hand, the complete state of the system to all photon number orders and with all spectral degrees of freedom is explicitly propagated through all optical elements, allowing for photon detection rates to be calculated exactly.

## IV. CONCLUSION

In this work, we have introduced a Gaussian quantum optics formalism that explicitly includes a tensor product structure of the spatial and spectral modes. This formalism allows us to model realistic and relevant highly multi-moded fields generated by collections of parametric sources together with non-Gaussian threshold and number resolving detection. This has allowed us to extend the known expressions for non-frequency resolving threshold and number resolved detection probabilities to include spectral degrees of freedom. We applied this formalism to the study of heralded Hong–Ou–Mandel interference, in which we elucidated the inter-dependencies of the source heralding rates, heralding efficiencies, and interference visibilities. In particular, we demonstrated that any non-separability of the source joint-spectral amplitude results in decreases of the interference visibility beyond the weak excitation limit and further decreases in the interference visibility and heralding efficiency when any loss is present. Furthermore, we found that while spectral filtering can improve a source’s interference visibility, the filtering process necessarily introduces photon number noise, which detrimentally affects the interference visibility and heralding efficiency at higher powers.

The HOM setup considered here constitutes a fundamental primitive for many more complex systems, and as such, understanding these effects will become ever more important as demonstrations (and applications) are increased in scale. Consider, for instance, the current state-of-the-art four-photon experiments achieving detection rates of 10^{−2},^{51} 1,^{52,53} 1–2,^{54} 21,^{55} and 46 s^{−1}^{56} with target state fidelities not exceeding 0.94. While state infidelity is routinely attributed to multi-photon events, with heuristic or idealized error models of such often presented,^{57} an exact analysis including multiphoton events alongside spectral effects and detection mechanisms has yet to be included. As systems are required to provide higher count rates, source figures of merit including purity and multi-photon contamination are traded in favor of achieving acceptable count rates. Recently, a 10-photon experiment was demonstrated at a rate of 4/h^{58} at the expense of a state fidelity of 0.573, and a subsequent 12-photon experiment achieved a fidelity of 0.576 at a rate of 1/h.^{59} Of course, the more efficient generation of higher photon-number states for computation applications requires multiplexing,^{60} and one would necessarily have to choose an operation regime in which generation rates are traded off against multi-photon contamination, and a realistic noise model capturing multi-photon and spectral impurity is then required in order to demonstrate that acceptable error thresholds are indeed overcome. In particular, it is clear that leakage errors associated with the higher dimensional state space of multi-photon states (outside the desired photon number subspaces) will present a key challenge for both the experimental systems and the theory of their operation.^{61}

In this work, we have primarily focused on the generation of heralded single photon states, which could in turn to be used to generate larger states with a fixed photon number. However, our formalism also naturally lends itself to applications where states without fixed photon number are of interest, most notably those of Gaussian boson sampling, which are based on the observation that calculating photon detection probabilities from Gaussian states with many modes is computationally hard.^{29,62–64} Using the present formalism, it would be interesting to investigate the extent to which photon spectral impurity or mutual distinguishability can affect the computational complexity of multi-photon detection probabilities. It is in turn worth mentioning that the number resolving detection probabilities derived here make use only of derivatives of vacuum projections and combinatorics, and it would be interesting to explore the extent to which these new expressions could lead to insights into the development of classical algorithms for simulating photonic measurements of this sort.

## ACKNOWLEDGMENTS

O.F.T. was supported by the Bristol Quantum Engineering Centre for Doctoral Training, EPSRC Grant No. EP/L015730/1. We would like to thank J. R. Scott for useful discussions and S. E. Armstrong and J. F. F. Bulmer for their helpful comments.

## DATA AVAILABILITY

The data that support the findings of this study are available from the corresponding author upon reasonable request.

### APPENDIX: CHARACTERISTIC FUNCTIONS AND SYMPLECTIC TRANSFORMATIONS

We define the *s*-ordered characteristic function as

where $\rho \u0302$ is the corresponding quantum state, $D\u0302(\Lambda )=expA\u0302\u2020\u2061K\Lambda $ with $A\u0302$ given in Eq. (1), and **Λ** is a 2*N*-dimensional vector of complex variables. Note that this vector takes the general form $\Lambda =(\lambda ,\lambda *)\u22a4$ with $\lambda =(\lambda 1\omega 1,\u2026,\lambda N)\u22a4$.^{67} To see how a quantum state and corresponding characteristic function evolve under a unitary transformation, we suppose that the state transforms as $\rho \u0302\u2192\rho \u0302\u2032=U\u0302\rho \u0302U\u0302\u2020$. The corresponding characteristic function is

For a linear symplectic transformation, as described in Eq. (3), we find

For the sake of completeness, we note that generalizing Eq. (3) to affine transformations, $U\u0302\u2020A\u0302U\u0302=MA\u0302+m$, we have

Non-zero ** m** contributions correspond to the injection of coherent states.

The Gaussian states are defined as those states with characteristic functions, which are Gaussian. A general Gaussian state therefore takes the form

where the 2*N* × 2*N* matrix of scalar coefficients *σ* is known as the covariance matrix and the 2*N*-dimensional vector ** d** is the displacement vector. If a Gaussian state undergoes a symplectic transformation described by Eq. (A4), using the symplectic condition in Eq. (5), it follows that the resulting characteristic function remains Gaussian, but with modified covariance matrix and displacement vector given by

#### 1. Calculation of projection onto the vacuum

In terms of characteristic functions, the expectation value of a general operator $O\u0302$ is written as

where *d*^{2N}**Λ** = ∏_{iω}*d*Re[*λ*_{iω}]*d*Im[*λ*_{iω}]. In what follows, it will also be convenient to define the multimode quasi-probability distributions as

The *s* = −1, 0, 1 forms are known as, respectively, the Husimi-Q function, Wigner function, and Glauber–Sudarshan P representation. A Gaussian state of the form in Eq. (A5) has the quasi-probability distributions,

valid for *s* = −1 or *s* = 0. A closed-form expression for the Glauber–Sudarshan P representation (*s* = 1 form) for a general Gaussian state does not exist.

In order to calculate projections of a general Gaussian state onto the vacuum for a subset of the modes, we label the subset *S* and its complement $S\u0304$. We are interested in the probability associated with the projector onto the vacuum for all modes in *S*, which we label $vacvacS$. For a general operator $O\u0302S$ acting only on modes in *S*, we have

where $1\u0302S\u0304$ is the identity in the space of modes $S\u0304$, $\rho \u0302S=TrS\u0304[\rho \u0302]$ is the reduced density matrix on modes *S*, and **Λ**_{S} contains variables pertaining only to modes in *S*. The characteristic function corresponding to the reduced density operator is

which shows that it can be obtained from the total characteristic function by setting the variables pertaining to the complement equal to zero.

In the present case, we are interested in $O\u0302S=vacvacS$, and it is most convenient to use Eq. (A10) with *s* = −1. For a Gaussian state, the *s* = −1 characteristic function is given in Eq. (A5). Using the result above, the characteristic function pertaining only to the modes in *S* is found by setting $\Lambda S\u0304=0$, meaning we can replace *σ* and ** d** with

*σ*

_{S}and

*d*_{S}, the covariance matrix and displacement vector retaining rows and columns pertaining to the modes in

*S*only. The

*s*= +1 characteristic function for a projection onto the vacuum for the set of modes

*S*is simply

and using this, we find

Setting *d*_{S} = 0 and $\sigma S=12|S|$, we find *P*_{off}(*S*) = 1 as expected.

## REFERENCES

The symplectic condition is usually written as *M*Ω*M*^{⊤} = Ω with $\Omega =01\u221210\u22971N$, which is equivalent to Eq. (5) if *M* → *LML*^{†} with $L=121i1\u2212i\u22971N$.

We use the identity $detu\u20d7v\u20d7\u22a4+A=det1+v\u20d7\u22a4A\u22121u\u20d7det[A]$.

With these definitions that can include, for example, both $\lambda $ and $\lambda $^{*} in $\Lambda $, we need to be careful when considering absolute values. In particular, we note that |$\Lambda $|^{2} = $\Lambda $^{†}$\Lambda $ = 2|$\lambda $|^{2} = 2*∑*_{iω}|$\lambda $_{iω}|^{2} = *∑*_{iω}|$\lambda $_{iω}|^{2}.