Quantum integrated photonics requires large-scale linear optical circuitry, and for many applications, it is desirable to have a universally programmable circuit, able to implement an arbitrary unitary transformation on a number of modes. This has been achieved using the Reck scheme, consisting of a network of Mach–Zehnder interferometers containing a variable phase shifter in one path as well as an external phase shifter after each Mach–Zehnder. It subsequently became apparent that with symmetric Mach–Zehnders containing a phase shifter in both paths, the external phase shifters are redundant, resulting in a more compact circuit. The rectangular Clements scheme improves on the Reck scheme in terms of circuit depth, but it has been thought that an external phase-shifter was necessary after each Mach–Zehnder. Here, we show that the Clements scheme can be realized using symmetric Mach–Zehnders, requiring only a small number of external phase-shifters that do not contribute to the depth of the circuit. This will result in a significant saving in the length of these devices, allowing more complex circuits to fit onto a photonic chip, and reducing the propagation losses associated with these circuits. We also discuss how similar savings can be made to alternative schemes, which have robustness to imbalanced beam-splitters.

## I. INTRODUCTION

Optical quantum computing requires interferometric circuits to process quantum states of light.^{1,2} In particular, integrated photonic circuits comprising networks of Mach–Zehnder interferometers (MZIs) have emerged as a compact and versatile solution for realizing reconfigurable linear optics,^{3} with applications to linear optical quantum computing,^{4–6} boson sampling,^{7–10} high-dimensional encodings,^{11,12} quantum simulation,^{13,14} photonic neural networks,^{15} and optical field-programmable gate array (FPGAs).^{16,17} Often, universal reconfigurability is essential or at least desirable in the sense that a device can be programmed to realize any unitary transformation between the input and output modes. This can be achieved using the architecture by Reck *et al.* (the Reck scheme),^{18} shown in Fig. 1(a), a triangular mesh with each unit cell comprising a variable beam-splitter and a variable phase-shifter. The rectangular architecture by Clements *et al.*,^{19} shown in Fig. 1(b), is also universal and benefits from improved compactness as well as the balanced loss per channel since each path has the same number of unit cells that improves the overall fidelity of the circuit to the desired operation. In integrated photonics, the unit cell is usually implemented as an MZI with an internal phase-shifter in one of the arms and an external phase-shifter on one output, as shown in Fig. 1(c), which we will term an asymmetric MZI (aMZI). The internal phase-shifter controls the splitting ratio between the two outputs, while the external controls the relative phase between the outputs. The reason for using this primitive operation is that it is a universal 2 × 2 circuit.

Alternatively, Fig. 1(d) shows a symmetric MZI (sMZI) with two internal and no external phase-shifters. This lacks control of the relative phase between outputs, yet it has been used in optical FPGAs,^{16,17} and it has been found that the sMZI can replace the aMZIs in a Reck scheme without compromising universality.^{20–23} The sMZI is attractive because it is more compact, without the need for an external phase-shifter, which can account for a significant fraction of the length of the circuit. A shorter structure not only occupies less area on a chip but also suffers from less propagation loss. Furthermore, there is a potential advantage in which a symmetric implementation might be expected to provide more balanced losses and heat distribution. A rectangular “Clements” architecture made up of sMZIs would be particularly beneficial for quantum integrated photonics, where large-scale circuits are required and transmission must be kept as high as possible. However, sMZIs have not yet been utilized in this way because it is not clear that such a non-universal 2 × 2 element can generate a universal circuit or that there is an efficient algorithm for finding the phases, which will implement a target unitary.

In this work, we resolve this issue. We show that sMZIs can replace aMZIs in the Clements schemes, providing methods of matrix decomposition, which generate the specific set of phases required to implement a given unitary matrix. We initially find that additional external phase-shifters are required in a diagonal layer through the middle of the circuit, as well as at a subset of the inputs and outputs. We subsequently show that the additional mid-circuit phase-shifts can be moved to otherwise vacant positions at the edge of the circuit, where they do not contribute to the overall length. This approximately halves the contribution of the phase-shifters to the length of a circuit, down to at most external phase-shifters at the inputs and outputs of the circuit, which are not required for some applications. We show how a similar advantage can be obtained for more general circuits comprising alternating layers of beam-splitters and phase-shifters, giving as an example the design by Fldzhyan *et al.*,^{24} which does not have a deterministic method of setting the phases but does appear to heuristically implement arbitrary unitary matrices with an improved level of robustness to imbalanced beam-splitters.

The remainder of this article is set out as follows: In Sec. II, we provide a matrix decomposition method for a Reck scheme made up of sMZIs. In Sec. III, we follow a near-identical method for the Clements scheme, showing how a layer of external phase-shifters arises mid-circuit. In Sec. IV, we propose a variant on the Clements scheme with additional phase-shifters at the edge of the circuit, which do not add to the overall length, and show that these can fulfill the same function as the external mid-circuit phase-shifts. In Sec. V, we discuss how a similar technique can be applied to more general architectures, such as that by Fldzhyan *et al.*^{24} In Sec. VI, we summarize and draw conclusions.

## II. THE RECK SCHEME

We consider a Reck scheme^{18} where the unit cell is an sMZI, with additional external phase-shifts at the input and output of the entire circuit, on each mode except the first, as depicted in Fig. 2(a). The matrix transformation implemented by one sMZI, applied to two adjacent modes, can be written as

where Σ = (*θ*_{1} + *θ*_{2})/2, *δ* = (*θ*_{1} − *θ*_{2})/2, and *θ*_{1,2} are the values of the two internal phase-shifts. Meanwhile, an external phase-shift is given by

In the following, we assume that all individual phases *θ*_{1,2} and *ϕ* have a full range of [0, 2*π*).

Rather than labeling the MZIs by modes, we divide them into diagonals numbered *j* = 1 to *m* − 1 as shown in Fig. 2(a), where *m* is the number of modes in the circuit. The MZIs within each diagonal are numbered from bottom left to top right as *k* = 1 to *j*. The MZI transformations within the circuit can be identified as *M*^{(j,k)}, and the phase settings Σ_{j,k} and *δ*_{j,k} refer to those of the (*j*, *k*) MZI. The input phase-shift operations associated with a diagonal are denoted *P*^{(j)} with phase *ϕ*_{j}. Meanwhile, the output phase-shift operation applied to mode *k* is labeled *Q*^{(k)} with phase *ζ*_{k}.

The matrix decomposition proceeds by applying the circuit operations to an auxillary matrix *V* (initially set to *U**) so as to successively zero matrix elements. Figure 2(b) shows the first few operations used to zero elements. Multiplying *V* from the right by an *M*^{(j,k)} matrix mixes two columns of *V* together; specifically, the *y* = *j* − *k* + 1 column is mixed with the *y* + 1 column. A particular element (*x*, *y*) can be set to zero by choosing *δ*_{j,k} such that

This has a real solution for *δ*_{j,k} if *V*_{x,y} and *V*_{x,y+1} have the same complex phase, i.e., arg(*V*_{x,y}) = arg(*V*_{x,y+1}). Since Σ_{j,k} affects the phase of both columns equally, it cannot be chosen to achieve this condition; rather, the phases need to be equalized by previous operations. The external phase-shifter *P*^{(j)} can be used to match the phases for the first MZI in each diagonal; then for each MZI, Σ_{j,k} can be chosen to match the phases for the subsequent MZI. The detailed order of operations is given as follows:

Set an auxillary matrix

*V*←*U**.For

*j*= 1 to*m*– 1:Set

*x*←*m*and*y*←*j*.Set

*V*←*VP*^{(j)}, choosing*ϕ*_{j}= arg(*V*_{x,y}) − arg(*V*_{x,y+1}).For

*k*= 1 to*j*:Set

*V*←*VM*^{(j,k)}, choosing*δ*_{j,k}such that*V*_{x,y}is set to zero.Choose Σ

_{j,k}such that arg(*V*_{x−1,y−1}) = arg(*V*_{x−1,y}). For*k*=*j*, this choice is redundant.Set

*x*←*x*− 1 and*y*←*y*− 1.

For

*k*= 2 to*m*:Set

*V*←*VQ*^{(k)}, choosing*ζ*_{k}= arg(*V*_{1,1}) − arg(*V*_{k,k}).

After step 2, every element of *V* below the diagonal has been set to zero. Since *V* remains unitary, this implies that it is a diagonal matrix, where the remaining diagonal elements are complex with unit norm. Step 3 then applies the final phase-shifts such that *V* is the identity matrix $1$, up to a global phase. Expanding *V* as

it can be seen that

where we have used the fact that all the circuit operations are symmetric unitaries, so their complex conjugate is their inverse. Hence, *U* has been decomposed into individual circuit operations. Since this procedure can be applied to an arbitrary unitary matrix, this demonstrates the universality of the circuit by construction.

The circuit consists of $12m(m\u22121)$ MZIs and hence *m*(*m* − 1) internal phase-shifters. In each row of MZI, the final choice of Σ_{j,j} is redundant, so in these MZIs, one of the internal phase-shifters could be omitted. This would leave (*m* − 1)^{2} internal phase-shifters and 2(*m* − 1) external ones. This is equal to the number of free parameters in an *m* × *m* unitary matrix, assuming that the global phase is neglected, and hence is the minimum required. We note that in situations where the output modes are to be connected directly to phase-insensitive detectors, the external phase-shifts at the output are redundant and could be omitted. If a phase-invariant input state is used, the external phases at the inputs are also redundant, for instance, Fock state inputs as required in boson sampling.

## III. THE CLEMENTS SCHEME

We now consider the Clements scheme^{19} circuit shown in Fig. 3(a). The MZIs are organized into diagonals labeled *j* as in Sec. II, but for even *j*, the direction of *k* has been reversed; as a result, the associated *P*^{(j)} phase-shifters have been moved from input to output. In the decomposition, the order in which matrix elements are zeroed within an even *j* diagonal is also reversed, and the corresponding circuit operations are applied to *V* by left multiplication rather than right multiplication. As a result, these operations mix adjacent rows of the *V* matrix rather than adjacent columns; the first few steps of this process are shown in Fig. 3(b). The *Q*^{(k)} operations appear in a diagonal line through the middle of the circuit, which is not a usual feature of the Clements scheme; in Sec. IV, we will show how these phase-shifts can be relocated to positions where they do not add to the overall length. In other aspects, the decomposition is identical to that of the Reck scheme:

Set an auxillary matrix

*V*←*U**.For

*j*= 1 to*m*– 1:if

*j*is odd:Set

*x*←*m*and*y*←*j*.Set

*V*←*VP*^{(j)}, choosing*ϕ*_{j}= arg(*V*_{x,y}) − arg(*V*_{x,y+1}).For

*k*= 1 to*j*:Set

*V*←*VM*^{(j,k)}, choosing*δ*_{j,k}to zero*V*_{x,y}.Choose Σ

_{j,k}so that arg(*V*_{x−1,y−1}) = arg(*V*_{x−1,y}).Set

*x*←*x*− 1 and*y*←*y*− 1.

if

*j*is even:Set

*x*←*m*−*j*+ 1 and*y*← 1.Set

*V*←*P*^{(j)}*V*, choosing*ϕ*^{(j)}= arg(*V*_{x,y}) − arg(*V*_{x−1,y}).For

*k*= 1 to*j*:Set

*V*←*M*^{(j,k)}*V*, choosing*δ*_{j,k}to zero*V*_{x,y}.Choose Σ

_{j,k}so that arg(*V*_{x+1,y+1}) = arg(*V*_{x,y+1}).Set

*x*←*x*+ 1 and*y*←*y*+ 1.

For

*k*= 2 to*m*:Set

*V*←*VQ*^{(k)}, choosing*ζ*_{k}= arg(*V*_{1,1}) − arg(*V*_{k,k}).

As before, at the end of step 3, *V* is the identity matrix up to a global phase. However, now when expanding *V*,

i.e., the operations with even *j* are to the left of *U** rather than to the right. It can be seen that this naturally places the *Q*^{(k)} operations, which were applied last, in the middle of the circuit,

## IV. RELOCATING RESIDUAL PHASE-SHIFTS

We now propose a variant on the Clements scheme as shown in Fig. 4(a), consisting of a rectangular network of sMZIs. For each vertical layer of MZIs, there is one odd mode at the edge, which is not involved in an MZI (for an even total number of modes, alternating layers have zero or two modes not involved in an MZI). The only change compared to the circuit considered in Sec. III is that tunable phase-shifters are added to these sections of waveguides, instead of placing the *Q*^{(k)} phase-shifters in the middle of the circuit. The new phase-shifters do not add to the overall length of the circuit since path-length matching dictates that these sections of the waveguide are anyway as long as if they had been involved in an MZI. The external phase-shifters at the inputs and outputs are not shown here.

Now, if a phase-shift *ϕ* is required on a single waveguide between layers of an MZI, we follow a new procedure, shown in Fig. 4(b). Adding +*ϕ* to both phase-shifters in the MZI to the left implements the phase, but a −*ϕ* shift is now required to correct the effect on the waveguide below the original phase-shift. Now, an MZI to the right can be used to implement −*ϕ* while requiring a +*ϕ* correction to a lower waveguide. This can be repeated until the residual phase-shift is moved to the edge of the circuit, where it can be implemented directly with one of the new phase-shifters. This relies on the fact that the operation of applying the same phase-shift to two modes will commute with a beam-splitter operation on those two modes.

This demonstrates that the new circuit is universal; one can follow the decomposition given in Sec. III and then implement the *Q*^{(k)} phase-shifts using this method. Two of these operations are already at the edges of the circuit, so they can be trivially implemented by the edge phase-shifters. One could also follow the original Clements decomposition intended for aMZIs and then implement all the external phase-shifts required by this method by absorbing them into surrounding sMZIs and shifting them to the edge of the circuit.

For each layer of MZIs, there are now *m* phase-shifters. Since adding the same phase to all the phase-shifters in a layer only adds a global phase, there is redundancy here, and any one phase-shifter could be removed from each layer. The circuit would then have the minimum number of degrees of freedom required to implement an arbitrary unitary. For an odd number of modes, this could include removing the single external phase-shifter from each layer whereas for an even number of modes, alternating layers would contain two external phase-shifters, and there is no obvious way to remove both. Moreover, retaining all the phase-shifters is quite a healthy redundancy to have since any one phase-shifter in each layer can fail (e.g., due to fabrication error), and the overall operation of the circuit will be preserved.

## V. ERROR TOLERANT DESIGNS

The Reck and Clements schemes are provably universal on the assumption that every beam-splitter in the circuit has an exactly balanced splitting ratio. If some beam-splitters deviate from this, for instance, due to uncertainties in fabrication, then the MZIs no longer have full tunability in their splitting ratio, and some unitary transformations become inaccessible to the circuit. Several alternative designs have been proposed, which show improved robustness to randomized beam-splitters, including adding redundant layers of MZIs,^{25} adding permutations of waveguides between MZI layers,^{26} and the design by Fldzhyan *et al.*,^{24} which makes use of alternating layers of beam-splitters and phase-shifters in an arrangement that does not map onto a network of MZIs. For these designs, there is generally no known deterministic method of decomposing them into elementary 2 × 2 unitaries. Rather, the phase settings are optimized to minimize the distance to some target unitary matrix.

For rectangular meshes of MZIs similar to the Clements scheme, it is fairly clear where aMZIs can be replaced by sMZIs; hence, we focus on the Fldzhyan design here. This design is appealing because the depth of the circuit and the number of elements are identical to those of the Clements scheme, while the robustness to beam-splitter imbalance is improved for Haar-random target unitaries. The circuit layout is as shown in Fig. 5(a); here, only four layers of beam-splitters and phase-shifters are shown, with 2*m* such layers required for universality. Figure 5(b) shows a compactified design, where every other layer of phase-shifters has been moved into the preceding layer. It can be seen that this design is equivalent to the original one since wherever a phase is required in a removed layer, it can be applied by adjusting phases in the neighboring layers. Figure 5(c) shows how a phase *ϕ* in a removed layer is implemented by applying +*ϕ* to a subset of phase-shifters in the neighboring layer to the left and −*ϕ* to a subset in the neighboring layer to the right. As mentioned previously, this relies on the fact that equal phase-shifts applied to both two modes involved in a beam-splitter can be commuted to the input or output of the beam-splitter. This remains valid regardless of the splitting ratio of the beam-splitter.

## VI. CONCLUSIONS

Using sMZIs implies that the circuit length taken up by phase-shifters is approximately halved compared to aMZIs. This can be a significant fraction of the overall circuit length in many integrated photonic platforms; for example, in silicon-on-insulators,^{15} silica-on-silicon,^{27} and lithium niobate-on-insulators,^{28} the length of a thermo-optic phase shifter is comparable to that of a beam-splitter, whereas in silicon nitride, the phase-shifters are relatively long and can provide a dominant contribution to the circuit length.^{29,30}

The saving in length comes at the expense of a somewhat more complicated control strategy: multiple phase-shifters need to be tuned together to configure the parameters of an sMZI, which could negatively impact their precision. An alternative is to use a four-phase MZI as suggested in Ref. 31, where through dual driving of two internal and two external phase-shifters, the required range of each could be reduced to [0, *π*). Assuming that the length of a phase-shifter is proportional to its required range, this could give a similar saving in length to the sMZI, although at the cost of doubling the number of electrical connections and control channels.

In summary, we have shown that the aMZI can be replaced by an sMZI in a Clements style rectangular network while retaining universality, providing a deterministic and efficient method of selecting the phases required to implement a target unitary. We expect that sMZIs can replace aMZIs in related designs making use of a rectangular structure and have used a similar logic to suggest a more compact but equivalent form of the circuit by Fldzhyan *et al.*^{24} We expect that reducing the circuit-depth while minimizing the number of control channels will allow larger universal linear optical circuits to fit on a chip, while reducing the propagation loss, helping to realize large-scale quantum photonic computation and simulation.

## ACKNOWLEDGMENTS

The decomposition methods presented in Secs. III and IV are implemented in the Strawberry Fields^{32} software library for photonic quantum computing as the *rectangular_compact* function. We thank David Miller for helpful comments. We acknowledge funding from the EPSRC UK Quantum Technologies Programme (Grant No. EP/T001062/1) via the Quantum Computing and Simulation hub. Bryn Bell was supported by a European Commission Marie Skłodowska Curie Individual Fellowship (FrEQuMP, Grant No. 846073).

## DATA AVAILABILITY

The data that support the findings of this study are available from the corresponding author upon reasonable request.