The aim of this review is to provide quantum engineers with an introductory guide to the central concepts and challenges in the rapidly accelerating field of superconducting quantum circuits. Over the past twenty years, the field has matured from a predominantly basic research endeavor to a one that increasingly explores the engineering of larger-scale superconducting quantum systems. Here, we review several foundational elements—qubit design, noise properties, qubit control, and readout techniques—developed during this period, bridging fundamental concepts in circuit quantum electrodynamics and contemporary, state-of-the-art applications in gate-model quantum computation.

Quantum processors harness the intrinsic properties of quantum mechanical systems—such as quantum parallelism and quantum interference—to solve certain problems where classical computers fall short.1–6 Over the past two decades, rapid developments in the science and engineering of quantum systems have advanced the frontier in quantum computation, from the realm of scientific explorations on single isolated quantum systems toward the creation and manipulation of multiqubit processors.7,8 In particular, the requirements imposed by larger quantum processors have shifted the mindset within the community, from solely scientific discovery to the development of new, foundational engineering abstractions associated with the design, control, and readout of multiqubit quantum systems. The result is the emergence of a new discipline termed “quantum engineering,” which serves to bridge basic sciences, mathematics, and computer science with fields generally associated with traditional engineering.

One prominent platform for constructing a multiqubit quantum processor involves superconducting qubits, in which information is stored in quantum degrees of freedom (DOFs) of nanofabricated, anharmonic oscillators (AHOs) constructed from superconducting circuit elements. In contrast to other platforms, e.g., electron spins in silicon9–14 and quantum dots,15–18 trapped ions,19–23 ultracold atoms,24–27 nitrogen-vacancies in diamonds,28,29 and polarized photons,30–33 where the quantum information is encoded in natural microscopic quantum systems, superconducting qubits are macroscopic in size and lithographically defined.

One remarkable feature of superconducting qubits is that their energy-level spectra are governed by circuit element parameters and thus are configurable; they can be designed to exhibit “atomlike” energy spectra with the desired properties. Therefore, superconducting qubits are also often referred to as “artificial atoms,” offering a rich parameter space of possible qubit properties and operation regimes, with predictable performance in terms of transition frequencies, anharmonicity, and complexity.

While there are many other excellent reviews on superconducting qubits, see, e.g., Refs. 34–43, this work specifically aims to introduce new quantum engineers (academic and industrial alike) to the terminology and state-of-the-art practices used in the rapidly accelerating field of superconducting quantum computing. The reader is assumed to be familiar with the basic concepts that span classical physics, quantum mechanics, and electrical engineering. In particular, readers will find it useful to have had previous exposure to classical mechanics, the Schrödinger equation, the Bloch sphere representation of qubit states, second quantization, basic concepts of superconductivity, electromagnetism, introductory circuit analysis, classical Boolean logic, linear dynamical systems, analog and digital signal processing, and familiarity with microwave components such as transmission lines and mixers. These topics will be introduced as they arise, but having basic prior knowledge will be helpful.

This review is organized in the following four sections: first, in Sec. II, we explore the parameter space available when designing superconducting circuits. In particular, we look at the promising capacitively shunted planar qubit modalities and how these can be engineered with the desired properties, such as transition frequency, anharmonicity, and reduced susceptibility to various sources of noise. In this section, we also introduce several ways in which interactions between qubits can be engineered, in order to implement two-qubit entangling operations, needed for a universal gate set.

In Sec. III, we discuss systematic and stochastic noise, the concepts of noise strength and qubit noise susceptibility, and the common sources of noise which lead to decoherence in superconducting circuits. We introduce the Bloch-Redfield model of decoherence, characterized by longitudinal and transverse relaxation times T1 and T2, and discuss the implications of 1/f noise. We then define the noise power spectral density (PSD), which is commonly used to characterize noise processes, and describe how it drives decoherence. Finally, we close the section with a review of coherent control methods used to mitigate certain types of coherence and reversible noise.

In Sec. IV, we provide a review of how single- and two-qubit operations are typically implemented in superconducing circuits, by using a combination of local magnetic flux control and microwave drives. In particular, we discuss the family of two-qubit gates arising from a capacitive coupling between qubits, and introduce several recent advances that have been demonstrated to achieve high-fidelity gates, as well as applications in quantum information processing that use these gates. The continued development of high-fidelity two-qubit gates in superconducting qubits is a highly active research area. For this reason, we include sufficient technical details that a reader may use this review as a starting point to critically assess the pros and cons of the various gates, as well as develop an appreciation for the types of gate-engineering already implemented in-state-of-the-art superconducting quantum processors.

Finally, in Sec. V, we discuss the physics and engineering associated with the dispersive readout technique, typically used to measure the individual qubit states in modern quantum processors. After a discussion of the theory behind dispersive coupling, we give an introduction to design of Purcell filters and the development of quantum-limited parametric amplifiers (PAs).

In this section, we will demonstrate how quantum systems based on superconducting circuits can be engineered to achieve certain desired properties. Using the most common qubit modalities, we discuss how properties such as the qubit transition frequency, anharmonicity, and noise susceptibility can be tailored by the choice of circuit topology and element parameter values. We also discuss how to engineer the interactions between different quantum systems, in particular, the cases of qubit-qubit and qubit-resonator couplings.

A quantum mechanical system is governed by the time-dependent Schrödinger equation


where |ψ(t) is the state of the quantum system at time t, is the reduced Planck's constant h/2π, and Ĥ is the “Hamiltonian” that describes the total energy of the system. The “hat” is used to indicate that Ĥ is a quantum operator. As the Schrödinger equation is a first-order linear differential equation, the temporal dynamics of the quantum system may be viewed as a straightforward example of a linear dynamical system with a formal solution


The time-independent Hamiltonian Ĥ governs the time evolution of the system through the operator eiĤt/. Thus, just as with classical systems, determining the Hamiltonian of a system—whether the classical Hamiltonian H or its quantum counterpart Ĥ—is the first step to deriving its dynamical behavior. In Sec. IV, we consider the case when the Hamiltonian is time-dependent in the context of qubit control.

To understand the dynamics of a superconducting qubit circuit, it is natural to start with the classical description of a linear LC resonant circuit [Fig. 1(a)]. In this system, energy oscillates between electrical energy in the capacitor C and magnetic energy in the inductor L. In the following, we will arbitrarily associate the electrical energy with the “kinetic energy” and the magnetic energy with the “potential energy” of the oscillator. The instantaneous, time-dependent energy in each element is derived from its current and voltage


where V(t) and I(t) denote the voltage and current of the capacitor or inductor.

To derive the classical Hamiltonian, we follow the standard approach used in classical mechanics: the Lagrange-Hamilton formulation. Here, we represent the circuit elements in terms of one of its generalized circuit coordinates, charge or flux. In the following, we pick flux, defined as the time integral of the voltage


In this example, the voltage at the node is also the branch voltage across the element. In this section, we will simply refer to these as node voltages and fluxes for convenience. For a more detailed discussion of nodes and branches in this context, we refer the reader to Ref. 44.

Note that in the following, we could have exchanged our associations with kinetic energy (momentum coordinate) and potential energy (position coordinate), and instead start with the charge variable Q(t), which is the time integral of the current I(t).

By combining Eqs. (3) and (4), using the relations V=LdI/dt and I=CdV/dt, and applying the integration by parts formula, we can write down energy terms for the capacitor and inductor in terms of the node flux


The Lagrangian is defined as the difference between the kinetic and potential energy terms and can thus be expressed in terms of Eqs. (5) and (6)


From the Lagrangian in Eq. (7), we can further derive the Hamiltonian using the Legendre transformation, for which we need to calculate the momentum conjugate to the flux, which in this case, is the charge on the capacitor


The Hamiltonian of the system is now defined as


as one would expect for an electrical LC circuit. Note that this Hamiltonian is analogous to that of a mechanical harmonic oscillator, with mass m = C and resonant frequency ω=1/LC, which expressed in position, x, and momentum, p, coordinates takes the form H=p2/2m+mω2x2/2.

The Hamiltonian described above is classical. In order to proceed to a quantum-mechanical description of the system, we need to promote the charge and flux coordinates to quantum operators, whereas the classical coordinates satisfy the Poisson bracket


the quantum operators similarly satisfy a “commutation relation”


where the operators are indicated by hats. From this point forward, however, the hats on operators will be omitted for simplicity.

In a simple LC resonant circuit [Fig. 1(a)], both the inductor L and the capacitor C are linear circuit elements. Defining the reduced flux ϕ2πΦ/Φ0 and the reduced charge n = Q/2e, we can write down the following quantum-mechanical Hamiltonian for the circuit


where EC = e2/(2C) is the charging energy required to add “each” electron of the Cooper-pair to the island and EL=(Φ0/2π)2/L is the inductive energy, where Φ0=h/(2e) is the superconducting magnetic flux quantum. Moreover, the quantum operator n is the excess number of Cooper-pairs on the island, and ϕ—the reduced flux—is denoted the “gauge-invariant phase” across the inductor. These two operators form a canonical conjugate pair, obeying the commutation relation [ϕ, n] = i. We note that the factor 4 in front of the charging energy EC is solely a historical artifact, namely, that this energy scale was first defined for single-electron systems and then adopted to two-electron Cooper-pair systems.

The Hamiltonian in Eq. (13) is identical to the one describing a particle in a one-dimensional quadratic potential, a quantum harmonic oscillator (QHO). We can treat ϕ as the generalized position coordinate, so that the first term is the kinetic energy and the second term is the potential energy. We emphasize that the functional form of the potential energy influences the eigensolutions. For example, the fact that this term is quadratic (ULϕ2) in Eq. (13) gives rise to the shape of the potential in Fig. 1(b). The solution to this eigenvalue problem gives an infinite series of eigenstates |k,(k=0,1,2,), whose corresponding eigenenergies Ek are all equidistantly spaced, i.e., Ek+1Ek=ωr, where ωr=8ELEC/=1/LC denotes the resonant frequency of the system, see Fig. 1(b). We may represent these results in a more compact form (second quantization) for the quantum harmonic oscillator (QHO) Hamiltonian


where a(a) is the creation (annihilation) operator of a single excitation of the resonator. The Hamiltonian in Eq. (14) is written as energy. It is, however, often preferred to divide by so that the expression has units of radian frequency, since we will later resonantly drive transitions at a particular frequency or reference the rate at which two systems interact with one another. Therefore, from here on, will be omitted.

The original charge number and phase operators can be expressed as n=nzpf×i(aa) and ϕ=ϕzpf×(a+a), where nzpf=[EL/(32EC)]1/4 and ϕzpf=(2EC/EL)1/4 are the “zero-point fluctuations” of the charge and phase variables, respectively. Quantum mechanically, the quantum states are represented as wavefunctions that are generally distributed over a range of values of n and ϕ and, consequently, the wavefunctions have nonzero standard deviations. Such wavefunction distributions are referred to as “quantum fluctuations,” and they exist, even in the ground state, where they are called zero-point fluctuations.

The linear characteristics of the QHO have a natural limitation in its applications for processing quantum information. Before the system can be used as a qubit, we need to be able to define a computational subspace consisting of only two energy states (usually the two-lowest energy eigenstates) in between which transitions can be driven without also exciting other levels in the system. Since many gate operations, such as single-qubit gates (Sec. IV), depend on frequency selectivity, the equidistant level-spacing of the QHO, illustrated in Fig. 1(b), poses a practical limitation.427 

To mitigate the problem of unwanted dynamics involving noncomputational states, we need to add anharmonicity (or nonlinearity) into our system. In short, we require the transition frequencies ωq01 and ωq12 be sufficiently different to be individually addressable. In general, the larger the anharmonicity the better it is. In practice, the amount of anharmonicity sets a limit on how short the pulses used to drive the qubit can be. This is discussed in detail in Sec. IV D 3.

To introduce the nonlinearity required to modify the harmonic potential, we use the Josephson junction—a nonlinear, dissipationless circuit element that forms the backbone in superconducting circuits.46,47 By replacing the linear inductor of the QHO with a Josephson junction, playing the role of a nonlinear inductor, we can modify the functional form of the potential energy. The potential energy of the Josephson junction can be derived from Eq. (3) and the two Josephson relations


resulting in a modified Hamiltonian


where EC=e2/(2CΣ),CΣ=Cs+CJ is the total capacitance, including both shunt capacitance Cs and the self-capacitance of the junction CJ, and EJ=IcΦ0/2π is the Josephson energy, with Ic being the critical current of the junction.428 After introducing the Josephson junction in the circuit, the potential energy no longer takes a manifestly parabolic form (from which the harmonic spectrum originates), but rather features a cosinusoidal form, see the second term in Eq. (16), which makes the energy spectrum nondegenerate. Therefore, the Josephson junction is the key ingredient that makes the oscillator anharmonic and thus allows us to identify a uniquely addressable quantum two-level system, see Fig. 1(d).

Once the nonlinearity has been added, the system dynamics is governed by the dominant energy in Eq. (16), reflected in the EJ/EC ratio. Over time, the superconducting qubit community has converged toward circuit designs with EJEC. In the opposite case when EJEC, the qubit becomes highly sensitive to charge noise, which has proven more challenging to mitigate than flux noise, making it very hard to achieve high coherence. Another motivation is that current technologies allow for more flexibility in engineering the inductive (or potential) part of the Hamiltonian. Therefore, working in the EJEC limit, makes the system more sensitive to the change in the potential Hamiltonian. Therefore, we will focus here on the state-of-the-art qubit modalities that fall in the regime EJEC. For readers who are interested in the physics in the EJEC regime, such as the earlier Cooper-pair box charge qubit, we refer to Refs. 48–51.

To access the EJEC regime, one preferred approach is to make the charging EC small by shunting the junction with a large capacitor, CsCJ, effectively making the qubit less sensitive to charge noise—a circuit commonly known as the transmon qubit.52 In this limit, the superconducting phase ϕ is a good quantum number, i.e., the spread (or quantum fluctuation) of ϕ values represented by the quantum wavefunction is small. The low-energy eigenstates are therefore, to a good approximation, localized states in the potential well, see Fig. 1(d). We may gain more insight by expanding the potential term of Eq. (16) into a power series (since ϕ is small), that is


The leading quadratic term in Eq. (17) alone will result in a QHO, recall Eq. (13). The second term, however, is quartic which modifies the eigensolution and disrupts the otherwise harmonic energy structure. Note that, the negative coefficient of the quartic term indicates that the anharmonicity α=ωq12ωq01 is negative and its limit in magnitude thus cannot be made arbitrarily large. For the case of the transmon, α = –EC is usually designed to be 100–300 MHz, as required to maintain a desirable qubit frequency ωq=(8EJECEC)/=3-6GHz, while keeping an energy ratio sufficiently large (EJ/EC50) to suppress charge sensitivity.52 Fortunately, the charge sensitivity is exponentially suppressed for an increased EJ/EC, while the reduction in anharmonicity only scales as a weak power law, leading to a workable device.

Including terms up to fourth order and using the QHO eigenbases, the system Hamiltonian resembles that of a Duffing oscillator


Since |α|ωq, we can see that the transmon qubit is basically a weakly anharmonic oscillator (AHO). If excitation to higher noncomputational states is suppressed over any gate operations, either due to a large enough |α| or due to robust control techniques such as the derivative reduction by adiabatic gate (DRAG) pulse, see Sec. IV D 3, we may effectively treat the AHO as a quantum two-level system, simplifying the Hamiltonian to


where σz is the Pauli-z operator. However, one should always keep in mind that higher levels physically exist.53 Their influence on the system dynamics should be taken into account when designing the system and its control processes. In fact, there are many cases where the higher levels have proven useful to implement more efficient gate operations.54 

In addition to reducing the charge dispersion, the use of a large shunt capacitor also enables us to engineer the electric field distribution of the quantum system, and thus the participation of surface loss mechanisms. In the development of the 3D transmon,55 e.g., a 2D transmon coupled to a 3D cavity, it was demonstrated that by making the gap between the two lateral capacitor plates large (compared to the film thickness) the coherence time increases since a smaller portion of the electric field interacts with the lossy interfaces, e.g., metal-substrate and substrate-vacuum interfaces, which has been studied extensively.56–61 

1. Tunable qubit: Split transmon

To implement fast gate operations with high-fidelity, as needed to implement quantum logic, many (though not all63) of the quantum processor architectures implemented today feature tunable qubit frequencies.64–67 For instance, in some cases, we need to bring two qubits into resonance to exchange (swap) energy, while we also need the capability of separating them during idling periods to minimize their interactions. To do this, we need an external parameter which allows us to access one of the degrees of freedom of the system in a controllable fashion.

One widely used technique is to replace the single Josephson junction with a loop interrupted by two identical junctions—forming a DC superconducting quantum interference device (DC-SQUID).68 Due to the interference between the two arms of the SQUID, the effective critical current of the two parallel junctions can be decreased by applying a magnetic flux threading the loop, see Fig. 2(a). Due to the fluxoid quantization condition, the algebraic sum of branch flux of all of the inductive elements along the loop plus the externally applied flux equal an integer number of superconducting flux quanta, that is


where φe=πΦext/Φ0. Using this condition, we can eliminate one degree of freedom and treat the SQUID-loop as a single junction, but with the important modification that EJ is tunable (via the SQUID critical current) by means of the external flux Φext. The effective Hamiltonian of the so-called split transmon (ignoring the constant) is


We can see that Eq. (21) is analogous to Eq. (16), with EJ replaced by EJ(φe)=2EJ|cos(φe)|. The magnitude of the net, effective Josephson energy EJ has a period of Φ0 in applied flux and spans from 0 to its maximum value 2EJ. Therefore, the qubit frequency can be tuned periodically with Φext, see Fig. 2(b).

While the split transmon enables frequency tunability by the externally applied magnetic field, it also introduces sensitivity to random flux fluctuations, known as flux noise. At any working point, the slope of the qubit spectrum, ωq/Φext, indicates to first order how strongly this flux noise affects the qubit frequency. The sensitivity is generally nonzero, except at multiples of the flux quantum, Φext=kΦ0, where k is an integer, where ωq/Φext=0.

One recent development has focused on reducing the qubit sensitivity to flux noise, while maintaining sufficient tunability to operate our quantum gates. The idea is to make the two junctions in the split transmon asymmetric,69 see Fig. 2(c). This yields the following Hamiltonian


where EJΣ=EJ1+EJ2 and d=(γ1)/(γ+1) is the junction asymmetry parameter, with γ=EJ2/EJ1. Again, we can treat the two junctions as a single-junction transmon, with an effective Josephson energy EJ(φe). In particular, we can recognize the two special cases; for d =0, the Hamiltonian in Eq. (22) reduces to the symmetric case with EJ(φe)=EJΣ|cos(φe)|, as in Eq. (21) with EJΣ=2EJ. In the other limit, when |d|1,EJ(φe)EJΣ and the flux-tunability of the Josephson energy vanishes, which is equivalent to the single junction case, recall Eq. (16).

From the discussion above we see that going from symmetric to asymmetric transmons does not change the circuit topology. This seemingly trivial modification, however, has a profound impact for practical applications. As we can see from the qubit spectra, Fig. 2(d), the flux sensitivity is suppressed across the entire tunable frequency range. For example, the performance of the cross-resonance gate is optimized with a certain frequency detuning between two qubits.70 Therefore, by using an asymmetric transmon, a small frequency-tuning range is introduced that is sufficient to compensate for fabrication variations, without introducing unnecessary large susceptibility to flux noise and thus maintaining high coherence. For another example, a surface code scheme based on the adiabatic controlled phase (CPHASE)-gate requires specific frequency configuration among qubits in order to avoid frequency crowding issues, and asymmetric transmons fit well with its well-defined frequency range.71 In general, as the quantum processors scale up and fabrication improves, asymmetric transmons are likely to be found in wider applications in the future.

2. Toward larger anharmonicity: Flux qubit and fluxonium

We see that split transmon qubits, be it symmetric or not, still share the same topology as the single junction version, yielding a sinusoidal potential. Therefore, the degree to which the properties of these qubits can be engineered has not fundamentally changed. In particular, the limited anharmonicity in transmon-type qubits intrinsically causes significant residual excitation to higher-energy states, undermining the performance of gate operations. To go beyond this, it is necessary to introduce additional complexity into the circuit.

One outstanding development in this regard is the invention of the flux qubit,72,73 where the qubit loop is interrupted by three (or four) junctions, see Fig. 2(e). On one branch is one smaller junction; on the other branch are two identical junctions, both a factor γ larger in size compared to the small junction. The addition of one more junction as compared to the split transmon is nontrivial, as it changes the circuit topology and reshapes the potential energy profile.

Each junction is associated with a phase variable, and the fluxoid quantization condition again allows us to eliminate one degree of freedom. Consequently, we have a two-dimensional potential landscape, which in comparison to the simpler topology of the transmon, complicates the problem both conceptually and computationally. Fortunately, under the assumed setting that the array junctions are larger in size (γ > 1), it is usually a good approximation to treat the problem as a particle moving in a quasi-1D potential, which also helps us gain more insight and intuition about the system and draw qualitative conclusions. The Hamiltonian under this “quasi-1D approximation” reads


Note that the phase variable in Eq. (23) is the sum of the branch phases across the two array junctions, ϕ = (φ1 + φ2)/2, assuming the same current direction across φ1 and φ2. The external magnetic flux is denoted φe=2πΦext/Φ0. The second term in Eq. (23) is contributed by the small junction with Josephson energy EJ, whereas the third term takes into account the two array junctions, together with Josephson energy 2γEJ. Clearly, the sum of these two terms no longer has the characteristics of a simple cosinusoid, and the final potential profile as well as the corresponding eigenstates depend on both the external flux φe and the junction area ratio γ.

The most common working point for this system is when φe=π+2πk, where k is an integer—that is when half a superconducting flux quantum threads the qubit loop. At this flux bias point, the qubit spectrum reaches its minimum, and the qubit frequency is first-order insensitive to flux noise, see Fig. 2(f). This point is often referred to as “the flux degeneracy point,” where flux qubits tend to have the optimal coherence time.

At this operation point, the potential energy may assume a single-well (γ2) or a double-well (γ < 2) profile. The single-well case shares some similarities with the transmon qubit, where the quadratic and quartic terms of the Hamiltonian determines the harmonicity and anharmonicity, respectively. The capacitively shunted flux qubit (CSFQ)62,74 was explored in this regime, demonstrating long coherence and decently high anharmonicity. Note that as opposed to the transmon qubit, the anharmonicity of the CSFQ is “positive” (α > 0). While the improvement in anharmonicity can be associated with reshaping the energy potential, the improved coherence over the first flux qubits can be attributed to the introduction of the capacitive shunt, similar to the modified Cooper-pair box leading to the transmon qubit.

The double-well case obtained for γ < 2 was demonstrated and investigated much earlier.72,73 The intuitive picture based on circulating current states—so it gets the name persisting-current flux qubit (PCFQ)—gives a satisfying physical description of the qubit degrees of freedom. However, from the perspective of a quantum engineer, the qubit properties are of more interest, even if sometimes we may lose physical intuition about the system in certain regimes; such as when γ ≈ 2 and there are no clear circulating current states. The most important feature of the PCFQ is that its anharmonicity can be much greater than the transmon and CSFQ and the transition matrix elements |1|n̂|0|,|1|ϕ̂|0| become considerably smaller given equivalent EJ/EC. Therefore, a longer relaxation time can be expected. These features have been demonstrated even more prominently in its close relative, the fluxonium qubit.75 

The flux qubit is a striking example that illustrates how one dramatically can engineer the qubit properties through the choice of various circuit parameters. The introduction of array junctions and consequent biharmonic profile generates rich dynamics as well as broad applications. An extention of this idea is the fluxonium qubit, which generated substantial interest recently, due partly to its capability of engineering the transition matrix elements to achieve millisecond T1 time, and due partly to the invention of novel gate schemes applicable to such well-protected qubits.76,77

Compared to flux qubits, which usually contain two or three array junctions,78 the number of array junctions in the fluxonium qubit is dramatically increased,75,79 in some cases, to the order of 100, see Fig. 2(g). Following the same quasi-1D approximation as for the flux qubit, the last term in Eq. (23) becomes NγEJcos(ϕ/N), where N denotes the number of array junctions. For large N, the argument in the cosine term ϕ/N becomes sufficiently small that a second order expansion is a good approximation. This results in the fluxonium Hamiltonian


where EL = (γ/N)EJ is the inductive energy of the effective inductance contributed by the junction array—often known as superinductance due to its large value.79–81 Therefore, we can treat the potential energy as a quadratic term modulated by a sinusoidal term, similar to that of an rf-SQUID type flux qubit.82 However, the kinetic inductance of the Josephson junction array is in general much larger than the geometric inductance of the wire in an rf-SQUID.

Depending on the relative magnitude of EJ and EL, the fluxonium system could involve plasmon states (in the same well) and fluxon states (in different wells). There are a variety of schemes to utilize them for quantum information processing. Generally, the spectrum of the transition between the lowest energy states is similar to that of the flux qubit, see Fig. 2(h). Both long coherence and high anharmonicity can be expected at the flux sweet spot.

Lastly, we want to point out a further extension—the 0–π qubit—which has even stronger topological protection from noise.83,84 However, the strongly suppressed sensitivity to external fluctuations also makes it hard to manipulate.

To generate entanglement between individual quantum systems—it is necessary to engineer an interaction Hamiltonian that connects degrees of freedom in those individual systems. In this section, we discuss the physical coupling mechanism and its representation in the qubit eigenbasis. The use of coupling to form 2-qubit gates is discussed in Sec. IV.

1. Physical coupling: Capacitive and inductive

The Hamiltonian of two coupled systems takes a generic form


where H1 and H2 denote the Hamiltonians of the individual quantum systems, which could be any combination of the qubit circuits mentioned in Secs. II A and II B. The last term, Hint, is the interaction Hamiltonian, which couples the variables of both systems. In superconducting circuits, the physical form of the coupling energy is either an electric or magnetic field (or a combination thereof).

To achieve capacitive coupling, a capacitor is placed between the voltage nodes of the two participating circuits, yielding an interaction Hamiltonian Hint of the form


where Cg is the coupling capacitance and V1(V2) is the voltage operator of the corresponding voltage node being connected. Figure 3(a) illustrates a realistic example of a direct capacitive coupling between the top nodes of two transmon qubits. Circuit quantization in the limit of CgC1,C2 yields


where the expressions in brackets are the two Hamiltonians of the individual qubits, [see Eq. (16)], and we take Vi=(2e/Ci)ni in Eq. (26). From Eq. (27), we see that the coupling energy depends on the coupling capacitance as well as the matrix elements of the voltage operators. The dependencies are bilinear in the perturbative limit (CgC1,C2).

To implement the coupling capacitance, one only need bring the edges of the capacitor pads into close proximity, as has been demonstrated in-state-of-the-art planar designs.85 The coupling capacitance is determined by the planar capacitor geometry as well as the surrounding environment, such as the dielectric constant of the substrate and the ground plane proximity.

In the case of inductive coupling, a mutual inductance shared by two loops is the coupling mechanism, yielding an interaction Hamiltonian of the form


where M12 denotes the mutual inductance and I1(I2) is the current-operator of the loop current. A typical example is two closely positioned (rf-SQUID type) flux qubits, as illustrated in Fig. 3(c). The system Hamiltonian can be expressed as


where the individual qubit Hamiltonians are identical to that of the fluxonium in Eq. (24), and the current operators, Ii=Icisin(ϕi) with i1,2, is the familiar DC-Josephson relation for each junction, see Eq. (15). In this case, the strength of the inductive coupling energy depends on the mutual inductance as well as the matrix element of the current operators.

To realize a mutual inductance, two looped circuits are brought into close proximity to one another, or, to make them stronger, overlapping with each other,86 and even may share the same wire or Josephson junction inductor.87–90 In the case of a Josephson junction, and for certain metals, the inductance is dominated by “kinetic inductance” contributions, rather than solely geometric inductance.91,92 Kinetic inductance arises from the mechanical, inertial mass of the charge carriers, but is only practically witnessed in very high-conductance materials like superconductors. A primary feature of kinetic inductance is that its values can vastly exceed those of conventional geometric inductances, which are generally limited by electromagnetic considerations.79 

2. Coupling axis: Transverse and longitudinal

Regardless of its physical realization, the effect of a coupling on system dynamics is determined by its form as represented in the eigenbasis of the individual systems. That is, how Hint appears in the representation spanned by the eigenbasis of H1H2.

Let us start with the previous example of two capacitively coupled transmon qubits [Fig. 3(a)]. Using second quantization, the system Hamiltonian in Eq. (27) can be expressed as


where the expression within brackets represent the Duffing oscillator Hamiltonian for the qubits and g is the coupling energy. Since we define Vni(aa), and consequently Iϕ(a+a), the original n1n2-term becomes what is shown in Eq. (30). Such a coupling is called “transverse,” because the coupling Hamiltonian has nonzero matrix elements only at off-diagonal positions with respect to both oscillators, i.e., ik|aiai|ki=0 for any integer k and for i1,2, and in this case, ik±1|aiai|ki0.

If we can ignore higher energy levels (k2) either because of sufficient anharmonicity or through careful control protocols that ensure these levels never have influence, we may truncate the Hamiltonian in Eq. (30) to


This is a Hamiltonian of two spins, coupled by an exchange interaction. As we will see in Sec. IV D 1, such a Hamiltonian is most commonly used in contemporary implementations and can generate various types of two-qubit entangling gates. Note that, more often, we see that the interaction term is expressed in σxσx instead of σyσy. The choice in the context here is arbitrary and does not change the dynamics. However, when both capacitive and inductive couplings are present in the system, both σxσx and σyσy may be needed. In this case, the voltage operator Vi(aa) (reduced to σy after two-level approximation in the lab frame) is transversal to the current operator I(a+a) (reduced to σx) and both of them may be transverse to the qubit. A similar example is demonstrated between a qubit and a resonator by Lu et al.93 

Transverse coupling can be engineered between a qubit and a harmonic oscillator, see Fig. 3(b). In this case, the Hamiltonian becomes


where ωq and ωr denote the qubit and resonator frequencies, and σ+=|01| and σ=|10| describes the processes of exciting and de-exciting the qubit, respectively. Here, we have assumed that the coupling is in the dispersive limit, i.e., gωq,ωr, hence ignoring the double (de)excitation terms proportional to σ+a and σa, which under typical operation regimes oscillate sufficiently fast to average to zero. The Hamiltonian in Eq. (32), is the standard model used for describing how a two-level atom interacts with a resonant cavity that houses it. Such a structure is also known as cavity quantum electrodynamics (cQED), and it is extended to the circuit version here. It has many useful applications in superconducting quantum information architectures, such as high-fidelity readout,94 see Sec. V, cavity buses,95 quantum memory,96,97 quantum computation with cat states,98–100 etc.

Here, we briefly mention the use of a cavity or resonator to mediate coupling between two qubits, which may be physically well-separated (≈1 cm). Since most superconducting resonators are in the GHz frequency range, they can be made much longer than any dimension of a qubit circuit (≈1 mm). One can use such a resonator to mediate coupling between two or more otherwise noninteracting qubits. An example is shown in Fig. 3(b), where two transmon qubits are both capacitively coupled to the center resonator. The two-level system Hamiltonian is:


It can be shown that in the dispersive limit, i.e., gir|ωiωr|, the resonator can—after proper transformation and approximation—be treated as an isolated system, and the composite system simplified to two transversely coupled qubits, see Eq. (31).

We now turn to the previous example of two inductively coupled flux qubits, see Fig. 3(c). Assume that the double-well potential [Fig. 2(g)] has a relatively high interwell barrier, which leads to an exponentially small qubit transition frequency at the energy degeneracy point, (Φe = π). Around this degeneracy point, the off-diagonal matrix element of sin(ϕ) is zero, i.e., the ground and excited states are localized in different wells and 1|sin(ϕ)|10|sin(ϕ)|00. We can then rewrite the Hamiltonian in Eq. (29) as


Now, the coupling axis is the same as the qubit quantization axes and therefore termed “longitudinal coupling.” Note, however, that the physical σxσx and σzσz couplings can change in the qubit frame.

Longitudinal coupling is an important type of interaction, because it can generate entanglement without energy exchange. Moreover, it is found a necessary ingredient in the application of quantum annealing, where certain hard combinatorial optimization problems can be modeled by the Ising Hamiltonian in Eq. (34) and finding its ground state would solve this problem.

An intermediate qubit mode may also be used as a coupler in the longitudinal case. In Fig. 3(d), an additional rf-SQUID is used to mediate the coupling. The coupling strength can be tuned by the flux bias of the coupler SQUID.101 Note that a tunable coupler may also be realized in a structure with capacitive couplings.63 A tunable coupler is useful because it provides a wide range of coupling strengths,102 a high on-off ratio103 for reducing gate error-rates, and more ways of achieving high-fidelity entangling gates.67,104–106 The trade-off is an additional control line.

In addition to the pure transversal and longitudinal qubit-qubit interactions, there are also examples of mixed types of interaction Hamiltonians107 


which are longitudinal with respect to a qubit, but transverse with respect to a harmonic oscillator in a qubit-resonator system. Such a model is called longitudinal but one should note that it is only longitudinal to one participating system. It is hard to engineer physically longitudinal coupling with respect to a harmonic oscillator, since either the E-field (V) or the B-field (I) is transverse with respect to the eigen field of the harmonic oscillator. Note, however, that a transversal model such as in Eq. (32) may be transformed into a longitudinal one in certain operating regimes, see Sec. V.

In some applications, such as for quantum annealing, both longitudinal and transverse couplings are desired (σzσz coupling for mapping the problem and σxσx coupling for enhancing the annealing performance) and require independent control.

Random, uncontrollable physical processes in the qubit control and measurement equipment or in the local environment surrounding the quantum processor are sources of noise that lead to decoherence and reduce the operational fidelity of the qubits. In this section, we introduce the basics of noise leading to decoherence in superconducting circuits, and we discuss coherent control methods to mitigate certain types of noise.

In a closed system, the dynamical evolution of a qubit state is deterministic. That is, if we know the starting state of the qubit and its Hamiltonian, then we can predict the state of the qubit at any time in the future. However, in open systems, the situation changes. The qubit now interacts with uncontrolled degrees of freedom in its environment, which we refer to as fluctuations or noise. In the presence of noise, as time progresses, the qubit state looks less and less like the state we would have predicted and, eventually, the state is lost. There are many different sources of noise that affect quantum systems, and they can be categorized into two primary types: systematic noise and stochastic noise.

1. Systematic noise

Systematic noise arises from a process that is traceable to a fixed control or readout error. For example, we apply a microwave pulse to the qubit that we believe will impart a 180° rotation. However, the control field is not tuned properly and, rather than rotating the qubit 180°, the pulse slightly over-rotates or under-rotates the qubit by a fixed amount. The underlying error is “systematic,” and it therefore leads to the same rotation error each time it is applied. However, when such erroneous pulses are used in practice in a variety of control sequences, the observed results may appear to be influenced by random noise. This is because the pulse is generally not applied in the same way for each experiment: it could be applied a different number of times, interspersed with different pulses in different orders, and therefore generally differs from experiment to experiment. However, once systematic errors are identified, they can generally be corrected through proper calibration or the use of improved hardware.

2. Stochastic noise

The second type of noise is stochastic noise, arising from random fluctuations of parameters that are coupled to our qubit.108 For example, thermal noise of a 50 Ω resistor in the control lines leading to the qubit will have voltage and current fluctuations—Johnson noise—with a noise power that is proportional to both temperature and bandwidth. Or, the oscillator that provides the carrier for a qubit control pulse may have amplitude or phase fluctuations. Additionally, randomly fluctuating electric and magnetic fields in the local qubit environment—e.g., on the metal surface, on the substrate surface, at the metal-substrate interface, or inside the substrate—can couple to the qubit. This creates unknown and uncontrolled fluctuations of one or more qubit parameters, and this leads to qubit decoherence.

3. Noise strength and qubit susceptibility

The degree to which a qubit is affected by noise is related to the amount of noise impinging on the qubit, and the qubit's susceptibility to that noise. The former is often a question of materials science and fabrication; that is, can we make devices with lower levels of noise. Or, it may be related to the quality of the control electronics and cryogenic engineering to limit the levels of noise on the control lines that necessarily connect to the qubits to control them. The latter—qubit susceptibility—is a question of qubit design. Qubits can be designed to trade-off sensitivity to one type of noise at the expense of increased sensitivity to other types of noise. Thus, materials science, fabrication engineering, electronics design, cryogenic engineering, and qubit design all play a role in creating devices with high coherence. In general, one should strive to eliminate the sources of noise, and then design qubits that are insensitive to the residual noise.

The qubit response to noise depends on how the noise couples to it—either through a longitudinal or a transverse coupling as referenced to the qubit quantization axis. This can be visualized using a Bloch Sphere picture of the qubit state, as illustrated in Fig. 4 and discussed in detail in Sec. III B.

1. Bloch sphere representation

The “Bloch sphere” is a unit sphere used to represent the quantum state of a two-level system (qubit). Figure 4(a) shows a Bloch sphere with a “Bloch vector” representing the state |ψ=α|0+β|1. If we visualize the Bloch sphere as the planet Earth, then by convention, the north pole represents state |0 and the south pole state |1. For pure quantum states such as |ψ, the Bloch vector is of unit length, |α|2+|β|2=1, connecting the center of the sphere to any point on its surface.

The z-axis connects the north and south poles. It is called the “longitudinal axis,” since it represents the “qubit quantization axis” for the states |0 and |1 in the qubit eigenbasis. In turn, the xy plane is the “transverse plane” with “transverse axes” x and y. In this Cartesian coordinate system, the unit Bloch vector a=(sinθcosϕ,sinθsinϕ,cosθ) is represented using the polar angle 0θπ and the azimuthal angle 0ϕ<2π, as illustrated in Fig. 4(a). Following our convention, state |0 at the north pole is associated with +1, and state |1 (the south pole) with –1. We can similarly represent the quantum state using the angles θ and ϕ,


The Bloch vector is stationary on the Bloch sphere in the “rotating frame picture.” If state |1 has a higher energy than state |0 (as it generally does in superconducting qubits), then in a stationary frame, the Bloch vector would precess around the z-axis at the qubit frequency (E1E0)/. Without loss of generality (and much easier to visualize), we instead “choose” to view the Bloch sphere in a reference frame where the x and y-axes also rotate around the z-axis at the qubit frequency. In this “rotating frame,” the Bloch vector appears stationary as written in Eq. (36). The rotating frame will be described in detail in Sec. IV D 1 in the context of single-qubit gates.

For completeness, we note that the density matrix ρ=|ψψ| for a pure state |ψ is equivalently


where I is the identity matrix, and σ=[σx,σy,σz] is a vector of Pauli matrices. If the Bloch vector a is a unit vector, then ρ represents a pure state ψ and Tr(ρ2) = 1. More generally, the Bloch sphere can be used to represent “mixed states,” for which |a|<1; in this case, the Bloch vector terminates at points “inside” the unit sphere, and 0Tr(ρ2)<1. To summarize, the surface of the unit sphere represents pure states, and its interior represents mixed states.6 

2. Bloch-Redfield model of decoherence

Within the standard Bloch-Redfield109–111 picture of two-level system dynamics, noise sources weakly coupled to the qubits have short correlation times with respect to the system dynamics. In this case, the relaxation processes are characterized by two rates (see Fig. 4),


which contains the pure dephasing rate Γφ. We note that the definition of Γ2 as a sum of rates presumes that the individual decay functions are exponential, which occurs for Lorentzian noise spectra (centered at ω = 0) such as white noise (short correlation times) with a high-frequency cutoff.

The impact of noise on the qubit can be visualized on the Bloch sphere in Fig. 4(a). For an initial state (t =0)


the Bloch-Redfield density matrix ρBR for the qubit is written112,113


There are a few important distinctions between Eqs. (43) and (39), which we list here and then describe in more detail in Secs. III B 2 aIII B 2 c.

  • First, we have introduced the “longitudinal decay function” exp(Γ1t), which accounts for longitudinal relaxation of the qubit.

  • Second, we introduced the “transverse decay function” exp(Γ2t), which accounts for transverse decay of the qubit.

  • Third, we have introduced an explicit phase accrual exp(iδωt), where δω=ωqωd, which generalizes the Bloch sphere picture to account for cases where the qubit frequency ωq differs from the rotating-frame frequency ωd, as we will see later when discussing measurements of T2 using Ramsey interferometry,114,115 and in Sec. IV D 1, in the context of single-qubit gates.

  • Fourth, we have constructed the matrix such that for t(T1,T2), the upper-left matrix element will approach a unit value, indicating that all populations relax to the ground state, while the other three matrix elements decay to zero. This is related to the assumption that the environmental temperature is low enough that thermal excitations of the qubit from the ground to the excited state rarely occur.

a. Longitudinal relaxation

The longitudinal relaxation rate Γ1 describes depolarization along the qubit quantization axis, often referred to as “energy decay” or “energy relaxation.” In this language, a qubit with polarization p =1 is entirely in the ground state (|0) at the north pole, p = –1 is entirely in the excited state (|1) at the south pole, and p =0 is a completely depolarized mixed state at the center of the Bloch sphere.

As illustrated in Fig. 4(b), longitudinal relaxation is caused by “transverse noise,” via the x- or y-axis, with the intuition that off-diagonal elements of an interaction Hamiltonian are needed to connect and drive transitions between states |0 and |1.

Depolarization occurs due to energy exchange with an environment, generally leading to both an “up transition rate” Γ1 (excitation from |0 to |1), and a “down transition rate” Γ1 (relaxation from |1 to |0). Together, these form the longitudinal relaxation rate Γ1


T1 is the 1/e decay time in the exponential decay function in Eq. (43), and it is the characteristic time scale over which the qubit population will relax to its steady-state value. For superconducting qubits, this steady-state value is generally the ground state, due to Boltzmann statistics and typical operating conditions. Boltzmann equilibrium statistics lead to the “detailed balance” relationship Γ1=exp(ωq/kBT)Γ1, where T is the temperature and kB is the Boltzmann constant, with an equilibrium qubit polarization approaching p=tanh(ωq/2kBT). Typical qubits are designed at frequency ωq/2π5 GHz and are operated at dilution refrigerator temperatures T 20 mK. In this limit, the up-rate Γ1 is exponentially suppressed by the Boltzmann factor exp(ωq/kBT), and so only the down-rate Γ1 contributes significantly, relaxing the population to the ground state. Thus, qubits generally spontaneously lose energy to their cold environment, but the environment rarely introduces a qubit excitation. As a result, the equilibrium polarization approaches unity [see Eq. (43)].116,117

Only noise at the qubit frequency mediates qubit transitions, whether absorption or emission, and this noise is generally “well behaved” (short correlation time, many modes weakly coupled to qubit, no divergences) around the qubit frequency for superconducting qubits. The intuition is that qubit-transition linewidths are relatively narrow in frequency, and so the noise generally does not vary much over this narrow frequency range. Although there are a few notable exceptions, for example, qubit decay in the presence of hot quasiparticles,118–120 which can lead to nonexponential decay functions, longitudinal depolarization measurements generally present exponential decay functions consistent with the Bloch-Redfield picture.

An example of a T1 measurement is shown in Fig. 5(a). The qubit is prepared in its excited state using an Xπ-pulse, and then left to spontaneously decay to the ground state for a time τ, after which the qubit is measured. A single measurement will project the quantum state into either state |0 or state |1, with probabilities that correspond to the qubit polarization. To make an estimate of this polarization, one needs to identically prepare the qubit and repeat the experiment many times. This is analogous to flipping a coin: any single flip will yield heads or tails, but the probability of obtaining a heads or tails can be estimated by flipping the coin many times and taking the ensemble average. The resulting exponential decay has a characteristic time T1 = 85 μs.

b. Pure dephasing

The “pure dephasing” rate Γϕ describes depolarization in the xy plane of the Bloch sphere. It is referred to as “pure dephasing,” to distinguish it from other phase-breaking processes such as energy excitation or relaxation.

As illustrated in Fig. 4(c), pure dephasing is caused by “longitudinal noise” that couples to the qubit via the z-axis. Such longitudinal noise causes the qubit frequency ωq to fluctuate, such that it is no longer equal to the rotating frame frequency ωd, and causes the Bloch vector to precess forward or backward in the rotating frame. Intuitively, we can imagine identically preparing several instances of the Bloch vector along the x-axis. For each instance, the stochastic fluctuations of qubit frequency will result in a different precession frequency, resulting in a net fanout of the Bloch vector in the xy plane. This eventually leads to a complete depolarization of the azimuthal angle ϕ. Note that this stochastic effect will be captured in the transverse relaxation rate Γ2 (Sec. III B 2 c); it is “not” the deterministic term exp(±iδωt) that appears in Eq. (43), which represents intentional detuning of the qubit reference frame.

There are a few important distinctions between pure dephasing and energy relaxation. First, in contrast to energy relaxation, pure dephasing is not a resonant phenomenon; noise at any frequency can modify the qubit frequency and cause dephasing. Thus, qubit dephasing is subject to broadband noise. Second, since pure dephasing is elastic (there is no energy exchange with the environment), it is in principle “reversible.” That is, the dephasing can be “undone”—with quantum information being preserved—through the application of unitary operations, e.g., dynamical decoupling pulses,78 see Sec. III D 2.

The degree to which the quantum information can be retained depends on many factors, including the bandwidth of the noise, the rate of dephasing, the rate at which unitary operations can be performed, etc. This should be contrasted with spontaneous energy relaxation, which is an “irreversible” process. Intuitively, once the qubit emits energy to the environment and its myriad uncontrollable modes, the quantum information is essentially lost with no hope for its recovery and reconstitution back into the qubit.

c. Transverse relaxation

The transverse relaxation rate Γ2=Γ1/2+Γφ describes the loss of coherence of a superposition state, for example (1/2)(|0+|1), pointed along the x-axis on the equator of the Bloch sphere as illustrated in Fig. 4(d). Decoherence is caused in part by longitudinal noise, which fluctuates the qubit frequency and leads to pure dephasing Γφ (red). It is also caused by transverse noise, which leads to energy relaxation of the excited-state component of the superposition state at a rate Γ1 (blue). Such a relaxation event is also a phase-breaking process, because once it occurs, the Bloch vector points to the north pole, |0, and there is no longer any knowledge of which direction the Bloch vector had been pointing along the equator; the relative phase of the superposition state is lost.

Transverse relaxation T2 can be measured using Ramsey interferometry, as shown and described in Fig. 5(b). The protocol positions the Bloch vector on the equator using a Xπ/2-pulse. Typically, the carrier frequency of this pulse is slightly detuned from the qubit frequency by an amount δω. As a result, the Bloch vector will precess around the z-axis at a rate δω. This is done for convenience sake, so that the resulting Ramsey measurement will oscillate, making it easier to analyze. After precessing for a time τ, a second Xπ/2-pulse projects the Bloch vector back on to the z-axis. Repeated measurements are made to take an ensemble averaged estimate of the qubit polarization, as a function of τ. The resulting oscillations in Fig. 5(b) feature an approximately exponential decay function with time T2*=98μs. The “*” indicates that the Ramsey experiment is sensitive to “inhomogeneous broadening.” That is, it is highly sensitive to quasistatic, low-frequency fluctuations that are constant within one experimental trial, but vary from trial to trial, e.g., due to 1/f-type noise. This sensitivity to quasistatic noise is related to the corresponding N =0 noise filter function shown in Fig. 5(d) being centered at zero-frequency, as described in more detail in Sec. III D 2.

The Hahn echo shown in Fig. 5(c) is an experiment that is less sensitive to quasistatic noise. By placing a Y π pulse at the center of a Ramsey interferometry experiment, the quasistatic contributions to dephasing can be “refocused,” leaving an estimate T2E that is less sensitive to inhomogeneous broadening mechanisms. The pulses are generally chosen to be resonant with the qubit transition for a Hahn echo, since any frequency detuning would be nominally refocused anyway. The resulting decay function in Fig. 5(c) is essentially exponential with time T2E=120μs.

With the known T1 and T2 times, one can infer the pure dephasing time Tφ from Eq. (41), provided the decay functions are exponential. In superconducting qubits, however, the broadband dephasing noise (e.g., flux noise, charge noise, critical-current noise, …) tends to exhibit a 1/f-like power spectrum. Such noise is singular near ω = 0, has long correlation times, and generally does not fall within the Bloch-Redfield description. The decay function of the off-diagonal terms in Eq. (43) is generally nonexponential, and for such cases, the simple expression in Eq. (41) is not applicable.

3. Modification due to 1/f-type noise

If we assume that the qubit is coupled to many independent fluctuators, then, regardless of their individual statistics, they will in concert generate noise with a Gaussian distribution due to the central limit theorem. We therefore say that the longitudinal fluctuations exhibit Gaussian-distributed 1/f noise. For 1/f noise spectra, the phase decay function is itself a Gaussian exp[(t/Tφ,Gt)2], where we write Tφ,G to distinguish it from Tφ used in Eq. (41). Furthermore, this function is separable from the T1-type exponential decay, because the T1-noise remains regular at the qubit frequency. The density matrix in Eq. (43) becomes, following Refs. 78 and 112,


where the decay function exp(χN(t)) contains the “coherence function” χN(t), which generalizes pure dephasing to include nonexponential decay functions. As we shall see later, the subscript N labeling the decay function refers to the number of π-pulses used to refocus the low-frequency noise, which impacts the form of the decay function. Because the function is no longer purely exponential, we cannot formally write the transverse relaxation decay function as exp(t/T2). However, an exponential decay remains a practically reasonable approximation for TφT1. We also note that the energy decay component of the transverse relaxation is exp(t/2T1), and so T2 can never be larger than 2T1. In the absence of pure dephasing, the maximum T2 = 2T1 is reached.

As an example, consider the Ramsey interferometry data in Fig. 5(b). Since the dephasing is relatively weak, the transverse relaxation function as exp(t/T2) is a reasonable fit and yields T2 = 95 μs. However, using the value T1 = 85 μs from Fig. 5(a) and dividing out exp(t/2T1) from the data in Fig. 5(b), the remaining pure dephasing decay function is shown in Fig. 5(d) and assumes a Gaussian envelope exp(χN(t))=exp[(t/Tφ,Gt)2], with Tφ,G=98μs. The Hahn echo data in Fig. 5(c) may be treated similarly.

For completeness, in addition to 1/f dephasing mechanisms, we note that there are also “white” pure dephasing mechanisms, which give rise to an exponential decay function for the dephasing component of T2. One common example is dephasing due to the shot noise of residual photons in the readout resonator coupled to superconducting qubits, as we discuss in Sec. III C 3.

4. Noise power spectral density (PSD)

The frequency distribution of the noise power for a stationary noise source λ is characterized by its PSD Sλ(ω)


The Wiener-Khintchine theorem states that the PSD is the Fourier transform of the autocorrelation function cλ(τ)=λ(τ)λ(0) of the noise source λ. Since the integration limits are (,), this is the bilateral PSD. Symmetrizing the PSD allows one to consider only positive frequencies, which is termed a unilateral PSD. Both unilateral and bilateral PSDs are used, often with the same notation, and so one needs to know how the PSD is defined, to keep track of the factors 2 and π, and also be aware of the implications for quantum systems.

For classical systems, the noise power spectral density is symmetric. This is because the autocorrelation function of real signals is itself a real function, and the Fourier transform of a real temporal function is symmetric in the frequency domain. Dephasing noise is caused by real, fluctuating fields, and so its PSD is generally symmetric. Examples of such classical noise include thermal (Johnson) noise and 1/f noise122 (see Fig. 6).

In turn, the inverse Fourier transform of the PSD will yield the autocorrelation function


This implies that integrating the noise power spectral density with τ = 0 yields the second moment of the noise, or, for zero-mean fluctuations, the variance.

However, the autocorrelation function for a quantum system may be complex-valued due to the fact that quantum operators generally do not commute at different times. This means that time-ordering of the operators matters, and the PSD need not be symmetric in frequency. This is generally the case for transverse noise causing longitudinal energy relaxation. Noise at a positive frequency S(ωq) corresponds to energy transfer from the qubit to the environment, including both stimulated and spontaneous emission, associated with the down-rate Γ1. Noise at a negative frequency S(ωq) corresponds to energy transfer to the qubit from the environment, associated with the up-rate Γ1. This energy transfer becomes exponentially suppressed when the qubit frequency is larger than thermal energy (kB T), as shown in Fig. 6. For a detailed discussion, see Refs. 123 and 124. Spontaneous emission to a cold environment or electromagnetic vacuum, represented by Nyquist noise in Fig. 6, is an example of an asymmetric noise PSD.121 

In general, making a connection between Sλ(ω) and the measured qubit decay functions is the basis for noise spectroscopy up to second-order statistics.78,125–128 The search for higher-order spectra related to non-Gaussian noise is a current topic of active research.129 

There are many sources of stochastic noise in superconducting qubits, and we refer the reader to Ref. 40 for a review. Here, we briefly present several of the most common types of noise, their effect on coherence, and refer the reader to the references for a more detailed discussion.

1. Charge noise

“Charge noise” is ubiquitous in solid-state devices. It arises from charged fluctuators present in the defects or charge traps that reside in interfacial dielectrics, the junction tunnel barrier, and in the substrate itself. These are often modeled as an ensemble of fluctuating two-level systems or as bulk dielectric loss.130,131 For example, in the case of a transmon qubit, the electric field between the capacitor plates traverses and couples to dielectric defects residing on the metal surfaces of the plates (for lateral-plate-type capacitors) or the capacitor dielectric between the plates (for parallel-plate-type capacitors). The electric field variable is transverse with respect to the quantization axis of the transmon qubit, which means that this noise is mainly responsible for energy relaxation (T1). Additionally, if the EJ/EC ratio of the transmon is not made sufficiently large (smaller than around 60), the qubit frequency itself will also be sensitive to broadband charge fluctuations. In this case, low-frequency charge noise couples longitudinally to the transmon and causes pure dephasing (Tφ).

Charge noise is modeled primarily as a combination of inverse-frequency noise and Nyquist noise, also referred to as “ohmic” noise. At lower frequencies, the spectral density takes the form


with quasiuniversal values AQ2=(103e)2/Hz at 1 Hz, and γQ1. In addition to large 1/f fluctuations, early charge qubits often witnessed discrete, charge offsets reminiscent of random telegraph noise. Together, these two mechanisms severely limited the utility of charge qubits, and served as a strong motivation to move to capacitively shunted charge qubits (transmons), which greatly reduced the qubit longitudinal sensitivity to charge noise. At higher frequencies, the power spectrum takes the form SQ(ω)=BQ2[ω/(2π×1Hz)], where the noise strength BQ2 at 1 Hz can assume a range of values depending on the level of dissipation in the system. Likewise, the cross-over from 1/f-like behavior to f-like behavior generally occurs at around 1 GHz, but will vary higher or lower between samples depending on the degree of dissipation.62,132

2. Magnetic flux noise

Another commonly observed noise in solid-state devices is magnetic “flux noise.” The origin of this noise is understood to arise from the stochastic flipping of spins (magnetic dipoles) that reside on the surfaces of the superconducting metals comprising the qubit,133 resulting in random fluctuations of the effective magnetic field that biases flux-tunable qubits.

For example, in the case of the split transmon, the external magnetic field threading the loop couples longitudinally to the qubit and modulates the transition frequency via the Josephson energy EJ (except at φe=0, where the qubit is first-order insensitive to magnetic-field fluctuations). Because the flux noise is longitudinal to the transmon, it contributes to pure dephasing (Tφ). However, in the case of the flux qubit, and depending on the flux-bias point, the flux noise may be either longitudinal—causing dephasing Tφ—or it may couple transversely and thus contribute to T1 relaxation.62,78 The noise power spectrum of these fluctuations generally exhibits a “quasiuniversal” dependence


with γΦ0.81.0 and AΦ2(1μΦ0)2/Hz, and has been shown to extend from less than millihertz to beyond gigahertz frequencies.78,127,128,134,135

The large, low-frequency weighting of the 1/f power distribution enables the use of engineered error mitigation techniques—such as dynamical decoupling—to achieve better coherence78,136,137 and for improving single and two-qubit gate fidelity.138 It was recently demonstrated that 1/f flux noise is also a T1-mechanism when extended out to the qubit frequency,62 and one similarly expects a crossover to ohmic flux noise at high enough frequencies.139 

Although much is known about the statistics and number of the defects presumed responsible for flux noise, their precise physical manifestation remains uncertain.133,140 The fact that the 1/f noise is quasiuniversal and largely independent of device, strongly suggests a common origin for the noise. Recent studies suggest that adsorbed molecular oxygen may be responsible for flux-noise.140,141

3. Photon number fluctuations

In the circuit QED architecture, resonator “photon number fluctuation” is another major decoherence source.142 Residual microwave fields in the cavity have photon-number fluctuations that in the dispersive regime impact the qubit through an interaction term χσzn, see Sec. II C 2, leading to a frequency shift ΔStark=2ηχn¯, where n¯ is the average photon number, and η=κ2/(κ2+4χ2) effectively scales the photon population seen by the qubit due to the interplay between the qubit-induced dispersive shift of the resonator frequency (χ) and the resonator decay rate (κ).

In the dispersive limit, the noise is longitudinally coupled to the qubit and leads to pure dephasing at a rate


The fluctuations originate from residual photons in the resonator, typically due to radiation from higher temperature stages in the dilution refrigerator.106,143 The corresponding noise spectral density is of Lorentzian type


which exhibits an essentially white noise spectrum up to a 3 dB cutoff frequency ω = κ set by the resonator decay rate κ, see Ref. 62.

4. Quasiparticles

“Quasiparticles,” i.e., unpaired electrons, are another important noise source for superconducting devices.119 The tunneling of quasiparticles through a qubit junction may lead to both T1 relaxation and pure dephasing Tφ, depending on the type of qubit, the bias point, and the junction through which the tunneling event occurs.118,120

Quasiparticles are naturally excited due to thermodynamics, and the quasiparticle density in equilibrium superconductors should be exponentially suppressed as the temperature decreases. However, below about 150 mK, the quasiparticle density observed in superconducting devices—generally in the range 10–8–10–6 per Cooper pair—is much higher than what the BCS theory would predict for a superconductor in equilibrium with its cryogenic environment at 10 mK. The reason for this excess quasiparticle population is unclear, but it is very likely related to the presence of additional, nonthermal mechanisms that increase the generation rates, “bottleneck effects” that occur at millikelvin temperatures to reduce recombination rates, or a combination of both.

It has been shown that the observed T1 and excess excited-state population measured in today's state-of-the-art high-coherence transmon are self-consistent with excess “hot” nonequilibrium quasiparticles at the quasiuniversal density of around 10–7–10–6 per Cooper pair.144,145 Although this quasiparticle generation mechanism is not yet well understood, it has been shown that quasiparticles can be transiently pumped away, improving T1 times and reducing T1 temporal variation.120 

Similar to the way two qubits are coupled, a qubit may couple and interact with uncontrolled degrees of freedom (DOF) in its environment (the noise sources). The interaction Hamiltonian between the qubit DOF (Ôq) and those of the noise source (λ̂) may be expressed in a general form


where ν denotes the coupling strength—which is related to the sensitivity of the qubit to environmental fluctuations Ĥq/λ—and we assume that Ôq is a qubit operator within the qubit Hamiltonian Ĥq. The noisy environment represented by the operator λ̂ produces fluctuations δλ. Note that we retained the hats in this section to remind us that these are quantum operators.

1. Connecting T1 to S(ω)

If the coupling is transverse to the qubit, e.g., Ôq is of the type σx or (a+a)—see the related case of qubit-qubit coupling treated in Sec. II C—then noise at the qubit frequency can cause transitions between the qubit eigenstates. Since this is a stochastic process, the ensemble-average manifests itself as a decay (usually exponential) of the qubit population toward a certain equilibrium value (usually the qubit ground state |0 for kBTωq). Again, this process is equivalently referred to as “T1 relaxation,” “energy relaxation,” or “longitudinal relaxation.” As stated above, T1 is the characteristic time scale of the decay. Its inverse, Γ1 = 1/T1 is called the relaxation rate and depends on the power spectral density of the noise S(ω) at the transition frequency of the qubit ω = ωq


where Ĥq/λ is the qubit transverse susceptibility to fluctuations δλ, such that |δλ|2 is the ensemble average value of the environmental noise sources as seen by the qubit. Equation (53) is equivalent to Fermi's Golden Rule, in which the qubit's transverse susceptibility to noise is driven by the noise power spectral density. The qubit transverse susceptibility can be used to calculate the prefactors; for example, for fluctuations δλ = δn, the relevant term in the transmon Hamiltonian in Eq. (16) is 4EC(n̂ng)2, where we allow for an offset charge ng, and the susceptibility is given by 8ECn̂. We refer the reader to Refs. 40, 146, and 147 for more details.

2. Connecting Tφ to S(ω)

If the coupling to the qubit is instead longitudinal, e.g., Ĥq is of the type σz or aa, the noise will stochastically modulate the transition frequency of the qubit and thereby introduce a stochastic phase evolution of a qubit superposition state. This gradually leads to a loss of phase information, and it is therefore called pure dephasing (time constant Tφ). Unlike T1 relaxation, which is generally an irreversible (incoherent) error, pure dephasing Tφ is in principle reversible (a coherent error). The degree of pure dephasing depends on the control pulse sequence applied while the qubit is subject to the noise process.

Consider the relative phase φ of a superposition state undergoing free evolution in the presence of noise. The superposition state's accumulated phase


diffuses due to adiabatic fluctuations of the transition frequency,


where ωq/λ=(1/)|Ĥq/λ| is the qubit's longitudinal sensitivity to λ-noise. For noise generated by a large number of fluctuators that are weakly coupled to the qubit, its statistics are Gaussian. Ensemble averaging over all realizations of the Gaussian-distributed stochastic process δλ(t), the dephasing is


leading to a coherence decay function,


where g(ω,τ) is a dimensionless weighting function.

The function gN(ω,τ) can be viewed as a frequency-domain filter of the noise Sλ(ω) [see Fig. 7(a)]. In general, its filter properties depend on the number N and distribution of applied pulses. For example, considering sequences of π-pulses,78,148–152


where δj[0,1] is the normalized position of the center of the jth π-pulse between the two π/2-pulses, τ is the total free-induction time, and τπ is the length of each π-pulse,151,152 yielding a total sequence length τ+Nτπ. As the number of pulses increases for a fixed τ, the filter function's peak shifts to higher frequencies, leading to a reduction in the net integrated noise for 1/fα-type noise spectra with α > 0. Similarly, for a fixed N, the filter function will shift in frequency with τ. Additionally, for a fixed time separation τ=τ/N (valid for N1), the filter sharpens and asymptotically peaks at ω/2π=1/2τ as more pulses are added. gN(ω,τ) is thus called the “filter function,”78,150 and it depends on the pulse sequences being applied. From Eq. (57), the pure dephasing decay arises from a noise spectral density that is “shaped” or “filtered” by the sequence-specific filter function. By choosing the number of pulses, their rotation axes, and their arrangement in time, we can design filter functions that minimize the net noise power for a given noise spectral density within the experimental constraints of the experiment (e.g., pulse-modulation bandwidth of the electronics used to control the qubits).

To give a standard example, we compare the coherence integral for two cases: a Ramsey pulse sequence and a Hahn echo pulse sequence. Both sequences involve two π/2 pulses separated by a time τ, during which free evolution of the qubit occurs in the presence of low-frequency dephasing noise. The distinction is that the Hahn echo will place a single π pulse (N =1) in the middle of the free-evolution period, whereas the Ramsey does not use any additional pulses (N =0). The resulting filter functions are


where the subscripts N =0 and N =1 indicate the number of π-pulses applied for the Ramsey and Hahn echo experiments, respectively. The filter function g0(ω,τ) for the Ramsey case is a sinc-function centered at ω = 0. For noise that decreases with frequency, e.g., 1/f flux noise in superconducting qubits, the Ramsey experiment windows through the noise in S(ω) where it has its highest value. This is the worst choice of filter function for 1/f noise. In contrast, the Hahn echo filter function has a centroid that is peaked at a higher frequency, away from ω = 0. In fact, it has zero value at ω = 0. For noise that decreases with frequency, such as 1/f noise, this is advantageous. This concept extends to larger numbers N of π pulses, and is called a Carr-Purcell-Meiboom-Gill (CPMG) sequence.153,154 In Fig. 7(b), the T2 time of a qubit under the influence of strong dephasing noise is increased toward the 2T1 limit using a CPMG dynamical error-suppression pulse sequence with an increasing number of pulses, N. We refer the reader to Refs. 78, 155, and 156, where these experiments were performed with superconducting qubits.

3. Noise spectroscopy

The qubit is highly sensitive to its noisy environment, and this feature can be used to map out the noise power spectral density. In general, one can map the noise PSD during “free evolution”—periods of time for which no control is applied to the qubit, except for very short dynamical decoupling pulses—and during “driven evolution”—periods of time during which the control fields are applied to the qubit. Both free-evolution and driven-evolution noise is important to characterize, as the noise PSD may differ for these two types of evolution, and both are utilized in the context of universal quantum computation. We refer the reader to Ref. 128 for a summary of noise spectroscopy during both types of evolution.

The Ramsey frequency itself is sensitive to longitudinal noise, and monitoring its fluctuations is one means to map out the noise spectral density over the submillihertz to ∼100 Hz range.127,157

At higher frequencies, the CPMG dynamical decoupling sequence can be used to create narrow-band filters that “sample” the noise at different frequencies as a function of the free-evolution time τ and the number of pulses N. This has been used to map out the noise PSD in the range 0.1–300 MHz.78 One must be careful of the additional small peaks at higher-frequencies, which all contribute to the dephasing used to perform the noise spectroscopy.158 

In fact, using pulse envelopes such as Slepians159—which are designed to have a concentrated frequency response—to perform noise spectroscopy is one means to reduce such errors.151 

At even higher frequencies, measurements of T1 can be used in conjunction with Fermi's golden rule to map out the transverse noise spectrum above 1 GHz.62,78,160

The aforementioned are all examples of noise spectroscopy during free evolution. Noise spectroscopy during driven evolution was also demonstrated using a “spin-locking” technique, where a strong drive along x or y axes defines a new qubit quantization axis, whose Rabi frequency is the new qubit frequency in the spin-locking frame. The spin-locking frame is then used to infer the noise spectrum while the qubit is continually subject to a driving field. For more information, we refer the reader to Ref. 128.

Here, we briefly review a few examples of techniques that have been developed to reduce noise or reduce its impact on decoherence (sensitivity). We stress that improving gate fidelity is a comprehensive optimization task, one that is full of trade-offs. It is thus important to identify what the limiting factors are, what price we have to pay to diminish these limiting factors, and what advantage we can achieve until reaching a better trade-off. These all require an accurate understanding the limitations on the gate fidelity, the sources of decoherence, the properties of the noise, and how it affects the system performance.

1. Materials and fabrication improvements

Numerous efforts have been undertaken to reduce noise-induced defects due to materials and fabrication.40,161 In the case of charge noise, significant efforts have been made to reduce the number of defects, such as substrate cleaning,59,162 substrate annealing,163 and trenching.41,61 In the case of flux noise, several groups have performed experiments to characterize the behavior and properties of magnetic-flux defects.133,164,165 More recently, a number of groups have tried optical surface treatments to remove these defects.140 

In the context of residual quasiparticles, it has been shown that adding quasiparticle traps to the circuit design can reduce the quasiparticle number, particularly in devices that create excess quasiparticles, such as classical digital logic or operation in the presence of thermal radiation166 

2. Design improvements

Another strategy is to reduce qubit sensitivity to the noise by design. A qubit can only lose energy to defects if it couples to them. It has been demonstrated that altering the capacitor geometry to increase the electric-field mode volume reduces the electric field density in the thin dielectric regions that cause loss. This effectively reduces the “participation” of the defects and makes the qubits less sensitive to these noise sources.55,62,130

In another example, the split transmons built using asymmetric junctions have lower sensitivity to flux noise than their symmetric counterparts at the expense of decreased frequency tunability.69 This is a good trade-off to make, because generally one is interested in tuning the qubit frequency over a somewhat restricted range (typically around 1 GHz) about the qubit frequency. When such asymmetric transmons are used in a gate scheme such as the adiabatic CPHASE-gate,65 (see Sec. IV F) the qubit is less sensitive to flux noise, has a lower dephasing rate, and this should improve the gate fidelity in general.

3. Dynamical error suppression

As introduced in Sec. III D 2, it is advantageous to leverage the 1/ω distribution of flux noise, wherein a considerable amount of the noise power resides at low frequencies, and so the noise is “quasistatic.” The spin-echo technique,115 which disrupts the free evolution by a π-pulse, is extremely effective in mitigating the pure dephasing by refocusing the coherent phase dispersion due to low-frequency noise. The more advanced versions, such as the CPMG-sequence, use multiple π-pulses to interrupt the system more frequently, pushing the filter band to even higher frequencies—a technique known as “dynamical decoupling.”78 

Returning to excess quasiparticles, it has been shown that quasiparticles can be stochastically pumped away from the qubit region, resulting in longer, and more stable T1 times.120 Although the pumping technique uses a series of π-pulses, this technique differs from dynamical error suppression of coherent errors in that pulses are stochastically applied, and that it addresses incoherent errors (T1).

4. Cryogenic engineering

In the case of photon shot-noise, in addition to applying dynamical decoupling techniques, there have been several recent studies aimed at reducing the thermal photon flux that reaches the device. This include optimizing the attenuation of the cryogenic setup,106,144,167 remaking the cryogenic attenuators with more efficient heat sinking,143 adding absorptive “black” material to absorb stray thermal photons,168,169 and adding additional cavity filters for thermalization.170 

In this section, we will introduce how superconducting qubits are manipulated to implement quantum algorithms. Since the transmonlike variety of superconducting qubits has so far been the most widely deployed modality for implementing quantum programs, the discussion throughout this section will be focused on modern techniques for transmons. Nonetheless, the techniques introduced here are applicable to all types of superconducting qubits.

We start with a brief review of the gates used in classical computing as well as quantum computing, and the concept of universality. Subsequently we discuss the most common technique of driving single qubit gates via a capacitive coupling of a microwave line, coupled to the qubit. We introduce the notion of “virtual Z gates” and “DRAG” pulsing. In the latter part of this section, we review some of the most common implementations of two-qubit gates in both tunable and fixed-frequency transmon qubits. The single-qubit and two-qubit operations together form the basis of many of the medium-scale superconducting quantum processors that exist today.

Throughout this section, we write everything in the computational basis {|0,|1} where |0 is the + 1 eigenstate of σz and |1 is the –1 eigenstate. We use capitalized serif-fonts to indicate the rotation operator of a qubit state, e.g., rotations around the x-axis by an angle θ is written as


and we use the shorthand notation “X” for a full π rotation about the x axis (and similarly for Y:=Yπ and Z:=Zπ).

Universal Boolean logic can be implemented on classical computers using a small set of single-bit and two-bit gates. Several common classical logic gates are shown in Fig. 8 along with their truth tables. In classical Boolean logic, bits can take on one of two values: state 0 or state 1. The state 0 represents logical FALSE, and state 1 represents logical TRUE.

Beyond the trivial “identity operation,” which simply passes a Boolean bit unchanged, the only other possible single-bit Boolean logic gate is the NOT gate. As shown in Fig. 8, the NOT gate flips the bit: 01 and 10. This gate is reversible, because it is trivial to determine the input bit value given the output bit values. As we will see, for two-bit gates, this is not the case.

There are several two-bit gates shown in Fig. 8. A two-bit gate takes two bits as inputs, and it passes as an output the result of a Boolean operation. One common example is the AND gate, for which the output is 1 if and only if both inputs are 1; otherwise, the output is 0. The AND gate, and the other two-bit gates shown in Fig. 8, are all examples of irreversible gates; that is, the input bit values cannot be inferred from the output values. For example, for the AND gate, an output of logical 1 uniquely identifies the input 11, but an output of 0 could be associated with 00, 01, or 10. Once the operation is performed, in general, it cannot be “undone” and the input information is lost. There are several variants of two-bit gates, including,

  • AND and OR;

  • NAND (a combination of NOT and AND) and NOR (a combination of NOT and OR);

  • XOR (exclusive OR) and NXOR (NOTXOR).

The XOR gate is interesting, because it is a “parity” gate. That is, it returns a logical 0 if the two inputs are the same values (i.e., they have the same parity), and it returns a logical 1 if the two inputs have different values (i.e., different parity). Still, the XOR and NXOR gates are not reversible, because knowledge of the output does not allow one to uniquely identify the input bit values.

The concept of “universality” refers to the ability to perform any Boolean logic algorithm using a small set of single-bit and two-bit gates. A universal gate set can in principle transform any state to any other state in the state space represented by the classical bits. The set of gates which enable universal computation is not unique, and may be represented by a small set of gates. For example, the NOT gate and the AND gate together form a universal gate set. Similarly, the NAND gate itself is universal, as is the NOR gate. The efficiency with which one can implement arbitrary Boolean logic, of course, depends on the choice of the gate set.

Quantum logic can similarly be performed by a small set of single-qubit and two-qubit gates. Qubits can of course assume the classical states |0 and |1, at the north pole and south pole of the Bloch sphere, but they can also assume arbitrary superpositions α|0+β|1, corresponding to any other position on the sphere.

Single-qubit operations translate an arbitrary quantum state from one point on the Bloch sphere to another point by rotating the Bloch vector (spin) a certain angle about a particular axis. As shown in Fig. 9, there are several single-qubit operations, each represented by a matrix that describes the quantum operation in the computational basis represented by the eigenvectors of the σz operator, i.e., |0[10]T and |1[01]T.

For example, the “identity gate” performs no rotation on the state of the qubit. This is represented by a two-by-two identity matrix. The X-gate performs a π rotation about the x axis. Similarly, the Y-gate and Z-gate perform a π rotation about the y axis and z axis, respectively. The S-gate performs a π/2 rotation about the z axis, and the T-gate performs a rotation of π/4 about the z axis. The Hadamard gate H is also a common single-qubit gate that performs a π rotation about an axis diagonal in the xz plane, see Fig. 9.

Two-qubit quantum-logic gates are generally “conditional” gates and take two qubits as inputs. Typically, the first qubit is the “control” qubit, and the second is the “target” qubit. A unitary operator is applied to the target qubit, dependent on the state of the control qubit. The two common examples shown in Fig. 10 are the controlled NOT (CNOT-gate) and controlled phase (CZ or CPHASE gate). The CNOT-gate flips the state of the target qubit conditioned on the control qubit being in-state |1. The CPHASE-gate applies a Z gate σz to the target qubit, conditioned on the control qubit being in-state |1. As we will show later, the iSWAP gate—another two-qubit gate—can be built from the CNOT-gate and single-qubit gates. The unitary operator of the CNOT gate can be written in a useful way, highlighting that it applies an X-gate (a σx operator) X depending on the state of the control qubit


and similarly for the CPHASE gate


Comparing the last equality above with the unitary for the CNOT [Eq. (62)], it is clear that the two gates are closely related. Indeed, a CNOT can be generated from a CPHASE by applying two Hadamard gates


since HZH=X. Due to the form of Eq. (63). The CPHASE gate is also denoted the CZ gate, since it implements a controlled Z gate (a controlled-σz operation), by analogy with CNOT (a controlled application of the X-gate, i.e., the σx operation). Inspection of the definition of CPHASE in Fig. 10 makes no distinction between which qubit acts as the target and which as the control and, consequently, the circuit-diagram is sometimes drawn in a symmetric fashion


The CNOT (with qubit 1 as control and qubit 2 as target) can be realized in terms of the CPHASE operation and single-qubit Hadamard gates,


Some two-qubit gates such as CNOT and CPHASE are also called “entangling gates,” because they can take product states as inputs and output entangled states. They are thus an indispensable component of a universal gate set for quantum logic. For example, consider two qubits A and B in the following state:


If we perform a CNOT gate, UCNOT, on this state, with qubit A the control qubit, and qubit B the target qubit, the resulting state is (see the truth table in Fig. 10)


which is a state that cannot be factored into an isolated qubit-A component and a qubit-B component. This is one of the two-qubit entangled “Bell states,” a manifestly quantum mechanical state.

A universal set of single-qubit and two-qubit gates is sufficient to implement an arbitrary quantum logic. This means that this gate set can in principle reach “any” state in the multiqubit state-space. How efficiently this is done depends on the choice of quantum gates that comprise the gate set. We also note that each of the single-qubit and two-qubit gates is reversible, that is, given the output state, one can uniquely determine the input state. As we discuss further, this distinction between classical and quantum gates arises, because quantum gates are based on “unitary” operations U. If a unitary operation U is a particular gate applied to a qubit, then its Hermitian conjugate U can be applied to recover the original state, since UU=I resolves an identity operation.

The gate-sequences used to represent quantum algorithms have certain similarities to those used in classical computing, with a few striking differences. As an example, we consider first the classical NOT gate (discussed previously), and the related quantum circuit version, shown in Fig. 11.

While the classic bit-flip gate inverts any input state, the quantum bit-flip does not in general produce the antipodal state (when viewed on the Bloch sphere), but rather exchange the prefactors of the wavefunction written in the computational basis. The X operator is sometimes referred to as “the quantum NOT” (or “quantum bit-flip”), but we note that X only acts similar to the classical NOT gate in the case of classical data stored in the quantum bit, i.e., X|g=|g¯ for g{0,1}.

As briefly mentioned in Sec. IV B, “all” quantum gates are reversible, due to the underlying unitary nature of the operators implementing the logical operations. Certain other processes used in quantum information processing, however, are irreversible, namely, measurements (see Sec. V for detailed discussion) and energy loss to the environment (if the resulting state of the environment is not known). Here, we will not consider how these processes are modeled, but refer the interested reader to, e.g., Ref. 172, and will only consider unitary control operations throughout the rest of this section. Finally, we note that quantum circuits are written left-to-right (in order of application), while the calculation of the result of a gate-sequences, e.g., the circuit


is performed right-to-left, i.e.,


As discussed in Sec. IV A, the NOR and NAND gates are each individually universal gates for classical computing. Since both of these gates have no direct quantum analog (because they are not reversible), it is natural to ask which gates “are” needed to build a universal quantum computer. It turns out that the ability to rotate about arbitrary axes on the Bloch-sphere (i.e., a complete single-qubit gate set), supplemented with any entangling 2-qubit operation will suffice for universality.172,173 By using what is known as the “Krauss-Cirac decomposition,” any two-qubit gate can be decomposed into a series of CNOT operations.172,174

1. Gate sets and gate synthesis

A common universal quantum gate set is


where Phθ=eiθ1 applies an overall phase θ to a single qubit. For completeness we mention another universal gate set which is of particular interest from a theoretical perspective, namely,


As a technical aside, we mention that the restriction to a discrete gate set still gives rise to universality. This fact relies on using the so-called Solovay-Kitaev175,176 theorem, which (roughly) states that any other single-qubit gate can be approximated to an error ϵ using only O(logc(1/ϵ)) (where c >0) single-qubit gates from G1. The gate-set G1 is typically referred to as the “Clifford + T” set, where H,S and CNOT are all Clifford gates.

Each quantum computing architecture will have certain gates that are simpler to implement at the hardware level than others (sometimes referred to as “native” gates of the architecture). These are typically the gates for which the Hamiltonian governing the gate-implementation gives rise to a unitary propagator that corresponds to the gate itself. We will show several examples of this in Secs. IV E, IV F, and IV G. Regardless of which gates are natively available, as long as one has a complete gate set, one can use the Solovay-Kitaev theorem to synthesize any other set efficiently. In general one wants to keep the overall number of time steps in which gates are applied (denoted the “depth” of a circuit) as low as possible, and one wants to use as many of the native gates as possible, to reduce the amount of time spent for the synthesis. Moreover, running a quantum algorithm also depends on the qubit connectivity of the device. The process of designing a quantum gate sequence that efficiently implements a specific algorithm, while taking into account the considerations outlined above is known as “gate synthesis” and “gate compilation,” respectively. A full discussion of this large research effort is outside the scope of this review, but the interested reader may consult, e.g., Refs. 177–179 and references therein as a starting point. As a concrete (and trivial) example of how gate identities can be used, in Eq. (73) we illustrate how the Hadamard gate from G1 can be generated by two single-qubit gates (from G0) and an overall phase gate


As we show in Sec. IV D 1, the gates Xθ,Yθ and Zθ are all natively available in a superconducting quantum processor.

We now address the question of how single qubit rotations and two-qubit operations are implemented in transmon-based superconducting quantum processors.

2. Addressing superconducting qubits

The modes of addressing transmonlike superconducting qubits can roughly be split into two main categories: (i) Capacitive coupling between a resonator (or a feedline) and the superconducting qubit dipole-field allows for microwave control to implement single-qubit rotations (see Sec. IV D) as well as certain two-qubit gates (see Secs. IV G and IV G 4). (ii) For flux-tunable qubits, the local magnetic fields can be used to tune the frequency of individual qubits. This allows the implementation of z-axis single-qubit rotation as well as multiple two-qubit gates (see Secs. IV E, IV F, and IV H).

In this section, we will review the steps necessary to demonstrate that capacitive coupling of microwaves to a superconducting circuit can be used to drive single-qubit gates. To this end we consider coupling a superconducting qubit to a microwave source (sometimes referred to as a “qubit drive”) as shown in Fig. 12(a). A full circuit analysis of the circuit in Fig. 12(a) is beyond the scope of this review, so here we settle for highlighting the steps that elucidate the physics of the qubit/drive coupling. The interested reader may consult a number of lectures notes and pertinent theses (e.g., Refs. 44, 157, and 180–182). Here we follow Ref. 157.

1. Capacitive coupling for X,Y control

We start by modeling the qubit as a harmonic oscillator, for which the (classical) circuit Hamiltonian can be calculated by circuit quantization techniques, starting from Kirchoffs laws, and is given by157 


where CΣ=C+Cd is the total capacitance to the ground and Q̃=CΣΦ̇CdVd(t) is a renormalized charge variable for the circuit. We can now promote the flux and charge variables to quantum operators and assume weak coupling to the drive-line, so that Q̃Q̂, and arrive at


where HLC=Q̂2/(2C)+Φ̂2/(2L) and we have kept only terms that couple to the dynamic variables. Similar to the momentum operator for a harmonic oscillator in (x, p)–space, we can express the charge variable in terms of raising and lowering operators, as done in Sec. II


where Qzpf=/2Z is the zero-point charge fluctuations and Z=L/C is the impedance of the circuit to ground. Thus, the LC oscillator capacitively coupled to a drive line can be written as


Finally, by truncating to the lowest transition of the oscillator, we can make the replacement aσ and aσ+ throughout and arrive at429 


where Ω=(Cd/CΣ)Qzpf and ωq=(E1E0)/.

To elucidate the role of the drive, we move into a frame rotating with the qubit at frequency ωq (also denoted “the rotating frame” or the “the interaction frame”). To see the usefulness of this rotating frame, consider a state |ψ0=(11)T/2. By the time-dependent Schrödinger equation this state evolves according to


where UH0 is the propagator corresponding to H0. By calculating, e.g., ψ0|σx|ψ0=cos(ωqt), it is evident that the phase is winding with a frequency of ωq due to the σz term. By going into a frame rotating with the qubit at frequency ωq, the action of the drive can be more clearly appreciated. To this end, we define Urf=eiH0t=UH0 and the new state in the rotating frame is |ψrf(t)=Urf|ψ0. The time-evolution in this new frame is again found from the Schrödinger equation (using the shorthand t=/t)


We can think of the term H̃0 in the parentheses in Eq. (82) as the form of H0 in the rotating frame. Simple insertion shows that H̃0=0 as expected (the rotating frame should take care of the time-dependence). However, one could also think of the term in brackets in Eq. (82) as a prescription for calculating the form of any Hamiltonian in the rotating frame given by Urf, by replacing H0 with some other H. In general, we will not find H̃=0.

Returning to Eq. (78), the form of Hd in the rotating frame is found to be


We can in general assume that the time-dependent part of the voltage (Vd(t)=V0v(t)) has the generic form


where s(t) is a dimensionless envelope function, so that the amplitude of the drive is set by V0s(t). Adopting the definitions


the driving Hamiltonian in the rotating frame takes the form


Performing the multiplication and dropping fast rotating terms that will average to zero (i.e., terms with ωq + ωd), known as the rotating wave approximation (RWA), we are left with


where δω=ωqωd. Finally, by reusing the definitions from Eq. (85), the driving Hamiltonian in the rotating frame using the RWA can be written as


Equation (90) is a powerful tool for understanding single-qubit gates in superconducting qubits. As a concrete example, assume that we apply a pulse at the qubit frequency, so that δω = 0, then


showing that an “in-phase” pulse (ϕ = 0, i.e., the I-component) corresponds to rotations around the x-axis, while an out-of-phase pulse (ϕ = π/2, i.e., the Q-component), corresponds to rotations about the y-axis. As a concrete example of an in-phase pulse, writing out the unitary operator yields


which depends only on the macroscopic design parameters of the circuit as well as the envelope of the baseband pulse s(t) and amplitude V0, which can both be controlled using arbitrary waveform generators (AWGs). Equation (92) is known as “Rabi driving” and can serve as a useful tool for engineering the circuit parameters needed for efficient gate operation (subject to the available output voltage V0). To see this, we define the shorthand


which is the angle by which a state is rotated given the capacitive couplings, the impedance of the circuit, the magnitude V0, and the waveform envelope, s(t). This means that to implement a π-pulse on the x-axis, one would solve the equation Θ(t) = π and output the signal in-phase with the qubit drive. In this language, a sequence of pulses [see Fig. 13(a)] Θk,Θk1,Θ0 is converted to a sequence of gates operating on a qubit as


where T is an operator that ensures the pulses are generated in the time-ordered sequence corresponding to UkU1U0.

In Fig. 13, we outline the typical in-phase and quadrature (IQ) modulation setup used to generate the pulses used in Eq. (94). Figure 13(a) shows how a pulse at frequency ωd is generated using a low phase-noise microwave generator [typically denoted “the local oscillator (LO)”], while the pulse is shaped by combining the LO signal in an IQ mixer with pulses generated in an AWG. To allow for frequency multiplexing, the AWG signal will typically be generated with a low-frequency component, ωAWG, and the LO signal will be offset, so that ωLO+ωAWG=ωd. By mixing in more than one frequency ωAWG1,ωAWG2, it is possible to address multiple qubits (or readout resonators) simultaneously, via the superposition of individual drives.

The I (Q) input of the IQ mixer will multiply the baseband signal to the in-phase (out-of-phase) component of the LO. In Fig. 13(b), we schematically show the comparison between XY gates in a quantum circuit and the corresponding waveforms generated in the AWG (omitting for clarity the frequency ωAWG component). The inset in Fig. 13(b) shows an example of a gate on the Bloch sphere, with the indication of (I, Q) axes. More sophisticated and compact approaches exist to reduce the hardware needed for XY qubit control, relative to the setup shown in Fig. 13, see, e.g., Refs. 183–185.

2. Virtual Z gate

As we saw in Sec. IV D, the distinction between x– and y–rotations was merely a choice of phase on the microwave signals, and the angle to be rotated is given by Θ(t), both of which are generated using an AWG. Since the choice of phase ϕ has an arbitrary starting point, we could consider ϕϕ+π/2. This would lead to IQ and QI. Therefore, changing the phase effectively changes rotations around x to rotations around y (and vice-versa, with a change of sign). This is reminiscent of the result of applying a Zπ rotation to x– and y–rotations, where ZπXπ=iYπ and ZπYπ=iXπ. This analogy between shifting a phase of an AWG-generated signal and applying Z rotations can be utilized to implement “virtual” Z gates.186 As shown by McKay et al., this intuition can be formalized via the following example: consider the case of applying a pulse with an angle θ on the I channel (i.e., a Xθ) followed by another θ pulse on the I channel, but with a phase ϕ0 relative to the first pulse (denoted Xθ(ϕ0), where X indicates we still use the I channel, but the rotation axis is now an angle ϕ0 away from the x-axis). Using Eq. (94) corresponds to a pulse sequence


from which we see that the effect of the offset phase ϕ0 is to apply Zϕ0. The equality above can be verified with a little trigonometric footwork. The final Zϕ0 is due to the rotation being in the frame of reference of the qubit. However, since the readout is along the z-axis (see Sec. V), a final phase rotation about z will not change the measurement outcome. Thus, if one wants to implement the gate sequence


where Ui's are arbitrary gates, this can be done by revising the gate sequence (in the control software for the AWG) and changing the phase of subsequent pulses


which reduces the number of overall gates. Moreover, the virtual-Z gates are “perfect,” in the sense that no additional pulses are required, and the gate takes “zero time,” and thus the gate fidelity is nominally unity. As we show in Secs. IV E and IV F, operation of two-qubit gates can incur additional single-qubit phases. Using the virtual-Z strategy, these phases can be canceled out, leaving a pure two-qubit interaction.

Finally, we mention one more salient feature of the virtual-Z gates. As shown in Ref. 63, any single-qubit operation (up to a global phase) can be written as


for appropriate choice of angles θ,ϕ,λ. This means that access to a single physical Xπ2 combined with the virtual-Z gives access to a complete single qubit gate set! An explicit example of Eq. (99) in action is the Hadamard gate, which can be written as H=Zπ2Xπ2Zπ2, but since the Z's can be virtual, it is possible to implement Hadamards as an effective single pulse operation in superconducting qubits.

3. The DRAG scheme

In going from Eq. (77) to Eq. (78), we assumed we could ignore the higher levels of the qubit. However, for weakly anharmonic qubits, such as the transmon (see Sec. II), this may not be a justified assumption, since ωq12 only differs from ωq(ωq01) by the anharmonicity, α=ωq12ωq, which is negative and typically around 200 to 300 MHz. This situation is sketched in Figs. 14(a)–14(c), where we illustrate how Gaussian pulses with standard deviations σ={1,2,5} ns have spectral content that leads to nonzero overlaps with the ωq12=ωq|α| frequency. This leads to two deleterious effects: (1) leakage errors which take the qubit out of the computational subspace, and (2) phase errors. Effect 1 can occur because a qubit in the state |1 may be excited to |2 as a π pulse is applied, or be excited directly from the |0, since the qubit spends some amount of time in the |1 state during the π pulse. Effect 2 occurs because the presence of the drive results in a repulsion between the |1 and |2 levels, in turn changing ωq01 as the pulse is applied. This leads to the accumulation of a relative phase between |0 and |1.188 The so-called DRAG procedure189–191 (Derivative Reduction by Adiabatic Gate) seeks to combat these two effects by applying an extra signal in the out-of-phase component. The trick is to modify the waveform envelope s(t) according to


where λ is a dimensionless scaling parameter, and λ = 0 corresponds to no DRAG pulse and ṡ(t) is the time derivative of s(t). The theoretically optimal choice for reducing dephasing error is λ = 0.5 and an optimal choice for reducing leakage error is λ = 1.190,192 Interchanging I and Q in Eq. (100) corresponds to DRAG pulsing for the Q component.

In practice, there can be a deviation from these two optimal values, often due to pulse distortions in the lines leading to the qubits. Typically, randomized benchmarking experiments combined with single-shot measurements (see Sec. V) of the |2 state are used to determine the optimal value of λ. The λ={0.5,1} trade-off was demonstrated explicitly in Refs. 186 and 193. However, by extending the original DRAG pulse implementation,194,195 it is possible to reduce “both” errors “simultaneously.” By introducing a frequency detuning parameter δf to the waveform190 (defined such that δf = 0 corresponds to the qubit frequency), i.e.,


and choosing λ to minimize leakage errors, then phase errors can be reduced simultaneously.193 Similarly, by a judicious use of the virtual-Z gate, it is also possible to reduce phase errors in combination with DRAG pulsing to reduce leakage.186 Modern single-qubit gates using DRAG pulsing now routinely reach fidelities F1qb0.99.65,67,193,196–199 Other techniques also exist for operating single-qubit gates in a spectrally crowded device.200,201

As briefly mentioned in Sec. IV C, single-qubit gates supplemented with an entangling two-qubit gate can form the gate set required for universal quantum computation. The two-qubit gates available in the transmonlike superconducting qubit architecture can roughly be split into two broad families as outlined previously: one group requiring local magnetic fields to tune the transition frequency of qubits and one group consisting of all-microwave control. There exist several hybrid schemes that combine various aspects of these two categories and, in particular, the notions of tunable coupling and parametric driving are proving to be important ingredients in modern superconducting qubit processors.63,67,89,103,105,106,202–207 In this section, however, we start by introducing the iSWAP gate, and then review the CPHASE (controlled-phase) in Sec. IV F and the CR (cross-resonance) in Sec. IV G. We briefly review a few other two-qubit gates and discuss their merits in Secs. IV G 4 and IV H.

1. Deriving the iSWAP unitary

As we saw in Sec. II, Eq. 31 the interaction term between two capacitively coupled qubits (in the two-level approximation) is given by


where g is the coupling strength and ⊗ is used to emphasize the tensor product. If the capacitive coupling is mediated through a bus resonator, then208,209


where gi is the resonator coupling to qubit i (dependent on the qubit-resonator coupling capacitance Cqir) and Δi=ωqiωr is the detuning of qubit i to the resonator. In the simpler case where the qubits are directly coupled210 


where Cqq is the qubit-qubit coupling capacitance and Ci is the capacitance of qubit i. Throughout this section, we will assume a direct capacitive coupling between qubits of the flux-tunable transmon type, so that g=gqq and ωqiωqi(Φi). For simplicity, we suppress the explicit flux dependence of the ωqi's and simply refer to the coupling as g. Equation (102) can be rewritten as


and then using the rotating wave approximation again (i.e., dropping fast rotating terms) we arrive at


where we have introduced the notation δω12=ωq1ωq2 and suppressed the explicit tensor product between qubit subspaces. If we now change the flux of qubit 1 to bring it into resonance with qubit 2 (ωq1=ωq2), then


The first part of Eq. (107) shows that a capacitive interaction leads to a swapping of excitations between the two qubits, giving rise to the “swap” in iSWAP. Moreover, due to the last part of Eq. (107), this capacitive coupling is also sometimes said to give rise to an “XY” interaction.211 The unitary corresponding to a XY (swap) interaction is


Since the qubits are tunable in frequency, we can now consider the effect of tuning the qubits into resonance for a time t=π2g


From this result, we see that a capacitive coupling between qubits turned on for a time t (inversely related to the coupling strength in units of radial frequency) leads to implementing a so called “iSWAP” gate,209,210,212–215 which acts to swap an excitation between the two qubits, and add a phase of i=eiπ/2. For completeness, we note that for t=π4g, the resulting unitary


is typically referred to as the “squareroot-iSWAP” gate. The iSWAP gate can be used to generate Bell-like superposition states, e.g., |01+i|10.

To elucidate the operating principle behind an iSWAP implementation, we show the spectrum of a flux-tunable qubit using typical transmonlike parameters in Fig. 15(a). The iSWAP is performed at the avoided crossing, where Φ=ΦiSWAP. By preparing QB1 in-state |1, moving into the avoided crossing, waiting there for a time τ [see pulse-sequence in inset in Fig. 15(b)], the excitation is swapped back and forth between the two qubits, as shown in Fig. 15(b). In Fig. 15(c), we plot the linecuts of (b) at ΦiSWAP, showing the excitation oscillating back and forth between |01 and |10 with the predicted time t=π/2g. In turn, the frequency of the oscillation can be used to extract the strength of the coupling, 2t=gπ.

So far, we have ignored the role of the single-qubit phases acquired by tuning the qubit frequency. Referring to the pulse-sequence shown in the top panel of Fig. 15(a), we see that each qubit will acquire a phase given by


This phase can be conveniently removed either by subsequent application of virtual-Z gates to all following pulses,186 or by shaping the waveform of the excursion such that single-qubit phases are exactly canceled.216 

Equations (104) and (108) together present a useful result from a quantum processor design perspective: The operating regime, frequency and time τ of the iSWAP can be calculated (typically simulated) to a high precision, before any processor fabrication is undertaken. The only “quantum” parts that enter gqq (and gqrq) are the qubit frequencies, ωq1(Φ1) and ωq2(Φ2). If the Josephson energies of the qubits are known (which they typically are, from fabrication parameters), then by simulating the capacitances in gqq or gqrq, the time τ and the pulseshape needed to implement an iSWAP can be estimated to high precision. Typical values of the coupling strength, g/(2π), for architectures using the iSWAP gate are 5–40 MHz and are often very close to expectations from EM simulations.213,215–217

2. Applications of the iSWAP gate

The iSWAP cannot generate a CNOT gate by itself. Rather, to implement a CNOT gate requires stringing together two iSWAP s and several single qubit gates211 


As evident from Eq. (112), the iSWAP gate in general needs to be used twice to generate a single CNOT, leading to a significant overhead when compiling CNOT–dense circuits from iSWAP gates. However, depending on the context, the iSWAP can be used efficiently (i.e., without any two-qubit gate overhead) to mimic the behavior of a CNOT. Typically such circuits will not be completely equivalent, but will share certain salient features for specified input states. As an example of this procedure, Neeley et al.214 demonstrated the generation of a 3-qubit Greenberger-Horne-Zeilinger (GHZ) state (which requires two subsequent CNOTs in the simplest construction), by using only two iSWAPs in a circuit that correctly generates the 3-qubit GHZ state on the |000 input. Moreover, the XY–interaction is a powerful tool for certain types of quantum simulation algorithms.218 If one is interested in digital quantum simulation of spinlike systems, then the XY–interaction can natively simulate, e.g., a Heisenberg interaction


This approach to the XY interaction was demonstrated by Salathé et al.,216 where repeated application of the iSWAP gate interspersed with single-qubit rotations was used to generate successive XY, XZ and YZ interactions that lead to an aggregate HHeisenberg Hamiltonian. State-of-the-art operation of the iSWAP gate has also been used to demonstrate a ten-qubit GHZ state.219 

In our discussion of the iSWAP gates, we assumed that the higher energy levels of the superconducting qubit do not play a role. As we show below, it turns out that for the case of transmon qubits (with negative anharmonicity), the higher levels can in fact be utilized to generate a the CPHASE gate directly.64,220

Recall from Sec. IV C that the CPHASE gate implements the following unitary:


Our goal for the remainder of this section is to show that the unitary operator of the CPHASE gate appears naturally for capacitively coupled transmon superconducting qubits and review a few of the modern applications of this gate. We have chosen to include a considerable amount of details for the implementation of this gate, as a means to review some of the issues one has to resolve, to engineer high quality two-qubit gates.

The structure of the matrix in Eq. (63) indicates that we need to apply a phase (1=eiπ) to the qubits whenever both are in the excited state |11. Considering the nature of the XY interaction, which couples |01|10 and leads to the iSWAP gate (see Sec. IV E), we expect avoided level crossings to exist between higher levels, e.g., |11|20 and |11|02. The flux-tunable implementation of the CPHASE gate relies on this higher-level avoided crossing.

To motivate this intuition, we plot the spectrum for two coupled transmon qubits, in Fig. 16(a), including levels with two excitations, as the local magnetic flux in qubit 1 is being tuned. The Hamiltonian for this spectrum, written in the {|00,|01,|10,|11,|02,|20}-basis, is approximately given by


where Enm=Enq1(Φ1)+Emq2(Φ2) and En(Φi) is the flux-dependent energy of the i-th level of a transmon,52 and the {|02,|20}|11 transitions are scaled by a factor 2 due to the higher photon number. In Fig. 16, we plot the frequencies ωnm=EnmE00 calculated from Eq. (115), using standard, symmetric, transmonlike parameters, as the local magnetic field of qubit 1 is increased.

The result of the higher levels on the computational basis can be understood by considering a concrete example. By preparing the combined qubit state |11 and moving slowly toward the avoided crossing between |11 and |20 at ΦCPHASE, waiting for some time τ and moving back [see black line with arrows in Fig. 16(b)], the resulting unitary operator in the computational basis is given by




is the phase acquired by the state |ij along the trajectory in (Φ, t)-space during time τ. The movement should be sufficiently slow on a time scale set by g that the moving state never populates the |20 state, i.e., the movement should be adiabatic. In terms of applied flux, the avoided crossing between the |11|20 state happens before |10|01 (due to the negative anharmonicity of the transmons, αEc) and consequently does not take the states through the ΦiSWAP operating point. As shown in Fig. 16(b), we can define a parameter (typically denoted ζ) quantifying the difference in phase acquired by the |11 relative to the single excitation states


The parameter ζ can be thought of as the result (in the computational space) of the repulsion of |11 due to the |20 state. If we now choose a trajectory π, designed so that 0τζ(π(t))dt=π, then


Inserting this expression into Eq. (116), we see that


After the adiabatic excursion, one can now apply single-qubit pulses (or use virtual-Z gates) to exactly cancel the single-qubit phases such that θ10(π)=θ01(π)=0. This changes Uad to


From Eq. (121), it is evident that an adiabatic movement of |11, followed by single-qubit gates (virtual or real) efficiently implements a CPHASE and, through Eq. (66), also efficiently implements a CNOT. The CPHASE gate is one of the workhorses of modern superconducting qubit processors with gate fidelities 0.99.65,221

One is, of course, free to choose an arbitrary trajectory ϕ that implements the phase eiϕ on the |11 state. Assuming that the single-qubit phases are properly canceled, one sees that the arbitrary phase version of the CPHASE gate (typically denoted CZϕ) can be written as


Because of the form of Eq. (122), one can think of the avoided crossing with the higher levels outside the computational subspace as giving rise to an effective σzσz coupling within the computational subspace.220 

An alternative to the adiabatic approach outlined above to realize CPHASE is to make a sudden excursion to the ΦCPHASE operating point, after waiting a time t=π/2g, the state will have completed a single Larmor-type rotation from |11 to |02 and back again to |11, but in the process, acquired an overall π phase, similar to the iSWAP gate, but in the {|11,|20} subspace.54 In fact, such excursions near or through avoided crossings leading to adiabatic and nonadiabatic transitions have been studied extensively in the context of interferometry, cooling, spectroscopy, and quantum control.117,222–231

The remainder of this subsection is devoted to an overview of some of the recent advances and demonstrations using the CPHASE gate since its first demonstration in 2009 where it was used to generate Bell-states and demonstrate two-qubit algorithms.64 

1. Trajectory design for the CPHASE gate

The (adiabatic) implementation of UCPHASE outlined above assumed that the trajectory π was completely adiabatic and that the |11 state never left the computational subspace. Since the fidelity of gates is bounded from above by the coherence times of the qubits, short gate times are desirable.232 This presents a tension for optimally operating the CPHASE gate—fast operation in conjunction with the need for adiabatic operation. A relevant question is then: what is the “optimal” trajectory π that implements the necessary phase as fast as possible, with as little leakage as possible, for a given size of the avoided crossing between |11 and |20? Given a typical coupling rate g/2π20 MHz (as discussed in Sec. IV E), one expects a heuristic lower time limit to be 2π/g50 ns (stronger coupling of course leads to shorter gate times, but will limit the on/off ratio of the gate). Traditional optimal control of adiabatic movement assumes the movement is “through” the avoided crossing (see, e.g., Ref. 233), but the trajectory π moves close to and then back from the avoided crossing. This modification to the adiabatic movement protocol was addressed by Martinis and Geller,234 specifically in the context of errors for a CPHASE gate implementation. The authors show that nonadiabatic errors can be minimal for gate times only slightly longer than 2π/g using an optimal waveform (based on a Slepian waveform235) to parametrize the trajectory π(τ).

2. The CPHASE gate for quantum error correction

Using the approach of Martinis and Geller, Barends et al. were able to demonstrate a two-qubit gate fidelity FCPHASE=0.9944 (determined via a technique known as “interleaved randomized benchmarking”236–239). This implementation had a gate time τ = 43 ns and was implemented with the π waveform,65 in an “xmon” device85−a transmon with a “+”-shaped capacitor. A two-qubit gate fidelity F>0.99 represents a significant milestone, not just from a technical and engineering perspective, but also from a foundational standpoint: The surface code (a quantum error correcting code) has a lenient fault-tolerance threshold of 1%.240–242 This means, roughly speaking, that if the underlying operations on the qubits have fidelities F>0.99, then by adding more qubits to the circuit (and correctly implementing the fault-tolerant quantum error correction protocol) the overall error-rate can be reduced, and one can in principle perform arbitrarily long quantum computations, without errors spreading uncontrollably and corrupting the calculation. Because of its relatively lenient threshold under circuit noise (compared to, e.g., Steane or Shor codes172,243,244) and its use of solely nearest-neighbor coupling, the surface code is one of the most promising quantum error correction codes for medium-to-large scale quantum computing in solid state systems.240 Therefore, surpassing the fault-tolerance threshold using CPHASE represents a significant milestone for the field.245 Moreover, practical blueprints for implementing scalable subcells of the surface code using the CPHASE as the fundamental two-qubit gate have also been proposed71 as well as in-situ calibration protocols for large-scale systems operating with CPHASE.246 For a full review of the pros and cons of various quantum error correcting codes we refer the interested reader to, e.g., an introductory review article Ref. 247, or any of the excellent textbooks and more detailed review articles in Refs. 172, 174, 244, and 247–250.

Returning to the CPHASE gate, numerical optimization of π was demonstrated by Kelly et al.221 using the interleaved randomized benchmarking sequence fidelity as a cost function to push a native implementation of π with a fidelity F=0.984 up to F=0.993, surpassing the surface code fault tolerance threshold. In the same work that demonstrated FCPHASE=0.9944, Barends et al.65 used the CPHASE gate to generate GHZ states, |GHZ=(|0N+|1N)/2, of up to N =5 qubits, with a fidelity for the N =5 state of F=Tr(ρidealρN=5)=0.817. The protocol for generating the GHZ state with N =2 and N =3 from CPHASE was originally demonstrated by DiCarlo et al..54,64 The textbook route to generating the N =2 GHZ state, |Φ+ (a Bell state) from the all-zero input is


An equivalent circuit using CPHASE and native single-qubit gates in superconducting qubits is


By repeating the operation inside the dashed box on additional qubits, an N-qubit GHZ state can be generated.65 Since the demonstration of the N =5 GHZ state using the CPHASE gate, the gate has been deployed to demonstrate several important aspects of quantum information processing using superconducting qubits. A nine-qubit implementation of the five-qubit repetition code (five data qubits + four syndrome qubits)247 was demonstrated, and the error suppression factor of a single logical quantum bit was shown to increase as the encoding was changed from three data qubits to five data qubits.66 Similarly, in a five qubit processor the three-qubit repetition code with artificially injected errors was demonstrated,251 building on earlier results utilizing a combination of iSWAP and CPHASE gates to perform parallelized stabilizer readout.252 

3. Quantum simulation and algorithm demonstrations using CPHASE

As an example of the utility of the CPHASE gate, we briefly discuss a particular demonstration of a digital quantum simulation. In this context, the CPHASE gate has been utilized to simulate a two-site Hubbard model with four fermionic modes, using four qubits.253 Using the Jordan-Wigner transformation,254,255 it is possible to map fermionic operators onto Pauli spin matrices.254 As shown in Ref. 253, a Hubbard model with two fermionic modes, whose Hamiltonian is given by


can be written in terms of Pauli operators as


where U is the repulsion energy and t is the hopping strength. Similar to the Heisenberg interaction discussed briefly in Sec. IV E, it is now a question of producing ασiσi-type interactions, where the prefactor α can be tuned. Using the CZϕ version of CPHASE, a UZZ(ϕ)=exp(iϕ2σzσz) unitary can be generated via


where Aπ{Xπ,Yπ} is used to allow for small and negative angles. Finally, for completeness, we mention an alternative approach to creating UZZ, given by42,256


which has the benefit of relying on CPHASE (through the CNOTs), and the angle can be controlled using the single-qubit Z gates. We refer the interested reader to two reviews on quantum simulations, see, e.g., Refs. 257 and 258.

The CPHASE gate has also been used in a variety of other contexts, e.g., for calculating the dissociation of diatomic hydrogen (H2) using the variational quantum eigensolver method,259 for feed-forward based teleportation experiments,260,261 as well as initial steps toward demonstrating quantum supremacy262 and a 2 × 2 implementation of the Harrow-Hassidim-Lloyd algorithm263,264 In the field of hybrid semiconducting nanowire/superconducting qubits (known as the “gatemon” approach265–267), where the qubit frequency is modified by electrostatically changing the density of carriers in a semiconducting region with proximity-induced superconductivity, the CPHASE gate was also demonstrated between two nanowire qubits.268 

One may worry that operating a qubit by moving its frequency can lead to overlap with frequencies already used by other qubits, in a system with multiple qubits. This issue is known as “frequency crowding.” While the use of asymmetric transmons [with two sweet spots in the range [Φ0,+Φ0], recall that Fig. 2(c)] may help alleviate some frequency crowding issues, a more long-term strategy is needed. One way to circumvent the problem is to utilize on/off tunable coupling schemes, in which qubits can exchange energy only if a coupler activates the interaction.63,103 To address this issue in the context of the CPHASE gate, Chen et al.103 demonstrated a device (named “the gmon”) where the qubit interaction can be tuned with an on/off ratio on the order of 1000, and a CPHASE gate fidelity of F=0.9907 was demonstrated.

This concludes the introduction to the physics and operation of the CPHASE gate in its native form. In the remainder of this section, we will introduce a few of the microwave-only gates that have been demonstrated in an effort to sidestep the need for local tunability (and the resulting increased sensitivity to noise) as required by the iSWAP and CPHASE gate.

One common (potential) drawback for the iSWAP and CPHASE gates is that their operation requires flux-tunable qubits. Introducing a new control knob, such as flux control, in turn also introduces a new noise channel for the system. Furthermore, the need for flux-tunability increases the sensitivity of the devices to flux noise by tuning the qubits from their “sweet spots,” increases the dephasing rate. From this perspective, one could envision using all-microwave-based gates to remedy these issues. To this end, the cross-resonance (“CR”) gate was developed for operating fixed-frequency superconducting qubits,269–271 which typically feature longer lifetimes and reduced sensitivity to flux noise.

1. The operational principle of the CR gate

To elucidate the operation of the CR gate, we briefly revisit the driving Hamiltonian derived in Sec. IV D. There, we considered only a single qubit. However, if one extends this formalism to two qubits, see Fig. 17(a) denoting the frequency difference by Δ12=ωq1ωq2 and the coupling by gΔ12, and performing a Schrieffer-Wolff transformation to go to the dressed state picture, the driving Hamiltonians for qubit 1 and 2 become270,272




and ΩVdi(t) is the driving for qubit i. From Eq. (130), it is evident that if we drive qubit 1 at the frequency of qubit 2, then to qubit 2, this will look like a combination of ν11σx and μ1σzσx. This means that the Rabi oscillations of qubit 2 will have a frequency given by


where z1=σz1, and z1 depends on the state of qubit 1. This effect is demonstrated in Fig. 17(c), where a simulated drive is applied to qubit 1 while the resulting Rabi oscillations in qubit 2 are recorded. We have used typical fixed-frequency transmon parameters from experiments, and we have included a spurious cross-talk term η=0.03.239,273 In Fig. 17(d), we plot the difference in angle in the (z, y) plane acquired by qubit 2 for different initializations of qubit 1, Δϕ=ϕ|00zyϕ|10zy. For this particular choice of parameters, the cross-resonance gate achieves a π-phase shift in ≈200 ns.

This strategy was first demonstrated using flux-tunable transmons in Ref. 274, where a Bell state with fidelity Fbell=Φ+|ρ|Φ+=0.90 was achieved. Using quantum process tomography, the gate fidelity was found to be FQPT=0.81. By moving to fixed-frequency qubits with increased lifetimes, the gate fidelity was increased to FQPT=0.98 (with subtraction of state initialization and measurement errors).273 For completeness, we note that due to the form of the last term in Eq. (130), the CR gate is also sometimes denoted the ZXθ gate. The unitary matrix representation of the CRθ gate is


where θ=μ1ΩVd1(t), which can be used to generate a CNOT with the addition of only single-qubit gates


up to a phase eiπ/4.

2. Improvements to the CR gate and quantum error correction experiments using CR

Since qubit 1 is being driven off-resonance, an ac-Stark shift will add a term σz1 to the driving Hamiltonian of qubit 1. The effect of both the spurious ac Stark shift and the direct ν11σx single-qubit rotations was studied in Ref. 239. By modifying the original CR protocol to effectively “echo away” the two unwanted contributions from the σz1 and 1σx terms, the fidelity of the CR gate was improved to FCR=0.8799,239 using quantum process tomography. Using interleaved randomized benchmarking of this improved “echo-CR”-gate (e CRπ2), a gate fidelity of FeCRπ2=0.9347 was achieved. This gate implementation was used to demonstrate two-qubit parity measurements in a three-qubit device,275 as well as detecting bit-flip and phase-flip errors in a Bell state encoded in a four-qubit device,276 with gate fidelities from interleaved randomized benchmarking in the range 0.94 to 0.96. Using a similar device, but with five qubits, weight-four parity measurement of the forms ZZZZ and XXXX were demonstrated,277 where the crosstalk to qubits not involved in the CR gates was studied, leading to the development of a four pulse eCR4pulse scheme.

Based on improvements in the analysis of the Hamiltonian describing the CR drive, Sheldon et al.197 subsequently demonstrated a version of the CR which reduced the gate time to τ = 160 ns and added an active cancelation tone to the e CR previously developed. Using this “active cancelation echo CR” (ace CR), the fidelity was increased to FaceCRπ2=0.991, measured with interleaved randomized benchmarking. The same sequence without active cancelation on the same qubits yielded FeCRπ2=0.948. The interested reader may consult the followup theoretical work278 with more details on the effective Hamiltonian models. Other approaches to fast, high-fidelity cross-resonance gates have also been proposed.279 This series of improvements to the original cross-resonance implementation has increased the gate fidelity to beyond the threshold for fault-tolerance in a surface code, with similar quality to the CPHASE gate. Although improvements should still be made, with the advent of the CR gate, superconducting qubit based quantum computing platforms now offer two entangling two-qubit gates that can be used for implementing surface-code based error correction schemes.

In the initial experiments using CR gates, the gate times were significantly longer than the typical CPHASE gate times (τCPHASE=30–60 ns and τCR=300–400 ns), which to a large extent accounts for the observed CR gate fidelities. The time scale for CR operation is set by the frequency detuning, the anharmonicity, and the coupling strength, through Eq. (132). This has the unfortunate drawback that if qubits do not have the intended frequencies (due to fabrication variation), they will be immediately manifested as longer gate times, and in turn, reduced gate fidelity. As fabrication techniques are becoming more sophisticated and reliable, this problem may be of reduced importance. However, since the coupling in the CR scheme is always on, there is an inherent tension between well-isolated qubits for high-fidelity single-qubit operations, and coupling qubits, for fast/high-fidelity two qubit gates.

3. Quantum simulation and algorithm demonstrations with the CR gate

Since the form of the CR Hamiltonian (σzσx) is not a (σxσx+σyσy)-type interaction (leading to iSWAP gate) nor is it an the effective (σzσz)-type (leading to CPHASE gate), one could question its applicability to quantum-simulation-type experiments, which often involves terms of the form σiσi. However, by developing a variational quantum eigensolver routine that efficiently generates entangled trial states using just the CR interaction, Kandala et al.280 calculated the ground-state energy for H2, LiH, and BeH2. This experiment was performed on six fixed-frequency qubits, and it employed a technique for compact encoding of the Hamiltonians corresponding to each molecule.281 As of this writing, this experiment represents the largest molecule for which the ground state has been found using a purely quantum processing approach.

The CR gate is also the native two-qubit gate available on the IBM Quantum Experience quantum processor,282 which is accessible online. Using the IBM Quantum Experience processor, Takita et al.283 demonstrated an implementation of a two-logical-qubit (four physical qubit) error detection code.284 The implementation was inspired by the proposal of Gottesman,285 which proposed a minimal experiment to claim observation of fault-tolerant encodings,248 using a four qubit error detection code in a five qubit setup. Due to constraints on the connectivity, the work by Takita et al. demonstrated a modified version of the Gottesman encoding, in which two logical qubits are initialized, but only one of them in a fault-tolerant manner. By artificially injecting an error in the state preparation circuit, the authors demonstrate that the probability of correctly preparing a fault tolerant state is greater than the probability of correctly preparing a non-fault-tolerant qubit. This behavior is consistent with expectations for how fault-tolerant encodings work. Simultaneously, Vuillot286 also used the IBM Quantum Experience machine to study fault-tolerant schemes encoded in that connectivity.

Beyond the applications to error-correction and error-detection, the cross-resonance gate has also been employed in early demonstrations of quantum advantages in machine learning. Risté et al.287 studied the so-called “learning parity with noise” problem, in which one attempts to learn a bit-string k by querying an oracle function f(D,k)=D·kmod2 with a user-input bit-string D. In a first implementation of this problem, the authors show that for a specific instance of the bit-string k=11, a learner with access to quantum operations needs fewer queries to the function f. However, by extending the model of learning parity with noise, the authors demonstrated a consistent advantage of the learner with access to quantum operations.287 

The CR gate was also used to demonstrate the implementation of a supervised learning algorithm where the feature space is encoded as quantum data on the Bloch sphere.256 In typical supervised learning, an algorithm is exposed to a training set of labeled data, and is subsequently asked to classify a new, unlabeled set of data.288 In the support vector machine (SVM) approach to such problems, the data is then mapped nonlinearly onto the so-called “feature space,” in which the trained algorithm has constructed a separating hyperplane to classify the data. While a full “quantum Support Vector Machine” proposal exists, the algorithm assumes that the data are already present in a coherent superposition.289 Instead, Havlicek et al.256 proposed, and demonstrated, that mapping the classical data nonlinearly onto the Bloch sphere can also be utilized to provide a quantum advantage. For a wider discussion of the important role of quantum data in many quantum machine learning algorithms, the reader is referred to Ref. 290.

4. Other microwave-only gates: bSWAP, MAP, and RIP

The CR gate (as outlined above) is not the only all-microwave two-qubit gate available. In particular, the bSWAP gate291 is an interesting alternative. The bSWAP gate directly drives the |00|11 transition, made possible by interactions with the higher levels of the qubit, see Fig. 18. Usually, the matrix element for such a transition is small (3rd order in the coupling strength), but if the detuning between the qubits is equal to the anharmonicity, the transition rate is enhanced. Applying a sequence of Schrieffer-Wolff transformations to the coupled-qubit system, and using a carefully chosen drive frequency (close to the midpoint of ωq1 and ωq2), it can be shown272 that the drive gives rise to a unitary operator




The two unitaries UZZ and UIZZI only contain terms that commute with UbSWAP(θ,ϕ), and their effects can be offset in postprocessing.272 In Eq. (138), ϕ is the phase of the drive relative to the single-qubit drive pulses, and θ=ΩBt with


where Ω is the amplitude of the drive, γ is a dimensionless parameter quantifying the coupling coefficient of the drive to qubit 2 in units of coupling strength to qubit 1, and αΣ=α1+α2. Explicit derivations leading to Eq. (137) can be found in the supplement of Ref. 291. By applying UbSWAP for a time that yields θ=π/2, and with ϕ=0, the resulting gate is denoted bSWAP and can act as the entangling gate (together with single-qubit gates) that forms a universal gate set. Moreover, the power of the bSWAP becomes apparent when one applies it for the time that yields θ=π/4, which from the ground state |00 directly produces the entangled Bell state |00+eiϕ|11. In line with the definition of iSWAP, this gate is denoted the bSWAP. In the work by Poletto et al.,291 the fidelity of the bSWAP gate was FbSWAP=0.9 (determined from quantum process tomography). The main source of error was the increased dephasing during the relatively long high-power pulse needed to drive the |00|11 transition. The bSWAP gate can be viewed as the superconducting qubit analog of the Mølmer–Sørensen gate.292 In Fig. 18, we outline the level diagram of two coupled qubits, along with the higher levels of the qubits. The arrows indicate which coupled states are utilized to implement the corresponding gate. As an application of the bSWAP gate, Colless et al.293 used this gate to calculate energies of the excited states of a H2 molecule using a two-qubit transmon processor.293 

Another all-microwave gate is the so-called “microwave-activated CPHASE” (or “MAP” for short).70 The MAP gate is in a spirit similar to that of the CPHASE gate, where noncomputational states are used to impact a conditional phase inside the computational subspace. In contrast to CPHASE, the MAP gate is implemented without tuning individual qubit frequencies. Rather, the canonical implementation of this gate comprises two fixed-frequency qubits, where the frequencies are carefully designed (and fabricated), such that the |12 and |03 levels are resonant. This leads to a splitting of the otherwise degenerate |02|01, and |12|11 transitions. By driving near resonance with the |n2|n1 transition, an effective σzσz interaction is generated. In a setup comprising two fixed-frequency qubits, the MAP gate was used to implement the unitary


with a gate fidelity FMAP=0.87 (determined via quantum process tomography) in a time τMAP=514ns.70 As the number of qubits in a system increase, one drawback of this gate is the need for a precise matching of higher energy levels across multiple qubits, while simultaneously avoiding spurious couplings to other modes in the system.

The CR, bSWAP and MAP gates all have quite stringent requirements on the spectral landscape of the qubits in order to obtain fast, efficient gate operation. To address this issue, another all-microwave gate was developed, the so-called “resonator induced phase gate” (“RIP”).294,295 The RIP gate operates by coupling two fixed-frequency qubits to a bus cavity, from which they are far detuned. By adiabatically applying and removing an off-resonant pulse to the cavity, the system undergoes a closed loop in phase space, after which the cavity is left unchanged, but the qubits acquire a state-dependent phase. By a careful choice of the amplitude and detuning of the pulse, and taking into account the dispersive shift of the cavity, a CPHASE gate can be implemented on the two qubits. This effect was experimentally demonstrated by Paik et al.296 in a 3D transmon system,55 where four qubits are coupled to the same bus. In this setup, the RIP gate operation results in unitaries with weight on all four qubits simultaneously. In order to isolate just the desired two-qubit coupling terms, Paik et al. developed a “refocused” RIP (rRIP) gate that implements


where the coupling rate (for an unmodulated drive) scales as


where n¯ denotes the average number of photons in the bus, χ is the dispersive shift, and Δcd is the detuning of the drive (d) from the cavity (c). By choosing θ̇t=π/4, it is possible to implement the CPHASE gate. The power of the RIP gate lies in its capability to accommodate large differences in qubit frequencies. To demonstrate this, Paik et al.296 performed two-qubit randomized benchmarking between pairs of qubits in a four-qubit device with frequency differences spanning from 0.38 GHz to 1.8 GHz, all with fidelities in the range 0.96–0.98 and gate times in the range of 285 to 760 ns.

Finally, we briefly review tunable coupling architectures, which have recently emerged as a promising alternative. The idea is to engineer an effective qubit-qubit coupling g̃ that is tunable (typically by applying a flux), and such gates are referred to as parametric gates. This can be implemented in two different ways: (i) The coupling strength between two qubits is tuned by a flux, gg(Φ(t)),193,202,297–299 or (ii) the resonant frequency of the coupling element is modified ωcouplerωcoupler(Φ(t)),89,106,300–304 with a fixed g, leading to an effective time-dependent coupling parameter. When the tunable coupling element is driven at frequencies corresponding to the detuning of the qubits from the coupler, an entangling gate can be implemented.

In a setup of type (ii), an implementation of the iSWAP gate was demonstrated by parametrically driving a flux-tunable coupler between two fixed-frequency qubits,63 yielding a fidelity FiSWAP=0.9823 (using interleaved randomized benchmarking) in a time τ = 183 ns. Similarly, the bSWAP (and iSWAP) gates were recently demonstrated, using a flux-tunable transmon connecting two fixed-frequency transmons. Driving the flux through the tunable qubit at the sum frequency of the fixed-frequency transmons results in the bSWAP104 gate. This parametrically driven approach is generally significantly faster than implementations relying solely on fixed-frequency qubits.

A hybrid approach, in which a combination of tunable and fixed-frequency qubits is used, was recently demonstrated for both iSWAP and CPHASE gates.67,105,206 This scheme has no added tunable qubits (or resonators) acting as the coupling element, but rather, relies solely on an always-on capacitive coupling between the qubits, and the effective coupling is roughly half that of the always-on coupling. The operational principle here is to modulate the frequency of the tunable qubit (using local flux control) at the transition frequency corresponding to |01|10 for iSWAP and |11|02 for CPHASE. Using interleaved randomized benchmarking, the authors demonstrated FiSWAP=0.94 (τ = 150 ns), and FCPHASE02=0.93 (τ = 210 ns) and FCPHASE20=0.88 (τ = 290 ns), showing a slight asymmetry in the direction in which the CPHASE is applied. This hybrid technique was used in Ref. 67 to demonstrate a four-qubit GHZ state with fidelity F4qubitGHZ=0.79 (using state tomography). Finally, this gate-architecture was used to demonstrate a hybrid quantum/classical implementation of an unsupervised learning task (determining clustering of data), using nineteen qubits and supplemented by a classical computer as part of the minimization loop.305 

The ability to perform fast and reliable (high fidelity) readout of the qubit states is an important cornerstone of any quantum processor.3 

In this section, we give a brief introduction to how readout is performed on superconducting qubits. We start by reviewing the fundamental theory behind “dispersive readout”—the most common readout technique used today in the circuit QED architecture—in which each qubit is coupled to a readout resonator. In the dispersive regime, i.e., when the qubit is detuned from the resonator frequency, the qubit induces a state-dependent frequency shift of the resonator from which the qubit state can be inferred by interrogating the resonator.

A dispersive readout allows us to map the quantum degree of freedom of the qubit onto the classical response of the linear resonator, thus transforming the readout optimization process into obtaining the best signal-to-noise ratio (SNR) of the microwave signal used to probe the resonator.

We then provide guidance on how to optimize system parameters to perform high-fidelity, single-shot readout. After choosing parameters, such as resonant frequencies and coupling rates, we address the filter and amplifier circuitry positioned in-between the qubit plane and the data acquisition hardware outside of the dilution refrigerator. On this note, we review the basic principles of Purcell filters as well as parametric amplifiers, both of which are necessary to obtain a fast, high-fidelity readout in scaled-up quantum processors.

A quantum measurement can be described as an entanglement of the qubit degree of freedom with a “pointer variable” of a measurement probe with a quantum Hamiltonian,306 followed by classical measurement of the probe. In circuit QED, the qubit (the quantum system) is entangled with an observable of a superconducting resonator (the probe), see Fig. 19(a), allowing us to gain information about the qubit state by interrogating the resonator—rather than directly interacting with the qubit. Therefore, the optimization of the readout performance is translated to maximizing the signal-to-noise ratio of a microwave probe tone sent to the resonator, while minimizing the unwanted “back-action” on the qubit.

The qubit-resonator interaction is described by the Jaynes–Cummings Hamiltonian,307–309 previously introduced in Sec. II


where ωr and ωq denote the resonator and qubit frequencies, respectively, and g is the transverse qubit-resonator coupling rate. The operators σ+ and σ represent the processes of exciting and de-exciting the qubit, respectively.

In the limit when the detuning between the qubit and the resonator is small compared with their coupling rate, i.e., Δ=|ωqωr|g, the energy levels of the two systems hybridize and a vacuum Rabi splitting of frequency ng/π opens up, where n=1,2,3 denotes the resonator mode. In this regime, excitations are coherently swapped between the two systems. Although useful for certain two-qubit gate operations, recall Sec. IV E, such transverse interactions change the qubit state (since energy is directly exchanged between the resonator and the qubit) and is therefore not desired in the context of “quantum nondemolition” (QND) readout, in which the outcome of the quantum measurement is not altered in the act of reading out the system.

In the dispersive limit, i.e., when the qubit is far detuned from the resonator compared with their coupling rate g and the resonator linewidth κ, Δg,κ, there is no longer a direct exchange of energy between the two systems. Instead, the qubit and resonator push each others' frequencies. To see this, the Hamiltonian can be approximated using second-order perturbation theory208,310 in terms of g/Δ, taken in the limit of few photons in the resonator. This is known as the “dispersive approximation,” after which the Hamiltonian takes the form


where χ=g2/Δ is the qubit-state dependent frequency shift, a so-called “dispersive shift,” see Fig. 19(b), allowing us to distinguish the two qubit states. This is an asymptotically longitudinal interaction, yielding a QND measurement. Note that, in addition, the qubit frequency also picks up a “Lamb shift,” ω̃q=ωq+g2/Δ, induced by the vacuum fluctuations in the resonator. Also note that the dispersive Hamiltonian in Eq. (144) is derived for a two-level atom.430 Taking the second excited state into account and introducing the anharmonicity α=ωq12ωq01 modifies the expression for the dispersive shift,


which for a transmon qubit with α<0 implies that the dispersive shift will depend on the detuning. This effect is plotted in Figs. 20(a) and 20(b), where the second energy level manifests itself as a second vertical asymptote at Δ/2π=EC/h. It is also worth noting that for qubit modalities with positive anharmonicity, e.g., flux qubits, the dispersive shift will also shift the sign.62 

In the small photon-number limit, the interaction term of the Hamiltonian in Eq. (144) commutes with the qubit observable,431σz, resulting in a QND measurement.306 This is an important condition for many applications in quantum information processing. However, it has been demonstrated that it is still possible to read out the qubit state by applying a very strong resonator drive tone, eventhough this readout scheme is not QND.317 

In the case when the resonator photon number n=aa exceeds a “critical photon number” ncΔ2/(4g2), the dispersive Hamiltonian in Eq. (144) is no longer a valid approximation.208,311,312 Therefore, the critical photon number sets an upper bound for the power level of the resonator probe signal to maintain (an approximate) QND measurement.432 This limitation could be lifted by implementing a pure (and not only approximate) QND readout using a manifestly longitudinal coupling between a qubit and the resonator. Several groups are currently pursuing the implementation of “longitudinal readout,” in which QND readout could be performed even with a larger number of resonator photons, thus improving the SNR.107,314,315

We can also interpret the dispersive qubit-resonator interaction in another way; by rearranging the terms in Eq. (144), we can equivalently write


where the bare qubit frequency is shifted by a fixed amount g2/Δ, known as the Lamb shift433 as well as an amount proportional to the number of photons populating the resonator.52,208 This effect is known as the “ac-Stark shift.” It has the consequence that photon number fluctuations (noise) in the resonator induce small shifts of the qubit frequency, slightly bringing the qubit out of its rotating frame and thus causing dephasing.142 This means that spurious photon occupation and fluctuation in the resonator, be it thermal or coherent photons, shift the qubit frequency and causing dephasing.311,316 For this reason, it is important to make sure that the processor is properly thermalized,106 and its control lines well filtered317 and attenuated,143 to reduce photon number fluctuation.

In Sec. V A, we outlined the underlying physics behind the dispersive readout technique, in which we concluded that the qubit induces a state-dependent frequency shift of the resonator. We now focus our attention on how to probe the resonator to “read out the qubit,” that is, best distinguish the two classical resonator signatures corresponding to our qubit states, see Figs. 19(b) and 19(c).

The readout circuit can be set up in measuring either reflection or transmission. The best state discrimination is obtained by maximizing the separation between the two states in the (I, Q)-plane, i.e., the in-phase and quadrature component of the voltage, see Fig. 19(c). It can be shown that this separation is maximal when the resonator is probed just in-between the two qubit-state dependent resonance frequencies,157ωRF=(ωr|0+ωr|1)/2. In this case, the reflected magnitude is identical for |0 and |1, and all information about the qubit state is encoded in the phase θ, see dashed line in Fig. 19(b). In turn, the qubit-resonator detuning should be designed to obey the criterion for maximal state visibility, χ = κ/2, which is maximized for phase measurements while constraining qubit dephasing.

Once we have picked the resonator probe frequency, the quantum dynamics of the qubit can be mapped onto the phase of the classical microwave response. In the following, we discuss how we can use heterodyne detection to measure the phase of the resonator response. We assume that the reader is already somewhat familiar with basic mixer operations, such as modulation and demodulation of signals. For interested readers, we refer to Ref. 318.

1. Representation of the readout signal

A readout event commences with a short microwave tone directed to the resonator at the resonator probe frequency ωRO. After interacting with the resonator, the reflected (or transmitted) microwave signal has the form


where ωRO is the “carrier frequency” used to probe the resonator. ARO and θRO are, respectively, the qubit-state-dependent amplitude and phase that we want to measure. One can equivalently use a “complex analytic representation” of the signal,


where Re takes the real part of an expression, e.g., Re[exp(jx)]=Re(cosx+jsinx)=cosx.

To gain intuition, we can rewrite Eq. (148) in a static “phasor” notation that separates out the time dependence exp(jωROt)


where the phasor AROexp(jθRO)AROθRO is a shorthand that fully specifies a harmonic signal s(t) at a known frequency ωRO. To perform qubit readout, we want to measure the “in-phase” component I and a “quadrature” component Q of the complex number represented by the phasor


to determine the amplitude ARO and the phase θRO (Fig. 20).

2. I-Q mixing

One direct means to extract I and Q is to perform a “homodyne” or “heterodyne” measurement using an analog “I-Q mixer.” Figure 21 shows a basic electrical schematic of an I-Q mixer. The readout signal s(t) and a reference local-oscillator signal y(t)=ALOcosωLOt are fed into the mixer via the RF and LO mixer ports. The mixer then equally splits the signal and local oscillator into two branches and multiplies them in the following way: in the I-branch, the signal sI(t)=s(t)/2 is multiplied by the local oscillator yI(t)=(ALO/2)cosωLOt; and in the Q-branch, the signal sQ(t)=s(t)/2 is multiplied by a π/2-phase-shifted version of the local oscillator, yQ(t)=(ALO/2)sinωLOt. The “-” sign arises from the choice of using a A(cosωt+ϕ) as the reconstructed real signal. At the I and Q ports, the output signals I(t) and Q(t) contain terms at the sum and difference frequencies, generally referred to as an “intermediate frequency,” ωIF=ωRO±ωLO. The resulting signals are low-pass filtered, passing only the terms at the difference frequency, IIF(t) and QIF(t), which are then digitized. After digital signal processing, one obtains the static in-phase (I) and quadrature (Q) components, from which one calculates the amplitude ARO and the phase θRO.

Microwave mixers use square-law-type diodes to implement multiplication. The optical analog of a mixer operation is a combination of a balanced (50–50) beamsplitter followed by optical photodetectors, as shown in the inset of Fig. 22(a). The signal and local-oscillator optical fields are first combined at the beamsplitter, yielding superpositions of both fields, and then detected at the photodetectors, which act as square-law devices. To build intuition for how this works, tbe square of the sum of two electric fields (E1+E2)2=E12+E22+2E1E2 has a cross term that is the multiplication of the two fields. We refer the reader to Ref. 319 for further details.

3. Homodyne demodulation

One direct means to extract I and Q is to perform a microwave homodyne measurement using an analog I-Q mixer of the type shown in Fig. 21. In an analog homodyne measurement, the local oscillator (LO) is chosen to be at the carrier frequency ωLO=ωRO. Upon mixing, I(t) and Q(t) contain terms at both DC (ωIF=0) and terms at twice the carrier frequency. Time-averaging (filtering) I(t) and Q(t) directly yield the DC terms IIF(t)=I and QIF(t)=Q:


where T is a time interval taken to be an integer number of periods of the readout signal. I and Q are then sampled and used to calculate the amplitude and phase,


Note that the global value of ARO or θRO is not what matters; what matters is the “change” in ARO and θRO between the qubit being in-state |0 and state |1. For example, the value of A leaving the resonator and the value G × A reaching a measurement stage are different, where G represents the net gain in the measurement amplifier chain. However, the gain is the same, independent of the qubit state, whereas A may be different, e.g., ARO(0)=G×A|0 or ARO(1)=G×A|1. Similarly, the propagation phase ϕ accumulated while a signal travels between the resonator and the measurement stage is also independent of the qubit state, and simply imparts a phase offset to the qubit-induced phase shift, e.g., θRO(0)=θ|0+ϕ or θRO(1)=θ|1+ϕ.

Homodyning works in principle, but there are two drawbacks. First, signals directly demodulated to DC may be subject to lower signal-to-noise ratios, since they fight against 1/f electronics noise, as well as any other noise signals that may have inadvertently been demodulated (e.g., via a square-law detector). The second is that homodyning is not compatible with frequency division multiplexing (FDM), where a single pulse can be used to interrogate N resonators at different frequencies by applying tones at each resonator frequency using the superposition principle, e.g.,


Homodyning an FDM signal will put all resonator signals at DC, and once downconverted, they cannot be differentiated. To work around this, it is generally advantageous to use “heterodyning,” which uses a two-step demodulation process via an intermediate frequency ωIF. Such a scheme is easily compatible with the concept of FDM, because a readout signal is first demodulated to unique intermediate frequencies (IF) frequencies ωIF(i), and then digitally demodulated to extract each ARO(i) and θ(i). In the following, we will consider N =1 for simplicity, but the process is applicable to larger N provided the frequencies a sufficiently spaced to avoid interference with one another during the demodulation process.

4. Heterodyne demodulation

In a heterodyne scheme, a local oscillator at frequency ωLO is offset by an intermediate frequency ωIF to target a unique readout frequency ωRO. Up-conversion techniques such as single-sideband modulation with suppressed carrier (SSB-SC) using balanced I-Q mixers (operated in reverse compared with Fig. 21) are commonly used to create such readout signals. We refer the reader to Ref. 318 for more information on how to create such pulses.

Here, we want to extract ARO and θRO (or their scaled and offset versions) from the reflected/transmitted tone using a heterodyning scheme. The first step is to perform analog I-Q mixing, as illustrated in Fig. 22(a). In contrast to the homodyning case, here, the local oscillator and readout tone are at different frequencies, ωIF=|ωROωLO|>0. Mixing the LO and RO signals yields the signals I(t) and Q(t) with terms at both sum and difference frequencies. Filtering out the sum frequencies using low-pass filtering (time averaging) yields the IF signals:


As before, we have omitted any offset phases from the LO or from the wave propagation between the resonator and the measurement. Again, these offset values are not what matters; it is the change in ARO and θRO with a change in qubit state that allows state discrimination.

The analog-demodulated IIF(t) and QIF(t) are now oscillating at a frequency that is generally low enough to be digitized using commonly available analog-to-digital converters (ADCs). The resulting digital signals are now written as IIF[n] and QIF[n]


where n=t/Δt indexes the sample number of the continuous-time signals IIF(t) and QIF(t),ΩIF=ωIFΔt is the digital frequency, and Δt is the sampling period (typically around 1 ns). Pulsing the resonator is necessarily accompanied by a ring-up time, related to the quality factor of the resonator, and the first few samples may decrease overall signal-to-noise. Consequently, a delayed window of samples [n1:n2] is often used to perform the second digital demodulation of the discrete-time signals IIF[n1:n2] and QIF[n1:n2]. Note that more complicated windowing functions may also be used to improve state discrimination, but here we use a simple boxcar [see Fig. 22(b)].

Digital demodulation comprises the point-by-point multiplication of IIF[n1:n2] and QIF[n1:n2] by cosΩIFn and sinΩIFn. Averaging the resulting time series eliminates the 2ΩIF component while retaining the DC component, as in a homodyne measurement, one obtains


where M=n2n1+1. As before, I and Q can then be used to find ARO and θRO.

The same procedure may be view in the complex IQ plane by the analytic function zIF[n], as illustrated in Figs. 22(c) and 22(d) 


where the digital in-phase and quadrature signals are represented here as the voltages VI[n] and VQ[n] sampled by the ADC, and we have separated the static phasor (AROALO/8)exp[jθRO] from the rotating term exp[jΩIFn]. One can digitally demodulate the time series zIF[n] by multiplying by the complex conjugate of the oscillatory exponential,


where .* indicates a point-by-point multiplication, and the result is a vector of length M of nominally identical values of the phasor—one for each sample point—with a small amount of additive noise due to noise in the measurement chain, digitization errors, etc. A singular phasor value is then estimated by taking the average,


Such “single-shot measurements” may then be repeated a large number of times to obtain an ensemble average z¯[n].

In quantum measurements, noise plays an essential role as it dictates the fidelity of its outcome,124,320 recall Fig. 22(c). In the absence of noise, any nonzero dispersive shift (resulting in a resonator field displacement) would suffice to unambiguously separate the qubit states, given a properly chosen resonator linewidth. In practice, however, the outcome of the quantum measurement is generally Gaussian distributed in the (I, Q)-plane due to presence of classical and quantum noise. In this section, we review the main sources of noise, as well as how it impedes our ability to extract information from the quantum system. For a rigorous discussion of noise and quantum measurements, the interested reader is referred, e.g., to the work by Clerk et al.124 and to the textbook by Haus.123 

The total noise added to the signal has multiple origins. One part of the noise is associated with the microwave signal used to probe the resonator, where each photon has an intrinsic quantum noise power of ω/2 per unit bandwidth. Another contribution comes from the phase-preserving amplifiers, adding both classical noise and at least ω/2 of noise as required by Heisenberg's uncertainty relation. Finally, any attenuation of the signal prior to the first amplifier will appear as added noise. Combined, these noise sources amount to a “system noise temperature,” which can be characterized using a sensitive thermometer, such as a shot-noise tunnel junction321 or a qubit322 as a sensor.

The noise results in time-dependent fluctuations of the measured signal, which in turn translates into uncertainty in our demodulated signals, see Fig. 22(c). This can be intuitively understood by considering that our heterodyne detection method requires us to sample for a finite amount of time.

To quantify the impact of the noise on our measurement, we first project the distributed (I, Q) data—corresponding to |0 and |1—onto the axis for which their relative separation in the complex plane is maximized.434 The line that is used to separate between |0 and |1 is called a “separatrix.”

The noise can now be quantified by comparing the widths of the Gaussian probability distribution surrounding the mean with the peak separation in the (I, Q)-plane, thus defining a signal-to-noise ratio SNR=δθ/(Δθ1+Δθ0), see Fig. 23(a), with δθ=|θ1θ0| representing the signal and Δθ0,Δθ1 represent the noise (2σ) of each distribution. The SNR allows us to distinguish between a weak and a strong quantum measurement, as illustrated in Figs. 23(b)–23(d).

In a weak measurement, the probabilities are broadly distributed as compared to their relative separation (SNR < 1), which means that only partial information of the quantum state is revealed to the observer, see Fig. 23(b). In a strong measurement, on the other hand, the quantum state is collapsed onto one of the two eigenstates. In this case, the outcome of the measurement can be distinguished unambigously, which is reflected in two fully separated distributions (SNR > 1), see Fig. 23(d).

In many applications of quantum measurements, it is necessary to unambiguously (and with high fidelity) tell the outcome without repeating the readout measurement. This is known as “single-shot readout” and it often requires the use of a parametric amplifier—a preamplifier used to increase system SNR—which is further discussed in Sec. V E 3.

Assuming that the widths of the two distributions are identical, Δθ0=Δθ1=Δθ, the separation error can be calculated by deriving the weight of the overlapping region of the Gaussian distributions as157 


where erfc(x) denotes the complementary Gaussian error function, defined as


Using the erfc in Eq. (170), the separation error in Eq. (169) can be compactly expressed in terms of the signal-to-noise ratio


Note, however, that the separation error between the two state distributions only tells us the signal-to-noise ratio of our detection scheme. On top of the separation error, fidelity is reduced if the qubit relaxes (or is excited) during the readout. This will result in a count on the “wrong” side of the threshold. This leads to an additional constraint on the readout; The readout cycle needs to be completed on a time scale much shorter than the qubit relaxation time.

In summary, we see that to optimize the qubit readout fidelity, the readout needs to fulfill the following two requirements:

  • “Fast readout”: The readout cycle needs to be completed within a time that is short compared with the qubit coherence time. The longer the readout time, the more likely the qubit is to relax, thus reducing the readout fidelity.

  • “High signal-to-noise ratio”: The signal-to-noise ratio needs to be sufficiently large to suppress the state separation errors below an acceptable limit where it does not limit the readout fidelity.

In Secs. V D and V E 3, we review how these two conditions are met by carefully engineering the signal path of the readout circuitry.

To ensure high-fidelity readout performance, it is important to perform single-shot readout at a time scale much shorter than the qubit coherence time, τroT1. This motivates us to: (i) make the resonator linewidth wide, thus reducing its ring-up time, τrd, and (ii) keep the integration time τs as short as possible, see Fig. 22(b). The ability to isolate a quantum system from decohering into its environment while, at the same time, being able to read out its state in a short time represents two contradictory criteria, which must be traded-off.320 

Even though dispersive readout (in the few-photon limit) has only a small back-action on the qubit state, the qubit will still suffer from T1-relaxation while we are performing a measurement. In fact, this “decay during the readout” often limits the readout fidelity, reducing it to


where τro=τrd+τs/2 denotes the total time for the readout, consisting of the readout delay τrd due to the resonator transient, and half the sampling time τs/2. The fidelity drop in Eq. (172) can be interpreted as a manifestation of the competition between the time scales at which our quantum information reaches our detector or the environment first.

The limitation of qubit coherence originates from an enhanced spontaneous emission of photons, induced by its environment. This is known as the Purcell effect,323 and is an important consideration when designing qubit-resonator systems.324 The portion of spontaneous emission that is mediated by the resonator describes how qubit relaxation is enhanced by the resonator Q when on-resonance, and suppressed off-resonance. The aim of this section is twofold: first, we develop an intuition for how the Purcell decay limits qubit coherence, and second, how to properly mitigate this limitation by designing a so-called “Purcell filter,” which modifies the impedance seen by the qubit through the readout resonator. This allows us to maintain a fast readout, while protecting the qubit from relaxing into its environment.

If we would just choose qubit and resonator operation frequencies guided by the resonator linewidth κ, qubit-resonator coupling g, and the amount of dispersive shift χ, we would reduce the detuning between the qubit and the resonator, thus maximizing the dispersive shift (recall Fig. 20). However, this presents a trade-off between two important system parameters; on one hand, we want the qubit to be isolated from the resonator environment off-resonant to avoid Purcell-enhanced decay. On the other hand, looking at the dispersive shift, we want the two rates, g and κ to be strong, yielding larger dispersive shift as well as short resonator transient and thus a faster readout.

Fortunately, when operating in the dispersive regime, the qubit and resonator are far detuned from each other Δg,κ, which means that their impedance (environment) can be independently engineered through filter design. In essence, one designs a filter to have strong coupling to the readout port at the resonator frequency (large κ), but isolates the qubit from its environment at the qubit frequency.325,326 In other words, an impedance transformation.

Depending on the design of the readout for the quantum processor to which the filter should be coupled, there are different ways to design a Purcell filter; such as quarter-wave stubs,325 low-Q bandpass filters,66,326 and stepped-impedance filters.327 Which one is optimal depends on the system properties such as qubit-resonator detunings, required bandwidth, and allowed insertion loss.

The most promising Purcell filter designs are the ones that allow for frequency multiplexing, such as the low-Q bandpass filter design,66,326 which in addition to Purcell filtering has the function of a quantum bus, connecting several frequency-multiplexed readout resonators sharing the same amplifier chain.

The Purcell effect can be framed in terms of Fermi's golden rule, where noise in the environment causes the qubit to decay with some probability. We can gain intuition about the Purcell effect (as well as how the qubit can be protected from it) by replacing the Josephson junction in the qubit circuit with an ac-current source, outputting I(t)=I0sin(ωt), with I0=eω and study the rate at which power is lost into an environmental load resistor R=Z0=50Ω, see Fig. 24(a).

Expressing the power lost in the resistor as P=I02(Cg/CΣ)2R=(eωβ)2Z0, with β=Cg/CΣ, the qubit Purcell decay rate into the continuum can be written as


To protect the qubit from decaying into the 50 Ω environment (as well as for deploying our dispersive readout), we can now add a resonator in parallel with the qubit, see Fig. 24(b). The presence of the resonator has the effect of shaping the impedance at the qubit frequency, which in turn modifies the decay rate in Eq. (173) into


where Zr(ω) denotes the impedance of the shunted resonator. We can express the real-part of the impedance in terms of the resonator quality factor Q=ωr/κ and qubit-resonator detuning Δ=ωqωr


Now, by substituting Eq. (175) into Eq. (174), we see that the Purcell decay rate for the qubit depends on the detuning between the resonator and the qubit. This is intuitive, since the resonator can be thought of as a bandpass filter, with center frequency ωr and bandwidth κ. For the resonant condition, i.e., when Δ = 0, the emission rate into the resonator takes the form


In the dispersive regime Δg,κ, which is also relevant for us in the context of qubit readout, we can make the approximation Re[Zr]QZ0(κ/Δ)2, yielding the familiar expression for the Purcell decay rate in the circuit QED324 


The relation for the Purcell limit in Eq. (177) thus provides us with a useful guide on how to design the coupling rates g and κ, as well as how large qubit-resonator detuning Δ is necessary to avoid the Purcell limit.

In recent years, however, the intrinsic coherence times for superconducting qubits have reached above 100μ s, recall Sec. II, imposing practical limitations on how to simultaneously optimize g and κ, to render fast readout without compromising the qubit coherence. Considering the parameters in Eq. (177), it is not possible to just increase the bound on the relaxation time T1, without at the same time trading off the readout speed and contrast.

We can now introduce the Purcell filter [Figs. 24(c) and 24(d)] in between the readout resonator and the 50 Ω environment, leading to a reduction of the decay rate according to325 


where QF denotes the quality factor of the Purcell filter. This is schematically depicted in Fig. 24(d), where the Purcell filter is placed around the resonator frequency, while far detuned from the qubit.

In light of the aforementioned limited signal-to-noise ratio associated with the low photon number of the dispersive qubit readout, and the short sampling time, the noise temperature of the amplifier chain plays a crucial role in determining the fidelity of the measurement.

A useful benchmark for quantum measurements is the “quantum efficiency,” defined as


which quantifies the photon energy to the system noise temperature Tsys, thus yielding a measure of how close the signal is to the standard quantum limit (SQL), as imposed by Heisenberg's uncertainty relation, adding 1/2 photon of noise when ηSQL approaches unity. Since the energy of each microwave photon is much smaller than that of optical photons, it is not easy to build a single-photon detector operating in the microwave domain.328,329 Instead, for heterodyne detection in the circuit QED, a set of cascaded microwave amplifiers are used. The system noise temperature for the amplifier chain can be expressed in terms of the individual gain figures Gn and noise temperatures TN,n of each constituent amplifier330 


where n =1,2,3,… denotes the order of the amplifiers, starting from the qubit chip. From Eq. (180), we see that the noise temperature Tsys is dominated by the noise contribution from the first amplifier, whereas the gain of the first amplifier has the effect of suppressing the noise from the second amplifier, and so on. If the first amplifier is a low-noise high-electron mobility transistor (HEMT) amplifier (TN ≈ 2 K), the system noise temperature when implemented in a cryostat is around 7–10 K, corresponding to around 10–20 added photons of noise per signal photon around 5 GHz. In practice, this is generally too much noise to perform a single-shot readout.

This inherently poor signal-to-noise ratio has revived interest in developing quantum-limited parametric amplifiers (PAs)—tailored for the readout of superconducting qubits—featuring the ability to amplify small microwave signals, and adding only approximately the minimum amount of noise allowed by quantum mechanics.123,124,320

1. Quantum-limited amplification processes

In a linear, phase-insensitive amplifier, an input state ain is amplified to an output state aout, with an amplitude gain factor G. Microwaves are electromagnetic fields and therefore considered to be coherent light comprising microwave photons. As such, they must obey the commutation relations123,320,331,332


from which it can be shown that it is not possible to simultaneously amplify both quadratures of ain without also adding noise. This is known as “Caves theorem” after the work by Caves,320 based on an earlier work by Haus and Mullen.331 This can be seen by considering the scattering relation between the input and output microwave fields


The gain relation in Eq. (182) constitutes our ideal scenario for an amplifier process. However, the problem is that that this relation does not satisfy the commutation relation in Eq. (181). To satisfy this relation, we need to also take into account the vacuum fluctuations of another mode124,333–335—called the “idler” mode bin, also satisfying the same commutation relation [bin,bin]=1. To satisfy the commutation relation, the idler mode is amplified by the gain factor G1. For a large gain, it can be shown that a minimum amount of half a photon of noise ω/2 needs to be added to a signal amplified with gain G.

Finally, taking the idler mode into account, the scattering relation for the coherent output field takes the form


Generally, this process results in a so-called “phase-insensitive” parametric amplification process, in which both quadratures of the input field get equally amplified. This is illustrated in Fig. 25, where the in-phase (Iin) and quadrature (Qin) components of the fields are plotted, before and after the parametric amplifier.

Considering the amplification process in Eq. (183), we can find a special case for the idler mode, for which noiseless amplification can be accomplished for one of the two quadratures, but at the expense of adding more noise to the other, thus not violating Heisenberg's uncertainty relation for the two field quadratures. This mode of operation is known as “phase-sensitive” amplification, and is obtained when the idler mode oscillates at the same frequency as the signal (or a multiple thereof), but can be shifted with an overall phase ϕ[0,2π]. By substituting the idler mode in Eq. (183) with bin=eiϕain, the scattering relation becomes


The overall phase factor allows us to tune the orientation of the amplification (or de-amplification) by means of the pump phase, thus allowing us to choose a quadrature for which we want to reduce the noise, see Fig. 26. Intuitively, this can be understood by considering the interference that occurs when two waves with the same frequency are confined in space, where we obtain constructive or destructive interference, depending on the phase between the two waves. Due to this interference, the noise can be suppressed even below the standard quantum limit (without violating Heisenberg's uncertainty relation). This is known as “single-mode squeezing” and was first observation in superconducting circuits by Yurke et al.336 In particular, after the theoretical prediction by Gardiner,337 Murch et al. showed that the coherence time of a qubit can be enhanced when the qubit is exposed to squeezed vacuum.338,339 Also “two-mode squeezing” was demonstrated by Eichler et al.,340 where the demodulation setup squeezes both quadratures of the acquired signal.107 

In the context of qubit readout, however, phase-sensitive amplification tends to be experimentally inconvenient. This is mainly due to its phase-dependent gain, which imposes stringent requirements on continuous phase-calibration of the readout signal.

For a detailed theoretical framework developed for quantum limited amplification, the reader is referred to an earlier work by Roy and Devoret,341 Clerk et al.,124 and Wustmann and Shumeiko.342 

2. Operation of Josephson parametric amplifiers

In this section, we review the basic operation characteristics of parametric amplifiers, and in particular, the Josephson parametric amplifiers (JPAs), that have been exploited for qubit readout. Although many different flavors of parametric systems exist, we here focus on the resonant implementations of the Josephson parametric amplifier (JPA), serving as a good system for reviewing the fundamental concepts around parametric amplification.

All parametric amplifiers operate based on one fundamental principle: the incoming “signal” photons are mixed with an applied “pump” tone via an intrinsic nonlinearity, by which energy from the pump is converted into signal photons and thereby providing gain. As we recall from Sec. II, such a nonlinearity can be engineered in the microwave domain using Josephson junctions,343 and the resonant parametric amplifiers are built from slightly anharmonic oscillators.

The first Josephson parametric amplifiers were built from a coplanar waveguide resonator, made nonlinear by adding a nonlinear Josephson contribution to its total inductance, see Fig. 27(a). The word “parametric” refers to the process of modulating (or “pumping”) one of the parameters of the system's equation-of-motion (such as frequency or damping) in time.342,344,345 The natural way to perform this parametric pumping is to modulate the nonlinear Josephson inductance, which in turn has the effect of modulating the resonator frequency ωr(t)=1/L(t)C.

Depending on how the pumping is implemented, there are two different mixing processes that can be exploited in Josephson parametric amplifiers, which determines the characteristics of the amplifier. These are illustrated in Figs. 27(b) and 27(c) and are referred to as “current-pumping”344,346–349 and “flux-pumping,”95,342,345,350–355 respectively. The type of mixing process that takes place depends on the leading order of the nonlinearity of the system, as reflected in its Hamiltonian. In the following, we briefly review the difference between these two pump-schemes.

In the current-pumped case, the dynamics of the system has characteristics of a Duffing oscillator,356 with a fourth-order nonlinear term in addition to the harmonic oscillator term in its Hamiltonian


where c denotes the resonator field operator and K is the “Kerr-nonlinearity.” This process is a so-called “four-wave mixing” process, since it mixes four photons: one signal (ωs), one idler (ωi), and two pump photons (ωp), obeying the energy conservation relation ωs+ωi=2ωp, see Fig. 27(b). Pioneered by Yurke,344 this was the first demonstration of microwave amplification using a Josephson parametric amplifier. When the signal and idler modes are at the same frequency, the amplification is said to be “degenerate.” This pumping scheme is the foundation for the Josephson Bifurcation Amplifier (JBA), developed by Siddiqi et al.,348,357,358 which has been used to perform single-shot qubit readout, by mapping the quantum states onto the high and low resonator field originating from the sharp bifurcation point of the amplifier.359 

In the other case, when the system is flux-pumped, the parametric process is driven by threading a magnetic flux Φac through a SQUID loop, thereby modulating the frequency of the resonator. This results in a “three-wave mixing” process, comprising three photons: one signal, one idler, and one pump photon, with ωs+ωi=ωp, see Fig. 27(c). Therefore, we see that the pump frequency is about twice that of the signal ωp2ωs for ωsωi. For degenerate, flux-pumped systems, the leading nonlinearity is a third-order term, yielding a Hamiltonian


where the p operator denotes the flux-pump mode. This approach to building parametric amplifiers was developed by Yamamoto et al.,351 as well as by Sandberg et al.95 

The flux-pumping scheme has several practical advantages. First, the large detuning of the pump makes it easier to filter, isolating the readout signal as its passing into the digitizer downstream and preventing the saturation of following amplifier stages. Second, if the resonator is a quarter-wavelength resonator, it has no resonant mode at the pump frequency ωp, reducing spurious population or saturation of the system as well as backaction on the qubits in the processor. Third, since the flux pump line is a separate on-chip microwave line, no additional directional coupler is needed.

Due to its rich dynamics, flux-pumping has also proven a useful platform to study the quantum dynamics of Josephson parametric oscillators, both in the context of qubit readout,360–363 the dynamical Casimir effect,364–366 and to better understand their complex nonlinear dynamics.353,356,367–371

In addition to the degenerate parametric interactions described above, parametric gain can be obtained between different resonant modes; either between different modes of the same resonator,321,372 or in-between different resonators,373 as with the Josephson parametric converter (JPC).332,374–378 In addition to the possibility of isolating and amplifying certain frequencies, the JPC can implement frequency conversion for which it has some other areas of applications compared with other types of parametric amplifiers.

3. The traveling wave parametric amplifier

In the previously described JPA, parametric amplification is realized using resonators that enhance the parametric interaction between the input signal and the Josephson junction nonlinearity. Essentially, the Q-enhancement of the resonator forces each photon to pass through the junction on average Q times before leaving the resonator, thereby enhancing the nonlinear interaction. Albeit proven to be able to reach near the standard-quantum limit of noise for readout of a small number of qubits, the future direction of the community is heading toward amplifier technologies which are compatible with multiplexed readout of several qubits coupled to the same amplifier chain.65,66,379–381 In this context, resonator-based parametric amplifiers suffer from two major drawbacks: First, the amplifier bandwidth is limited to the resonator linewidth, typically ≈ 10–50 MHz, practically limiting the number of multiplexed frequencies that can be amplified. Second, since the Josephson nonlinearity is realized by a small number of junctions, the saturation power is low due to the interplay of higher order nonlinearities, effectively taking the system outside its desired operation regime.356,367,368,378 In practice, this limits how many readout resonators that can be simultaneously read out.

These two bottlenecks can, to a degree, be overcome with microwave engineering. For instance, the linewidth can be made an order of magnitude wider by altering the impedance along the resonator. This is called a stepped-impedance transformer, where the impedance is ramped down from a matched 50 Ω at the capacitor down to a small impedance at the SQUID382 shorting the device to the ground. Also, the saturation power can be increased by distributing the nonlinearity across an array consisting of many identical junctions, reducing the Kerr-nonlinearity by a factor 1/N2 with N representing the number of junctions in the array. This has been demonstrated by using an array of SQUIDs in a resonator, rather than a single one.352 

However, despite the above-mentioned engineering efforts to improve the resonator-based JPAs, the most prominent approach to date is to get rid of the resonator altogether and, instead, construct a microwave analog to optical parametric amplifiers, where kilometers of weakly nonlinear fibers are used. Such a device is called a “traveling wave parametric amplifier” (TWPA) and was developed to surmount the bandwidth and dynamic range limitations of the resonator-based JPAs.

Although operated in similar way, the nonlinearity of TWPAs can be realized in different ways, such as the kinetic inductance of a superconducting film383–385 or using an array of Josephson junctions,322,386,387 through which the four-wave mixing process is distributed across a nonlinear lumped element transmission line, see Fig. 28(a).

The Josephson TWPA consists of a few thousand identical unit cells, each comprising a shunt capacitor to ground and a nonlinear Josephson inductor, together yielding a characteristic impedance of Z0=LJ/C50Ω, see Fig. 28(a).

The fact that the nonlinearity is distributed allows for a high saturation power, since each Josephson junction is accessed once. However, even though energy conservation is satisfied, the four-wave mixing process in the device, there is a problem with phase (or momentum) conservation. This is associated with the system nonlinearity as well as the large frequency detuning between the signal and pump photons, yielding a difference in phase-velocity between the two, which in turn gives rise to a nonflat gain profile, as well as an overall reduction in gain.386 

Again, by taking inspiration from the dispersive engineering developed in quantum optics and photonics, where the refractive index can be periodically altered to engineer the momentum of a transferred signal, the solution to this phase-mismatch problem was introduced by O'Brien et al.386 By introducing resonators at periodic intervals of TWPA unit cells, the pump tone can be given a “momentum kick,” effectively slowing it down and phase-matching the device by means of its wave vector. This technique is called resonant-phase matching (RPM), see Fig. 28(d), and requires that the pump frequency is set on the left side of the dispersion feature (where the wave vector diverges), defined by the resonant frequency of the phase-matching resonators. Note, finally, that broadband parametric amplification with a high dynamic range has been demonstrated in other Josephson-based circuits, e.g., the superconducting nonlinear asymmetric inductive element (SNAIL) parametric amplifier (SPA).388 

In this review, we have discussed the phenomenal progress over the last decade in the engineering of superconducting devices, the development of high-fidelity gate-operations, and quantum nondemolition measurements with a high signal-to-noise ratio. Putting these advances together, we hope that it is clear that the planar superconducting qubit modality is a promising platform for realizing near-term medium scale quantum processors. While we have focused on highlighting the advances made within the fields of realizing, controlling, and reading out planar superconducting qubits specifically used for quantum information processing, there has of course also been tremendous activity in the surrounding fields. In this final section, we briefly mention a few of those fields, and invite the reader to look into the references, for further details.

Superconducting qubits also form the basis for certain quantum annealing platforms.389,390 Quantum annealing operates by finding the ground state of a given Hamiltonian (typically a classical Ising Hamiltonian), and this state will correspond to the solution of an optimization problem. By utilizing a flux-qubit type design (see Sec. II), the company D-Wave have demonstrated quantum annealing processors86 which have now reached beyond 2000 qubits.391 The benchmarking of quantum annealers and attempts to demonstrate a quantum speedup for a general class of problems is a highly active research field, and we refer the reader, for example, to recent papers Refs. 392–394 and references therein.

A parallel effort to the planar superconducting qubits discussed in this review is the development of 3D cavity-based superconducting qubits. In these systems, quantum information is encoded in superpositions of coherent photonic modes of the cavity.99 The cat states can be highly coherent due to the inherently high quality factors associated with 3D cavities.100,395,396 This approach has a fairly small hardware overhead to encode a logical qubit,397 and lends itself to certain implementations of asymmetric error-correcting codes due to the fact that errors due to single-photon loss in the cavity are a tractable observable to decode. Using this architecture, several important advances were recently demonstrated including extending the lifetime of an error-corrected qubit beyond its constituent parts,98 randomized benchmarking of logical operations,397 a CNOT gate between two logical qubits398 as well as Ramsey interference of an encoded quantum error corrected qubit.399 

We briefly mentioned the electrical engineering, software development, and cryogenic considerations associated with the control wiring and on-chip layout of medium-scale quantum processors. While dilution refrigerators are now readily available, off-the-shelf commercial products, the details of how to optimally do signal-routing and rapid data processing in a scalable fashion, is also a field in rapid development. However, with the recent demonstrations of enabling technologies such as 3D integration, packages for multilayered devices and superconducting interconnects,400–406 some of the immediate concerns for how to scale the “number” of qubits in the superconducting modality, have been addressed. On the control software side, there currently exist multiple commercial and free software packages for interfacing with quantum hardware, such as QCoDeS,