In this Tutorial, we introduce basic conceptual elements to understand and build a gate-based superconducting quantum computing system.

## I. INTRODUCTION

Quantum computing is considered as a next-generation information processing technology. The basic element of a quantum computing system is a quantum bit, often called a qubit. Over the last few decades, considerable progress has been made toward realizing quantum computing systems by physically implementing a qubit in various systems such as ion traps, quantum dots, nuclear spins, and cavity quantum electrodynamics. The scalability of such a qubit is considered to be a prerequisite for a practical quantum computer of the future. In this regard, a solid-state qubit has been considered to be indispensable. Superconducting quantum systems are one of the most promising candidates because, in these systems, qubits are intrinsically integrated in a solid-state device, and their wide range of choice for the qubit parameters is a considerable advantage, which in turn gives flexibility in designing such quantum circuits.

In this Tutorial, we try to provide basic conceptual elements to understand and build a potentially scalable superconducting quantum computing system based on gate operations. The logical flow is roughly from principle to practice. After introducing the qubit and structure of a universal quantum computing system (Sec. II), we explain a superconducting circuit that can be used as a qubit (Secs. III and IV) and how to implement basic functions that are required for quantum computation (Secs. V and VI). Then, we introduce a quantum error-correction scheme, called the surface code, that is believed to be suitable for superconducting qubit systems (Sec. VII). Last, we deal with practical topics, such as how to characterize and control a quantum system (Secs. VIII and IX). The contents of this Tutorial are briefly summarized in Table I.

Since this is a Tutorial, the topics covered here are very selective rather than comprehensive. Hence, we cite references that are more accessible to readers. Another reason for this is that many concepts and experimental techniques for superconducting circuits were originally developed in other branches of science—tracing all historical literature is not meaningful for readers. For comprehensive reviews on this field, see Refs. 1–6.

Regarding the difficulty of this Tutorial, we assume that readers are somewhat familiar with quantum mechanics, especially the Dirac notation and the occupation number representation (second quantization), and elementary statistical mechanics, such as the Boltzmann distribution. Since superconducting quantum computing systems are electrical circuits, knowledge on basic electrical engineering will be helpful, especially the S-parameters. However, readers do not need to be masters of these topics. Reading this Tutorial does not require deep physical insights—it is more like learning a new language.^{7} Once you get used to it, you will enjoy it.

Before entering the main part, we would like to point out that the word “scaling” in quantum engineering is different from that in the semiconductor industry. In the semiconductor industry, scaling means reducing the size of the information processing device used, such as a transistor, and the energy cost per bit, so that we can integrate more and more devices into a chip. In quantum engineering, “scaling” simply means adding more qubits because physical quantities involved in operations of a superconducting quantum computing platform, such as the charge of a Cooper pair and magnetic flux quantum, are already at the quantum limit, and a quantum information processing device is lossless. Thus, the dramatic size reduction as demonstrated in Moore’s law may not be expected for superconducting qubits.

A set of formulas for deriving equations in this Tutorial are summarized in Table II.

## II. UNIVERSAL QUANTUM COMPUTING SYSTEM

### A. Essential elements

#### 1. Quantum bit

Note that a qubit and a spin-1/2 system are mathematically identical. This allows us to represent the qubit state conveniently as an arrow, called the Bloch vector, in the Bloch sphere [Fig. 1(a)]. Conventionally, the qubit quantization axis is set as the $z$ axis, and the north and south poles represent $ 0$ and $ 1$, respectively. Hence, the longitudinal component of the Bloch vector corresponds to the polarization of the qubit, and the transverse component corresponds to the coherence between the two basis states.

When we use the Bloch sphere, we are free to choose a frame of reference. In the majority of the literature, including this Tutorial, the dynamics of the qubit state are described in the rotating frame [Fig. 1(b)]. To determine the rotating frame frequency, we have to know the dynamics we want to focus on. Then, we eliminate the trivial evolution by performing a unitary transformation, which changes our frame of reference. Note that this is conceptually and mathematically identical to switching into the interaction picture. Usually, the qubit frequency, the resonator frequency (see Sec. V B), or the external drive frequency (Sec. VI C) is chosen as the rotating frame frequency.

A qubit is often implemented by the two lowest states of a quantum system, such as (artificial or natural) atoms [Fig. 1(a)]. This subspace is called the computational subspace. In general, any Hilbert space whose dimension is truncated into two can be used as a qubit. This generalized definition of a qubit is essential for constructing a logical qubit (Sec. VII).

In this Tutorial, the notations denoting the qubit states, { $ 0$, $ 1$, $ 2$ (higher excitation level)} and { $ g$, $ e$, $ f$}, are used interchangeably to avoid confusion with the photon or the charge number states. In addition, $ \omega q$, which we call the qubit frequency, is the transition frequency between $ 0$ and $ 1$, and $ \omega i - j$ (with a hyphen in the subscript) is the transition frequency between $ i$ and $ j$; $ \omega i j$ (without a hyphen in the subscript) indicates the energy level of the two-qubit state, $ i\u2297 j$ (or $ i j$ in the short form).

#### 2. Quantum gate

A quantum gate is a discrete control acting on qubits inducing the unitary evolution of the quantum states of the qubits. Quantum computation is basically a series of quantum gate operations.

^{8,9}

Any multiqubit gate operation can be decomposed into a set of single-qubit and controlled-NOT (CNOT) gates. Thus, the gate set {single-qubit gates, CNOT} is called a universal quantum gate set. An arbitrary single-qubit gate can be well approximated by the discrete gate set { $H$, $S$, $T$} (Solovay–Kitaev theorem^{10}). Hence, we can rewrite a universal gate set as { $H$, $S$, $T$, CNOT}. The definitions of these gates and other popular gates are summarized in Table III.

Among these universal quantum gates, the quantum gates generated by the $H$, $S$, and CNOT gates form a group called the Clifford group. This group is important in quantum computation, especially for quantum error correction (Sec. VII) and efficient gate qualification (Sec. IX D). However, it is known that a quantum computer operated by only Clifford gates can be simulated efficiently on a probabilistic classical computer (Gottesman–Knill theorem^{10}). Thus, a non-Clifford gate, such as the $T$ gate, is required to show the advantage of quantum computation.

### B. Structure

A gate-operation-based, universal, and scalable superconducting quantum computer will likely have the following structure (Fig. 2):^{11,12}

**Physical resources:**This layer is a collection of physical qubits and necessary circuits for the control and readout of the physical qubits.**Error correction resources:**In this layer, errors acting on quantum information stored in a set of physical qubits are corrected. This operation produces a single error-free logical qubit. For this, high-fidelity controls, such as initialization, gate operation, readout, and feedback, for physical qubits are required.**Logical resources:**Initialization, gate operation, and readout of logical qubits are performed in this layer.**Algorithmic resources:**Quantum algorithms, such as Shor’s factoring and Grover’s search algorithms, are performed in this layer.

## III. SUPERCONDUCTING QUBIT

### A. Design criteria

A superconducting qubit is the two lowest energy eigenstates of an artificial atom made of a superconducting circuit. To be a useful qubit, the circuit must be designed to satisfy the following conditions:

*Proper operating frequency range*: A qubit must have a transition frequency that is significantly higher than the thermal energy of a typical solid-state system to observe quantum nature. The only continuous refrigeration method for solid state devices below 0.3 K is to use a dilution refrigerator, whose base temperature is usually about 10 mK ( $\u223c200$ MHz). This means that the transition frequency of a qubit must be at least a few gigahertz. At the same time, the qubit transition frequency should be sufficiently lower than the superconducting energy gap of the host superconductor so as not to excite quasiparticles. For aluminum, which is the most popular material for superconducting qubit systems, the energy gap is about 100 GHz.*Large anharmonicity*: To be a well-defined two-level system, a qubit should have anharmonicity $\alpha \u2261 \omega 1 - 2\u2212 \omega q$ of at least $\u223c100$ MHz to perform a reasonably fast gate operation (see Sec. IX A 1 for the gate time and frequency selectivity). Recently, it has been found that having a third level in an accessible frequency range can be beneficial, such as for initialization or two-qubit gate operation (see Secs. VI D and VI E 1).*Long coherence time*: The assigned quantum state should last for a long time compared with the time for gate operations.*Ease of coupling*: For readout and (multi)qubit gate operation, a reasonably strong coupling between a qubit and another quantum system, such as a resonator or neighboring qubit, should be achieved easily.*Ease of control*: The quantum state should be brought to a superposition easily and straightforwardly by an external mean.*Ease of fabrication*: A qubit should be easy to fabricate with standard nanotechnology for good reproducibility.

### B. Josephson junction

A superconductor is a macroscopic quantum mechanical system in the sense that it can be described by a single macroscopic wavefunction, i.e., the order parameter $\Psi $. However, this property is not a sufficient condition for being a qubit; we need a confinement potential to have discrete energy eigenstates such as electrons in the Coulomb potential forming an atom. Moreover, to control the two lowest energy eigenstates selectively, the potential must be anharmonic to have distinct energy separation between eigenstates.

The solution for discrete energy eigenstates is to make an electrical circuit. In a superconducting circuit, the quantized energy level emerges from the quantization of the charge and the magnetic flux stored in various electrical components just like the position and the momentum of electrons in a real atom. (Since the charge and the magnetic flux are collective coordinates that represent the cooperative motion of large numbers of electrons, the circuit quantization is essentially phenomenological.^{14})

^{15}

^{,}

^{16}

### C. Elementary circuits

#### 1. Generic Hamiltonian

We can categorize elementary circuits of superconducting qubits into two groups, an island and a loop (Fig. 4). In the early literature, these two kinds of qubits were called a charge qubit and a flux qubit, respectively, on the basis of the spread of the wavefunctions in the number (charge) and phase (flux) bases [typical wavefunctions of a charge qubit are shown in Fig. 5(d)].^{17} However, such a classification is valid only for a certain parameter range; it does not work well for sophisticated qubits whose wavefunctions often show exotic distributions in both the number and the phase bases. Therefore, we simply categorize circuits of superconducting qubits based on the geometry. Then, we will show how the qubit properties change as we tune the circuit parameters. The knowledge acquired in this way can also be used for analyzing more complex qubits.

*relative*number and phase operators between two superconductors, and this number imbalance of electrons is much less than the number of electrons in each superconductor. For details, see Refs. 15 and 19.) The resulting circuit Hamiltonian $ H ^ q$ is given by

#### 2. Island-based qubit

In the small $ E J/ E C$ limit, the $ E C$ term is dominant in Eq. (15); as a result, the wavefunctions are localized in the number basis as shown in Fig. 5(d), suggesting that the number basis will be more convenient to describe the physics in this regime. In Fig. 5(a), the gray lines indicate the $ E C$ term associated with $ N = 0$ and $ \xb1 1$. At $ N ext=0.5$, $ N = 0$ and $ 1$ are energetically degenerated. Here, the $ E J$ term hybridizes these two states via coherent charge tunneling [Fig. 4(a)], resulting in an anticrossing whose size is approximately $ E J$. At zero bias, a similar, but significantly smaller, hybridization occurs between $ N = \u2212 1$ and $ 1$. This results in the first excitation level at $\u2248$4 $ E C$.

More systematic plots regarding the two observations, (i) the flattening of the energy band and (ii) the suppression of the anharmonicity in the large $ E J/ E C$ limit, are given in Figs. 5(g) and 5(h), respectively. Note that the difference between the transition frequencies at $ N ext=0$ and 0.5, denoted by $\Delta \omega q$, decreases exponentially as shown in the inset of Fig. 5(g). This indicates that the energy levels are completely flat if $ E J/ E C\u227350$.

The anharmonicity at $ N ext=0$ and 0.5 also collapses into a single curve because of the flattening of the energy band [Fig. 5(h)]. The crucial observation is that, although $\alpha $ is also approaching zero, its slope is algebraic rather than exponential. This suggests that we can use the circuit in the large $ E J/ E C$ limit as a charge-insensitive qubit, which is called a transmon (see Sec. IV B for the implementation of a transmon).^{20}

#### 3. Loop-based qubit

A loop-based qubit is not as simple as an island-based qubit because we have to consider all terms in Eq. (15). We start with the effect of $ E L$. Since $ E L$ is a function of $\phi $, it is convenient to take the phase basis, and consequently, to treat $ E J$ and $ E L$ terms as the potential. We first consider the regime in which $ E J/ E L\u226b1$. In this regime, the periodic shape is prominent in the potential as shown in Figs. 6(a)–6(d). When $ E J/ E C\u226a1$ [Fig. 6(e)], the energy level diagram is almost independent of $\Phi $, and $ \omega q\u2248 8 E L E C/\u210f$. The reason is that the oscillating potential is averaged out owing to the large kinetic energy [Fig. 6(a)], and consequently, only the harmonic terms are effective in Eq. (15). In this regime, $ N ^$ is localized. Thus, $ N ^$ corresponds to the position in the spring-block oscillator analogy, $L$ corresponds to the mass, and $ C \u2212 1$ corresponds to the spring constant.

For $ E J/ E C>1$, the physics of a loop-based qubit can be understood in a similar way to that of an island-based qubit. In Figs. 6(f) and 6(g), the gray lines show the $ E L$ term in Eq. (15) associated with $ \phi = 0$ and $ \xb1 2 \pi $, which means that the numbers of trapped fluxes in the loop are 0 and $\xb11$, respectively. Note that, at $\Phi / \Phi 0=0.5$ ( $ \phi ext=\pi $), the potential has a double-well shape, resulting in energy degeneracy between $ \phi \u2248 + \pi $ and $ \phi \u2248 \u2212 \pi $. These degenerated states correspond to two superposed currents circulating in opposite directions [Fig. 6(b) and its inset]. Similarly to the degeneracy point in an island-based qubit, the hybridization mediated by the kinetic energy ( $ E C$ term) breaks the degeneracy, resulting in an anticrossing. This process can be understood as coherent flux tunneling between the flux island (loop) and the flux reservoir [Fig. 4(b)]. On the basis of this explanation, it is easy to understand that $ \omega q$ at $\Phi / \Phi 0=0.5$ decreases monotonically as a function of $ E J/ E C$ [Fig. 6(m), dashed lines].

At zero flux bias, the first excitation level is formed through the hybridization of states $ \phi \u2248 \xb1 2 \pi $, as shown in Fig. 6(c). Since this hybridization requires the tunneling of two potential barriers, the energy gap is significantly smaller than that at $\Phi / \Phi 0=0.5$. Hence, $ \omega q$ at zero bias is approximately $2 \pi 2 E L$[ $= E L ( \xb1 2 \pi ) 2/2$] and weakly depends on $ E J/ E C$. This explains why $ \omega q$ at zero bias shows a plateau in Fig. 6(m).

As $ E C$ decreases further [Fig. 6(h)], the ground and excited states at zero bias become bound states within a well of the periodic potential. In this case, we can approximate the potential as a weakly nonlinear harmonic potential [Fig. 6(d)] as we did in Sec. III C 2. Hence, $ \omega q\u2248 8 E J E C/\u210f$. This explanation suggests that the physics of a loop-based qubit in this regime is actually close to that of two island-based qubits connected by an inductor, i.e., inductively shunted junction [the inset of Fig. 6(d)]. The reason is that $L$ is very large in the regime $ E J/ E L\u226b1$ such that the reactance at $ \omega q$ is significant, whereas the circuit is electrically shorted at the low-frequency limit.

In the regime $ E J/ E L\u223c1$, the harmonic contribution to the potential is substantial; thus, it is difficult to separate the contributions from the periodic and harmonic potentials to the energy levels. One consequence is that the minimum of the potential becomes almost flat at $\Phi / \Phi 0=0.5$ as shown in Fig. 6(j). The other consequence is that, in Figs. 6(k) and 6(l), the first excitation level near zero bias already has a parabolic shape rather than a cross shape because the first excitation level at zero bias is mostly governed by the physics shown in Fig. 6(d), rather than the hybridization shown in Fig. 6(c). This explains why $ \omega q$ at $\Phi / \Phi 0=0$ in this regime decreases monotonically with increasing $ E J/ E C$ without any plateau in Fig. 6(m).

The experimentally accessible range of $ E J/ E C$ is typically from $\u223c0.1$ to $\u223c100$. In this range, $ \omega q$ of a loop-based qubit with $ E J/ E L\u226b1$ at $\Phi / \Phi 0=0.5$ is often too low to satisfy condition 1 in Sec. III A, while $ \omega q$ of a qubit with $ E J/ E L\u223c1$ at zero bias is too high. Regarding the anharmonicity, a loop-based qubit with $ E J/ E L\u226b1$ is more advantageous than that with $ E J/ E L\u223c1$ as shown in Fig. 6(n). If $ E L$ increases even further such that $ E J/ E L\u226a1$, the potential becomes almost harmonic, and as a result, the circuit does not show enough anharmonicity to be a qubit.

## IV. EFFECT OF NOISE

### A. Relaxation

#### 1. Concept

*independently*of its direction.

^{10,13}) $ T 2$ is the time constant for the decay of the transverse component of the Bloch vector to zero. Note that there are two contributions to $ T 2$ in Fig. 7(a): one is the shortening of arrows and the other is the spreading of arrows. The shortening of arrows is due to the growth of the longitudinal component, whereas the spreading is due to the loss of the phase coherence of the qubit, called dephasing. As shown in Figs. 7(b) and 7(c), dephasing is caused by the temporal fluctuation in qubit transition frequency. Hence, both thermalization and dephasing contribute to $ T 2$, while $ T 1$ is entirely determined by thermalization. This explanation can be written as

^{21}

#### 2. Thermalization

^{21}

^{24–26}

Equations (19)–(22) suggest that thermalization due to various noise processes acting on a qubit is determined by the circuit parameters and the off-diagonal matrix elements of $ N ^$ and $ \phi ^$, i.e., the overlap between wavefunctions in the circuit variable space.

Currently, there are three approaches to suppressing thermalization:

*Clean environment*: This approach eliminates the noise source by removing any unnecessary quantum systems, such as defects, which could possibly couple to the qubit. Naturally, this approach requires much knowledge and engineering regarding materials, such as host superconductors, substrates, and oxide layers.^{27–29}For example, it is known that a qubit on a silicon substrate usually shows a shorter $ T 1$ than that on a sapphire substrate, partly because of a lossy amorphous silicon oxide layer.^{30}For a comprehensive review for this approach, see Ref. 31.*Reducing participation ratio*: This approach reduces losses in dielectric media, such as oxides or organics at the surface of a qubit, by minimizing the participation ratio that is defined as the fraction of the electric field energy stored within the volume of each dielectric medium.^{32,33}Since an electric field in a planar device is highly concentrated near the edges, a qubit made of large superconducting pads with a simple design shows good performance in general.^{30,32}*Reducing wavefunction overlap*: We can engineer the potential by choosing the geometry and parameters of the circuit to minimize the effective dipole moment, i.e., the wavefunction overlap in the circuit variable space, as shown in Fig. 7(e). This is the strategy that the so-called protected qubit takes.^{34–39}However, reducing the effective dipole moment inevitably makes the qubit difficult to control [compare Eqs. (22) and (34)].

#### 3. Dephasing

^{21}

^{20,21}

On the basis of what we have learned thus far, we explain two approaches to suppressing dephasing.

*Geometry*: We can select a circuit geometry that is insensitive to a certain type of noise. A fixed-frequency island-based qubit is insensitive to flux noise simply because there is no loop that can contain a flux [Fig. 4(a)]. For a loop-based qubit, the sensitivity to flux noise depends on the circuit parameters. If the qubit is in a circuit parameter range in which the qubit states are the circulating current states shown in Fig. 4(b) and the inset of Fig. 6(b), the qubit is insensitive to charge noise. The reason is that a continuously flowing DC supercurrent does not allow any charge offset within the current path, i.e., the circuit is electrically shorted in the low-frequency limit. However, such a state is sensitive to flux noise. If the circuit parameters are chosen such that the qubit states are similar to the island-like qubit shown in the inset of Fig. 6(d), then the qubit states are sensitive to charge noise but less sensitive to flux noise.*Bias dependence*: Since $ \Gamma \phi $ is proportional to $ \u2202 \lambda \omega q$ [Eq. (25)], we can choose the circuit parameters that give minimal bias dependence as shown in Fig. 7(f). In this regard, operating a qubit at a bias at which $ \u2202 \lambda \omega q=0$, called a sweet spot [red circles in Fig. 7(f)], is necessary because the qubit is first-order insensitive to noise at this particular bias [ $ \Gamma \phi =0$ in Eq. (24)].

Note that the energy of a qubit is conserved during dephasing, in contrast to thermalization. This allows us to recover the phase coherence by applying pulses that can revert the direction of the time evolution. Such a technique is called refocusing and will be discussed in Sec. IX C.

In Sec. IV B, we briefly explore several noise-resilient qubit designs and discuss how to improve the robustness of the qubit by tuning the circuit parameters.

### B. Noise-resilient designs

#### 1. Island-based qubit

The most successful noise-resilient design of an island-based qubit is a transmon. As mentioned in Sec. IV, the dephasing rate of an island-based qubit in Fig. 8(a) is insensitive to flux noise because of the absence of a loop. To suppress the effect of charge noise, the transmon design pushes strategy 2 in Sec. IV A 3 to the limit: eliminating the $ N ext$ dependence by choosing $ E J/ E C=50$–100 (Fig. 5).

This limit can be achieved by adopting a shunt capacitor [red capacitor in Fig. 8(a)]. The shunt capacitor takes the majority of the effect of the charge noise and thus minimizes this effect on the junction. The physics of this idea is the same as adding a heavy mass to reduce the sensitivity to mechanical noise.

As mentioned in Sec. III C 2, the tradeoff is the reduced anharmonicity: in the large $ E J/ E C$ limit, the qubit wavefunctions are localized in the phase space; hence, a transmon is a weakly nonlinear harmonic oscillator. From this reasoning, we can easily imagine that, if we treat the qubit wavefunction as a rolling glass bead in a potential well, the bead sees more anharmonicity as the kinetic energy ( $ E C$) increases. Indeed, the anharmonicity of a transmon is roughly given by $\u2212 E C$ (see Sec. V B 1). $ E C$ is usually chosen 100–500 MHz to satisfy condition 2 in Sec. III A. Then, $ E J$ must be 10–30 GHz to satisfy condition 1 in Sec. III A. The resulting circuit parameters are summarized in Table IV.

A DC SQUID is employed to tune the qubit frequency as explained in Sec. III B [rightmost figure in Fig. 8(a)]. However, in this case, the transmon is exposed to the flux noise. Therefore, we need to design a DC SQUID with minimal flux dependence based on Eq. (14).^{43} In addition, we have to operate the tunable transmon at the flux bias sweet spot.

#### 2. Loop-based qubit

The main difficulty in implementing a loop-based qubit is designing an inductor with sufficiently large inductance because the inductance of a superconducting loop made of aluminum or niobium is usually very small such that $ E L> E J$. Consequently, the resulting anharmonicity is too small to satisfy condition 2 in Sec. III A as explained in Sec. III C 3.

A popular strategy is to add multiple Josephson junctions, where the Josephson energy for each junction is $\beta E J$, as an effective inductor. Here, we still want to keep the current flowing in the loop dominated by the main junction [black junction in Fig. 8(b)]. (In the literature, the main junction is often called the “ $\alpha $ junction,” where $\alpha = \beta \u2212 1$, for historical reasons.) Since the flux tunneling rate through a Josephson junction is roughly proportional to $exp\u2061 (\u2212 E J / E C )$ (Ref. 44), $\beta $ must be larger than 1.

^{45,46}

First, we consider the case when $ N J$ is 2 or 3 and $\beta \u22482$ [upper figures in Fig. 8(b)]. The circuit with these parameters, which roughly corresponds to a loop-based qubit with $ E J/ E L\u223c1$, is called a flux qubit. Although the resulting energy level structure from Eq. (26) is not the same as that from Eq. (15), the overall dependence of the energy levels on the circuit parameters is qualitatively similar to that in Figs. 6(k)–6(n).

For a noise-resilient qubit, we need to select $ E J$ to minimize the $ \phi ext$ dependence as mentioned in Sec. IV. At the same time, we also need to satisfy condition 1 in Sec. III A. It was found that $ E J\u223c10$–100 GHz and $\beta \u22482$ balance these two.^{23} However, there is a tradeoff: the qubit becomes sensitive to charge noise because the circulating currents are close to zero even at $\Phi / \Phi 0=0.5$. To circumvent this, a shunt capacitance is added to the main junction as we did for the transmon; thus, we have $ E C=0.1\u22121$ GHz. The final circuit shown in Fig. 8(b) is called a capacitively shunted (C-shunt) flux qubit [upper rightmost figure in Fig. 8(b)].^{47}

One might ask, “can a flux-tunable transmon be considered as a kind of C-shunt flux qubit?” Our answer is yes, but the working flux bias at which criterion 1 in Sec. III A is satisfied is different: a transmon is usually operated at zero flux bias, while a C-shunt flux qubit is operated at $\Phi / \Phi 0=0.5$.

With a sufficiently large $ N J$ ( $\u223c10$–100), Eq. (26) can be treated as a linear inductor with $ E L\u2248\beta E J/ N J$ [lower figures in Fig. 8(b)], if the self-resonance frequency of the junction array is sufficiently higher than that of each junction, $ 8 E J E C/h$.^{48} This condition can be satisfied by limiting $ N J$ to $ N J\u2272 C J / C g$, where $ C J$ is the capacitance across each junction and $ C g$ is the capacitance between the junction array and the ground. By tuning $\beta $ and $ N J$, we can satisfy $ E L\u226a E J$. A superconducting qubit in this regime is called a fluxonium or an RF SQUID qubit. In this case, it is easy for $ \omega q$ at zero bias to satisfy condition 1 in Sec. III A; at $\Phi / \Phi 0=0.5$, $ \omega q$ might be too low. This drawback can be resolved by employing active qubit initialization protocols (see Sec. VI E). According to Fig. 6(h), the anisotropy is significantly larger than that of a flux qubit. Capacitive shunting has also been applied to a fluxonium [lower rightmost figure in Fig. 8(b)], resulting in improved $ T 2$.^{42}

Last, we would like to point out that a Josephson junction array as a linear inductor itself is an interesting system. The reason is that it is difficult, although not impossible,^{49} to make a geometric inductor whose reactance exceeds the superconducting resistance quantum $ R Q=h/ ( 2 e ) 2\u22486.5$ k $\Omega $ because of stray capacitance and radiation to vacuum, whose impedance is about 377 $\Omega $. Such a linear inductor whose impedance is similar to or larger than $ R Q$ is often called a superinductor. Thus, implementations of superinductors have usually been based on kinetic inductance,^{48,50,51} i.e., an inductive contribution to the impedance that arises from kinetic energy of the charge carrier, instead of geometric inductance (for further discussion about kinetic inductance, see Sec. VI B 2). Very recently, a qubit made of Josephson junction arrays with extremely high inductance ( $ E L<100$ MHz) succeeded in implementing the regime shown in Figs. 6(a) and 6(e).^{52}

## V. COUPLING

Thus far, we have explained how to make a qubit out of superconductors. To perform actual computation, a qubit must be coupled to other systems so that the qubit state can be controlled or read. The most commonly used physics for these operations is the cavity quantum electrodynamics.^{53} It provides an integrated control/readout scheme via the interaction between an atom and a cavity. The same physics can be applied to a superconducting circuit as the interaction between a qubit and a resonator. This circuit version of the cavity quantum electrodynamics is called the circuit Quantum ElectroDynamics (cQED).^{6,54,55} In addition, for multiqubit gate operation, qubit–qubit coupling is required. In this section, we discuss how to couple a qubit to other systems.

### A. Two coupled classical oscillators

Note that, although the coupling spring is always present, its effect on the dynamics strongly depends on the system parameters. When $\kappa $ is the static parameter $ \kappa 0$, the two oscillators exchange their energy only when they are on-resonance [Fig. 9(a)]. Even if the oscillators are off-resonance, we can force them to exchange their energy by modulating $\kappa $ with the frequency difference between the two oscillators, $| f 1\u2212 f 2|$, where $2\pi f i= ( k i + \kappa 0 ) / m i$ [Fig. 9(b)]. These two phenomena can be seen in both classical and quantum systems regardless of whether statistics is fermionic (qubit) or bosonic (resonator).

Next, we inject energy into the system by two methods. One is to modulate the coupling constant with $ f m= f 1+ f 2$. In this case, as one can see in Fig. 9(b), both $ x 1$ and $ x 2$ increase exponentially with time. This is parametric amplification, which is important for realizing noiseless amplification. The concept and applications of parametric amplification will be discussed further in Sec. VI B 2. The other is to drive oscillator 1 with the frequency $ f d$. When $ f d= f 2$, $ x 2$ increases linearly with time.

When we apply the physics learned from these energy injection processes, we need to consider the quantum statistics. If two coupled quantum systems are bosonic, we can simply interpret the displacement of the blocks as the population. However, if one or both of the systems are fermionic, we will see an oscillation in the population, instead of the linear increase that we saw in Fig. 9(c). Such an oscillation is called the Rabi oscillation, which will be discussed further in Sec. VI C.

### B. From circuit to atom

#### 1. Qubit, resonator, and somewhere between them

Although Eq. (32) models several important properties of a weakly anharmonic/nonlinear system successfully, we still need a unified description of superconducting circuits in a wide range of nonlinearity for complicated circuits. Moreover, off-resonant resonator modes have been known to contribute substantially to the relaxation times of a qubit via the Purcell effect (see Sec. VI B 1).^{56} To remedy these issues, semiclassical superconducting circuit quantization methods have been proposed and showed a good agreement with experimental data. Interested readers should see Refs. 33 and 57–60.

For the rest of this section, we model a superconducting qubit as an ideal two-level system because this provides a qualitatively satisfactory picture to understand the physics of various couplings associated with a qubit at the level of this Tutorial.

#### 2. Qubit–resonator coupling

The transverse coupling mediates the energy exchange between the qubit and the resonator [Fig. 10(b)]. Thus, the transverse coupling is effective when the coupled system has a mode whose frequency is close to $ \omega q$ as we saw in Fig. 9(a). The longitudinal coupling changes the qubit frequency. It is effective when $ \omega q$ varies considerably with the external bias [Fig. 10(b)]. Note that the physics of the relaxation processes in Sec. IV can be understood within this framework; the transverse and longitudinal couplings are actually the mechanisms for thermalization and dephasing, respectively.

Equations (34) and (36) suggest that if $ \omega q$ does not depend on the physical parameter that is coupled to the effective dipole moment, the voltage in this case, there is no longitudinal coupling. Note that, because of this, the dominant coupling associated with a qubit at its sweet spot is the transverse coupling. One consequence is that the only possible coupling associated with a capacitively coupled transmon is the transverse coupling because $ \omega q$ of a transmon is insensitive to the external voltage fluctuation, i.e., a transmon is always at its charge bias sweet spot. To implement the longitudinal coupling, a transmon needs a flux-tunable element, such as a DC SQUID, and should be coupled to the target system inductively.^{61}

#### 3. Qubit–qubit coupling

In Eq. (42), the $XX$ interaction corresponds to the transverse interaction. Regarding the longitudinal interaction, there is ambiguity in its definition. If we follow the convention in the qubit–resonator interaction consistently, only the $XZ$ and $ZX$ interactions must be called the longitudinal interactions. However, a considerable number of papers designate all non-transverse interactions, which includes the $ZZ$ interaction, as the longitudinal interactions. In this Tutorial, we use the term “longitudinal interaction” for the qubit–resonator interaction only. For the qubit–qubit interaction, we call the type of interaction explicitly, such as the $XZ$ interaction, for clarity.

If $J$ is static and $| \omega q 1\u2212 \omega q 2|\u226bJ$, it is clear that the coupling term will be averaged out and consequently cannot be used for two-qubit gate operation unless one of the following actions is taken: (i) tuning $ \omega q 1$ or $ \omega q 2$ so that $ \omega q 1\u2248 \omega q 2$; (ii) modulating $J$ with the frequency $| \omega q 1\xb1 \omega q 2|$ to cancel out oscillating factors; or (iii) adding an additional drive term. These strategies are based on the lessons learned in Sec. V A and will be the basis of two-qubit gates in Sec. VI D.

It is often necessary to couple two qubits separated by a macroscopic distance. In this case, a resonator or even a qubit is employed as a coupler—such a scheme is called indirect coupling (Fig. 11). Here, we need to be careful not to excite the coupler itself; otherwise, the information will leak to the Hilbert space of the coupler. Hence, the resonance frequency of the coupler must be significantly far from the transition frequency of the qubits such that $ | \omega r\u2212 \omega q i |\u226b g ( i )$, where $ g ( i )$ is the transverse coupling constant associated with the resonator and qubit $i$. The coupler mediates the exchange of virtual photons between the two qubits. Such a system can also be modeled as Eq. (44).^{66,67}

### C. Strong (transverse) coupling

In this subsection, we consider how to quantify the strength of the transverse coupling because the current standard qubit control and readout methods are based on the transverse coupling. (There are many studies on the potential use of the longitudinal qubit–resonator coupling for quantum computation. Interested readers should see Refs. 61 and 63–65.) For efficient qubit control and readout, we need a reasonably strong qubit–resonator coupling; otherwise, the signal will be too small and the control will be too slow. Similarly, we also need a strong qubit–qubit interaction for efficient two-qubit gate operation (see Secs. VI B 1 and VI D for further explanation). Then, what are the criteria that must be satisfied to be called a strong coupling?

The strength of the qubit–resonator coupling is usually characterized by three quantities: $g$, $\kappa $, and $\gamma $ [Fig. 10(a)]. Here, $g/2\pi $ is the transverse coupling strength in Hz, $\kappa /2\pi $ is the loss rate of photons from the resonator, i.e., the spectral linewidth of the resonator, in Hz ( $\kappa = \omega r/Q$, where $Q$ is the quality factor of the resonator), and $\gamma /2\pi $( $=1/\pi T 2$) is the transverse relaxation rate, i.e., the spectral linewidth, of the qubit in Hz. When the system satisfies $g>\kappa /2,\gamma /2$, the coupling is regarded as a strong coupling. The physical meaning is clear: to ensure a strong qubit–resonator interaction, the photon must stay in the resonator and the qubit needs to keep its coherence while the two systems exchange their energy.

The experimental signature of a strong qubit–resonator or qubit–qubit coupling is an anticrossing called the vacuum Rabi splitting (Fig. 12). Such a situation is well described by the Jaynes–Cummings Hamiltonian [Eq. (41)]. In the Jaynes–Cummings Hamiltonian, when the qubit and the resonator are far off-resonance, the ground state of the entire system is roughly given by $ g 0$ (biases a and c in Fig. 12), where $ i j$ denotes the quantum state where the $i$th state of the bare qubit and the $j$th state of the bare resonator are occupied. At on-resonance, the ground state becomes $( g 1+ e 0)/ 2$ because of the hybridization between the qubit state and the resonator state (bias b in Fig. 12). In the time domain, the population of the two systems oscillates out-of-phase. This oscillation is called the vacuum Rabi oscillation.

*vacuum*mode of the resonator. (There is no longitudinal coupling in our classical oscillators because this system is harmonic. The state of a harmonic system, i.e., boson, cannot be represented in the Bloch sphere because there is no well-defined geometrical quantization axis. However, bosons can couple to each other and exchange their energy; we just call this coupling transverse to be consistent with that for fermionic systems.)

The strong transverse qubit–qubit coupling also yields a similar anticrossing. However, the transition probability, i.e., the strength of the signal, near the anticrossing is more complex than that of the qubit–resonator coupling. The reason is that there are two types of symmetry, triplet and singlet, associated with the quantum states of two entangled qubits, and transitions between different symmetries are forbidden.^{66,68}

Note that, compared with other quantum systems, superconducting planar circuit is particularly convenient system for realizing a strong coupling because the low-dimensional nature of this system results in a strongly concentrated electromagnetic field profile and consequently produces a large $ V r , 0$ in Eq. (34).

## VI. IMPLEMENTATION OF QUANTUM COMPUTATION

### A. Equation of motion

^{53,69}

Although the Lindblad master equation is an appropriate tool to describe the dynamics of a quantum system induced by uncontrolled interactions with the environment, we need another formalism that describes the interaction between the system and a “controlled” environment, such as traveling electromagnetic fields through transmission lines, to model an actual experiment. Input–output theory is a theory for this. Here, “input” refers to the field that drives the system and “output” refers to the field that propagates away from the system. Interested readers should see Refs. 70–72.

### B. Readout

#### 1. Dispersive readout

Readout of a qubit state means to transfer the information of the qubit state to a change in a physical quantity of a classical device. At the time of writing, the standard method of detecting the superconducting qubit state is dispersive readout, i.e., detecting the qubit state by observing the shift in the resonance frequency of a readout resonator interacting with the qubit [Fig. 13(a)].

Advantages of dispersive readout are that (i) it does not rely on the dominant degree of freedom of a qubit, such as charge or flux, and (ii) its nondestructive nature. Before dispersive readout, a single-electron transistor was employed for island-based qubit readout and a DC SQUID was used for loop-based qubit readout because of their excellent sensitivity to charge and flux, respectively. The problem was that if the eigenstates of the qubit show significant spread or superposition in the number or phase basis [Figs. 5(d) and 5(e)], which happens in all noise-resilient qubits mentioned in Sec. IV B, these quantity-specific detection methods are not effective and often suffer from a strong backaction that disturbs the subsequent evolution of the measured observable. As a result, the qubit state becomes uncertain after the readout. This prevents any feedback scheme based on the measurement outcome.

In the dispersive readout scheme, a qubit state is detected and controlled by a resonator via a strong qubit–resonator interaction. However, near on-resonance ( $ \omega q\u2248 \omega r$), we cannot selectively detect or control the qubit state because, in this regime, the strong qubit–resonator interaction hybridizes the qubit and resonator states (see Sec. V C). Hence, we detune $ \omega q$ such that the qubit–resonator detuning $ \Delta qr(\u2261 \omega r\u2212 \omega q)$ is much greater than $g$ and $\kappa $. This limit is called the dispersive limit. In this off-resonant regime, a qubit transition induced by photon exchange with the resonator is negligible. However, the qubit shows small but easily measurable frequency shifts that depend on the resonator state; at the same time, the resonator also shows a small frequency shift that depends on the qubit state. The qubit state is detected by measuring this frequency shift of the resonator.

^{75,76}(the same results can be obtained using the standard perturbation theory

^{53,62}). A unitary operator $ U ^ disp= e S ^$ for the Schrieffer–Wolff transformation is defined such that $ S ^ \u2020=\u2212 S ^$ and $[ S ^, H ^ 0]=\u2212 H ^ qr$. Then, we have (use the formulas in Table II)

Note that the dispersive term [ $\chi \sigma ^ z a ^ \u2020 a ^$ in Eq. (57)] commutes with the bare qubit and bare resonator terms; in other words, measuring the qubit state does not disturb the subsequent evolution of the qubit and resonator, meaning that the dispersive readout scheme is nondestructive. Such a measurement scheme is called Quantum NonDemolition (QND) measurement.^{53,77,78} Here, we emphasize that the term “nondemolition” does not mean the absence of wavefunction collapse. If the measurement scheme is QND-type, a measured qubit remains in the eigenstate that we record as a measurement outcome, and subsequent measurements reproduce the outcome of the first measurement.

For quantum error correction, the state of an ancilla qubit (see Sec. VII) must be determined in a single shot—without averaging the output signals of repeated identical measurements. Thus, maximizing the Signal-to-Noise Ratio (SNR) is crucial. While we can enhance the SNR by increasing the probe power, i.e., the average number of photons $ n \xaf$ for the detection of the resonator state, $ n \xaf$ must be significantly less than the critical photon number $ n crit$, which is given by $ \Delta qr/4 g 2$.^{55} If not, Eq. (57) would no longer be valid. Then, the readout process is no longer QND, and the photons induce an unwanted qubit transition as a backaction.^{79} The resulting change in qubit population during the readout process reduces the readout fidelity. In experiments, it was found that $ n \xaf$ must be $\u2272$10 for high-fidelity readout.^{79,80}

The next question we have to consider is the following: with a given $ n \xaf$, what is the optimal readout condition that ensures fast readout with high fidelity? From Fig. 13(a), we can see that $\kappa $ should not be too large compared with $\chi $; otherwise, the frequency shift would be difficult to observe. The opposite limit, the small $\kappa $ limit, is not so good either—it would make the readout process inefficient because photons would stay too long in the resonator, resulting in a small signal. Careful theoretical and experimental studies^{80,81} found that the condition for the best SNR is $2\chi =\kappa $. In experiments, the quality factor ( $= \omega r/\kappa $) of the readout resonator is usually designed to be of the order of 100–1000.^{82} If we lower the quality factor further, the readout process will be faster; however, achieving a comparable $\chi $ might not satisfy the dispersive limit because a large $\chi $ is achieved by either enhancing $g$ or reducing $ \Delta qr$. Moreover, if $g$ is on the order of $0.1 \omega q$ or larger, the qubit–resonator coupling enters the so-called ultrastrong coupling regime, in which the RWA is no longer applicable.^{83,84}

Another important phenomenon that must be considered is the Purcell effect that refers to the increase in the spontaneous emission rate of a qubit, i.e., the reduction in $ T 1$, by the resonator coupled to the qubit. The Purcell effect is maximized when $ \omega q= \omega r$ because it is caused by the resonator concentrating the density of decay channels.^{62} Therefore, if $ \Delta qr$ is not large enough, $ T 1$ will be severely limited.^{56} The problem is that a large $ \Delta qr$ is not always desirable because the resulting $\chi $ might be too small to ensure a high readout fidelity. To maximize the readout fidelity while maintaining the fast readout capability, the Purcell filter was developed. Contrary to the Purcell effect, the Purcell filter “filters out” the density of decay channels near the qubit transition frequency; hence, it protects the qubit from the unwanted acceleration of energy relaxation. For details, see Refs. 85 and 86.

In a real experiment, the measured dispersive shift can be significantly different from the value of $ g 2/ \Delta qr$ because of the contribution from higher excitation levels interacting with the resonator. To fully account for this contribution, we also need to calculate $ g i j$ using Eq. (34), and $ \chi i j=| g i j | 2/( \omega r\u2212 \omega i - j)$, where $ \omega i - j$ is the energy *released* in the transition from $ i$ to $ j$ ( $i\u2260j$) (note that $ \omega i - j$ is negative when $i<j$). The total dispersive shift $ \chi total$, which is what we observe in experiments, is given by $ \u2211 j = 0 M[( \chi j 1\u2212 \chi 1 j)\u2212( \chi j 0\u2212 \chi 0 j)]/2$, where $M$ is the cutoff energy level.^{87,88} Fortunately, by considering $ \chi total$ as an empirical $\chi $, we can understand the readout process with the physics explained in this section.

On the basis of the physics we have explored thus far, the dispersive readout process can be described as follows:^{81} (i) A set of photons, the energy of each of which is about $ \omega r$, enter the resonator. (ii) The qubit state information is encoded to the photons, for example, as the phase of the transmission, by the qubit–resonator interaction. The measurement-induced dephasing caused by the same interaction makes the qubit state lose its phase coherence and collapse into $ 0$ or $ 1$. (iii) The photons escape from the resonator and are then detected.

#### 2. Josephson parametric amplifier

Although dispersive readout can give a reasonably good SNR, achieving single-shot readout is still difficult because the SNR is deteriorated by thermal noise during travel from the chip to the room-temperature instruments. This is an inevitable consequence of the limited number of microwave photons, the energy of each of which is orders of magnitude smaller than the room-temperature thermal energy. Hence, the pre-amplification of the signal before detection is indispensable. One might think that using multiple amplifiers will solve the problem. However, this is not a good option because, if a signal passes a chain of amplifiers, the SNR is primarily established by the noise figure of the first amplifier in the chain (Friis formula). Therefore, having a good amplifier immediately after the qubit is important.

Commercially available High-Electron-Mobility Transistor (HEMT) amplifiers are widely used and installed in the output microwave wiring (at the 4K plate of the dilution refrigerator), typically providing a gain of 30–40 dB. Although an HEMT amplifier has a high gain and broad operation bandwidth, it adds on an average of 10–20 noise photons to the signal photons, which in turn worsens the SNR. To overcome this, practically noiseless parametric amplifiers were developed using Josephson junctions.

A parametric amplifier basically transfers energy from a strong pump to a weak signal by mixing the signal and pump frequencies via the modulation of the reactance. Hence, the reactance is a time-varying parameter, from which the name of the amplifier originates. This type of amplifier has low noise because it modulates the reactance instead of the resistance. In superconducting circuits, a variable inductor can be implemented using Josephson junctions, hence the name Josephson Parametric Amplifier (JPA).

The physics of parametric amplification was introduced in Fig. 9(b): $ x 1$ and $ x 2$ correspond to the signal (non-zero initial value) and the idler (zero initial value), respectively. Here, the idler is a tone generated during the amplification process as a consequence of the energy conservation: $ \omega m= \omega s+ \omega I$, where $ \omega m$ is the modulation frequency, $ \omega s$ is the signal frequency, and $ \omega I$ is the idler frequency.

In general, JPAs can be categorized on the basis of two factors. One is how to maximize the time for energy transfer from the pump to the signal. This can be achieved by using either a resonator (multiple bounces in a cavity) or a long waveguide. The other is how to modulate the inductance. Depending on the modulation method or operation conditions, $ \omega m$ can be either the same or twice the pump frequency $ \omega p$.

^{89}Here, we decompose $ \phi ext$ into the DC component $ \phi ext dc$ and the pump component $ \phi ext p$. We can operate the amplifier in two regimes depending on $ \phi ext dc$:

If $ \phi ext dc=0$ [bias a in Fig. 14(b)], $ L J , eff$ varies quadratically with $ \phi ext p$ because $1/cos(x/2)\u22481+ x 2/8$. This results that $ \omega m=2 \omega p$, where $ \omega p= \omega r$. Such a process is called the four-wave mixing process ( $ \omega s$, $ \omega I$, and two $ \omega p$).

For a suitable value of $ \phi ext dc$, we can have an appreciable contribution from the linear term in Eq. (59). Bias b in Fig. 14(b) is an example of this. In this case, $ \omega m= \omega p$. The parametric amplification of the signals is then achieved by applying a pump tone with $ \omega p=2 \omega r$. This process is called the three-wave mixing process ( $ \omega s$, $ \omega I$, and one $ \omega p$). Advantages of this operation are that (i) we can easily separate the pump tone and the signal in the frequency domain, and (ii) we can tune $ \omega r$ by adjusting $ \phi ext dc$.

There is another method for the inductance modulation, called current pumping [green arrows in Fig. 14(a)].^{90} In this method, the inductance of a JPA is modulated by applying a large current, i.e., pumping current, flowing through the Josephson junctions. Roughly speaking, the number of charge carriers, i.e., Cooper pairs, is locally and partially reduced in a Josephson junction owing to its weak link nature. Because of this, the charge carriers must have a higher speed near the junctions to maintain the same current in and out of the junctions. The resulting large kinetic energy of the charge carriers contributes to the inductance of the circuit, in addition to the geometric inductance. This additional inductance is called the kinetic inductance. Since the kinetic energy is proportional to the square of the velocity of the charge carriers, the kinetic inductance is roughly proportional to the square of the pump current. Thus, the current-pumped amplification is a four-wave mixing process regardless of a DC flux bias.^{93} Such an inductive contribution from current pumping shifts the resonance frequency of the resonator as shown in Fig. 14(c). This characteristic line shape can be modeled as a Duffing oscillator.^{91,92}

The resonator-based JPA is relatively easy to make but suffers from a gain-bandwidth tradeoff. The reason is that, to have a higher gain, the signal must bounce in the resonator more; this requires a higher quality factor and inevitably reduces the bandwidth. A waveguide-based JPA is called a Josephson Traveling-Wave Parametric Amplifier (JTWPA). A JTWPA is free from the gain-bandwidth tradeoff. This allows us to achieve high-bandwidth (several gigahertz) and high-gain ( $>20$ dB) amplification.^{94} Since the waveguide has to be nonlinear to mix the signal and pump tones, it is implemented using a long Josephson junction array with current pumping.

The challenge in making a JTWPA is the phase matching: the phases of all interacting tones (pump, signal, and idler) must be matched (a resonator-based JPA is free from this problem because it is geometrically confined). The phase mismatch results from changes in phase velocity, i.e., the refractive index, caused by the interaction between the strong pump tone and the nonlinear medium. To match the phase, we create a local distortion in the dispersion relation as shown in Fig. 14(e). Then, depending on the pump current $ I p$, a value of $ \omega p$ that matches the phase can be chosen. If $ I p$ is very small compared to the critical current of the junction, the phase mismatch will be small; thus, a frequency where the wave vector is close to that from the linear dispersion relation (long-dashed line) will match the phase well. Frequency a in Fig. 14(e) is an example for this. For $ I p$ comparable to the junction critical current, we have to choose a frequency where the dispersion relation deviates significantly from the linear one as $ \omega p$; frequency b is such an example. For actual implementation, we can either insert a resonant structure near each Josephson junction^{94,95} or periodically modulate the refractive index similarly to what is done in photonic crystals [Fig. 14(d)].^{96} Here, modulating the refractive index can be achieved by modulating the size of the junctions in DC SQUIDs.

These JPAs can perform as quantum-limited amplifiers, which add only the minimum noise allowed by the laws of quantum mechanics. For a phase-preserving linear amplifier, whose gain is the same regardless of the phase of the input signal, this minimum amount of added noise is equivalent to half a photon.^{97,98} For theories of parametric amplification, including JPA, see Refs. 92, 72, and 99–101.

### C. Single-qubit gate

Since the standard qubit readout technique is dispersive readout (Sec. VI B 1), we consider a qubit–resonator system and assume that the qubit is driven via the resonator. A classical analog of this system is shown in Fig. 9(c). As we drive oscillator 1 (control oscillator) with $ f d= f 2$, the amplitude of oscillator 2 (target oscillator) increases indefinitely. However, this does not happen for a qubit because it is intrinsically a nonlinear quantum object. Instead, the population of the qubit oscillates as a function of time or the amplitude of the drive. This oscillation in the qubit population is called the Rabi oscillation (Fig. 15).

^{53,55}

One might ask how the rotation axis is defined in a real experiment. The answer is that the reference phase of the instruments, such as the phase of the first pulse in the experiment, defines the rotation axis, which is usually set as $x$.^{102} If we want to change the rotation axis from $x$ to $y$, then all we have to do is to add a $\pi /2$ phase shift to the subsequent pulses and the reference phase of the measurement instruments. This means that, if we set the $x$ rotation $cos(\omega t)$, then the $y$ rotation is $\u2212sin(\omega t)$ (do not omit the minus sign!).

Note that changing the rotation axis from $x$ to $y$ is actually a $z$ rotation. This suggests that shifting the reference phase of the instruments is functionally equivalent to a rotation about the $z$ axis, which is called a virtual $z$ rotation.^{103} Since we do not apply a real pulse for this, the virtual $z$ rotation is a nearly perfect and zero-time operation.

### D. Two-qubit gate

Many two-qubit gates have been implemented by various methods. Among them, four gates and four methods are introduced in this Tutorial. These methods are based on the transverse qubit–qubit interaction whose physics is explored in Sec. V A. The two-qubit gates introduced in this Tutorial are summarized in Table V.

#### 1. iSWAP: Coherent exchange

This method implements a two-qubit gate using changes in the phase and population of the qubit states during the coherent exchange of a photon. The basic mechanism for this is tuning the transition frequency of one of the qubits so that $ \omega q 1= \omega q 2$. The relevant analogy for this is shown in Fig. 9(a).

The actual implementation can be done via the following steps [black arrowed path in Fig. 16(a) and lower figure in Fig. 16(c)]: (i) Prepare the initial state. In Fig. 16, $ 01$ was chosen as an example. At this stage, the tunable qubit, qubit 2 in this case, is at its sweet spot. (ii) Increase the flux bias to the point at which the energy levels of $ 10$ and $ 01$ are equal. At this bias, the new eigenstates, $( 01\xb1 10)/ 2$, exchange their energy at a rate of $J/\pi $. (iii) Wait for a while to satisfy $J\tau =\pi /2$. (iv) Decrease the flux bias to the sweet spot.

The bias ramping in steps (ii) and (iv) must be as fast as possible for efficient gate operation. Thus, the iSWAP gate implemented using this method is a diabatic gate. For qubits with negative anharmonicity, such as the two-transmon system shown in Fig. 16(a), the gate fidelity of a diabatic gate is limited mainly by population leakage out of the computational subspace. This population leakage is driven by unwanted transitions, such as $ 11$- $ 02$ and $ 11$- $ 20$ transitions, because we pass anticrossings associated with $ 11$ during the flux bias ramping. Note that this leakage is still a unitary evolution, suggesting that the population leakage oscillates with time. Thus, the error due to this leakage can be minimized by synchronizing the periods of the leakage and the iSWAP gate time.^{105}

#### 2. iSWAP and bSWAP: Parametric coupling

The previous implementation of the iSWAP gate relies on the frequency tunability of the qubit. This means that the qubit must be out of its sweet spot for a while, which potentially degrades the coherence. To resolve this issue, another scheme that allows qubits to stay at their sweet spots during the gate operation has been developed.^{114}

In this scheme, the control knob is the qubit–qubit coupling. Consider the two-qubit Hamiltonian in Eq. (44) (here, the qubits do not need to be transmons). If $J$ is static and $| \omega q 1\u2212 \omega q 2|\u226bJ$, the qubit–qubit interaction is effectively turned off, i.e., the interaction is very slow compared with the time scale we are interested in, as indicated by in Eq. (46).

^{106,115}

#### 3. CZ: Adiabatic excursion

^{112,113}Since the CNOT gate is the essential gate for quantum error correction (see Sec. VII), a more efficient implementation of the CNOT gate is desired. For this, we need a non-transverse qubit–qubit interaction. It was soon realized that we have an effective $ZZ$ interaction that comes from the transverse interaction associated with higher excitation levels.

^{116}This interaction allows us to implement the controlled-$Z$ (CZ) gate, which is identical to the CNOT gate up to single-qubit rotations:

If we consider the computational subspace only, the difficulty in implementing the CZ gate is that $ \omega 11$ is always the same as $ \omega 10+ \omega 01$; thus, $ \theta 11= \theta 10+ \theta 01$. This is why we need a $ZZ$ interaction. The crucial observation is that $ \omega 11$ deviates from $ \omega 10+ \omega 01$ because of the anticrossing between the $ 11$ and $ 02$ levels [left figure in Fig. 16(b)]. This gives an effective $ZZ$ interaction whose strength $\zeta $ is $ \omega 10+ \omega 01\u2212 \omega 11$. As shown in the left figure in Fig. 16(b), $\zeta $ increases rapidly as the energy levels of $ 02$ and $ 11$ become closer, suggesting that varying the flux bias can be used to tune $\zeta $ by orders of magnitude.^{116}

Now, we have the $ZZ$ interaction. Note that, in Eq. (78), the population of each state must remain the same. One way to implement the CZ gate is to tune the energy levels adiabatically. In addition, another condition to make the CZ gate operation feasible is negative anharmonicity such that the $ 11$- $ 02$ anticrossing appears earlier than the $ 10$- $ 01$ anticrossing. Otherwise, the transition between $ 10$ and $ 01$ will be activated during the gate operation, resulting in an unwanted population change. Therefore, the most suitable qubit for the CZ gate operation is a transmon and its variants.

Implementing a high-fidelity and fast CZ gate is then reduced to finding an optimal trajectory satisfying $ \u222b 0 \tau \zeta (t)dt=\pi $. In general, the flux must be ramped up fast at the beginning to reduce the gate time. Near the anticrossing, the flux sweep is relatively slow to satisfy adiabaticity and acquire the phase we need [center figure in Fig. 16(b)]. It was found that the Slepian shape is close to the optimal trajectory.^{117}

#### 4. CZ: Coherent exchange and parametric coupling

^{67,105}Regarding coherent exchange, when $ \omega 11= \omega 02$, we can construct a propagator similar to Eq. (70),

Parametric coupling can also be used.^{109} Since we do not need to tune the qubit frequencies, negative anharmonicity is not required. Moreover, we can use the coupling between $ 11$ and either $ 02$ or $ 20$.

#### 5. CR: All microwave control

All implementations of two-qubit gates discussed thus far require some tunability of the qubit frequency or qubit–qubit coupling. However, the phase coherence is so delicate that any control line potentially degrades $ T 2$. This motivates the development of two-qubit gates purely driven by microwave activation. Among them, the cross-resonance (CR) gate is the most widely used one.

The CR gate basically excites one qubit (target qubit) through the other qubit (control qubit). Hence, it is similar to a single-qubit gate, and its classical analogy is Fig. 9(c). The difference is that, instead of a harmonic oscillator (resonator), a nonlinear oscillator (qubit) is used as a control oscillator.^{120} Because of the nonlinearity, the result of the gate operation depends on the state of the control qubit, resulting in a two-qubit gate.

In the following, we repeat the calculations we made in Sec. VI C with slight modifications for a two-qubit system instead of a qubit–resonator system. We make three assumptions. First, $J\u226a \Delta qq$, where $ \Delta qq\u2261 \omega q 1\u2212 \omega q 2$. This is a kind of dispersive limit. Second, the control qubit, which is assumed to be qubit 1, is an ideal two-level system, i.e., $ \alpha 1\u226b \Delta qq$, where $ \alpha 1$ is the anharmonicity of qubit 1. Last, there is no spurious crosstalk between the two qubits.

^{118}

^{119,120}

^{111}

### E. Initialization

After computation, qubits should be initialized for the next computation. However, condition 3 in Sec. III A and fast initialization seem to be contradictory. Here, we introduce two categories of qubit initialization and their working principles.

#### 1. Entropy dumping

This method is to pump entropy of the target system to another system, which we call the pumping system, interacting with the target system. Here, the required condition is that the relaxation of the pumping system has to be much faster than that of the target system. This idea has been used in magnetic resonance for a long time, with the name “Dynamic Nuclear Polarization (DNP).”^{121,122} In DNP, the target system is a nuclear spin; the pumping system is an electron spin; and their interaction is mediated by the hyperfine interaction.

In a superconducting circuit, the target system is a superconducting qubit and the pumping system is usually the readout resonator because the qubit must be isolated as much as possible to maintain the coherence, while the resonator needs to be strongly coupled to the microwave feedline for fast readout. The strategy is to find an efficient and controllable energy transfer path to the environment.

One path is from the qubit state $ e$ to the resonator state $ 1$ [Fig. 17(a)].^{56,85} For this, the Purcell effect (Sec. VI B 1) is used. By using a frequency-tunable qubit, we can tune $ \omega q$ to $ \omega r$ to speed up the qubit relaxation process. Once the energy is emitted from the qubit to the resonator, the energy is quickly dissipated to the environment.

An all-microwave option is also available.^{123,124} In this case, the higher excitation level $ f$ is inserted into the $ e$- $ 1$ path [Fig. 17(b)]. The transitions between the steps are induced by microwave drives. Hence, this method is useful for a transmon with the fixed qubit frequency. Note that we cannot induce a direct transition between $ e 0$ and $ g 1$ because this transition is forbidden when the qubit–resonator interaction is of the Jaynes–Cummings type [Eq. (40)].^{125}

These two methods can be understood using mechanical analogs shown in Figs. 17(a) and 17(b).

#### 2. Measurement-based initialization

This is a completely different initialization method based on the QND measurement mentioned in Sec. VI B 1 and the measurement postulate, which states that a measurement of an observable acting on a quantum state destroys the phase coherence and forces the state to collapse into one of the eigenstates of the observable.^{8,9} If a measurement is QND-type and its outcome is $ 0$, then the qubit is initialized. If the outcome is $ 1$, we simply apply a $\pi $-pulse to flip the qubit state. The flow chart for this procedure is shown in Fig. 17(c). This method is often called measurement-based initialization.^{126–128}

One technical difficulty is that this method heavily relies on high-fidelity projective measurement and fast feedback. It can suffer from latency due to the classical data processing and the pulse generation and injection.

Note that, in this method, we reduce the entropy of the qubit by extracting information about its state. This suggests that information processing and thermodynamics are connected intrinsically. One of the most famous examples showing this connection is Maxwell’s demon that is recently implemented in superconducting qubit systems.^{129–131}

## VII. QUANTUM ERROR CORRECTION

### A. Introduction

Noise in real physical systems, both classical and quantum, cannot be eliminated completely. As quantum algorithms of scientific and/or commercial use consist of many qubits and time steps, error correction schemes are essential for reliable quantum computation at scale. The goal of Quantum Error Correction (QEC) is to generate an error-free logical qubit, i.e., two selected quantum states, out of the large Hilbert space of a system composed of multiple quantum systems. This introduces redundancy that can be used to both detect and correct physical errors.

Classical error correction generally can introduce redundancy by duplicating bits multiple times and then errors can be detected and corrected by comparing copies together. However, in QEC, the use of redundancy is different and much more difficult to achieve because of the following reasons. First, the no-cloning theorem^{10,13} prevents this type of error correction in quantum information as arbitrary quantum states cannot be copied. Second, direct measurement of quantum states is not possible as it will collapse the qubit state (measurement postulate in Sec. VI E 2). Hence, measurements that are used to detect actual errors have to be performed in an indirect way. Finally, besides the bit flip error ( $ 0\u2192 1$ or $ 1\u2192 0$), which is the standard model for classical information, quantum information can experience a second type of the error, called the phase flip error ( $\alpha 0+\beta 1\u2192\alpha 0\u2212\beta 1$). Consequently, a quantum error correction code must be able to simultaneously correct for both bit flips and phase flips.

As a quantum algorithm creates complex entangled states between the constituent qubits, we require a technique that can detect individual errors on any physical qubits *without* extracting any information regarding the computational state of the computing system. Parity measurement that measures bit/phase parity of neighboring qubits is such a technique. For this, additional qubits, which are not used for actual computation, are introduced. These are commonly referred to as ancilla qubits or syndrome qubits. Syndrome qubits are entangled with encoded qubits, often called data qubits, within a logical qubit and are used to extract information only related to physical errors that may have occurred on individual data qubits. These syndrome qubits are then measured, generating classical information called the error syndrome. This syndrome extraction procedure is specifically designed to avoid direct measurement of the computational state of any qubit and hence preserves the computation during the error-correction process. Multiple syndrome measurements of the data qubit are taken and this classical information is decoded. This decoding procedure determines the most likely physical errors that resulted in the specific set of syndrome measurements that were observed.

Since we have to detect errors without knowing any information about the qubit state, to define a state as an eigenstate of a certain operator in the Heisenberg representation is more convenient than to write the state itself. A formalism based on this idea is the stabilizer formalism.^{139,140} As it will be shown in the following, the stabilizer formalism describes our action for error detection and logical state construction in a unified manner.

A stabilizer set is a set of commuting multiqubit operators, made up of tensor products of Pauli- $X$, $Y$, and $Z$ operators. These multiqubit operators are commonly known as stabilizer operators or simply stabilizers. By using multiqubit projective measurement, we can force an arbitrary quantum state into simultaneous eigenstates of these stabilizers. One consequence is that, if we repeat these projective measurements in the absence of errors, we will repeatedly measure the same eigenvalue and project the quantum state into the same eigenstate; this is why these operators are called stabilizer operators. Note that the projective measurement on each stabilizer is actually parity measurement and its outcome is an error syndrome. Physical errors cause eigenvalues to flip between $+1$ and $\u22121$, depending on if the physical errors commute or anti-commute with stabilizers. Bit flip errors will anti-commute with stabilizers made up of Pauli- $Y$ or $Z$ operators and phase flip errors will anti-commute with stabilizers made up of Pauli- $X$ and $Y$ operators, allowing for the correction of both types of errors.

Another consequence is that the size of the stabilizer set determines the size of the restricted subspace of states that satisfy the eigen-conditions, thus defining a logical qubit. An $N$-qubit state is spanned by $ 2 N$ possible basis states, where a single qubit is spanned by two ( $|0\u27e9$ and $|1\u27e9$). Consequently, we say that there are $N$ degrees of freedom for an $N$-qubit state. Taking $N$ physical qubits and encoding them into a single logical qubit requires us to *fix* $N\u22121$ degrees of freedom with the one left over to represent the logical qubit. This is done by requiring the multiqubit state that defines the logical qubit to be in definite eigenstates of stabilizers whose eigenvalue is $+1$. The logical qubit states, $ 0 L$ and $ 1 L$, both satisfy the eigen-conditions defined by these stabilizers.

In a real system, physical errors do not occur as discrete bit and phase flips, instead either coherent (such as imprecise control errors) or incoherent errors (such as thermalization or dephasing) act to perturb a qubit state in a continuous manner. However, measuring the eigenvalues of the stabilizer operators acts to discretize this noise, which translates the continuous nature of the noise into a probability of detecting a discrete error through the measurement of the syndrome qubit. An arbitrary error operator can be written as a linear combination of $X$ errors, $Z$ errors, and $Y$ errors. Therefore, correcting all three is sufficient to correct for all possible errors on a single physical qubit. Here, as $Y$ error can be decomposed into $Z$ and $X$ errors, $ Y ^= i Z ^ X ^$, correcting for $Z$ and $X$ errors will together correct for any $Y$ errors. $Z$ and $X$ errors are conventionally referred to as bit flip and phase flip errors, respectively.

There are a plethora of QEC codes in existence and many of them have been studied extensively. Among these, the surface code remains the most suitable scheme for solid-state qubit systems.^{132,133} One of the advantages of the surface code is that it requires each qubit to be coupled to at most, four nearest neighboring qubits. This allows us to arrange qubits a into two-dimensional lattice (Fig. 18). A single square patch of qubits defines a single, logically encoded qubit. This patch is generally parameterized by the number of physical qubits along an edge, which is also relate to the distance, $d$, of the underlying quantum code—the distance of a quantum code is the minimum number of physical errors needed to induce a logical error. Another advantage is the fault-tolerant threshold of the surface code. The fault-tolerant threshold is the maximum physical error rate that the code can correct. For the surface code, the fault-tolerant threshold is approximately 1% including errors related to state readout and all gate operations. This error threshold is one of the highest of any error correction scheme to date and remains the highest of a code that is compatible with the architectural constraints of current quantum computing hardware. However, we emphasize that fault-tolerant thresholds vary significantly, depending on the actual QEC code that is utilized.

In this Tutorial, we explain how to construct a logical qubit, how to detect and correct errors, and how to perform gate operations on a logical qubit in the context of the surface code. Interested readers are referred Refs. 10, 13, 134, and 135 for introductions to QEC and Refs. 136–138 for comprehensive reviews.

### B. Surface code

#### 1. Definition

The surface code is defined over a two-dimensional lattice of physical qubits that allow for nearest neighbor interactions. For a distance $d$ code, the size of the lattice is $(2d\u22121)\xd7(2d\u22121)$ physical qubits. Of the total $4 d 2\u22124d+1$ physical qubits in the lattice, approximately half ( $2 d 2\u22122d+1$) are data qubits and the rest of the physical qubits are syndrome qubits. It is sufficient to detect $\u2308(d\u22121)/2\u2309$ single-qubit errors, i.e., errors associated with a single data qubit, or correct $\u230a(d\u22121)/2\u230b$ single-qubit errors. Thus, the smallest $d$ required to correct any single-qubit error is 3. Figure 18 illustrates for a $d=4$ surface code, which can detect two single-qubit errors and correct one single-qubit error.

#### 2. Logical qubit construction

There are other methods for constructing a logical qubit when logical qubits are defined using defects in a qubit lattice. However, these methods are out of the scope of this Tutorial. The standard introduction to this topic is Ref. 133.

#### 3. Error detection and correction

Error detection requires repeated parity measurements of the stabilizers of the surface code. The syndrome qubits are used for this. These syndrome qubits are the only qubits measured during the error detection (syndrome extraction) process to avoid destruction of the logical information stored.

As shown in Fig. 18, there are two types of syndrome qubits, namely, $Z$ syndrome qubits, which are measured in the $Z$ basis, and $X$ syndrome qubits, which are measured in the $X$ basis. These syndrome qubits are used to measure the eigenvalue of $Z$ and $X$ stabilizers of the surface code, respectively. Having these two types of syndrome qubits allows us to detect both bit and phase flip errors as physical bit flip errors will be detected by the measurement of $Z$ stabilizers and physical phase flip errors will be detected by the measurement of $X$ stabilizers.

The quantum circuits [Eqs. (96) and (97)], called parity check circuits, are designed to infer the bit and phase parities of neighboring data qubits by measuring the eigenvalue of specific four-qubit Pauli operators. This not only extracts the eigenvalue of the relevant Pauli operator but also projects the four data qubits into the appropriate eigenstate. In Eq. (96), our measurement on the Za syndrome qubit forces the neighboring data qubits (D0, D2, D3, and D5) into an eigenstate of $ Z ^ D 0 Z ^ D 2 Z ^ D 3 Z ^ D 5$ ( $Z$ stabilizer). Similarly, in Eq. (97), the measurement on the Xb syndrome qubit forces the neighboring data qubits (D2, D3, D4, and D6) into an eigenstate of $ X ^ D 2 X ^ D 3 X ^ D 4 X ^ D 6$ ( $X$ stabilizer).

If the D2 qubit has a phase error, Xa and Xb syndrome qubits [Eq. (97)] return $\u22121$ as the error syndromes if there is no error in the syndrome qubits; if the D2 qubit has a bit flip error, Za and Zb syndrome qubits [Eq. (96)] return $\u22121$. Note that the existence of both a bit flip error and a phase flip error can be detected simultaneously because $ Z ^ D 0 Z ^ D 2 Z ^ D 3 Z ^ D 5$ and $ X ^ D 2 X ^ D 3 X ^ D 4 X ^ D 6$ commute. Therefore, the parity measurement allows us to know the existence of both the bit flip and the phase flip errors without collapsing the quantum state of data qubits.

These circuits have to be performed across the entire lattice, with measurement of all $X$ syndromes occurring simultaneously followed by simultaneous measurement of all $Z$ syndromes. The measurement of every $X$ and $Z$ syndrome is referred to as an error correction cycle, which is repeated continuously as the quantum computer is in operation. Consequently, after each error correction cycle, there will be $ d 2\u2212d$ classical bits of information (error syndromes) related to $X$ errors and $ d 2\u2212d$ classical bits of information related to $Z$ errors. After multiple cycles of error correction, all this information is then processed by a classical error-correction decoder to determine the most likely set of *actual* errors that resulted in the syndrome measurements that are observed.^{143}

Once the existence of an error is confirmed, the error can be corrected by applying the microwave pulse to flip phase or bit of the D2 qubit. However, in practice, we can simply record the error in a classical computer and correct measurement outcomes that are affected by the error. This is known as tracking the Pauli frame.^{141}

In the above example, the position and the type (bit or phase flip) of the error were already known. However, determining the position and the type of the error from syndrome measurements are difficult because it is an inverse problem. Moreover, the error position cannot be determined uniquely for some error patterns. For example, error syndromes of Xa and Xc resulting from phase errors in D0 and D5 are identical to that from phase errors in D2 and D3. However, if the number of such an error is reasonably small, the identity of each error can be almost completely inferred by minimum-weight perfect matching algorithm.^{133} This accuracy in using minimum weight matching to identify the most likely physical errors corresponding to a measured syndrome pattern is what effectively determines the fault-tolerant threshold. If error rates are too high, or errors spread too much through the quantum circuits used in the syndrome extraction process, then decoding algorithms will not decode physical errors accurately and corrections may induce logical errors.

One interesting consequence is that initialization of a logical qubit can be considered as a kind of error correction starting from a known state, such as $ 00000$. Initializing the logical qubit from this state only requires projective measurement of the $X$ stabilizers as the state $ 00000$ already satisfies the eigen-conditions of the $Z$ stabilizers. Additionally, when starting in the $ 00000$ state, we initialize into the logical $ 0 L$, because again our initial state before encoding is already in the $+1$ eigenstate of the logical $Z$ operator. Further examples can be found in Ref. 135.

Last, we briefly mention how we perform readout of a logical qubit. The logical readout of an error-corrected qubit ideally requires the direct measurement of all the physical qubits in an encoded block.^{133,142} To maintain correct fault-tolerant operation, all of these physical measurements on the encoded block have to take place. While it is possible to perform readout of an encoded qubit by performing a logical single-qubit parity check of the $X$ or $Z$ operator,^{142} this would still require the assistance of a fully encoded ancillary qubit that would still require the physical measurement of all component qubits to perform the readout.

#### 4. Logical gate operation

After encoding, computing is performed on the code using logical gates. Logical gates must preserve the symmetries enforced by the stabilizer operators, but manipulate the operators that define the logical state of the encoded qubit in the same way as operations on physical qubits do. Consequently, logical gate operators commute with all elements in the stabilizer set ${ S ^}$, but by definition are not contained in ${ S ^}$. For the surface code, for example, the logical $X$ gate, $ X ^ L$, corresponds to applying the physical $X$ gate to all data qubits in one of columns—not including the $Z$ syndrome qubits. The logical $Z$ gate is achieved by applying the physical $Z$ gate to the all data qubits in one of rows—not including the $X$ syndrome qubits.

Figure 19 reproduces the yellow area in Fig. 18 with the logical $Z$ and the logical $X$ operations. Applying $ Z ^ L= Z ^ D 0 Z ^ D 1 I ^ D 2 I ^ D 3 I ^ D 4$ and $ X ^ L= X ^ D 0 I ^ D 1 I ^ D 2 X ^ D 3 I ^ D 4$ to Eq. (93) helps to understand these logical operations. Note that $ Z ^ L$ and $ X ^ L$ commute with every stabilizers in Eq. (95) and anti-commute with each other as they must intersect on an odd number of physical qubits and physical $X$ and $Z$ gates anti-commute. Consequently, they form a pair of Pauli operators that have the same commutation properties as physical $X$ and $Z$.

The logical two-qubit gate, such as the logical CNOT gate, is not as simple compared to the logical single-qubit gates. Implementing a non-Clifford gate, such as the $T$ gate (see Table III for its definition), for a logical qubit is even more difficult than implementing the logical CNOT gate. The reason for this is that non-Clifford gates cannot be operated directly on the stabilizer codes in a fault-tolerant manner, i.e., ensuring errors do not cascade out of control, whereas gates in the Clifford group (gates that map Pauli operators to Pauli operators) can generally be applied directly to encoded data. It should be noted that not all stabilizer codes can enact the full Clifford group of logical operations directly. In fact, the surface code cannot enact the $S$ gate (Table III) directly and must use other constructions.^{133} Thus, these gates will not be covered in this Tutorial. Interested readers are referred Ref. 142 for the logical CNOT gate and Refs. 144 and 145 for non-Clifford logical gates.

### C. Proposed device architectures

One of the biggest problems in scaling up a superconducting qubit system is the so-called wiring problem.^{146} The wiring problem is that the number of wires required to operate qubits increases too fast with the number of qubits because each qubit requires multiple channels such as control lines and measurement devices. In such a situation, it will be difficult to access a qubit inside a chip.

One natural idea that avoids this problem and fits well with Figs. 18 and 19 is the use of the third dimension [Fig. 20(a)].^{107} For a classical solid-state circuit, a three-dimensional (3D) structure is made by employing a silicon oxide film as a interlayer insulator. However, this method cannot be used for quantum systems because such layers are so lossy that they can be severe decoherence channels. To get around this problem, flip-chip bonding is often used. On one chip, qubits are arranged into a square lattice; on another chip, other circuit components such as control lines and readout resonators are fabricated. Then, the two chips are combined face-to-face using superconducting bumps.^{147,148} To introduce microwaves to qubits from the top or back side of the wafer, pogo pins,^{149} and through-silicon vias^{150,151} can be used, respectively. Here, a through-silicon via is a coaxial structure that passes through a silicon wafer.

Recently, it has been reported that the 2D lattice for the surface code can be folded like origami as shown in Fig. 20(b).^{152} In this architecture, qubits and control lines can be fabricated on the same plane if the coupling resonators are allowed to intersect by air bridges [inset in Fig. 20(b)].^{149} Because of these cross-connections, this method was named a pseudo-2D architecture. The appealing point of this architecture is that all qubits and their associated lines can exist on the same chip. Moreover, it can be made by utilizing the standard 2D microwave technology so that we can avoid the complex techniques required for a 3D architecture.

Possible concerns are crosstalk between intersecting resonators and the degradation of the quality factor caused by air bridges. In Ref. 152, it was shown that the crosstalk is at most about $\u221250$ dB; in addition, resonators with 15–20 air bridges showed an internal quality factor in a range where the infidelity is lower than the threshold value of the surface code.

## VIII. CHARACTERIZING A QUANTUM SYSTEM

To control a quantum system precisely, we have to know the Hamiltonian of the system. In other words, a set of parameters, called system parameters, that characterize the system must be determined. Since the system parameters are based on our model describing the system, appropriate modeling is essential. For example, if we treat our qubit as an ideal two-level system as we did in Sec. V B 1, knowing $ \omega q$ might be good enough to set up the Hamiltonian. However, for high-fidelity gate operation, we must consider higher excitation levels; hence, we have to extract more information, such as the transition frequency between $ 1$ and $ 2$. If our target operation requires strong drive, we may also have to consider the nonlinearity of the readout resonator.

In this section, we explain the minimal procedure for extracting system parameters and the related experimental methods. These are summarized in Table VI.

### A. Spectroscopy

#### 1. Single-tone spectroscopy

The first step in characterizing a superconducting qubit system is finding the resonance frequency of the readout resonator, $ \omega r$. For this, we inject a microwave continuously [Continuous Wave (CW)] and measure the S-parameters of the system as a function of microwave frequency [Fig. 21(a)]. This task is usually done using a vector network analyzer (VNA). Since a single microwave source, which is a part of the VNA, is used in this step, this type of measurement is called single-tone spectroscopy.

Since the transition frequency of the qubit, $ \omega q$, is usually designed to be far detuned from $ \omega r$ for dispersive readout (Sec. VI B 1), most of the microwave power whose frequency is close to $ \omega q$ is filtered out by the resonator. As a result, the qubit signal is not visible in single-tone spectroscopy. This is why we need another type of spectroscopy, called two-tone spectroscopy.

#### 2. Two-tone spectroscopy

Once $ \omega r$ is known, we fix the excitation frequency of the VNA near $ \omega r$ for readout. Subsequently, we inject another microwave to the circuit to drive the qubit. $ \omega q$ is found by sweeping the frequency of the second microwave, called the drive frequency $ \omega d$, while monitoring the changes in the S-parameters of the readout resonator. If $ \omega d$ becomes close to $ \omega q$, the S-parameters of the readout resonator will vary because of the dispersive shift in resonance frequency [Fig. 13(a)]. This type of measurement is called two-tone spectroscopy.

When we characterize $ \omega q$, we have to minimize the excitation power of the VNA; if the excitation power is too high, then the readout resonator is populated by multiple photons, resulting in the shift or splitting of the qubit spectrum as mentioned in Sec. VI B 1.

We can repeat this procedure for different external biases to obtain the full bias dependence of $ \omega q$, which informs us of the position of the sweet spot. If the transition frequencies of two qubits coincide at a certain bias, we can see an anticrossing. From this, we can estimate the qubit–qubit coupling constant.

#### 3. Dispersive shift

As explained in Sec. VI B 1, the dispersive shift $\chi $ is the qubit-state-dependent frequency shift of the readout resonator. From this, we can estimate the qubit–resonator coupling $g$ using $\chi = g 2/ \Delta qr$, where $ \Delta qr= \omega r\u2212 \omega q$.

To measure $\chi $, we first prepare the qubit state of either $ 0$ or $ 1$. In this step, a $\pi $-pulse is required to prepare $ 1$. Because of this, the dispersive shift measurement has to be preceded by the Rabi oscillation measurement (Sec. VII B 1). Once the qubit state is prepared, we apply the readout pulse and measure the S-parameters. The sweep parameter is the frequency of the readout pulse [Fig. 21(a)]. Hence, the measurement of the dispersive shift is like pulsed single-tone spectroscopy with the qubit state preparation. By comparing the spectrum of the readout resonator with two different qubit states, we can obtain $\chi $ as shown in Fig. 13(a).

### B. Time-domain measurement

#### 1. Rabi oscillation

In time-domain measurements, we calibrate the qubit drive by observing the Rabi oscillation [Fig. 21(b)]. First, the qubit drive pulse with $ \omega q$ is applied to excite the qubit. After the drive pulse, the readout pulse near $ \omega r$ is applied, and the S-parameters of this readout pulse are measured via quadrature detection.^{4,102} The sweep parameter is the area of the drive pulse; in actual experiments, it can be either the length or amplitude of the drive pulse.

Since the drive pulse rotates the qubit state in the Bloch sphere (Fig. 15), and the dispersive measurement detects the longitudinal component of the Bloch vector (Sec. VI B 1), the amplitude of the signal oscillates with the drive pulse area: this oscillation is the Rabi oscillation. The Rabi oscillation provides a correspondence between the nutation angle of the Bloch vector and the drive pulse area. The names of frequently used pulses, such as $\pi $- and $\pi /2$-pulses, indicate the nutation angles induced by such pulses.

^{21,153}

With the extracted system parameters, we can establish the correspondence between the control parameters we set and the actual response of the system. This process is called calibration. The Rabi oscillation measurement is the simplest calibration procedure; however, the Rabi frequency estimated from this method is usually not accurate enough for high-fidelity control. Probably, the next simplest and more accurate one is to apply a train of pulses; interested readers should see Refs. 154 and 155. Optimal control theory can also be adopted (see Refs. 156 and 157). More advanced techniques for Google’s devices can be found in Refs. 158 and 159 and the supplementary material of Ref. 160.

#### 2. Relaxation time: *T*_{1}

#### 3. Relaxation time: *T*_{2}

The standard measurement procedure for the transverse relaxation time ( $ T 2$) is to observe the Ramsey fringes [Fig. 21(d)]. Since we need to detect the transverse component of the Bloch sphere, we apply a $\pi /2$ pulse at the beginning of the time interval and then a $\u2212\pi /2$ pulse to transfer the transverse component back to the longitudinal component. Subsequently, we make a detection. The time constant for the decay of the $ 0$ population is $ T 2$.

If $ \omega d= \omega q$ (on-resonance), we observe single exponential decay shown on the left side of Fig. 21(d). However, if $ \omega d\u2260 \omega q$ (off resonance), the Bloch vector will acquire an additional rotation in the $ x \u2032 y \u2032$ plane during the time interval, resulting in an oscillation with the frequency $ \omega d\u2212 \omega q$. The reason for the oscillation is that only the transverse component perpendicular to the rotating axis is transferred to the longitudinal component. This oscillation is called Ramsey fringes.

^{25,26}This happens easily when the flux bias is out of the sweet spot. In this case, the fitting function ( $ \omega qd$ is assumed to be zero for simplicity)

## IX. CONTROLLING A QUANTUM SYSTEM

In this section, we discuss how to control a superconducting qubit system. As a quantum computing system becomes larger, its precise control becomes as important as making the system itself. Without efficient control, we cannot reduce errors enough to perform quantum error correction.

^{161}Single qubit gates in Sec. VI C are good examples. In this case, the control Hamiltonian is $ H ^ d$ [Eq. (61)], and the control parameters are $ E r(t)$ and the phase of the drive pulse, which selects the rotation axis.

To achieve a high-fidelity gate operation, a control pulse must satisfy the following conditions:

*Fast qubit manipulation*: The control pulse must be as short as possible to avoid loss of coherence.*Narrow excitation bandwidth*: The excitation bandwidth has to be sufficiently narrow to minimize the information leakage through an unwanted transition. For a multiqubit system, the excitation bandwidth of the readout pulse is also important because the readout of a certain qubit might induce the dephasing of other qubits by populating readout resonators for these qubits (see Sec. VI B 1).*Decoupling from unwanted interactions*: Couplings to uncharted or unaccountable external degrees of freedom, which induce unwanted interactions and information leakage, should be minimized.*Self-compensating*: The gate operation must compensate for the complex response of classical control electronics: bandwidth and long-time transients or nonlinearity such as kinetic inductance in a superconducting resonator, amplifier nonlinearity, or mixer imbalance. Although these nonlinearity is controllable in principle,^{162}it is a difficult task. Thus, the best option is to minimize any unnecessary nonlinearity by careful device design and operating the amplifier in the linear regime, which will set the power limit.*Robustness against experimental imperfections*: The control must be robust against uncertainties and stochastic variations in the system’s internal and control Hamiltonians, such as amplitude and phase noises in the control pulse.

### A. Elementary pulse shaping

#### 1. Excitation bandwidth

In this subsection, we estimate the excitation bandwidth by Fourier transforming the pulse applied to the qubit. We stress that this method is valid only if the system’s response is linear, while the evolution on the Bloch sphere is intrinsically nonlinear.^{165,166} To control the excitation bandwidth precisely, we must solve the full equation of motion of the system. Here, we use the concept of Fourier transformation because it simplifies our discussion and helps develop intuition.

To grasp the concept of controlling the excitation bandwidth, we compare two widely used pulses: square and Gaussian pulses. As shown in Figs. 22(a) and 22(b), the excitation bandwidth of a Gaussian pulse is significantly narrower than that of a square pulse.^{167} One drawback of a Gaussian pulse is that it does not have well-defined starting and end points; hence, the pulse envelope must be truncated somewhere.^{168} Because of this, a cosine pulse is also widely used.

For qubits like transmons, however, a Gaussian pulse is still not enough to perform nanosecond qubit control because of the bandwidth of 100 MHz order, which is comparable to the typical anharmonicity of a transmon [Fig. 22(b)]. A pulse with this bandwidth is likely to induce transitions not only between $ 0$ and $ 1$ but also between $ 1$ and $ 2$, resulting in information leakage out of the computational subspace. A more advanced pulse resolving this issue is the Derivative Removal by Adiabatic Gate (DRAG) pulse.^{169} The DRAG scheme is very effective for weakly nonlinear qubits, such as transmons; all high-fidelity controls achieved in superconducting qubit systems are based on the DRAG pulse and its variants.

In the frequency domain, taking the derivative gives sharp suppression at a certain frequency [black curve in Fig. 22(d)]. The coefficient $\u22121/\alpha $ in Eq. (109) matches this suppressed frequency and the transition frequency between $ 1$ and $ 2$. This idea is very intuitive and easy to apply; for example, if we want to suppress two frequencies, one higher and the other lower than our working frequency, we can simply take a second derivative of the pulse as the imaginary part.^{170} Although we explained the DRAG scheme for a Gaussian pulse, the concept of the DRAG pulse can also be applied to other pulse shapes.

One interesting point of view is to consider the DRAG scheme as an extension of “shortcuts to adiabaticity,” which are fast routes to the final results of slow, adiabatic changes of the controlling parameters of a system.^{171} From this point of view, the imaginary part of the pulse makes the qubit couple to the non-computational subspace only adiabatically, resulting in the removal of population leakage.^{172} Interested readers should see Refs. 171 and 172.

#### 2. Pulse distortion

Although we carefully design a microwave or a DC pulse for qubit control, the shape of the pulse is distorted as it travels from the pulse generator to the qubit. This is mainly due to the finite bandwidth of various electrical components, such as cables, filters, and resonators. The problem is even worse for the readout pulse because, in this case, the pulse is close to the resonance of the high-quality superconducting resonator. A typical example of pulse distortion is shown in Fig. 23(a). If the time scale of the transient is not negligible compared with $ T 1$, which is often the case, the readout fidelity will be limited. The simplest solution is to add overshoot and negative pulses at the beginning and end of the readout pulse as shown in Fig. 23(b).^{173}

A more advanced way to solve this problem is to model the transient behavior using various filter functions^{174,175} or an RLC circuit.^{162,176}

### B. Numerical optimization

^{177}If our interest is unitary gate optimization, gate infidelity $ J gate$ is a natural choice as the cost function. Mathematically, $ J gate$ is defined by

There are two types of optimization algorithm: gradient-based and gradient-free. In gradient-based algorithms, the performance of the pulse is evaluated by the cost function. Then, the control parameters are updated for the next iteration based on the derivative of the cost function with respect to the control parameters. The advantage of gradient-based algorithms is that they are much faster than gradient-free algorithms. However, gradient-based algorithms do not work well if the cost function is discontinuous or noisy because calculating the gradient is difficult in such a case. Thus, they are difficult to use for direct optimization on an experimental system, i.e., closed-loop optimal control.

Despite their slowness, gradient-free algorithms are simple to implement and work well with a noisy measurement outcome, suggesting that we can perform closed-loop optimal control using these algorithms. Thus, gradient-free algorithms are useful for the calibration or optimization of pulses defined by a limited number of parameters.^{156,157}

There are many numerical optimization algorithms that have been used widely in quantum control. One helpful “decision tree” for the choice of an optimization algorithm is available in Ref. 178.

#### 1. GRAPE algorithm

We introduce GRadient Ascent Pulse Engineering (GRAPE)^{179} because it has been found to be one of the most reliable algorithms for controlling quantum systems. As the name implies, GRAPE is a gradient-based algorithm. Hence, it is fast but the control parameters are determined in simulations. This suggests that the GRAPE algorithm requires complete system modeling.

In the GRAPE algorithm, the control pulses are defined as a collection of pulses with a piecewise constant amplitude and phase over a number of intervals $N$, each of length $\Delta t$, which yields an overall pulse length of $T=N\Delta t$ [see Figs. 24 (insets) and 25]. Hence, the control parameters are a set of amplitudes and phases (or amplitudes of two quadratures) of these piecewise pulses.

A brief description of the GRAPE algorithm for unitary gate optimization is as follows (Fig. 24):

*Characterize the system*: In this step, the system parameters are extracted as described in Sec. VIII. From these parameters, we set up the model Hamiltonian that describes the dynamics of the system properly [ $ H ^ 0$ in Eq. (107)].*Guess initial control parameters and construct the Hamiltonian*: With a set of initial control parameters, we can construct the full Hamiltonian [ $ H ^$ in Eq. (107)]. An example of the initial pulse shape is shown in the upper inset of Fig. 24.*Calculate the propagator*: The time evolution of the system during a time step $j$ is given by the propagator [Eq. (49)],Thus, $ U ^(T)= U ^ N\cdots U ^ 1$.$ U ^ j=exp \u2212 i \Delta t \u210f H ^ 0 + \u2211 k K u k ( j ) H ^ k.$*Evaluate the infidelity*: In this step, it is determined how close the resulting state is to the target one via $ J gate$.*Calculate derivatives and update the control parameters*: The update of the control parameters can be written aswhere $\u03f5$ is an adjustable small step size. An example is shown in the lower inset of Fig. 24. For small enough $\Delta t$, the gradients are calculated using the following formula:$ u k(j)\u2192 u k(j)+\u03f5 \delta J gate \delta u k ( j ),$^{179}where $ V ^ j$ and $ \Lambda ^ j$ are defined by$ \delta J gate \delta u k ( j )\u2248\u22122\u211c tr i \Delta t \u210f H ^ k V ^ j \Lambda ^ j \u2020 tr V ^ j \u2020 \Lambda ^ j,$Here, the property that the trace is invariant under cyclic permutations was used.$ J gate = 1 \u2212 | tr ( U ^ target \u2020 U ^ N \cdots U ^ j + 1 \u23df \u2261 \Lambda ^ j \u2020 U ^ j \cdots U ^ 1 \u23df \u2261 V ^ j ) | 2 = 1 \u2212 tr ( V ^ j \Lambda ^ j \u2020 ) tr ( V ^ j \u2020 \Lambda ^ j ) .$

Note that the piecewise constant amplitude and phase assumption is not valid in reality because of the pulse distortion mentioned in Sec. IX A 2. However, this does not make the pulse optimization particularly harder. Once the transient behavior is suitably modeled, it can be integrated into the GRAPE algorithm easily by introducing an additional level of discretization. One discretization is for integrating the state evolution and the other is for the control parameters.^{176,180,181}

#### 2. Example: Controlling excitation bandwidth

##### 1. Goal

##### 2. Hamiltonian

^{20}We also use this number. We emphasize that the $y$-rotation is implemented by $\u2212sin( \omega dt)$, not by $+sin( \omega dt)$, as mentioned in Sec. VI C.

##### 3. Constraint

The constraint is usually set by our instruments. In this case, it is the maximal power of the drive.

##### 4. Result

The optimal pulse shape found by GRAPE is shown in Fig. 25. If the system is an ideal two-level system, i.e., $\lambda =0$, we do not need one of the quadratures, say $ \Omega y$, and any shape satisfying $ \u222b 0 \tau E x(t)dt=\pi $ achieves a perfect $\pi $-rotation, where $\tau $ is the total gate time. However, because of the existence of $ 2$, the numerical solutions found by GRAPE have $ \Omega y(t)$ whose shape is similar to that of the DRAG pulse as shown in Fig. 25.

### C. Refocusing technique

Although a qubit must interact with external systems to be controlled or read, the interaction must be turned on only when we need it. Any unwanted interaction makes the qubit lose its coherence and degrades the performance of gate operation. The problem is that interactions associated with qubits are not always controllable; even if we can control some of the interactions using a tunable coupler or flux bias, such control knobs always introduce additional noises. A refocusing technique^{185} allows us to cancel the evolution caused by unwanted interactions by applying appropriate pulses that can effectively reverse the direction of evolution. In this context, such a pulsing technique is also called dynamical decoupling.^{186,187}

The simplest and most well-known refocusing scheme is the Hahn echo sequence [Fig. 26(a)]. (Although the majority of the literature, including this Tutorial, calls the pulse sequence in Fig. 26(a) the Hahn echo sequence, it was proposed by Carr and Purcell.^{183} In Hahn’s original paper, he applied pulses with the same rotation angle.^{182}) The Bloch spheres with circled numbers show how the qubit state evolves in repeated identical measurements. If the transition frequency of a qubit fluctuates for each measurement because of noise or unwanted interactions, then the qubit will lose its phase coherence [③ in Fig. 26(a)] (Sec. IV A 3). Here, the role of the $\pi $ pulse is to flip the population of the qubit, thus reversing the direction of evolution [④ in Fig. 26(a)]. This makes the net area of the “direction of evolution” in Fig. 26(a) zero, indicating that the unwanted evolution is canceled out. Thus, $ T 2$ measured using this pulse sequence is usually much longer than that measured via the Ramsey fringes (Sec. VIII B 3). To distinguish the relaxation time constants obtained from the Ramsey fringes and Hahn echo sequence, notations such as $ T 2 Ramsey$ and $ T 2 echo$ are often used. (In the magnetic resonance literature, the notations $ T 2 \u2217$ and $ T 2$ are used for $ T 2 Ramsey$ and $ T 2 echo$, respectively, for historical reasons.)

^{189,190}

The Hahn echo sequence is also useful for a multiqubit system. Figure 26(b) shows an example of engineering $ZZ$ interactions in a four-qubit system.^{104} We can intuitively see which interaction will be eliminated by multiplying the two directions of evolution curves and checking if the net area is zero.

One might notice that, if we use the Hahn echo sequence to remove the $ZZ$ interaction for the CR gate operation as mentioned in Sec. VI D 5, our gate operation based on the $ZX$ interaction will also be removed. To prevent such a situation, we apply the $\pi $-pulses to the control qubit and change the sign of the CR drive after the first $\pi $-pulse.^{111} As a result, the $ZX$ interaction survives, while the $ZZ$ interaction is canceled out.

The Hahn echo sequence works well only when the unwanted interaction is static in the time scale of $\tau $. This means that only low-frequency noise can be canceled out by the Hahn echo sequence. The Carr–Purcell–Meiboom–Gill (CPMG) pulse sequence [Fig. 26(c)], an extension of the Hahn echo sequence, can remove a wider frequency range of noise by applying multiple $\pi $-pulses.^{183,184} This can be understood using the concept of the filter function, which is basically the Fourier transformation of the direction of evolution.^{26,188} Figure 26(d) clearly shows that increasing the number of $\pi $-pulses filters out a wider range of low-frequency noise.

As the number of qubits increases, the dimension of the Hilbert space associated with the qubits increases exponentially and cannot be efficiently simulated on a classical computer, which is a requirement for optimization as suggested in Sec. IX B. Furthermore, the mere task of perfectly characterizing all multiqubit Hamiltonian terms becomes impossible. Nevertheless, the increase in $ T 2$ by refocusing techniques shows us that we can engineer the system dynamics without full knowledge of the system-environment couplings and the state of the environment in some large Hilbert space that cannot be characterized. This success of the refocusing techniques arises from treating the problem perturbatively for a class of possible perturbations. These same ideas can be employed when controlling multiqubit systems whose Hilbert space can be divided into small manageable sectors (e.g., either one or a few qubits) with unwanted/uncharacterized Hamiltonian terms treated as perturbations. An approach based on this philosophy is effective Hamiltonian engineering. The most popular theoretical tool for this is average Hamiltonian theory. Interested readers should see Refs. 104, 165, and 191–193.

### D. Evaluation of gate operation

For precise quantum control, a quantitative measure that shows how close the actual gate operation is to the target operation is required. The quantity showing this is the error rate or gate infidelity $ J gate$ [see Eq. (110) for the definition]. For the surface code, $ J gate\u22720.01$ is required.

The standard method of estimating $ J gate$ in the early days of quantum computation (before the 2010s) was quantum process tomography (QPT).^{10} Although it gives complete information about the dynamics occurring in the system during a given time, QPT has two drawbacks. First, it is not scalable—the time for fidelity estimation increases exponentially with the number of qubits. Second, QPT is susceptible not only to errors associated with gate operation but also to errors associated with state preparation and measurement (SPAM), resulting in inaccurate gate fidelity estimation.

Nowadays, the Clifford group randomized benchmarking (standard RB) is the standard measure of error rates associated with a set of gate operations, i.e., average gate fidelity, because of the following advantages. First, the Clifford group RB is scalable: the time for fidelity estimation increases polynomially with the number of qubits. Second, the estimated error is insensitive to the type of gate. Last, this protocol is robust against SPAM errors.

The idea of RB is to observe how the gate error accumulates with the number of gate operations, while SPAM errors remain similar with increasing number of gate operations. We can understand the idea of RB with a toy block analogy in the following way [Fig. 27(a)]. Imagine that we have lots of toy blocks in a toy storage bag with different height errors. Our job is to measure the average height error of these blocks. For this, we pick a set of $m$ blocks randomly from the toy storage bag. Then, we stack these blocks and measure their total height. By comparing the measured height and our expected height from the design, we can estimate the error $\epsilon $ as shown in Fig. 27(a). We can repeat this process by varying $m$ to obtain $\epsilon $ as a function of $m$ [Fig. 27(c)]. The slope from the fitting indicates the average height error of the toy blocks [ $ \epsilon 1$ in Fig. 27(c)]. The advantage of measuring the slope is that it is free from errors originating from an unintended position offset of the stack [ $ \epsilon 0$ in Fig. 27(c)].

The procedure of RB is essentially the same as that of the toy blocks explained above. A toy block corresponds to a quantum gate, and the toy storage bag corresponds to the Clifford group. The main difference is that the experimentally measured quantity for the toy blocks is the error in height, which increases linearly with $m$, whereas the measured quantity for quantum gates is the survival probability, which decays exponentially with $m$. Here, the survival probability means the probability that the initial state is not changed by the gate sequence. After the entire process, the survival probability is fitted with the function $A p m+B$. The crucial point is that the average gate fidelity is estimated from $p$ only; the effects of SPAM errors are reflected in $A$ and $B$.

The detailed protocol of the standard RB is as follows:^{194}

Generate a sequence of $m+1$ gates picked uniformly at random from the Clifford group. Here, the last gate is chosen so that the net sequence is the identity operation.

Prepare a state, such as $|0\u27e9$ for a single-qubit system or $|00\u27e9$ for a two-qubit system.

Perform the gate operation.

Repeat steps 1–3 to measure the survival probability.

Repeat steps 1–4 for various $m$ to obtain the survival probability as a function of $m$.

Fit the survival probability with the decay function $A p m+B$.

The average gate infidelity is given by $ J gate=(d\u22121)(1\u2212p)/d$, where $d\u2261 2 n$ is the dimension of the Hilbert space and $n$ is the number of qubits.

The standard RB gives a single value, the average gate (in)fidelity. Although such convenience is an advantage of RB, we often need to know the fidelity of a specific gate. One example is the fidelity evaluation for the closed-loop optimization of a certain gate operation.^{156,157} In this case, we use the interleaved RB.^{195}

The procedure for the interleaved RB is similar to that for the standard RB. The differences are (i) a random gate and the gate of interest $ C$ appear alternately; (ii) the infidelity associated with the gate of interest, $ r C$, is given by $ r C=(d\u22121)(1\u2212 p C/p)/d$, where $ p C$ is the fitting parameter in the decay function and $p$ is the fitting parameter obtained from the standard RB. Note that this formula works only when the gate error is due to a stochastic process, such as thermalization or dephasing. If the gate error is due to coherent errors, such as imprecise control errors, we might have interferences between these errors. In this case, we have to estimate the upper and lower bounds of $ p C$ assuming the best and worst cases of interference.^{196}

Last, we point out that, although RB is scalable in principle, it is not clear in practice. The reason for this is that the implementation of the $N$-qubit Clifford operation requires $ O( N 2/log\u2061N)$ primitive two-qubit gate operations,^{197} which implies that, even if the gate fidelity of a primitive two-qubit gate looks reasonably good, the quality of the gate degrades rapidly with increasing number of qubits. This increases the number of measurements required to estimate the fidelity. One of the scalable alternatives is cycle benchmarking: in this protocol, the uncertainty of the fidelity estimate is independent of the number of qubits. Interested readers should see Ref. 198.

## ACKNOWLEDGMENTS

S.K. thanks Sota Ino, Paul Magnard, Hiroto Mukai, Shotaro Shirai, Teruaki Yoshioka, and Dengke Zhang for helpful discussions and Giuseppe Falci, Holger Haas, Ian Hincks, Anita Fadavi Roudsari, Rui Wang, Alex Wozniakowski, and anonymous reviewers for valuable comments on the initial manuscript. This work was supported by CREST, JST (Grant No. JPMJCR1676) and the New Energy and Industrial Technology Development Organization (NEDO).

## DATA AVAILABILITY

The data that support the findings of this study are available from the corresponding author upon reasonable request.

## REFERENCES

*What is Quantum Mechanics? A Physics Adventure*

*Modern Quantum Mechanics*

*Quantum Computation and Quantum Information: 10th Anniversary Edition*

*Principles of Quantum Computation and Information: A Comprehensive Textbook*

*Optical Coherence and Quantum Optics*

*Exploring the Quantum: Atoms, Cavities, and Photons*

*Quantum Machines: Measurement and Control of Engineered Quantum Systems—Lecture Notes of the Les Houches Summer School: Volume 96, July 2011*, edited by M. Devoret, B. Huard, R. Schoelkopf, and L. F. Cugliandolo (Oxford University Press, 2014).