Communication is an integral part of human life. Today, optical pulses are the preferred information carriers for long-distance communication. The exponential growth in data leads to a “capacity crunch” in the underlying physical systems. One of the possible methods to deter the exponential growth of physical resources for communication is to use quantum, rather than classical measurement at the receiver. Quantum measurement improves the energy efficiency of optical communication protocols by enabling discrimination of optical coherent states with the discrimination error rate below the shot-noise limit. In this review article, the authors focus on quantum receivers that can be practically implemented at the current state of technology, first and foremost displacement-based receivers. The authors present the experimentalist view on the progress in quantum-enhanced receivers and discuss their potential.

## I. INTRODUCTION

The communication capacity crunch is upon us,^{1,2} owing to the exponential expansion of the Internet. With monthly Internet traffic of 200 exabytes at the time of writing, the underlying communications systems will no longer be able to support the service reliability and Internet traffic congestion will only worsen as the exponential trend continues. Claude Shannon analyzed and described the limits of a communication channel.^{3} He found a universal relation between information capacity, available channel resources, and noise. The connection between information and physics turns out to be even more fundamental. This connection is now well-established as a result of the progress in information theory and computer science, on one hand,^{4,5} and quantum physics on the other.^{6} With physical measurement at the heart of communication, the fundamental communication channel limits are related to fundamental properties of measurements.

Quantum theory established tools that quantitatively connect physical measurement and communication. In 1962, Gordon found the maximal capacity of electromagnetic (bosonic) channels from first principles.^{7} The connection between the channel's physical resource and its capacity was solidified in the subsequent work.^{8–15} On one hand,^{9,10} there is a fundamental limit of measurement accuracy that leads to occasional errors in discriminating between physical states used for communication called Helstrom bound (HB). On the other hand, it was found that commonly used detection techniques such as homodyne/heterodyne measurement even with the ideal components are limited by measurement noise^{7,11–15} due to the quantum nature of the photoelectric effect and the Poisson photon statistics of the coherent light.^{16} This inherent noise is called shot noise. This noise prevents reaching the rate of errors prescribed by Helstrom's work.

Thus, at least potentially, quantum measurements can surpass the capabilities of classical measurements and improve channel capacity (to within Gordon's bound). A new field of research was born. The main goal of this research is to use quantum effects to surpass the shot-noise limited measurement. First practically attainable quantum measurement-based receivers were proposed in 1970s.^{17,18} This pioneering work was followed by further theoretical and experimental research, for instance.^{19–26} The field became particularly active in the late 2010s as highly efficient, low dark noise single-photon detectors became available.^{27}

Although there are reviews on discrete quantum state discrimination,^{28–33} there is no comprehensive review of experimental efforts in state discrimination of the continuous-variable states. In this review, we focus on the state-of-the art quantum measurement schemes and communication protocols for classical communications with finite sets of continuous-variable states, such as coherent states. Because we primarily describe receivers that have been practically implemented, the central attention is devoted to coherent displacement-based receiver designs. Averting the capacity crunch in global communications may require paradigm-shifting research and engineering efforts. Quantum measurement could provide new tools that will help take full advantage of communication channels—up to the theoretical maximum—and thus enable this paradigm shift.

This review is organized as follows. In Sec. II, we briefly review the theoretical foundations of classical and quantum-enabled channels. We introduce a simple classification of the communication channels based on the type of encoding and the type of measurement. In Sec. III, we discuss conventional and novel communication protocols used for coherent optical communication. We compare power limited and bandwidth limited encodings and the trade-off between their resource efficiencies. In Sec. IV, we discuss displacement-based quantum receivers for discrimination of coherent states. The two main classes of the receivers are considered: receivers with adaptive displacement and passive displacement receivers. In Sec. V, we outline research efforts beyond displacement receivers and beyond the use of coherent states in noiseless communication channels. Section VI summarizes advantages and challenges of the potential widespread use of quantum measurement for communication and concludes the review.

## II. THEORY OF DETECTION; CLASSICAL SHOT-NOISE LIMIT AND QUANTUM HELSTROM BOUND

### A. Quantum-enabled channels

Quantum theory revisited the fundamentals of communication. The calculation of information capacity limits of conventional communication channels from the first principles became possible. Then, questions on using quantum enhancement to improve conventional communication channels emerged. Although we expect a higher capacity for a quantum-enabled channel, additional steps may be required to take advantage of it. A communication link requires an encoding scheme that maps user information to physical states and a measurement device for a physical state detection. Thus, on the most practical level, quantum properties of physical states and measurement need to be considered, potentially limiting the practically accessible channel capacity. We will start from the most abstract analysis and then consider practical constraints.

In general, a quantum enabled channel supports (1) classical encoding and measurement, (2) classical encoding and quantum measurement, (3) quantum encoding and classical measurement, and (4) quantum encoding and measurement. The information capacity of quantum-enabled channels is bounded from above by Holevo's theorem.^{8}

From a quantum standpoint, electromagnetic waves are described by expanding them to a series of orthogonal modes and prescribing each mode a discrete number of excitations, i.e., photons. For the sake of simplicity, we assume communication via a single spatial mode, which is most commonly the case. Then, the number of orthogonal modes is directly related to the frequency bandwidth *B* of the channel. The average number of photons per state *n* is directly proportional to the average energy: $E=\u210f\omega n$. The average power is $W=EB=n\u210f\omega B$. After substituting the maximal achievable entropy per optical mode to the Holevo theorem, one finds the capacity of a lossless and noiseless quantum-enabled channel:^{7}

Therefore, the number of modes and the average energy of an optical state fully describe the physical resource use when electromagnetic waves are used as information carriers. This important result is referred to as Gordon capacity (or Holevo bound). To aid comparison, the channel capacity is often divided by the channel bandwidth. Then, normalized channel capacity $CQ/B$ conveniently characterizes the spectral efficiency. Formally, the spectral efficiency is measured in bits, but often units $bits/s/Hz$ are used to emphasize the physical meaning of *C*/*B* as a measure of data rate in bits per second over a channel with a bandwidth of 1 Hz. $CQ/B$ is also used for classification of communication protocols^{34,35}

In a classical limit, $W\u226bB\u210f\omega $, so the second term nearly vanishes, and the capacity becomes $CQ\u2248CShannon=B\u2009log2(1+W/(\u210f\omega B))$. This result is identical to a classical channel capacity given by the Shannon limit, where $W/(\u210f\omega B)$ is a signal to noise ratio.

In a photon-starving regime, $W\u226aB\u210f\omega $ capacity is mainly defined by the second term in (2). This result can be interpreted as follows. For low input power, one can use several orthogonal modes and, each time they send the entire available energy (a single photon) in one mode. The more modes are available, the more bits of information can be encoded per photon. For example, if one photon can be sent per time interval *T*, then for an available bandwidth *B*, one can divide this interval into *M* = *BT* slots. The information can be encoded by sending a photon in a particular time slot. The number of encoded bits is $\u2009log2BT$. Therefore,

where OME stands for orthogonal mode encoding. Spectral efficiencies $CQ/B,\u2009CShannon/B$ and $COME/B$ are shown as a function of energy efficiency defined as average number of photons used to transmit 1 bit of information in Fig. 1.

Note that this simple result is based on the assumption of a noiseless and lossless channel. In this ideal case only encoding using Fock states in conjunction with ideal photon-number resolving (PNR) measurement can attain the Gordon capacity, Table I. Typically, optical channels exhibit a significant loss. The upper bound (2) can be corrected by changing assumptions. In particular, adding a model for losses leads to a different capacity bound.^{36–38} Any practical optical communication system requires physical states that are resilient to optical loss, at least to some extent. To this end, classical states, especially coherent states of light, are particularly useful. In this review, we focus on channels with classical encoding and quantum measurement.

Metric . | Channel assumptions . | Measurement assumptions . | Encoding assumptions . |
---|---|---|---|

Gordon capacity (Holevo) | Lossless, noiseless | Photon number resolving | Fock states |

Helstrom bound | Noiseless | n/a | Any alphabet |

Shot Noise limit | Noiseless | Ideal classical | Any alphabet |

Metric . | Channel assumptions . | Measurement assumptions . | Encoding assumptions . |
---|---|---|---|

Gordon capacity (Holevo) | Lossless, noiseless | Photon number resolving | Fock states |

Helstrom bound | Noiseless | n/a | Any alphabet |

Shot Noise limit | Noiseless | Ideal classical | Any alphabet |

To design a practical digital communication system, an encoding method to map digital information on transmitted physical states is needed. The set of states ${|\psi j\u27e9}$ is called an alphabet; it can be of an arbitrary length *M*. We assume equiprobable states and a noiseless channel, Table I. How well can the alphabet symbols be distinguished? To answer this question quantitatively, we use symbol error rate (SER), the probability that a transmitted symbol is received incorrectly, *P*. Helstrom determined that the lower bound on this error is related to an overlap of the alphabet states.^{10} One uses the square root measure (SRM) method^{8,23,39,40} to find the Helstrom bound (HB). This method relies on a Gram matrix defined as

Note that the dot product in (4) cannot be zero for coherent states. Indeed, in Fock basis, one writes

where *α* is a coherent state parameter. Recall that $\u27e8vacm|vacj\u27e9=1$ even if modes *m* and *j* are orthogonal. This property of coherent states is important for other applications such as quantum fingerprinting.^{41} Interestingly, even if the communication alphabet uses quantum states with no vacuum component, any loss in a channel admixes the vacuum component to the initial state. The error probability bound for a quantum receiver can be written as the square root of Gram matrix elements

In general, Helstrom bound cannot be found analytically. For some encodings, $G1/2$ has an analytical form. We will give examples of Helstrom bounds for typical encodings in Sec. III.

Because Holevo theorem bounds channel capacity and HB puts a limit on error rate, the two bounds cannot be directly compared. However, HB gives the resource use, i.e., the required power and bandwidth to reach a certain error probability. Thus, to benchmark an encoding, the error probability is fixed. Then the normalized data rate *R*/*B* and the required power are compared to the normalized channel capacity $C(W)/B$. Obviously, for sufficiently low SER $PHB\u2032$ and a lossy communication channel, $RHB/B<C(W\u2032)/B$, where $W\u2032$ is the power required to achieve the SER $PHB\u2032$. It is very important to note here that HB merely establishes the lowest possible error probability, but does not guarantee a measurement method capable of achieving the HB. Hence, we expect that experimental spectral efficiency $RE/B\u2264RHB/B<C(P)/B$.

### B. Classical channels

Unless OME is used, classically, the information capacity is given by the Shannon theorem^{3}

This classical model does not specify the origin of channel noise *N*. Naively, this noise is a property of a communication channel and can be arbitrary small, which would result in the infinite channel capacity. In reality, noise is a fundamental property of any measurement. Because communication cannot occur without a physical measurement at the receiver, it is the measurement noise that would limit an otherwise noiseless channel. Although noise can be introduced *ad hoc* into a classical model of measurement, it is much more convenient to derive the minimal measurement noise using a quantum mechanical description of an otherwise classical measurement.^{7,42,43} A typical classical receiver measures the optical signal via heterodyne and/or homodyne measurements. In both cases, the signal undergoes interference on a beam splitter with a local oscillator (LO). The LO is a coherent state with the same optical frequency as the signal carrier in the case of the homodyne and a different frequency in the case of the heterodyne. After interference, the signal is detected on one or more detector(s). In all cases, there will be a current at the detector, and hence there will be shot noise. Assuming the detection efficiency of unity, the information capacity of coherent homodyne and heterodyne receivers is^{7}

We see that the measurement-induced noise is proportional to the channel's bandwidth, and the dependence of capacity on power and bandwidth in (7) is similar to the first term of the Gordon capacity, cf. Eq. (1).

For OME, a so-called direct detection measurement can be used. In principle, orthogonal optical modes can be physically separated without introducing extra noise or loss. Once separated, each mode can be separately measured with a detector. For instance, if spectral modes were used, a dispersive element such as a grating could be employed, followed by *M* spatially separated detectors. For pulse position multiplexing (PPM), when a position of a short pulse within a larger temporal window encodes information, one time-resolving detector is sufficient because modes are separated in time. A successful detection occurs when light is detected in one and only one mode. Although the exact analytical expression for the OME capacity in a classical Poisson channel is not known,^{44,45} channel capacity scales like $W\u210f\omega B\u2009log2(1+\u210f\omega BW)$ in a limit of weak optical input.^{46} This limit is identical to the second term of the Gordon capacity, cf. (3).

We see that $CQ>Cheterodyne,Chomodyne,COME$. Therefore, channel capacity of the quantum-enabled channel exceeds that of the classical channel. In derivations above, one does not specify a modulation scheme to obtain channel capacity. Finding the upper bound for SER requires selecting a modulation scheme. The uncertainty due to shot noise on the detector^{12,47} leads to state discrimination errors. The lowest classically attainable symbol error rate is often referred to as shot noise limit (SNL), quantum noise limit (QNL), or standard quantum limit (SQL). We will give examples of SNL derivations for particular modulation protocols in Sec. III.

To benchmark an encoding, a SER *P* is fixed (at a sufficiently low value). Then the normalized data rate *R*/*B* and the required power can be compared to the normalized channel capacity $C(W)/B$. Thus, the highest attainable data rates for classical and quantum-enabled channels as well as Holevo bound and Shannon limit can be presented on the same graph. As we will see below, $RSNL(W\u2032)/B<RHB(W\u2032)/B<CQ(W\u2032)/B$, where $W\u2032$ is a fixed power. Note that because we explicitly assume the measurement method to compute SNL, the classical lowest possible error probability $PSNL$ can in principle be achieved using ideal components, as opposed to $PHB$, because the ideal quantum measurement method might be unknown.

Table I summarizes channel capacity and SER bounds and the assumptions that are required to derive them.

## III. CONVENTIONAL and NOVEL COMMUNICATION PROTOCOLS

In digital communications, the ratio between data rate and bandwidth *R*/*B* gives the spectral efficiency of the communication protocols. Two main families of modulation schemes are generally distinguished: power-limited $R/B>1$ and bandwidth-limited $R/B<1$, Fig. 2. The power-limited family includes such encodings as pulse amplitude modulation (PAM), quadrature amplitude modulation (QAM), phase-shift keying (PSK), and others. In these modulation schemes, the bit rate *R* for a fixed signal pulse duration grows as $\u2009log2M$ as the number of states in the alphabet *M* increases. Communication bandwidth *B* remains constant, which means that the spectral efficiency *R*/*B* improves with *M*. However, power-limited modulation schemes using longer communication alphabets *M* require more power than these with shorter alphabets for reliable communication because it is generally harder to discriminate a larger number of non-orthogonal states. The maximal possible *R*/*B* is set by the power limit of the communication channel. While energy per symbol requirements increase as a power function of *M*, the number of encoded bits increases logarithmically. Thus, even though spectral efficiency *R*/*B* improves, energy requirements per bit increase exponentially.

The bandwidth-limited $R/B<1$ family includes pulse position modulation (PPM), biorthogonal and simplex signal modulation, and orthogonal frequency-shift keying (OFSK). These encodings typically use classically orthogonal states. The number of bits carried with each signal pulse depend on the alphabet length as $\u2009log2M$, the same dependence as for power-limited protocols. However, the bandwidth occupied by orthogonal communication symbols grows linearly with the alphabet length *M*. The energy efficiency improves with *M* because each signal pulse carries more information, while the energy required for reliable discrimination does not depend on the number of orthogonal signals. The spectral efficiency *R*/*B* of bandwidth-limited protocols decreases as $M/(\u2009log2M)$, Fig. 2. The largest *M* is given by the bandwidth limit of the communication channel.

We will discuss different modulation methods, review their theoretical limits for detection error rates, and compare their performance with the fundamental channel capacity. We will focus on encoding schemes that have been actively considered for quantum-enabled communication experiments.

### A. Binary protocols

Binary protocols are well studied, and they are a rare case where analytical expressions for error rate limits can be found, see Table II. It is not surprising that the first quantum receiver outperforming SNL was proposed for the binary modulation.^{17} In addition, the first projection measurement that achieves the HB (or the optimal projection) was found for binary encodings.^{10,18} Here we discuss binary encodings based on amplitude and phase modulations.

Encoding . | Optimal classicalReceiver . | Shot noise limit$PSNL$ . | Helstrom bound$PHB$ . | Ref. . |
---|---|---|---|---|

OOK | Direct detection | $12e\u2212n$ | $12(1\u22121\u2212e\u2212n)$ | 20 |

BPSK | Homodyne | $12(1\u2212erf(2n))$ | $12(1\u22121\u2212e\u22124n)$ | 10,34 |

M-PSK | Homodyne | $1\u22121\pi \u222b\u2212\pi /M\pi /M\u222b0\u221ee|rei\theta \u2212n|2r\u2009drd\theta $ | $1\u2212(\u2211q=1Me\u2212n\u2211m=1Me(1\u2212q)2\pi imM+ne2\pi imM)2/M2$ | 24 |

M-PPM | Direct detection | $(M\u22121)e\u2212n/M$ | $(M\u22121)(1+(M\u22121)e\u2212n\u22121\u2212e\u2212n)2/M2$ | 50,51 |

M-CFSK | Homodyne | Numerical | Numerical SRM | 48,49 |

Encoding . | Optimal classicalReceiver . | Shot noise limit$PSNL$ . | Helstrom bound$PHB$ . | Ref. . |
---|---|---|---|---|

OOK | Direct detection | $12e\u2212n$ | $12(1\u22121\u2212e\u2212n)$ | 20 |

BPSK | Homodyne | $12(1\u2212erf(2n))$ | $12(1\u22121\u2212e\u22124n)$ | 10,34 |

M-PSK | Homodyne | $1\u22121\pi \u222b\u2212\pi /M\pi /M\u222b0\u221ee|rei\theta \u2212n|2r\u2009drd\theta $ | $1\u2212(\u2211q=1Me\u2212n\u2211m=1Me(1\u2212q)2\pi imM+ne2\pi imM)2/M2$ | 24 |

M-PPM | Direct detection | $(M\u22121)e\u2212n/M$ | $(M\u22121)(1+(M\u22121)e\u2212n\u22121\u2212e\u2212n)2/M2$ | 50,51 |

M-CFSK | Homodyne | Numerical | Numerical SRM | 48,49 |

The binary PSK (BPSK) uses two coherent states with opposite phases for encoding and encodes exactly one bit per symbol

The corresponding constellation diagram is shown in Fig. 3. Fuzzy circles represent coherent states on a phase diagram, a distance from the state to the origin is proportional to the square root of the average number of photons in the state, and an average phase is measured as the angle between the positive direction of axis *I* and the vector from the origin to the center of the coherent state. Because both BPSK symbols *s*_{0} and *s*_{1} are states of the same optical mode, this encoding is non-orthogonal even if one can neglect the vacuum component, cf. (5). Faint states *s* can significantly overlap. The optimal classical discrimination of the BPSK signals can be achieved via a homodyne measurement. The only relevant measurement value for BPSK is the projection of the measured state on the in-phase quadrature (*I* axis in Fig. 3). The probability density function to receive a projection *x* when state *s _{i}* was sent is

where $n=\u27e8n\u27e9=|\alpha |2$ is the average number of photons in a state *s _{i}*. A decision that the input state is

*s*

_{0}is made if the measured

*x*< 0; otherwise, if

*x*> 0, the decision is

*s*

_{1}. Therefore, to find the probability of a discrimination error, we need to compute the probability of measuring

*x*> 0 when

*s*

_{0}was sent (or the probability of measuring

*x*< 0 when

*s*

_{1}was sent),

The HB can be readily found as

As we expect, $PHB<PSNL$.

The BPSK constellation is similar to a binary on-off keying (OOK) (i.e., $s0=0,\u2009s1=|\alpha \u27e9$) when the origin is shifted to the center of the left state in Fig. 3. Thus, we immediately get^{20}

Note that the classical measurement of OOK states distinguishes coherent states from vacuum states, and this measurement does not require a heterodyne; $PSNLOOK$ is based on direct optical power measurement (direct detection). OOK requires four times higher peak energy and two times higher average signal energy than BPSK to match its quantum discrimination error rate bound, HB. This inefficiency can be explained by calculating the geometrical distance between signal vectors $d01PSK=2\alpha $ and $d01OOK=\alpha $.^{34}

Binary protocols can carry only one bit of information with each signal pulse. It may be beneficial to encode more than one bit of information per signal pulse, i.e., by using larger encoding alphabets.

### B. *M*-ary PSK

A natural extension of BPSK is when more than two states are encoded in the phase of a coherent state. From symmetry considerations, the states are separated by equal phases $\Delta \varphi =2\pi /M$, where *M* is the number of states in the alphabet. As an example the constellation diagram of the 4-ary PSK is presented in Fig. 4. This modulation method encodes more than one bit per state, which may be beneficial for two reasons. First, when detectors are slow, a single measurement yields several ($\u2009log2M$) bits, so that the rate of information exchange improves. Second, the number of bits transmitted per optical mode in a unit time is higher; thus, spectral efficiency is higher. SNL and HB can be found analytically in integral form Refs. 23 and 34; see Table II. It is convenient to plot energy and bandwidth requirements of *M*-ary PSK protocols on one graph, where points with different *M* are connected to guide the eye, Fig. 2. Even though $PSNL>PHB$ for all *M*, error probability bounds for classical and quantum detection grow fast with *M* for a constant energy per bit $n/\u2009log2M$.^{34,48} Unfortunately, the potential advantage of the quantum measurement $PSNL/PHB$ also decreases with *M*. Therefore, quantum receivers are most effective for PSK protocols with relatively low *M* (see SNL PSK and HB PSK in Fig. 2).

### C. *M*-ary orthogonal encodings

The information can be encoded in *M* orthogonal modes, where a single optical pulse occupies one such mode. Modes can be made orthogonal using non-overlapping time bins, non-overlapping frequency bands, polarization, and spatio-angular distributions. Particularly, pulse-position modulation (PPM) is a modulation scheme in which $\u2009log2M$ bits are encoded in one of *M* time bins, Fig. 5. Because different symbols of the alphabet do not overlap in time, this encoding is classically orthogonal. Each time bin can be thought of as an optical mode; therefore, an *M*-ary alphabet occupies *M* modes. Because the duration of a signal is one *T*/*M* time bin, the required bandwidth for this protocol is *M* times broader than that for the flat-top pulse of duration *T*. Instead of using time bins, one can use symbols that are separated in frequency, in which case information will be encoded in spectral modes, and the required bandwidth will still be *M* times broader than that for the flat-top pulse of duration *T*. Linear expansion of bandwidth use is unavoidable for all modulation schemes using orthogonal modes. Other degrees of freedom, such as polarization or spatial modes, can be used when available.

Direct detection is classically the best detection strategy. Specifically, in PPM, modes are separated in time, so the arrival time of the pulse to the detector is sufficient for the physical separation of modes. For other encodings, mode separation may involve spectral filtering, spatial mode sorters, and so on. The classical error limit for ideal signal-shot-noise limited (background-free) detector operation, Table II, is proportional to $e\u2212n$, i.e., the probability to detect vacuum states in all modes. There is no dependence of $PSNL$ on *M* for large *M*. Therefore, for a given power, error per bit reduces with *M* as $\u2009log2(M)$ (DD orthogonal in Fig. 2). This feature is used for photon-starved communications although the energy-bandwidth trade-off becomes inefficient for large *M*.

Even though heterodyne detection is not optimal due to larger shot noise, it is often used in optical communications for orthogonal frequency shift keying. Heterodyne noise increases with the bandwidth. On the other hand, noiseless physical separation of the closely-spaced frequency modes may be practically unfeasible. Interestingly, when heterodyne detection is employed, nearly all gain in bits per unit energy for large *M* is canceled by increasing noise (see SNL OFSK in Fig. 2).

As we discussed above, from the quantum viewpoint, faint coherent states are always nonorthogonal. A Helstrom bound is therefore above zero. Its value can be readily found, Table II (HB orthogonal in Fig. 2), and it can be shown that the $PHB<PSNL$.^{9} Therefore, orthogonal encoding receivers can also benefit from a quantum measurement.

### D. *M*-ary coherent frequency shift keying

The *M*-ary coherent frequency shift keying (CFSK) encodes information in both the frequency and phase of coherent state pulses, $|\alpha m\u27e9=|\alpha (\omega m,\theta m)\u27e9$. The adjacent symbols *m* and *m *+* *1 are separated by $\Delta \omega $ in frequency space, and their initial phases differ by $\Delta \theta $, so that $|\alpha m\u27e9=|\alpha (\omega 0+(m\u22121)\Delta \omega ,(m\u22121)\Delta \theta )\u27e9$. This alphabet is illustrated in the constellation diagram, Fig. 6. In this diagram, coherent states rotate with time around the origin with rates given by their detuning. The keying can be described by two parameters: $\Delta \theta $ and $\Delta \omega T$. This parameter space contains the PSK modulation scheme: $\Delta \omega T=0,\u2009\Delta \theta =2\pi /M$ and the orthogonal frequency shift keying (OFSK): $\Delta \omega T=2\pi $. The goal here is to reduce the bandwidth of the communication protocol while maintaining low error probabilities. Therefore, one is interested in small frequency separation: $\Delta \omega T<2\pi $. In this parameter space, states are nonorthogonal. Therefore, both $PHB$ and $PSNL$ cannot be expressed analytically. Numerical methods^{48,49} are used instead. Both $\Delta \theta $ and $\Delta \omega T$ can be adjusted to meet certain optimization goals. For instance, when optimizing for energy efficiency, minimal Helstrom bound is achieved with one set of parameters, the lowest shot noise limit requires another parameter set, and the minimal error rate is achieved in a quantum receiver with yet another one. Interestingly, as the numerical analysis of $PHB$ shows, this keying balances energy requirements and bandwidth requirements at the same time, for $4\u2264M\u226432$, see Fig. 2. As a consequence, its rate graph crosses the $R/W=1$ value. Therefore, this keying is neither power limited nor bandwidth limited.

For a properly optimized CFSK $PSNLCFSK<PSNLPSK$, which is expected, because the bandwidth of CFSK is wider than that of PSK. However, it may be difficult to build an efficient classical CFSK receiver in practice. Interestingly, it turns out that a time-resolving quantum receiver, discussed later, uses the same hardware for many encodings including PSK and CFSK. The only difference is the feedback algorithm encoded in firmware. Therefore, the quantum measurement can be used to provide bandwidth and power efficiency simultaneously in a practical way.

## IV. DISPLACEMENT-BASED QUANTUM STATE DISCRIMINATION

Quantum theory establishes a lower discrimination error bound than that accessible through classical measurement. However, the design of a practical measurement method does not directly follow from theory. In 1973, Kennedy proposed the first near-optimum receiver approaching Helstrom bound for binary coherent states.^{17} In less than a year, Dolinar proposed an improved receiver for binary coherent states.^{18} In both receivers, the input state is displaced from its original state through interference with a local oscillator, which can be practically accomplished with a heavily unbalanced (typically, 99:1) beam splitter. These two seminal papers have triggered theoretical and experimental research of quantum receivers.

Most theoretical and nearly all experimental reports to date take advantage of coherent state displacement in one way or another even though coherent state displacement is not the optimal quantum measurement for some encodings. As it has been shown recently, an optimal projective measurement may require ancillary quantum states or quantum nodes, such as a single atom. We cover this exciting work in Sec. V.

In this section, we discuss the experiments with coherent state displacement-based quantum receivers. To aid the reader, we present a simple classification of these receivers in Fig. 7. The classification is based on the principle of operation. Coherent displacement can be either non-adaptive, where the local oscillator does not change throughout measurement (as in a Kennedy receiver) or adaptive, where the coherent state is actively controlled (as in a Dolinar receiver). The lowest level in the figure contains references (in bold) to experimental demonstrations and mentions a modulation protocol(s) used in the experiment.

At the time of writing, we are aware of OOK, BPSK, *M*-ary PSK, *M*-ary PPM, and *M*-ary CFSK experiments. To gauge the performance of quantum receivers, we compiled a table with the experimental results, Table III. The improvement from quantum measurement is typically measured as a ratio of the observed error rate to the classical SNL limit for a noiseless receiver with the same system detection efficiency as the quantum receiver, i.e., adjusted SNL. This measurement quantifies the so-called “quantum advantage” over a classical measurement under similar conditions. However, using this characterization method does not account for any inefficiency of the quantum measurement experiment. Some inefficiencies may be due to imperfect off-the-shelf components that were used, while other inefficiencies may be intrinsic to the chosen quantum measurement method. Thus, one could argue that a more relevant comparison of quantum versus classical receivers is to use the absolute SNL—the limit of the ideal classical receiver with unity efficiency. The error rates below the absolute SNL cannot be achieved by a classical receiver in principle. Although all quantum receivers surpass the adjusted SNL, not all of them achieve SER below the absolute SNL. We also compare input state energy required to achieve SER of 10% for demonstrated quantum receivers versus the SNL-limited receivers where applicable. This comparison shows the possible reduction of energy requirements by switching to quantum receivers.

Encoding protocol . | $PEPSNL(\eta )$@$\u27e8n\u27e9E\u2009log2M$ . | η
. | $PEPSNL$@$\u27e8n\u27e9E\u2009log2M$ . | $\u27e8n\u27e9E/\u2009log2M$ @$PE=0.1$ . | $\u27e8n\u27e9SNL/\u27e8n\u27e9E$ @$PE=0.1$ . | $\u27e8n\u27e9SNL(\eta )/\u27e8n\u27e9E$ @$PE=0.1$ . | References . | ||
---|---|---|---|---|---|---|---|---|---|

dB | photons/bit | dB | photons/bit | photons/bit | |||||

OOK | −2.2 | 2 | 0.35 | −0.31 | 0.29 | ^{a} | ^{a} | ^{a} | 57 |

−0.5 | 0.2 | −0.75 | 0.2 | 0.7 | ^{a} | ^{a} | ^{a} | 20 | |

BPSK | −0.77 | 0.44 | 0.55 | ^{b} | 0.73 | 0.56 | 1.03 | 54 | |

−0.42 | 0.21 | 0.91 | −0.15 | 0.21 | 0.41 | 1 | 1.1 | 22 | |

−6 | 7 | 0.72 | ^{b} | 0.5 | 0.82 | 1.14 | 60 | ||

−4 | 2.2 | 0.58 | ^{b} | 0.78 | 0.53 | 0.91 | 55 ^{c} | ||

4-PPM | −2.3 | 1.6 | 0.4 | ^{b} | ^{a} | ^{a} | ^{a} | 51 | |

4-PSK | −0.22 | 1.51 | 0.53 | ^{b} | ^{a} | ^{a} | ^{a} | 59 | |

−13 | 4.5 | 0.72 | −6.7 | 5.5 | 1.25 | 1.07 | 1.48 | 67 | |

−27 | 10 | 0.72 | −14 | 10 | 1.25 | 1.07 | 1.48 | 69 | |

−6.8 | 2 | 0.7 | −3.7 | 2 | 1.02 | 1.31 | 1.87 | 73 | |

−8.9 | 4.7 | 0.65 | −1.7 | 4.2 | 1.3 | 1 | 1.58 | 68 ^{c} | |

−6.3 | 2 | 0.75 | −3.7 | 2 | 1 | 1.33 | 1.79 | 75 | |

4-CFSK | −11 | 2.7 | 0.75 | −7.1 | 2.7 | 0.84 | 1.49 | 1.98 | 49 |

8-CFSK | −7.1 | 2 | 0.75 | −3.1 | 2 | 1 | 1.25 | 1.67 | 49 |

16-CFSK | −2.6 | 1 | 0.75 | −0.30 | 1 | 1.28 | 0.95 | 1.27 | 49 |

8-PSK | −3.8 | 3.1 | 0.75 | −1.86 | 3.1 | 2.42 | 1.27 | 1.69 | 75 |

16-PSK | −2.7 | 7.4 | 0.75 | −1.2 | 6.3 | 7.75 | 1.15 | 1.33 | 75 |

Encoding protocol . | $PEPSNL(\eta )$@$\u27e8n\u27e9E\u2009log2M$ . | η
. | $PEPSNL$@$\u27e8n\u27e9E\u2009log2M$ . | $\u27e8n\u27e9E/\u2009log2M$ @$PE=0.1$ . | $\u27e8n\u27e9SNL/\u27e8n\u27e9E$ @$PE=0.1$ . | $\u27e8n\u27e9SNL(\eta )/\u27e8n\u27e9E$ @$PE=0.1$ . | References . | ||
---|---|---|---|---|---|---|---|---|---|

dB | photons/bit | dB | photons/bit | photons/bit | |||||

OOK | −2.2 | 2 | 0.35 | −0.31 | 0.29 | ^{a} | ^{a} | ^{a} | 57 |

−0.5 | 0.2 | −0.75 | 0.2 | 0.7 | ^{a} | ^{a} | ^{a} | 20 | |

BPSK | −0.77 | 0.44 | 0.55 | ^{b} | 0.73 | 0.56 | 1.03 | 54 | |

−0.42 | 0.21 | 0.91 | −0.15 | 0.21 | 0.41 | 1 | 1.1 | 22 | |

−6 | 7 | 0.72 | ^{b} | 0.5 | 0.82 | 1.14 | 60 | ||

−4 | 2.2 | 0.58 | ^{b} | 0.78 | 0.53 | 0.91 | 55 ^{c} | ||

4-PPM | −2.3 | 1.6 | 0.4 | ^{b} | ^{a} | ^{a} | ^{a} | 51 | |

4-PSK | −0.22 | 1.51 | 0.53 | ^{b} | ^{a} | ^{a} | ^{a} | 59 | |

−13 | 4.5 | 0.72 | −6.7 | 5.5 | 1.25 | 1.07 | 1.48 | 67 | |

−27 | 10 | 0.72 | −14 | 10 | 1.25 | 1.07 | 1.48 | 69 | |

−6.8 | 2 | 0.7 | −3.7 | 2 | 1.02 | 1.31 | 1.87 | 73 | |

−8.9 | 4.7 | 0.65 | −1.7 | 4.2 | 1.3 | 1 | 1.58 | 68 ^{c} | |

−6.3 | 2 | 0.75 | −3.7 | 2 | 1 | 1.33 | 1.79 | 75 | |

4-CFSK | −11 | 2.7 | 0.75 | −7.1 | 2.7 | 0.84 | 1.49 | 1.98 | 49 |

8-CFSK | −7.1 | 2 | 0.75 | −3.1 | 2 | 1 | 1.25 | 1.67 | 49 |

16-CFSK | −2.6 | 1 | 0.75 | −0.30 | 1 | 1.28 | 0.95 | 1.27 | 49 |

8-PSK | −3.8 | 3.1 | 0.75 | −1.86 | 3.1 | 2.42 | 1.27 | 1.69 | 75 |

16-PSK | −2.7 | 7.4 | 0.75 | −1.2 | 6.3 | 7.75 | 1.15 | 1.33 | 75 |

^{a}

Experimentally measured SER is above $PE=0.1$.

^{b}

Experimental SER does not surpass absolute SNL.

^{c}

Mark experiments at the telecom wavelength (1550 nm).

### A. Kennedy receiver

Helstrom determined the fundamental SER bound for the optimum receiver in 1968,^{10} where the projection measurement on a quantum superposition state, often called “Shrödinger cat state” was proposed to reach the quantum limit for the binary coherent state encoding. The experimental implementation of the proposed optimal measurement is very difficult because it relies on a superposition basis and entanglement measurements.^{52} This method requires a very high-fidelity entanglement and a near-unit detection efficiency.^{53} In 1973, Kennedy proposed the first receiver using a simple displacement operation on the input coherent state followed by photon detection.^{17} While the overall performance of the receiver falls short of the HB, the receiver achieves exponentially optimum performance and outperforms the shot noise limit.^{17} The receiver scheme proposed for BPSK states $|+\alpha \u27e9$ and $|\u2212\alpha \u27e9$ is shown in Fig. 8(a). The input signal is displaced using a local coherent state and measured using a photon detector. The displacement occurs by interfering with the input signal with the local state on a beam-splitter. As shown in Fig. 8(a), the local state is set to $|+\alpha \u27e9$. The destructive interference occurs for the input signal $|\u2212\alpha \u27e9$ which is displaced to vacuum $|0\u27e9$, so no photon can be detected. The constructive interference occurs for $|+\alpha \u27e9$, such that the output is displaced to $|+2\alpha \u27e9$. A brighter output makes the probability to detect a photon higher. Therefore, in the ideal noiseless case and with the perfect displacement no photons will be detected when the input state was $|\u2212\alpha \u27e9$, but there is a probability (proportional to $exp\u2009(\u22124|\alpha |2)$) that no photons will be detected if the input state was $|+\alpha \u27e9$. This non-zero probability causes a discrimination error. In spite of the apparent simplicity of the method, experimental implementations^{54,55} fell short from outperforming the absolute SNL due to low system efficiency, non-ideal displacement, and dark noise at the detector. Modified Kennedy receivers use an optimized displacement and a more sophisticated discrimination algorithm. Those receivers unconditionally surpass the SNL in experiments,^{20,22} discussed below.

### B. Dolinar receiver

Following the proposal of the first quantum receiver using non-Gaussian measurements to beat the shot-noise limit, Dolinar proposed a receiver^{18} that can reach the Helstrom bound for discrimination of binary coherent states. This receiver theoretically approaches the quantum limit in binary state discrimination by using the real-time quantum feedback with the so-called optimal displacement and photon counting measurements, i.e., without the need for a “cat-state” measurement.^{10} In contrast to the Kennedy receiver, the displacement amplitude *β* is changing constantly. The phase is adjusted every time a photon is detected, i.e., it is determined from the number of photons *n _{t}* detected in the time interval $[0,t)$.

^{18,56}For an on–off keying, the optimal displacement amplitude is given by

^{57}

The discrimination decision is based on the total number of photons *n _{T}* counted during the entire measurement $[0,T]$, so that $|\alpha \u27e9$ ($|\u2212\alpha \u27e9$) is chosen when

*n*is even (odd) as shown in Fig. 8(b). Formally, Eq. (12) diverges at the beginning of the pulse

_{T}*t*=

*0, which cannot be practically implemented because of the finite energy of the LO and the saturation of a single-photon detector.*

Yet, this issue can be practically alleviated in a laboratory environment. A binary Dolinar-like receiver with finite displacement amplitudes was successfully implemented experimentally in Ref. 57. In their work, authors demonstrated that for input signal with the low average number of photons (*n *<1) the OOK receiver not only surpasses the adjusted SNL, but also approaches the adjusted HB; for comparison, both SNL and HB were adjusted to the system efficiency.

Dolinar's idea of adaptive feedback enabled multiple new quantum receiver configurations. Particularly, sub-SNL receivers for *M*-ary encodings were invented and experimentally demonstrated.

### C. Novel quantum receivers and experiments

#### 1. The optimized displacement receiver

A few attempts were made to modify Kennedy receivers to achieve a lower SER. One such enhancement is the optimized displacement receiver (ODR). Kennedy receiver displaces the input state by interfering it with the equal amplitude of the local state. In their theoretical paper, Takeoka and Sasaki proposed to adjust the displacement of the input signal using local state.^{58} Their ODR uses the local state with an amplitude *β* greater than the input signal amplitude *α*, Fig. 9(a). It is evident that due to unequal amplitude in the local state the input signal will not be displaced to vacuum. There are no other changes to the Kennedy design, cf. Fig. 8(a). Since larger displacement results in a higher probability of photon detection when input signal state is displaced to $|\alpha +\beta \u27e9$, the probability of detecting no photons $e\u2212|\alpha +\beta |2$ is reduced from that of the Kennedy receiver. However, because $|\u2212\alpha \u27e9$ is no longer displaced to vacuum, there is a possibility to collect photons, which leads to errors. The trade-off between these “false” detections due to the non-ideal vacuum $|\beta \u2212\alpha \u27e9$ and the reduced probability to get no clicks for the $|\beta +\alpha \u27e9$ state results in an optimization problem. The optimal displacement amplitude *β* minimizes the combined error probability. The experimental implementations of ODR has shown discrimination error rates below the SNL adjusted for the experimental conditions^{54,59} and unconditionally,^{20,22} i.e., in comparison to the absolute SNL. The most significant improvement in discrimination accuracy is shown for faint coherent states with $|\alpha |2\u22481$. The amplitude of the optimized displacement approaches the amplitude of the input state $|\beta |\u2192|\alpha |$ as $|\alpha |\u2192\u221e$. A similar optimization of displacement can reduce the discrimination error rate of adaptive feedback receivers for binary and *M*-ary alphabets as well.

The discrimination error rate of the ODR receivers can be further reduced with photon-number resolving (PNR) measurements [Fig. 9(b)] and can extend the below-SNL performance of the receiver to higher input energies $|\alpha |2$.^{22,60} A discrimination threshold is the particular number of detected photons during *T*. If the total number of detections is below that threshold, $|\u2212\alpha \u27e9$ is received; otherwise, $|\alpha \u27e9$ is received. Note that with a notable exception of Refs. 20 and 22 where transition edge sensor (TES) detectors were employed, other receivers use a quasi-PNR detector. A conventional single-photon avalanche photodiode's (SPAD) clicks are counted. The total count of clicks corresponds to the number of photons to within the detector's deadtime, afterpulsing, and dark count probability.^{27,61,62}

#### 2. Conditional pulse nulling receiver

Conditional pulse nulling (CPN) receivers are explicitly designed for pulse position modulation (PPM) which is widely used in photon-starved free space communications due to its high energy efficiency. Dolinar proposed the CPN receiver in 1982.^{50} He theoretically showed that CPN performs near the optimum.^{50} Almost three decades later, the CPN receiver has been experimentally demonstrated for a 4-ary PPM with the discrimination error below the adjusted SNL.^{51} The experimental scheme of the CPN receiver is shown in Fig. 10(a). The input signal is displaced to vacuum using the local state pulse. The decision strategy for 4-ary PPM is shown in Fig. 10(b). The receiver starts by nulling the pulse in position 1 (Fig. 5). Photon detection (failure) leads to the nulling of pulses in the subsequent steps. If no photons were detected in position 1 (success), then the received state is $|\alpha 1\u27e9$. The same strategy is repeated for subsequent positions. The green boxes represent the received state after a discrimination. Even in ideal experimental conditions, errors arise from the Poisson nature of the coherent states, cf. Kennedy receiver: the displacement of the input signal with a wrong local state does not necessarily lead to photon detection.

#### 3. Multi-stage receivers

The optimal receiver for binary coherent states proposed by Dolinar^{18} requires feedback to adjust the LO as more information about the input state becomes available. A possible modification of the Dolinar receiver that makes it more experimentally feasible breaks the input into segments or stages either spatially [(11(a)] or temporally [(11(b)]. Then, the measurement result from each segment can be used to choose the best displacement state for the next segment. The number of stages is predefined. Switching rules can be represented as a decision-making tree that is typically precomputed. It can be shown^{63,64} that with the proper choice of the displacement intensity at each stage *n* ($|\beta n|2>|\alpha i|2$, cf. Dolinar receiver) and in the limit of infinite number of stages such a multi-stage receiver can optimally discriminate binary states. Thus, choosing the same intensity of the LO for all stages does not enable the HB-limited discrimination even when the intensity is optimized.

For example, the BPSK input state, $|\alpha i\u27e9$, is split into multiple copies with equal intensity, Fig. 11(a). Thus, the energy of the input to each stage is reduced by the factor of *m*. Each attenuated copy of the state is sent to a displacement setup. An optical delay is inserted in each stage so that the measurement on an *n *+* *1th stage does not start before the measurement on the *n*th stage is completed. For the first stage, an arbitrary state of the LO is chosen. If the LO matches the input, the input state is displaced to vacuum, no photons will be detected; otherwise, a photon can be detected. To achieve close to optimal performance, the value of $|\beta n|2$ should be corrected at each stage, but the phase of $|\beta j\u27e9$ only changes with photon detection. The potential drawback of this scheme is that the number of optical elements and single-photon detectors grows with the number of stages. An excessive loss of the optical signal occurs due to imperfect optical components. In addition, the alignment of the multistage setup may be complicated.

A signal pulse can be divided into equal temporal intervals rather than spatially. In this case, just one LO with the feedback and one detector is needed. As before, the feedback is used to update the LO after each measurement segment with an equal duration *T*/*m*. Figure 11(b) shows the experimental scheme of the multi-stage receiver with temporal stages. The strategy tests the hypothesis that the most probable input signal is *α _{i}* during each measurement segment. At the end of the signal pulse

*T*, final Bayesian probabilities are computed, and then the hypothesis with the highest probability is used to make the discrimination decision. The main drawback of temporal segmenting is the need for faster detectors and electronic components. A deadtime of single-photon detectors is yet another obstacle.

The idea of adjusting the feedback after each photon detection can be generalized for *M*-ary communication protocols although the optimal feedback algorithm is not known. An *M*-ary discrimination strategy that uses *m* measurement stages where LO can be adjusted after each stage was proposed in Ref. 24. In this paper, the signal is split on a beam splitting tree, where each measurement stage has its own LO and detector. Formally, the Bayesian probabilities for the possible input state are calculated based on the outcome of the measurement (click or no click) in each stage. Bayesian probabilities are used to set the LO of the next stage to verify the most probable hypothesis. Finally, after all *m* measurement results are known, the Bayesian probabilities are updated one last time and the most likely hypothesis is used as the discrimination outcome. This proposal does not consider optimizing the LO intensity separately for each stage. The spatial multi-stage approach was further investigated theoretically for $M=3,4$ PSK in Ref. 65. In Ref. 66, authors investigate the theoretical performance of the 4-PSK receiver by adding PNR capabilities.

Temporal adaptive receivers can also be generalized to longer alphabets, Fig. 11(b). The temporal adaptive receiver design was used in the experimental demonstration of the 4-PSK quantum receiver that unconditionally surpassed the SNL limit.^{67} A similar design was used for the first demonstration of the 4-PSK receiver at a telecom wavelength.^{68} A more sophisticated version of this receiver counts the number of photons in each measurement. This approach enables more precise Bayesian calculations and especially helps with sub-SNL measurements of mesoscopic input states. The information about the number of detected photons is particularly helpful against the experimental imperfections such as darkcounts, non-ideal visibility, etc. Thus, lower discrimination error probability can be achieved. In Ref. 69, a SPAD-based quasi-PNR detection was used. The authors extended the sub-SNL performance of their receiver to the inputs with more than 20 photons per pulse on average.^{69} They achieved the record SER (below $10\u22126$). The similar quasi-PNR enhancement with a SPAD detector was used to optimize other multi-stage receivers.^{66,70–72} Adjusting intensity of the LO is yet another path to sensitivity improvement. In Ref. 73, the theoretical model of displacement for *M*-ary receivers is optimized by optimizing $|\beta |2$ at each step and the unconditional error rate below signal-to-noise ratio (SNR) is experimentally demonstrated.

#### 4. Time-resolving receivers

Another class of receivers consists of one displacement module and one single-photon detector and uses single-photon detection times for discrimination. Unlike multi-stage receivers, it provides instantaneous feedback to switch the LO state right after each photon detection. By design, the receiver gets to test the unrestricted number of hypotheses and allocates the optimal time to verify each hypothesis. Owing to the nature of coherent states, with a sufficiently fast detector, the probability to detect more than one photon in the field is negligible. Therefore, PNR detection is not required.

The first receiver of this class was introduced by Bondurant.^{19} Type-I Bondurant receiver probes hypothesis in a simple sequential order and uses the hypothesis at time *T* as the discrimination decision, Fig. 12(a), while Type-II receiver uses the sequential order, but compares photon interarrival times to make the final discrimination decision. Bondurant receivers have a near-optimal performance for 4-PSK state discrimination, where a Type-II receiver outperforms the Type-I receiver at low input energies. The probing is executed by switching the local state from one hypothesis to next, $\alpha 1\u2192\alpha 2\u2026\u2192\alpha m$, until all hypotheses are tested or no more clicks are detected. In a practical setting, a detection event can be induced by a dark count or non-ideal displacement. After any photon detection, the Bondurant receiver discards the hypothesis and will never test it again, leading to extra errors. A cyclic strategy can correct some of these errors. A cyclic receiver is similar to the Bondurant Type I receiver, except after testing the last state of the alphabet *α _{M}* it switches back to the first state

*α*

_{1}and continues the measurement until the end of the pulse

*T.*

^{26}The cyclic receiver was demonstrated experimentally.

^{74}The measured SER is unconditionally better than the SNL for 4-PSK, 8-PSK, 4-CFSK, and 8-CFSK encodings.

A much better result can be obtained if the time-resolving quantum receiver uses both instantaneous feedback and Bayesian inference.^{48} A Bayesian classifier uses the knowledge about prior local state and a photon arrival time to predict the most probable input state after each photon detection. This strategy converges to the right hypothesis with a minimal number of photon detections and it can be applied to any encoding.^{75} The strategy works best if the encoding is developed to take advantage of the instantaneous feedback.^{48,49} This holistic approach when both the receiver and the encoding are developed side-by-side has resulted in the record low error rates in discrimination of large alphabets with faint signals ($|\alpha |2\u2009\u2272\u20091$ photon per bit). This receiver is shown to perform unconditionally below the SNL for $M\u226416$ alphabets, the largest number of states in an alphabet reported to date.^{49}

### D. Summary of displacement receivers

In summary, a direct comparison of different displacement receivers is not always possible. For binary protocols, the optimal measurement is theoretically possible; measurement schemes that are asymptotically optimal have a clear advantage. For longer alphabet lengths, displacement measurements are not optimal. Theoretically, time-resolving protocols and the protocols that adjust LO intensity throughout the measurement are the most advantageous.

In experiment, practical considerations may play the decisive role. In general, based on experimental evidence, Table III, the protocols that take advantage of photon number resolution perform particularly well for brighter input states. Time-resolving protocols perform better with dimmer input states (with $\u2248$ 1 photon/bit and lower). This is because detectors have deadtime and the feedback components have latency; therefore, fewer feedback cycles may be practically advantageous. Other considerations include the following:

Transmission loss and detection efficiency. Both properties reduce system efficiency and reduce the unconditional advantage over the absolute SNL.

Alignment of the displacement reduces both conditional and unconditional advantage of the quantum measurement, but can be partially mitigated by including the inefficiency into the feedback model.

Background and dark counts similarly reduce both conditional and unconditional advantage of the quantum measurement, and can be partially mitigated by adjusting the feedback model.

We see that spatial multiplexing can remedy time delays, but it may introduce higher losses and alignment issues. The choice of the most optimal modulation protocol and the alphabet length may also depend on experimental and/or practical conditions. In making the choice, considering both conditional and unconditional performance of a receiver (Table III) is important because the conditional performance shows the degree of the advantage made specifically by a non-classical measurement whereas the unconditional performance reveals the system efficiency penalty.

## V. NEW TRENDS

In Sec. IV, we discussed theoretical and experimental achievements in coherent state discrimination with displacement-based quantum receivers. The field of quantum measurement is very active, and many new ideas for using quantum measurement in optical networks have emerged. Here we briefly discuss new research directions that in our view have a significant practical potential.

### A. Noisy communication channels

Realistic communication channels may distort and contaminate communication signals. Given that the theory of quantum receivers assumes noiseless channels, it is important to understand if quantum measurement advantage extends to channels with noise. An important realistic channel model is the non-Gaussian channel with bosonic phase noise. In Ref. 76, authors investigate a communication strategy over channels with phase noise and demonstrate that quantum measurement may be advantageous. In particular, authors optimize the displacement of the BPSK signals by varying LO amplitude, cf. Ref. 58, paired with a Kennedy-like receiver that takes advantage of PNR, cf. Ref. 60 [Fig. 13(a)]. They demonstrated SER below the homodyne limit adjusted for the system efficiency of 72% in the presence of phase noise. A similar strategy for a channel with thermal noise is considered in Ref. 77. The authors theoretically demonstrate that a PNR-enabled Kennedy-like receiver with the optimized displacement (see Refs. 58 and 60) can surpass the SNL when the average number of thermal photons is smaller than 0.2. Practical implementations of many quantum receivers require interferometric stability of the communication channel or a pilot signal providing the reference phase. In long-distance communication, it may be challenging to interferometrically stabilize the communication channel. In Ref. 78, authors experimentally demonstrate a phase-tracking protocol for quantum receivers to correct for time-varying phase noise and keep SER below the SNL.

### B. Discrimination of optical states other than coherent states

So far we considered coherent states as communication carriers. This is because coherent states of light are widely used for communication. Other types of states can be discriminated using quantum methods as well. The optimal discrimination of optical states with non-Poissonian photon number statistics^{81–84} has recently attracted a lot of interest. In these new experiments, ancillary coherent states are used for displacement in a receiver. Clearly, the perfect displacement of a non-Poissonian state to a vacuum state with a coherent state is impossible. Still, the probability to detect at least one photon can be significantly increased for one type of input and significantly reduced for the other.

In Ref. 79, authors investigate a binary communication channel that uses squeezed vacuum states as information carriers. The information is encoded by displacing the squeezed vacuum state by $D\u0302(\xb1\alpha )$,^{16} resulting in two displaced squeezed states (DSS) $|\xb1DSS\u27e9$ with the opposite phases [cf. BPSK, see Fig. 13(b)]. Squeezing of one of the quadratures of the carrier gives a smaller overlap between the DSS states in comparison to coherent states with the same average number of photons. Thus, the discrimination error probability for the squeezed states, in theory, may fall below the Helstrom bound for BPSK with coherent states in the absence of loss. When the channel has some phase noise, but no significant loss, even a homodyne-based “classical” receiver can approach the quantum optimum.

In Ref. 85, the fundamental quantum limit for discrimination error probability between a coherent and a thermal optical state is computed. Additionally, error probability bounds for direct detection, coherent homodyne detection, and the Kennedy-like receiver are given. The generalization of the Kennedy receiver for discrimination of coherent and thermal states with a low average photon number is shown to closely approach the quantum limit.

The displacement-based discrimination strategies used by quantum receivers were recently adopted for the discrimination of single-rail qubits, a superposition of the vacuum state with a single photon. In Refs. 80 and 86, authors theoretically and experimentally investigate a receiver for orthogonal single rail qubits: $|\xb1\u27e9=(|0\u27e9\u2009\xb1\u2009|1\u27e9)/2$ [see Fig. 13(c)]. Authors have shown that their setup can discriminate the superposition states using weak coherent states for displacement. Both input states have a certain vacuum and single-photon components. After coherent state displacement, the resulting states have distinct photon-number statistics, Fig. 14. This difference in mean photon numbers can be assessed with a single-photon detector. A feedback discrimination strategy generalized for single-rail qubits yields an SER below that of the perfect homodyne detection. These results can facilitate the implementation of quantum information processing protocols using single-rail qubits.

### C. Quantum unambiguous state discrimination

Displacement-based quantum receivers can be employed for so-called unambiguous state discrimination (USD).^{71,87,88} Unlike a typical receiver whose goal is to provide the best guess for all input states, unambiguous state discrimination receivers aim to error-free discrimination of states or reject the measurement as inconclusive if that cannot be done. In Ref. 71, sub-SNL USD is experimentally demonstrated for BPSK. At a later time, sub-SNL USD was extended to 4-PSK in Refs. 87 and 88.

### D. Optimal quantum measurements

We saw that displacement receivers are optimal for some encodings. For other encodings, displacement receivers cannot reach the HB. There is an alternative to displacement measurements, however. For instance, an optimal projective measurement with the help of quantum states, such as cat states,^{89} has been proposed and experimentally implemented. To our knowledge, this work is the only experimental effort to date that enables a quantum receiver that is not based on coherent state displacement. Yet another idea is to take advantage of an ancillary quantum system, such as a single atom.^{90–92} In these proposals, the input light field is mapped on a discrete set of atomic states, followed by a projection measurement. Near-optimal discrimination of BPSK, *M*-PSK, and *M*-ASK (amplitude shift keying) encodings has been discussed. An efficient light field interaction with an ancilla atom is required, which may be challenging to experimentally implement with today's technology. Another theoretical proposal shows how to design the optimal receiver for an arbitrary alphabet length and an arbitrary modulation scheme with the help of a universal quantum computer. The input signal is split to *m* copies each of which is transferred to the quantum computer. The quantum computer performs *m* unitary operations on the ancilla quantum register. The final state of the ancilla register is measured to arrive to the discrimination result.^{93} This idea uses two properties of coherent states: first, splitting a coherent state produces coherent states with the same properties, except for amplitudes; second, a coherent state with a sufficiently small amplitude is well approximated by a single-rail qubit (cf. Refs. 80 and 86). The problem of discriminating coherent states is reduced to discriminating multicopy single-rail qubit states by a sequential coherent-processing receiver.^{94}

### E. Artificial intelligence in communication

One of the interesting future directions for quantum receivers is the possible use of the artificial intelligence for real-time feedback and discrimination. Recently, artificial neural networks were successfully applied to reduce the error probability of the classical communications system, achieving the classical optimal limit.^{95} Replacing or pairing Bayesian inference with artificial neural networks could optimize feedback strategy and reduce error rates of quantum receivers in practical settings.

## VI. THE QUANTUM MEASUREMENT ENHANCED CLASSICAL INTERNET OF THE FUTURE? (IN LIEU OF CONCLUSION)

As it is evident by now, below-the-shot-noise limit discrimination error rates for coherent states have been achieved in many laboratories and for different encoding methods. Properties of displacement-based quantum receivers using non-Gaussian measurement were extensively studied. The field, however, is still in its early stage. Indeed, just one experimental report achieved SERs unconditionally below the classical limit at a telecom wavelength,^{68} while other proof-of-principle experiments either use visible light or cannot unconditionally surpass the SNL.^{55} Conventional communication systems, on the other hand, are very successful, mature, and competitive. Let us discuss the possible future of quantum technologies for classical communication.

Figure 15 shows the channel resource use required for nearly fault-free communication ($Pe=10\u22125$) using traditional modulation methods with ideal classical detection. This theoretical plot does not consider channel noise in a practical communication link, which would make energy requirements significantly greater. The sources of such noise include in-line optical amplifiers, cross-talk between wavelength-multiplexed channels, nonlinear effects in fiber, and dark noise of detectors. On the other hand, this plot does not take error correction into account, which can somewhat relax the energy requirements. Yet, we believe that this curve is a good estimation for the threshold of classical technologies. Some classical systems that are currently near this threshold use single-photon detectors^{96–99} because of their low dark noise.^{27} We see that quantum measurement could potentially reduce channel energy requirements from this threshold by more than one order of magnitude while not requiring more bandwidth.

In certain cases, for instance, for photon-starved communication links, reducing channel energy requirements may be the goal, which is achievable by switching to a quantum measurement at the receiver. However, reducing channel energy requirements does not automatically reduce the total energy consumption of the entire communication link. In fact, the total energy requirements of the state-of-the-art communication link using quantum receivers can be higher than that using classical receivers. Below we discuss if reducing the total energy consumption of communication systems using quantum measurement is fundamentally possible. We also list major technological obstacles that prevent such an energy reduction.

In order to tame the power needs of the telecom links, all components of a communication system should be taken into account. Power requirements of some electronic components scale proportionally to optical power used and those components dominate the power budget of fast (>10 GB/s) optical communication systems.^{100} Quantum measurement reduces the energy of light required to transmit one bit; thus, the power required for those electronic components excluding the receiver reduces proportionally. Displacement quantum receivers require significantly stronger LO than that for the ideal classical homodyne or heterodyne measurement. On the other hand, consider a long-distance fiber link where the optical loss is significant. The energy savings at the transmitter scale proportionally to loss and eventually overcome the additional optical power needs at the receiver. Certain single-photon detectors, such as SPADs, are less energy-efficient than classical detectors, yet another issue with quantum receivers. A new generation of single-photon detectors particularly superconductor nanowire detectors can use significantly lower currents to reliably register photons than amplified classical detectors, ultimately dissipating approximately 5 aJ per photon detection.^{101,102} Therefore, on balance, long-distance communication systems can fundamentally be more energy efficient than classical systems. Significant energy savings could also come from a conceptual rethinking of the network topology. Currently, a series of optical amplification stations mitigate light loss in fiber. Amplification stations are used because they require less wall power to operate than a transceiver. If transceivers power requirements could be dropped below that of an amplifier, the topology of the network would significantly change. Given that a large fraction of the optical noise in current networks is due to amplification and optical power-dependent effects (Raman cross-talk, cross- and self- phase modulation, etc.), the quantum-measurement-based communication system can be made nearly noiseless by reducing optical power. Such a nearly noiseless communication system can naturally support the coexistence of classical and quantum communication channels (such as quantum key distribution and entanglement distribution channels). This optimistic outlook faces serious technological challenges. Currently, even the best single-photon detectors at telecom can count fewer than 100 × 10^{6} photons per second. In addition, adaptive algorithms employed in receivers may require extra time to execute. Thus, per-channel data rates may be slower than that of conventional receivers. Wavelength division multiplexing can alleviate this issue, but it will require denser channel “packing” than is currently used. Such packing would require better frequency stabilization of telecom light sources, multiplexers/demultiplexers with better resolution, etc. Some single-photon detectors, such as superconducting nanowire detectors, require a low ambient temperature to operate. Because these detectors generate very little heat when operating, hundreds of such detectors could share the same cooling module.^{102} Also, the efficiency of the state-of-the-art cooling systems is far from theoretically optimal, leaving a lot of room for improvement. Lastly, although most of the proof of principle experiments currently use one laser source for both signal and local oscillator, local laser sources with long coherence times and the phase control should be used to unveil the potential energy saving. To this end, new phase correction protocols are being actively considered. One such protocol^{78} demonstrates phase estimation based on the output of quantum state discrimination, potentially requiring no exchange of phase information between the transmitter and receiver.

In conclusion, in light of the exponential growth of the Internet traffic and capacity crunch,^{1,2} the research of applied practical quantum measurement for communications is of urgent importance. We are cautiously optimistic that quantum technology will be used—either on a global scale or at least for some niche applications in a near future. We hope that our review helped the curious reader to get acquainted with this exciting field.

## ACKNOWLEDGMENTS

The authors thank Abdella Battou, Alan Migdall, Carl Miller, and Thomas Gerrits for a critical review of this manuscript and fruitful discussions. This work was partially supported by National Science Foundation through ECCS 1927674.

## DATA AVAILABILITY

Data sharing is not applicable to this article as no new data were created or analyzed in this study.

## References

*, FF1D.1*(