## Abstract

Because a quantum measurement generally disturbs the state of a quantum system, one might think that it should not be possible for a sender and receiver to communicate reliably when the receiver performs a large number of sequential measurements to determine the message of the sender. We show here that this intuition is not true, by demonstrating that a sequential decoding strategy works well even in the most general ‘one-shot’ regime, where we are given a single instance of a channel and wish to determine the maximal number of bits that can be communicated up to a small failure probability. This result follows by generalizing a non-commutative union bound to apply for a sequence of general measurements. We also demonstrate two ways in which a receiver can recover a state close to the original state after it has been decoded by a sequence of measurements that each succeed with high probability. The second of these methods will be useful in realizing an efficient decoder for fully quantum polar codes, should a method ever be found to realize an efficient decoder for classical-quantum polar codes.

## 1. Introduction

The reliable communication of classical data over quantum channels is one of the earliest problems to be considered in quantum information theory. Some of the most important contributions to this problem (to name just a few) are the Holevo upper bound on the accessible information [1], the coding theorem due to [2,3] Holevo–Schumacher–Westmoreland (HSW), and the fact that entangled signalling states can enhance communication rates for certain quantum channels [4].

The main difference between the proofs of the HSW theorem and Shannon's classical channel capacity theorem [5] is that, in the former case, one has to specify a quantum measurement that recovers the classical data being transmitted (a quantum decoder), as opposed to a classical algorithm that does so. Indeed, in their respective proofs, HSW demonstrated that a quantum measurement known as the ‘pretty-good’ or ‘square-root’ measurement [6–8] allows a receiver to decode classical information reliably at a rate equal to the Holevo rate. For a pure-loss bosonic channel modelling free-space communication, for example, this Holevo rate can be significantly higher than data rates that are achievable with more traditional measurement strategies such as homodyne or heterodyne detection [9].

As in the HSW decoding measurement, we typically perform measurements on quantum systems in order to gain information about them, and one well-known feature of quantum mechanics is that a measurement can disturb the state of the system that we are measuring. Thus, it came as a surprise when Lloyd–Giovannetti–Maccone (LGM) [10,11] showed that it is possible to achieve the Holevo rate by performing independent non-commuting sequential measurements, in analogy with classical sequential decoding strategies [12]. A sequential decoding scheme proceeds according to the following simple algorithm:

(1) Let

*M*be the total number of codewords. Initialize a counter*i*=1.(2) Perform a quantum measurement to determine whether the transmitted codeword is the

*i*th codeword.(3) If the measurement result is ‘yes’, decode as codeword

*i*and conclude. If the measurement result is ‘no,’ increment*i*.(4) If

*i*≤*M*, go to step 2. Otherwise, declare failure.

After the work of LGM, Sen [13] presented a remarkable simplification of their error analysis, by establishing a non-commutative union bound that holds for a set of projective measurements applied sequentially to a quantum state (where the projective measurements do not necessarily commute). This non-commutative union bound is an extension of the familiar union bound from probability theory, and as such, it should find wide application in settings beyond those considered in quantum communication theory. Sen applied his non-commutative union bound to a variety of problems in [13], including the problem of classical communication over quantum channels, and it has since been applied in designing Holevo-rate-achieving polar codes for classical-quantum channels [14] and in demonstrating how to decode the pure-loss bosonic channel at the Holevo rate [15].

All of the above results apply to a setting in which the channel is memoryless and identically distributed, so that one use of it does not depend on the others and so that each use leads to the same noise at the output as the other uses, respectively. Given that this ‘IID’ setting is really just an idealization, there has been a strong effort to develop a theory of quantum information that goes beyond the IID setting and applies to channels with no structure whatsoever [16–18]. This regime beyond the IID setting is known as the ‘one-shot’ regime, where we are concerned with a single instance of a resource and desire to make the best use of it up to some controllable failure probability. In this vein, there have been several contributions characterizing the reliable communication of classical data over quantum channels [19–21], and all of these used the ‘pretty-good’ measurement as the decoder.

Many of the developments listed above have improved our understanding of classical communication over quantum channels, but there are some important considerations left unanswered:

— We know very well that the most general kind of measurement allowed in quantum mechanics is a positive operator-valued measure (POVM). Does Sen's bound generalize so that it applies for a sequence of general measurements?

— Does sequential decoding work well in the one-shot regime?

— When can one conclude that the state resulting from a sequence of general measurements is close to the state before this sequence of measurements occurs?

This paper resolves the above problems, by showing that

— Sen's non-commutative union bound applies not just for a sequence of projections, but for the more general case of a sequence of positive operators each with spectrum less than one. This result follows simply by applying the well-known Naimark extension theorem.

^{1}Thus, the non-commutative union bound now applies for a sequence of general measurements (POVMs) and, as such, it should find wide application in other areas of quantum information science.— Indeed, sequential decoding works well even in the one-shot regime. That is, one can give a meaningful bound on the amount of information that can be transmitted up to a failure probability no larger than

*ε*for some*ε*>0 when using a sequential decoding strategy. The information bound we present is very similar to the bound of Wang & Renner [21].— A sequence of measurements followed by the reverse sequence of these measurements causes only a negligible disturbance to a state if the original sequence of measurements has a high probability of success. This last result generalizes Winter's gentle measurement lemma [22] to the more general setting of a sequence of measurements. One application of this last result is in decoding fully quantum polar codes for arbitrary quantum channels [23].

^{2}

We structure this paper as follows. Section 2 reviews some background material, including the definition of the hypothesis testing relative entropy [21,24], the Naimark extension theorem and Sen's non-commutative union bound [13]. We then proceed in the order given above.

## 2. Review

### (a) Hypothesis testing relative entropy

The hypothesis testing relative entropy, denoted as , is an entropy measure derived from the error probabilities arising from a quantum measurement that attempts to distinguish between the states *ρ* and *σ* (a quantum hypothesis test). The most general measurement that one could use in such a test is a two-outcome POVM {*Q*,*I*−*Q*}, where 0≤*Q*≤*I*. The outcome *Q* corresponds to deciding that the state is *ρ*, and the outcome *I*−*Q* corresponds to deciding that the state is *σ*. Thus, the probability of guessing correctly when the state is *ρ* is equal to Tr{*Qρ*}, and the probability of guessing incorrectly when the state is *σ* is equal to Tr{*Qσ*}. In an asymmetric quantum hypothesis test, we try to find a POVM that guesses *ρ* correctly with high probability, so that
2.1for some small, fixed *ε*≥0, while minimizing the probability that we guess *σ* incorrectly. This naturally leads to a semidefinite optimization programme, specified by the following quantity:
2.2By taking the negative logarithm of *β*_{ε}(*ρ*,*σ*), we arrive at the hypothesis testing relative entropy defined in Buscemi & Datta [24] and Wang & Renner [21]:
2.3One can derive other entropic measures based on the hypothesis testing relative entropy that have various natural properties [25].

### (b) Naimark extension theorem

We briefly review the Naimark extension theorem and a straightforward proof of it. The importance of this theorem is that it demonstrates how one can implement a general quantum measurement simply by performing a unitary on the system of interest and a probe system, followed by a von Neumann measurement of the probe.

### Theorem 2.1 (Naimark)

*For any POVM* *acting on a system S, there exists a unitary U*_{SP} *(acting on the system S and a probe system P) and an orthonormal basis* *such that
*2.4

### Proof.

For every POVM {*Γ*_{x}}, we can form the following isometry:
2.5which can be extended to a unitary operator *U*_{SP} by appropriately filling out the other entries of the form:
2.6for some operators *A*_{x,x′} and where . The statement of the theorem then follows easily from this choice of unitary. □

### Example 2.2

Let {*Γ*,*I*−*Γ*} be a binary POVM acting on the system *S*. Consider the following unitary operator *U*_{SP} acting on the system *S* and a qubit probe system *P*:
2.7The above unitary corresponds to a Naimark extension of the POVM {*Γ*,*I*−*Γ*}.

### (c) Non-commutative union bound

This section recalls Sen's [13] non-commutative union bound. As we mentioned in §1, this bound should find wide application in settings beyond those considered for communication, because it generalizes the union bound from probability theory.

### Theorem 2.3 (Sen)

*For a subnormalized state σ such that σ≥0 and Tr{σ}≤1, and a sequence of Hermitian projectors Π*_{1}*,…,Π*_{M}*, the following non-commutative union bound holds:
*2.8

## 3. Non-commutative union bound for positive operator-valued measures

We now give an extension of Sen's non-commutative union bound that applies for general measurements.

### Lemma 3.1

*Let σ be a subnormalized state such that σ≥0 and Tr{σ}≤1, and let Λ*_{1},…,*Λ*_{M} *denote a set of positive operators such that* 0≤*Λ*_{m}≤*I for all m*∈{1,…,*M*}. *Then, the following non-commutative union bound holds*:
3.1*where* *is an ancillary state of M probe systems and Π*_{Λi} *is a projector defined as* , *for some unitary U*_{i} *and projector P*_{i} *such that*
3.2

### Proof.

This extension of Sen's bound follows easily by using the Naimark extension theorem and Sen's non-commutative union bound.

To each POVM element *Λ*_{i} (as in the statement of the theorem), there exists a unitary (acting on the system *S* and the *ith* probe system) and a projector *I*_{S}⊗|*i*〉〈*i*|_{Pi} such that the following relation holds
3.3where
3.4Observe that the operator *Π*_{Λi} is a Hermitian projector, so that Sen's bound applies to each of these operators. Then, (3.1) follows from theorems 2.3 and 2.1. □

### Remark 3.2

Because lemma 3.1 applies for general measurements, it can be used in the context of sections 3 and 4 of Sen [13] without the need for constructing a particular kind of ‘intersection projector’ as is performed there.

## 4. Sequential decoding in the one-shot regime

This section provides a proof for one of our main results: that a sequential decoding strategy works well even in the one-shot regime. More specifically, the theorem bounds the *ε*-one-shot classical capacity of a classical-quantum channel, defined operationally as the maximum number of bits that a sender can transmit to a receiver using such a channel with a failure probability no larger than *ε*. The general idea behind the proof is the same as that in the proof of theorem 1 of Wang & Renner [21], with the exception that we use a sequential decoding strategy and use lemma 3.1 to bound the error probability of this decoding strategy.

### Theorem 4.1

*A sequential decoding strategy leads to the following bound on the ε-one-shot classical capacity C*^{ε}*(W) of a classical-quantum channel* 4.1*for some ε*^{′} *such that ε*^{2}*/4>ε*^{′}*, where ρ*_{XB} *is the following classical-quantum state that depends on the distribution p*_{X}*(x) and the channel W:
*4.2

### Proof.

Fix *ε*≥0 and a distribution *p*_{X}(*x*). Let *Q*_{XB} be an operator such that 0≤*Q*_{XB}≤*I*_{XB} and
4.3where *ε*^{′} is chosen as in the statement of the theorem. We generate a codebook by choosing its codewords *x*_{j} at random, each independently according to *p*_{X}(*x*). Let *A*_{xj} denote the following operator:
4.4From theorem 2.1, we know that to each *A*_{xj} there is associated a qubit probe system *P*_{j}, a unitary *U*_{BPj} and a projector *I*_{B}⊗|1〉〈1|_{Pj} such that for every state *σ*
4.5Furthermore, it follows that for the complementary operator *I*−*A*_{xj}, we have the following relation:
4.6(Because {*I*−*A*_{xj},*A*_{xj}} is a two-outcome POVM, the unitary operator can have the form given in example 2.2.)

For a specific codebook {*x*_{j}}_{j∈[M]}, the decoding strategy of the receiver Bob is as follows. Suppose that the sender Alice wishes to transmit message *m*, so that she transmits codeword *x*_{m} over the channel *W*. Then, the state at the receiver is *ρ*_{xm}. The receiver first appends *M* ancillas, each set to |0〉, to the state *ρ*_{xm} received. Then, the state at the receiving end is as follows:
4.7Bob then checks whether the codeword transmitted by Alice is the first codeword. He does so by performing the unitary corresponding to the first POVM {*I*−*A*_{x1},*A*_{x1}}, and the state becomes
4.8He then measures the probe system *P*_{1} in the computational basis {|0〉〈0|_{P1},|1〉〈1|_{P1}}. If he obtains the outcome |1〉, then he decodes that the first message was sent (in this case, there would be an error if *m*≠1). Otherwise, he performs the inverse of . At this point, if *m*≠1 and if there is no error, the subnormalized state becomes
4.9Making the abbreviations
4.10and
4.11we can write the above subnormalized state as
4.12

The receiver then continues by performing similar actions to determine whether the transmitted codeword was the second one. That is, he performs the unitary corresponding to *A*_{x2}, measures the probe system *P*_{2} in the computational basis and inverts the unitary if he does not receive the outcome |1〉 from the measurement of *P*_{2}.

The success probability of this sequential decoding procedure when the *mth* codeword is sent is equal to
4.13Thus, the error probability is given by
4.14We can then upper bound this error probability by using lemma 3.1:
4.15and
4.16Taking the expectation of the error with respect to all codebooks (but keeping the codeword *x*_{m} fixed) and exploiting concavity of the square-root function, this upper bound becomes
4.17Taking the expectation of the error with respect to the codeword *x*_{m} itself (and again exploiting concavity), this upper bound becomes
4.18Using the facts that
4.19and
4.20we can write the upper bound in (4.18) as
4.21and
4.22Let . By optimizing the choice of the operator *Q*_{XB} with respect to the hypothesis testing relative entropy defined in (2.2) and (2.3), we find the following upper bound on the error *ε*:
4.23Because we proved an upper bound on the expectation of the average error probability with respect to the codebook choice, we can conclude that there exists at least one code with the above bound on its average error probability. Rewriting this upper bound on *ε*, we find that the sequential decoding scheme gives the following bound on the *ε*-one-shot capacity of *W*:
4.24 □

### Remark 4.2

We recover the Holevo rate for communication by considering a memoryless classical-quantum channel and evaluating a limit as the number of channel used tends to infinity. We do not discuss this point any further here, because Wang & Renner [21] already discussed it in detail.

### Remark 4.3

Of course, it is not actually necessary to use *M* ancillas when decoding. After performing each measurement, the receiver could store the result in a classical memory and simply refresh a single ancilla to the state |0〉.

### Remark 4.4

The proof of the above theorem and lemma 3.1 make it clear that one can always use Sen's bound in the error analysis for any random coding classical communication scheme of the above form, thus serving as a substitute for the well-known bound in lemma 2 of Hayashi & Nagaoka [19]. However, the performance is slightly worse than that obtained with the Hayashi–Nagaoka bound owing to the square root on the right-hand side of Sen's bound (one can see this explicitly by comparing theorem 4.1 with theorem 1 of [21]).

### Remark 4.5

The operation of the sequential decoder is similar in spirit to the conditional pulse nulling receiver introduced in Guha *et al.* [26] and experimentally implemented in Chen *et al.* [27], in the sense that it proceeds by performing a unitary operation, a projection and the inverse of the unitary for every codeword in the codebook.

### Remark 4.6

We can also use sequential decoding for a task known as one-shot classical data compression with quantum side information [28–30]. In such a task, the sender and receiver are given a classical-quantum state of the form , where the sender has the system *X* and the receiver the system *B*. The goal is for the sender to transmit as few classical bits as possible to the receiver, such that he can recover the register *X* up to a failure probability no larger than some *ε*>0. In this case, we can show that the number of bits that need to be sent is related to the conditional hypothesis testing entropy [25,30], defined as
by exploiting the same kind of proof as given in Renes & Renner [29] and Tomamichel & Hayashi [30] combined with our proof given above. The use of sequential decoding in the IID setting for this task was first performed in §4 of Wilde *et al.* [31].

### (a) Performing sequential decoding coherently

We can also consider a fully coherent implementation of the sequential decoding strategy (i.e. with unitary operations alone). For the sake of simplicity, let |*ψ*〉 denote the state on which the coherent sequential decoding operations will act. As before, the procedure begins by the receiver appending *M* probe ancillas, so that the state becomes
4.25The receiver first performs the unitary corresponding to the first codeword, leading to
4.26Rather than perform an incoherent projection of the probe, the receiver can perform a controlled-NOT operation from the first probe system to another ancillary system *A*_{1} initialized in the state |0〉_{A1}. This leads to the state:
4.27The receiver then performs the inverse unitary , and by using the shorthand in (4.11) and (4.10), we can write the resulting state as
4.28Continuing a similar procedure for the second codeword leads to the expansion:
4.29and so forth.

## 5. Gentle sequential measurements

Winter's gentle operator lemma has found numerous applications in quantum information theory [22,32].^{3} It states that if a two-outcome measurement has one outcome that occurs with high probability, then the subnormalized post-measurement state is close to the original state. More formally,

### Lemma 5.1 (gentle operator)

*Let ρ be a state, and let Λ be an operator such that* 0≤*Λ*≤*I*. *Then*
5.1*Thus, if* Tr*{Λρ*}≥1−*ε for some small ε*≥0, *then*
5.2

Of course, lemma 5.1 can be extended with the Naimark extension theorem as well. Suppose that we know that
5.3for a two-outcome POVM {*Λ*,*I*−*Λ*}. By theorem 2.1, we know that there exists a unitary *U*_{SP} such that for the orthonormal basis {|0〉_{P},|1〉_{P}}, we have that
5.4where
5.5Thus, if we perform the unitary *U*_{SP}, the projective measurement {|0〉〈0|_{P},|1〉〈1|_{P}}, followed by the inverse unitary , we can conclude from lemma 5.1 that the resulting state of both the system and the probe is close to the original state:
5.6So, for the sake of simplicity, in what follows, we just consider the state *ρ* to be the state of the combined system and any necessary ancillas so that we can consider projective measurements only (this is due to the above observation and the Naimark extension theorem).

In a sequential decoding scheme, we also might like to conclude that the state after the decoding procedure is close to the original state. This would be pleasing conceptually and would also have applications in constructing decoders for quantum data from decoders for classical data [23,34,35]. However, as noted in [13], we cannot generally make the above conclusion. Here, we show how performing additional operations leads to a state close to the original one.

There are at least two ways that we can perform additional operations in order to guarantee that the sequentially decoded state is close to the original one. The first was mentioned at the end of §4.3 of Wilde *et al.* [31] and relies on the polar decomposition. Given that the post-measurement state is of the following form (omitting normalization):
5.7the receiver could perform a unitary *V* given by the polar decomposition
5.8so that the post-measurement state becomes
5.9In this case, we can apply the gentle operator lemma (lemma 5.1) to upper bound the disturbance:
5.10and
5.11and then once again apply Sen's non-commutative union bound (theorem 2.3) to upper bound the disturbance as
5.12(In the context of classical-quantum polar codes, a quantity such as the above will be exponentially small in the number of channel uses because each term is exponentially small while there are only a linear number of terms [14].)

One practical problem with the above approach is as follows. Suppose that we can efficiently implement each of the measurements corresponding to the projections *Π*_{1}, …, *Π*_{N} (say, on a quantum computer). Then, we can clearly perform the sequential decoding procedure efficiently if *N* is not too large. On the other hand, given a particular sequence of measurements, it is not clear at all that the unitary given by the polar decomposition in (5.8) has an efficient implementation. Thus, the above approach does not realize both desirable requirements of having a small disturbance and an efficient sequential decoding (if each of the measurements can be efficiently implemented to begin with).

There is a simple way to remedy the aforementioned problem if the sequential decoding strategy has a very small error probability. We can simply perform the projections *Π*_{1} through *Π*_{N} and then perform them again in the opposite order. This gives the subnormalized post-measurement state
5.13and the error probability is bounded as follows, again by applying Sen's bound (theorem 2.3):
5.14Thus, by performing the measurements again in reverse, we increase only the error probability by a factor of . Furthermore, the gentle operator lemma (lemma 5.1) gives the following upper bound on the disturbance:
5.15
5.16Applying the bound in (5.14) gives the following upper bound on the disturbance:
5.17Thus, with this scheme, we can realize both requirements of having an efficient implementation and a small disturbance—the receiver simply has to perform 2*N* measurements (each of which were assumed to have an efficient implementation) while the disturbance increases only by a factor of . An efficient coherent implementation of these operations follows from the discussion in §4*a*, if each measurement has an efficient implementation.

By the methods of Wilde & Renes [23], this latter approach will be useful for decoding quantum polar codes, should a method ever be found to realize an efficient decoder for classical-quantum polar codes [14] (see [36] for progress in this direction). At the very least, this latter approach answers an open question from Wilde & Renes [23].

## Acknowledgements

I am grateful to Nilanjana Datta and Renato Renner for organizing the ‘Beyond i.i.d. in information theory’ workshop in Cambridge, UK, where some of the questions in this work were raised. I also thank them and Aram Harrow, Patrick Hayden, Olivier Landon-Cardinal, Ligong Wang and Andreas Winter for helpful discussions. I acknowledge support from the Centre de Recherches Mathématiques in Montreal and am grateful for the hospitality of the Center for Mathematical Sciences at the University of Cambridge and the Institute for Theoretical Physics at ETH Zurich during a research visit in January and February of 2013, when some of the work for this paper was completed.

## Footnotes

↵1 This observation is due to Andreas Winter and Aram Harrow from a discussion in December 2011 at QIP 2012.

↵2 Quantum polar codes are the only known near-explicit quantum-error-correcting codes that achieve the coherent information rate of an arbitrary quantum channel.

↵3 In quantum complexity theory, there is a similar lemma known as the ‘almost as good as new lemma’ discovered independently by Aaronson [33].

- Received April 24, 2013.
- Accepted June 13, 2013.

- © 2013 The Author(s) Published by the Royal Society. All rights reserved.