## Abstract

Let *G*(*A*, *B*) denote the two-qubit gate that acts as the one-qubit *SU*(2) gates *A* and *B* in the even and odd parity subspaces, respectively, of two qubits. Using a Clifford algebra formalism, we show that arbitrary uniform families of circuits of these gates, restricted to act only on nearest neighbour (n.n.) qubit lines, can be classically efficiently simulated. This reproduces a result originally proved by Valiant using his matchgate formalism, and subsequently related by others to free fermionic physics. We further show that if the n.n. condition is slightly relaxed, to allow the same gates to act only on n.n. and next n.n. qubit lines, then the resulting circuits can efficiently perform universal quantum computation. From this point of view, the gap between efficient classical and quantum computational power is bridged by a very modest use of a seemingly innocuous resource (qubit swapping). We also extend the simulation result above in various ways. In particular, by exploiting properties of Clifford operations in conjunction with the Jordan–Wigner representation of a Clifford algebra, we show how one may generalize the simulation result above to provide further classes of classically efficiently simulatable quantum circuits, which we call Gaussian quantum circuits.

## 1. Introduction

Quantum computation is widely regarded as being more powerful than classical computation. Indeed, in some scenarios there are provable benefits, such as an exponential reduction in communication resources for some distributed computing tasks (e.g. Raz 1999), and in quantum cryptography the ability to communicate with unconditional security against eavesdropping. These results depend neither on any computational hardness assumptions nor on the presence of any oracle relativizations. Furthermore, in suitably relativized oracle models of computation, there are various known exponential savings in quantum versus classical query complexity, such as Deutsch & Jozsa (1992) and Simon (1997). However, for pure (unrelativized) computation, there is to date no proof of separation and it is still possible that efficient classical and quantum computational power might coincide, i.e. the complexity classes BPP and BQP might be equal. (Nielsen & Chuang (2000) provide a definition of these classes. Here and below, the term ‘efficient’ is synonymous with ‘polynomial time’.) Note that this is not the same question as the issue of efficient classical simulation of quantum processes, since any BQP algorithm is a quantum process of only a severely restricted kind, required to satisfy infinitely many constraints relative to an infinite set of input states, *viz*. all computational basis states. In this paper, we will study a representation of quantum computation in which the gap (if it exists) between efficient classical and efficient quantum computation appears to be surprisingly fragile, being provably bridged by a very modest use of a seemingly trivial resource (cf. theorems 1.1 and 2.1).

We will provide a self-contained development of a class of quantum circuits based on so-called matchgates, a notion that was introduced in Valiant (2002). Our approach is closely related to the work of Knill (2001), Terhal & DiVincenzo (2002) and DiVincenzo & Terhal (2005), relating matchgate circuits to free fermionic quantum computation (and later further extended in Bravyi (2005, 2008)). Here, we will emphasize the underlying mathematical ingredients and consider some further properties and generalizations that go beyond the fermionic formalism. Indeed, the existence of a physical interpretation in terms of fermionic physics, although interesting, appears to be entirely fortuitous and of no particular consequence for our considerations of computational complexity issues *per se*.

Consider two-qubit gates *G*(*A*, *B*) of the form (in the computational basis)(1.1)where *A* and *B* are both in *SU*(2) or *U*(2) with the *same determinant*. Thus, the action of *G*(*A*, *B*) amounts to *A* acting in the even parity subspace (spanned by |00〉 and |11〉) and *B* acting in the odd parity subspace (spanned by |01〉 and |10〉). Occasionally, we will wish to consider two-qubit gates of the form equation (1.1) but having det *A*≠det *B*. In this case, the gate will be denoted . To emphasize this distinction, we sometimes refer to a gate with det *A*=det *B* as an *allowable G*(*A*, *B*) gate. We will denote the Pauli operators by *X*, *Y* and *Z*.

*Consider any uniform* (*hence poly-sized*) *quantum circuit family comprising only G*(*A*, *B*) *gates such that*

*the G*(*A*,*B*)*gates act on nearest neighbour*(*n.n.*)*lines only,**the input state is any product state**,**and**the output is a final measurement in the computational basis on any single line*.

*Then the output may be classically efficiently simulated*. *More precisely,* *for any k,* *we can classically efficiently compute the expectation value* *,* *where Z*_{k} *is the Pauli Z operator on the kth line*; |*ψ*_{out}〉 *is the final state;* *and p*_{0} *and p*_{1} *are the outcome probabilities*.

Theorem 1.1 is very similar to the classical simulation result of Terhal & DiVincenzo (2002) and Valiant (2002). Our result is more general in the feature of allowing arbitrary product state inputs (rather than just computational basis states). It is more restrictive in considering only single-bit outputs (rather than individual probabilities of computational basis measurements across many lines) and in not encompassing the adaptive circuits that are included in Terhal & DiVincenzo (2002).

The notion of *efficient classical simulation* that we will use in this paper is the following. Let *C*_{n} be any uniform family of quantum circuits together with a specified class of (i) input states (usually taken to be product states, but we may also restrict to just computational basis states) and (ii) output measurements (which we take to be *Z*_{k}, a *Z*-basis measurement on any single line). We say that *C*_{n} is *classically efficiently simulatable* (relative to (i) and (ii)) if the probabilities of measurement outcomes can be *computed* by classical means to *m* digits of accuracy in poly(*n*, *m*) time.

Note that this ability to efficiently *compute* the probabilities to *exponential* accuracy is a rather strong notion, e.g. we might instead adopt weaker criteria, such as the ability to compute the probabilities to accuracy 1/poly(*m*), i.e. *O*(log *m*) digits in poly(*n*, *m*) time, or the ability to *sample* the output probability distribution *once* (by classical efficient probabilistic means, to suitable accuracy). Indeed, the last requirement would suffice in issues of the comparison of quantum with classical computational power but we will in fact achieve the strongest notion above in our results and we thus adopt it as our definition. We make further comments about implications of this strong notion of classical simulation in §7.

Note that if the n.n. *G*(*A*, *B*) circuits in theorem 1.1 are considered with computational basis inputs and also required to satisfy the BQP bounded probability conditions (*viz*. that the output probabilities are always greater than or equal to 2/3 or less than or equal to 1/3), then the ability to classically *calculate* the output probabilities (rather than the ability merely to sample the output distribution once) implies that the corresponding decision problem is not just in BPP but actually in P, i.e. *deterministic* classical polynomial time.

In the following sections we will first prove a universality result for *G*(*A*, *B*) gates, if these gates are also allowed to act on next n.n. lines in addition to the n.n. lines of theorem 1.1. Then, we give some background on the origin of the notion of matchgates, which first lead to the consideration of circuits of *G*(*A*, *B*) gates in Valiant (2002). Next, we consider a formalism of anticommuting variables that form a Clifford algebra, leading to a proof of theorem 1.1. This approach is essentially the one given in Knill (2001) and Terhal & DiVincenzo (2002), but we give some more transparent proofs and we develop further properties. First, we elucidate the n.n. condition (i) in theorem 1.1: we show that the general class of simulatable gates comprises a uniformly describable family that may act on any number of the *n* qubits, in which the *G*(*A*, *B*) gates appear as the subset of n.n. two-qubit gates. Furthermore, we show that all gates in the family can be obtained as *circuits* of n.n. *G*(*A*, *B*) gates (hence, adding nothing new) and that non-n.n. *G*(*A*, *B*) gates are not generally in the family. Second, by considering Clifford *operations*1 (i.e. *n*-qubit unitary operations that normalize the *n*-qubit Pauli group in *U*(2^{n})) in conjunction with the Clifford algebra formalism, we will describe an avenue for generalizing theorem 1.1 and give some examples of simulatable circuits, which cannot be obtained as circuits of n.n. *G*(*A*, *B*) gates only. Since all our classes of classically simulatable circuits comprise gates that are generated by Hamiltonians expressible as quadratic elements of a Clifford algebra, we call them *Gaussian quantum circuits*.

## 2. Universality of n.n. and next n.n. *G*(*A*, *B*) gates

The n.n. condition in theorem 1.1(i) is perhaps a surprising ingredient but it is crucial: it was already mentioned in Terhal & DiVincenzo (2002) (based on a result in Kempe *et al*. 2001) that n.n. *G*(*A*, *B*) gates together with the swap gate *SWAP*, or equivalently *G*(*A*, *B*) gates acting on arbitrary pairs of qubit lines, can perform universal quantum computation. We will prove a stronger result:

*Let C*_{n} *be any uniform family of quantum circuits with output given by a Z-basis measurement on the first line*. *Then,* *C*_{n} *may be simulated by a circuit of G*(*A*, *B*) *gates acting on n.n. or next n.n. lines only* (*i.e. on line pairs at most distance* 2 *apart*) *with at most a constant increase in the size of the circuit*.

Before the proof we make a few remarks. As an immediate corollary we have that any BQP algorithm can be simulated by a poly-sized circuit of *G*(*A*, *B*) gates acting on the n.n. and next n.n. lines only. This fact together with theorem 1.1 shows that a very limited use of the seemingly innocuous operation *SWAP* (on n.n. lines) allowing n.n. *G*(*A*, *B*) gates to act on lines just one further apart, suffices to bridge the gap between classical and quantum efficient computational power. The result becomes perhaps even more striking if we note that *SWAP* itself is very close to being expressible in the allowed *G*(*A*, *B*) form. Indeed, and fails only through a mere minus sign in det *X*=−det *I*. Thus, if we drop the det *A*=det *B* condition in equation (1.1), then the resulting gates acting on n.n. lines become efficiently universal for quantum computation.

The significance of *SWAP* (or equivalently the ability of two-qubit gates to act on distant lines) for quantum computational power appears also in a different context. Using the formalism of tensor network contractions, it may be shown (Jozsa 2006; Yoran & Short 2006; Markov & Shi 2008) that any poly-sized quantum circuit of one- and two-qubit gates, which has log depth and in which the two-qubit gates are restricted to act at bounded range (i.e. on line pairs at most distance *c* apart, for some constant *c*) may be classically efficiently simulated. It is also known (Cleve & Watrous 2000) that Shor's quantum factoring algorithm (Shor 1997) can be implemented as a log depth circuit but its two-qubit gates act on distant lines, *O*(*n*) apart, which is not bounded with increasing input size *n*. Thus, from this point of view, the quantum advantage of the algorithm (over classical algorithms) rests entirely on the presence of *unboundedly* distant actions (or unbounded use of *SWAP*). Also it is shown in Terhal & DiVincenzo (2004) and Jozsa (2006) that all depth 2 circuits (followed by a measurement) are classically efficiently simulatable even if two-qubit gates act on arbitrary line pairs while the same simulation result for depth 3 circuits (with a suitably strong notion of classical simulation) would imply equality of BPP and BQP. Here again, the feature of unboundedly distant action is essential, whereas our result in theorems 1.1 and 2.1 achieves full efficient quantum computational power by passage from distance 1 to just bounded distance 2.

Given any uniform quantum circuit family, we may assume w.l.o.g. that it comprises n.n. controlled-*Z* gates (n.n. *CZ*) and one-qubit gates generically denoted as *A*.

We start with a *quadrupled* number of qubit lines and encode the original input |0〉's and |1〉's as logical basis states |0_{L}〉=|0000〉 and |1_{L}〉=|1001〉, respectively, in consecutive blocks of four lines each (cf. the remark after the proof). Then, with suitably encoded gate operations the whole computation will stay within tensor products of span{|0_{L}〉,|1_{L}〉} of each quadruple of qubits.

On any such quadruple of lines, say 1234, we can perform the encoded one-qubit gate *A* as the following sequence of allowed n.n. gates (depicted in figure 1*a*):(2.1)(where the subscripts denote the line numbers). To see this, note thatso that can be thought of (for our logical basis states) as just swapping lines 1 and 4 into positions 2 and 3. In view of the form of the encoding |0_{L}〉 and |1_{L}〉, the logical qubit is then encoded in the {|00〉, |11〉} subspace of lines 2 and 3, so *G*(*A*, *A*) will apply the one-qubit gate *A* to it. Finally, the lines are swapped back to their original positions, restoring the encoding.

To perform an encoded *CZ* on two consecutive quadruples, say 1234 and 5678, we simply apply *CZ*_{45}, i.e. *CZ* on the ‘crossover’ pair of lines 4 and 5. Indeed, for any pair of basis states and we will get a minus sign if and only if *d*=*e*=1, giving the correct action on any encoded |*x*_{L}〉 and |*y*_{L}〉. Next, note that since composition of *G*(*A*,*B*) gates amounts to multiplying the *A*'s and *B*'s separately, we obtain (with all gates acting on lines 4 and 5)(2.2)where is the Hadamard operator. In the last expression, is *SWAP* and all other gates are allowable *G*(*A*,*B*) gates. This implementation of *CZ*_{45} is depicted in figure 1*b*.

Finally, note that in an arbitrary such circuit, *SWAP* is used only on crossover pairs (4, 5), (8, 9), etc. of the encoding quadruples (1, 2, 3, 4), (5, 6, 7, 8), (9, 10, 11, 12), etc. Hence, no line is ever moved more than one position distant from its original location, by overall action of any number of such *SWAPs*. We may commute all these *SWAP* operations out to the output end of the circuit. In so doing, any line of each (originally n.n.) *G*(*A*,*B*) gate in equation (2.1) may be moved by at most one place, but we can never move both lines of any n.n. *G*(*A*,*B*) gate in view of the block size 4 of the encoding. Thus, the resulting circuit (with *SWAPs* eliminated) comprises only *G*(*A*,*B*) gates acting on n.n. or next n.n. lines, as required.

In this process, *SWAP* gates need to be commuted across *G*(*X*, *X*) and *G*(*H*, *H*) gates (cf. equation (2.2)). But for any *G*(*A*,*B*), we have (using )and the last gate is an allowed *G*(*A*, *B*) gate. Since the whole computation is engineered to represent the original given circuit, recoded in the subspaces of consecutive line quadruples, a final measurement on line 1 will produce the same output distribution as a measurement on qubit 1 in the original given circuit. This completes the proof of theorem 2.1. ▪

We remark that instead of the quadruple encoding above, we might have considered the simpler |0_{L}〉=|00〉 and |1_{L}〉=|11〉 as a potentially more natural choice. Indeed, in that case, the one-qubit gate *A* is applied very simply as *G*(*A*, *A*) (in contrast to equation (2.1)) but the *SWAPs* from the *CZ* actions may now move both lines of a n.n. *G*(*A*, *B*) gate in opposite directions, resulting in *G*(*A*, *B*)'s on lines up to distance 3 (rather than just 2) apart.

## 3. Perfect matchings and matchgates

Before beginning our development of theorem 1.1, we give some brief background remarks on the interesting provenance of Valiant's notion of matchgates. (These remarks will not be used in any further results.) Matchgates arose (Valiant 2002, 2007) in the context of the theory of perfect matchings in graphs. For a graph *G*, a perfect matching is a set *M* of edges such that each vertex is the endpoint of exactly one edge in *M*. It is known that the problem of counting the number of perfect matchings in a graph is computationally very hard (being complete for the complexity class #P, cf. Papadimitriou 1994) but for *planar* graphs it is, remarkably, computable in polynomial time, using the Fisher–Kasteleyn–Temperley (FKT) algorithm (Kasteleyn 1961; Temperley & Fisher 1961; Jerrum 2003).

More generally, we may consider weighted graphs *G* in which each edge (*ij*) is assigned a weight *w*_{ij} and introduce the so-called match sum,(3.1)Then, the FKT algorithm provides a polynomial time computation of the match sum for any planar graph.

Next, consider a (planar) weighted graph with a designated set {*v*_{1}, …, *v*_{n}} of ‘input’ vertices and a disjoint designated set of ‘output’ vertices, and consider the indexed collection (tensor)where each index takes values 0 and 1 and is the graph obtained from *G* by deleting all those input and output vertices (together with their incident edges) that have corresponding index *i* or *j* set to 1. Hence, the tensor components are each computable in poly time by the FKT algorithm.

Matchgates are essentially these tensors with some additional technical modifications (Valiant 2002, 2007) whose specification we will omit here. Suffice it to say that the full definition is chosen so that a circuit of matchgates, representing contraction of a matchgate tensor network, corresponds to the problem of evaluating the match sum of *G*_{tot}, where *G*_{tot} (with possibly some residual uncontracted input and output vertices) is the graph obtained from the graphs of the individual matchgates by identifying (or ‘glueing along’) input and output vertices that are contracted in the tensor network. It follows that the components of the contraction are computable in poly time too. This is essentially the content of Valiant's so-called Holant theorem (Valiant 2007). The expression and clarification of the Holant theorem in terms of tensor contractions (and its invariance under appropriate basis changes in representing the tensors) were developed in a series of works by Cai & Choudhary (2006, 2007).

For some choices of graphs and weights (with equal numbers *m*=*n* of input and output nodes), the matchgate tensors can be *unitary*, i.e. unitary operations on *m* qubits. Hence, the above formalism leads to a class of quantum circuits (comprising unitary matchgates) that can be classically efficiently simulated. For example, it may be shown (Valiant 2002) that the unitary gates *G*(*A*, *B*) in equation (1.1) arise as matchgates with two input and two output vertices.

The FKT algorithm (Kasteleyn 1961; Temperley & Fisher 1961; Jerrum 2003) proceeds by setting up a suitable antisymmetric incidence matrix *A* of the graph's weights and then computing Perf*M*(*G*) as the Pfaffian of *A* (which in fact equals ). Like the determinant of an arbitrary matrix, the definition of Pfaffian of an antisymmetric matrix is an expression involving exponentially many terms *a priori*, yet computable on polynomial time. It is known that Pfaffians also occur in the mathematical formalism of fermionic quantum physics, which suggests that there may be some relationship (or at least some form of translation of basic problems) between fermionic physics and perfect matchings in graphs. Indeed, soon after the appearance of Valiant's work (Valiant 2002) on classical simulation of matchgate quantum circuits, Knill (2001) and Terhal & DiVincenzo (2002) provided an interpretation of it in terms of fermionic quantum gates and this formalism was subsequently further developed by Bravyi (2005, 2008).

## 4. Clifford algebras, quadratic Hamiltonians and classical simulation

We now return to developing a formalism for treating theorem 1.1 and some generalizations. For *n*-qubit lines, we introduce the set of 2*n* Hermitian operators *c*_{μ} that satisfy the anticommutation relations,(4.1)These relations define a *Clifford algebra* _{2n} on 2*n* generators, whose elements are arbitrary complex linear combinations of products of generators.2 Since each generator squares to the identity a general element in the algebra may be expressed as a polynomial of degree at most 2*n*,(4.2)(where the index set {*i*_{1}, …, *i*_{k}} may be empty). It follows from equation (4.1) that the monomials in the sum are linearly independent so as a vector space _{2n} has dimension 2^{2n}=2^{n}×2^{n}. Hence (Hermitian) matrix representations of the *c*_{μ}'s will involve matrices of size 2^{n}×2^{n}.

Operators *c*_{μ} satisfying equation (4.1) arise in the formalism of fermionic physics, where they are known as Majorana spinors. In that formalism, we start with a set of operators *a*_{1}, …, *a*_{n} associated with *n* free fermionic modes, satisfying the standard anticommutation relations for fermionic creation and annihilation operators,

Then, as a consequence of these relations, the following Hermitian operators (which are fermionic analogues of position and momentum operators)satisfy the Clifford algebra relations equation (4.1). However, we emphasize that in the present paper we are not concerned with the study of free fermions *per se* but rather, consideration of general quantum circuit simulation properties, based on the Clifford algebra structure, which can also go beyond the fermionic formalism (such as the statement in theorem 2.1 and results in §6).

For theorem 1.1 and generalizations we will (in later sections) consider matrix representations of the Clifford algebra but we first develop some further abstract algebra. A *quadratic Hamiltonian* is an element of of the form(4.3)where *h*_{μν} is a 2*n*×2*n* matrix of coefficients. Note that we omit *μ*=*ν* terms, which contribute only an overall additive constant to *H*. Since and imposing *H*=*H*^{†} we may w.l.o.g. take *h*_{μν} to be a real antisymmetric matrix.

Given *H* we consider the unitary operation *U*=e^{iH} (where the exponential is calculated in the algebra of as the power series). Any such unitary operation corresponding to a *quadratic* Hamiltonian is called a *Gaussian* operation. The following result from fermionic linear optics (cf. Knill 2001; Terhal & DiVincenzo 2002; Bravyi 2005; DiVincenzo & Terhal 2005) will be basic for our classical simulation results and we include a simple proof of it.

*Let H be any quadratic Hamiltonian and U*=e^{iH} *be the corresponding Gaussian operation*. *Then for all μ,**where the matrix R is in SO*(2*n*)*,* *and we obtain all of SO*(2*n*) *in this way*. *In fact R*=e^{4h}.

Write *c*_{μ} as *c*_{μ}(0) and introduce with *U*(*t*)=e^{iHt}. Then,(with square brackets [*a*,*b*] denoting the commutator *ab*−*ba*). But, if *μ*≠*ν*_{1}, *ν*_{2} and [*c*_{μ}*c*_{ν},*c*_{μ}]=−2*c*_{ν} (using equation (4.1)) so,where *R*=e^{4ht}. It is well known that antisymmetric matrices are the infinitesimal generators of rotations and the theorem follows by just setting *t*=1. ▪

The significance of theorem 4.1 for us is the following: note that e^{iH} generally involves all products of all generators so that the expression *Uc*_{μ}*U*^{†} could potentially finish up anywhere in the exponentially large (2^{2n}-dimensional) linear space _{2n}. However, it always happens to stay within the *polynomially small* (2*n*-dimensional) subspace spanned by just the generators themselves. We exploit this feature of the adjoint representation (cf. also Somma *et al*. 2006 for a more general Lie-theoretic setting), using the following strategy.

We find a Hermitian representation of the *c*_{μ}'s on *n*-qubit operators, and then the Gaussian operations corresponding to quadratic Hamiltonians define a class of *n*-qubit unitary gates. Let be any circuit of these with for some choice of input state. Then, by theorem 4.1 for each *μ* we have the expectation value(4.4)where is the product of all *SO*(2*n*) matrices corresponding to the individual gates of *M*. Hence the full matrix is poly-time computable.

Now suppose further that is a *product* state and that *c*_{μ} is represented by a *product* operator *P*_{1}⊗⋯⊗*P*_{n}. Then, is also poly-time computable and hence 〈*c*_{μ}〉_{out} is poly-time computable for each *μ*.

However, we really want , where *Z*_{k} is the Pauli *Z* operator acting on the *k*th line. Recall that _{2n} as a vector space has dimension 2^{n}×2^{n}, so it spans all *n*-qubit matrices in our representation. Thus, *Z*_{k} must be expressible as some polynomial of the form equation (4.2). If this polynomial has a constant degree, independent of *n*, then 〈*Z*_{k}〉_{out} will be poly-time computable too. As an example suppose *Z*_{1}=−i*c*_{1}*c*_{2}. Then(4.5)

If the *c*_{μ} are product operators then so are all the monomials such as , and will be poly-time computable for any product state input |*ψ*_{in}〉. Note also that the size of the sum in equation (4.5) is *O*(*n*^{2}) (compared with the *O*(*n*)-sized sum for 〈*c*_{μ}〉_{out} in equation (4.4)) and hence 〈*Z*_{1}〉_{out} is poly-time computable too. This argument is easily generalized to give the following result.

*Consider any poly-sized circuit of Gaussian gates acting on a product input state*. *If the observable Z*_{k} *in the final measurement is expressible in* _{2n} *as a polynomial of degree d,* *then for each of its monomials the corresponding sum as in equation* (*4.5*) *for* 〈*Z*_{k}〉_{out} *will be O*(*n*^{d}) *sized and hence* 〈*Z*_{k}〉_{out} *will be poly-time computable if d does not increase with n*.

## 5. The Jordan–Wigner representation and theorem 1.1

Introduce the 2*n* Hermitian operators on *n* qubits (omitting tensor product symbols ⊗ throughout),(5.1)where *X* and *Y* are in the *k*th slot for *c*_{2k−1} and *c*_{2k} and *k* ranges from 1 to *n*. Thus, the operators *c*_{2k−1} and *c*_{2k} are associated with the *k*th qubit line. It is straightforward to check that these matrices satisfy the relations equation (4.1) so that we have a representation of the Clifford algebra _{2n}, known as the Jordan–Wigner representation (Jordan & Wigner 1928). This is in fact the unique representation of _{2n} up to a *global unitary equivalence*. Furthermore, , which has bounded degree 2, and the *c*_{μ}'s are all product operators. Hence, for any poly-sized circuit of Gaussian gates with a product state input, 〈*Z*_{k}〉_{out} is poly-time computable. But what do these Gaussian gates actually look like?

Consider first just qubit lines 1 and 2 (i.e. *c*_{1}, *c*_{2}, *c*_{3} and *c*_{4}) and corresponding quadratic Hamiltonians that involve six possible terms,These operators are all trace free and all preserve the even and odd parity subspaces. Hence, the corresponding six-parameter family of Gaussian gates must be *SU*(2)⊕*SU*(2) decomposed relative to the two parity subspaces. More explicitly, we may first construct the Pauli *X*, *Y*, *Z* operators acting within the two subspaces (e.g. is *X* acting in the odd subspace relative to the {|01〉, |10〉} basis, and maps the even subspace to zero) and generate the two *SU*(2)'s by direct exponentiation. Hence, we get precisely the *G*(*A*, *B*) gates for lines 1 and 2 as the Gaussian operations *U*=e^{iH}, with the quadratic Hamiltonian of equation (4.3) restricted to the use of *c*_{1}, *c*_{2}, *c*_{3} and *c*_{4} only. Similarly, for any pair of *consecutive* lines we get all n.n. *G*(*A*, *B*) gates (since for lines *k*, *k*+1, the initial *Z* operators in equation (5.1) are eliminated in all quadratic products *c*_{μ}*c*_{ν} and the calculation proceeds exactly as above).

Thus, all n.n. *G*(*A*, *B*) gates are Gaussian for the Jordan–Wigner representation and this completes the proof of theorem 1.1.

But there are still more Gaussian gates, generated by quadratic Hamiltonians involving more *c*_{μ}'s associated with a larger number and more distant lines. Note first that if we use only the four *c*_{μ}'s associated with a pair of not-n.n. lines, we do *not* get the corresponding (now non-n.n.) *G*(*A*, *B*) gate acting on those lines. For example, consider the quadratic term *c*_{2}*c*_{4} (associated with n.n. lines 1 and 2) replaced by *c*_{2}*c*_{6} (being the corresponding operator associated with lines 1 and 3). We have *c*_{2}*c*_{4}=*X*_{1}*Y*_{2} but *c*_{2}*c*_{6}=*X*_{1}*Z*_{2}*Y*_{3}. Hence, exponentiation of the latter does not correspond to exponentiation of *XY* for lines 1 and 3 but gives a gate acting non-trivially across all three lines.

In summary so far, we see that non-n.n. *G*(*A*, *B*)'s are not generally Gaussian. But n.n. *G*(*A*, *B*)'s are all examples of Gaussian operations, albeit only special cases in the full set of such operations that may generally act on any number of qubit lines. Finally, we show that these apparently more general Gaussian operations actually bring nothing new, in the context of *circuits* of gates.

*Let* *be any quadratic Hamiltonian with corresponding Gaussian gate V*=e^{iH} *on n qubits*. *Then,* *V as an operator on n qubits is expressible as a circuit of O*(*n*^{3}) *n.n. G*(*A*, *B*) *gates,* *i.e.* *where each U*_{j}=e^{iHj} *having* *with the sum involving only four c*'*s associated with two n.n. lines,* *viz*. *c*_{2k−1}*,* *c*_{2k}*,* *c*_{2k+1}*,* *c*_{2k+2} *for some fixed k*.

Note that (as shown in the proof below) the circuit expression of theorem 5.1 is *exact, analytic and explicitly describable in poly time*, in contrast to an alternative standard, but generally inefficient, asymptotic decomposition using the Lie-Trotter expansion (for an exponential of a sum of generally non-commuting operators).

Let *V*=e^{iH} be any Gaussian operation as above. We havewith *R*∈*SO*(2*n*). We can efficiently decompose *R* into its generalized Euler angles (by the algorithm of §4 in Hoffman *et al*. 1972), obtaining *R*=*r*_{1}*r*_{2}…*r*_{M} where *M*=*O*(*n*^{2}) and each *r*_{j} is a rotation in 2*n* dimensions that acts non-trivially only in two dimensions, spanned by, say, the *a*th and *b*th coordinates. Thus, where *h*_{j} is an antisymmetric matrix with non-zero values (denoted ±*θ*/2) only in its *a*th and *b*th columns and rows. Then introduce so that hasIn this construction, *c*_{a} and *c*_{b} do not generally belong to n.n. qubit lines. To remedy this, we introduce the n.n. ‘modified swap’ operation (Bravyi & Kitaev 2002) defined, for example, for lines 1 and 2, by(5.2)We can readily verify thati.e. *S*_{12} swaps the roles of the pairs (*c*_{1},*c*_{2}) and (*c*_{3},*c*_{4}). Similarly, we have *S*_{k,k+1} for any n.n. line pair to swap pairs (*c*_{2k−1},*c*_{2k}) and (*c*_{2k+1},*c*_{2k+2}). Note that the exponent in equation (5.2) is n.n. quadratic so that *S*_{12} is an n.n. Gaussian gate. In fact, in the Jordan–Wigner representation we get .

Returning to *U*_{j} and *H*_{j}, if *c*_{a} and *c*_{b} are not associated with n.n. qubit lines we can use a ladder of *S*_{k,k+1} conjugations to express *U*_{j} as a product of at most *O*(*n*) n.n. *G*(*A*,*B*)'s. Thus, starting from *U* we obtain a product of at most *O*(*n*^{3}) n.n. *G*(*A*,*B*) gates such that for all *c*_{μ}. Hence, this relation holds for all monomials too and thus for arbitrary matrices *M* (as the monomials span all, matrices), i.e. for all *M* so that for some overall phase *δ*, which may be set to zero by a further trivial *G*(*A*,*B*) gate.

We remark that theorem 5.1 has a direct application in digital quantum simulation (algorithmic quantum simulation by the set of elementary gates) of a one-dimensional quantum system whose Hamiltonian *H* is describable in the form of equation (4.3). In particular, this includes the one-dimensional *XY* Hamiltonian, which exhibits a quantum phase transition for suitable choice of its parameters. We see that the real-time dynamics of the *XY* Hamiltonian *for any length of time t* can be efficiently quantumly simulated in terms of n.n. *G*(*A*, *B*) gates. Another efficient circuit simulation of the *XY* Hamiltonian was described recently in Verstraete *et al*. (2008).

## 6. Gaussian quantum circuits intertwined by Clifford operations

Recall (cf. Nielsen & Chuang 2000) that the Pauli group on *n* qubits contains all *n*-fold tensor products *P*_{1}⊗⋯⊗*P*_{n} of Pauli matrices (i.e. each *P*_{j} is *I*, *X*, *Y* or *Z*), together with overall factors of ±1 and ±*i*. An *n*-qubit operation *T* is a Clifford operation if and only if for all i.e. conjugation by *T* preserves the product Pauli structure. It is known (Gottesman 1997) that a unitary operation *T* is a Clifford operation if and only if it can be expressed as a circuit of *CNOT*, Hadamard *H* and *P*=diag(1, *i*) gates.

With this in mind, recall also that the Jordan–Wigner representation of _{2n} comprises not only product operators, but also *Pauli* products. It is easy to verify that if a set of Hermitian operators *c*_{μ} satisfy the Clifford algebra relations equation (4.1), then so do for any unitary *V*. Now, recall that our classical simulation result relied upon the quadratic Hamiltonian property in theorem 4.1—which in turn rests on the algebra relations equation (4.1)—and the product structure of the matrix representation (associated with product state inputs). Hence, if we choose *V* in to be a *Clifford operation T*, we preserve *both* features and we can obtain new classes of classically efficiently simulatable quantum circuits using the Gaussian operations provided by the 's (assuming that the conditions of theorem 4.2 are also satisfied). Note that the Clifford operation *T* itself cannot generally be obtained as a Gaussian gate of the original Clifford algebra representation *c*_{μ}, nor thus by a circuit of n.n. *G*(*A*, *B*) gates.

The conjugation action of *T* can be taken outside the quadratic Hamiltonian and the exponential power series sum, showing that the new Gaussian gate is just the original one *U*_{old} (e.g. n.n. *G*(*A*, *B*)'s) conjugated by *T*. In a new circuit comprising new Gaussian gates *U*_{new}, the intermediate Clifford operations *T* can be viewed as cancelling each other by the unitarity identity . Thus, we can alternatively think of these new simulatable circuits as being the same as the old ones, but the input states are now (now generally entangled) and the final measurement is now *TZ*_{k}*T*^{†} (now generally a multi-line observable) rather than *Z*_{k} itself, i.e. we extend the class of allowed inputs and measured outputs in theorem 1.1 while maintaining classical efficient simulatability. From this point of view, the new freedom associated with the use of Clifford operations *T* appears at the *boundary* of circuits, which is analogous to Valiant's use of basis changes in his notion of holographic algorithm (Valiant 2007).

In our construction *T* is generally a global (*n*-qubit) operation, and we will require two further features.

The Pauli operator

*Z*_{k}should be expressible as a*bounded*degree polynomial in the 's. Recall that the classical simulation cost depends on this degree*d*as*O*(*n*^{d}) (cf. theorem 4.2), and it was previously quadratic, but with arbitrary*T*'s we may get*d*=*O*(*n*).We wish to identify suitable

*local*new gates*U*_{new}acting on, say, some constant number*K*of qubit lines. For general*T*operators, even the conjugates of n.n.*G*(*A*,*B*) gates may become global*n*-qubit operators; so we may, for example, seek Clifford*T*'s such that these particular conjugates remain*K*-local for some*K*. In contrast to (i) this requirement is not essential for the existence of an efficient classical simulation but it is desirable in view of the usual notion of quantum circuit as comprising local gates, each acting on a bounded number of lines.

We also remark that, in the above construction, we need to choose a Clifford operation *T*_{n} for each number *n* of qubit lines. A curious feature is that in addition to being able to vary the structure of *T*_{n} with *n*, each *T*_{n} need not itself be ‘translationally uniform’ across the *n* lines, whereas the class of all n.n. *G*(*A*, *B*) gates as a whole does have a translationally uniform structure. Hence, we can obtain classically simulatable quantum circuits that have different kinds of gates allowed on different sections of the qubit line set.

To conclude this section, we give three illustrative examples of this construction.

Clearly, any circuit *T* of *SWAP* operations is a Clifford operation. In this case, amounts to allowing the *G*(*A*, *B*) gates to act on correspondingly selected distant lines. However, in any such resulting Gaussian circuit the lines may always be simply reordered to restore all *G*(*A*, *B*)'s to n.n. status.

Let *CNOT*_{ij} denote the *n*-qubit operation that applies the two-qubit *CNOT* gate with control line *i* and target line *j*. Let *H*_{i} denote the one-qubit Hadamard gate on line *i*. Consider the (translationally uniform) Clifford operation,as depicted in figure 2*a*. Indeed, this *T* operation is known as a duality transformation of a one-dimensional quantum system (cf. Plenio 2007). Conjugating the Jordan–Wigner representation *c*_{μ}'s of equation (5.1), we obtain givingso that remains quadratic in the generators. The six n.n. quadratic Hamiltonian terms on lines *k*, *k*+1 areand correspond in the Jordan–Wigner representation toConjugation by *T* gives the Hamiltonian (also known as the three-body cluster-state interaction),Note that this Hamiltonian (although quadratic in the 's) has now become 3-local so that the six-parameter family of n.n. *G*(*A*, *B*)'s on lines *k*, *k*+1 will conjugate to a six-parameter family of 3-local gates on lines *k*−1, *k*, *k*+1 (and we omit computation of the explicit form of these three-qubit gates). Since we have expanded into three lines, we may go back and consider arbitrary quadratic Hamiltonian terms in the *c*_{μ}'s associated with lines *k*−1, *k*, *k*+1, involving ^{6}C_{2}=15 parameters. By computing their conjugates under *T* we find that they all remain 3-local, giving a 15-parameter family of 3-local Gaussian gates. However, by theorem 5.1, any member of this 15-parameter family is obtainable as a circuit of the initial six-parameter family. Finally, we see, by the construction, that arbitrary poly-sized circuits of the 15-parameter family of 3-local gates, with input product states and a final *Z*_{k} measurement can be classically efficiently simulated.

For odd *n*, consider the translationally uniform Clifford operation *T*_{n} given byas depicted in figure 2*b*. Conjugating the Jordan–Wigner representation, we obtain the following in this case:supplemented by boundary terms,

It follows from these expressions that the conjugations of n.n. *G*(*A*, *B*) gates on lines *k*, *k*+1 become 4-local gates on lines *k*, *k*+1, *k*+2, *k*+3. By considering all possible quadratic terms of *c*_{μ}'s associated with these four lines we obtain for each *k*, a 13-parameter family of Gaussian gates (i.e. not all ^{8}*C*_{2} quadratic terms remain 4-local under conjugation). These are generated by the following Hamiltonians and their commutators:andThus, when *k* is odd *Z*_{k} is obtained as a quadratic expression in the 's, whereas when *k* is even *Z*_{k} requires a sixth-degree Clifford algebra monomial, *viz*. the product of *Z*_{k−1}*Z*_{k}*Z*_{k+1}, *Z*_{k−1} and *Z*_{k+1} each of which is in the *k*-even list above and hence quadratically representable. Thus, arbitrary poly-sized circuits of the 26-parameter family of four-qubit gates defined by the Hamiltonians above are classically efficiently simulatable albeit with a higher simulation cost which now scales as *O*(*n*^{6}).

## 7. Concluding remarks

In theorems 1.1 and 2.1, we have seen that quantum computational power may be made to appear as a surprisingly delicate extension of its classical counterpart. Is it conceivable that the passage from n.n. to next n.n. use of *G*(*A*, *B*) gates may be achieved while maintaining classical simulatability? We relate this question to some more formal complexity theoretic considerations after introducing some further terminology.

Recall that BQP is the class of languages decided by a uniform (poly-sized) family of quantum circuits for which, given any input computational basis state, each output probability *p*_{0} and *p*_{1} is greater than or equal to 2/3 or less than or equal to 1/3 (with output 1 resp. 0 designating acceptance resp. rejection of the input). Introduce PQP (a quantum analogue of the classical class PP, cf. Papadimitriou 1994) to denote the corresponding class of languages for which the bounded probability conditions are relaxed to requiring only that *p*_{0} and *p*_{1} are greater than or equal to (1/2)+(1/2^{n}) or less than or equal to (1/2)−(1/2^{n}). Clearly, BQP⊆PQP but we also have NP⊆PQP (e.g. using a quantum algorithm for SAT that simply computes a Boolean function on an equal superposition of all its inputs and measures the function output register for values 0 versus 1). Similarly, it is straightforward to see that PP⊆PQP but furthermore it may be shown (Watrous in press) that PP=PQP.

Now, let *V*_{n.n.}⊆PQP be the class of languages decided by PQP circuits of n.n. *G*(*A*, *B*) gates. With our strong notion of classical simulation, theorem 1.1 gives *V*_{n.n.}⊆P. Also theorem 2.1 shows that every language in PQP is decidable (relative to the PQP probability conditions) by a circuit comprising only n.n. and next n.n. *G*(*A*, *B*) gates (and applied to a suitably restricted set of inputs, encoding strings of 0's and 1's). Thus, if the latter were also classically simulatable we would have P=NP=PP, i.e. in the context of the PQP probability conditions, an extra supraclassical computational power *must* be associated with the single distance extension of the range of n.n. two-qubit *G*(*A*, *B*) gates if these classical computational complexity classes are to be unequal.

On the other hand, the same analysis carried out relative to the (far more stringent) BQP probability conditions (*viz*. requiring *p*_{0} and *p*_{1} to be bounded away from 1/2 by at least 1/6) is less compelling. Indeed, it is generally believed (although not proven) that neither NP nor PP is contained in BQP so in the context of BQP circuits it becomes less implausible that the passage from n.n. to next n.n. *G*(*A*, *B*) circuits might retain classical simulatability (now no longer implying equality of P, NP and PP). But then we would have P=BQP. Actually, more simply, to obtain BPP=BQP it would suffice to simultaneously relax our (very strong) notion of classical simulation to a far weaker requirement, *viz*. the ability to merely *sample* the output distribution *once* by classical efficient means, in contrast to classically efficiently computing the probabilities to exponential accuracy.

## Acknowledgments

R.J. and A.M. acknowledge support from the EC network QICS that provided collaborative opportunities for this work. A.M. acknowledges discussions with H. J. Briegel, B. Kraus and R. Somma, and is supported by FWF and the EC networks OLAQUI, SCALA. R.J. is supported by EPSRC QIP-IRC and EC network QAP.

## Footnotes

↵The appellation ‘Clifford’ here, commonly used in quantum computation literature, appears not to be mathematically related to the well established notion of Clifford

*algebra*in mathematics generally.↵As mentioned in Knill (2001) and Somma

*et al*. (2006), it is possible to consider 2*n*+1 anticommuting operators to define the Clifford algebra and correspondingly to have*SO*(2*n*+1) symmetry in theorem 4.1. However, this extension appears not to lead to a significant generalization of our results.- Received May 7, 2008.
- Accepted June 24, 2008.

- © 2008 The Royal Society