Improved Soundness for QMA with Multiple Provers

We present three contributions to the understanding of QMA with multiple provers: 1) We give a tight soundness analysis of the protocol of [Blier and Tapp, ICQNM '09], yielding a soundness gap Omega(1/N^2). Our improvement is achieved without the use of an instance with a constant soundness gap (i.e., without using a PCP). 2) We give a tight soundness analysis of the protocol of [Chen and Drucker, ArXiV '10], thereby improving their result from a monolithic protocol where Theta(sqrt(N)) provers are needed in order to have any soundness gap, to a protocol with a smooth trade-off between the number of provers k and a soundness gap Omega(k^2/N), as long as k>=Omega(log N). (And, when k=Theta(sqrt(N)), we recover the original parameters of Chen and Drucker.) 3) We make progress towards an open question of [Aaronson et al., ToC '09] about what kinds of NP-complete problems are amenable to sublinear multiple-prover QMA protocols, by observing that a large class of such examples can easily be derived from results already in the PCP literature - namely, at least the languages recognized by a non-deterministic RAMs in quasilinear time.

• We give a tight soundness analysis of the protocol of [Chen and Drucker, ArXiV '10], thereby improving their result from a "monolithic" protocol where Θ( √ N ) provers are needed in order to have any soundness gap, to a protocol with a smooth trade-off between the number of provers κ and a soundness gap Ω(κ 2 N −1 ), as long as κ ∈ Ω(log N ). (And, when κ ∈ Θ( √ N ), we recover the original parameters of Chen and Drucker.)

Introduction
The class QMA is the natural quantum analogue of NP (or, rather, MA): with the help of a quantum proof (given by the all-powerful "Merlin"), a quantum polynomial-time verifier ("Arthur") attempts to decide whether an input string x is in a given language L or not; this class was first studied by Knill [Kni96], Kitaev [Kit99], and Watrous [Wat00]. For more details, see the survey of Aharonov and Naveh [AN02].
Kobayashi et al. [KMY09] first introduced and studied the class QMA(κ), where Arthur receives κ ∈ [2, poly(N )] quantum proofs that are promised to be unentangled. While multiple proofs in the classical case do not increase the power of the class (i.e., "NP(κ) = NP" and "MA(κ) = MA"), there is some evidence that multiple unentangled proofs in the quantum case are in fact more powerful than one (as currently conjectured): for example, Liu et al. [LCV07] have proposed a problem, pure state N -representability, that is known to lie in QMA(2) but is not known to lie in QMA; also, several works [BT09, Bei10, ABD + 09, CD10,LGNN12,Per12] have proposed multi-prover QMA protocols for certain NP languages whose (soundness and proof length) parameters are not known to be achievable with only one prover. (See Table 1 for a summary of such results. ) Harrow and Montanaro [HM10] recently answered several open problems regarding the class QMA(κ), by proving that amplification within QMA(κ) is possible and that QMA(poly(N )) = QMA(2); the "collapse" is achieved by giving an analysis of a product test, which allows a verifier to use the unentanglement promise of only two registers to ensure that states within a single register are close to a separable state.
Brandão et al. [BaCJ10,Corollary 4] prove (among other things) that two-prover QMA protocols where the verifier is restricted to LOCC measurements (i.e., adaptive unentangled measurements) only can be simulated by a single prover QMA protocol, incurring only in a quadratic increase in total proof length. In particular, for example, this implies that a two-prover QMA protocol for 3SAT with an LOCC verifier and total proof length of o( √ N ) is unlikely to exist (for, otherwise, 3SAT could be solved in deterministic subexponential time). In a related theme, Brandão and Harrow [BaH12] show that the O( √ N )-prover LOCC-protocol of Chen and Drucker [CD10] is optimal in the poly(N )-prover regime, under the same hardness assumption for 3SAT as above.
Particularly interesting is the gap between the "lower bound" of Brandão et al. [BaCJ10] and the "upper bound" results that are known for multi-prover QMA protocols for certain NP languages. Specifically, Aaronson et al. [ABD + 09] give a Θ( √ N )-prover QMA protocol for 3SAT, with perfect completeness and constant soundness gap, where each prover sends Θ(log N ) qubits; two improvements, in different directions, on this protocol are known: • Reducing the number of provers. Harrow and Montanaro [HM10], through their product test, reduce the number of provers of [ABD + 09] to only two, thereby obtaining a two-prover QMA protocol for 3SAT, with perfect completeness and constant soundness gap, where each prover sends Θ( √ N ) qubits. • Avoiding use of the swap test (and any entangling measurement). Chen and Drucker [CD10] simplify the verifier of [ABD + 09] by avoiding the swap test, thereby making the verifier perform only LOCC (in fact, Bell) measurements; along the way, they also manage to greatly simplify the soundness analysis too. (Also, but less relevant: they (i) use a coloring problem as a starting point instead of a "structured" SAT instance, and (ii) they lose perfect completeness.) However, no result that improves on both directions is known; such a result, in light of the lower bound of Brandão et al. [BaCJ10], would be a tight upper bound (under plausible hardness assumptions). Thus, the following is an interesting open question: Question 1. Does there exist a two-prover QMA protocol for 3SAT, with a constant soundness gap and O( √ N ) total number of qubits, where the verifier is only allowed to perform LOCC measurements? paper language gap? provers qubits provers c c − s verifier test • We give a tight soundness analysis of the protocol of Chen and Drucker [CD10] for 3SAT, thereby improving their result from a "monolithic" protocol where Θ( √ N ) provers are needed in order to have any soundness gap, to a protocol with a smooth trade-off between the number of provers κ and a soundness gap Ω(κ 2 N −1 ), as long as κ ∈ Ω(log N ). (And, when κ ∈ Θ( √ N ), we recover the original parameters of [CD10].) Further, we explain why even our tight analysis cannot give any soundness gap for the "κ ∈ O(1) regime", implying that new protocols are needed for any "sublinear" constant-prover LOCC QMA protocol with an inverse-polynomial soundness gap.
• We give a tight soundness analysis of the protocol of Blier and Tapp [BT09] for 3SAT, yielding a soundness gap Ω(N −2 ). Maybe surprisingly, our improvement is achieved without the use of an instance with a constant soundness gap (i.e., without using a "PCP"); this is unlike the soundness gap of Ω(N −3−ε ) given by Beigi [Bei10], which was achieved using a (balanced) 2out-of-4 instance with constant soundness gap. Independently from us, Le Gall et al. [LGNN12] have been able to use PCPs in the protocol of Blier and Tapp [BT09] to obtain a soundness game of Ω(N −1 ).
We now discuss each of the above contributions; the technical details are left to subsequent sections. What is surprising about the result is that, for M ∈ O(N ), the total number of qubits sent by all the provers to the verifier is sublinear ; instead, the best known proof length in the case of only one prover (i.e., in the case of QMA) is linear (and we believe one cannot do better, by the Exponential-Time Hypothesis [IP01], which says that 3SAT cannot be solved in subexponential time in the worst case was the existence of a quasilinear reduction from 3SAT to 2CSP with constant soundness gap; we note that the works of Ben-Sasson and Sudan [BSS08] and Dinur [Din07] actually imply that a similar reduction holds for any language that can be recognized in non-deterministic quasilinear time by a random-access machine. Proposition 1.1. Let L be any language that can be recognized in non-deterministic quasilinear time by a random-access machine. Let x be an instance in L of size N . Then one can prove that x is in L, with perfect completeness and constant soundness, using Θ( √ N ) unentangled quantum proofs, each with Θ(log N ) qubits.
More generally, letting NTIME RAM (t) be the class of languages solvable in non-deterministic t(n)time by a random-access machine, for any L in NTIME RAM (t) it is possible to prove membership in L, with perfect completeness and constant soundness, using Θ t(n) unentangled quantum proofs, each with Θ log t(n) qubits.
In order to obtain statements analogous to Proposition 1.1 for the parameters obtained by other multi-prover QMA protocols (including [CD10, Bei10, BT09, LGNN12]), we state the "size-efficient" reduction from NTIME RAM (t) in a very generic form. For details, see Section 3.

Improvements to [CD10]
Aaronson et al. [ABD + 09] raised the question of whether it is possible to construct a (multi-prover) QMA protocol with constant soundness gap and sublinear proof size for an NP-complete language, but using no entangled measurements. Chen and Drucker [CD10] gave a positive answer: Theorem ([CD10]). Let ϕ be a satisfiable 3SAT instance with N variables and M clauses (and M ≥ N ). Then one can prove satisfiability of ϕ, with almost-perfect completeness and constant soundness, using Θ( √ M ) unentangled quantum proofs, each with Θ(log M ) qubits, and by only making LOCC (in fact, Bell) measurements.
The analysis of [CD10] does not give a smooth tradeoff between the number of provers and soundness, because their proof only shows a soundness gap when the number of provers is Θ( √ M ). We give a tight analysis of their protocol that yields a soundness gap for a number of provers κ ∈ Ω(log N ). We believe the smooth trade-off is of interest because it helps us "push the barrier closer" to the best two-prover LOCC QMA protocols with logarithmic proof length and small soundness gap.
The proof follows by improving the second-moment argument of [CD10] by using a one-sided Chebyshev inequality. See Section 5 for more details.

Improvements to [BT09]
We give a tight soundness analysis of the protocol of Blier and Tapp [BT09].
Proposition 1.3. The protocol of Blier and Tapp [BT09] for 2CSP's on N vertices and M edge constraints over a K-size alphabet has soundness s = 1 − Ω(N −2 ), assuming K ∈ O(1). Moreover, our analysis is tight, for reasons described in Remark 6.1 on page 17.
The above results improves on the original analysis by Blier and Tapp [BT09], who show that the protocol for instances of graph 3-coloring on N vertices and M edges that has completeness c = 1 and soundness s = 1 − Ω(N −6 ). It also improves on the result of Beigi [Bei10], who gives a protocol for constant-gap instances of (balanced) 2-out-of-4 SAT with M clauses that has completeness for every ε > 0. In independent work, Le Gall, Nakagawa and Nishimura [LGNN12] also gave an improved version of the Blier and Tapp [BT09] protocol, by changing the protocol to utilize 2CSP instances with constant soundness gaps, and achieve c = 1 and s = 1 − Ω( 1 N ), which when applied with the requisite PCP results, improves upon our soundness gap by nearly a quadratic factor. See Section 6 for details.

Conclusions
Our results have made limited progress in answering Question 1, and we now comment on avenues for further progress.
One possible approach is to first construct a two-prover LOCC QMA protocol for 3SAT with Ω( 1 √ N ) soundness gap and logarithmic proof size, and then suitably amplify the protocol to constant soundness. Note that since LOCC QMA protocols amplify naturally, there is no need to use a product test [HM10] (which would have regardless created a non-LOCC protocol), nor there is any need to invoke additional assumptions such as the Weak Additivity Conjecture [ABD + 09, Theorem 35].) Through Proposition 1.3 we have made some progress in this direction by improving the soundness gap of two-prover protocols for 3SAT with a polylogarithmic proof size to Ω(N −2 ). Unfortunately, the protocol is not LOCC and does not achieve the required soundness gap of Ω( 1 √ N ). Another approach is to construct an LOCC protocol that acts on all the provided qubits at the same time, possibly in a much more complicated way than amplifying a two-prover protocol. The main difficulty in such an approach is that one of the main tools (as used in [ABD + 09, KMY09, BT09, Bei10, HM10, LGNN12]) for multi-prover QMA protocols is the swap test, which is not LOCC. Attempting to replicate the properties of the product test of of Harrow and Montanaro [HM10] (which relies on the swap test) within the LOCC framework (in order to apply it to LOCC protocols such as [CD10]) runs into the risk of implying that QMA(κ) = QMA LOCC (2), which, through the result of Brandão et al. [BaCJ10], would have the unlikely consequence of QMA(κ) = QMA. Thus, any such approach must make essential "non-black-box" use of the structure of the language at hand (e.g., that of 2-out-of-4 SAT) to avoid being a "generic" test.

Preliminaries
Two languages and non-deterministic time. First, we define two NP languages that we will be working with. The first language is constraint-satisfaction problems on graphs: Definition 2.1 (Graph Constraint Satisfaction). Let G, = (V, E) be a graph (possibly with self-loops) and an alphabet Σ. A graph constraint-satisfaction problem is a pair We say that C is satisfiable if there is a labeling C : V → Σ such that every edge predicate evaluates to 1. We say that C is δ-satisfiable if, for every possibly labeling of the vertices, at most a δ fraction of the edge predicates evaluate to 1.
Fix positive integers N , M , and K. The class 2CSP(N, M, K) consists of satisfiable graph constraintsatisfaction problems over K-size alphabets on graphs of N vertices and M edges.
The second language is SAT formulae with some additional structure: The class (2, 4)SAT consists of 2-out-of-4 satisfiable 4-CNF formulae in which every variable appears in Θ(1) number of clauses. (A 4-CNF formula is 2-out-of-4 satisfiable if there is an assignment to the variables such that for every clause in the 4-CNF exactly two of the four variables are satisfied.) Next, we recall the definition of a proper complexity function: Finally, we recall non-deterministic time complexity classes with respect to multi-tape Turing machines and random-access machines: Definition 2.4 (NTIME mTM ). A multi-tape Turing machine is a finite state machine attached to multiple tapes, with one head per tape. The tapes are infinite in one direction. The machine can read and write to each tape, moving one cell per time step. See [Pap94] for more details.
That the machine is non-deterministic means that the finite state control of the Turing machine can non-deterministically decide its next move, such that the machine accepts if and only if there is some non-deterministic choice that allows it to accept.
For a proper complexity function t, NTIME mTM (t) denotes the class of languages that can be recognized by a t-time non-deterministic multi-tape Turing machine.
Definition 2.5 (NTIME RAM ). A random-access machine (RAM) is a list of commands that includes a finite number of control registers as well as an unbounded number of indexable registers. Each register holds an integer. Commands include addition, multiplication (with a log-cost penalty), branching on register contents, and indexing the registers with the contents of other registers. See [GS89] for more details.
That the machine is non-deterministic means that the finite list of commands of the random-access machine allow non-deterministic branching to its next move, such that the machine accepts if and only if there is some non-deterministic choice that allows it to accept.
For a proper complexity function t, NTIME RAM (t) denotes the class of languages that can be recognized by a t-time non-deterministic random-access machine.
Information theory. First, we recall the classical information-theoretic notion of statistical distance between two probability distributions [NC00, Sec. 9.1]: Definition 2.6 (Statistical Distance). Let P and Q be two probability distributions over the same finite set S. The statistical distance between P and Q, denoted |P − Q| 1 , is defined as the quantity Next, we recall its quantum analogue of trace distance [NC00, Sec. 9.2.1]: Definition 2.7 (Trace Distance). The trace distance between two quantum states ρ and σ, denoted |ρ − σ| Tr , is defined as the quantity: If ρ and σ commute, then they can be simultaneously diagonalized, and the trace distance between ρ and σ reduces to the statistical distance between the two probability distributions induced by the two sets of eigenvalues of ρ and σ.
Also, we recall that, given a projective measurement (i.e., a Hermitian operator) M , if P and Q are the probability distributions describing the outcomes obtained when measuring M on |φ and |ψ respectively, then |P − Q| 1 ≤ ||φ φ| − |ψ ψ|| Tr . For convenience, we will denote by dstr M (|φ ) the distribution P , and simply dstr(|φ ) when M is assumed to be a full computational basis measurement.
Swap test. The swap-test on two quantum states ρ and σ [BBD + 97, BCWdW01] is given by the following quantum circuit: Essentially, the swap-test measures the overlap between two quantum states, because If ρ and σ are two pure states |φ φ| and |ψ ψ|, then the probability above is equal to 1+| φ|ψ | 2 2 . Interpreting b = 0 as an "accept" and b = 1 as a "reject", we define: The swap test can thus be used to check whether |φ φ| and |ψ ψ| are equal or not: if they are equal, then REJ Swap(|φ , |ψ ) = 0; if they are not equal, then the probability of rejection is inversely proportional to the overlap between |φ φ| and |ψ ψ|.
If the probability that the swap test rejects two quantum states is bounded from above, then the statistical distance between the two probability distributions arising when measuring the two quantum states (in any basis) is also bounded from above: Lemma 2.1. Let |φ and |ψ be quantum states and δ a number Quantum Fourier transform. Finally, we recall the quantum Fourier transform: Definition 2.8 (Quantum Fourier Transform). Let H n be an n-dimensional Hilbert space with orthonormal basis {|0 , . . . , |n − 1 }. The n-dimensional quantum Fourier transform, denoted F n , is the linear operator whose action on the basis vector |j , j ∈ {0, 1 . . . , n − 1}, is given by We recall that F n is a unitary operator. Also, we will denote by |0 Fn the image of |0 under F n , and it satisfies the following equation: For more details on the quantum Fourier transform, see [NC00, Ch. 5].
Quantum proofs. In an κ-prover QMA protocol, the BQP verifier (Arthur) receives a classical input x ∈ {0, 1} * together with κ quantum proofs |Ψ (1) , . . . , |Ψ (κ) sent by the κ provers (Merlins); the provers (and thus the quantum proofs they send) are promised to be unentangled. The verifier will then decide whether to accept x or not, based on all the quantum proofs he received.
The class QMA (κ, c, s) is defined to be the class QMA M (κ, c, s) where M is the set of all Hermitian operators. The class QMA(κ, c, s) is defined to be the class QMA poly(·) (κ, c, s). The class QMA(κ) is defined to be the class QMA(κ, 2/3, 2/3).
Note that any set of admissible Hermitian operators M induces a set of binary measurements, where each M ∈ M means "accept" and I − M means "reject". For example, M = Bell is the set of Bell measurements (non-adaptive, unentangled measurements), M = LOCC is the set of LOCC measurements (adaptive, unentangled measurements), and M = SEP is the set of separable measurements (which includes the swap test and product test).

Quasilinear-Time Has Sublinear Unentangled Quantum Proofs
In this section, we show how a few simple observations suffice to generalize the known positive results on multi-prover QMA protocols for NP languages (i.e., [BT09], [Bei10], [ABD + 09], [CD10], and [HM10]). Doing so allows us to exhibit a large class of problems that qualify as positive examples to Question 2 raised by Aaronson et al.
The main observation is the fact that "short" PCPs exist not only for 3SAT but, more generally, for every NTIME language: Claim 3.1 (Quasilinear PCPs for NTIME Languages). Let t : N → N be any proper complexity function and let L be a language in NTIME RAM (t). Then, there exist • a size function S L : • a reduction complexity function C L : N → N with C L ∈ poly(t(n)), • a gap constant η L ∈ (0, 1), and • a regularity constant d L ∈ N, such that, for every instance x ∈ {0, 1} n , the following properties hold: • Efficiency: The claim is simply a statement about the short PCPs that are obtained from the works of Ben-Sasson and Sudan [BSS08] and Dinur [Din07], together with some simple observations about the generality of their results.
In order to construct "short" PCPs for 3SAT, Dinur: (i) observes [Din07,Lemma 8.3] that the result of Ben-Sasson and Sudan implies that 3SAT can be reduced to 2CSP instances that satisfy all the properties of the claim, with the exception that the soundness gap is is only 1/polylog(n); and (ii) then she applies her main technical result of gap amplification [Din07, Theorem 1.5] to bring the soundness gap to a constant η.
Then, (i) and (ii) together easily imply "short" PCPs for 3SAT [Din07, Theorem 8.1]. We note that Dinur's first observation, (i), only relies on the fact that 3SAT ∈ NTIME mTM (t(n)) for some t(n) ∈ O(n), and a similar observation can be made for a general language L ∈ NTIME mTM (t(n)), which, again combined with her gap amplification result, yields the claim for languages in NTIME mTM (t(n)).
We choose to not state the claim for languages in NTIME mTM (t(n)), because it is not as illuminating; it seems quite tedious (and difficult) to check whether a given language L can be recognized in nondeterministic t(n)-time by some multi-tape Turing machine. Instead, we observe that, by using a result of Gurevich and Shelah [GS89, Theorem 2], which implies that NTIME RAM (t) ⊆ NTIME mTM (t (n)) for some t (n) ∈ O(t(n)), we obtain the claim as stated; 1 this way, the task of checking whether a given language is in L is in NTIME RAM (t) is much simpler: one only needs to write "pseudocode" for the non-deterministic verifier, and prove that it halts in time t(n).
The 2CSP instance guaranteed by Claim 3.1 is already a "nice" instance for which multiple-prover QMA results have been proved. For example, a 2CSP instance is the starting point of Blier and Tapp [BT09] and Chen and Drucker [CD10], so that we obtain generic results for both protocols. 2 Other works, instead, such as [ABD + 09] and [Bei10] "process the 2CSP instance further", in order to give it additional structure (that is exploited in their protocols). Thus, these additional processings also inherit the more general reduction guaranteed by Claim 3.1 for all of the languages in NTIME RAM (t): Corollary 1 (Constant-Gap Boolean Formulae for NTIME Languages). Let t : N → N be any proper complexity function and let L be a language in NTIME RAM (t). Then, , a reduction complexity function C L : N → N with C L ∈ poly(t(n)), a C L -time reduction R L : {0, 1} * → {0, 1} * from L to 3SAT, and a gap constant η L ∈ (0, 1), such that, for every instance x ∈ {0, 1} n , the following properties hold: 4)SAT, and a gap constant η L ∈ (0, 1), a balance constant b L ∈ N, such that, for every instance x ∈ {0, 1} n , the following properties hold: Of course, one could add more items to the above corollary, other than 3SAT and (2, 4)SAT, if other languages that can be efficiently reduced to from 2CSP are found to be useful. We chose to mention only 3SAT because of its general importance and (2, 4)SAT because it was successfully used by [ABD + 09] and [Bei10].
The proof of the corollary was partly sketched, in the particular case of t(n) = n in [ABD + 09, Lemma 12]. We give here the more general proof: Proof of Corollary 1. To obtain (i), we argue as follows. To prove the first item, it suffices to convert the instance guaranteed by Claim 3.1, which is a 2CSP instance over a constant-size alphabet, into a 3SAT instance, in a way that preserves perfect completeness and degrades the soundness gap by at most a constant factor.
First, consider a 2CSP instance over a constant-size alphabet. Observe that we can transform this into a CSP over a binary alphabet by allowing constraints to restrict multiple variables. As the original alphabet was of constant-size, this only increase the arity, number of variables, and number of constraints in the CSP by a constant factor. Further, the soundness gap is preserved.
So consider now a constraint C in the CSP over variables x. By the Cook-Levin Theorem, there exist a 3SAT formula ϕ C and additional variables y such that C( x) if and only if there exists y such that ϕ C ( x, y). Observe that the size of ϕ C is at most some constant g, because the original CSP is over a constant-size alphabet and has arity 2. Define the output of this reduction to be the 3SAT formula ϕ := C ϕ C .
We now analyze the properties of ϕ. First, observe that the number of clauses in ϕ is at most g times the number of constraints in the original CSP, and the number of variables is also a constant-factor more than the number of variables in the original CSP. Further, if the original CSP was satisfiable, then so must be ϕ C , so perfect completeness is preserved. To analyze soundness, suppose that the original CSP was at most δ satisfiable. Then, in any assignment to ϕ, at least (1 − δ) · E clauses must be unsatisfied, where E is the number of constraints in the original CSP. As there are at most gE clauses in ϕ, this means that ϕ can have at most a (1 − 1−δ g )-fraction of satisfied clauses. Thus, there is still a constant soundness gap.
To obtain (ii), we first invoke (i) so to obtain a reduction to 3SAT, and then follow the outline of Aaronson et al. [ABD + 09, Lemma 12]. Specifically, the instance output by the reduction guaranteed by (i) can first be further modified using a reduction of Papadimitriou and Yannakakis [PY91] from 3SAT to 3SAT that makes each variable appear in at most b L = 29 (in fact, exactly) clauses (and this reduction preserves the constant soundness gap); then, we apply a reduction of Khanna et al. [KSTW01] (that preserves both the constant soundness gap and the balanced property of the formula) from 3SAT to (2, 4)SAT. The reason that the outline of Aaronson et al. [ABD + 09, Lemma 12] also works in the general case considered in this corollary is that the number of variables and clauses increases only a by a constant through these two additional reductions.
By combining the above results with [ABD + 09], we have now established Proposition 1.1.

Graph Coloring States
Let G = (V, E) be a graph with N vertices and M edges, and let Σ be a color alphabet of size K. The graph G and the color alphabet Σ will be fixed throughout the rest of the paper.
We say that a quantum state |Ψ is a graph coloring state (for G and Σ) if it is a quantum state over a Hilbert space H = (H 2 ) ⊗ log N ⊗ (H 2 ) ⊗ log K . Thus, any graph coloring state |Ψ can be written as where N −1 v=0 |α v | 2 = 1 and K−1 j=0 |β v,j | 2 = 1 for each v ∈ {0, . . . , N − 1}. Note that the definition of a graph coloring state is independent of the edge set E. Such a state is intended to allow the provers to honestly encode a coloring χ : V → Σ of the graph via the state The challenge of using these states is to enforce the provers to act as in the honest case. If they encode the coloring as above, we can recover/check it by measuring the state and recovering vertex/color pairs. If the coloring is invalid then with measurements of independent (identical) states we can observe invalidly colored edges. However, dishonest provers could allow many colors for each vertex, or make it so some vertices have zero amplitudes. Doing such things would fool the above coloring test. Thus, the main challenge is to detect this dishonest case and reject it with good probability.
We believe that developing strong tools for quantum property testing is essential for making improvement towards better multi-prover QMA protocols. 3 As a first move in that direction, we give in the next subsection two lemmas for graph coloring states, which were implicitly used in both [BT09] and [CD10] with very different parameters, and present them in a generic form. After that, we summarize the tests for graph coloring states that have been used in previous protocols.

Two Lemmas on Graph Coloring States
Let us first introduce some simple notation: given any graph coloring state |Ψ , • for c ∈ (0, 1], R c (|Ψ ) is the subset of V consisting of those vertices v for which |α v | 2 < c; • for c ∈ (0, 1], S c (|Ψ ) is equal to V − R c (|Ψ ); • for j = 0, . . . , K − 1, p j (|Ψ ) is equal to the probability of measuring j in the color register of the quantum state (I N ⊗ F K )|Ψ ; and • for j = 0, . . . , K − 1, |γ(j) = N −1 v=0 γ v (j)|v is the reduced quantum state obtained when we measure j in the color register of (I N ⊗ F K )|Ψ .
First, we prove that, as long as a color j has a large-enough probability of being measured in the color register of (I N ⊗ F K )|Ψ , if a vertex v has small amplitude then it will also have a small amplitude in the reduced state conditioned on measuring j.
Proof. Let |X be the quantum state obtained from |Ψ after performing the quantum Fourier transform on the color register of |Ψ , i.e., For each v ∈ {0, . . . , N −1}, let P v,j (|Ψ ) be the probability that the color register of |X is measured j and the vertex register is measured v. Recalling that |γ(j) = N −1 v=0 γ v (j)|v is the reduced quantum state when outcome j occurs, we have that On the other hand, it is also the case that We deduce that p j (|Ψ ) · |γ v (j)| 2 ≤ |α v | 2 or, equivalently, that By assumption, the probability of measuring j in the color register of |X = (I N ⊗ F K )|Ψ , which is p j |Ψ , is at least 1 c 1 . Also by assumption, |α v | 2 < 1 c 2 N . Therefore, as desired.
Next, we prove that if a quantum state has at least one amplitude that is "far" from uniform, then the probability of measuring any given outcome in the Fourier basis can be upper bounded.

Summary of Tests for Graph Coloring States
We give a brief summary and description of the tests that have been used successfully in protocols with graph coloring states. The first one is the swap test, which checks whether two states are close to each other: Swap |Ψ , |Φ ≡ 1. Perform the swap test on the two quantum (graph) states |ψ and |φ . 2. Accept if and only if the swap test accepts.
The swap test performs a superposition of swapping the two states, and not swapping the states. By then combining these superpositions, the interference will leave a result proportional to the distance of the two original states. See Section 2 for the details and properties of the swap test. Another test that is often useful is the uniformity test: 2. Measure the vertex and color register of |Φ in the computational basis to get outcome (v, j). 3. If j = 0 but v = 0, then reject. 4. Accept.
The uniformity test seeks to ensure that the total amplitude of each vertex is large, assuming that the probability of measuring j = 0 is large. This is used in ensuring that a graph coloring state meaningfully assigns a color to each vertex in the graph. A generalization of this test is the conditional uniformity test: for any z ∈ [0, κ], Intuitively, the conditional uniformity test also makes sure that a significant fraction of the graph coloring states are such that, when their color register is measured in the Fourier basis, the color 0 has a not too small probability of occurring. Once this is ensured, the uniformity test ensures that vertices have near-uniform amplitudes, and thus are meaningfully colored. Finally, the consistency test with respect to a given 2CSP instance C = (G, {R e } e∈E ) is: 1. For i = 1, . . . , κ, measure the graph coloring state |Ψ (i) in the standard basis to get outcome The consistency test just checks that the states meaningfully encode a solution to the 2CSP instance, by ensuring that each vertex has a unique color, and no edge is violated. This test is only meaningful with honest encodings of the graph coloring state, and we can perform other tests (such as the conditional uniformity test) to rule out dishonest encodings.
Throughout, we will denote by REJ(·) the rejection probability of a given test; e.g., REJ Swap |Ψ , |Φ denotes the rejection probability of the swap test on the two quantum states |Ψ and |Φ .
5 An Improvement on the Soundness Analysis of [CD10] In this section, we give the details for our tight soundness analysis of the two-prover QMA protocol of Chen and Drucker [CD10]. Specifically, we prove: Proposition (Proposition 1.2, restated). The κ-prover QMA protocol for 2CSP(N, M, K) given by Algorithm 1 has completeness 1 − e −Ω(κ) and soundness 1 − Ω κ 2 N +κ 2 , assuming K ∈ O(1); thus, for κ ∈ Ω(log N ) and κ ∈ O( √ N ), the soundness gap is Ω(κ 2 N −1 ). Moreover, the analysis of the soundness of the protocol cannot be improved, in the sense of Remark 5.1 below.
The proposition improves the status quo by giving a smooth trade-off between the number of provers κ and the soundness gap as a function of κ, whereas the soundness analysis of [CD10] only gave a soundness gap for κ ∈ Θ( √ N ).
Algorithm 1 Verifier of [CD10] inputs: a 2CSP(N, proofs: κ unentangled graph coloring states |Ψ (1) , . . . , |Ψ (κ) verifier: draw r ∈ {1, 2} at random, and perform the r-th test below: 1. CondUnif 99 100 κ |Ψ (1) , . . . , |Ψ (κ) 2. Cons C |Ψ (1) , . . . , |Ψ (κ) Remark 5.1 ("Tightness" of Our Analysis). Consider a 2CSP(N, M, K) instance C = (G, {R e } e∈E ); suppose that C is not satisfiable, and suppose also that C has constant soundness gap η. Hence, for any coloring C : V → Σ, at least η|E| of the edge constraints {R e } e∈E are not satisfied. So fix some coloring C. Now suppose that the κ graph coloring states |Ψ (1) , . . . , |Ψ (κ) given to the verifier are all equal and indeed are a uniform superposition of all vertices with a unique color determined by C. If so, the test Cons C |Ψ (1) , . . . , |Ψ (κ) rejects with probability O( κ 2 N ) by the Birthday Problem (indeed, we only have κ 2 chances to see a particular edge in the constraint graph, and a constrained edge is seen with only probability Θ(N −1 ) because the graph is sparse). Thus, our analysis is "tight" in the sense that the assumptions we made could indeed really be the case, so one cannot hope to exhibit an even better soundness analysis that proves a soundness gap of ω(κ 2 /N ).
Furthermore, if instead C is satisfiable (and the verifier receives uniform and equal κ graph coloring states |Ψ (1) , . . . , |Ψ (κ) with a satisfying coloring), then completeness would be only 1 − e −Θ(κ) , due to the imperfect completeness of the conditional uniformity test. This test has imperfect completeness due to Line (3) of that test, that rejects whenever the number of 0's measured is below the threshold. Due to natural variability, this can happen with non-zero probability even in the satisfiable case. Thus, we are forced to take κ ∈ Ω(log N ) in order for there to be any inverse-polynomial soundness gap. (In other words, the protocol of [CD10] has no soundness gap in the "constant regime" κ ∈ O(1); to breach the constant regime, it seems that one would have to strengthen the verifier with additional LOCC measurements to increase the soundness gap, or, at the very least, to endow the protocol with perfect completeness.) We now proceed to the proof of Proposition 1.2, which follows closely the proof of Chen and Drucker [CD10]. Throughout, we use notation for graph coloring states, which was introduced in Section 4.
Observe that the completeness in Proposition 1.2 follows exactly as in the analysis of Chen and Drucker. Thus, it remains to examine the soundness. Chen and Drucker [CD10, Lemma 3] gave sufficient conditions for an arbitrary graph coloring state |Ψ to be accepted by the uniformity test Unif(|Ψ ) with constant probability. We first show how to use the generic lemmas of Section 4 to prove the same result (and these same lemmas are used with very different parameters in our soundness analysis of the protocol of Blier and Tapp [BT09] in Section 6). In particular, this next lemma says that, assuming the 0 coloring is measured with good probability, we can reject the dishonest case of when the provers assign too small amplitude to many vertices.
The above result shows that for graph coloring states with constant probability of measuring the 0 color, we reject with good probability if there are many vertices with small amplitudes. In the case when 0 is not measured with such probability, nothing can be said. Thus, Chen and Drucker argue that amongst the different graph coloring states, we can detect if very few of them have a good probability of measuring 0. This can simply be done by measuring said states and comparing the number of zeroes measured and the relevant threshold value. Thus, the remaining case to analyze is when there are many states with good probability of measuring 0, and each state has few vertices with small amplitude. They give a reduction (with some loss in the constants) to a slightly simpler normal form of this case, which they then analyze. We present a slightly better analysis of this normal form. Suppose that for each i ∈ {1, . . . , κ} there exists S i ⊆ V with |S i | ≥ (1 − ε)N such that v i is uniformly distributed over S i , and ε < η/20. Then, when sampling (v i , c i ) from D i for all i, there is a probability of at least Ω ε,d ( κ 2 N +κ 2 ) such that there exists an i < j with: either e = (v i , v j ) is an edge of G and R e (c i , c j ) = 0, or v i = v j and c i = c j .
Proof. We follow the proof of Chen and Drucker [CD10]. For i, j ∈ {1, . . . , κ}, define V i,j to be an indicator for the event that either e = (v i , v j ) is an edge of G and R e (c i , c j ) = 0, or v i = v j and c i = c j .
Observe that the result follows from bounding Pr[V = 0]. To bound this probability, we use Cantelli's inequality (also known as the one-sided Chebyschev inequality, cf [Ros84]): for a random variable X and a > 0, Pr Var(X)+a 2 . Thus, taking X = V and a = E(V ), and using the fact that V is a non-negative random variable, we have The result will then follow from an upper bound on Var(V ) and a lower bound on E[V ] 2 .
We now invoke the following facts from the analysis of [CD10]: Hence, the upper bound for Var(V ) is already given. As for the lower bound on E[V ] 2 : by linearity of expectation and (i) above, we see that Combining with the above, we conclude that where the big-O and big-Ω notation hide constants depending on ε and d. Thus, we obtain the desired lower bound on Pr[V > 0].
The above lemma, combined with the rest of Chen and Drucker's analysis, readily yield Proposition 1.2. That is, [CD10] use the conditional uniformity test to rule out dishonest provers presenting malformed graph coloring states, and then use an analysis of the above type to analyze the case of dishonest provers presenting invalid colorings. With the improved analysis, we can analyze the protocol in a larger parameter regime, giving the claim.
6 An Improvement on the Soundness Analysis of [BT09] In this section, we give the details for our tight soundness analysis of the two-prover QMA protocol of Blier and Tapp [BT09]. Specifically, we prove: Proposition (Proposition 1.3, restated). The two-prover QMA protocol for 2CSP(N, M, K) given in Algorithm 2 has (perfect completeness and) soundness 1 − Ω(N −2 ), assuming K ∈ O(1). Moreover, the analysis of the soundness of the protocol cannot be improved, in the sense of Remark 6.1 below.
Remark 6.1 ("Tightness" of Our Analysis). Consider a 2CSP(N, M, K) instance C = (G, {R e } e∈E ); suppose that C is not satisfiable, and suppose further that there exists a coloring of the vertices C : V → Σ for which there exists exactly one edge (ṽ,w) ∈ E such that R (ṽ,w) (C(ṽ), C(w)) = 0. Now suppose that the two graph coloring states |Ψ (1) and |Ψ (2) given to the verifier are equal and that they indeed are a uniform superposition of all vertices, colored with C. If so, both the first Algorithm 2 Verifier of [BT09] inputs: a 2CSP(N, M, K) instance C = (G, {R e } e∈E ) proofs: two unentangled graph coloring states |Ψ (1) and |Ψ (2) verifier: draw r ∈ {1, 2, 3} at random, and perform the r-th test below: test (i.e., the swap test) and the third test (i.e., the two uniformity tests) accept with probability 1; however, the second test (i.e., the consistency test) accepts with probability that is exactly 1 − N −2 .
In other words, our analysis is "tight" in the sense that the assumptions we made could indeed really be the case, thus implying that one cannot hope to exhibit an even better soundness analysis that proves a soundness of 1 − ω(N −2 ).
First, we show that, as long as two graph coloring states |Ψ (1) and |Ψ (2) are "close enough" and the colors of the vertices are "consistent enough", then a definite color can be chosen for vertices with large enough amplitude. (Indeed, if vertices with large enough amplitude were to be colored very inconsistently, then we would be able to catch them, through the second test.) Lemma 6.2 (modified [BT09, Lemma 3.4]). Fix any 2CSP instance C. Define δ = 1 2 · 1600 2 K 4 N 2 and µ = 1 1600 2 K 4 N 2 .
Next, we show that, under the same assumptions as Lemma 6.2, the probability of measuring j in the color register of (I N ⊗ F K )|Ψ (1) is at least 1 4K for every color j ∈ {0, . . . , K − 1}.
Next we prove that, under the same assumptions of Lemma 6.2 and Lemma 6.3, if we further require that the uniform test does not reject with high probability, then we can be sure that all the vertices have a somewhat large amplitude.
Suppose that: Proof. Recall that: • p j |Ψ (1) is the probability of measuring j in the color register of (I N ⊗ F K )|Ψ (1) , and • |γ(j) (1) = N −1 v=0 γ v (j) (1) |v is the reduced quantum state obtained when we measure j in the color register of (I N ⊗ F K )|Ψ (1) .
Finally, we can now lower bound the soundness of the protocol: