Parallel repetition and concentration for (sub-)no-signalling games via a flexible constrained de Finetti reduction

We use a recently discovered constrained de Finetti reduction (aka"Post-Selection Lemma") to study the parallel repetition of multi-player non-local games under no-signalling strategies. Since the technique allows us to reduce general strategies to independent plays, we obtain parallel repetition (corresponding to winning all rounds) in the same way as exponential concentration of the probability to win a fraction larger than the value of the game. Our proof technique leads us naturally to a relaxation of no-signalling (NS) strategies, which we dub sub-no-signalling (SNOS). While for two players the two concepts coincide, they differ for three or more players. Our results are most complete and satisfying for arbitrary number of sub-no-signalling players, where we get universal parallel repetition and concentration for any game, while the no-signalling case is obtained as a corollary, but only for games with"full support".

A multi-player non-local game is played between cooperating but non-communicating players. Each player receives an input from some input alphabet and has to produce an output in some output alphabet. The common goal of the players is to satisfy some pre-defined predicate on their inputs and outputs. For that, they may agree on a strategy before the game starts, but are then not allowed to communicate anymore. Such games are especially relevant in theoretical physics in the context of the foundations of quantum mechanics and quantum information, and in computer science where they arise in multi-prover interactive proof systems. Indeed, they may provide an intuitive and quantitative understanding of the role played by various degrees of correlations in global systems which are composed of several local subsystems. These games also arise in complexity theory, under the formulation of multi-provers with some shared resources producing a protocol that should convince a referee, or in cryptography as attacks from malicious parties having a more or less restricted physical power.
The value of a game is the maximum winning probability of the players, over all allowed joint strategies, using possibly some prescribed correlation resource such as shared randomness, quantum entanglement or no-signalling correlations. It has been a subject of considerable study how the availability of different resources affects the values of certain games [3,4,12,32,36].
In this context, a natural question is how the value of a game behaves when n independent instances of the game are played simultaneously, i.e. each player gets n independent inputs and has to provide n outputs such that each game instance is won (or a large fraction of them). This is the parallel repetition problem. Playing independently the optimal single-game strategy on all n game instances will result in an exponentially decreasing winning probability. But although that was found paradoxical at first, this is in general not optimal [18,19]. For classical two-player games, Raz [34], later simplified and improved by Holenstein [23], established the first general parallel repetition theorem, showing that the value of n repetitions decreases exponentially for every game. Holenstein [23] also proved an analogous parallel repetition theorem for the nosignalling value of general two-player games. Only recently, parallel repetition theorems were proved for the entangled value of two-player games: for general games, nothing better than a polynomial decay result is known up to now (this was intially proved for a slightly modified game [29], and very recently only for the game itself [37]), while exponential decay results have been established in several special cases (perfect parallel repetition for XOR games [13], exponential decrease under parallel repetition for unique games [28], projection games [14], free games [9,25]).
Even less is known concerning multi-player games. And apart from [11] (containing both classical and quantum statements), results were obtained only in the no-signalling setting [2,8,35]. The present work has the same focus on multiple no-signalling players, albeit we will find that the theory becomes much more satisfying for sub-no-signalling players.
Before getting into more precise and technical statements, let us give a high-level exposition of the philosophy of the present work, and especially how it compares to or differs from previous approaches. The standard proof technique to tackle parallel repetition (in either the classical, the quantum or the no-signalling case) consists in iteratively assuming that the players have won a given instance of the game and then studying how this affects their winning probability in the others. Hence, if one can show that, conditioned on the event "the players have already won k instances of the game", the probability is high that they lose in at least 1, resp. most, of the n − k remaining instances, one gets exponential decay of the probability of winning all, resp. a fraction above the game value of, n instances of the game played in parallel. The main drawback of this approach is probably its "locality", which makes it not so straightforward and not so easily generalizable to more than two players. Here, we take a more "global" look at the problem, by attempting to reduce the study of such n-instance game to that of n i.i.d. 1-instance games, whose analysis is trivial. This is where de Finetti type statements come into play: using the fact that the repeated game is symmetric under permutation of its parallel rounds, these allow relating it in some way to independent rounds. However, there are certain steps from the standard route which we do not avoid in our approach. One of them is some kind of reconstruction step. Phrased informally: we have to be able to say at some point that, if our strategy almost satisfies the constraints defining our set of interest, then there must exist a strategy which exactly satisfies them and which is not too far away from it. Nonetheless, it is not always so easy to get handleable quantitative versions of this quite natural expectancy. This is the main reason why the set of sub-no-signalling strategies that we introduce is such a nice one for studying parallel repetition. Indeed, we prove that if a strategy satisfies all the sub-no-signalling constraints, up to some error ǫ, then it is Cǫ-close to the set of sub-nosignalling strategies, with a constant C which depends only on the number of players. And this fact ultimately translates into a universal exponential decay statement for the sub-no-signalling value of repeated multi-player games. This kind of stability property actually also holds for the set of two-player no-signalling strategies, which was discovered and used by Holenstein to prove universal parallel repetition in that case [23]. Oppositely, it remains unknown whether this is still true for three or more players, which explains why all parallel repetition results for strictly more than two no-signalling players are game-dependent ones [2,8]. Viewing the no-signalling setting inside our broader sub-no-signalling framework, we are also able to reproduce these earlier findings. One notable advantage of our approach is that it is particularly well-suited to studying the concentration problem as well, and once exponential decay of the probability of winning all game instances is established, exponential decay of winning a too high fraction of them comes almost for free.

II. NON-LOCAL MULTI-PLAYER GAMES AND (SUB-)NO-SIGNALLING STRATEGIES: DEFINITIONS AND FIRST OBSERVATIONS
Specifically, we will consider here ℓ-player games G with input alphabets X 1 , . . . , X ℓ and output alphabets A 1 , . . . , A ℓ . By way of notation, Furthermore, for any subset I ⊂ [ℓ] of indices, An element from X i , X I , X will usually be denoted by x i , x I , x, respectively, sometimes without explicitly specifying the set it belongs to (and similarly for A i , A I , A). Also, for any I, J ⊂ [ℓ], given T a probability distribution (which we may quite often abbreviate by "p.d.") on X I , resp. P a conditional probability distribution on A J |X I , we may denote it by T X I , resp. P A J |X I , when confusion on the considered alphabets is at risk.
From now on, we will be interested in making minimal a priori assumptions on how powerful the ℓ players may be. This will naturally lead us to considering that their common strategy to win the game G could be any no-signalling (or even sub-no-signalling) strategy, which we define now.

Definition 1
The sets of no-signalling and sub-no-signalling correlations, denoted respectively NS(A|X ) and SNOS(A|X ), consist of non-negative densities P (a|x) ≥ 0 defined as follows: Here, P (a I |x) denotes the marginal density, Remark Note that under this definition, NS(A|X ) ⊂ SNOS(A|X ), but the latter is a strictly larger set (e.g. it always contains the all-zero density). Furthermore, P ∈ NS(A|X ) iff P ∈ SNOS(A|X ) and P is normalized in the sense that for all x ∈ X , a P (a|x) = 1. Indeed, NS consists of conditional probability distributions, while SNOS allows, given each input, a total "probability" of less than or equal to 1. Also, it can be shown that in equation (1), only sets of the form I = [ℓ] \ i need to be considered. This is because the no-signalling conditions take the form of equations and this subset spans the set of all equations required (cf. [22], Lemma 2.7). The analogous statement for sub-no-signalling is not known and likely false. Nevertheless, one might in other contexts consider to relax the conditions of equation (2) to hold only for a selected family of subsets I ⊂ [ℓ].
⊓ ⊔ An ℓ-player game G is characterized by a probability distribution T (x) on the queries X , and a binary predicate V (a, x) ∈ {0, 1} on the answers and queries A × X , as illustrated in Figure 1.
The no-signalling, resp. sub-no-signalling, value of the game, denoted ω NS (G), resp. ω SNOS (G), is the maximum of the winning probability over all P ∈ NS(A|X ), resp. P ∈ SNOS(A|X ), where the distribution of X = X 1 . . . X ℓ and A = A 1 . . . A ℓ is as expected, ∀ x, a, P (X = x, A = a) = T (x)P (a|x).
In words, the (sub-)no-signalling value of a game is the maximal probability of winning it when no limitation is assumed on the power of the players, apart from the fact that they cannot signal information instantaneously from one another. In the sub-no-signalling case, constraints are relaxed even more: players are not forced to always produce an output, and it is only required that their strategy "looks as if it were no-signalling" (even though they may have "hidden" in their abstentions the fact that it is signalling). In Section VI, we extend on the physical interpretation of sub-no-signalling, and briefly discuss other kinds of restrictions that one may put on the players' physical power, such as shared randomness or shared quantum entanglement only.

A. Two-player SNOS ≡ NS
Not surprisingly, the no-signalling and sub-no-signalling values of games are related. We start by showing that for any two-player game G, they are identical, i.e. ω NS (G) = ω SNOS (G). As NS ⊂ SNOS, the inequality "≤" is evident, and we only need to prove the opposite inequality "≥". This follows from the following structural lemma.
Since playing a game G with a strategy P necessarily yields a smaller value than playing it with a strategy P ′ which dominates P pointwise, it is clear that once Lemma 2 is proved we just have to apply it to P an optimal SNOS strategy for G to get the inequality "≥".
Proof If P is normalized, i.e. if for all x, y, ab P (ab|xy) = 1, there is nothing to prove because P is already no-signalling.
As the total weight of both marginals of P (·|xy) is w < 1, we can find a and b such that P (a|xy) < Q(a|x), P (b|xy) < Q(b|y), so we can increase P (ab|xy) by some ǫ > 0 to P ′ (ab|xy) = P (ab|xy) + ǫ and still satisfy the subno-signalling conditions. By choosing ǫ maximally so, we can reduce the total number of strict inequality signs in the SNOS conditions. Iterating this procedure we arrive at a sub-no-signalling correlation P ′ with all inequalities met with equality, i.e. a no-signalling correlation.
Another presentation of this argument appeals to compactness. Consider the following set of correlations: X P,Q being compact and P ′ → xy ab P ′ (ab|xy) being continuous, sup xy ab P ′ (ab|xy) : P ′ ∈ X P,Q is actually attained. If it were less than |X × Y|, we could use the procedure above to increase the objective function, contradicting that it is a maximum. ⊓ ⊔ Note that the "bumping up" procedure described above, in order to transform any two-player sub-no-signalling strategy into a no-signalling one dominating it pointwise, may fail for more players. The two-player case is indeed special, due to non-overlapping of the two SNOS or NS constraints. However, already in the case of three players, even just the three inequalities may be impossible to bring simultaneously to equalities by pointwise increment (as illustrated by Example II B below).

B. Multi-player SNOS vs NS
Clearly, ω NS (G) ≤ ω SNOS (G) for every game, and there are examples of games (with game distribution T having strictly smaller than full support) where ω NS (G) < 1 but ω SNOS (G) = 1, for instance the anticorrelation game. Example (cf. [2], Appendix A) Consider the three-player anti-correlation game A 3 , which has binary input and output for all players and game distribution T supported on {0, 1} 3 \ {111}, i.e. 111 does not occur as a triple of questions. The winning predicate is that if any two inputs are 1, say x i = x j = 1, then the corresponding outputs must be different, a i = a j . While if there are zero or only a single 1 amongst the inputs, outputs may be arbitrary.
It is straightforward to verify that the following correlation is in SNOS {0, 1} 3 |{0, 1} 3 and wins the game with certainty: So ω SNOS (A 3 ) = 1. On the other hand, for, say, T uniform on {011, 101, 110}, one can check by elementary means that ω NS (A 3 ) = 2/3. ⊓ ⊔ What happens in the above example is that it is possible to satisfy any two amongst the three no-signalling constraints, but not the three of them at the same time. This is a phenomenon sometimes referred to as "frustration".
However, for a game distribution T having full support, a simple reasoning shows that ω NS (G) < 1 implies ω SNOS (G) < 1. Indeed, we show the contrapositive, assuming that ω SNOS (G) = 1. Because of the full support of T , this implies that for the optimal sub-no-signalling strategy P and every x, hence equality (i.e. normalization) holds for all x. Thus, P is really a no-signalling correlation and so ω NS (G) = 1. In fact, we can show something stronger, namely the following quantitative relationship.

Lemma 3 Consider a game distribution
T with full support on X . Then there exists Γ = Γ(T ) ≥ 0, which only depends on T , such that for every game G with query distribution T , The definition of Γ can be taken from [8] or [2], where it is implicitly defined as some robustness parameter of the linear program whose optimal value is ω NS (G).
Proof Take an optimal strategy P ∈ SNOS(A|X ), so that P (a I |x) ≤ Q(a I |x I ) for all I, a I , x. Then, And so we get, for all I, because the difference term in the sum is non-negative. Now simply "bump up" the sub-normalized probability distribution P A|X to a properly normalized conditional probability distribution P ′ A|X , adding at most an averaged weight over T X of ǫ, and hence, for all I, At this point we can invoke the stability of linear programs, used in [8] and [2] to conclude that there is Γ = Γ(T ) ≥ 0 such that there is a no-signalling correlation P ′′ A|X ∈ NS(A|X ) with This gives where we have used the total variational bound on P ′′ − P ′ , the fact that P ′ dominates P and the assumption on the probability of winning G when played P . ⊓ ⊔ The rest of the paper is structured as follows: In Section III we introduce parallel repetition of games, and state our main results, which improve upon, and partly clarify, earlier findings by Holenstein [23], Buhrman et al. [8] and Arnon-Friedman et al. [2]. In Section IV, we present the main technical tool, one of the constrained de Finetti reductions from [30], adapted to our present needs, followed by the proofs of the main theorems and corollaries in Section V. We conclude in Section VI.

III. PARALLEL REPETITION: DEFINITIONS AND MAIN RESULTS
Given an ℓ-player game G, with probability distribution T (x) on X and binary predicate V (a, x) ∈ {0, 1} on A × X , we are interested in playing the same game n times independently in parallel, and in looking at the probability of winning all n or a subset of t of them.
Formally, the n-fold parallel repetition of G is the ℓ-player game G n having the product probability distribution on X n T ⊗n (x n ) = T x (1) ) · · · T (x (n) , and the product binary predicate on A n × X n V ⊗n (a n , The no-signalling, resp. sub-no-signalling, value of this n-fold parallel repetition game, denoted ω NS (G n ), resp. ω SNOS (G n ), is thus the maximum of the winning probability P (win) = a n ,x n T ⊗n (x n )V ⊗n (a n , x n )P (a n |x n ) over all P ∈ NS(A n |X n ), resp. P ∈ SNOS(A n |X n ).
In words, the players win G n if they win all n instances of G played in parallel. So we obviously always have (for the allowed set of strategies being X ∈ {NS, SNOS}) However, in the case where ω X (G) < 1, the gap between the lower and upper bounds in equation (3) grows exponentially with n, making equation (3) very little informative. The parallel repetition problem is thus the following: If none of the players' allowed strategies can make them win 1 instance of G with probability 1, does it necessarily imply that they have an exponentially decaying probability of winning n of them at the same time? And if so at which rate?
More generally, we can study the game G t/n , whose winning predicate is defined as winning any t (or more) out of n repetitions [8], i.e. V t/n (a n , x n ) : Note that, with our notation, G n = G n/n .
The main results of the present paper are gathered below, where we set C ℓ := 2 ℓ+1 − 3.
Theorem 4 (Parallel repetition of sub-no-signalling ℓ-player games) Let G be an ℓ-player game such that ω SNOS (G) ≤ 1 − δ for some 0 < δ < 1. Then, for any n ∈ N, and any t ≥ (1 − δ + α)n for some 0 < α ≤ δ, we have As immediate consequences or refinements of Theorem 4, we can get parallel repetition results for the no-signalling value of multiplayer games in some particular instances.
Note that the constant Γ in this corollary depends on the game, and in the worst case carries a heavy dependence on the players' alphabet sizes. This is in contrast to Holenstein's two-player result for no-signalling games, which has no alphabet dependence at all [23]. This is generalized in our Theorem 4, since for two players we know by Lemma 2 that NS ≡ SNOS, and we could directly read off bounds with constants already improving on Holenstein's. Looking a little into the proof allows us to optimize the constants even more, which we record as follows.

IV. CONSTRAINED DE FINETTI REDUCTION
De Finetti reductions are a useful tool when trying to understand any permutation-invariant information processing task. Indeed, these enable to restrict the analysis to that of i.i.d. scenarios, which are usually trivially understood. In the context of multi-player games played n times in parallel, one would like to use the fact that the numbering of the n instances of the repeated game is irrelevant to reduce the study of strategies for the latter to the study of so-called de Finetti strategies (i.e. convex combinations of n i.i.d. strategies).
The seminal de Finetti reduction (aka post-selection) lemma was stated in [10], later finding applications in many areas of quantum information theory, from quantum cryptography [31] to quantum Shannon theory [5]. Our proofs though, will rely on two more recently established de Finetti reduction results, which are stated below. Just to fix some definitions: we will say that a (sub-)probability distribution P Z n , resp. a conditional (sub-)probability distribution P B n |Y n , is n-symmetric if for any permutation π of n elements, ∀ z n , P (π(z n )) = P (z n ), resp. ∀ b n , y n , P (π(b n )|π(y n )) = P (b n |y n ).

Lemma 7 (de Finetti reduction for conditional p.d.'s, [1])
Let B, Y be finite alphabets. There exists a probability measure dR B|Y on the set of conditional probability distributions R B|Y such that, for any n-symmetric conditional probability distribution P B n |Y n , where the polynomial pre-factor may be upper bounded as poly(n) ≤ (n + 1) |B||Y| .

Lemma 8 (Constrained de Finetti reduction for (sub-)p.d.'s, [30])
Let Z be a finite alphabet. There exists a probability measure dQ Z on the set of probability distributions Q Z on Z such that, for any nsymmetric (sub-)probability distribution P Z n on Z n , where the polynomial pre-factor may be upper bounded as poly(n) ≤ (n + 1) 3|Z| 2 .
In Lemma 8 above, as well as in the remainder of this paper, F (P, Q) stands for the fidelity between probability distributions P and Q, defined as F (P, Q) = √ P √ Q 1 .
We are now ready to present the technical lemma that will allow us in Section V to reduce the study of strategies for repeated games to the study of so-called de Finetti strategies, and hence prove our main results.

Lemma 9 (de Finetti reduction for sub-no-signalling correlations)
There exists a probability measure dQ on the set of probability distributions Q on A × X such that for any probability distribution T on X and any P ∈ SNOS(A n |X n ) an n-symmetric sub-no-signalling correlation, it holds that where we defined We mention for the sake of completeness that the poly(n) pre-factor in equation (4) may be upper bounded by (n + 1) 3|A| 2 |X | 2 +2|A||X | .
Proof Since T ⊗n X P A n |X n is an n-symmetric sub-probability distribution on (AX ) n , we first of all have by Lemma 8 that Notice next that, for any ∅ = I [ℓ], The first inequality is by monotonicity of the fidelity under stochastic maps (in particular taking marginals). While the second inequality is because P ∈ SNOS(A n |X n ), so that P A n I |X n ≤ P ′ A n I |X n I for some conditional p.d. P ′ A n I |X n I , and because the fidelity is order-preserving. What is more, for any ∅ = I [ℓ], P ′ A n I |X n I can be chosen to be an n-symmetric conditional probability distribution. Indeed, if it were not, its n-symmetrization would still upper bound P A n I |X n (since the latter is by assumption n-symmetric). We then have by Lemma 7 that and subsequently, using first, once more, that the fidelity is order-preserving, and second that it is multiplicative on tensor products, Recapitulating, we get T ⊗n X P A n |X n ≤ poly(n) as announced. ⊓ ⊔

V. PROOFS OF THE MAIN THEOREMS
In this section we prove Theorem 4, Corollary 5 and Theorem 6. We need first of all the following extension of Lemma 9.5 in [23]: Lemma 10 For Z = Ś m j=1 Z j and B = Ś m j=1 B j , consider probability distributions T on Z and P on B × Z satisfying If for each j ∈ [m] there exists a conditional probability distribution Q(b j |z j ) such that then there exists a conditional probability distribution P ′ (b|z) such that, for each j ∈ [m], P ′ (b j |z) = P ′ (b j |z j ) for all b j , z, and Proof This works exactly as the proofs of the case m = 2, appearing as Lemma 9.5 in [23], or of the case m = 3, appearing as Lemma 5.4 in [35]. Both statements follow from applying either two or three times Lemma 9.4 in [23]. Let us state the latter for completeness, and then only sketch how the proofs of the cases m = 2 or m = 3 generalize to any m.
Holenstein ([23], Lemma 9.4): Let P ST and Q S be probability distributions over S ×T and S respectively. There exists a probability distribution R ST over S × T such that Thanks to this result, we know that we can recursively construct a sequence P (1) , . . . , P (m) of probability distributions on B × Z such that, setting P (0) = Q, for each j ∈ [m], we have: for any fixed z ∈ Z, The probability distribution P (m) then satisfies and can therefore be chosen as the desired P ′ . ⊓ ⊔ We just mention as a side note that Lemma 9.4 in [23] crucially relies on the following fact: the statistical distance between two probability distributions P 1 , P 2 , i.e.
can be equivalently characterized as the minimum probability that X 1 differs from X 2 over pairs of random variables (X 1 , X 2 ) sampled from P having (P 1 , P 2 ) as marginals.
Note that the conditions enforced in Lemma 10 are not enough to ensure no-signalling of P ′ for three or more players. They would be sufficient though to guarantee that P ′ satisfies the relaxed no-signalling constraints considered in [35], namely that any group of ℓ−1 players together cannot signal to the remaining player. In other words, if a correlation approximately satisfies the Markov chain conditions necessary for being no-signalling, in the form of equations (5) and (6), then it is approximated, in the sense of equation (7), by a "weak" no-signalling correlation, as considered in [35]. Nevertheless, we can leverage this result to approximate the given no-signalling correlation by a sub-no-signalling correlation.

Lemma 11
Let P be a probability distribution on A × X and T be a probability distribution on X . If the no-signalling conditions (1) hold approximately, namely then there exists a sub-no-signalling correlation P ′ ∈ SNOS(A|X ) that approximates P , in the sense that 2ǫ I .
In the two-player case ℓ = 2, P ′ can be chosen to be no-signalling itself, P ′ ∈ NS(A|X ).
Proof We will apply Lemma 10, with m = 2 ℓ − 2, the index j identifying a non-empty and nonfull set ∅ = I [ℓ] (for instance via the expansion of j into ℓ binary digits). The local input and output alphabets are and the distribution we apply it to is Likewise, the prior distribution on Z is given by and we use the conditional distributions Q(b j |z j ) = Q(a I |x I ). Now, the prerequisites of Lemma 10 are given, with ǫ j = ǫ I , and thus we get a conditional probability distribution P ′ with P ′ (b j |z) = P ′ (b j |z j ) for all j, and We would like to conclude here by "pulling back" this conditional distribution to a correlation on A × X , which we would wish to be no-signalling. This almost works, except that P ′ has support outside the image of the diagonal embedding and likewise for ∆ : X −→ Z.
To resolve this issue, we simply remove this part of the distribution, and define the desired sub-normalized conditional densities by letting P ′ (a|x) := P ′ ∆(a)|∆(x) .
From this we see directly that x) for b = ∆(a) and z = ∆(x), and it is 0 outside the image of ∆.
It remains to check that P ′ is sub-no-signalling. Let ∅ = I [ℓ] be a subset with corresponding index 1 ≤ j ≤ 2 ℓ − 2. Let also x ∈ X , a I ∈ A I be tuples, and set z = ∆(x), b = ∆(a) (so that z j = x I ∈ X I = Z j , b j = a I ∈ A I = B j ). Then, Here, we have used the definition of the marginal and of P ′ . The inequality in the third line is because we enlarge the domain of the summation, and the equality in the last line is by the marginal property of P ′ .
The last claim, regarding ℓ = 2 players, is the original Lemma 9.5 in [23]. ⊓ ⊔ We are now ready to prove our main theorem, namely the parallel repetition and concentration results for the sub-no-signalling value of multi-player games. Proof [Proof of Theorem 4] Let P A n |X n be a sub-no-signalling correlation which is optimal to win the game G n . The distribution T ⊗n X and the predicate V ⊗n AX of G n being n-symmetric, we can assume without loss of generality that P A n |X n is also n-symmetric. Indeed, since for any permutation π of n elements, T • π = T and V • π = V , playing G n with P or with P • π yields the same winning probability. And therefore, if P is an optimal strategy then so is its symmetrization over all permutations of n elements. Hence, by Lemma 9, Now, fix 0 < ǫ < 1 and define Observe that, by well-known relations between fidelity and trace-distance (see e.g. [20]), if Q AX / ∈ P ǫ , then automatically F Q AX 2 ≤ 1 − ǫ 2 . Hence, On the other hand, if Q AX ∈ P ǫ , then by definition By Lemma 11, the latter condition implies that there exists a sub-no-signalling correlation R ′ A|X such that Yet, the winning probability when playing G with a strategy R ′ A|X ∈ SNOS(A|X ) is, by assumption on G, at most 1 − δ. So the average of the predicate of G over Q AX ∈ P ǫ is at most 1 − δ + 2C ℓ ǫ.
Putting everything together, we eventually get that the winning probability when playing G n with strategy P A n |X n is upper bounded as Choosing in equation (8) and recalling that P A n |X n is, by hypothesis, an optimal sub-no-signalling strategy, we obtain In order to conclude, we have to remove the polynomial pre-factor. So assume that there exists a constant C > 0 such that for some N ∈ N, ω SNOS (G N ) ≥ C 1 − δ 2 /5C 2 ℓ N . Then, for any n ∈ N, On the other hand, however, we still have by equation (9) Letting n grow, we see that the only option to make these two conditions compatible is to have C ≤ 1, which is precisely what we wanted to show. Following the exact same lines as above, we also get the concentration bound. Indeed, for any t ≥ (1 − δ + α)n, we now have in place of equation (8) that, for any 0 < ǫ < 1, The first term in the r.h.s. of equation (10) is a consequence of Hoeffding's inequality: if A, X are distributed according to Q AX ∈ P ǫ , then the value of the game predicate is on average at most 1 − δ + 2C ℓ ǫ, so for n independent such A, X, the probability that the sum of the n values of the game predicate is above (1−δ +α)n is at most exp[−2n(α−2C ℓ ǫ) 2 ]. The second term in the r.h.s. of equation (10) is obtained by simply using that e −x ≥ 1 − x for any x > 0.
The announced upper bound follows from choosing in equation (10) and removing the polynomial pre-factor by the same trick as before. ⊓ ⊔ Proof [Proof of Corollary 5] By Lemma 3, we know that if G is an ℓ-player game with full support satisfying ω NS (G) ≤ 1 − δ, then ω SNOS (G) ≤ 1 − δ/(Γ + 1). And thus by Theorem 4, The concentration bound for ω NS (G t/n ) follows analogously. ⊓ ⊔ Proof [Proof of Theorem 6] We follow the exact same reasoning as in the proof of Theorem 4, and keep the same notation. In the case ℓ = 2, we have by Lemma 11 that, for any 0 < ǫ < 1, Yet, if the winning probability when playing G with a strategy R ′ A|X ∈ NS(A|X ) is, by assumption on G, at most 1 − δ, then the average of the predicate of G over Q AX ∈ P ǫ is at most 1 − δ + 5ǫ. This is because we are here dealing with normalised probability distributions. Hence, for any 0 < ǫ < 1, We can now choose ǫ = ( √ 29 − 5)δ/2 in the parallel repetition estimate and ǫ = (10 − √ 2)α/49 in the concentration bound one, and argue as in the proof of Theorem 4 to remove the polynomial pre-factor, which yields the two advertised results.

VI. DISCUSSION
Our main contribution in the present paper is a concentration result for the sub-no-signalling value of multi-player games under parallel repetition. In fact, we believe that our work is the first to recognize the intrinsic interest of the class of sub-no-signalling correlations, which appears naturally as a relaxation of the no-signalling ones. In particular, the fact that sub-no-signalling correlations have total probability less than or equal to 1 can be interpreted as the possibility of "abstaining" from giving an answer in A, with a certain probability depending on the input in X . However, each marginal P A I |X has to be consistent "locally" with the no-signalling behaviour, in that it has to be dominated by a correlation Q A I |X I that depends only on the I positions of the input x. This means that each group I of players is able to interpret their observed statistics as being "really" governed by a local marginal Q A I |X I , only that sometimes the device generating the correlation defaults and does not give an answer. The probability of abstention depends on the entire input x, and thus would be signalling, if observed. Indeed, the anti-correlation game discussed in Example II B shows that a sub-no-signalling correlation cannot always be embedded in a no-signalling one, except in the case of two players (cf. Lemma 2). Abstention thus gives more power in general, but it comes with a price as well, since abstaining does not mean winning the game. In this sense, it should not be confused with plain post-selection (that is, conditioning) on the non-abstaining event, which is well-known to allow the violation of Bell inequalities by otherwise local hidden variables, via the so-called "detection loophole" [16,21].
Specifically, if an ℓ-player game G has SNOS value 1 − δ, then the probability for SNOS players to win a fraction at least 1 − δ + α of n instances of G played in parallel is at most exp(−nC ℓ α 2 ), where C ℓ > 0 is a constant which only depends on the number ℓ of players. This, a universal multi-player parallel repetition and concentration bound, is in contrast to the results on [8] and [2], which are restricted to full-support game distributions and with constants that seem to depend heavily on the game. We think of these findings as evidence that sub-no-signalling correlations are natural, due to their well-behaved parallel repetition properties. As hinted at in [8], such a result, valid for games involving strictly more than 2 players and where not all queries are asked [6], might potentially find applications in position-based cryptography [7,17]. It would also be interesting to investigate whether the recent work of [26], showing multi-prover interactive proofs for EXP (exponential time languages) that are robust against no-signalling provers, remains valid for sub-no-signalling provers, and whether our result can be generalized to amplify the soundness gap of their scheme. The latter is not self-evident, as they require a polynomial number of provers, but our bounds carry a penalty exponential in the number of players.
In the case ℓ = 2, our concentration statement is actually equivalent to the analogous one for the no-signalling value of G, thus with a universal constant c = C 2 in the exponential bound. And we know we cannot hope for a better dependence in α than the obtained one, even in the special case α = δ, as proved in [27]. In the case ℓ > 2, our result implies a concentration bound for the no-signalling value of G, but only if its input distribution has full support. Besides, the constant in the exponential bound is this time highly game-dependent (dependence on the sizes of the input and output alphabets, and on the smallest weight occurring in the input distribution). This is fully comparable to previous work in this direction due to Buhrman, Fehr and Schaffner [8], and Arnon-Friedman, Renner and Vidick [2].
Hence, the most immediate open problem at that point is regarding games with non-full support in the case of three or more players (e.g. the anti-correlation game): does a parallel repetition result hold for the no-signalling value of such multi-player games? Answering this question probably requires to understand first whether in Corollary 5, the presence of the game parameter Γ is really necessary or is just an artifact of the proof technique. In other words, does the rate at which the no-signalling value of a game decays under parallel repetition truly depends on the game distribution?
Another issue that would be worth investigating is whether constrained de Finetti reductions could also be used to establish parallel repetition results for the classical or quantum value of multi-player games. Formally, the sets of classical correlations C(A|X ) and quantum correlations Q(A|X ) are defined as follows: P ∈ C(A|X ) :⇔ ∀ x, a, P (a|x) = m∈M Q(m)P 1 (a 1 |x 1 m) · · · P ℓ (a ℓ |x ℓ m), for some p.d. Q on some alphabet M and some p.d.'s P i (·|x i m) on A i . P ∈ Q(A|X ) :⇔ ∀ x, a, P (a|x) = ψ|M (x 1 ) a 1 ⊗ · · · ⊗ M (x ℓ ) a ℓ |ψ , for some pure state |ψ ψ| on H 1 ⊗ · · · ⊗ H ℓ and some POVMs M (x i ) on H i . And the classical, resp. quantum, value of an ℓ-player game G with distribution T and predicate V , denoted ω C (G), resp. ω Q (G), is then naturally defined as the maximum, resp. supremum, of the winning probability over all P ∈ C(A|X ), resp. P ∈ Q(A|X ).
In the classical case, the first parallel repetition result for two-player games was established by Raz [34], and later improved by Holenstein [23], while Rao [33] gave a concentration bound. However, the proof techniques are arguably not as straightforward as via de Finetti reductions, and do not generalise directly to any number ℓ of players. In the quantum case, even less is known. The best parallel repetition result up to now is the one established by Chailloux and Scarpa [9] (subsequently improved by Chung, Wu and Yuen [11]), which applies to two-player (ℓ-player) free games, and from there to games with full support. That is why being able to export ideas from the de Finetti approach to these two cases would be of great interest. Roughly speaking, the problem we are facing is the following: Given an n-symmetric correlation P A n |X n , we can always write the first step in the proof of Lemma 9, i.e.
T ⊗n X P A n |X n ≤ poly(n) Q AX F T ⊗n X P A n |X n , Q ⊗n AX 2 Q ⊗n AX dQ AX .
Now, we would like to argue that if P A n |X n is a classical, resp. quantum, correlation, then the p.d.'s Q AX for which the fidelity weight in the r.h.s. of equation (11) is not exponentially small are necessarily close to being of the form T X R A|X for some classical, resp. quantum, correlation R A|X . This was precisely our proof philosophy in the no-signalling case. However, the fact that the classical and quantum conditions are not properties that one can read off on the marginals, contrary to the no-signalling one, seems to be a first obstacle to surmount.
One related legitimate question would be the following: is it possible to make an even stronger statement than the one that, as explained above, we either are looking for (in the classical and quantum cases) or already have (in the no-signalling case)? Namely, could we upper bound T ⊗n X P A n |X n by a de Finetti distribution analogous to that in the r.h.s. of equation (11), but with weight strictly 0 on p.d.'s Q AX which are not of the form T X R A|X , for R A|X belonging to the same class as P A n |X n ? The answer to this question is no. Indeed, such improved de Finetti reduction would imply a strong parallel repetition result, which we know does not hold (see [2] for a similar discussion). So the best we can hope for is really to show that the fidelity weight in our upper bounding de Finetti distribution is exponentially small on the p.d.'s which are too far from being of the desired form.
Finally, let us briefly comment on the main spirit difference between the present work and the one by Arnon-Friedman et al. [2]. Our approach consists in using a more "flexible" de Finetti reduction, in which the information on the correlation P A n |X n and the p.d. T ⊗n X of interest are kept in the upper bounding de Finetti distribution, through the fidelity weight F (T ⊗n X P A n |X n , Q ⊗n AX ) 2 . Whereas in [2], any initial correlation is first upper bounded by the same universal de Finetti correlation, on which a test (specifically tailored to the considered game distribution) is performed in a second step, that has the property of letting pass, resp. rejecting, with high probability the strategies which are no-signalling, resp. too signalling. So it seems in the end that both approaches are quite closely related: in our case, the "signalling test" which is applied to a given p.d. Q AX is nothing else than the maximal fidelity of Q AX to the set of p.d.'s of the form T X R A|X , with R A|X no-signalling, being above or below a certain threshold value. Also, it would be interesting (and potentially fruitful) to investigate whether one could combine in some way the techniques yielding Lemmas 7 and 8, to get a de Finetti reduction result that would have the advantages of both: namely, that is designed for conditional p.d.'s while at the same carrying the relevant information on the conditional p.d. it is applied to.