Computing on Anonymous Quantum Network

This paper considers distributed computing on an anonymous quantum network, a network in which no party has a unique identifier and quantum communication and computation are available. It is proved that the leader election problem can exactly (i.e., without error in bounded time) be solved with at most the same complexity up to a constant factor as that of exactly computing symmetric functions (without intermediate measurements for a distributed and superposed input), if the number of parties is given to every party. A corollary of this result is a more efficient quantum leader election algorithm than existing ones: the new quantum algorithm runs in O(n) rounds with bit complexity O(mn^2), on an anonymous quantum network with n parties and m communication links. Another corollary is the first quantum algorithm that exactly computes any computable Boolean function with round complexity O(n) and with smaller bit complexity than that of existing classical algorithms in the worst case over all (computable) Boolean functions and network topologies. More generally, any n-qubit state can be shared with that complexity on an anonymous quantum network with n parties.


Quantum Leader Election
Our first result shows that the leader election problem is not harder than computing symmetric functions on anonymous quantum networks. Let n be the number of parties and H k : {0, 1} n → {true, false} be the function over distributed n bits, which is true if and only if the Hamming weight, i.e., the sum, of the n bits is k. Let H k be any quantum algorithm that exactly computes H k without intermediate measurements 2 on an anonymous quantum network, and let Q rnd (H k ) and Q bit (H k ) be the worst-case round and bit complexities of H k over all possible quantum states as input. 3 Theorem 1 If the number n of parties is provided to each party, the leader election problem can exactly be solved in O(Q rnd (H 0 ) + Q rnd (H 1 )) rounds with bit complexity O(Q bit (H 0 ) + Q bit (H 1 )) on an anonymous quantum network of any unknown topology. This is the first non-trivial characterization of the complexity of leader election relative to computing Boolean functions on anonymous quantum networks. This does not have a classical counterpart, since, for some network topologies (e.g., rings), symmetric Boolean functions can exactly be computed [19,20,14] but a unique leader cannot exactly be elected [20]. In fact, any symmetric function can exactly be computed on an anonymous classical network of any unknown topology (and thus, on an anonymous quantum network). Therefore, Theorem 1 subsumes the computability result in Ref. [18] that the leader election problem can exactly be solved on an anonymous quantum network. Our second result is that computing H 1 is reducible to computing H 0 .
Theorem 2 If the number n of parties is provided to each party, H 1 can exactly be computed without intermediate measurements for any possible quantum states as input in O(Q rnd (H 0 )) rounds with bit complexity O(n · Q bit (H 0 )) on an anonymous quantum network of any unknown topology.
Theorem 1 together with Theorem 2 implies that the complexity of the leader election problem is characterized by that of computing H 0 . This would be helpful in intuitively understanding the hardness of the leader election problem on an anonymous quantum network, since computing H 0 can be interpreted as just a simple problem of checking if all parties have the same value.
Since Theorem 1 (Theorem 2) is proved by quantumly reducing the leader election problem (resp. computing H 1 ) to computing H 0 and H 1 (resp. computing H 0 ), the theorems provide ways of developing quantum leader election algorithms by plugging in algorithms that compute H 0 (and H 1 ). Since there is a classical algorithm that exactly computes H 0 in O(n) rounds with bit complexity O(mn) for the number m of edges of the underlying graph (e.g., Ref. [14]) and it can be converted into a quantum algorithm with the same complexity up to a constant factor, Theorem 1 together with Theorem 2 yields a quantum leader election algorithm.

Corollary 3 The leader election problem can exactly be solved in O(n) rounds with bit complexity O(mn 2 ) on an anonymous quantum network for any unknown topology, if the number n of parties is given to every party, where m is the number of edges of the underlying graph.
This leader election algorithm has better round and bit complexity than existing algorithms -the two quantum algorithms given in Ref. [18] have the round [bit] complexity of O(n 2 ) [O(mn 2 )] and O(n log n) [O(mn 4 log n)], respectively. Actually, the proofs of Theorems 1 and 2 can be carried over asynchronous networks in a straightforward manner. Thus, the theorems hold for asynchronous networks.

Quantum State Sharing
Once a unique leader is elected, it is possible to construct a spanning tree and assign a unique identifier drawn from {1, . . . , n} to each party with the same order of complexity as that of electing a unique leader on anonymous quantum networks. 4 Then, the leader can recognize the underlying graph by gathering along the spanning tree the adjacency matrices of subgraphs with a unique identifier on each node. This implies that, if every party i is given a bit x i as input, a unique leader (who is elected by the leader election algorithm) can compute any Boolean function f (x 1 , . . . , x n ) that depends on the underlying graph G with node label x i s (and send the function value to every party along the spanning tree). Here, the index i of each party is introduced just for explanation, and it is not necessarily the same as the identifier assigned by the leader to the party having x i . An example of f is a majority function that is true if and only if the sum over all x i 's is more than n/2. Another example is a function that is true if and only if there is a cycle in which each node i has input x i = 1. Similarly, if each party is given a qubit as node label so that the n parties share some n-qubit state ξ, the leader can generate any quantum state ρ computable from ξ and the underlying graph G.

Corollary 4
Suppose that every party i is given the number n of parties and a qubit as node label so that the n parties share some n-qubit state ξ. Let ρ be any n-qubit quantum state computable from ξ and the underlying graph. Then, state ρ can exactly be shared among the n parties in O(n) rounds with bit complexity O(mn 2 ) on an anonymous quantum network, where m is the number of edges of the underlying graph. A special case of f is a Boolean function that is determined by the underlying graph in which each node i is labeled with a bit x i . If every party i is given n and x i , function f can exactly be computed in O(n) rounds with bit complexity O(mn 2 ) on an anonymous quantum network.
This gives the first quantum algorithm that exactly computes any computable Boolean function with round complexity O(n) and with smaller bit complexity than that of existing classical algorithms [20,14,18] in the worst case over all (computable) Boolean functions and network topologies.

GHZ-State Sharing
From the viewpoint of quantum information, our leader election algorithm exactly solves the problem of sharing an n-partite W -state (e.g., a state (|100 +|010 +|001 )/ √ 3 for the threeparty case). As described above, this essentially solves the more general problem of sharing an n-qubit state ρ. We are then interested in whether a certain non-trivial ρ can be shared with less computational resources than a W -state. Specifically, we focus on the number of distinct quantum gates required to share ρ, since, for the leader election problem, all known exact algorithms (including ours) require quantum gates that depend on the number n of parties.
Among non-trivial quantum states other than W -states, an n-partite GHZ state would be one of the most interesting quantum states, since it would be a useful resource for quantum computing and communication.
We give exact quantum algorithms that solve, with a constant-sized gate set, the problem of sharing an npartite GHZ state (or an n-partite cat state) (|0 ⊗n + |1 ⊗n )/ √ 2 with qubits, and the problem of sharing an n-partite generalized-GHZ state (|0 ⊗n +· · · +|k − 1 ⊗n )/ √ k with k-level qudits for a constant integer k ≥ 2, among n parties on an anonymous quantum network. We call this problem the GHZ-state sharing problem. Notice that k-level qudits are physically realizable [16] and are just qubits for k = 2. Let F k be a function such that F k (x 1 , . . . , Let F k be any quantum algorithm that exactly computes F k without intermediate measurements on an anonymous network, and let Q rnd (F k ) and Q bit (F k ) be the worst-case round and bit complexities, respectively, of F k over all possible quantum states as input.

Theorem 5
If every party is given the number n of party and an integer k ≥ 2, the GHZ-state sharing problem can exactly be solved on an anonymous quantum network in O(Q rnd (F k )) rounds with bit complexity O(Q bit (F k )). Moreover, every party uses only a constant-sized gate set to perform all operations for any integer constant k ≥ 2, if an algorithm F k is given as a black box.
For every integer constant k ≥ 2, there is an algorithm that exactly and reversibly computes F k for any possible quantum state as input in O(n) rounds with bit complexity O(mn 4 log n) on an anonymous classical/quantum network of any unknown topology [18]. Therefore, the theorem implies that there exists a quantum algorithm that exactly solves the GHZ-state sharing problem with a constant-sized gate set for any constant k ≥ 2. For k = 2, we have the following corollary.

Corollary 6
The GHZ-state sharing problem with k = 2 can exactly be solved on an anonymous quantum network for any number n of parties with a gate set that can perfectly implement the Hadamard transformation and any classical reversible transformations. In particular, the problem can exactly be solved with either the Shor basis or the gate set consisting of the Hadamard gate, the CNOT gate, and the Toffoli gate.
If much more rounds are allowed, there exists a more bit-efficient algorithm that exactly solves the GHZstate sharing problem in O(n 2 ) rounds with bit complexity O(mn 2 ) by using only a constant-sized gate set for any n. The algorithm is obtained by modifying Algorithm I in Ref. [18].

Related Work
Refs. [17,7] have dealt with the leader election and GHZ-state sharing problems in a different setting where pre-shared entanglement is assumed but only classical communication is allowed. The relation between several network models that differ in available quantum resources is discussed in Ref. [8].

Organization
Section 2 describes the network model, and some tools and notations used in the paper. Sections 3 and 4 prove Theorems 1 and 2. Section 5 then gives a quantum leader election algorithm as a corollary of the theorems. Section 6 considers the problems of computing Boolean functions and sharing a quantum state. Section 7 presents a quantum algorithm for the GHZ-state sharing problem.

Distributed Computing
The Network Model: A classical network is composed of multiple parties and bidirectional classical communication links connecting parties. In a quantum network, every party can perform quantum computation and communication, and each adjacent pair of parties has a bidirectional quantum communication link between them (we do not assume any prior shared entanglement). When the parties and links are regarded as nodes and edges, respectively, the topology of the network is expressed by a connected undirected graph. We denote by G n the set of all n-node connected undirected graphs with no multiple edges and no self-loops.
In what follows, we may identify each party/link with its corresponding node/edge in the underlying graph for the system, provided that doing so is not confusing. Every party has ports corresponding one-to-one to communication links incident to the party. Every port of party l has a unique label i, where d l is the number of parties adjacent to l. More formally, the underlying graph G = (V, E) has a port numbering [20], which is a set σ of functions is a bijection from the set of edges incident to v to {1, 2, . . . , d v }. It is stressed that each function σ[v] may be defined independently of any other σ[v ′ ]. In our model, each party knows the number of his ports and the party can appropriately choose one of his ports whenever he transmits or receives a message.
Initially, every party l has local information I l , the information that only party l knows, such as his local state and the number of his adjacent parties, and global information I G , the information shared by all parties (if it exists), such as the number of parties in the system (there may be some information shared by not all parties, but it is not necessary to consider such a situation when defining anonymous networks). Every party l runs the same algorithm, which is given local and global informations, I l and I G , as its arguments. If all parties have the same local information except for the number of ports they have, the system and the parties in the system are said to be anonymous. For instance, if the underlying graph of an anonymous network is regular, this is essentially equivalent to the situation in which every party has the same identifier (since we can regard the local information I l of each party l as his identifier). This paper deals with only anonymous networks, but may refer to a party with its index (e.g., party i) only for the purpose of simple description.
A network is either synchronous or asynchronous. In the synchronous case, message passing is performed synchronously. The unit interval of synchronization is called a round. Following the approach in Ref. [15], one round consists of the following two sequential steps, where we assume that two (probabilistic) procedures that generate messages and change local states are defined in the algorithm invoked by each party: (1) each party changes his local state according to a procedure that takes his current local state and the incoming messages as input, and then removes the messages from his ports; (2) each party then prepares messages and decides the ports through which the messages should be sent by using the other procedure that takes his current local state as input, and finally sends the messages via the ports. Notice that, in the quantum setting, the two procedures are physically realizable operators. A network that is not synchronous is asynchronous. In asynchronous networks, the number of rounds required by an algorithm is defined by convention as the length of the longest chains of messages sent during the execution of the algorithm. This paper focuses on the required number of rounds as a complexity measure (called round complexity). This is often used as an approximation of time complexity, which includes the time taken by local operations as well as that taken by message exchanges. Another complexity measure we use is bit complexity, which is the number of bits, including qubits, communicated over all communication links. In this paper, we do not assume any faulty party and communication link.

Leader Election Problem in Anonymous Networks
The leader election problem is formally defined as follows.
Definition 7 (n-party leader election problem (LE n )) Suppose that there is an n-party network whose underlying graph is in G n , and that each party i ∈ {1, 2, . . . , n} in the network has a variable y i initialized to 1. Create the situation in which y k = 1 for a certain k ∈ {1, 2, . . . , n} and y i = 0 for every i in the rest {1, 2, . . . , n} \ {k}.
This paper considers LE n on an anonymous network (when each party i has his own unique identifier, i.e., I i = I j for all distinct i, j ∈ {1, . . . , n}, LE n can deterministically be solved in Θ(n) rounds in both synchronous and asynchronous cases [15]).
The leader election problem on an anonymous network was first investigated by Angluin [2]. Subsequently, Yamashita and Kameda [20] gave a necessary and sufficient condition on network topologies under which LE n can exactly be solved for given n. Their result implies that LE n cannot exactly be solved for a broad class of graphs, including rings, complete graphs, and certain families of regular graphs. Interested readers should consult Refs. [1,22] and the references in them for detailed information about the leader election problem on anonymous networks.

Quantum Computing
We assume that readers have some basic knowledge of quantum computing introduced in standard textbooks [16,11]. The following well-known theorem is called "exact quantum amplitude amplification", which will be used repeatedly.
If the initial success probability a = z : χ(z)=true |α z | 2 of A is exactly known and at least 1/4, then for some values φ a and θ a (0 ≤ φ a , θ a ≤ 2π) computable from a.

Notations
A Boolean function f : {0, 1} n → {true, false} depending on n variables, We say that n parties exactly compute a Boolean function f : {0, 1} n → {true, false} if every party i has variables y i (initialized to "true") and x i ∈ {0, 1} before computation, and set y i to f ( x) with certainty after computation. If a quantum algorithm exactly computes f without intermediate measurements on an anonymous quantum network, we say that the algorithm is an f -algorithm.
In general, an f -algorithm transforms (with ancilla qubits) an input state where |g x is "garbage" left after computing f ( x). For the algorithms over networks with bidirectional communication links, any f -algorithms are reversible. Hence we can totally remove the "garbage" by standard garbage-erasing technique, as the f -algorithm exactly and reversibly computes f . Putting everything together, we may assume without loss of generality (at the cost of doubling each complexity) that any f -algorithm transforms an input state Similarly, for the more general function f : X n → Y depending on distributed n variables (x 1 , . . . , x n ) with x i ∈ X, we say that a quantum algorithm is an f -algorithm, if the algorithm exactly computes f without intermediate measurements on an anonymous quantum network. For an f -algorithm F on an anonymous quantum network with the underlying graph G ∈ G n , we denote by Q bit G (F) and Q rnd G (F) the worst-case bit and round complexities, respectively, of F over all possible quantum states given as input. For simplicity, we may write Q bit (F) and Q rnd (F) if G is clear from context.

Proof of Theorem 1 3.1 Basic Idea
Initially, every party is eligible to be the leader and is given the number n of parties as input. Every party flips a coin that gives heads with probability 1/n and tails with 1 − 1/n. If exactly one party sees heads, the party becomes a unique leader. The probability of this successful case is given by We shall amplify the probability of this case to one by applying the exact quantum amplitude amplification in Theorem 8. To do this, we use an H 1 -algorithm in a black-box manner to check (in F χ (θ s(n) )) whether or not a run of the above randomized algorithm results in the successful case, and use an H 0 -algorithm in a black-box manner to realize the diffusion operator (more strictly, F 0 (φ s(n) )). In other words, we shall quantumly reduce the leader election problem to computing H 0 and H 1 . In our algorithm, all communication is performed for computing H 0 , H 1 and their inversions. The non-trivial part is how to implement F χ (θ s(n) ) and F 0 (φ s(n) ) in a distributed way on an anonymous network, where s(n) = (1 − 1/n) n−1 , since every party must run the same algorithm.

The Algorithm
Before describing the algorithm, we introduce the concept of solving and unsolving strings. Suppose that each party i has a bit x i , i.e., the n parties share n-bit string x = (x 1 , x 2 , . . . , x n ). A string x is said to be solving if x has Hamming weight one. Otherwise, x is said to be unsolving. We also say that an n-qubit pure state |ψ = x∈{0,1} n α x | x shared by the n parties is solving (unsolving) if α x = 0 only for x that is solving (unsolving).
Fix an H 0 -algorithm and an H 1 -algorithm, which we are allowed to use in a black-box manner.

Base algorithm A: Let A be the two-by-two unitary matrix defined by
At the beginning of the algorithm, each party prepares three single-qubit quantum registers R, S, and S ′ , where the qubit in R is initialized to |0 , the qubits in S and S ′ are initialized to |"true" (the qubits in S and S ′ will be used as ancillary qubits when performing phase-shift operations on the qubit in R). First, each party applies A to the qubit in R to generate the quantum state |ψ = A|0 = 1 − 1 n |0 + 1 n |1 . Equivalently, all n parties share the n-qubit quantum state x is solving} be the set of solving strings of length n, and let |Ψ solving = 1 √ n x∈Sn | x be the quantum state which is the uniform superposition of solving strings of length n. Notice that |Ψ is a superposition of the solving state |Ψ solving and some unsolving state |Ψ unsolving : |Ψ = α solving |Ψ solving + α unsolving |Ψ unsolving .
Exact amplitude amplification: Now the task for the n parties is to amplify the amplitude of |Ψ solving to one via exact amplitude amplification, which involves one run of −AF 0 (φ a )A −1 F χ (θ a ) for A = A ⊗n since the initial success probability is α 2 solving > 1/4. To realize F χ (θ s(n) ) in a distributed manner, where χ( x) = 1 if x is solving and χ( x) = 0 otherwise, each party wants to multiply the amplitude of any basis state | x for χ( x) = 1 by a factor of e i 1 n θ s(n) , where s(n) = (1 − 1/n) n−1 . This will multiply the amplitude of the basis state by a factor of e iθ s(n) as a whole. At this point, however, no party can check if χ( x) = 1 for each basis state | x , since he knows only the content of his R. Thus, every party runs the H 1 -algorithm with R and S, which sets the content of S to "true" if the number of 1's among the contents of R's of all parties is exactly one and sets it to "false" otherwise (recall that the H 1 -algorithm computes H 1 for each basis state in a superposition). This operation transforms the state as follows: |Ψ |"true" ⊗n → α solving |Ψ solving |"true" ⊗n + α unsolving |Ψ unsolving |"false" ⊗n , where the last n qubits are those in S's. Every party then multiplies the amplitude of each basis state by a factor of e i 1 n θ s(n) , if the content of S is "true" (here, no party measures S; every party just performs the phase-shift operator controlled by the qubit in S). Namely, the state over R's and S's of all parties is transformed into (e i 1 n θ s(n) ) n α solving |Ψ solving |"true" ⊗n + α unsolving |Ψ unsolving |"false" ⊗n .
Finally, every party inverts every computation and communication of the H 1 -algorithm to disentangle S. The implementation of F 0 (φ s(n) ) is similar to that of F χ (θ s(n) ), except that F 0 (φ s(n) ) multiplies the all-zero basis state |0 · · · 0 by e iφ s(n) . First, every party runs the H 0 -algorithm with R 0 and S ′ , which sets the content of S ′ to "true" in the case of the all-zero state, and sets it to "false" otherwise. Next, every party multiplies the amplitude of the all-zero state by a factor of e i 1 n φ s(n) , if the content of S ′ is "true". Finally, every party inverts every computation and communication of the H 0 -algorithm to disentangle S ′ .

If status
n − 1 to the qubit in R to generate the quantum state |ψ = n−1 n |0 + 1 n |1 in R. 3. Perform the exact amplitude amplification consisting of the following steps: 3.1 To realize F χ (ψ s(n) ) for s(n) = (1 − 1/n) n−1 , perform the following steps:  More precisely, every party sets his classical variable status to "eligible", and runs Algorithm QLE with status and n, given in Figure 1. After the execution of the algorithm, exactly one party has the value "eligible" in status. Since all communication is performed to compute H 0 and H 1 and their inversions, the algorithm runs in 2(Q rnd G (H 0 ) + Q rnd G (H 1 )) rounds with bit complexity 2(Q bit G (H 0 ) + Q bit G (H 1 )) for any graph G ∈ G n , where H 0 and H 1 are the H 0 -algorithm and H 1 algorithm, respectively, that we fixed. This completes the proof of Theorem 1.

Proof of Theorem 2
The proof consists of the following two steps: • Reduce computing H 1 to computing H 0 and the consistency function C S , where C S is a Boolean function that is true if and only if a subset (specified by S) of all parties has the same classical value (its formal definition will be given later).
• Reduce computing C S to computing H 0 .
Actually, the second step is almost trivial. We start with the first step.

Basic Idea
Suppose that every party i is given a Boolean variable x i . We can probabilistically compute H 1 ( x) with the following classical algorithm, where x = (x 1 , . . . , x n ): Every party i with x i = 1 sets a variable r i to 0 or 1 each with probability 1/2 and sends r i to all parties (by taking δ rounds for the diameter δ of the underlying graph); every party i with x i = 0 sets variable r i to " * " and sends r i to all parties. It is not difficult to see that the following three hold: (i) if | x| = 0, every party receives only " * ", (ii) if | x| = 1, either no party receives "1" or no party receives "0", and (iii) if | x| = t ≥ 2, every party receives both "0" and "1" with probability 1 − 2/2 t . Therefore, every party can conclude that Roughly speaking, our quantum algorithm for computing H 1 is obtained by first quantizing this probabilistic algorithm and then applying the exact quantum amplitude amplification to boost the success probability to one. More concretely, we amplify the probability p that there are both 0 and 1 among all r i 's by using the exact amplitude amplification. Let p init and p final be the values of p before and after, respectively, applying the amplitude amplification. Obviously, if p init = 0, then p final = 0 also. Hence, for | x| ≤ 1, p final = 0.
For | x| ≥ 2, p could be boosted to one if the exact value of p init were known to every party. However, p init is determined by t, the value of which may be harder to compute than to just decide whether t = 1 or not. Therefore, instead of actual t, we run the amplitude amplification for each t ′ := 2, . . . , n, a guess of t, in parallel. We can then observe that exactly one of the (n − 1) runs boosts p to one if and only if | x| ≥ 2.

Terminology
Suppose that each party i has a bit x i , i.e., the n parties share n-bit string x = (x 1 , x 2 , . . . , x n ). For convenience, we may consider that each x i expresses an integer, and identify string x i with the integer it expresses. For an index set S ⊆ {1, . . . , n}, string x is said to be consistent over S if x i is equal to x j for all i, j in S. Otherwise x is said to be inconsistent over S. Here, index set S is used just for the definition (recall that no party has an index or identifier in the anonymous setting). Formally, we assume that every party has a variable z ∈ {"marked", "unmarked"}, and S is defined as the set of all parties with z = "marked". If S is the empty set, any x is said to be consistent over S. We also say that an n-qubit pure state |ψ = x∈{0,1} n α x | x = x∈{0,1} n α x |x 1 ⊗ · · · ⊗ |x n shared by the n parties is consistent (inconsistent) over S if α x = 0 only for x 's that are consistent (inconsistent) over S (there are pure states that are neither consistent nor inconsistent over S, but we do not need to define such states). We next define the consistency function C S : {0, 1} n → {"consistent", "inconsistent"}, which decides if a given string x ∈ {0, 1} n distributed over n parties is consistent over S. Namely, C S ( x) returns "consistent" if x is consistent over S and "inconsistent" otherwise.

The H 1 -Algorithm
As in the previous section, we fix an H 0 -algorithm and a C S -algorithm, which we use in a black-box manner. At the beginning of the algorithm, every party prepares two one-qubit registers X and Y. We shall describe an H 1 -algorithm that exactly computes function H 1 over the contents of X's and sets the content of each Y to the function value. Here, we assume that registers Y's are initialized to |"true" for an orthonormal basis {|"true" , |"false" } of C 2 . We basically follow the idea in Section 4.1 to reduce computing H 1 to computing the binary-valued functions H 0 and C S . However, the idea actually represents a three-valued function, i.e., distinguishes among three cases: | x| = 0, | x| = 1, and | x| ≥ 2. Thus, we cast the idea into two yes-no tests. Namely, the algorithm first tests if | x| is 0 or not. If | x| = 0, then it concludes H 1 ( x) = "false". The algorithm then performs another test to decide if | x| ≤ 1 or | x| ≥ 2, which determines H 1 ( x).

First Test
To test if | x| = 0, each party prepares a single-qubit register S 0 , the content of which is initialized to |"true ′′ . Each party then performs the H 0 -algorithm to exactly compute the value of H 0 over the contents of X's, and stores the computed value in each S 0 .
From the definition of the H 0 -algorithm, this transforms the state in X's and S 0 's as follows: by rearranging registers, If the content of S 0 is "true", then the content of Y will be set to "false" later (because this means | x| = 0).

Second Test
Next each party tests if | x| ≤ 1 or | x| ≥ 2 with certainty. Recall the probabilistic algorithm in which every party i sets a variable r i to 0 or 1 each with probability 1/2 if x i = 1 and sets variable r i to " * " if x i = 0, and then sends r i to all parties. Our goal is to amplify the probability p that that there are both 0 and 1 among all r i 's by using the exact amplitude amplification. The difficulty is that no party knows the value of p init (= 1 − 2/2 | x| ). The test thus uses a guess t of | x| and tries to amplify p assuming that p init = 1 − 2/2 t . If t = | x|, then the procedure obviously outputs the correct answer with probability one. If t = | x|, the procedure may output the wrong answer. As will be proved later, however, we can decide if | x| ≤ 1 or | x| ≥ 2 without error from the outputs of (n − 1)-runs of the test for t = 2, . . . , n, which are performed in parallel.
We now describe the test procedure for each t. Assume that one-qubit register Z t is initialized to |"unmarked " . The initial state is thus where registers Y and S 0 are omitted to avoid complication.
The base algorithm A (to be amplified) is described as follows. If the content of X is 1, the party flips the content of Z t to "marked", where {|"marked " , |"unmarked " } is an orthonormal basis in C 2 . This operation just copies the contents of X to those of Z t (in the different orthonormal basis) for parallel use over all t. The state is thus, for any fixed x, , where z t (x i ) ∈ {|"marked " , |"unmarked " } is the content of Z t when the content of X is x i , and S is the set of the parties whose Z t is in the state |"marked " (note that |S| = | x|).
If the content of Z t is"marked", apply the Hadamard operator H = 1 √ 2 1 1 1 −1 to the qubit in R t to create (|0 + |1 )/ √ 2 (note that register R t of each party i is the quantum equivalent of r i 5 ). The state is now represented as, for the x, .

By rearranging registers, we have
where |z t ( x) is the n-tensor product of |"marked " or |"unmarked " corresponding to x, and This is the end of the base algorithm A.
We then boost the amplitudes of the basis states superposed in |ψ t ( x) such that there are both |0 and |1 in R t 's of parties in S, i.e., the amplitudes of the states that are inconsistent over S, with amplitude amplification. Here, function χ in Theorem 8 is the consistency function C S and a(t) = 1 − 2 1 2 t is used as the success probability a. For convenience, we express |ψ t ( x) as To realize F χ (θ a(t) ), every party prepares a single-qubit register S t initialized to |"consistent" and then performs the next operations: (1) Perform a C S -algorithm with R t , S t and Z t , which computes C S for each basis state | y |0 n−|S| of |ψ t ( x) and sets the content of S t to value of C S ; (2) Multiply the amplitude of each basis state of R t by a factor of exp i θ a(t) n if the content of S t is "inconsistent"; (3) Finally invert every computation and communication of (1) to disentangle S t . The state evolves with the above operations as follows: We have now finished the first operation, F χ (θ a(t) ), of −AF 0 (φ a(t) )A −1 F χ (θ a(t) ).
Then A −1 is performed. Operation F 0 (φ a(t) ) can be realized with the H 0 -algorithm in the same way as in Algorithm QLE in the previous section. Finally, perform operation A again. This is the end of the amplitude amplification. In summary, the state over Z t 's and R t 's is transformed as follows: where |ψ ′ t ( x) is expressed as in the following claim.

Final Evaluation
After the first test and the second tests for t = 2, . . . , n, the state is now Recall that every party has registers Y, S 0 , S ′′ t for t = 2, . . . , n. In the final step of our algorithm for computing H 1 , every party concludes the value of H 1 ( x) from the contents of S 0 and S ′′ t 's as follows: • If either the content of S 0 is "true" or the content of S ′′ t is "inconsistent" for some t ∈ {2, . . . , n}, then every party sets the content of Y to "false".
It is not difficult to show the correctness. If the content of S 0 is "true", then the value of H 1 ( x) is obviously "false" (because | x| = 0). Suppose that the content of S 0 is "false", i.e., |S| = 0. From the definition of |Ψ t ( x) , we can observe the following facts: (1) If | x| := |S| = 1, then the contents of S ′′ t are "consistent" for all t = 2, . . . , n. (2) If | x| := |S| ≥ 2, then the content of S ′′ t is "inconsistent" for some t ∈ {2, . . . , n}. More precise description of our algorithm is given in Figure 2.

Lemma 9
For any graph G ∈ G n , if every party knows the number n of parties. there is an H 1 -algorithm that runs in O(Q rnd G (H 0 ) + Q rnd G (C S )) rounds with bit complexity O(n(Q bit G (H 0 ) + Q bit G (C S ))), where H 0 and C S are any H 0 -algorithm and any C S -algorithm, respectively.
Proof. The correctness follows from the above description of the algorithm. For the complexity, all communications are performed for computing H 0 and then computing C S for t = 2, ..., n in parallel. Therefore, the lemma follows.

Computing C S with Any H 0 -Algorithm
We now show that computing C S is reducible to computing H 0 .

Lemma 10 For any graph
Proof. Function C S can be computed by first computing in parallel H 0 and H |S| over the input bits of the parties associated with S, and then computing OR of them. To compute H 0 over the |S| bits with any H 0algorithm over n bits, every party i with i ∈ S sets his input to 0, and all parties then run the H 0 -algorithm. Similarly, (the negation of) H |S| over the |S| bits can be computed except that every party i with i ∈ S negates his/her input.
Lemmas 9 and 10 imply that, for any graph G ∈ G n , there is an H 1 -algorithm that runs in O(Q rnd G (H 0 )) rounds with bit complexity O(n · Q bit G (H 0 )). This completes the proof of Theorem 2. Theorem 2 can easily be generalized to the case where only an upper bound N of n is given to every party. Suppose that we are given an H 0 -algorithm that works for a given upper bound N of n. The proof of Lemma 10 then implies that there exists a C S -algorithm that can work even if only an upper bound N is given. We can thus make an H 1 -algorithm that works for the upper bound N , by performing the first test and then the second tests for t = 2, . . . , N in parallel.

Theorem 11 If only an upper bound N of the number n of parties is provided to each party, function H 1 can exactly be computed without intermediate measurements for any possible quantum state as input
in O(Q rnd (H 0 )) rounds with bit complexity O(N · Q bit (H 0 )) on an anonymous quantum network of any unknown topology.

Improved Algorithm for LE n
As an application of Theorems 1 and 2, we present a quantum algorithm that exactly solves LE n , which runs with less round complexity than the existing algorithms while keeping the best bit complexity.
Proof of Corollary 3. We first give a simple H 0 -algorithm in order to apply Theorems 1 and 2. The algorithm is a straight-forward quantization of the following deterministic algorithm: Every party sends his input bit to each adjacent party (and keep the information of the bit for himself). Every party then computes the OR of all the bits he received and the bit kept by himself and sends the resulting bit to each adjacent party (and keep the information of the bit for himself). By repeating this procedure ∆ times for an upper bound ∆ of the network diameter, every party can know the OR of all bits and thus the value of H 0 . This classical algorithm can easily be converted to the quantum equivalent with the same complexity (up to a constant factor).
Thus, we have proved the following claim. Corollary 3 improves the complexity of the existing quantum algorithms for LE n in Ref. [18]. For particular classes of graphs, it is known that H 1 can be computed as efficiently as H 0 . In this case, a direct application of Theorem 1 gives a better bound. For a ring network, both H 0 and H 1 can be computed in O(n) rounds with bit complexity O(n 2 ).
More generally, Kranakis et al. [14] developed a random-walk-based classical algorithm that efficiently computes any symmetric function if the stochastic matrix P of the random walk on the underlying graph augmented with self-loops has a large second eigenvalue (in the absolute sense). By using this algorithm to compute H 0 and H 1 , Theorem 1 yields an efficient algorithm for the graphs with a large eigenvalue gap.

Corollary 12
Let G ∈ G n and let G ′ be the graph G with self-loops added to each node. Let λ be the second largest eigenvalue (in absolute value) of the stochastic matrix P associated with G ′ . There is an algorithm that exactly elects a unique leader in O − log n log λ rounds with bit complexity O − m log λ (log n) 2 on an anonymous quantum network with the underlying graph G, where m is the number of edges of G.
In particular, a unique leader can exactly be elected in O(n 2/d log n) rounds with bit complexity O(n 1+2/d log n) for an anonymous quantum d-dimensional torus for any integer constant d ≥ 2, since −1/ log λ ∈ O(n 2/d ).
We next consider a more general setting, in which only an upper bound N of n is given to each party. In this case, our algorithm can be modified so that it attains the linear round complexity in N . The algorithm, however, has a larger bit complexity than than O(mN 2 ), which is attainable by an existing algorithm. Proof. Theorem 11 and Claim 2 imply that there exist an H 0 -algorithm and an H 1 -algorithm that work even if only an upper bound N of n is given to each party.
Since Theorem 1 depends on the high success probability of the base randomized algorithm (i.e., the algorithm in which every party flips a coin that gives heads with probability 1/n), the reduction works only if N = n. We thus modify the reduction in Theorem 1 as follows: (1) We attempt the quantum reduction in Theorem 1 for every guess n ′ of n in parallel, where n ′ = 2, . . . , N . (2) Each attempt is followed by performing the H 1 -algorithm to verify that a unique leader is elected. Observe that for at least one of n ′ = 2, . . . , N , a unique leader is elected, which is correctly verified by Step (2) due to Theorem 11. Therefore, the round complexity is O(N ) and the bit complexity is O(mN 3 ).

Computing Boolean Functions
Once a unique leader is elected, a spanning tree can be constructed by starting at the leader and traversing the underlying graph (e.g., in a depth first manner) and the leader can assign a unique identifier to every party by traversing the tree. Moreover, if a unique leader exists, the underlying graph is recognizable, i.e., every party can know the adjacency matrix of the graph, as shown in Lemma 14. Hence, it is possible to compute a wider class of Boolean functions than symmetric functions, i.e., all Boolean functions that may depend on the graph topology (but are independent of the way of assigning unique identifiers to parties). We call such functions computable functions.

Lemma 14
Once a unique leader is elected on an anonymous quantum network of any topology, the underlying graph can be recognized in O(n) rounds with O(n 3 ) bit complexity.
Proof. Once a unique leader has been elected, the following procedure can recognize the underlying graph. First construct a spanning tree in O(n) rounds with O(m) bit complexity by traversing the graph for the number m of the edges of the underlying graph. Second assign a unique identifier to each party in O(n) rounds with bit complexity O(n log n) by traversing the spanning tree starting at the leader (the first and second steps can be merged, but we here describe them separately just for simplicity). Finally, gather into the leader the information of what parties are adjacent to each party by conveying adjacency matrices along the spanning tree as follows: Each party communicates with each adjacent party to know the identifier of the adjacent party in one round with O(m log n) bit complexity. Next, each leaf node i prepares an n-by-n adjacency matrix with all entries being zero, puts 1 in the entries (i, j) of the matrix for all adjacent parties j, and then sends the matrix to its parent node of the tree with O(n 2 ) bit complexity. Every internal node k of the tree merges all received matrices, puts 1 in the entries (k, j) for all adjacent parties j, and then sends the resulting matrix to its parent node. Finally, the leader can obtain the adjacency matrix of the underlying graph, and he then broadcasts the matrix along the tree. These gathering and broadcasting steps take O(n) rounds with bit complexity O(n 3 ).
We now give a proof of Corollary 4.
Proof of Corollary 4. Once a unique leader is elected and the underlying graph is recognized, it is sufficient for the leader to gather the input bit of every party with his identifier of O(log n) bits along the spanning tree. This input gathering can be done in O(n) rounds with bit complexity O(n 2 log n). Thus, together with Corollary 3 and Lemma 14, any computable Boolean function can be computed in O(n) rounds with bit complexity O(mn 2 ) for the number m of the edges of the underlying graph. More generally, suppose that every party i has a qubit so that the n parties share some n-qubit state ξ, and let ρ be any n-qubit quantum state computable from ξ and the underlying graph. Then, by replacing an input bit with an input qubit for each party in the above proof for classical case, the leader can gather the n qubits to have ξ in his local space. Now the leader can locally generate ρ from ξ, and send back the corresponding qubit to each party to share ρ, again along the spanning tree, in O(n) rounds with O(n 2 log n) bit complexity. This completes the proof of Corollary 4.

GHZ-State Sharing Problem
In this section, we prove Theorem 5 by reducing the GHZ-state sharing problem to computing function F k , where F k is a function such that F k (x 1 , . . . , x n ) = n i=1 x i (mod k) for distributed inputs x i ∈ {0, . . . , k − 1}. Hereafter, we assume the existence of an F k -algorithm. The basic idea can be well understood by considering the case of k = 2.

Basic Case (k = 2)
The algorithm consists of two phases. The first phase runs two attempts of the same procedure in parallel, each of which lets all parties share either (|0 ⊗n +|1 ⊗n )/ √ 2 or (|0 ⊗n − |1 ⊗n )/ √ 2 each with probability 1/2. If the parties share at least one copy of (|0 ⊗n +|1 ⊗n )/ √ 2 after the first phase, they succeed. If the parties share two copies of (|0 ⊗n − |1 ⊗n )/ √ 2, the second phase distills the state (|0 ⊗n +|1 ⊗n )/ √ 2 from them with classical communication and partial measurements. A more detailed description is as follows.
Let i ∈ {1, 2} be the index of each attempt of the procedure performed in the first phase. The first phase performs the following procedure for each i (notice that function F 2 is equivalent to the parity of distributed n bits).
1. Every party prepares two single-qubit registers R i and S i initialized to |0 .

Every party applies Hadamard operator
3. All parties collaborate to compute the parity (i.e., the sum modulo 2) of the contents of R i of all parties and store the result into S i of each party: 4. Every party measures S i in the basis {|0 , |1 } and applies H to R i : If the state over all R i 's is (|0 ⊗n +|1 ⊗n )/ √ 2 for at least one of i = 1, 2, we are done; otherwise, we go on to the second phase. Observe that the state over all R 1 's and R 2 's is where we omit normalization coefficients. If every party locally computes the parity of the contents of R 1 's and R 2 's and measures the result, the entire state will be either |0 ⊗n It is easy to see that the state |0 ⊗n +|1 ⊗n can be obtained from any of these states by applying a CNOT to R 2 using R 1 as control (all R 2 's are disentangled). If we use a quantum simulation of a classical algorithm that deterministically computes the parity of distributed n bits (e.g., view-based algorithms [20,14,18]), our algorithm uses only a constant-sized gate set.

General Case (k > 2)
In the following, we assume k-level qudits are available for simplicity (the algorithm can easily be carried over the case where we are allowed to use only qubits). Any pure state of a k-level qudit can be represented as k−1 i=0 α i |i with complex numbers α i such that k−1 i=0 |α i | 2 = 1 (for k = 2, this is just a qubit). Our algorithm uses the following operator W k over one k-level qudit, instead of H used in the case of k = 2: For x ∈ {0, . . . , k−1}, where ω k = e 2π k i . In what follows, we denote

First Phase
The first phase is for the purpose of sharing k states drawn from the set {CAT k (t) : t ∈ {0, . . . , k − 1}} .
The operations are described as follows, which are similar to the case of k = 2.
2. Apply W k to the qudit in R i , which maps the state |0 ⊗n into (W k |0 ) ⊗n = 1 √ k n k n −1 y=0 |y . 3. Run an F k -algorithm to compute the value of F k (y) := n j=1 y j (mod k), where y j is the content of R i of the jth party, and store the result into a single-qudit register S i : |y .

Apply
The following lemma implies that, for each i ∈ {1, . . . , k}, the state of R i 's after the first phase is CAT k (−s i mod k). If s i = 0 for some i, we are done. Otherwise, the parties perform the second phase (described later) to distill the state CAT k (0) from the k states shared by all parties. |y .
The proof is given in Appendix.

Second Phase
Suppose that, after the first phase, all parties share k states, CAT k (−s i mod k) with s i = 0 for i = 1, . . . , k. Then, there must be two integers l, m ∈ {1, . . . , k} with s l = s m , since s l , s m ∈ {1, . . . , k − 1}. We can distill the state CAT k (0) from the states CAT k (−s l mod k) and CAT k (−s m mod k) as follows. Suppose that n parties share two copies of CAT k (t) for any t ∈ {1, . . . , k − 1} for their quantum registers R 1 's and R 2 's. Namely, the state over all R 1 's and R 2 's is By rearranging the registers, the state is ω t·r k (|x R 1 |r − x (mod k) R 2 ) ⊗n .
Every party then performs the following operations. |x ⊗n R 1 .

Proof of Theorem 5
The correctness of the algorithm follows from the above description of the algorithm. The communication occurs only when computing F k (in the first phase). Thus, the algorithm works in O(Q rnd (F k )) rounds with bit complexity O(Q bit (F k )), where F k is the given F k -algorithm. The algorithm works with the operators W k , W † k , the operators for computing classical functions (such as addition under modulo k) that are independent of n, except the given F k -algorithm. Therefore, the algorithm can be implemented with a gate set whose size is finite and independent of n if an F k -algorithm is given.
We next calculate α y for each y. If α Suppose that α (1) y = ω q k for some number q not prime to k. Let g be the greatest common divisor (GCD) of q and k. Since α (1) y = e 2π q/g k/g i is the (k/g)th root of 1, we have k/g−1 j=0 (α (1) y ) j = 0. Therefore, Hence, only the basis states |y such that α Thus, the lemma for t = 0 follows from eq. (1).
We now consider the case of t > 0. Suppose that