Computing the Degenerate Ground Space of Gapped Spin Chains in Polynomial Time

Given a gapped Hamiltonian of a spin chain, we give a polynomial-time algorithm for finding the degenerate ground space projector. The output is an orthonormal set of matrix product states that approximate the true ground space projector up to an inverse polynomial error in any Schatten norm, with a runtime exponential in the degeneracy. Our algorithm is an extension of the recent algorithm of Landau, Vazirani, and Vidick for the nondegenerate case, and it includes the recent improvements due to Huang. The main new idea is to incorporate the local distinguishability of ground states on the half-chain to ensure that the algorithm returns a complete set of global ground states.

Strongly correlated quantum systems are at the heart of such diverse physical phenomena as frustrated magnets, high-temperature superconductors, and topological quantum phases, to name but a few, and understanding their low-temperature behaviour is a grand challenge at the interface of physics and computer science. The computational tools to study these systems numerically are however inevitably stymied by the exponential growth of the Hilbert space of an n-particle system.
Many numerical tools have been developed to surmount the computational difficulties surrounding the exponential growth of the state space. Tools like quantum Monte Carlo [3,4], coupled cluster methods [5], and the density-matrix renormalisation group (DMRG) [6] all exploit some underlying physical structure to perform heuristically efficient simulations of interacting quantum systems. However, like all heuristics they often lack basic theoretical guarantees except in certain special cases, and some of them have failure modes that are difficult to diagnose or characterise. This makes it hard to precisely determine which physical features enable efficient simulation and which features might be responsible in the cases where the heuristic fails. Very recently, a success story has emerged in the rigorous understanding of gapped 1D spin chains. Hastings' proof of the 1D area law [7] established that approximating the ground state of a gapped spin chain is in NP. The proof technique crucially used the idea of matrix product states (MPS) [8], an ansatz for quantum systems that exploits the area law structure of entanglement [9]. The first rigorous nontrivial algorithm for finding ground states, due independently to Schuch and Cirac [10] and Aharonov, Arad, and Irani [11], used dynamic programming to find MPS ground states. The runtime of the algorithm was exponential in the bond dimension, and since Hastings' result used a polynomial bond dimension to achieve a sufficiently accurate approximation, the runtime could only be guaranteed to be exponential for general gapped 1D systems. The first subexponentialtime algorithm was found by Arad, Kitaev, Landau, and Vazirani [12]. They used the same algorithm as Refs. [10,11], but showed that the ground state could be approximated to sufficient accuracy with an MPS of bond dimension n o (1) , where n is the length of the chain. This line of research culminated in the breakthrough algorithm of Landau, Vazirani, and Vidick (LVV) [1], who gave the first polynomial-time algorithm for approximating the ground state in the case of a non-degenerate ground space. This was subsequently improved by Huang [2] who gave a deterministic algorithm with an exponentially improved scaling with respect to the gap that also applied to highly frustrated systems.
In this paper, we extend the LVV algorithm and its subsequent improvements by Huang to the case where the Hamiltonian has a degenerate ground space. To give a precise statement of our results, we first introduce some notation.
Consider a 1D chain of d-level quantum systems (qudits) with d = O(1) and a Hamiltonian H that is a sum of local terms on contiguous sites, and let n be the number of sites in the chain. After sufficient coarse-graining, shifting the energy, and dividing by the largest operator norm from the terms in the Hamiltonian, we can take the Hamiltonian to be of the form H = n−1 i=1 H i , where 0 ≤ H i ≤ 1 and H i acts non-trivially only on the i and i + 1 qudits. We additionally require that this transformed Hamiltonian has a constant lower bound ǫ 1 − ǫ 0 ≥ ǫ ≥ Ω(1) on the spectral gap, i.e. the difference between the first excited energy (ǫ 1 ) and the ground state energy (ǫ 0 ) of H is bounded from below by ǫ independent of n. Our Hamiltonians will have a degenerate spectrum in the ground space with a degeneracy g = O(1). We say that such a Hamiltonian is in standard form, though this is not a unique normal form because of the ambiguity in the coarse-graining step.
Our main result is an approximation of the ground space in the form of an orthonormal basis of matrix product states that approximate the true ground space projector. More precisely, we have the following.
Theorem 1 (Main result). Given an n-qudit Hamiltonian H in standard form and constant bounds ǫ and g on the gap and degeneracy respectively, then for every sufficiently small approximation error η satisfying 1/η = n O(1) , Algorithm 1 constructs the orthonormal approximate ground states {|γ j } j in matrix product form for 1 ≤ j ≤ g in runtime n O (1) . The approximation error gives a bound in the Frobenius norm of where G is the true ground space projector, and Υ := g j=1 |γ j γ j | is the approximate ground space projector.
The proof of this statement is given in Section 2.7, and a more precise dependence of the runtime on the gap ǫ and the ground space degeneracy g is made explicit in Section 2.8. Note that since both G and Υ have rank g, we can get an approximation error in the trace norm at the cost of a factor of √ 2g = O(1) increase in error. Our result relies heavily on the ideas in LVV [1] and the improvements of Huang [2], so let us recall briefly how this algorithm proceeds. The LVV algorithm proceeds by constructing a viable set, which roughly speaking is a small set of matrix product states on a half-chain that is guaranteed to contain the local Schmidt vectors of an approximate ground state on the full chain. In each step of the algorithm, the half-chain of the current viable set is extended by one site to create a new viable set. This initial extension is too large, but trimming away some states with too high an energy yields another suitably small set that is still viable.
The main new idea is an extension of the size trimming step to degenerate systems, based on the concept of local distinguishability of ground states. The algorithm works by iteratively sweeping across the chain, and finds the degenerate ground states one-by-one in a manner similar to that of LVV on each iteration. As we run these iterations, we must ensure that the viable set of potential ground states on the half-chain contains a witness that not only has low energy, but also does not simply reduce to something in the span of the previously found ground states. To ensure this, we break the analysis into two cases of high-and low-distinguishability.
In the low-distinguishability case, the Schmidt vectors on the left half-chain of the previously found ground states and the desired witness have large overlap. By recycling these Schmidt vectors, we therefore get a viable set for this new witness for free.
In the high-distinguishability case however, this recycling would not provide a viable set. In this case we need to trim the viable set in a way that not only ensures the existence of a low-energy witness, but a witness which has low overlap with the existing states. If the relevant Schmidt vectors of the desired witness and the existing states have low overlap, then we can optimise for a state of low energy and overlap efficiently.

Preliminaries
Let us begin by giving a more precise statement of the LVV algorithm, with the definitions suitably extended to our treatment of the degenerate case.
Given the Hamiltonian H = n−1 j=1 H j , we will often consider a partition of the system into the first i spins and the rest, we will call this the i|i + 1 cut. The Hamiltonian on a contiguous subset of spins {j, . . . , k} is denoted H [j,k] = k−1 i=j H i . In this notation the local terms can also be expressed as H i = H [i,i+1] , and the full Hamiltonian as H = H [1,n] . In the special case that an index i is clear from context, we define H L = H [1,i] [1,n] , and similarly for the other possible subscripts. It will sometimes be convenient to subtract a constant energy shift from a Hamiltonian, in which case we define H ′ L = H L − ǫ L , where ǫ L is the least eigenvalue of H L , and similarly for H ′ R and ǫ R . When a particular state ρ is clear from context, we refer to Tr(ρH L ) as the left energy of ρ, and similarly for the right energy.
A viable set is simply a set of vectors that can be used to construct the Schmidt vectors of a new approximate ground state on a half-chain, assuming we already have approximate ground states |γ 1 ,. . . ,|γ h−1 . The aim is to inductively grow the region of viability to the entire system starting from one end of the chain. The span of such a set gives the desired small domain on which candidate ground states can be found efficiently.
• The states in S are supported on the first i ≤ n qudits of the system, i.e. S ⊂ H [1,i] .
• The cardinality is bounded |S| ≤ s.
• Each state in S is represented by an MPS with bond dimension at most b.
• There exists a witness state |v which has, depending on context, either -(δ case) overlap with the ground space projector G satisfying G |v ≥ 1 − δ, or -(∆ case) the expected energy satisfying v|H|v ≤ ǫ 0 + ∆ǫ.
• The reduced density operator of this witness must be supported on Span(S).
When it is clear from the context (and it almost always will be clear), we will drop the explicit h dependence in the definition of viable set and refer to a (i, s, b, δ)-viable set, and similarly for ∆.
The two error parameters specified, δ or ∆, will be used contextually and we will sometimes refer to them as the overlap error and the energy error, respectively. The two error parameters however can be used interchangeably, though at the cost of some overhead which we would like to avoid. The proof is simple and can be found in [1,2].
For later use, we define the following absolute constants c and c ′ that are independent of all other parameters, and a simple function f of the ground state degeneracy, where we note that 200 4 √ c = c ′ . These values can be used to make all of the constants hidden by the big-O notation in our derivations and procedures completely explicit. However, our final result still relies on certain implicit estimates (in particular, the estimates in Refs. [13,14]) so we are not able to give a completely explicit scaling.

The Algorithm
Algorithm 1 works by first running the slight modifications on the existing nondegenerate ground state approximation algorithms of Ref. [1,2], which is discussed in more detail in Section 3.1.1. As discussed earlier, the extended algorithm works by iteratively finding ground states one-by-one, with each guaranteed to be orthogonal to those previously found until all g ground states have been found. It is not necessary to know a priori a specific value for g, but unless g = O(1) holds, then there is no guarantee from the area law alone that all the ground states can be approximated well by the algorithm. Moreover, if the algorithm is iterated more times than the true degeneracy of the system, then it will necessarily produce high energy states which will herald this over-estimation, as we will show in Appendix A. As such we will henceforth assume that the degeneracy g is known. The Step/FinalStep procedures will be discussed below in Section 2.1, and its four sub-procedures will follow in Sections 2.2-2.5. Finally the ApproxGroundState procedure will be discussed in Section 2.6. Algorithm 1 Ground space approximation 1: procedure DegenerateGSA(H, g, ǫ, η)

Step
The Step procedure, shown above in Algorithm 2, takes a viable set defined on i−1 qudits, and constructs a new set which is viable on i qudits. As well as physically extending the set, this procedure can be efficiently performed such that neither the cardinality nor the error grow. Starting from the trivially (0,1,1,0)-viable set {1}, the Step procedure allows you to inductively construct (i,s,b,∆ = cǫ 6 /f 4 )-viable sets S h,i , for all i. Here the parameters s and b are all polynomial functions of n and are given in Table 1. On the final qudit the Step procedure is tweaked in the form of FinalStep, which performs a more powerful error reduction procedure to bring the error down to the desired final level. By constructing a viable set on the entire system, a polynomial-sized subspace has been constructed on which the Hamiltonian may be efficiently optimised, allowing the Red indicates the desired parameter change in each sub-step, grey the weaker of the two error parameters as given by Lemma 3. Here p is some absolute polynomial p = n O(1) and the remaining polynomials scale as p 0 , p 1 , p 2 , q = n a where a = O 1 + ǫ −1 log η −1 / log n . The polynomials s := pp 1 + pq + q and b := pp 2 denote the fixed points of the cardinality and bond dimension under the Step procedure.
previously inaccessible witness state to be harvested, yielding the final desired ground state approximation. This final optimisation is performed in ApproxGroundState, which follows the Step-based viable set construction.

Extension
By induction, suppose we start with a (i − 1,s,b,∆ = cǫ 6 /f 4 )-viable set S i−1 . The first step is to take a viable set on i − 1 qudits and extend it trivially to a viable set on i qudits. To do this we take the element-wise tensor product of the existing viable set with a basis on the ith qudit. It is trivial to see that the set resulting from Extension, S h,i , is (i,ds,b,∆ = cǫ 6 /f 4 )-viable; see Ref. [1] for the explicit argument. We simply note that the net result of this step is to increase the cardinality of the viable set from s to ds, while keeping all other parameters fixed.

Size trimming
For the size-trimming we are going to use two nets to compensate for our ignorance of two properties of the ground states: the local energy differences between them, and their entanglement structure across the i|i + 1 cut. The former is represented by the difference in the expectation value of H L between our desired witness and (say) the first ground state found. The latter is represented by the boundary contraction (defined below) of the witness state in question. In both cases we are going to construct a net on the space of possible values of these unknowns, and perform the necessary optimisations in the neighbourhoods of these elements, thus allowing for error guarantees in spite of this ignorance.

Boundary contraction
Viable sets, like reduced density matrices, capture local information about a given state. As defined, viable sets are solely determined by the left Schmidt vectors of a witness state; the reduced density operator however also contains the Schmidt coefficients across the cut. As such, merely finding vectors of low energy on the half-chain [1, i] in no way guarantees consistency with a low-energy global state, as they might not be consistent with the entanglement structure across the i|i + 1 cut. As well as this the system may be frustrated, with a globally low-energy state appearing locally excited. Both of these problems are circumvented by performing the desired optimisations alongside boundary contraction constraints.
Given a state of Schmidt rank B and Schmidt decomposition across the i|i + 1 cut given by |v = B j=1 λ j |a j |b j , let U v : C B → H R be the partial isometry specified by U v |j = |b j . Then we have the following.
Definition 4 (Boundary Contraction [1]). The left state of |v with respect to the i|i + 1 cut is defined as The boundary contraction of |v is defined as Note that the density matrix cont(v) is supported on H i ⊗ C B . The interpretation of a boundary contraction is the following. The difference in right energies between a candidate state and a valid witness state for the full chain is bounded by how close the boundary contractions of the states are. By extending a state by the as-yet-unknown witness state, we can argue the existence of a low energy solution within our extended viable set.
Proof. We give a sketch of the proof and refer the reader to Ref. [1] for more detail. Begin with Tr(σ ′ H) = Tr (σ ′ + |v v| − |v v|)H for some ground state |v . Expanding H = H L + H i + H R and using the matrix Hölder inequality together with the bound H i ≤ 1 yields the proof.

Nets
Recall that an ǫ-net on a metric space is a set of points such that any point in the space is within distance ǫ of a point in the net. We will use two nets in our argument: one on the interval [−1, 1 + η] in the standard absolute value metric and one on the set of matrices with bounded max-norm on C d ⊗ C B in the trace norm.
We refer to this net as the energy net. The choice of limits [−1, 1 + η] is justified in Appendix C. Let N η be the matrices on C d ⊗ C B with entries whose real and imaginary parts are in the set The cardinality of N η is upper bounded by (2 ⌈Bd/η⌉ + 1) 4Bd . For any positive semidefinite matrix ρ on C d ⊗ C B with trace at most 1, there exists a point X ∈ N η such that ρ − X 1 ≤ η. We refer to this net as the boundary contraction net.

Procedure
Let L h,i be the set of left Schmidt vectors with respect to the i|i + 1 cut of the states |γ 1 , . . . , |γ h . Furthermore define Π h,i to be the projector onto Span(L h,i ). Now define parameters δ = 8 √ cǫ 3 /f 2 , η = 4c ′ ǫ/f , and ξ to be chosen later in Eq. (7).
Trim: For each boundary contraction X ∈ N ξ/2 with bond dimension B δ (to be defined below in Eq. (3)) and left-energy difference Y ∈ M η , take σ to be a density matrix supported on Span S We will specifically choose σ to be a solution to the degenerate size-trimming convex program: where Tr H L (σ − |γ 1 γ 1 |) ≤ Y, Now consider the set of vectors {|v k } k consisting of all the eigenvectors of σ with eigenvalues at least 10 −9 /g. Now define where |v k,j is the jth left Schmidt vector of |v k across the i|B cut.
The next Claim, whose proof is postponed until the next section, establishes that the viability parameters after each Trim step remain bounded as advertised.
Claim 6 (Proof in Section 3.2). The set resulting from Trim, S

Bond truncation
First, we recall some results on low-rank approximate ground states [12,14,15]. Let us define a function Specifically we will take B δ to be defined by the following lemma, a corollary to the area law [7,12,14].
Lemma 7 ( [12,14]). For any ground state |Γ and bipartite cut of the system in question, there exists a state |γ with Schmidt rank across that cut bounded by B δ such that | Γ|γ | ≥ 1−δ. Moreover, there exists an MPS approximation |γ ′ with bond dimension across every cut of at most B δ/n and for which | Γ|γ ′ | ≥ 1 − δ.
If we know the Schmidt decomposition of a given state across a bipartite cut, then the Eckart-Young theorem [16] guarantees that the truncation of that state to Schmidt-rank D is the best approximation with said Schmidt rank.
Definition 8 (Truncation). Consider a state |v with Schmidt decomposition |v = j λ j |a j |b j across the i|i + 1 cut, where any degeneracy in the Schmidt coefficients is resolved arbitrarily, and with non-increasing λ j . Given an integer D, define the truncation of |v by This approximation is optimal in the following sense [16]. For any state |v , and any integer D, the truncated vector |v ′ := Trunc D |v / Trunc D |v satisfies v|v ′ ≥ | v|w | for any |w of Schmidt rank at most D.
Combining these results, we get a precise statement that low-energy states can also be approximated faithfully with low-rank approximations. See Ref. [2] for a proof.
Building on these lemmas, we can define the bond truncation procedure Truncate for our algorithm as follows.
Truncate: Take each vector within S (2) h,i , and truncate all bonds on 1, . . . , i − 1 to P := 800nB 1/800n = n 1+o(1) . Take S As before, the following Claim establishes that the Truncate step preserves the viability with the advertised parameters.
Claim 10 (Proof in Section 3.3). The set resulting from Truncate, S

Error reduction
Next step in the algorithm is to push the error level back down. To do this we need to construct operators called approximate ground state projectors (AGSPs). One convenient choice of AGSP would be of the form where x is a parameter defined later. We cannot however construct such an operator.
Firstly it would require us to exactly know the ground state energy. Secondly it would require exponential time to construct and specify. We already have a ground state approximation |γ 1 which we got from the nondegenerate algorithm with energy at most ǫ 0 +η 2 ǫ/4f . As such, if we define ǫ ′ 0 := γ 1 |H|γ 1 then for sufficiently small error η we have In order to construct an approximation of A by a more efficient operator K, we define where U B (t) is an approximation of exp(−iHt) with Schmidt rank B across every cut, the construction and analysis of which is given in Ref. [13]. The approximation between A and K is broken into three distinct approximations. The first is that the integral on (−∞, ∞) is truncated to an integral on [−T, T ], with an exponentially small error in T . The second is that this integral is approximated by a Riemann sum, specifically the rectangle rule 1 with a discrete step-size of τ . The third is that exp(−iHt) is approximated by a unitary U B (t) that has low Schmidt rank. All three errors are considered in Section 3.4. If we take parameters in Eq. 4 to have the scaling then this approximate AGSP is capable of lowering the energy error down to ∆ = ζ. As such we will henceforth refer to this as a ζ-approximate AGSP. With the definition of K in hand, we can give a precise specification of Reduce: Reduce: Decompose a ζ-approximate AGSP with ζ = cǫ 6 /f 4 as K = j A j ⊗ B j . Then return the set given by applying this approximate AGSP, combined with the recycled Schmidt vectors of the previous approximate ground states.
As we will prove in Section 3.4, the Reduce step can lower the error of a viable set from δ = 1/5 down to ∆ = ζ, increasing both the bond dimension and cardinality by a factor of Therefore the above construction with ζ = cǫ 6 /f 4 can send a (i,p 1 + q,p 2 ,δ = 1/5)-viable set to a (i,pp 1 + pq + q,pp 2 ,cǫ 6 /f 4 )-viable set where This forms our next Claim.

Final error reduction
As for the final step, we now want to reduce the error further than was previously necessary. We do this by simply constructing a stronger version of the previously used approximate AGSP.
FinalReduce: Construct a η 2 /4f -approximate AGSP K. Return the set constructed by applying this AGSP, combined with the previous approximate ground states, S h,n := K |s |s ∈ S Claim 12 (Proof in Section 3.4). The set resulting from FinalReduce, S h,n , is (n,

Ground state approximation
Once we have S h,n , a viable set on the entire system, we then need to extract the witness state. By construction, a viable set provides a domain on which known optimisation techniques can be applied to efficiently find approximate ground states. We summarise this with the last procedure.
ApproxGroundState: Take σ to be a density matrix supported on Span(S h,n ), found as the solution to the convex program min Tr(Hσ h ) where γ j |σ h |γ j = 0 for 1 ≤ j < h , Return |γ h , a leading eigenvector of σ h .
The objective functions and constraints can both be evaluated efficiently, firstly because our witness states are MPS with small bond dimension, and the Hamiltonian is a matrix product operator (MPO) with small bond dimension, and also because the domain of the optimisation is polynomially large. Since the program takes the form of a convex optimisation, these two properties are sufficient to allow the program to be solved efficiently. By optimality of the above program, the energy of σ h must be at most that of the witness, ǫ 0 + η 2 ǫ/4f . If we take η ≤ 1/3 and using Lemma 15 we get that the leading eigenvector, |γ h , has an energy at most ǫ 0 + η 2 ǫ/4g, and ground space overlap G |γ h ≥ 1 − η 2 /4g.

Proof of Theorem 1
We can now turn to our proof of the main result Theorem 1. The proof is conditional on Claims 6, 10, 11, and 12, whose proof is given below in Section 3.
Proof. The Claims 6, 10, 11, and 12 establish that in each iteration of Algorithm 1 a viable set with energy error at most η 2 /4f is produced, and Lemma 15 establishes that this yields an orthonormal set of vectors such that γ j | G |γ j ≥ 1 − η 2 /2g for all j = 1, . . . , g. Define Υ := g j=1 |γ j γ j |, and compute This completes the proof of the main result conditional on the Claims.

Runtime
As all the parameters given in Table 1 are polynomials in n, and all operations required to execute Algorithm 1 (convex optimisation, inner product of MPS etc.) can be performed at polynomial overhead, the overall runtime is also polynomial in n. The leading order term in n gives a runtime of T = n a where a = O 1 + log η −1 / log n , which reduces to In Ref. [22], an improved analysis of the FinalReduce step (which we will discuss in Section 3.4.2) is presented. The scaling the polynomials p 0 , p 1 , and p 2 are all improved to q = n O(1) in the η −1 = n O(1) case. This means that the corresponding run-time reduces to T = n O(1) .
One thing to note is that this leading order scaling is independent of the degeneracy. If we now focus on the degeneracy scaling, fixing n and allowing g to vary, we find that the dominant term now comes from the cardinality of the boundary contraction net. As such the leading-order scaling becomes 2 b where b = gÕ (1/ǫ) ; this is also the leading order scaling in the inverse-gap.

Preliminaries
To prove the claims in the previous section, we first state some simple results that are easy to prove. These results were also used in Refs. [1,2], and we state them here without proof.
Lemma 14 (Orthogonalising lemma). Suppose you have two vectors |u and |v , such that u|H|u , v|H|v ≤ ǫ 0 + ∆ǫ, with bounded overlap | u|v | ≤ β. Then there exist orthogonal vectors Proof. The proof is constructive, take the two vectors to be As |v ′ ∝ (I − |u u|) |v this vector is clearly orthogonal to |u ′ . As for the energy bound, define After reintroducing the ǫ 0 term in H we get the final energy bound.
If we apply the same argument to the projector orthogonal to the ground space, the same (1 + β)/(1 − β) growth can also be shown to occur in the overlap error. The Schmidt rank of the resulting orthogonalised vectors will in general be the sum of the two original vectors' Schmidt ranks. If we can bound the overlap between a vector and any member of a vector space, then that vector can be similarly orthogonalised with respect to that whole space at the same cost.
As expectation values are bilinear in pure states, the optimisations in the Trim and ApproxGroundState are both performed over mixed states such that their objective functions are linear. To aid in the analysis of these optimisations, we will next show that the leading eigenvector of a low energy mixed state must be a low energy pure state.
Proof. First decompose the mixed state as σ = k>0 λ k |v k v k |, where λ k ≥ 0. By the normalisation of σ k we have λ k = 1. Next define the set of indices K as By comparing the energy bound of the mixed state as a whole as well as those on indices in K, we get which in turn gives k∈K λ k ≥ 2g/(2g + 1), analogous to a Markov inequality. As we required ∆ ≤ 1/3g, each vector corresponding to a index in K has an energy at most ǫ 0 + ǫ/3g, and thus a ground space overlap at least 1 − 1/3g. By Lemma A.3 there can only be g such orthonormal vectors, leading us to conclude that |K| ≤ g. Using this we can bound the largest eigenvalue corresponding to an index in K As such we can conclude that the largest eigenvalue must belong within K, meaning that the leading eigenvector has energy at most ǫ 0 + ∆ǫ.

Non-degenerate ground state approximation
Here we will briefly address NondegenerateGSA, which we have claimed is a small modification on the nondegenerate ground state approximation algorithms of Ref. [1,2]. The analysis of these algorithms largely holds in the degenerate case, appropriately replacing the notions of ground state overlap with ground space overlap. The only key difference is the use of a special case of Lemma 15. As Lemma 15 will be invoked during the size-trimming and ground state approximation steps (Sections 2.6, 3.2), the energy levels in these steps grow by a factor of g overlooked in the non-degenerate analysis. By applying a slightly more powerful error reduction, dropping error levels beforehand by g to compensate, the same overall scaling can be proved, at the mere introduction of an extra constant factor. Due to this rather superficial change we will use these algorithms without proof of their validity. Furthermore we will take to denote the bond dimension of this first approximate ground state |γ 1 given by this algorithm. We will show that the bond dimension of all subsequent ground state approximations follows the same scaling, and thus use q(n) to generally denote the bond dimension of any combination thereof.

Size trimming
Next we will consider the size trimming sub-step, and prove the associated Claim 6. In Lemma 19 we will show that there exists a low-energy state orthogonal to the existing ground states, which we hope to find. After this we will introduce and define localdistinguishability, and give some intuition about how our size-trimming works in the lowand high-distinguishability regimes. We will then consider how the error levels behave in these two regimes, showing that the size-trimming procedure yields a good viable set for any level of distinguishability. We will conclude this section in with a full proof of Claim 6.

Low-energy state
The optimisation Eq. (1) in Trim is performed with respect to H L and not the whole Hamiltonian. There is no reason to assume a priori that states which have low energy with respect to the global Hamiltonian H, have low energy with respect to half-chain Hamiltonians H L or H R . Using the locality of the Hamiltonian however we will see that a global ground state of H can be excited by an energy at most 1 with respect to H L + H R .
Lemma 16. For any ground state |Γ of H, Proof. Consider the state |v := |l ⊗ |r , where |l and |r are in the 0-eigenspace of H ′ L and H ′ R respectively, which are non-empty by construction. The energy of |v is so for |v to not have an energy lower than that of a ground state, we require ǫ 0 ≤ ǫ L + ǫ R + v|H i |v . By bounding this further by the normalisation H i ≤ 1, this means the ground expectation of H ′ L + H ′ R is at most 1.
As global ground states have low energy with respect to H L + H R , there must exist a good approximant of the ground state in the low-energy subspace of H L + H R . As such we will restrict our analysis to this subspace, which we will refer to as the truncated Hilbert space.
As the left and right Hamiltonians commute and are positive, we have P t ≥ Q t .
In general whilst a ground state can be heavily frustrated, containing contributions from states with arbitrarily high local energy, these contributions will be small. Specifically we expect that a given ground state will have a large overlap with P t and Q t -this result is known as the truncation lemma, and was proven in Refs. [2,12].
Lemma 18 (Truncation lemma [2,12]). For any ground state |Γ of H, Given these results, we can now move on to proving the existence of a low-energy witness state which Eq. (1) will approximate.
i , such that it is orthogonal to the previous ground state approximations and has energy at most ǫ 0 + cǫ 6 /f 4 . By Lemma 3 there exists a ground state |Γ such that | v ′ |Γ | ≥ 1 − cǫ 6 /f 4 . Using Lemma 13 we get that Using this together with the energy of |v ′ , and again assuming that cǫ 6 /f 4 is sufficiently small gives Decomposing the above using the projector P t and its orthogonal complement -both of which commute with H ′ L and H ′ R -then this gives where we have defined |v := P t |v ′ / P t |v ′ , such that |v now lies in the desired domain Span{S (1) i } ⊗ H ≤t [i+1,n] by construction. Using P t |v ′ 2 + (1 − P t )|v ′ 2 = 1 as well as the inequality t ≥ Γ|(H ′ L + H ′ R )|Γ + 4 √ cǫ 3 /f 2 , we can reduce the above simply to The overlap between this truncated witness and the ground state is bounded and so once again applying Lemma 13 gives Combining Equations (8) and (9) we can bound the total energy of the witness projected into the truncated Hilbert Space Next we need to considering trimming the bond dimension. By applying Lemma 9 and recalling the definition of B δ from Eq. (3), we get that the truncated state has w|H|w ≤ ǫ 0 + 96 4 √ 4cǫ/f ≤ ǫ 0 + 0.75c ′ ǫ/f . Next we consider the overlap induced with the existing ground states. To bound this we consider the overlap between |w and |v ′ . We had that | v ′ |Γ | ≥ 1 − cǫ 6 /f 4 and | v|Γ | ≥ 1 − 2cǫ 6 /f 4 , Lemma 13 therefore gives v ′ |v ≥ 1 − 6cǫ 6 /f 4 .
Lemma 13 once more gives that | v|γ | ≥ 1 − 20 √ cǫ 3 /f 2 . As |w is a truncation of |v , by the Eckart-Young theorem it must have a larger overlap than |γ Applying Lemma 13 a third time we get finally that As |γ 1 , . . . , |γ h−1 are all perpendicular to |v ′ , the overlap between |w and any element of Span{|γ 1 , . . . , |γ h−1 } is upper bounded by the component of |w orthogonal to |v ′ Taking ǫ ≤ 10 7 without loss of generality this overlap is less than 1/7, so we can orthogonalise |w to the existing ground states at a multiplicative energy cost of 4/3, resulting in a final state with w|H|w ≤ ǫ 0 + c ′ ǫ/f , and all other desired properties by construction.

Distinguishability
Viable sets capture local information about a witness state, specifically the Schmidt vectors. If two states are globally orthogonal, this can mean that their Schmidt vectors span orthogonal spaces, but they can also span identical regions. Indeed global orthogonality of witness states cannot be locally imposed for this reason.
When considering viable sets, a natural notion of distinguishability is the overlap of the desired witness state, and the Schmidt vectors of the already constructed states. If we take |w to be a witness state which we wish to capture, L h,i to be the set of Schmidt vectors on [1, i] of the h already constructed ground states, and Π h,i the projector onto Span(L h,i ), then the distinguishability is defined as From here we will omit the subscripts on L, Π, and D, since they will be clear from context. If we think of Π as being the projector onto the vectors on [1, i] which we already have, then D can be considered a measure of how much of the state is unaccounted for.
Clearly if D is low, then a viable set can be constructed entirely by recycling existing Schmidt vectors. The precise bounds for the quality of viable set constructed in this manner, as a function of D, are given in Lemma 20. In the case of a high D new vectors need to be found. In this case we know there is little overlap between the Schmidt vectors of our new witness and Span(L). Our procedure works by restricting to a low-energy subspace, and projecting away from the existing vectors, by minimising the expectation value of Π. In Lemma 21 we will show that this procedure must produce a low energy state. Given that our witness is highly distinguishable, Lemma 22 will show that this new witness after size trimming must also be highly distinguishable. This will culminate in Lemma 23 where we show that this therefore implies that the new witness has small overlap with the previous ground states, producing a good viable set.
We will conclude this subsection by showing in Lemma 24 that by performing both the low-and high-distinguishability procedures and combining the results, we can upper bound the error of our viable set in a distinguishability-independent manner.

Low D
By definition, in the low D case the left Schmidt vectors of |w have large overlap with L. As a result we can see that L will form a viable set, whose witness is the projection of |w onto L, and that the error of this viable set is low for low D.
Lemma 20 (Low D Solution). There exists a state |v orthogonal to |γ 1 , . . . , |γ h−1 with Tr [i+1,n] |v v| supported on the span of L h,i , such that there exists a ground state |Γ with Proof. The idea here is to project |w into Span (L h,i ) ⊗ H [i+1,n] and use the distinguishability to bound the error associated with renormalisation. Specifically take where we recall that w|Π|w = 1 − D. As the existing ground state approximations are all stable under Π by construction, the orthogonality of |w to these vectors carries over to |v . The energy of |w is at most ǫ 0 + c ′ ǫ/f , by Lemma 3 this means there is a ground state |Γ with Decomposing this with the projector Π, and using both the triangle inequality and Cauchy's ineqality, we get Rearranging this gives the desired ground state overlap stated above.

High D
Take σ to both be the solution to Program 1 for the X closest to cont(w) and the smallest Y which is greater than or equal to w|H L |w − γ 1 |H L |γ 1 , i.e. the tightest set of constraints satisfied by |w w|. By the nature of the two nets, described in Section 2.3.2, we have that Here we want to argue that if D is not too low, then one of the σ outputs of Eq. (1) has an eigenvector which can be extended to a good witness. To extend the eigenvectors of σ we take the 'boundary-uncontracted' version σ ′ = U w σU † w , eigendecomposing these matrices as: where |v ′ k = U w |v k . In Trim we kept all eigenvectors whose value exceeded 10 −9 /g. We next want to show that there can only exist a finite number of such eigenvectors, and that they represent the vast majority of the spectrum of σ.
Proof. Defining σ ′ = U w σU † w , the boundary contraction property Lemma 5 gives an energy bound on σ ′ .
By applying an argument similar to Lemma 15, if we increase the excitation not only by a factor of (2g + 1) but also by 5000, we get that there are between 1 and g eigenvectors which have energy at most ǫ 0 + 30,000c ′ ǫ/g. Moreover we also get that the sum of the corresponding eigenvalues is at least 1 − 1/(10,000g + 5000) = Λ.
Next we rearrange the order of the eigenvectors. Let 1, . . . , g refer to these g vectors 2 described in Lemma 21. In other words: Next we drop any such vectors with eigenvalues less than the 10 −9 /g threshold 3 . It should first be noted that by applying Lemma 15 to σ we get that the dominant eigenvalue that is at least 2/(2g + 1) has a corresponding eigenvector in this set, and so this set is non-empty. Next we note that as there are at most g − 1 such vectors dropped, so decrease in the sum of eigenvalues is at most 10 −9 · g−1 g ≤ 10 −9 . As such we can conclude that the sum of the corresponding eigenvalues is still bounded Without loss of generality, take |v ′ 1 to be the member of {|v ′ 1 , . . . , v ′ g } for which the expectation value of Π is the lowest. We now want to show that the expectation of Π with respect to |v ′ 1 not significantly higher than it is for |w (which is 1 − D by definition). As such this will imply that in the high D case |v ′ 1 has small overlap with L, meaning that we have succeeded in finding a state with low overlap with the existing ground states.
Proof. The specific iteration of Eq. (5) we are referring to is that for which |w satisfies all of the constraints of the optimisation. As such, by the optimality of the solution σ, we know that w|Π|w ≥ Tr(σΠ). Decomposing this and using the definition of D: w|Π|w ≥ Tr(σΠ) As Π acts strictly to the left of the cut and U w strictly to the right they comute, meaning the expectation value of Π for primed and unprimed eigenvectors is identical.
As we defined |v ′ 1 to have minimal expectation this can be further loosened to Rearranging this together with the bound for g k=1 λ k , we get After a binomial expansion, followed by a loosening of the inequality, we arrive at our final bound.
In general |v ′ 1 will have some non-trivial overlap with the existing ground state approximations. This low expectation value for Π can however be used to bound this overlap when D is not too small. Lemma 23. For any vector |γ ∈ Span{|γ 1 , . . . , |γ h−1 } the overlap between |γ and |v ′ 1 is upper bounded Proof. The main property to use here is that Π |γ = |γ by definition. Using this and Cauchy-Schwartz gives Take |u to be |v ′ 1 projected orthogonally to |γ 1 , . . . , |γ h−1 , as per Lemma 14. Then the ground space overlap of this vector can be bounded to be | Γ|u | ≥ 1 − δ High for some ground state |Γ , where δ High = 30,000c ′ g

All D
Next we can combine the two above errors. We have shown that there exist bounds on the overlap error δ in the low-and high-D cases. By combining these, we can bound δ independent of the unknown D.
Proof. Consider the case of D = 10 −4 and g = 2. The two error levels in this case can be explicitly calculated, and are It can also be shown that for a fixed g that δ Low and δ High are monotonically increasing and decreasing in D respectively, and that for a fixed D both decrease monotonically with g. As such we can conclude that δ ≤ 0.01 for all D ∈ [0, 1] and g ≥ 2.
Proof. As shown above, marginalising over distinguishability for degeneracies at least two, the overlap error is upper bounded by 1/100. The cardinality contribution comes in two parts: the results given by Program 1, and the previously obtained left-Schmidt vectors that are recycled. Let the cardinality of the Program 1 results be r where Schmidt rank The cardinality contribution from recycled Schmidt vectors is bounded by the bond dimension of each previous ground state approximation, as given in Section 3.1.1. This contribution is q(n) = n a where a = O(1 + ǫ −1 log η −1 / log n), and dominates over r.
The overall cardinality is the summation of these two contributions, and as such is given by p 1 (n) = n a where a has the scaling given above.
The results of Program 1 are linear combinations of vectors in Span(S (1) i ∪ L h,i ). The contributions from both spaces to the bond dimension of the worst-case result is the product of their cardinalities and bond dimensions respectively. As such these contributions are dsb for S (1) i , and q 2 for L h,i respectively.

Bond truncation
The idea here is to use the existence of low-rank approximate ground states to argue that truncation induces little error due to its optimal nature as a low-rank approximation [16]. First we note an important result [1,2] about truncation, that large overlap with a low-rank vector is maintained under truncation.
Lemma 25 ([1, 2]). Given a vector |v with Schmidt rank R across i|i + 1, for any |u Let |u denote a witness of S (2) i such that there exists a ground state |Γ such that .
Fix |Γ to denote the ground state with which |u has maximal overlap. Next we invoke Lemma 7, taking r(n) := B 1/800n such that there exists an MPS |γ with bond dimension bounded by r(n) such that Applying Lemma 13 we can combine these, showing |γ and |u have large overlap Next take |w to be the renormalised truncation of |u across every bond where we have taken P (n) := 800n r(n).
Applying Lemma 25 across every bond we get that the overlap-error induced by truncation grows additively, giving a final overlap of Once again applying Lemma 13 we get therefore that the ground space overlap of |w , which is lower bounded by the overlap with |Γ , can in turn be lower bounded Once again we must turn our attention now to orthogonalising |w with respect to the existing ground states to satisfy the relevant orthogonality condition. The procedure for bounding this once again is to note that because |u is orthogonal to Span{|γ 1 , . . . , |γ h−1 }, any overlap |w has with this space is upper bounded by the component of |w orthogonal to |u . Applying Lemma 13 once again to their mutual overlap with |Γ , we get that and thus that the orthogonal component is bounded By Lemma 14 then we can orthogonalise |w with respect to the existing ground state approximations, at most tripling the overlap-error. Thus, after orthogonalisation, there must exist a ground state |Γ ′

Claim 10
The set resulting from Truncate, S i , is (i, p 1 +q, p 2 , δ = 1/5)-viable, where Proof. The error level δ = 1/5 is proven above. As the truncation procedure is performed element-wise, the cardinality is changed only by the reintroduction of recycled Schmidt vectors, adding q. After truncation, the bond dimension is bounded by the trimming level The orthogonalisation procedure then will additively increase this bond dimension by q, the bond dimension of the previous approximate ground states. As such the final bond dimension can be bounded by a polynomial p 2 of the same scaling.

Error reduction
Next we need to address the Reduce and FinalReduce procedures. We will do this by considering the construction of an AGSP A, and an approximation thereof K. Both operators are subject to a free parameter ζ, which controls the magnitude of the error reduction they can perform. The only difference between Reduce and FinalReduce will be the final level of ζ utilised. As such we will analyse the procedure for an arbitrary ζ, stating the Reduce and FinalReduce as special cases of a more general analysis.

Exact AGSP
First we consider the exact AGSP construction where ǫ ′ 0 := γ 1 |H|γ 1 is an inverse polynomial approximation to the ground state energy, and x is a ζ-dependent parameter given in Appendix B. As well as specifying the precise parameters of the AGSP/AAGSP construction, in Appendix B we also show that for each ground state |Γ and excited state Γ ⊥ : It is in this sense that our AGSP approximates a ground state projector. Take |u to be a witness of S i , such that there exists a ground state |Γ with We can decompose this witness in the energy eigenbasis where |E 0 is a ground state, {|E j } j>0 are all excited states 4 with energy ǫ j ≥ ǫ 0 + ǫ. Due to the role of |u as a witness of S We can bound the renormalising denominator term by applying A to |u in this decomposed form If we assume ζ ≤ 1 then A |u ≥ 1/5. Using this bound we can argue that because A shrinks the high energy terms, yet does not shrink the magnitude all that much, the energy of |w should be a drastic improvement over that of |u .
Lemma 26. w|H|w ≤ ǫ 0 + ζǫ/80,000 Proof. We can bound the energy by simply expanding the expectation of H ′ := (H − ǫ 0 ), and seeing how it changes under the application of A using the above decomposition in the energy basis.
Because A commutes with H, the double sum collapses and we have Using ǫ ′ 0 − ǫ 0 ≤ ǫ/2, and then using the Hölder inequality, we have The maximum is on the boundary because x is sufficiently large, so we find ≤ 25ǫ e −x/4 ≤ ζǫ/3200 Where above we have used the chosen value of x := 33 − 8 log(ζ) from Appendix B.

Approximate AGSP
As well as the bounds on the power of A, Appendix B also gives that we can approximate A by a low bond dimension operator K with For simplicity we can express K = A + ζ ′ O, where O ≤ 1. As with K we can bound the renormalising term for |u , Now if we take ζ ′ ≤ 1/10 then we get K |u ≥ 1/10. As such we can replace the state |w that had an exact AGSP with an approximate version using the AAGSP, and similarly bound the energy.
w ′ H w ′ ≤ ǫ 0 + ζǫ/400 Proof. Once again we consider the expectation value of H ′ Reintroducing ǫ 0 gives the desired energy bound.
Finally we must orthogonalise to negate any overlap with the existing vectors this has caused.
Proof. The original witness |u has a large ground state overlap, which in terms of the previous decomposition means Using the fact that K |u ≤ 1 and that A is diagonal in the energy basis, we can bound the overlap that |w ′ in turn has with |E 0 . Using ζ ′ ≤ 1/100, Applying Lemma 13 we therefore get that As has been argued before, the component of |w ′ overlapping with Span{|γ 1 , . . . , |γ h−1 } can be bounded by the component perpendicular to |u . This component is bounded ≤ 400 − 1 400 + 1 .
By Lemma 14 and the previous lemma, we can thus orthogonalise |w ′ to a vector |w ′′ with a 400-fold increase in the excitation, resulting in an energy of w ′′ H w ′′ ≤ ǫ 0 + ζǫ .
As such setting ζ to the desired final energy error level and applying the local Schmidt operators of K, the energy error can be reduced to ζ. Claim 11,12 The set resulting from Reduce, S The set resulting from FinalReduce, S n , is (n, Proof. The construction described above gives a K with a bond dimension that scales (see Ref. [13]) as n a ′ where a ′ = O(1 + ǫ −1 log ζ −1 / log n).
For Reduce the cardinality growth is caused by the fact that K must be applied locally, i.e. each Schmidt operator must be individually applied. The number of these operators is bounded by the bond dimension of K. As such Reduce, for which ζ = cǫ 6 /f 4 = O(1), has a multiplicative growth in cardinality of p(n) = n O(1) . On top of this the recycled Schmidt vectors additively increase the cardinality once more by q.
For FinalReduce however the whole operator can be applied without Schmidt decomposition, thus leaving cardinality unchanged. The recycled previous ground states grow the cardinality additively by g.
The bond dimension increase is once again multiplicative, and once again bounded by the bond dimension of K as an MPO. For Reduce this means a growth by a factor of p(n). For FinalReduce however ζ = η 2 /4f is not necessarily constant, as such the growth in bond dimension has the more general scaling of p 0 (n) = n a where a = O(1 + ǫ −1 log η −1 / log n) .
Here we briefly comment on the improved analysis of Ref. [22]. Suppose we perform the following for our FinalReduce: instead applying of a single powerful AGSP, we interleave a mild AGSP with bond trimming multiple times. With sufficient parameters (the details are given in Sec.5.5 of [22]), this can be done such that the bond dimension remains at most n O(1) , and the error level is halved. Performing this procedure the requisite number of times, the scaling of final bond dimension can be improved to p 0 = n O(1) for η −1 = n O (1) . Carrying this analysis through, this also reduces the scaling of p 1 and q to both be n O(1) , and therefore a final run-time also of T = n O(1) .

Conclusion
We have shown the ground state approximation of Refs. [1,2] can be extended to degenerate systems with an approximation of the ground space projector that is up to inverse polynomial error in any Schatten norm.
AGSPs have proven powerful tools for relating the structural properties of gapped systems to various computational complexity bounds regarding them. Most AGSP constructions are either functions of the Hamiltonians [2,12,14] or parts of the Hamiltonian terms [1,[17][18][19][20]. While we utilise such AGSPs in intermediate steps, our algorithm outputs a state-based AGSP, and more thoroughly exploits the underlying one-dimensional structure, allowing for a more powerful overall construction. While previous AGSP constructions have typically been full-rank, our AGSP takes the form of a minimal-rank projector, that moreover has a polynomially bounded bond dimension as a matrix product operator (MPO). It remains an open question for which systems there exist efficiently constructible AGSPs that are also low-rank AGSPs. Note that this is important for applications because while AGSPs that are close to the ground space in operator norm are easy to construct even with small bond dimension in MPO form, they do not generally allow for low-error estimation of expectation values of other MPO observables.
While running in polynomial-time, the specific runtime scaling of the above algorithm is highly suboptimal, leaving wide room for further optimisation. One potential avenue for optimisation is the level of redundancy within the viable sets that are generated. While there are rigorous upper bounds on the cardinality of these sets, no care has been taken to explore to possibility of high levels of linear dependence, opening the door to far better practical runtimes than our analysis would suggest.
Note added in preparation: This work began as the undergraduate Honours thesis of the first author as a project to extend LVV to the degenerate case. During that time, the results of Huang in Ref. [2] (version 1) appeared, and were incorporated into our independent analysis. Near the completion of this manuscript, Huang also extended the results in Ref. [2] to the degenerate case and published this as Ref. [21] (version 3), an updated version of the original manuscript. While our techniques rely on the methods in version 1 (Ref. [2]), our treatment of the degenerate case was made completely independently from version 3 (Ref. [21]).
Our results and the new results in Ref. [21] are essentially the same, but with the following differences. The most notable is that the algorithm of Ref. [21] is single-pass, finding a full set of ground states with a single sweep of the system. In principle our multipass algorithm will generate viable sets with far higher levels of redundancy, causing worse scaling in the parameters of Table 1. In the final runtime however this slow-down only influences the various constants hidden under big-O terms, giving an overall scaling that is identical to that of Ref. [21]. The second major difference regards the size-trimming, the most complicated step to generalise. Our algorithm works by applying an energy constraint and minimising the overlap between witnesses, while Ref. [21] constrains the overlap and minimises the energy. This allows one to use boundary contraction to circumvent the issue of local distinguishability, simplifying the analysis of this step. The final difference is that our algorithm does not require knowing a specific bound on the ground state degeneracy g, only that a bound of the form g = O(1) exists, whereas it is not clear how to perform the algorithm in Ref. [21] without an explicit upper bound.
Subsequent to the journal submission of this manuscript, but prior to its publication, Arad et.al. released a yet further improved algorithm in Ref. [23]. While our algorithm works by iterating along the spin chain, constructing matrix product states, the main new technique in Ref. [23] is to instead apply divide-and-conquer approach, generating states in a tree-like fashion. This method allows for much tighter runtime analysis (e.g. runtime reduction from poly(n) to O(n 3.38 ) for frustration-free systems), as well as extensions to much larger classes of systems (e.g. a quasi-polynomial-time algorithm for certain gapless models).
We thank Thomas Vidick for comments on the previous version of this manuscript. We acknowledge support from the Australian Research Council via EQuS project number CE11001013, the US Army Research Office via grant numbers W911NF-14-1-0098 and W911NF-14-1-0103, and by iARPA via the MQCO program. S.T.F. also acknowledges support from an ARC Future Fellowship FT130101744.
where we have used the normalisation i |c i | 2 = 1 and the resulting bound i |c i | ≤ √ g.
Lemma A.2 (Basis under projection). The projections G |v i form a basis of im(G).
Proof. As the number of vectors matches the dimension of the target space, proving linear independence is sufficient to prove the formation of a basis. If the projected vectors were linearly dependent then this would imply there exist a non-zero |v ∈ S such that G |v = 0. Lemma A.1 however gives v|G|v ≥ 1 − δ ≥ 0 for all such vectors, thus proving linear independence by contradiction.
Next we want to generalise the concept of a full basis, specifically the fact that any vector perpendicular to a full basis is necessarily perpendicular to the entire space, meaning that there is a natural maximum to the size of any given basis. Lemma A.3 (Fullness). Any vector |v ′ which is orthogonal to S has v ′ |G|v ′ ≤ δ.
Proof. As the projection of this space is spanning, we can pick |v ∈ S such that it projects into the same vector as |v ′ , i.e. G |v ∝ G |v ′ . As such we can decompose both states as where |g ∈ im(G) and |h , |h ′ ∈ ker(G). The orthogonality of |v and |v ′ gives Using | h|h ′ | ≤ 1 in turn gives λ + λ ′ ≤ 1. Lemma A.1 gives that λ ≥ 1 − δ, leading us to conclude that λ ′ ≤ δ, and thus that v ′ |G|v ′ ≤ δ generically for any vector orthogonal to S.

B Approximate AGSP construction
In this section we explicitly lay out the various errors associated with approximating the AGSP defined in Sections 2.5 and 3.4. The scaling of these errors was stated in less detail in Ref. [2]. First we consider the AGSP itself where ǫ ′ 0 is the ground energy approximation given by |γ 1 . If we take η, ζ ≤ 1/2 and assume that ζ ≥ η 2 /4f then The magnitude of this AGSP on the excited subspace, known as the shrinking factor, can be bounded If we define x := 33 − 8 log ζ, then this shrinking factor is at most ζ/2. Further assuming that ζ ≤ 1/25, the magnitude of A on the ground space can be lower bounded 20 .
Next we move on to the approximate AGSP, defined by where T , τ and D are parameters and U D is some unitary operator all of which we shall define below. This definition relies on taking the Fourier decomposition of A, and approximating this by truncating and discretising the integral and approximating the propagators by a known algorithm [13].
This error can be upper bounded in turn by the difference between upper and lower Riemann sums which is equal to the left and right Riemann sums owing to the monotonicity of the integrand.
The difference between these two summations is telescopic. As such it equals the difference between the highest and lowest terms.
Using the trivial bound on the real exponential term, we get Again this error can also be bounded by δ D ≤ ζ ′ /3 if we take the time-step

B.3 Bond Truncation Error
To upper bound the total truncation error for the entire approximation by ζ ′ /3, it is sufficient to bound the truncation error for each propagator exp(−iHτ j) by ζ ′ /3 since the total sum is normalized. Using the construction of Ref. [13] this error can be achieved if the bond dimension of each propagator is where we have assumed ζ −1 = n O(1) .

C Frustration bound
Consider a 1D system with nearest neighbour interactions, were each local Hamiltonian term H i is bounded 0 ≤ H i ≤ 1. For a cut i|i + 1 we split the Hamiltonian into a left-Hamiltonian H L = i−1 j=1 H j , middle-Hamiltonian H M = H i , and right-Hamiltonian H R = n j=i+1 H j ; we refer to their expectation values as the left/middle/right-energy.
Lemma C.1. For a Hamiltonian normalised as above, the left-energies relative to any cut for any two states with an energy less than ∆E over the ground, differ by no more than 1 + ∆E.
Proof. Suppose we have two states |v 1 and |v 2 with local-energies given in Table 2.
where E = E 0 + ∆E, and E 0 is the ground energy, and we take ∆L ≥ 0 without loss of generality. Using the positivity of H M on the expectations for |v 2 gives L + R ≤ E − ∆L .
As |v 1 has a left-energy L, it must possess at least one left Schmidt vector |l with a left-energy at most L, similarly |v 2 has a right Schmidt vector |r with right-energy at most R. As such we can construct a state |v = |l ⊗ |r with energy upper bounded v|H|v = v|H L |v + v|H M |v + v|H R |v = l|H L |l + v|H M |v + r|H R |r ≤ L + 1 + R ≤ E + 1 − ∆L .
As v|H|v ≥ E 0 , we get that in general the difference between the left-energies is bounded ∆L ≤ 1 + ∆E .
A simple example of a perturbed Ising model shows that this bound is tight.