Characterizing overstretched NTRU attacks

: Overstretched NTRU is a variant of NTRU with a large modulus. Recent lattice subfield and subring attacks have broken suggested parameters for several schemes. There are a number of conflicting claims in the literature over which attack has the best performance. These claims are typically based on experiments more than analysis. In this paper, we argue that comparisons should focus on the lattice dimension used in the attack. We give evidence, both analytically and experimentally, that the subring attack finds shorter vectors and thus is expected to succeed with a smaller dimension lattice than the subfield attack for the same problem parameters, and also to succeed with a smaller modulus when the lattice dimension is fixed.


Introduction
NTRU is a public key cryptosystem [11] that serves as a basis for many cryptographic protocols (e.g. [9,10,14]) and is believed to remain secure in the presence of quantum computers. See the survey [23] for a complete description of the NTRU cryptosystem and its applications. In this paper, we consider the overstretched NTRU variant in the cyclotomic field K = Q[x]/(x n +1), with n a power of 2. Let R = Z[x]/(x n +1) be the ring of integers in K, and let Rq = Zq[x]/(x n + 1) for some integer q that is super-polynomial in n. The private key consists of two polynomials f, g ∈ Rq, with f invertible. The public key h is defined by h := gf −1 . The coefficients of f and g are chosen to be small from a given distribution, most commonly the uniform distribution over {−1, 0, 1}. The (general) NTRU key recovery problem is to recover a,b ∈ Rq with "sufficiently small" coefficients such that h = ba −1 . Cryptanalyses of NTRU. Coppersmith and Shamir [5] presented a lattice attack on NTRU. The lattice has dimension 2n, and in the context of this paper the attack works over the full field K. Later variants of this attack decrease the lattice dimension, resulting in feasible computations for larger parameters. The first cryptanalyses of overstretched NTRU were given independently by Albrecht et al. [1] and Cheon et al. [4]. These works present subfield attacks that exploit subfields in K. Their main result is that if q is exponentially large, the dimension can be much smaller and the attack runs in polynomial time.
Kirchner and Fouque [12] presented a subring attack against overstretched NTRU. This attack allows more flexibility in choosing different (larger) dimensions. Moreover, they proved that despite the larger dimension, the "full field" attack does not perform worse than the subfield and subring attacks.
Experimentally, the subring and subfield attacks are significantly faster than the full field attack. [12] compare to the experiments in [1] and conclude that if one wishes to minimize the ratio between n and q then the subring attack "performs better" than the subfield attack. Subsequent work by Duong et al. [8] conclude that for different choices of subfields, the subfield attack is not worse than the subring attack. These comparisons are mainly experimental and not analytic.
However, these works directly compare lattices of different dimensions and observe that the attack succeeds on a smaller modulus q for a given degree n using larger-dimension lattices. The benefit of this comparison is questionable, as it follows from [12] that the full field, which has the largest dimension, is expected to achieve the lowest ratio between n and q among all these lattice attacks. Furthermore, the point of the subfield and subring attacks is to decrease the dimension, so increasing the dimension is in opposition to the goal of the attack construction. While lattice dimension is not the only parameter that affects the running time or approximation factor of lattice basis reduction algorithms, in all of our experiments, as in the reported experiments of previous works, lattice dimension seems to play a crucial role in the actual running time. Our results. The main goal of our work is to resolve the conflicting claims in previous work. We formally analyze the relative performance, focusing on the lattice dimension and use experiments to validate our analysis. Our focus on the lattice dimension follows our claim that this is the correct metric for comparison.
1. We formally justify the projection technique of May [17] and May and Silverman [18], which is key to the subring attack. We formalize its necessary conditions, and explain its relation to standard assumptions on NTRU. This analysis fills a theoretical gap in prior work on the subring attack. 2. We show that the subring lattice is expected to contain shorter vectors than the subfield lattice, which resolves the incompatible claims in prior work. In short, if this is the case, for fixed n and q, the subring attack can use projection to discard more equations and obtain a lower dimension lattice. Thus, for a given degree n and fixed lattice dimension, the subring attack is expected to succeed on a smaller modulus q. Our analysis does not show that these desired vectors are the shortest vectors in the corresponding lattices. Thus, our bounds may not be tight. We present experimental evidence suggesting that these bounds are conservative. Our result focuses on the structure of the lattices more than actual implementations of the attacks. In particular, we fix the subfield index and analyze the asymptotics of the length of short vectors in the lattices. Implementations of the attacks would try to optimize the choice of the subfield with respect to the degree of the field. An analysis of such an optimization seems challenging.

Preliminaries
We denote the ring representation of an element by a ∈ R, and the vector representation by a ∈ Z n q . The Euclidean norm ||f|| is the norm of the corresponding vector consisting of the polynomial coefficients, i.e. ||f ||. We use [x]q to denote x mod q.
A number field is a finite field extension of Q. Its degree is [K : Q]. The ring of integers O of a number field K is the set of algebraic integers contained in K. For any field K and subfield L of K, we define r = [K : L] to be the index of the subfield we consider. If n ′ = [L : Q] and n = [K : Q], then r = n/n ′ .
Let ζm be a primitive m th root of unity. Let the m th cyclotomic field be K = Q(ζm), and m a power of 2. The m th cyclotomic polynomial is x n + 1, where n = ϕ(m) and ϕ is Euler's phi function, and K ∼ = Q[x]/(x n + 1).
Let K be a number field and L a subfield of K. Consider the map ma : x ↦ → ax, for an element a ∈ K and x ∈ L. The trace of a ∈ K, denoted Tr K/L (a), is the trace of ma. The relative norm of a ∈ K, denoted N K/L (a), is the determinant of ma. The trace is additive and the norm is multiplicative. In particular, if K is a Galois extension of Q and we define G = Gal(K/L), then Tr K/L (a) = ∑︀ σ∈G σ(a) and N K/L (a) = ∏︀ σ∈G σ(a). The embeddings σ ∈ G permute or conjugate the coefficients of x ∈ K in the canonical embedding. Hence, ∀σ ∈ G, ||σ(x)|| = ||x||. Let K be a number field and L a subfield of index r. When we enumerate the embeddings, we set σ 1 = Id. To prevent confusion, while we use canonical embeddings, the norms are taken with respect to the coefficients.
A lattice is a discrete additive subgroup of R n . We will represent an n-dimensional lattice as an n×n matrix where the rows are given by the basis vectors b i , and write L(B) for the lattice generated by basis matrix B.

Characterization of NTRU attacks
We give a short presentation of the different attacks on NTRU. A comprehensive characterization of the attacks appears in the full version of this paper [6].

The full field attack
Coppersmith and Shamir [5] considered the following 2n × 2n matrix where In is the n-dimensional identity matrix and M h represents multiplication in R by the public key h. The lattice generated by A full , which we call L(A full ), contains the vector (f , . The vector (f , g) has relatively small Euclidean norm, and is most likely a vector of the smallest non-zero length in the lattice L(A full ). Thus, the NTRU problem can be reduced to computing short vectors in L(A full ). Coppersmith and Shamir [5] note that one can derive useful information to recover the secret key even when a multiple of (f, g) is found, for multiples of relatively small norm (yet larger than (f, g)). See [5] for more details. In fact, one can focus on finding a small multiple of f, since given αf, a small multiple of g can be found by computing α = αg. The different attacks exploit this fact, and aim to recover (a small multiple of) N K/L (f), for a subfield L ⊆ K, which can be shown to have a relatively small Euclidean norm.
In the following descriptions, we use the term "f part" for the part of the vector corresponding to the identity matrix, and "g part" for the part corresponding to multiplication by h (or by N K/L (h) in the subfield lattice).

The subfield attacks [1, 4]
The norm attack [1] uses the lattice generated by the rows of the 2n ′ × 2n ′ matrix where n ′ is the degree of the subfield L and M ′ N K/L (h) is an n ′ × n ′ matrix representing multiplication by N K/L (h) in L. This lattice contains the short vector )︀ . In the trace attack [4] one replaces M N K/L (h) with M Tr K/L (h) , the matrix representing multiplication by Tr K/L (h) in L. We focus on the norm attack because it performs slightly better when the polynomials are balanced (that is, the polynomials f and g have approximately the same number of non-zero coefficients).

The subring attack [12]
The subring attack can be divided into two steps. First, consider the the following (n ′ + n) × (n ′ + n) matrix where M ′ h is an n ′ × n matrix representing multiplication by h in L. This lattice contains the short vector . The dimension n ′ + n is larger than 2n ′ , the subfield attack lattice dimension. The second step "projects" the lattice, i.e., deletes columns of M ′ h , to reduce the dimension. We denote the projected vector (N K/L (f), N K/L (f)h). For rigorous analysis, columns should be chosen independently. This approach offers more flexibility than the subfield attack by allowing one to choose dimensions greater than 2n ′ .
Previous comparison of the attacks. The size of the modulus q causes a tension between applications and security. Many applications, such as fully homomorphic encryption, need a relatively large modulus, but NTRU with a very large modulus can be broken in polynomial time, even with the Coppersmith-Shamir attack. Hence, it is important to analyze the relation between q and n when studying the security of cryptosystems that are based on overstretched NTRU.
In [1], the experimental results focus on finding the minimal modulus q for which the NTRU problem can be solved with the subfield attack on a fixed n using the LLL algorithm. The subsequent work [8,12] followed this approach and directly compared to these experiments, both claiming to achieve "better results". Moreover, [12,Theorem 9] shows that under some conditions, working over a subfield of smaller index (including the full field) will not give worse results despite the increase in dimension (however, see our experimental results for the full field compared to the other attacks in Table 2).
We claim that the lattice dimension, which strongly influences the attacks' running time, should be central to the attack comparisons. This point has been overlooked in prior work, and thus it is not clear whether the different experiments remain comparable. Table 1 gives a series of experimental "improvements" that decrease q by modifying the dimension. We give results from previous papers and from our own experiments. We ran the subfield and the subring attack using our implementations of these attacks and used the projection technique to reduce the dimension of the lattice. Except for the case log(n) = 12, these comparisons are inconclusive when dimension is taken into account. For a detailed comparison, see the full version of this paper [6].

Main results
Our main contribution is a full characterization of the various attacks on overstretched NTRU. In Section 4.1, we give a detailed analysis of the applicability of the projection technique to NTRU lattices. Having fully proven the subring attack, we compare it to the subfield attack in Section 4.2.

The projection technique
The key to the projection technique is that the system of equations corresponding to f · h = g is assumed to be overdetermined, so one can discard some of the equations. We formalize this assumption, relate it to standard assumptions on NTRU and derive concrete results. Some of these details are missing in [12,Section 3.2]. In the following, let L(A) be the lattice generated by any of the above attack matrices A, L ′ be the projected lattice of dimension n ′ + d, and M the upper right quadrant of A. The discrepancy D(Γ) [7,13] measures how equidistributed a sequence of points Γ = { 1 , . . . , n} in the interval [0, 1] is. Formally defined, where the supremum is taken over all subintervals J of [0, 1], |J| is the length of J, and T(J, n) is the number of points i in J for 1 ≤ i ≤ n. Let T be a sequence of elements in Z n with the dot product. T is ∆-homogenously distributed modulo q if for any a ∈ Z n with at least one coordinate coprime to q, the discrepancy of {[a · t]q /q} t∈T is at most ∆. We would like to consider the columns of M as a set of elements in T. However, this sequence is not ∆-homogenously distributed modulo q for small ∆: for example when M = M h , h · f = g does not distribute homogenously. We define the following weaker notion. Theorem 1 considers M h , and thus also M ′ h , as these are the cases of interest in [12,17] respectively. However, it can be generalized to the other matrices M described above. In general, it is not known that the columns of M h are (B, f)-weakly homogenously distributed for sufficiently small B. Understanding the distribution of h is extremely important as it underlies the security of NTRU. In general, h is not uniformly distributed in Rq, as can be seen from a simple information-theoretic argument. Hence, understanding the distribution of f −1 is important in order to understand the distribution of h. Banks and Shparlinski [2] studied how "well spread" the coefficients of f −1 are, that is, whether they "look and behave like random polynomials". We remark that the desired property on h may follow from the behaviour of f −1 , but this property is not well formed.
Moreover, it is standard to assume that h = g/f is indistinguishable from random in Rq, see [14]. We remark that this assumption has a strong relation with the weakly homogenous distribution of M h . Indeed, if the latter is not true, then one can pick a set of small polynomials a and analyze the distribution of {[a · t]q /q} t∈M h to distinguish it from from a random h. Thus, under the indistinguishability assumption, this set of polynomials has to be negligibly small.
We ran experiments with ternary f, g and verified that the coefficients of [f −1 ]q equidistribute in Zq and that M h is (B, f)-weakly homogenously distributed for sufficiently small B. For the rest of the paper, we rely on the following assumption.  Proof. Using notation from Theorem 1, the probability that there exists a lattice point ||( . Setting d ≥ (n log(2B + 1) + 1)/log (︀ q/(2B + 1) )︀ gives the result.
Setting d as required in Corollary 1, observe that the dimension of L ′ is n + d ≥ (n log(q)+1)/(log (︀ q/(2B + 1) )︀ ). This is similar to the subring lattice in [12,Theorem 6]. Thus, Corollary 1 completes the missing details on the validity of the subring attack and, along with [12,Theorem 6], gives a complete analysis of the subring attack under Assumption 1. We formalize this result in the following theorem. Now, β denotes the block size in the BKZ [21] algorithm.

Comparing ||N K/L (g)|| and ||N K/L (f)h||
In Section 3, we showed that a small vector in the subfield lattice is the vector (N K/L (f), N K/L (g)), while in the subring lattice the vector (N K/L (f), N K/L (f)h) is small. The f part of these vectors is the same. Our interest is therefore in the g part. Moreover, we know that N K/L (g) n ′ -dimension vector and N K/L (f)h is n-dimensional vector. We show that these two elements have the same Euclidean norm. It then follows that on average the coefficients of N K/L (g) are larger than the coefficients of N K/L (f)h. When we truncate the latter to n ′ coordinates, its norm becomes smaller than the norm of N K/L (g). Using an assumption on the distribution of the coefficients of N K/L (g), we quantify the difference in size. More precisely, Theorem 3 shows, with no further assumptions, that when [K : L] = 2, the average size of the coefficients of N K/L (g) is expected to be In light of this result, our aim is to show that the ratio between ||N K/L (g)|| and ||N K/L (f)h|| tends to 1 as n increases. Then, we can conclude that the subring lattice of dimension 2n ′ is expected to contain shorter vectors than the subfield lattice of the same dimension. We use random walks to model the coefficients of a product of two polynomials. A first case is for polynomials whose coefficients are drawn independently and uniformly from the set {−1, 0, 1}. A one-dimensional random walk over Z starts at 0 and at each step moves either +1 or −1 with equal probability. Let a i , for i = 1, . . . , n, denote independent random variables with value either +1 or −1 with uniform probability, and let w 0 = 0 and wn = ∑︀ n i=1 a i . The series {wn} defines a random walk over Z. The expected distance after n steps is on the order of √ n. As n increases, the distribution of the series wn approaches the normal distribution. A second case is for polynomials whose coefficients are drawn from a Gaussian distribution: in a Gaussian random walk, we let a i follow the Gaussian distribution with standard deviation σ and mean zero. The expected distance after n steps is then on the order of σ √ n. For further background, see [15]. We now consider the specific case of subfield L ⊆ K of index 2.
Theorem 3. Let f, g ∈ Rq be two polynomials whose coefficients are drawn independently and uniformly from {−1, 0, 1}. Let N K/L (g) = (u 1 , . . . , un) and gσ 2 (f) = (w 1 , . . . , wn). Then E[u 2 i | u i ≠ 0] = 8n/9 − 4/9 and E[w 2 i ] = 4n/9. Thus, as n goes to infinity the ratio between the expected magnitude of the non-zero coefficients of N K/L (g) and gσ 2 (f) tends to √ 2 in absolute value and the ratio of the expected squared Euclidean norms E[||N K/L (g)|| 2 ]/ E[||gσ 2 (f)|| 2 ] tends to 1. Proof. We start by comparing the expected size of the coefficients of N K/L (g) to those of gσ 2 (f). Let a i ∈ {−1, 0, 1} uniformly and independently at random and consider the polynomial g = a n−1 x n−1 + a n−2 x n−2 + · · ·+ a 1 x + a 0 ∈ Rq . Then σ 2 (g) = −a n−1 x n−1 + a n−2 x n−2 +· · ·− a 1 x + a 0 . Each coefficient u k of N K/L (g) = gσ 2 (g) is a sum of n terms. For k odd, u k = 0, because each of the terms a i a j in this sum appears twice with opposite signs. For k even, u k = 2 ∑︀ i<j,i+j≡k mod n ±a i a j + a 2 k/2 + a 2 (n+k)/2 , since each term a i a j with i ≠ j appears twice with similar sign. Then We would like to generalize this result to r > 2. While the coefficients of gσ 2 (g) and gσ 2 (f) can be expressed as random walks, and thus follow a Gaussian distribution, they may not be independent. This assumption seems natural and allows us to prove Theorem 4, a generalisation of Theorem 3 to any r > 0. We experimentally verified that as n grows, the ratio of the norms tends to 1, as in Theorem 4, see Figure 1. From this, we directly get that the ratio of the average coefficient tends to √ r, see Claim 1.  Proof. We give a proof by induction on the index r where the base case is proven in Theorem 3. Suppose the claim holds for index r, we show that it holds for [K : L] = 2r. First note that for a tower of fields L ⊆ E ⊆ K we have N K/L (a) = N E/L (N K/E (a)) for every a ∈ K (see [16,Theorem 2.29]). Consider the case [K : E] = r, [E : L] = 2, and denote G := N K/E (g) and F := N K/E (f). Then, N K/L (g) = N E/L (G) = Gσ ′ 2 (G) and The previous case of the construction, i.e. [K : E] = r, shows that each (non-zero) coefficient of F, G and G ′ follows a Gaussian distribution. Under Assumption 2 the coefficients can be considered to be independent. We can now repeat the process of Theorem 3. While G ′ ∈ K, note that F, G ∈ E so they have n/r non-zero coefficients. Thus, similarly to Theorem 3, each non-zero coefficient of Gσ ′ 2 (G) is approximately 2 multiplied by a Gaussian random walk with n/2r steps, while each coefficient of G ′ σ ′ 2 (F) is a random walk with n/r steps. By the induction hypothesis, a coefficient of G is expected to be larger than coefficients of G ′ by a factor of √ r for sufficiently large n. The result on the coefficients follows from evaluating the expected size of the random walks and the claim on the norms follows from Claim 1.
The following corollary compares small vectors in the subfield and subring lattices of same dimensions, without using projection in the subfield lattice.  Proof. To simplify notation, we write coeff(f ) to denote the average size of the coefficients of a polynomial f . We have B 2 sub eld = (n/r)(coeff(N K/L (f)) 2 + coeff(N K/L (g)) 2 ), and . We know that coeff(N K/L (f)) 2 ≈ coeff(N K/L (g)) 2 and from Theorem 4, we also know that coeff(N K/L (g)) 2 ≈ r coeff(N K/L (f)h) 2 . BKZ is guaranteed to output a vector bounded by β 2n ′ /β λ 1 (L). The second result follows from bounding this value by √ q in both lattices.
It follows that if we take the same lattice dimension in both attacks, the subring lattice contains vectors of smaller size. Therefore one can solve the NTRU problem using the subring attack with a smaller q. As mentioned in [1, Section 6], it is not known that B sub eld is the norm of the shortest vector (see [12,Theorem 5]). Moreover, our experiments in Table 2 show that the ratio between q sub eld and q subring grows with r. One possible explanation is that (N K/L (f), N K/L (f)h) is unbalanced, as we show its f part is much larger than the g part. Therefore, if there exists an integral multiple of this vector that decreases the size of the f part and increases the size of the g part so that the vector becomes balanced, the ratio between the feasible qs would increase.
If the systems of equations derived from the lattices given in Corollary 2 are overdetermined, we can project and get smaller lattices. Since the g part in the subring lattice is smaller than in the subfield one, the subring attack system is more determined than the subfield one.
The following corollary shows that one can use projection to discard more equations in the subring attack and achieve a lower dimension. Corollary 3. With the notation from Theorem 2 and Corollary 2, set B := B sub eld . Under Assumptions 1 and 2, for sufficiently large n, we can find a multiple αv for some non-zero α ∈ O such that using BKZ with block size β on the lattice L ′ of dimension n ′ + d, we have the following two cases. If L ′ is the subfield lattice, then n ′ + d ≥ n ′ log(q)+1 log(q/(2B+1)) and β log β = 2n ′ log q (log(q/B)) 2 , and if L ′ is the subring lattice, then n ′ + d ≥  For a fixed dimension, we compare the subring attack to the subfield attack without projection, and note whether the attack succeeded. In some cases the full field attack only succeeded with projection; (384;512) means we ran the full field attack in these two dimensions.

Experimental results
We implemented all three attacks in Sage and experimentally compared them using Sage's default LLL implementation. A success in our experiments is recovering a vector v such that ||v|| < q 3/4 . As noted by [12], we either get vectors which are roughly of size q, or vectors of size √ q that are short integral multiples of the private key. First, we fix the parameters (n, dimension, r) and compare the smallest modulus q that succeeded for each of the three attacks using LLL. Details are given in Table 2. Note that we do not apply the projection technique to the subfield attack in these experiments. The analysis of the subfield attack without projection is given in Corollary 2. Then, we compared the subring and subfield attacks by fixing (n, q, r) and comparing the smallest dimension that succeeded using LLL. For some of the lattices we used the projection technique by deleting the right-most columns until we reached the desired dimension. The difference is greater than our analysis predicts. Details are given in Table 3. The analysis of the subfield attack with projection is given in Corollary 3. Experiments were run on a single core of an Intel Xeon E5-2699 v3 running at 2.30GHz, with 128 GB of RAM.