Improved cryptanalysis of the AJPS Mersenne based cryptosystem

Abstract At Crypto 2018, Aggarwal, Joux, Prakash and Santha (AJPS) described a new public-key encryption scheme based on Mersenne numbers. Shortly after the publication of the cryptosystem, Beunardeau et al. described an attack with complexity 𝓞(22h). In this paper, we describe an improved attack with complexity 𝓞(21.75h).


Introduction
The AJPS public-key encryption scheme. At Crypto 2018, Aggarwal, Joux, Prakash and Santha (AJPS) described a new public-key encryption scheme based on arithmetic modulo Mersenne numbers [AJPS18]. A Mersenne prime is a prime integer p of the form p = 2 n − 1 where n is a prime. The arithmetic modulo p has good properties and one can establish a correspondence between integers modulo p and binary strings of length n, up to 0 n ∼ 1 n . In particular one can define the Hamming weight of a number as the Hamming weight of the unique binary string associated to it, i.e. the number of 1s in its binary representation. In the earliest version of their work, the authors presented a public-key encryption scheme (AJPS-1) somewhat similar to the NTRU cryptosystem, but based on a new assumption: the Mersenne Low Hamming Ratio Assumption. Its security relies on the following assumption: given H = F/G mod p, where the binary representation of F and G modulo p has low Hamming weight, then H looks pseudorandom, namely it is hard to distinguish H from a random integer modulo p.
The Beunardeau et al. attack. Even though the authors claimed that the known lattice attacks against NTRU would not apply, very soon Beunardeau et al. [BCGN17] described a lattice-based attack against the first AJPS proposal. The attack complexity is O(2 2h ), where h is the Hamming weight of F and G. The attack was further analyzed in [dBDJdW18]; the authors also described a Meet-in-the-Middle attack against AJPS-1 based on locality-sensitive hash functions to obtain collisions; they showed that the lattice attack from [BCGN17] is more efficient.
Since AJPS-1 allows to encrypt only a single bit at a time, it is not very efficient. However in a later version of the article, published at Crypto 2018 [AJPS18], Aggarwal et al. described a variant (AJPS-2) that encrypts many bits at a time, with much larger security parameters to prevent the lattice attack.
Our contribution. In this paper we describe a variant of the Beunardeau et al. attack against AJPS-2, with improved complexity O(2 1.75h ) instead of O(2 2h ). Instead of recovering the private-key, our attack only breaks the indistinguishability of ciphertexts.

The AJPS Cryptosystems
In this section we recall the two versions of the AJPS cryptosystems; see [AJPS18] for further details.
AJPS-1: bit-by-bit encryption. Let p = 2 n − 1 be a Mersenne prime, where n itself is prime. Let h be an integer. Let F and G be two random integers modulo p with Hamming weight h such that 4h 2 < n ≤ 16h 2 . Then the public-key is pk = H = F/G mod p and the private key is sk = G. To encrypt, choose two random integers A and B of Hamming weight h. Encrypt the bit b as: To decrypt, compute d = Ham(C · G). Output 0 if d ≤ 2h 2 , otherwise output 1.
Decryption works because which has Hamming weight at most 2h 2 if b = 0, and at least n − 2h 2 if b = 1. Namely for any number x of Hamming weight h, the integer x · 2 z mod p for z ≥ 0 is a cyclic shift of x, and therefore its Hamming weight remains unchanged. Therefore the Hamming weight of A · F is at most h 2 and the Hamming weight of B · G is also at most h 2 ; therefore the Hamming weight of C · G is at most 2h 2 for b = 0.
AJPS-2: error correcting codes. Let n be a positive integer such that p = 2 n − 1 be a Mersenne prime. Let h ∈ N be such 10h 2 < n ≤ 16h 2 . Let F, G be two random integers modulo p with Hamming weight h and R be a random integer modulo p. Set and sk = F . To encrypt a message m ∈ {0, 1} h , first generate three random integers A, B 1 , B 2 modulo p, with Hamming weight h. Then, using the encoding algorithm E : {0, 1} h → {0, 1} n of an error correcting code (E, D), compute the ciphertext: To decrypt, compute D((F · C 1 ) ⊕ C 2 ), where D is the corresponding decoding algorithm. Decryption works because and therefore the Hamming distance between A · T + B 2 and F · C 1 is expected to be low, which enables to recover m with good probability.

The Beunardeau et al. Attack
Basic attack. Beunardeau et al. described an attack against AJPS-1 in [BCGN17] that recovers the private-key from the public-key. More precisely, they consider the following problem: Definition 3.1 (Mersenne Low Hamming Ratio Search Problem (MLHSP)). Let p = 2 n − 1 be an n-bit Mersenne prime and h an integer. Let F , G be two n-bit random strings with Hamming weight h. Given H = F/G mod p, recover F and G.
Their basic attack is based on the following observation. With probability 2 −2h , we have both F < √ p and G < √ p, and therefore, given H = F/G mod p, one can recover F and G by applying LLL in dimension 2. In the original proposal [AJPS17], it was recommended to take h = 17 for λ = 120 bits of security. However here we have an attack that recovers the private-key from the public-key with probability 2 −34 ; see also [dBDJdW18] for a detailed analysis.
More precisely, one considers the lattice L generated by the rows of the matrix: We have that det L = p; hence by the Gaussian heuristic it contains a vector of norm (det L) 1/2 = √ p. Moreover (G, F ) is a short vector of the lattice. Therefore if both F < √ p and G < √ p we can recover F and G; since F and G have Hamming weight h, this happens with probability 2 −2h . We note that a similar attack can also be applied to the encryption equation: Namely if both A < √ p and B < √ p, then we can recover A and B by applying LLL in dimension 3, hence the plaintext bit b. Indeed we have that only one between (H, C) and (−H, C) is an instance of the following problem: Definition 3.2 (Mersenne Low Hamming Combination Search Problem (MLHCSP)).
Let p = 2 n − 1 be an n-bit Mersenne prime, h be an integer, R be a uniformly random n-bit string and F, G having Hamming weight h. Given the pair (R, F · R + G mod p), find F, G.
Given R and T = F · R + G mod p, a variant attack recovers F , G with probability 2 −2h . More precisely, the attack works by considering the lattice L of row vectors: We have that (2 n/2 , F, G) belongs to the lattice L. Moreover det L = 2 n/2 p 2 3n/2 . Hence by the Gaussian heuristic the lattice L contains a vector of norm 2 n/2 . Therefore if both F < √ p and G < √ p we can recover F and G by applying LLL to the lattice L.
Extension with random partitions. The basic attack from [BCGN17] is only a weak-key attack that recovers the private-key from the public-key with probability 2 −2h over the set of possible public-keys. Similarly, the above variant attack against the encryption equation can only decrypt a fraction 2 −2h of the ciphertexts. Therefore, the authors extended their attack by considering random partitions, with higher dimensional lattices. In that case, the attack can recover the private-key from any public-key, solving MLHSP, with complexity O(2 2h ). The same partition strategy can be used for the MLHCSP with the same complexity. In our improved attack in the next section, we will also use random partitions.

Our new attack
We describe our new attack against AJPS-2. We consider the previous encryption equation: Given the public-key (R, T ) and a ciphertext (C 1 , C 2 ), our attack can distinguish between m = 0 and m = 0. Assume that m = 0 and E(m) = 0. In that case, we have: We claim that if A, B 1 and B 2 are less than 2 2n/3 , then we can recover A, B 1 and B 2 with LLL. Namely we consider the lattice of row vectors: We have that det L = 2 2n/3 p 2 2 8n/3 . Therefore by the Gaussian heuristic the lattice L contains vectors of norm 2 2n/3 . Moreover the lattice L contains the vector (2 2n/3 , A, B 1 , B 2 ). Therefore if A, B 1 and B 2 are less than 2 2n/3 , we can recover A, B 1 and B 2 by applying LLL to L.
Since A has Hamming weight h, the probability that A < 2 2n/3 is (2/3) h ; the same holds for B 1 and B 2 . The success probability of the attack is therefore: which gives a slightly better success probability than the original attack with 2 −2h . Therefore, using the same partition technique as in [BCGN17], the attack complexity to break the indistinguishability of any ciphertext is O(2 1.75h ) instead of O(2 2h ).

Working with random partitions
We show that using the same random partition technique as in [BCGN17], we can break the indistinguishability property of any ciphertext (C 1 , C 2 ), whereas the basic attack above only works when A, B 1 and B 2 are less than 2 2n/3 , which only happens with probability (2/3) 3h .
In the following, we determine under which conditions the secret vector s is the unique shortest vector of the lattice L β,P,Q,S . Given A, B 1 , B 2 , we say that the triple (P, Q, S) of partitions of [n] is a lucky triple if there exists a scaling factor β ∈ N such that the secret vector s is the unique shortest vector of L β,P,Q,S . In that case L β,P,Q,S will be said to be a lucky lattice respect to A, B 1 , B 2 . In other words, we aim to establish sufficient conditions under which a lattice L β,P,Q,S is lucky given a ciphertext C = (C 1 , C 2 ) such that E(m) = 0.
We write β = 2 tn ; thus we have vol(L β,P,Q,S ) 2 (2+t)n . By the Gaussian heuristic, we obtain the following estimate of the length of the shortest vector of L β,P,Q,S d 2πe · vol(L β,P,Q,S ) Since the Hamming weight of A, B 1 , B 2 is the same, we take k = j = . We note that the lattice L β,P,Q,S contains intrinsic short vectors u = (0, . . . , 0, 2 g , −1, 0, . . . 0) whose norm is 2 g when g is of the form p i − p i−1 or q i − q i−1 or s i − s i−1 . If we consider partitions with intervals of similar length, we obtain u ≈ 2 n/k . Therefore we have to ensure that such vectors are not shorter than our target secret vector.
In low dimensions we can assume that LLL recovers the shortest vector s of the lattice. From (1) we must therefore ensure: · 2 (2+t)n d where d = 3k + 1 is the lattice dimension. We expect the entries of the secret vector to be about of the same size for a lucky triple, hence we take the scaling factor β such that β = 2 tn e . Then we have approximately: which gives t ≤ 2 3k − 3k+1 6kn . Therefore we have the approximative condition to have a lucky triple (P, Q, S) of partitions: e < 2 2n 3k . (2) It remains to evaluate the probability to find a lucky triple of partitions (P, Q, S). It is actually easier to assume that the partitions (P, Q, S) are fixed, and the ciphertext C = (C 1 , C 2 ) is random. In that case, from the bound (2), each of the h bits from the integers A, B 1 and B 2 must land in one of the subintervals of length 2n/(3k) of the k partition intervals. For a single bit, this happens with probability roughly k · 2n/(3k) · 1/n = 2/3. Therefore, as in the basic attack, the success probability is roughly (2/3) 3h 2 −1.75·h . Therefore, the number of partitions to try before finding a lucky one is approximately: O(2 1.75h ) instead of O(2 2h ) in the original attack from [BCGN17].
Security parameter selection. In the latest version of the paper the authors recommended to take for λ bit of security h = λ, in order to prevent possible improvements of Beunardeau et al. attack. Then our attack does not affect the choice of parameter proposed in [AJPS18].

Practical experiments
We have performed some practical experiments for various values of bitsize n and Hamming weight h of AJPS-2, in order to compare our new attack with the original Beunardeau et al. attack. For both attacks, since we don't know a priori the optimal size of the partition k to recover the secret, we perform a repeated loop over all possible 1 ≤ k ≤ h. We summarize our results in Table 1, showing that our attack indeed requires fewer partitions than the original attack.  Table 1. Average numberȳ of partitions required to recover the secret values A, B 1 , B 2 , compared to the average numberȲ required for the original attack. We used 70 samples for h = 3, 6, 7, and 9 samples for h = 9.