Skip to content
BY-NC-ND 3.0 license Open Access Published by De Gruyter October 18, 2017

RSA: A number of formulas to improve the search for p+q

  • Ahmed Mohammed and Abdulrahman Alkhelaifi EMAIL logo

Abstract

Breaking RSA is one of the fundamental problems in cryptography. Due to its reliance on the difficulty of the integer factorization problem, no efficient solution has been found despite decades of extensive research. One of the possible ways to break RSA is by finding the value p+q, which requires searching the set of all even integers. In this paper, we present a number of formulas for p+q that depend on the form of n=pq. These formulas make the set to be searched much smaller than the set of all even integers.

MSC 2010: 94A60

1 Introduction

The security of RSA is one of the most researched topics in cryptography. It depends strongly on the size of the key used. Large key sizes are considered secure and hard to break. A large body of research in cryptology looked at analyzing the security of RSA and possible ways to break it. The naive way is brute force where an attacker tries to search for the key in the set of all possible keys. However, this method is not practical for large enough keys and believed to be computationally infeasible.

Another way requires solving the integer factorization problem. Integer factorization is a very difficult mathematical problem that requires tremendous time and computation power for sufficiently large numbers and can take up to hundreds or even thousands of years to solve with the most advanced computers. No efficient algorithm has been found to factor large integers in a reasonable time. For that reason, RSA is still widely used today to secure internet communications.

Another possible way to break RSA is to find the value p+q. Finding p+q allows us to find p and q if we combine it with the following equation for p-q:

(1)p-q=(p+q)2-4n.

When we have the values p+q and p-q, we can construct a system of two linear equations to solve for p and q, where p=[(p+q)+(p-q)]/2.

The trivial method to find p+q is to search the set of all even numbers. In our work, we try to find a better way to search for p+q, given that n=pq, by analyzing the relationship between the two values. We present a number of formulas for p+q for any form of n. These formulas limit the search to a much smaller set than the set of all even numbers, thus making this method a significant improvement on simple brute force attacks.

2 Background and related work

2.1 The RSA cryptosystem

RSA is a well-known cryptosystem that was introduced in 1977 by Ronald Rivest, Adi Shamir and Leonard Adleman [10]. They invented RSA to provide confidentiality and authenticity to digital information travelling through insecure communication channels. RSA is one of the first cryptosystems to implement asymmetric encryption using public key cryptography. Before RSA, people relied on symmetric encryption whereby two parties must share the same key in order to communicate secretly. This shared key requirement is especially difficult in practice. Moreover, if one person wants to privately communicate with multiple people, this person needs to share a different key with each one of them adding more complexity to the problem. Public key cryptography solves this problem by using two different keys. One key is used for encryption, and that key is distributed publicly, while the other private key is used for decryption and it must be kept secret. If Alice wants to communicate with Bob in a secure way, she would use Bob’s public key to encrypt her message and then send the resulting cipher text to Bob. Using his private key, Bob then decrypts the cipher text to obtain the original message from Alice.

The inventors of RSA needed a special type of mathematical functions called one way trapdoor functions in order to implement public key cryptography. Such function is easy to compute for any given input but hard to invert without an additional piece of information called the trapdoor. The function they used was modular exponentiation, where the encryption of a message m can be easily computed by raising m to some exponent e, and then taking the remainder after dividing by some number n called the modulus. The integer n is the product of two large prime numbers p and q, and e is an odd integer such that e3. The pair (n,e) is distributed as the public key. To reverse this computation, another exponent d is needed to undo the effect of e and that integer d is the trapdoor. The integer d is calculated so that ed=1modϕ(n), where ϕ is Euler’s totient function and ϕ(n)=(p-1)(q-1). The exponent d must be kept secret along with the integers p and q. To break RSA, one must find d given only (n,e), which requires factoring n to find its prime factors p and q.

2.2 The RSA problem

Breaking RSA is known as the RSA problem. It is formally defined as follows: Compute M given a public key (n,e) and a ciphertext C=Memodn, see [9]. It is believed to be as difficult as the integer factorization problem, however, no definite proof has been found. Clearly a solution to the integer factorization problem also solves the RSA problem, however, it is still unknown if the opposite is also true. On the one hand, some research shows that breaking RSA is not equivalent to integer factorization for a very small public exponent [4]. On the other hand, multiple researchers suggest that the RSA problem and integer factorization are equivalent [1, 5].

The fastest known method for factoring large numbers is the general number field sieve [2]. Another method is the Fermat factoring attack, which can be used if the factors p and q are very close to each other [6]. Moreover, elliptic curves have been used to factor large integers [7].

There are other ways to break RSA without factoring n. One way is to solve the discrete logarithm problem to find the private exponent d that satisfies the equation M=Cdmodn. This is also a difficult problem that has not been solved efficiently yet, see [8]. Other attacks target the RSA cryptosystem itself. The most notable attacks include common modulus, low private exponent, low public exponent, Hastad’s broadcast attack, the Franklin-Reiter related message attack, Coppersmith’s short pad attack, the partial key exposure attack, and other implementation attacks all explained by Boneh in [3].

Despite the large number of attacks against RSA, it is still considered secure if implemented properly and large key sizes are used.

3 Technical details

In this section, we explain in detail the methodology to find a formula for p+q. Let n=pq and r=p+q. We first define general forms for p and q. Then we consider all the possible forms that n can have which are all the different combinations of the product pq. After that we explain how to find r for one form of n. We omit the details for the rest of the forms of n which follow the same methodology, however, we provide the rest of the formulas in Appendix B (see Tables 223).

3.1 Divisibility by 3 and 4

In this section, we try to find a relationship between n and r. Since r is the sum of two odd numbers, then it must be divisible by 2. Therefore, we will only consider the set of even positive integers rather than the set of all positive integers. Next, we consider divisibility by 3 and 4 and find the following results.

Proposition 3.1.

Let p and q be distinct primes not equal to 3, n=pq and r=p+q. Then 3n+1 if and only if 3r.

Proof.

Let p and q be two distinct prime numbers. Then they have one of the following forms:

  1. 3k+1,

  2. 3k+2.

Then the possible forms of n are:

  1. (3k+1)(3m+1)=9km+3(k+m)+1=3k+1,

  2. (3k+1)(3m+2)=9km+3(2k+m)+2=3k+2,

  3. (3k+2)(3m+2)=9km+6(k+m)+4=3k+1,

and the possible forms of r will, respectively, be:

  1. (3k+1)+(3m+1)=3(k+m)+2=3k+2,

  2. (3k+1)+(3m+2)=3(k+m)+3=3k,

  3. (3k+2)+(3m+2)=3(k+m)+4=3k+1.

Here n+1 is not divisible by 3 for the first and third forms, and similarly the corresponding forms of r are also not divisible by 3. The second form for n+1 as well as for r are both divisible by 3. Therefore, we conclude that 3n+1 if and only if 3r. ∎

Proposition 3.2.

Let p and q be distinct primes, n=pq and r=p+q. Then 4n+1 if and only if 4r.

Proof.

Let p and q be two distinct prime numbers. Then they have one of the following forms:

  1. 4k+1,

  2. 4k+3.

Then the possible forms of n are:

  1. (4k+1)(4m+1)=16km+4(k+m)+1=4k+1,

  2. (4k+1)(4m+3)=16km+4(3k+m)+3=4k+3,

  3. (4k+3)(4m+3)=16km+12(k+m)+9=4k+1,

and the possible forms of r will, respectively, be:

  1. (4k+1)+(4m+1)=4(k+m)+2=4k+2,

  2. (4k+1)+(4m+3)=4(k+m)+4=4k,

  3. (4k+3)+(4m+3)=4(k+m)+6=4k+2.

Here n+1 for the first and third forms is not divisible by 4, and similarly the corresponding forms of r are also not divisible by 4. The second form for n+1 as well as for r are both divisible by 4. Therefore, we conclude that 4n+1 if and only if 4r. ∎

By combining Propositions 3.1 and 3.2, we arrive to the following result.

Corollary 3.3.

Let p and q be distinct primes, n=pq and r=p+q. Then 12n+1 if and only if 12r.

3.2 Finding a formula for p+q

Consider a prime not equal to 2,3,5. Then it has one of the following forms:

  1. 10x+1,

  2. 10x+3,

  3. 10x+7,

  4. 10x+9.

Let p=10x+z1, and without loss of generality, let q=10(x+j)+z2, where z1,z2=1,3,7,9 and j0 if z1z2, or j>0 if z1=z2. Then n=pq can have one of the following forms:

  1. 100x(x+j)+10(2x+j)+1,

  2. 100x(x+j+1)+10(8x+9j+8)+1,

  3. 100x(x+j+1)+10(7j+2)+1,

  4. 100x(x+j+1)+10(3j+2)+1,

  5. 100x(x+j)+10(4x+3j)+3,

  6. 100x(x+j)+10(4x+j)+3,

  7. 100x(x+j+1)+10(6x+9j+6)+3,

  8. 100x(x+j+1)+10(6x+7j+6)+3,

  9. 100x(x+j)+10(8x+7j)+7,

  10. 100x(x+j)+10(8x+j)+7,

  11. 100x(x+j+1)+10(2x+9j+2)+7,

  12. 100x(x+j+1)+10(2x+3j+2)+7

  13. 100x(x+j)+10(6x+3j)+9,

  14. 100x(x+j+1)+10(4x+7j+4)+9,

  15. 100x(x+j+1)+10(9j)+9,

  16. 100x(x+j+1)+10j+9,

and r=p+q will have, respectively, one of the following forms:

  1. 10(2x+j)+2,

  2. 10(2x+j+1)+8,

  3. 10(2x+j+1),

  4. 10(2x+j+1),

  5. 10(2x+j)+4,

  6. 10(2x+j)+4,

  7. 10(2x+j+1)+6,

  8. 10(2x+j+1)+6,

  9. 10(2x+j)+8,

  10. 10(2x+j)+8,

  11. 10(2x+j+1)+2,

  12. 10(2x+j+1)+2,

  13. 10(2x+j)+6,

  14. 10(2x+j+1)+4,

  15. 10(2x+j+1),

  16. 10(2x+j+1).

From the above two lists, if n=1mod10, then r{x:x=0,2,8mod10}. As we can see, r now is in a smaller set than the set of all even positive integers.

We want to further reduce the set that contains r. By letting n=11mod100, we can use the first formula in each list, i.e.,

n=100x(x+j)+10(2x+j)+1,
r=10(2x+j)+2.

Notice that j is odd, since 2x+j=1mod10. We also notice that 4n+1, since n+1=2mod10, and 2x+j is odd (e.g., 12,32,). Therefore, by Proposition 3.2, we conclude that 4r.

Next, we consider the divisibility by 8 and arrive to the following statement.

Proposition 3.4.

Let n=11mod100. Then the hundreds of n is odd if and only if 8r.

Proof.

We will analyze a specific case here, which is the first form of n, and the rest of the formulas can be proved in a similar way.

() We assume that 8r. Then

r=(10(2x+j)+2)=0mod8 2(2x+j)=6mod8
 2x+j=3mod4
 2x+j=4k+3.

Since the tens is equal to 1, we have 4k+3=1mod10. Also,

4k=8mod10 2k=4mod5
k=2mod5
k=5m+2,

so 2x+j=4k+3=4(5m+2)+3=20m+11. Then

n=100x(x+j)+10(2x+j)+1
=100x(x+j)+10(20m+11)+1
=100x(x+j)+200m+110+1
=100[x(x+j)+2m+1]+11.

We notice in the above expression that 2m+1 is always odd and, as we previously noted, that j is odd. Hence, x(x+j) will always be even regardless of x. Consequently, the expression x(x+j)+2m+1 is always odd. Therefore, if 8r, then the hundreds is odd.

() Suppose 8r. Then 2x+j=4k+1, and we have

2x+j=4k+1=1mod10 4k=0mod10
 2k=0mod5
k=0mod5
k=5m,

so 2x+j=4k+1=4(5m)+1=20m+1. Then

n=100x(x+j)+10(2x+j)+1
=100x(x+j)+10(20m+1)+1
=100x(x+j)+200m+10+1
=100[x(x+j)+2m]+11.

We notice in this case that 2m is even and, as explained above, x(x+j) will always be even regardless of x, so the expression x(x+j)+2m is always even. Therefore, if 8r, then the hundreds is even.

For the first formula in the list, let n=11mod100. By combining the two results, we complete the proof of Proposition 3.4. ∎

Let n=11mod100. If the hundreds is odd, then one of the possible formulas is

r=10(20m+11)+2=200m+112

and if the hundreds is even, then one of the possible formulas is

r=10(20m+1)+2=200m+12

By combining this result with Proposition 3.1, we get Table 1.

Table 1

Possible values of r when n=11mod100.

Hundreds oddHundreds even
3n+13n+13n+13n+1
r=600k+312r=600k+112r=600k+12r=600k+212
r=600k+512r=600k+412

Table 1 is for the first form in the list of n when n=11mod100. The rest of the forms are provided in Appendix B.

3.3 Example

In this section, we give an example on how we can find r=p+q for a given n, and subsequently p and q using equation (1). Notice that p+q>2n, which we use to further improve our search.

We start by looking at the value n, and find the table of formulas corresponding to the form of n. After that, we use the formulas to search for r. We find r when the value r2-4n is equal to an integer (see Appendix A).

Given n=9476465591, we will try to find p and q (we computed n from p=101693 and q=93187 so we are going to check that our solution matches these values).

  1. For n=91mod100, we look at Table 11 in Appendix B.

  2. If 3n+1 and the hundreds is odd, then r could be one of the following formulas:

    1. r1=600k1+192,

    2. r2=600k2+408,

    3. r3=120k3.

  3. For 2n194694, since r1,r2,r3>2n, the starting points will be:

    1. k1,0=324,

    2. k2,0=324,

    3. k3,0=1623.

  4. We evaluate r1,r2,r3, starting with the above values for k1,0,k2,0,k3,0, respectively, and continue to increment the k until r2-4n is equal to an integer. We find the right value for k to be k3,1=k3,0+1=1624. Now we compute r in the following way: r=120(k3,1)=120(1624)=194880.

  5. We have p-q=r2-4n=37978214400-37905862364=8506.

  6. We have p=[r+(p-q)]/2=203386/2=101693 and q=n/p=9476465591/101693=93187.

Notice in this example that k is small; this is due to p-q being small.

4 Conclusion

In this paper, we presented a number of equations for the sum of RSA prime factors p and q. We defined general forms for n=pq, and then explained how to find a formula for r=p+q for one of the forms. After that, we gave an example on how to use the formulas to factor n and retrieve the prime factors p and q.

We believe this work has more research potential. We believe the equations can be further improved to allow faster factorization. One possibility would be to find a way to increase the coefficients of the formulas of r. Another possibility is to reduce the number of formulas for r for a given form of n i.e. reduce the number of rows in a column.


Communicated by Spyros S. Magliveras


A The relationship between p+q and m2-4n

Proposition A.1.

Let p and q be odd primes, with p>q, let n=pq, and let m be an even integer less than n. If m2-4n is an integer, then m=p+q.

Proof.

Let k be a positive integer such that k=m2-4n. Then

k2=m2-4n 4n=m2-k2=(m-k)(m+k).

Thus, we have the following two systems of linear equations:

{m-k=2,m+k=2n,{m-k=2q,m+k=2p.

By solving the two systems, we have m1=n+1 or m2=p+q. Since m1>n, it follows that m=p+q. ∎

Table 2

n=1mod100.

3n+13n+1
r=300k+102r=300k+2
r=300k+198r=300k+202
r=60k+30r=300k+98
r=300k+298
r=60k+10
r=60k+50
Table 3

n=11mod100.

Hundreds oddHundreds even
3n+13n+13n+13n+1
r=600k+312r=600k+112r=600k+12r=600k+212
r=600k+288r=600k+512r=600k+588r=600k+412
r=120kr=600k+88r=120k+60r=600k+188
r=600k+488r=600k+388
r=120k+40r=120k+20
r=120k+80r=120k+100
Table 4

n=21mod100.

3n+13n+1
r=300k+222r=300k+22
r=300k+78r=300k+122
r=60k+30r=300k+278
r=300k+178
r=60k+10
r=60k+50
Table 5

n=31mod100.

Hundreds oddHundreds even
3n+13n+13n+13n+1
r=600k+132r=600k+332r=600k+432r=600k+32
r=600k+468r=600k+532r=600k+168r=600k+232
r=120kr=600k+68r=120k+60r=600k+368
r=600k+268r=600k+568
r=120k+40r=120k+20
r=120k+80r=120k+100
Table 6

n=41mod100.

3n+13n+1
r=300k+42r=300k+142
r=300k+258r=300k+242
r=60k+30r=300k+58
r=300k+158
r=60k+10
r=60k+50
Table 7

n=51mod100.

Hundreds oddHundreds even
3n+13n+13n+13n+1
r=600k+552r=600k+152r=600k+252r=600k+52
r=600k+48r=600k+352r=600k+348r=600k+452
r=120kr=600k+248r=120k+60r=600k+148
r=600k+448r=600k+548
r=120k+40r=120k+20
r=120k+80r=120k+100

B The formulas for p+q

For n=1mod10 and n=9mod10, if the tens and the hundreds are odd or even, it affects the value of r, which is why there are many tables. However, for n=3mod10 and n=7mod10, it has no effect and hence each has only one table.

Table 8

n=61mod100.

3n+13n+1
r=300k+162r=300k+62
r=300k+138r=300k+262
r=60k+30r=300k+38
r=300k+238
r=60k+10
r=60k+50
Table 9

n=71mod100.

Hundreds oddHundreds even
3n+13n+13n+13n+1
r=600k+372r=600k+172r=600k+72r=600k+272
r=600k+228r=600k+572r=600k+528r=600k+472
r=120kr=600k+428r=120k+60r=600k+128
r=600k+28r=600k+328
r=120k+40r=120k+20
r=120k+80r=120k+100
Table 10

n=81mod100.

3n+13n+1
r=300k+282r=300k+82
r=300k+18r=300k+182
r=60k+30r=300k+118
r=300k+218
r=60k+10
r=60k+50
Table 11

n=91mod100.

Hundreds oddHundreds even
3n+13n+13n+13n+1
r=600k+192r=600k+392r=600k+492r=600k+92
r=600k+408r=600k+592r=600k+108r=600k+292
r=120kr=600k+8r=120k+60r=600k+308
r=600k+208r=600k+508
r=120k+40r=120k+20
r=120k+80r=120k+100
Table 12

n=9mod100.

3n+13n+1
r=300k+6r=300k+106
r=300k+294r=300k+206
r=60k+30r=300k+94
r=300k+194
r=60k+10
r=60k+50
Table 13

n=19mod100.

Hundreds oddHundreds even
3n+13n+13n+13n+1
r=600k+576r=600k+176r=600k+276r=600k+76
r=600k+24r=600k+376r=600k+324r=600k+476
r=120kr=600k+224r=120k+60r=600k+124
r=600k+424r=600k+524
r=120k+40r=120k+20
r=120k+80r=120k+100
Table 14

n=29mod100.

3n+13n+1
r=300k+246r=300k+46
r=300k+54r=300k+146
r=60k+30r=300k+154
r=300k+254
r=60k+10
r=60k+50
Table 15

n=39mod100.

Hundreds oddHundreds even
3n+13n+13n+13n+1
r=600k+516r=600k+116r=600k+216r=600k+16
r=600k+84r=600k+316r=600k+384r=600k+416
r=120k+60r=600k+284r=120kr=600k+84
r=600k+484r=600k+584
r=120k+20r=120k+40
r=120k+100r=120k+80
Table 16

n=49mod100.

3n+13n+1
r=300k+186r=300k+86
r=300k+114r=300k+286
r=60k+30r=300k+14
r=300k+214
r=60k+10
r=60k+50
Table 17

n=59mod100.

Hundreds oddHundreds even
3n+13n+13n+13n+1
r=600k+456r=600k+56r=600k+156r=600k+356
r=600k+144r=600k+256r=600k+444r=600k+556
r=120kr=600k+344r=120k+60r=600k+44
r=600k+544r=600k+244
r=120k+40r=120k+20
r=120k+80r=120k+100
Table 18

n=69mod100.

3n+13n+1
r=300k+126r=300k+26
r=300k+174r=300k+226
r=60k+30r=300k+74
r=300k+274
r=60k+10
r=60k+50
Table 19

n=79mod100.

Hundreds oddHundreds even
3n+13n+13n+13n+1
r=600k+396r=600k+196r=600k+96r=600k+296
r=600k+204r=600k+596r=600k+504r=600k+496
r=120k+60r=600k+4r=120kr=600k+104
r=600k+404r=600k+304
r=120k+20r=120k+40
r=120k+100r=120k+80
Table 20

n=89mod100.

3n+13n+1
r=300k+66r=300k+166
r=300k+234r=300k+266
r=60k+30r=300k+34
r=300k+134
r=60k+10
r=60k+50
Table 21

n=99mod100.

Hundreds oddHundreds even
3n+13n+13n+13n+1
r=600k+336r=600k+136r=600k+36r=600k+236
r=600k+264r=600k+536r=600k+564r=600k+436
r=120kr=600k+64r=120k+60r=600k+364
r=600k+464r=600k+164
r=120k+40r=120k+20
r=120k+80r=120k+100
Table 22

n=3mod10.

Tens oddTens even
3n+13n+13n+13n+1
r=60k+54r=60k+14r=60k+24r=60k+4
r=60k+6r=60k+34r=60k+36r=60k+44
r=60k+26r=60k+16
r=60k+46r=60k+56
Table 23

n=7mod10.

Tens oddTens even
3n+13n+13n+13n+1
r=60k+18r=60k+38r=60k+48r=60k+8
r=60k+42r=60k+58r=60k+12r=60k+28
r=60k+2r=60k+32
r=60k+22r=60k+52

For n=1mod10 and n=9mod10, when the tens is odd, then the parity of the hundreds affects r, but when the tens is even, the parity of the hundreds does not affect r.

References

[1] D. Aggarwal and U. Maurer, Breaking RSA generically is equivalent to factoring, Advances in Cryptology – EUROCRYPT 2009. Lecture Notes in Comput. Sci., Springer, Berlin (2009), 36–53. 10.1007/978-3-642-01001-9_2Search in Google Scholar

[2] D. J. Bernstein and A. K. Lenstra, A general number field sieve implementation, The Development of the Number Field Sieve, Springer, Berlin (1993), 103–126. 10.1007/BFb0091541Search in Google Scholar

[3] D. Boneh, Twenty years of attacks on the RSA cryptosystem, Notices Amer. Math. Soc. 46 (1999), 203–213. Search in Google Scholar

[4] D. Boneh and R. Venkatesan, Breaking RSA may not be equivalent to factoring, Advances in Cryptology – EUROCRYPT’98, Lecture Notes in Comput. Sci., Springer, Berlin (1998), 59–71. 10.1007/BFb0054117Search in Google Scholar

[5] D. R. L. Brown, Breaking RSA may be as difficult as factoring, Cryptology ePrint Archive 2008, https://eprint.iacr.org/2005/380.pdf. 10.1007/s00145-014-9192-ySearch in Google Scholar

[6] B. de Weger, Cryptanalysis of RSA with small prime difference, Appl. Algebra Engrg. Comm. Comput. 13 (2002), 17–28. 10.1007/s002000100088Search in Google Scholar

[7] H. W. Lenstra, Factoring integers with elliptic curves, Ann. of Math. (2) 126 (1987), 649–673. 10.2307/1971363Search in Google Scholar

[8] A. Odlyzko, Discrete logarithms: The past and the future, Des. Codes Cryptogr. 19 (2000), 129–145. 10.1023/A:1008350005447Search in Google Scholar

[9] R. L. Rivest and B. Kaliski, RSA problem, Encyclopedia of Cryptography and Security, Springer US, Boston (2011), 1065–1069. 10.1007/978-1-4419-5906-5_475Search in Google Scholar

[10] R. L. Rivest, A. Shamir and L. Adleman, A method for obtaining digital signatures and public-key cryptosystems, Commun. ACM 21 (1978), 120–126. 10.1145/359340.359342Search in Google Scholar

Received: 2016-8-8
Revised: 2017-9-2
Accepted: 2017-9-13
Published Online: 2017-10-18
Published in Print: 2017-12-1

© 2017 Walter de Gruyter GmbH, Berlin/Boston

This article is distributed under the terms of the Creative Commons Attribution Non-Commercial License, which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Downloaded on 3.12.2023 from https://www.degruyter.com/document/doi/10.1515/jmc-2016-0046/html
Scroll to top button