Skip to content
BY-NC-ND 3.0 license Open Access Published by De Gruyter July 11, 2018

MRHS solver based on linear algebra and exhaustive search

  • Håvard Raddum and Pavol Zajac EMAIL logo

Abstract

We show how to build a binary matrix from the MRHS representation of a symmetric-key cipher. The matrix contains the cipher represented as an equation system and can be used to assess a cipher’s resistance against algebraic attacks. We give an algorithm for solving the system and compute its complexity. The complexity is normally close to exhaustive search on the variables representing the user-selected key. Finally, we show that for some variants of LowMC, the joined MRHS matrix representation can be used to speed up regular encryption in addition to exhaustive key search.

MSC 2010: 94A60; 68W40; 11D72

1 Introduction

Encryption technology is being used in a large number of applications today, and many different encryption algorithms have been proposed for use in various environments. Security is always the most important consideration for a cryptographic primitive. This is achieved through cryptanalysis, where we try to find ways to break the security guarantees given by the designers. For symmetric ciphers this is usually equivalent to finding a secret key faster than doing exhaustive search on the secret key.

Several cryptanalytic techniques can be tried when assessing the strength of a cipher. In this paper we will focus on algebraic cryptanalysis, characterised by formulating the attacker’s problem as solving a system of equations. Representing a cipher as an equation system can be done in several ways, and over different fields. In this paper we will only be concerned with the binary field GF(2), and we will use the Multiple Right Hand Side (MRHS) [4] representation for constructing the equation systems.

Our contribution.

It is easy to join all the individual MRHS equations into one matrix/vector multiplication with a (large) right-hand side set of potential solutions. We call this the joined system matrix. We show which linear operations we can do on the joined system matrix without changing its solution space, and without increasing the space complexity of the right-hand side set. This allows to bring the joined system matrix into a special form, allowing for fast execution of a simple brute-force solving algorithm. We describe a recursive algorithm using a guess/verify approach for solving a joined MRHS system and explain its complexity.

In the last part of the paper we test the algorithm on some well-known block ciphers and show that the analytic complexity estimates match the observed complexities very closely. One interesting observation we make is that for some of the proposed variants of LowMC [1] using very few S-boxes, we can actually do encryption via the MRHS solver faster than in the standard reference implementation. This can be explained with the fact that there is a very high number of linear operations compared to the number of non-linear operations in these versions. The joined MRHS representation merges all the dense linear layers in LowMC into one big matrix, and in total there are less XORs to be done working on this matrix than executing the round-by-round linear layers in the standard specification.

2 Preliminaries

We denote the finite field with two elements as 𝔽. All vectors over 𝔽 are row vectors and are denoted by lower case letters. Sets of vectors are denoted by capital letters, and all matrices are denoted by boldface capital letters. The p×p identity matrix is denoted as 𝐈p, but if p is clear from the context, we may just write 𝐈.

Definition 1 ([4]).

A Multiple-Right-Hand-Sides (MRHS) equation over 𝔽 is an expression of the form

(2.1)x𝐌S,

where 𝐌 is an (n×l) matrix, and S𝔽l is a set of l-bit vectors. We say that x𝔽n is a solution of the MRHS equation (2.1) if and only if x𝐌S.

A system of MRHS equations is a set of m MRHS equations with the same dimension n, i.e.

={x𝐌iSi:1im},

where each 𝐌i is an (n×li)-matrix and Si𝔽li. The vector x𝔽n is a solution of the MRHS system if it is a solution to all MRHS equations in , i.e. x𝐌iSi for each i=1,2,,m. We denote the set of all solutions of an MRHS system by Sol().

Throughout the paper, n will always denote the number of variables in an MRHS equation system, and m will always be the number of equations in the system. The length of the vectors in Si are li, but if all li in the system are the same, we may just write l. By writing S𝐌 we mean the set {v𝐌:vS} and with w+S we mean the set {w+v:vS}.

Joined system matrix.

Given an MRHS equation system ={x𝐌iSi}, we may concatenate all the matrices 𝐌i columnwise since they all have the same number of rows. We denote the joined matrix by 𝐌, and call it the joined system matrix:

𝐌=[𝐌1|𝐌2||𝐌m].

Similarly, we denote S1×S2××Sm by S. The problem of solving an MRHS equation system can now be stated as finding some x𝔽n such that x𝐌S. A similar representation that was introduced in [6], but we use a different approach to solve the system. Finally, the columns of 𝐌 corresponding to 𝐌i is called a block, and we sometimes speak of block i in 𝐌.

3 Solving algorithm

In this section we describe an algorithm for solving an MRHS system, and determine its complexity. We start with bringing the system into a special form.

3.1 Transforming MRHS system to full echelon form

We may perform linear transformations on the rows of 𝐌 without changing the set S. Doing linear operations on the rows of 𝐌 is essentially changing the variable basis, and can be captured in the following lemma.

Lemma 1.

Let x0 be a solution to xMS, let U be an invertible n×n matrix and let A=UM. Then y0=x0U-1 is a solution to yAS.

Proof.

If x0𝐌=sS, then (x0𝐔-1)(𝐔𝐌)=y0𝐀=sS. ∎

We may also perform column operations on 𝐌, but then we need to transform the set S in order to preserve the solution space.

Lemma 2.

Let x0 be a solution to xMS, let r=i=1mli and let U be an r×r invertible matrix. Then x0 is a solution to xMUSU.

Proof.

If x0𝐌=sS, then x0𝐌𝐔=s𝐔S𝐔. ∎

The problem with applying Lemma 2 in practice is that S can be very big. Each individual Si are normally of small size but as |S|=i=1m|Si|, explicitly computing the set S𝐔 often has too high complexity to be done in practice.

However, by restricting 𝐔 to a block diagonal matrix we can compute the set S𝐔 while keeping both time and memory complexity low.

Let 𝐔 be a block diagonal matrix

(3.1)𝐔=[𝐔1𝟎𝟎𝟎𝐔2𝟎𝟎𝟎𝐔m],

where each 𝐔i is an (li×li) invertible matrix. Then S𝐔=S1𝐔1×S2𝐔2××Sm𝐔m. The memory requirement of storing S𝐔 is storing each individual Si𝐔i, which is the same as storing the original sets Si. The time complexity for computing S𝐔 is doing i=1m|Si| vector/matrix multiplications.

In the following we assume this restriction on 𝐔 when doing column operations on 𝐌. That is, we can do column operations within block i of 𝐌 and transform the corresponding set Si without altering the solution space.

Definition 3.

Given an MRHS system , we say that its joined system matrix 𝐌 is in full echelon form if each block in 𝐌 has the following form:

𝐌=(𝟎𝐓i𝐈pi𝟎𝟎𝟎),

with 0pili.

We can change any joined MRHS system x𝐌S1×S2×Sm to a full echelon form as follows:

  1. Compute the matrix 𝐄 that brings 𝐌 into standard reduced row echelon form, where we also create 0’s above every leading 1: 𝐌=𝐄𝐌.

  2. Let the number of leading 1’s in block i be pi and let 𝐔 initially be an r×r identity matrix. Permute the columns in block i such that all leading 1’s are moved to the left, and make the same permutation on the same columns of 𝐔. This creates an upper triangular sub-matrix at the place where 𝐈pi is in Definition 3. Next, add columns of the triangular sub-matrix, starting from the left-most one, to create zeroes to the right of the leading 1’s on each row of block i. Make the same additions on the columns of 𝐔. Then block i gets the form given in Definition 3, and we can define 𝐔 as in (3.1) where each 𝐔i corresponds to block i. Finally, 𝐌′′=𝐌𝐔 will then be in full echelon form.

  3. Compute new sets Si=Si𝐔i.

The system x𝐌′′S1×S2×Sm=S is now in full echelon form. For every solution of this system we can compute the solution of the original system using Lemma 1. The transformation of the system to full echelon form is accomplished in polynomial time (in n) by doing linear algebra operations. This is done once for the whole system as a pre-processing step, so in the following we will always assume that the system in question is in full echelon form.

3.2 Algorithm searching for solution to MRHS system

Here we present an algorithm for searching exhaustively for possible solutions to an MRHS system. Since x𝔽n, we could expect that an exhaustive search algorithm would have complexity 2n, but as we will see, the actual complexity for systems representing ciphers is a lot lower. This is because in these systems, once a (small) part of x has been guessed, the rest of x becomes uniquely determined and can be verified as correct or not. Hence the algorithm guesses on small parts of x, and keeps track of possible ways to extend a current guess.

Informal description of algorithm.

The algorithm is accurately described in Algorithm 1. The process can be briefly explained as follows.

Algorithm 1 (Solve MRHS system in full echelon form.).

We separate x into parts x=(x1,x2,,xm), where each xi has length pi (from Definition 3). We first fix x1 to some value and compute w1=(x1,0,,0)𝐌. We choose x1 such that block 1 of w1 is in S1. Next, we choose x2 and compute w2=(x1,x2,0,,0)𝐌 such that block 2 of w2 is in S2. Note that x2 will not affect block 1 of w2, due to 𝐌 being in echelon form. So block 1 of w2 is only determined by x1 and will be unchanged. We continue this way: Assuming x1,,xi-1 has been fixed, we guess a value for xi such that block i of wi=(x1,,xi,0,,0)𝐌 is in Si (block j of wi remains in Sj for j<i).

At some point we run into cases where no possible choice of xi is possible. If pi is too small, it may be that no choice of xi will produce a wi whose block i is in Si. Then the algorithm backtracks to the first point where we have an untried guess for xj, and continues from there. Note also that pi may even be 0, in which case xi is empty and block i of wi-1 must already be in Si in order for the algorithm to proceed. When the algorithm is able to complete x by selecting xm such that block m of x𝐌 is in Sm we have found a solution to the system.

The reason for having 𝐌 in full echelon form is for easy identification of possible values for xi. The last li-pi bits of block i of wi are independent of xi, so these bits will be equal in wi-1 and wi, regardless of choice of xi. If we sort the vectors in Si on the value of the last li-pi bits, we can do a fast look-up to see which vectors in Si that will be equal to wi in the last li-pi bits. The first pi bits of any such vector immediately gives the possible value for xi, since xi is multiplied with 𝐈pi in block i.

When pi is small, we can precompute the vectors (0,,0,xi,0,,0)𝐌=zi for all choices of xi and store them in a table Ti. When the algorithm fixes a value for xi, we just look up the corresponding zi from Ti and add it to wi-1 to produce wi.

Parallelism.

We can precompute a list L of partial solutions (xd|0) along with corresponding words wd up to some depth d. Then we can distribute the search to |L| parallel instances of the algorithm. This requires a separate memory for the words wi in each of the parallel tasks, but the parallel tasks can use the same set of precomputed tables Ti (the tables are read-only). Furthermore, we can use internal parallelism to efficiently compute vector sums wi=wi-1+zi,j.

In our implementation we store the precomputed vectors zi,j and words wi as sequences of 64-bit words. Thus, a 9single 64-bit XOR operation computes 64l block sums. Even better speedup can be obtained on specialized hardware or with vector instructions (e.g., SSE).

3.3 Algorithm complexity

In each level of the recursion we compute one projection and do a table lookup. On each lookup we obtain a list Ti[ti] of vectors. The expected size of Ti[ti] is |Si|2pi-li. If the expected size is below 1, the Ti table must be empty for some values of ti, and we can expect the algorithm to backtrack at block i with probability at least 1-|Si|2li-pi. Thus, if we start the recursion N times on level i, we expect to continue N|Si|2pi-li times to level i+1. Furthermore, before the recursion we need to add wi-1 with zi,j, which requires m-i+1 block-XORs (we know that the first part of zi,j is always 0 due to full echelon form of 𝐌). These additions will typically dominate the running time of the algorithm.

If the vectors in the sets Si are chosen uniformly at random, the expected total number of recursion calls can be estimated as

(3.2)Ntotal=i=2mj=1i-1|Sj|2pj-lj.

The total number of recursion calls corresponds to the number of accesses to the lookup tables. The expected number of solutions can be estimated as

E(|Sol()|)=j=1m|Sj|2pj-lj.

The total number of block XORs can be computed as

NXOR=i=2m(m-i+1)j=1i-1|Sj|2pj-lj.

We can slightly reduce the number of XORs by storing vectors zi,j which are equal to zero in a special format. This is especially useful in blocks where pi=0, when the algorithm degenerates to just a look-up whether Ti[ti,j] is or is not an empty set.

We can estimate the chance of zero zi,j as 2-pi (one out of 2pi possible choices). Thus, the reduced number of XORs can be computed as

NXORed=i=2m(1-2-pi-1)(m-i+1)j=1i-1|Sj|2pj-lj.

On the bit level or the instruction level, the number of XORs must take into account the number of bits in each block, and the internal parallelism of XOR instructions. The number of single bit XORs (without taking into account zero zi,j) can be estimated as

NXOR1=i=2mb=i+1mlbj=1i-1|Sj|2pj-lj.

When using w-bit internal parallelism, we cannot directly divide NXOR1 by w, because on some levels we cannot use the whole w birs in the word. Instead, the number of expected w-bit XOR instructions can be expressed as

(3.3)NXORw=i=2m(b=i+1mlbwj=1i-1|Sj|2pj-lj).

Similar estimates can be done for NXORed1 and NXORedw by multiplying the internal products by coefficient (1-2-pi-1) that corresponds to a probability of non-zero zi,j:

(3.4)NXORedw=i=2m(b=i+1mlbw(1-2-pi-1)j=1i-1|Sj|2pj-lj).

Example.

Suppose that li=3, and |Si|=4 for each i (we can model AND-gates with random linear combinations of variables as inputs in this setting). Let us suppose that we have m variables and m MRHS equations (blocks). Now assume a simple case where m=3k, and we can get a simple echelon form with pi=3 for i=1,,k and pi=0 for k+1i3k. The expected number of solutions is

E(|Sol()|)=(j=1k221)(j=k+13k222-3)=22k2-2k=1.

The number of recursion calls is

Ntotal=i=2k+122(i-1)+i=k+23k22k-(i-k-1),

so 22k+1<Ntotal<22k+2. The total number of XORs will be between k22k and 2k22k. This is a factor 2kk lower than multiplication of all possible 23kx-vectors by matrix 𝐌.

Another example.

Suppose that li=3, and |Si|=4 for each i again, m variables and m MRHS equations (blocks). Now assume a case, where a single leading 1 in each block, i.e. pi=1 for each i. The expected number of solutions is as before

E(|Sol()|)=(j=1m222-2)=1.

The number of recursion calls is

Ntotal=i=2m1=m-1,

and the number of XORs is 12(m2+m). This means that we can solve the system in quadratic time (the problem complexity is no longer exponential in system size).

Randomly generated systems will typically have almost all blocks without internal linear dependencies. Thus, they are most likely of type 1 (hard to solve), and systems of type 2 must be constructed artificially. On the other hand, systems produced from cryptanalytic problems have a lot of internal structure that comes from the algorithm implementation. We observe the effect of this structure for selected ciphers in Section 4.

4 Representing SL ciphers as MRHS systems

In the following we study several ciphers. We adopt the term SL cipher to a cipher that can be represented as a sequence of linear (or more precisely, affine) transformations and substitution layers realised by S-boxes.

4.1 Modelling an SL cipher as an MRHS system

Let s, la, lo,nR,nB,nK denote the number of S-boxes per round, input and output bits of S-boxes, rounds, input/output block size, and key size, respectively. We can construct an initial MRHS system representing an SL cipher with parameters li=la+lo, |Si|=2la, m=snR, and n=2nB+nK+mlo. The unknown xGF(2)n in the system consists of the nB plaintext bits, nK key bits, all S-box outputs and nB ciphertext bits. For concreteness, let x1,,xnB be the plaintext bits, let xnB+1,,xnB+nK be the bits of the user-selected key, let xnB+nK+1,,xn-nB be the output bits of all S-boxes used in one encryption, and let xn-nB+1,,xn be the ciphertext bits. If S-boxes are used in the key schedule they are considered to be used in the encryption.

As SL ciphers only have linear (affine if additions of constants occur) operations apart from S-boxes, all input bits to all S-boxes can be described as linear (affine) combinations of the variables we have defined. We create one MRHS equation for each S-box i as follows:

x𝐌i{(0cS(0c)),(1cS(1c)),,((2la-1)cS((2la-1)c))}=Si,

where c is the constant part of the input affine combinations and we use the natural mapping between integers and vectors over GF(2).

The first la columns of 𝐌i contain the coefficients of the linear combinations of the inputs to the S-box. The last lo columns of 𝐌i contain a single 1-bit each; mj,t=1 if xj is the variable for output bit t, and mj,t=0 otherwise. In the end, the ciphertext bits can be described as linear combinations of the variables we have defined.

Constructing the joined system matrix of the MRHS system models the complete cipher. The model is flexible enough to cover all of the currently used ciphers that use S-boxes[1], and a linear or affine layer in-between layers of S-boxes.

The model is based on a single plaintext/ciphertext (P/C) pair. To model multiple encryption instances, new variables must be defined for plaintext, ciphertext and S-box output bits, except for S-boxes used in the key schedule. Variables for the user-selected key and key schedule S-boxes are re-used across different P/C pairs. If multiple P/C pairs are used, all MRHS equations may still be merged into one joined system matrix.

4.2 Fixing known bits

If the purpose of constructing the joined system matrix is to do algebraic cryptanalysis, we assume we have a known plaintext/ciphertext pair. By fixing the first and last nB variables in x to their correct values we get a reduced system. If the original joined system is x𝐌S, we can write it as

(p,x,c)[𝐌𝐩𝐌𝐌𝐜]S1×S2××Sm,

where p and c are the known values of the P/C pair and x are the remaining variables. Setting

p𝐌𝐩+c𝐌𝐜=w=(w1,w2,,wm),

we get a reduced MRHS system

x𝐌(w1+S1)×(w2+S2)××(wm+Sm).

4.3 Encryption as MRHS solving

In an algebraic attack, we fix the variables for the plaintext and ciphertext bits. The task for the cryptanalyst is then to solve the remaining system, using Algorithm 1 or by other means, to find the values of the variables for the key bits.

However, it is also possible to do regular encryptions using the MRHS representation. In this case we fix the plaintext and the key variables. The task is then to “solve” the reduced system to find the values of the ciphertext variables. In a joined MRHS system where the initial equations were joined in the natural order (round by round) and both plaintext and key variables are fixed, Algorithm 1 will not do any guessing but rather just do look-up’s for the values of the intermediate variables before finding the ciphertext bits in the end.

For many ciphers there is nothing to gain from doing encryptions this way, but if the cipher contains a lot of linear operations, encryption can go faster using the MRHS representation. LowMC is a cipher that has a dense affine transformation in each round, and the MRHS representations packs all of this linearity more efficiently together such that less XORs need to be done when encrypting via Algorithm 1.

We show this for the LowMC version with one S-box per round, la=lo=3, 64-bit block and 164 rounds. We count the number of single-bit XORs needed to be done in a straight-forward reference implementation and in solving one MRHS instance. The number of look-ups needed to be done is the same in both cases since each S-box in the cipher specification gives rise to one block in the joined system matrix of the MRHS representation. Since we are in encryption mode, we assume that all key material is fixed and precomputed in both cases.

The number of XORs done in one encryption according to the LowMC specification can be computed as follows: The linear layer in each round needs approximately nB2 XORs of nB-bit words. The addition of round constant and round key accounts for 2 more nB-bit XORs. Over nR rounds the number of single-bit XORs is nR(12nB2+2nB). Setting nB=64 and nR=164 gives 356,864 XORs.

The number of columns in the joined MRHS matrix is lnR, so each XOR of one row in this matrix will count for lnR single-bit XORs. When the plaintext bits get fixed, we need to add approximately nB2 rows together to produce the initial vector w0. Then Algorithm 1 does nR look-ups, each one adding exactly one vector of length lnR to the current wi. In total we get lnR(12nB+nR) single-bit XORs before the ciphertext can be found. With nR=164,nB=64 and l=6 this number comes to 192,864. We see that the number of XORs needed when encrypting via the MRHS representation is approximately 1.85 times lower than in the reference specification.

5 Experiments with concrete ciphers

In this section we report on experiments done with Algorithm 1 on some ciphers. In the experiments we reduce the key space by always setting a certain amount of key bits to zero and get a reduced system. This is done to get practical running times so we can measure the observed time complexity of Algorithm 1.

We have focused on four (families of) ciphers: DES [8], AES [3], Present [2], and version 2 of LowMC [1]. We have tried various experiments with key sizes between 18 bits and 24 bits. The results are very similar between the choice of key size, thus we only present results for 22-bit unknown key bits. This choice makes individual experiments reasonably fast, but not so fast that time measurement errors become significant.

These families all fit into our SL cipher framework, while providing enough variety in design choices (Feistel/SPN), S-box sizes, linear layer type (permutation, MDS, random), different key schedules. We only provide short notes on SL models of each family, as we suppose the reader is familiar with the design of these ciphers.

5.1 Ciphers tested

In our model we use standard DES parameters nB=64 and nK=56. The first part of the key is fixed to zero. We use the full set of 8 DES S-boxes in each round, with la=6 and lo=4. There are 32 new variables introduced in each round, and the nK user-selected key bits are used directly in the encryption. The total number of variables in the MRHS representation of DES before fixing any known data will then be

n=2×64+56+32nR.

Our model of AES has parameters nB=nK=128, again with fixing part of the key bits to zero. The AES specification includes four S-boxes in the key schedule, thus we use 20 S-boxes in each round with la=lo=8. There are 160 new variables introduced in each round, 32 in the key schedule and 128 in the cipher block. The total number of variables in the system before fixing known data is

n=2×128+128+160×nR.

We have selected a version of Present with nB=64 and nK=80 (again zero-reduced). The key schedule uses one S-box so the total number of S-boxes per round is 17, with la=lo=4. There are then 68 new variables introduced in each round, so the total number of variables before inserting known data will be

n=2×64+80+68nR.

Finally, we focus on the LowMC cipher which allows a variable number of S-boxes, block and key size. Only 3s bits of the cipher block passes through S-boxes in each round, so we get 3s new variables in each round. All other parts of the cipher are linear, both in encryption and key schedule, and the total number of variables in the MRHS representation is

n=2nB+nK+3snR.

We use a custom software implemented in SAGE [9] to generate instances of MRHS systems based on the SL model described above. The generator software produces an instance with nk unknown key bits (nknK), with expected complexity of the (whole) exhaustive search 2nk.

The generated instance is given to a fast solver, which is a C implementation of Algorithm 1. The algorithm searches the full space (it does not stop after producing a solution). It reports the total number of recursive calls (same as table lookups) c, the number of XORs x, and the running time t of the search.

The solver can also generate random instances of MRHS systems with specified parameters, and estimate the complexity Ntotal using equation (3.2), without actually solving the system. This is useful for larger instances. Similarly, we use equations (3.3) and (3.4) to estimate the number of w=64-bit XOR instructions NXORw and NXORedw (without, and with taking zero blocks into account, respectively).

5.2 Brute force attacks with MRHS solver

Table 1

Running times for DES, plus estimate for exhaustive search.

Rounds22-bit key [s]Full [CPUyear]Ratio
40.0631.730.12
50.32176.100.68
60.82447.081.73
71.41770.162.98
82.321262.894.88
93.371834.377.09
104.582494.909.64
115.943235.5612.50
127.384018.4615.53
138.954870.7618.82
1410.605772.6022.31
1512.066563.2025.36
1613.687448.4228.78
OpenSSL0.48258.761.0

We started the experiments with estimating a CPUyear cost for exhaustive key search on DES with the MRHS solver, depending on the number of rounds. The results are summarized in Table 1. We compare the estimated exhaustive search time of our solver with the results obtained from OpenSSL speed command on the same PC. The results indicate that the running time of the MRHS solver are comparable to standard exhaustive search, with slight advantage for the four and five round versions.

In Table 2, we compare the results of 22-bit exhaustive key search for different cipher instances. We include the size of the system, the total running time of the solver, and the time of exhaustive search using a software implementation of the cipher. We have used DES and AES implemented in OpenSSL (1.0.2g) using the speed command. Present implementation was taken from [7], and LowMC implementation from [5].

We see that the MRHS solver is typically slower than the optimised cipher implementations. On the other hand, the complexity does not grow exponentially with the size of the system (as is expected for a random non-linear equation system). Depending on optimisations and implementation platform, we can expect that brute-force attack with MRHS solver is competitive with some implementations of ciphers. This is confirmed when comparing our implementation with the reference implementation of LowMC with a low number of S-boxes per round.

Table 2

Running times of the solver (MRHS) and reference implementations (ref. SW) for different ciphers.

Ciphersnml|Si|MRHS [s]ref. SW[s]
DES8470128106413.60.48
Present80172066527816114.40.12
AES1282014942001625657.90.58
LowMC64-114501646812.945.82
LowMC64-224501646812.423.71
LowMC1283110103726855.59.24
LowMC25649153058868108.825.71

5.3 Expanded results for various versions of LowMC cipher

In Figures 1 and 2 we show results from exhaustive key search experiments with the LowMC cipher with a single S-box and variable number of rounds, from 22 to the recommended 164. The key size was set to 22 bits, so the running time of experiments is in the order of seconds.

We have measured the number of lookups and XORs in the implementation of Algorithm 1. The measured results are compared with the estimates obtained by equations (3.2), (3.3), and (3.4), respectively. The Ntotal estimate is very accurate for the real number of lookups. On the other hand, the number of XORs is typically between the estimates NXORedw and NXORw. The staircase character corresponds to an internal parallelism in the algorithm implementation: each step corresponds to approximately 21 rounds, which adds 63 bits to the width of the system that fit into the 64-bit architecture used.

Figure 1 Comparison of the number of lookups in experiments with LowMC cipher to the estimated number of lookups Ntotal{N_{\rm total}}.
Figure 1

Comparison of the number of lookups in experiments with LowMC cipher to the estimated number of lookups Ntotal.

Figure 2 Comparison of the number of XORs in experiments with LowMC cipher to the estimated number of XORs by formulas (3.3) (upper bound) and (3.4) (lower bound).
Figure 2

Comparison of the number of XORs in experiments with LowMC cipher to the estimated number of XORs by formulas (3.3) (upper bound) and (3.4) (lower bound).

In Figure 3, we compare the running time of the same experiment with the running time of the brute-force attack that uses the LowMC implementation from [5]. The brute force attack only uses the functions cipher.setkey and cipher.encrypt in a loop over all 22-bit keys. Both programs were compiled with the same compiler and level of optimisation and run on the same computer. Note that both solvers go through the whole range of keys/potential solutions, and do not stop if the key is found sooner. For low number of rounds, algebraic representation gives a huge speedup in checking the key. This converges to about 3.5-times faster encryption via the MRHS solver for the full version of the cipher.

This speed-up can be explained by the different numbers of XORs needed to be done in the reference implementation and in Algorithm 1. In an exhaustive key search the plaintext and ciphertext are fixed, and only the key is changed for every encryption. The number of XORs needed for checking one key in the reference implementation can be estimated as follows. To make one round key we must perform approximately 12nBnK single-bit XORs. Repeated over nR rounds the complete key schedule costs 12nRnBnK XORs. As we saw in Section 4.3 one encryption needs nR(12nB2+2nB) XORs. In total it takes

12nRnBnK+nR(12nB2+2nB)

single-bit XORs to check one key using the standard implementation. With nR=164,nB=64 and nK=80 this comes to 776,704 single-bit XORs.

Fixing the key in the MRHS representation takes approximately nK2 XORs of rows from the joined system matrix. Performing the encryption is the same as in Section 4.3, nR XORs of vectors of length lnR. The total number of single-bit XORs for checking one key in the MRHS model is then

lnR(12nK+nR).

Setting l=6,nR=164 and nK=80 gives a total of 200,736 single-bit XORs. This is a factor 3.87 lower than in the reference specification, and explains the observed speed-up in exhaustive search using Algorithm 1.

Figure 3 Comparison of the real time required to brute-force a 22-bit key with the MRHS solver and with the SW implementation of LowMC running in loop.
Figure 3

Comparison of the real time required to brute-force a 22-bit key with the MRHS solver and with the SW implementation of LowMC running in loop.

5.3.1 Increasing the number of S-boxes

In Figures 46 we compare exhaustive search run-times across different versions of LowMC: with 8, 4, 2, and 1 S-boxes per round (denoted by s). In each version we set the maximum number of rounds as 164/s, and plot the results for cipher versions with smaller number of Sboxes only in the range of rounds above the previously attained number of rounds. The behaviour of the solver is consistent across different versions of the cipher.

Note that the largest size of the MRHS system for each case is the same, as it is derived from the total number of S-boxes used in the encryption. This behaviour is most pronounced when comparing the running times with the reference software implementation of LowMC (Figure 6): while the running time of the software implementation grows linearly in nR, the running time of the solver depends both on the number of rounds and the number of S-boxes used in each round, and grows linearly in snR. Hence the MRHS solver’s speed advantage quickly disappears when s increases.

Figure 4 Comparison of the number of lookups in experiments with LowMC cipher to the estimated number of lookups Ntotal{N_{\rm total}}. Number of S-boxes from left to right: 8, 4, 2, 1.
Figure 4

Comparison of the number of lookups in experiments with LowMC cipher to the estimated number of lookups Ntotal. Number of S-boxes from left to right: 8, 4, 2, 1.

Figure 5 Comparison of the number of XORs in experiments with LowMC cipher to the estimated number of XORs by formulas (3.3) (upper bound) and (3.4) (lower bound). Number of S-boxes from left to right: 8, 4, 2, 1.
Figure 5

Comparison of the number of XORs in experiments with LowMC cipher to the estimated number of XORs by formulas (3.3) (upper bound) and (3.4) (lower bound). Number of S-boxes from left to right: 8, 4, 2, 1.

Figure 6 Comparison of the real time required to brute-force a 22-bit key with the MRHS solver and with the SW implementation of LowMC running in loop. Number of S-boxes from left to right: 8, 4, 2, 1.
Figure 6

Comparison of the real time required to brute-force a 22-bit key with the MRHS solver and with the SW implementation of LowMC running in loop. Number of S-boxes from left to right: 8, 4, 2, 1.

6 Conclusions and discussion

All symmetric ciphers can be modelled as a system of Boolean equations, represented as a fully joined MRHS matrix. We have devised an algorithm that solves the system in the MRHS matrix model, and estimated its complexity. Actual implementations show the estimates are very accurate. With other solvers, like F4, ordinary glueing/agreeing with MRHS systems, or SAT-solvers, it is difficult to predict actual running time from the initial system. The fully joined MRHS system helps on this situation, as solving time can be determined by only examining the structure of the matrix.

We also observe that for full-scale ciphers the complexity is only exponential in the number of variables representing the user-selected key, and not in the total number of variables. For ciphers with reduced rounds the joined MRHS matrix can reveal at which point we get lower than brute force solving complexity. For instance, eight rounds of Simon32 has complexity 262 while the key has 64 bits.

It is possible to guess the values of t linear combinations of variables such that the number of leading 1’s in some blocks decreases by t. This will decrease some numbers pi in (3.2) by a total of t, and hence we decrease running time with a factor 2-t. An open problem is to study if it is possible to gain more than a factor 2-t when guessing some particular linear combinations. With the accurate complexity estimate given by the joined MRHS matrix it is possible to determine this beforehand, without running any experiments of actual solving.

The final interesting finding in this work is that versions of LowMC with very few S-boxes can be more efficiently implemented in the MRHS model. This comes from the fact that with few S-boxes the steps of the cipher are close to being successive linear operations. The MRHS implementation merges a lot of the linear-upon-linear parts of the operations, resulting in less XORs needing to be done using the MRHS representation. It is then possible to implement encryption more efficiently, with a speed-up factor of 1.85 when doing the linear operations of the cipher.

In the exhaustive key search scenario the gain from using the MRHS solver is even bigger since the LowMC key schedule must be executed for every key tested. With one S-box per round the MRHS representation for exhaustive search is close to four times faster to use than the standard reference implementation. This may be interpreted as a valid attack since we can do a full search of the 80-bit key space in approximately the same time it takes to do 278 standard LowMC encryptions. On the other hand, assuming that LowMC encryptions are also done in the MRHS model, this speed advantage disappears.

We hope that the community on analysing symmetric encryption algorithms finds the work in this paper useful, and that modelling a cipher using its joined MRHS matrix may serve as a tool in assessing ciphers’ strength against algebraic cryptanalysis.


Communicated by Spyros Magliveras


Funding source: EEA Grants

Award Identifier / Grant number: SK06-IV-01-001

Funding statement: This research was supported by project Cryptography brings security and freedom SK06-IV-01-001 funded by EEA Scholarship Programme Slovakia.

References

[1] M. Albrecht, C. Rechberger, T. Schneider, T. Tiessen and M. Zohner, Ciphers for MPC and FHE, Advances in Cryptology – EUROCRYPT 2015, Lecture Notes in Comput. Sci. 9056, Springer, Berlin (2015), 430–454. 10.1007/978-3-662-46800-5_17Search in Google Scholar

[2] A. Bogdanov, L. R. Knudsen, G. Leander, C. Paar, A. Poschmann, M. J. B. Robshaw, Y. Seurin and C. Vikkelsoe, PRESENT: An ultra-lightweight block cipher, Cryptographic Hardware and Embedded Systems – CHES 2007, Lecture Notes in Comput. Sci. 4727, Springer, Berlin (2007), 450–466. 10.1007/978-3-540-74735-2_31Search in Google Scholar

[3] J. Daemen and V. Rijmen, The Design of Rijndael: AES – The Advanced Encryption Standard, Information Security and Cryptography, Springer, Berlin, 2002. 10.1007/978-3-662-04722-4Search in Google Scholar

[4] H. Raddum and I. Semaev, Solving multiple right hand sides linear equations, Des. Codes Cryptogr. 49 (2008), no. 1–3, 147–160. 10.1007/s10623-008-9180-zSearch in Google Scholar

[5] T. Tiessen, An implementation of the lowmc block cipher family, https://github.com/tyti/lowmc, 2016. Search in Google Scholar

[6] P. Zajac, A new method to solve MRHS equation systems and its connection to group factorization, J. Math. Cryptol. 7 (2013), no. 4, 367–381. 10.1515/jmc-2013-5012Search in Google Scholar

[7] B. Zhu, An efficient software implementation of the block cipher present for 8-bit platforms, https://github.com/bozhu/PRESENT-C/, 2013. Search in Google Scholar

[8] Data Encryption Standard, Federal Information Processing Standards Publication (FIPS PUB) 46-3, National Bureau of Standards, 1999. Search in Google Scholar

[9] The Sage Developers, SageMath, the Sage Mathematics Software System (Version 7.2), 2016, http://www.sagemath.org. Search in Google Scholar

Received: 2017-02-10
Accepted: 2018-06-26
Published Online: 2018-07-11
Published in Print: 2018-09-01

© 2018 Walter de Gruyter GmbH, Berlin/Boston

This article is distributed under the terms of the Creative Commons Attribution Non-Commercial License, which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Downloaded on 28.3.2024 from https://www.degruyter.com/document/doi/10.1515/jmc-2017-0005/html
Scroll to top button