Code-based cryptography is one of the main areas of interest for NIST’s Post-Quantum Cryptography Standardization call. In this paper, we introduce DAGS, a Key Encapsulation Mechanism (KEM) based on quasi-dyadic generalized Srivastava codes. The scheme is proved to be IND-CCA secure in both random oracle model and quantum random oracle model. We believe that DAGS will offer competitive performance, especially when compared with other existing code-based schemes, and represent a valid candidate for post-quantum standardization.
The availability of large-scale quantum computers is getting ever closer to reality, and with it, all of the public-key cryptosystems currently in use, which rely on number theory problems (e.g., factorization) and discrete logarithm problems will become obsolete . Therefore, it is of extreme importance to be able to offer a credible alternative that can resist attackers equipped with quantum technology. In this regard, NIST’s call for proposals for post-quantum standardization is a further reassurance about the need for solid post-quantum proposals. Furthermore, considering the desired life of the encrypted data, and the lengthy timeframe for such a complex standardization process, it is clear how convincing research work in post-quantum cryptography is not only necessary, but also urgent.
Code-based cryptography is one of the main candidates for this task. The area is generally based on the syndrome decoding problem , which has shown no vulnerabilities to quantum attacks over the years. Since McEliece’s seminal work  in 1978, many variants and modifications have been proposed, trying to balance security and efficiency and in particular dealing with inherent flaws such as the large size of the public keys. In fact, while the original McEliece’s cryptosystem (based on binary Goppa codes) is still formally unbroken, it features a key of several tenths of kilobytes, which has effectively prevented its use in many applications.
There are currently two main trends to deal with this issue, and they both involve structured matrices: the first, is based on “traditional” algebraic codes, and in particular alternant codes such as Goppa or generalized Srivastava codes; the second suggests to use sparse matrices as in LDPC/MDPC codes [3, 32]. This work builds on the former approach, initiated in 2009 by Berger et al. , who proposed Quasi-Cyclic (QC) codes, and Misoczki and Barreto , suggesting Quasi-Dyadic (QD) codes instead (later generalized to Quasi-Monoidic (QM) codes ). Both proposals feature very compact public keys due to the introduction of the extra algebraic structure, but unfortunately this also leads to a vulnerability. Indeed, Faugère et al.  devised a clever attack (known simply as FOPT) which exploits the algebraic structure to build a system of equations, which can successively be solved using Gröbner bases techniques. As a result, the QC proposal is heavily compromised, while the QD/QM approach needs to be treated with caution. In fact, for a proper choice of parameters, it is still possible to design secure schemes, using for instance binary Goppa codes, or Generalized Srivastava (GS) codes as suggested by Persichetti in .
In this paper, we present DAGS, a key encapsulation mechanism that follows the QD approach using GS codes. KEMs are the primitive favored by NIST for key exchange schemes, and can be used to build encryption schemes, for example using the hybrid encryption paradigm introduced by Cramer and Shoup . To the best of our knowledge, this is the first code-based KEM that uses quasi-dyadic codes. Another NIST submission, named BIG QUAKE , proposes a scheme based on quasi-cyclic codes.
Our KEM achieves IND-CCA security, following the recent framework by Kiltz et al. , and features compact public keys and efficient encapsulation and decapsulation algorithms. We modulate our parameters to achieve an efficient scheme, while at the same time keeping out of range of the FOPT attack. We provide an initial performance analysis of our scheme as well as access to our reference code; the team is currently working at several additional, optimized implementations, using C++, assembly language, and hardware (FPGA).
We show that our proposal compares well with other post-quantum KEMs. These include the classic McEliece approach , as well as more recent proposals such as BIKE  and the aforementioned BIG QUAKE.
The “Classic McEliece” project is an evolution of the well-known McBits  (based on the work of Persichetti ), and benefits from a well-understood security assessment but suffers from the usual public key size issue. BIKE, a protocol based on QC-MDPC codes, is the result of a merge between two independently published works with similar background, namely CAKE  and Ouroboros . The scheme possesses some very nice features like compact keys and an easy implementation approach, but has currently some potential drawbacks. In fact, the QC-MDPC encryption scheme on which it is based is susceptible to a reaction attack by Guo, Johansson and Stankovski (GJS) , and thus the protocol is forced to employ ephemeral keys. Moreover, due to its non-trivial Decoding Failure Rate (DFR), achieving IND-CCA security becomes very hard, so that the BIKE protocol only claims to be IND-CPA secure.
Finally, BIG QUAKE continues the line of work of  and proposes to use quasi-cyclic Goppa codes. Due to the particular nature of the FOPT attack and its successors , it seems harder to provide security with this approach, and the protocol chooses very large parameters in order to do so. We will discuss attack and parameters in Section 5.
More distantly related are lattice-based schemes like NewHope  and Frodo , based respectively on LWE and its ring variant. While these schemes are not necessarily a direct comparison term, it is nice to observe that DAGS offers comparable performance.
Organization of the paper.
The paper is organized as follows. We start by giving some preliminary notions in Section 2. We describe the DAGS protocol in Section 3, and we discuss its provable security in Section 4, showing that DAGS is IND-CCA secure in the random oracle model as well as the quantum random oracle model. Section 5 features a discussion about practical security and known attacks, which include general decoding attacks (information set decoding and the like) as well as algebraic attacks; we then present parameters for the scheme. Performance details are given in Section 6. Finally, we conclude in Section 7.
We will use the following conventions throughout the rest of the paper:
an algorithm or (hash) function
the concatenation of vectors and
the diagonal matrix formed by the vector
the identity matrix
choosing a random element from a set or distribution
the length of a shared symmetric key
2.2 Linear codes
We briefly recall some fundamental notions from coding theory. The Hamming weight of a vector is given by the number of its non-zero components. We define a linear code using the metric induced by the Hamming weight.
An -linear code of length n and dimension k over is a k-dimensional vector subspace of .
A linear code can be represented by means of a matrix , called generator matrix, whose rows form a basis for the vector space defining the code. Alternatively, a linear code can also be represented as kernel of a matrix , known as parity-check matrix, i.e. . Thanks to the generator matrix, we can easily define the codeword corresponding to a vector as . Finally, we call syndrome of a vector the vector .
2.3 Structured matrices and GS codes
Given a ring (in our case the finite field ) and a vector , the dyadic matrix is the symmetric matrix with components , where stands for bitwise exclusive-or on the binary representations of the indices. The sequence is called its signature. Moreover, denotes the matrix truncated to its first r rows. Finally, we call a matrix quasi-dyadic if it is a block matrix whose component blocks are dyadic submatrices.
If n is a power of 2, then every dyadic matrix can be described recursively as
where each block is a dyadic matrix. Note that by definition any 1 1 matrix is trivially dyadic.
For and a prime power q, let and be distinct elements of , and be non-zero elements of . The generalized Srivastava code of order st and length n is defined by a parity-check matrix of the form
where each block is given by
The parameters for such a code are the length , dimension and minimum distance . GS codes are part of the family of alternant codes, and therefore benefit of an efficient decoding algorithm; according to Sarwate [40, Corollary 2] the complexity of decoding is , which is the same as for Goppa codes. Moreover, it can be easily proved that every GS code with is a Goppa code. More information about this class of codes can be found in [29, Chapter 12, Section 6].
The core idea of DAGS is to use GS codes which are defined by matrices in quasi-dyadic form. In particular, the public key of the scheme is the generator matrix of such a code, which, being quasi-dyadic, can be described using just the signature of each block. This allows to obtain a very compact public key. Now, it can be easily proved that every GS code with is a Goppa code, and we know [29, Chapter 12, Proposition 5] that Goppa codes admit a parity-check matrix in Cauchy form under certain conditions (the generator polynomial has to be monic and without multiple zeros). By Cauchy we mean a matrix with components .
Misoczki and Barreto showed in [31, Theorem 2] that the intersection of the set of Cauchy matrices with the set of dyadic matrices is not empty if the code is defined over a field of characteristic 2, and the dyadic signature satisfies the following “fundamental” equation:
On the other hand, it is evident from Definition 2.3 that if we permute the rows of H to constitute blocks of the form
we obtain an equivalent parity-check matrix for a GS code, given by
The key generation process exploits first of all the fundamental equation to build a Cauchy matrix. The matrix is then successively powered (element by element) forming several blocks which are superimposed and then multiplied by a random diagonal matrix. Thanks to the observation above, we have now formed the matrix , where for ease of notation we use and to denote the vectors of elements and , respectively. Finally, the resulting matrix is projected onto the base field (as usual for alternant codes) and row-reduced to systematic form to form the public key. The process will be described in detail in the next section: note that this is essentially the same as in , to which we refer the readers looking for additional details about dyadic GS codes and the key generation process.
We are now ready to introduce the three algorithms that form DAGS. System parameters are the code length n and dimension k, the values s and t which define a GS code, the cardinality of the base field q and the degree of the field extension m. In addition, we have , where is arbitrary and is set to be “small”. In practice, the value of depends on the base field and is such that a vector of length provides at least 256 bits of entropy. This also makes the hash functions (see below) easy to compute, and ensures that the overhead due to the IND-CCA2 security in the QROM is minimal.
DAGS is a key encapsulation mechanism and as such it is composed of three algorithms – Key Generation, Encapsulation and Decapsulation – which will present below in the respective order.
Algorithm 1 (Key Generation).
Build the Cauchy support:
Choose a random offset .
Compute for .
Compute for .
Set and .
Form the Cauchy matrix .
Build , , by raising each element of to the power of i.
Superimpose the blocks in ascending order to form matrix .
Generate the vector by sampling uniformly at random elements in with the restriction for , .
Project H onto using the co-trace function; call this .
Write in systematic form .
The public key is the generator matrix .
The private key is the pair .
The encapsulation and decapsulation algorithms follow the paradigm of  to obtain an IND-CCA secure KEM from a PKE, and as such, they make use of two functions and , respectively an expansion and a compression function, the former with the task of generating randomness for the scheme, the latter to provide “plaintext confirmation”. The shared symmetric key is obtained via a third function . For more details about randomness generation and how the functions are implemented in practice, see Section 6.2.
Algorithm 2 (Encapsulation).
Compute and .
Parse as then set .
Generate the error vector of length n and weight w from .
Output the ciphertext ; the encapsulated key is .
The decapsulation algorithm consists mainly of decoding the noisy codeword received as part of the ciphertext. This is done using the alternant decoding algorithm described in [29, Chapter 12, Section 9] and requires the parity-check matrix to be in alternant form.
Algorithm 3 (Decapsulation).
Get the parity-check matrix in alternant form from a private key.
Use to decode and obtain the codeword and the error .
Output if decoding fails or .
Recover and parse it as .
Compute and .
Parse as .
Generate the error vector of length n and weight w from .
If , output .
Else, compute .
The decapsulated key is .
DAGS is built upon the McEliece cryptosystem, with a notable exception. In fact, we incorporate the “randomized” version of McEliece by Nojima et al.  into our scheme. This is extremely beneficial for two distinct aspects: first of all, it allows us to use a much shorter vector to generate the remaining components of the scheme, greatly improving efficiency. Secondly, it allows us to get tighter security bounds. Note that our protocol differs slightly from the paradigm presented in , in the fact that we do not perform a full re-encryption in the decapsulation algorithm. Instead, we simply re-generate the randomness and compare it with the one obtained after decoding. This is possible since, unlike a generic PKE, McEliece decryption reveals the randomness used, in our case (and ). It is clear that if the re-generated randomness is equal to the retrieved one, the resulting encryption will also be equal. This allows us to further decrease computation time.
The selection of the parameters for the scheme will be discussed in Section 5.4.
4 KEM security
In this section, we discuss some aspects of provable security, and in particular we show that DAGS satisfies the notion of IND-CCA security for KEMs, as defined below.
The adaptive chosen-ciphertext attack game for a KEM proceeds as follows:
Query a key generation oracle to obtain a public key pk.
Make a sequence of calls to a decryption oracle, submitting any string of the proper length. The oracle will respond with .
Query an encryption oracle. The oracle runs to generate a pair , then chooses a random and replies with the “challenge” ciphertext where if or is a random string of length otherwise.
Keep performing decryption queries. If the submitted ciphertext is , the oracle will return .
The adversary succeeds if . More precisely, we define the advantage of against KEM as
We say that a KEM is secure if the advantage of any polynomial-time adversary in the above CCA attack model is negligible.
Before discussing the IND-CCA security of DAGS, we show that the underlying PKE (i.e. randomized McEliece, see ) satisfies a simple property. This will allow us to get better security bounds in our reduction.
Consider a probabilistic PKE with randomness set . We say that PKE is γ-spread if for a given key pair , a plaintext m and an element y in the ciphertext domain, we have
for a certain .
The definition above is presented as in , but note that in fact this corresponds to the notion of γ-uniformity given by Fujisaki and Okamoto in , except for a change of constants. In other words, a scheme is γ-spread if it is -uniform.
It was proved in  that a simple variant of the (classic) McEliece PKE is γ-uniform for , where k is the code dimension as usual (more in general, for a cryptosystem defined over ). We can extend this result to our scheme as follows.
Randomized McEliece is γ-uniform for .
Let be a generic vector of . Then either is a word at distance w from the code, or it is not. If it is not, the probability of being a valid ciphertext is clearly exactly 0. On the other hand, suppose is at distance w from the code; then there is only one choice of and one choice of that satisfy the equation (since w is below the GV bound), i.e. the probability of being a valid ciphertext is exactly , which concludes the proof. ∎
We are now ready to present the security results.
Let be an IND-CCA adversary against DAGS that makes at most total random oracle queries and decryption queries. Then there exists an IND-CPA adversary against PKE, running in approximately the same time as , such that
The thesis is a consequence of the results presented in [27, Section 3.3]. In fact, our scheme follows the framework that consists of applying two generic transformations to a public-key encryption scheme. The first step consists of transforming the IND-CPA encryption scheme into an OW-PCVA (i.e. plaintext and validity checking) scheme. Then, the resulting scheme is transformed into a KEM in a “standard” way. Both proofs are obtained via a sequence of games, and the combination of them shows that breaking IND-CCA security of the KEM would lead to break the IND-CPA security of the underlying encryption scheme. Note that randomized McEliece, instantiated with quasi-dyadic GS codes, presents no correctness error (the value δ in ), which greatly simplifies the resulting bound. ∎
The value d included in the KEM ciphertext does not contribute to the security result above, but it is a crucial factor to provide security in the Quantum Random Oracle Model (QROM). We present this in the next theorem.
Let be a quantum IND-CCA adversary against DAGS that makes at most total quantum random oracle queries and (classical) decryption queries. Then there exists an OW-CPA adversary against PKE, running in approximately the same time as , such that
The thesis is a consequence of the results presented in Section 4.4 of . In fact, our scheme follows the framework that consists of applying two generic transformations to a public-key encryption scheme. The first step transforming the IND-CPA encryption scheme into an OW-PCVA (i.e. plaintext and validity checking) scheme, is the same as in the previous case. Now, the resulting scheme is transformed into a KEM with techniques suitable for the QROM. The combination of the two proofs shows that breaking IND-CCA security of the KEM would lead to break the OW-CPA security of the underlying encryption scheme. Note, therefore, that the IND-CPA security of the underlying PKE has in this case no further effect on the final result, and can be considered instead just a guarantee that the scheme is indeed OW-CPA secure. The bound obtained is a “simplified” and “concrete” version (as presented by the authors) and, in particular, it is easy to notice that it does not depend on the number of queries presented to the random oracle . The bound is further simplified since, as above, the underlying PKE presents no correctness error. ∎
5 Practical security and parameters
Having proved that DAGS satisfies the notion of IND-CCA security for KEMs, we now move onto a treatment of practical security issues. In particular, we will briefly present the hard problem on which DAGS is based, and then discuss the main attacks on the scheme and related security concerns.
5.1 Hard problems from coding theory
Most of the code-based cryptographic constructions are based on the hardness of the following problem, known as the (q-ary) Syndrome Decoding Problem (SDP).
Given an full-rank matrix H over , a vector , and a non-negative integer w, find a vector of weight w such that .
The corresponding decision problem was proved to be NP-complete in 1978 , but only for binary codes. In 1994, Barg proved that this result holds for codes over all finite fields (, in Russian, and [6, Theorem 4.1]).
In addition, many schemes (including the original McEliece proposal) require the following computational assumption.
The public matrix output by the key generation algorithm is computationally indistinguishable from a uniformly chosen matrix of the same size.
The assumption above is historically believed to be true, except for very particular cases. For instance, there exists a distinguisher (Faugère et al. ) for cryptographic protocols that make use of high-rate Goppa codes (like the CFS signature scheme ). Moreover, it is worth mentioning that the “classical” methods for obtaining an indistinguishable public matrix, such as the use of scrambling matrices S and P, are rather outdated and unpractical and can introduce vulnerabilities to the scheme as per the work of Strenzke et al. [42, 43]. Thus, traditionally, the safest method (Biswas and Sendrier, ) to obtain the public matrix is simply to compute the systematic form of the private matrix.
5.2 Decoding attacks
The main approach for solving SDP is the technique known as Information Set Decoding (ISD), first introduced by Prange , which targets directly the error vector and aims at decoding without knowing the underlying structure of the code (i.e. treating the code as truly random). Among several variants and generalizations, Peters showed  that it is possible to apply Prange’s approach to generic q-ary codes. Other approaches such as statistical decoding [1, 33] are usually considered less efficient. Thus, when choosing parameters, we will focus mainly on defeating attacks of the ISD family.
Hamdaoui and Sendrier in  provide non-asymptotic complexity estimates for ISD in the binary case. For codes over , instead, a bound is given in , which extends the work of Peters. For a practical evaluation of the ISD running times and corresponding security level, we used Peters’s ISDFQ script .
Bernstein in  shows that Grover’s algorithm applies to ISD-like algorithms, effectively halving the asymptotic exponent in the complexity estimates. Later, it was proved in  that several variants of ISD have the potential to achieve a better exponent, however the improvement was disappointingly away from the factor of 2 that could be expected. For this reason, we simply treat the best quantum attack on our scheme to be “traditional” ISD (Prange) combined with Grover search.
5.3 Algebraic attacks
While, as we discussed above, recovering a private matrix from a public one is in general a very difficult problem, the presence of special algebraic properties and additional structure in the code can have a considerable effect in lowering this difficulty. It turns out that, in the case of alternant codes for instance, there are indeed efficient methods that exploit this issue.
Solving systems of equations.
A very effective structural attack was introduced by Faugère et al. in . The attack (for convenience referred to as FOPT) relies on the simple property to build an algebraic system, using then Gröbner bases techniques to solve it. Note that this applies in principal to every linear code, but the system of equations is in general way too large to be solved in practice. It is then the special properties of alternant codes, as we mentioned above, that make the attack possible, by considerably reducing the number of unknowns of the system.
The attack was originally aimed at two variants of McEliece, introduced respectively in  and . The first variant, using quasi-cyclic codes, was easily broken in all proposed parameters and falls out of the scope of this paper. The second variant, instead, only considered quasi-dyadic Goppa codes. In this case too, most of the parameters proposed have been broken, except for the binary case (i.e. base field ). This was, in truth, not connected to the base field per se, but rather depended on the fact that, with a smaller base field, the authors provided a much higher extension degree m. This is because, probably for comparison reasons, all the proposed parameters were chosen so that the value was kept constant. As it turns out, the extension degree m plays a key role in evaluating the complexity of the attack.
Following up on their own work, the authors in  produced a paper which analyzes the attack in detail, with the aim of evaluating its complexity at least somewhat rigorously. At the core of the attack, there is an affine bilinear system, which is derived from the initial system of equations by applying various algebraic relations due to the quasi-dyadic structure. This bilinear system has variables, where these are, respectively, the number of X and Y “free” variables (after applying the relations) of an alternant parity-check matrix H with . Moreover, the degree of regularity (i.e. the maximal degree of the polynomials appearing during the computation) is bounded above by . It is shown that this number dominates computation time, and so it is crucial to correctly evaluate it in our case. In fact, for the original proposal based on Goppa codes , we have , where is the dyadic order and is the number of dyadic blocks, and . We report an excerpt of some numbers from the paper in Table 1.
|q||m||n||k||Time (s) /||Operations|
It is possible to observe several facts. In every set of parameters, for instance, , and so is the most important number here. In other words, the degree of the extension field is crucial in evaluating the complexity of the attack, as we mentioned above. As a confirmation, it is easy to notice that all parameters were broken very easily when this is extremely small (1 in most cases), while the running time scales accordingly when m grows. In fact, the attack could not be performed in practice on the first set of parameters (hence the N/A).
The first three groups of parameters are taken from the preliminary (unpublished) version of [31, Tables 2, 3 and 5, respectively], while the last group consists of some ad hoc parameters generated by the FOPT authors. It stands out the absence of parameters from [31, Table 4]. In fact, all of these parameters used as base field and thus could not be broken (at least not without very long computations), just like for the case of the first set. As a result, an updated version of  was produced for publication, in which the insecure parameters are removed and only the binary sets (those of [31, Table 4]) appear.
Towards the end of , the authors present a bound on the theoretical complexity of computing a Gröbner base of the affine bilinear system which is at the core of the attack. They then evaluate this bound and compare it with the number of operations required in practice (last column of Table 1). The bound is given by
where D is the degree of regularity of the system, i.e.
and indicates the coefficient of the term in the Hilbert bi-series , as defined in [23, Appendix A].
As it turns out this bound is quite loose, being sometimes above and sometimes below the experimental results, depending on which set of parameters is considered. As such, it is to be read as a grossly approximate indication of the expected complexity of a parameter set, and it only allows to have a rough idea of the security provided for each set. Nevertheless, since are able to compute the bound for all DAGS proposed parameters, we will keep this number in mind when proposing parameters (Section 5.4), to make sure our choices are at least not obviously insecure.
As a bottom-line, it is clear that the complexity of the attack scales somewhat proportionally to the value which defines the dimension of the solution space. The FOPT authors point out that any scheme for which this dimension is less or equal to 20 should be within the scope of the attack.
Since GS codes are also alternant codes, the attack can be applied to our proposal as well. There is, however, one very important difference to keep in mind. In fact, it is shown in  that, thanks to the particular structure of GS codes, the dimension of the solution space is defined by , rather than . This provides greater flexibility when designing parameters for the code, and it allows, in particular, to “rest the weight” of the attack on two shoulders rather than just one. Thus we are able to modulate the parameters and keep the extension degree m small while still achieving a large dimension for the solution space. We will discuss parameter selection in detail in Section 5.4 as already mentioned.
Recently, an extension of the FOPT attack appeared in . In this work, the authors introduce a new technique called “folding”, and show that it is possible to reduce the complexity of the FOPT attack to the complexity of attacking a smaller code (the “folded” code). This is a consequence of the strong properties of the automorphism group that is present in the alternant codes used. The attack turns out to be very efficient against Goppa codes, as it is possible to recover a folded code which is also a Goppa code. As a consequence, it is possible to tweak the attack to solve a different, augmented system of equations (named ), rather than the “basic” one which is aimed at generic alternant codes (called ). Moreover, this can be further refined in the case of binary Goppa codes, leading to a third system of equations referred to as . In parallel, the authors present a new method called “structural elimination” that manages to eliminate a considerable number of variables, at the price of an increased degree in the equations considered. Solving the “eliminated” systems (called respectively , and ) often proves a more efficient choice, but the authors do occasionally use the non-eliminated systems when it is more convenient to do so.
The paper concentrates on attacking several parameters that were proposed for signature schemes and encryption schemes in various follow-up works that build and expand on  and . The latter includes, among others, some of the parameters presented in Table 1. It turns out that codes designed to work for signature schemes are very easy to attack (due to their particular nature); however, the situation for encryption is more complex. The authors are able to obtain a speedup in the attack times for previously investigated parameters, but some of the parameters could still not be solved in practice. We report the results in Table 2, where we indicate the type of system chosen to be solved, and we keep some of the previously-shown parameters for ease of comparison.
|4||2048||1024||32||00.01 s||0000.5 s|
|4||4096||3584||32||00.01 s||0007.1 s|
|8||3584||1536||56||00.04 s||1776.3 s|
The authors do not report timings for codes that were already broken with FOPT in negligible time (which is the case for all the parameter sets where ). Also, we have decided to exclude from our table parameters that are not relevant to this submission. These include for example the quasi-monoidic codes introduced in  (i.e. codes defined over a field for q not a power of 2).
This table confirms our intuition that high values of m result in a high number of operations, and that complexity increases somewhat proportionally to this value. Note that the last five sets of parameters were not broken in practice and the estimated complexity is always quite high. It is not clear what the authors mean by , but it is reasonable to assume that the actual complexity would not be dramatically smaller than the indicated value, and thus at least in all cases. Consequently, the claim that parameters with are “within the scope of the attack” looks now perhaps a bit optimistic.
The fourth set of parameters seem to contradict our intuition, since it was broken in practice with relative ease even though . However, it is possible to see that this is a code with a ridiculously high rate ( is very close to1) and in particular, the very large number of blocks clearly stands out. We remark that this set of parameters was chosen ad hoc by the attack authors and in practice such a poor choice of parameters would never be considered. Nevertheless, it gives us the confirmation (if needed) that high-rate codes are a bad choice not only for ISD-like attacks, but for structural attacks also.
The authors did not present any explicit result against GS codes and, in particular, it is not known whether a folded GS code is still a GS code. Thus, the attack in this case is limited to solving the generic system (or ) and does not benefit from the speedups which are specific to (binary) Goppa codes. For these reasons, and until an accurate complexity analysis is available, we choose to attain to the latest measurable guidelines and choose our parameters such that the dimension of the solution space for the algebraic system is strictly greater than 20. We then compute the bound given by equation (5.1) and report it as an additional indication of the expected complexity of the attack. We hope that this work will encourage further study into FOPT and folding attacks in relation to GS codes.
An attack based on norm-trace codes has been recently introduced by Barelli and Couvreur . As the name suggests, these codes are the result of the application of both the trace and the norm operation to a certain support vector, and they are alternant codes. In particular, they are subfield subcodes of Reed–Solomon codes. The construction of these codes is given explicitly only for the specific case (as will be the case in all DAGS parameters), i.e. the support vector has components in , in which case the norm-trace code is defined as
where α is an element of trace 1.
The main idea of the attack is that there exists a specific norm-trace code that is the conductor of the secret subcode into the public code. By “conductor” the authors refer to the largest code for which the Schur product (i.e. the component-wise product of all codewords, denoted by ) is entirely contained in the target, i.e.
The authors present two strategies to determine the secret subcode. The first strategy is essentially an exhaustive search over all the codes of the proper co-dimension. This co-dimension is given by , since s is the size of the permutation group of the code, which is non-trivial in our case due to the code being quasi-dyadic. While such a brute force in principle would be too expensive, the authors present a few refinements that make it feasible, which include an observation on the code rate of the codes in use, and the use of shortened codes.
The second strategy, instead, consists of solving a bilinear system, which is obtained using the parity-check matrix of the public code and treating as unknowns the elements of a generator matrix for the secret code (as well as the support vector ). This system is solved using Gröbner bases techniques, and benefits from a reduction in the number of variables similar to the one performed in FOPT, as well as the refinements mentioned above (shortened codes).
In any case, it is easy to deduce that the two parameters q and s are crucial in determining the cost of running this step of the attack, which dominates the overall cost. In fact, the authors are able to provide an accurate complexity analysis for the first strategy which confirms this intuition. The average number of iterations of the brute force search is given by , where c is exactly the co-dimension described above, i.e. . In addition, it is shown that the cost of computing Schur products is operations in the base field. Thus, the overall cost of the recovery step is operations in . The authors then argue that wrapping up the attack has negligible cost, and that q-ary operations can be done in constant time (using tables) when q is not too big. All this leads to a complexity which is below the desired security level for all of the DAGS parameters that had been proposed at the time of submission. We report these numbers in Table 3.
As it is possible to observe, the attack complexity is especially low for the last set of parameters since the dyadic order s was chosen to be , and this is probably too much to provide security against this attack. Still, we point out that, at the time this parameters were proposed, there was no indication this was the case, since this attack is using an entirely new technique, and it is unrelated to the FOPT and folding attacks that we just described.
Unfortunately, the attack authors were not able to provide a security analysis for the second strategy (bilinear system). This is due to the fact that the attack uses Gröbner based techniques, and it is very hard to evaluate the cost in this case (similarly to what happened for FOPT). In this case then, the only evidence the authors provide is experimental, and based on running the attack in practice on all the parameters. The authors report running times around 15 minutes for the first set and less than a minute for the last, while they admit they were not able to complete the execution in the middle case. This seems to match the evidence from the complexity results obtained for the first strategy, and suggests a speedup proportional to those. Further test runs are currently planned, but the fact that the attack already fails to run in practice for the middle set, gives us some confidence to believe that updated parameters will definitely make the attack infeasible.
5.4 Parameter selection
To choose our parameters, we keep in mind all the remarks from the previous sections about decoding attacks and structural attacks. As we have just seen, we need to respect the condition to guarantee security against FOPT. At the same time, to prevent the BC attack q has to be chosen large enough and s cannot be too big. Finally, for ISD to be computationally intensive, we require a sufficiently large number w of errors to decode. This is given by according to the minimum distance of GS codes.
In addition, we tune our parameters to optimize performance. In this regard, the best results are obtained when the extension degree m is as small as possible. This, however, requires the base field to be large enough to accommodate sufficiently big codes (against ISD attacks), since the maximum size for the code length n is capped by . Realistically, this means we want to be at least , and the optimal choice in this sense seems to be (see Section 6). Finally, note that s is constrained to be a power of 2, and that odd values of t seem to offer best performance.
Putting all the pieces together, we are able to present three sets of parameters, in Table 4. These correspond to three of the security levels indicated by NIST (first column), which are related to the hardness of performing a key search attack on three different variants of a block cipher, such as AES (with key length respectively 128, 192 and 256). As far as quantum attacks are concerned, we claim that ISD with Grover (see above) will usually require more resources than a Grover search attack on AES for the circuit depths suggested by NIST (parameter MAXDEPTH). Thus, classical security bits are the bottleneck in our case, and as such we choose our parameters to provide 128, 192 and 256 bits of classical security for security levels 1, 3 and 5 respectively. For practical reasons, during the rest of the paper we will refer to these parameters respectively as DAGS_1, DAGS_3 and DAGS_5.
For the above parameters, it is easy to observe that the value is always greater or equal to 21 (it is in fact 25 for DAGS_1), which keeps us clear of FOPT. With respect to the BC attack, the complexity analysis provided by the authors results in a value of for DAGS_1 and more than for the other two sets. Finally, the running cost of ISD (using Peters’ script) is estimated at and respectively, as desired.
6 Performance analysis
DAGS operates on vectors of elements of the finite field , where q is a power of 2 as specified by the choice of parameters. Finite field elements are represented as bit strings using standard log/antilog tables (see for instance [29, Chapter 4, Section 5]) which are stored in the memory.
For DAGS_1, the finite field is built using the polynomial and then extended to using the quadratic irreducible polynomial , where α is a primitive element of . In particular, using a well-known result on finite fields, we choose where γ is a primitive element of . This particular choice allows for more efficient arithmetic using a conversion matrix to switch between different field representations. Similarly, for DAGS_3 and DAGS_5, we build the base field using and the extension field is obtained via , where β is a primitive element of .
6.2 Randomness generation
The randomness used in our implementation is provided by the NIST API. It uses AES as a PNGR, where NIST chooses the seed in order to have a controlled environment for tests. We use this random generator to obtain our input message , after which we calculate and , where is an expansion function and is a compression function. In practice, we compute both using the KangarooTwelve function  from the Keccak family. To generate a low-weight error vector, we take part of as a seed . We use again KangarooTwelve to expand the seed into a string of length n, then transform the latter into a fixed-weight string using a deterministic function.
6.3 Efficient private key reconstruction and decoding
As mentioned in Section 3, in our scheme we use a standard alternant decoder (step (2) of Algorithm 3), which requires the input to be a matrix in alternant form, i.e. for and . The first step consists of computing the syndrome of the received word, . Now, defining the whole alternant matrix as private key would require storing stn elements of , leading to huge key sizes. It would be possible to store as private key just the defining vectors and , and then compute the alternant form during decapsulation. Doing so would drastically reduce the private key size, but would also significantly slow down the decapsulation algorithm. Thus we came up with a neat idea, and implemented a hybrid approach. We use and to compute the vector during key generation and store as private key, which still results in a compact size. Then, we complete the computation of the alternant form in the decapsulation algorithm. To avoid an unnecessary overhead, we incorporate this computation together with the syndrome computation. The process is detailed as follows.
Algorithm 4 (Alternant-Syndrome Computation).
Input received word to be decoded.
Compute the vector .
Form intermediate matrix . To do this:
Set first row equal to .
Obtain row i, for , by multiplying the j-th element of row by , for .
Sum elements in each row and output resulting vector.
The implementation is in ANSI C, as requested for a generic reference implementation. For the measurements, we used an x64 Intel Core i5-5300U processor at 2.30 GHz, 16 GiB of RAM and GCC version 6.3.0 20170516 without any optimization, running on Debian 9.2.
We start by considering space requirements. We recall the flow between two parties and in a generic key exchange protocol derived from a KEM:
When instantiated with DAGS, the public key is given by the generator matrix G. The non-identity block is and is dyadic of order s, thus requires only elements of the base field for storage. The private key is simply the pair , consisting of elements of . Finally, the ciphertext is the pair , that is, a q-ary vector of length n plus 256 bits. This leads to the measurements (in bytes) given in Table 5 and Table 6.
|Parameter set||Public key||Private key||Ciphertext|
|Message flow||Transmitted message||Size|
We now move on to analyzing time measurements. We are using x64 architecture and our measurements use an assembly instruction to get the time counter. We do this by calling “rdtsc” before and after the instruction, which gives us the cycles used by each function. Table 7 gives the results of our measurements represented by the mean after running the code 50 times.
We thought it useful to provide a comparison with other recently proposed code-based KEMs (and in particular, NIST submissions). In Table 8, we present data for Classic McEliece, BIKE and BIG QUAKE with regards to memory requirements, for the highest security level (256 bits classical). We did not deem necessary, on the other hand, to provide a comparison in terms of implementation timings, as reference implementations are designed for clarity, rather than performance.
It is easy to see that the public key is much smaller than Classic McEliece and BIG QUAKE, and similar to that of BIKE. With regards to the latter, note that, for the same security level, the total communication bandwidth is of the same order of magnitude. This is because DAGS uses much shorter codes, and as a consequence the ciphertext is considerably smaller than a BIKE ciphertext. Moreover, for the purposes of a fair comparison, we remark that BIKE uses ephemeral keys, has a non-negligible decoding failure rate and only claims IND-CPA security – all factors that can restrict its use in various applications.
|Parameter set||Public key||Private key||Ciphertext|
In this paper, we presented DAGS, a key encapsulation mechanism based on quasi-dyadic generalized Srivastava codes. We proved that DAGS is IND-CCA secure in the random oracle model, and in the quantum random oracle model. Thanks to this feature, it is possible to employ DAGS not only as a key exchange protocol (for which IND-CPA would be a sufficient requirement), but also in other contexts such as hybrid encryption, where IND-CCA is of paramount importance.
In terms of performance, DAGS compares well with other code-based protocols, as shown by Table 8 and the related discussion (above). Another advantage of our proposal is that it does not involve any decoding error. This is particularly favorable in a comparison with some lattice-based schemes like ,  and , as well as BIKE. No decoding error allows for a simpler formulation and better security bounds in the IND-CCA security proof.
Unlike traditional code-based protocols, DAGS features small sizes for all components, that is ciphertexts, private keys (thanks to our improved computation idea) and public keys. All the objects involved in the computations are vectors of finite fields elements, which in turn are represented as binary strings; thus computations are fast. The cost of computing the hash functions is minimized thanks to the parameter choice that makes sure the input is only 256 bits. As a result, we expect our scheme to be implemented efficiently on multiple platforms.
The current reference code for the scheme is available at the repository https://git.dags-project.org/dags/dags. Our team is currently at work to complete various implementations that can better showcase the potential of DAGS in terms of performance. These include code prepared with x86 assembly instructions (AVX) as well as a hardware implementation (FPGA) etc. A hint at the effectiveness of DAGS can be had by looking at the performance of the scheme presented in , which also features an implementation for embedded devices. In particular, we expect DAGS to perform especially well in hardware, due to the nature of the computations of the McEliece framework.
Finally, we would like to highlight that a DAGS-based key exchange features an “asymmetric” structure, where the bandwidth cost and computational effort of the two parties are considerably different. In particular, in the flow described in (6.1), the party benefits from a much smaller message and faster computation (the encapsulation operation), whereas has to perform a key generation and a decapsulation (which includes a run of the decoding algorithm), and transmit a larger message (the public matrix). This is suitable for traditional client-server applications where the server side is usually expected to respond to a large number of requests and thus benefits from a lighter computational load. On the other hand, it is easy to imagine an instantiation, with reversed roles, which could be suitable for example in Internet-of-Things (IoT) applications, where it would be beneficial to lesser the burden on the client side, due to its typical processing, memory and energy constraints. All in all, our scheme offers great flexibility in key exchange applications, which is not the case for traditional key exchange protocols like Diffie–Hellman.
In light of all these aspects, we believe that DAGS is a promising candidate for post-quantum cryptography standardization as a key encapsulation mechanism.
A Note on the choice of ω
As discussed in Section 6.3, in our scheme we use a standard alternant decoder. After computing the syndrome of the word to be decoded, the next step is to recover the error locator polynomial , by means of the euclidean algorithm for polynomial division; the algorithm then proceeds by finding the roots of σ. There is a one-to-one correspondence between these roots and the error positions: in fact, there is an error in position i if and only if .
Of course, if one of the ’s is equal to 0, it is not possible to find the root, and to detect the error.
Now, the generation of the error vector is random, hence we can assume the probability of having an error in position i to be around ; since the codes give the best performance when mst is close to , we can estimate this probability as , which is reasonably low for any non-trivial choice of m; however, we still argue that the code is not fully decodable and we now explain how to adapt the key generation algorithm to ensure that all the ’s are non-zero.
As part of the key generation algorithm we assign to each the value , hence it is enough to restrict the possible choices for ω to the set . In doing so, we considerably restrict the possible choices for ω but we ensure that the decoding algorithm works properly.
 A. Al Jabri, A statistical decoding algorithm for general linear block codes, Cryptography and Coding, Lecture Notes in Comput. Sci. 2260, Springer, Berlin (2001), 1–8. 10.1007/3-540-45325-3_1Search in Google Scholar
 E. Alkim, L. Ducas, T. Pöppelmann and P. Schwabe, Post-quantum key exchange - a new hope, Cryptology ePrint Archive Report 2015/1092 (2015), http://eprint.iacr.org/2015/1092. Search in Google Scholar
 M. Baldi, F. Chiaraluce, R. Garello and F. Mininni, Quasi-cyclic low-density parity-check codes in the McEliece cryptosystem, IEEE International Conference on Communications—ICC’07, IEEE Press, Piscataway (2007), 951–956. 10.1109/ICC.2007.161Search in Google Scholar
 S. Barg, Some new NP-complete coding problems (in Russian), Problemy Peredachi Informatsii 30 (1994), no. 3, 23–28. Search in Google Scholar
 A. Barg, Complexity issues in coding theory, Handbook of Coding Theory. Vol. 1. Part 1: Algebraic Coding, Elsevier, Amsterdam (1998), 649–754. Search in Google Scholar
 P. S. L. M. Barreto, S. Gueron, T. Gueneysu, R. Misoczki, E. Persichetti, N. Sendrier and J.-P. Tillich, Cake: Code-based algorithm for key encapsulation, Cryptography and Coding—IMACC 2017, Springer, Cham (2017), 207–226. 10.1007/978-3-319-71045-7_11Search in Google Scholar
 P. S. L. M. Barreto, R. Lindner and R. Misoczki, Monoidic codes in cryptography, Post-quantum Cryptography, Lecture Notes in Comput. Sci. 7071, Springer, Heidelberg (2011), 179–199. 10.1007/978-3-642-25405-5_12Search in Google Scholar
 T. P. Berger, P.-L. Cayrel, P. Gaborit and A. Otmani, Reducing key length of the McEliece cryptosystem, Progress in Cryptology—AFRICACRYPT 2009, Lecture Notes in Comput. Sci. 5580, Springer, Berlin (2009), 77–97. 10.1007/978-3-642-02384-2_6Search in Google Scholar
 E. R. Berlekamp, R. J. McEliece and H. C. A. van Tilborg, On the inherent intractability of certain coding problems, IEEE Trans. Inform. Theory IT-24 (1978), no. 3, 384–386. 10.1109/TIT.1978.1055873Search in Google Scholar
 D. J. Bernstein, T. Chou and P. Schwabe, Mcbits: Fast constant-time code-based cryptography, Cryptographic Hardware and Embedded Systems—CHES 2013, Lecture Notes in Comput. Sci. 8086, Springer, Berlin (2013), 250–272. 10.1007/978-3-642-40349-1_15Search in Google Scholar
 B. Biswas and N. Sendrier, McEliece cryptosystem implementation: Theory and practice, Post-quantum Cryptography, Lecture Notes in Comput. Sci. 5299, Springer, Berlin (2008), 47–62. 10.1007/978-3-540-88403-3_4Search in Google Scholar
 J. Bos, C. Costello, L. Ducas, I. Mironov, M. Naehrig, V. Nikolaenko, A. Raghunathan and D. Stebila, Frodo: Take off the ring! Practical, quantum-secure key exchange from LWE, Cryptology ePrint Archive Report 2016/659 (2016), http://eprint.iacr.org/2016/659. 10.1145/2976749.2978425Search in Google Scholar
 J. W. Bos, C. Costello, M. Naehrig and D. Stebila, Post-quantum key exchange for the tls protocol from the ring learning with errors problem, IEEE Symposium on Security and Privacy, IEEE Press, Piscataway (2015), 553–570. 10.1109/SP.2015.40Search in Google Scholar
 P.-L. Cayrel, G. Hoffmann and E. Persichetti, Efficient implementation of a CCA2-secure variant of McEliece using generalized Srivastava codes, Public Key Cryptography—PKC 2012, Lecture Notes in Comput. Sci. 7293, Springer, Heidelberg (2012), 138–155. 10.1007/978-3-642-30057-8_9Search in Google Scholar
 N. T. Courtois, M. Finiasz and N. Sendrier, How to achieve a McEliece-based digital signature scheme, Advances in Cryptology—ASIACRYPT 2001, Lecture Notes in Comput. Sci. 2248, Springer, Berlin (2001), 157–174. 10.1007/3-540-45682-1_10Search in Google Scholar
 R. Cramer and V. Shoup, Design and analysis of practical public-key encryption schemes secure against adaptive chosen ciphertext attack, SIAM J. Comput. 33 (2003), no. 1, 167–226. 10.1137/S0097539702403773Search in Google Scholar
 J.-C. Deneuville, P. Gaborit and G. Zémor, Ouroboros: A simple, secure and efficient key exchange protocol based on coding theory, Post-quantum Cryptography, Lecture Notes in Comput. Sci. 10346, Springer, Cham (2017), 18–34. 10.1007/978-3-319-59879-6_2Search in Google Scholar
 J.-C. Faugère, V. Gauthier-Umaña, A. Otmani, L. Perret and J.-P. Tillich, A distinguisher for high-rate McEliece cryptosystems, IEEE Trans. Inform. Theory 59 (2013), no. 10, 6830–6844. 10.1109/TIT.2013.2272036Search in Google Scholar
 J.-C. Faugère, A. Otmani, L. Perret, F. de Portzamparc and J.-P. Tillich, Structural cryptanalysis of McEliece schemes with compact keys, Des. Codes Cryptogr. 79 (2016), no. 1, 87–112. 10.1007/s10623-015-0036-zSearch in Google Scholar
 J.-C. Faugère, A. Otmani, L. Perret and J.-P. Tillich, Algebraic cryptanalysis of McEliece variants with compact keys, Advances in Cryptology—EUROCRYPT 2010, Lecture Notes in Comput. Sci. 6110, Springer, Berlin (2010), 279–298. 10.1007/978-3-642-13190-5_14Search in Google Scholar
 J.-C. Faugère, A. Otmani, L. Perret and J.-P. Tillich, Algebraic cryptanalysis of McEliece variants with compact keys – towards a complexity analysis, Proceedings of the 2nd International Conference on Symbolic Computation and Cryptography—SCC’10, Laboratoire d’Informatique de Paris 6, Paris (2010), 45–55. Search in Google Scholar
 Q. Guo, T. Johansson and P. Stankovski, A key recovery attack on MDPC with CCA security using decoding errors, Advances in Cryptology—ASIACRYPT 2016. Part I, Lecture Notes in Comput. Sci. 10031, Springer, Berlin (2016), 789–815. 10.1007/978-3-662-53887-6_29Search in Google Scholar
 D. Hofheinz, K. Hövelmanns and E. Kiltz, A modular analysis of the Fujisaki–Okamoto transformation, Theory of Cryptography. Part I, Lecture Notes in Comput. Sci. 10677, Springer, Cham (2017), 341–371. 10.1007/978-3-319-70500-2_12Search in Google Scholar
 G. Kachigar and J.-P. Tillich, Quantum information set decoding algorithms, Post-quantum Cryptography, Lecture Notes in Comput. Sci. 10346, Springer, Cham (2017), 69–89. 10.1007/978-3-319-59879-6_5Search in Google Scholar
 F. J. MacWilliams and N. J. A. Sloane, The Theory of Error-Correcting Codes. I, North-Holland Math. Libr. 16, North-Holland, Amsterdam, 1977, Search in Google Scholar
 R. J. McEliece, A public-key cryptosystem based on algebraic coding theory, Deep Space Netw. Prog. Rep. 44 (1978), 114–116. Search in Google Scholar
 R. Misoczki and P. S. L. M. Barreto, Compact mceliece keys from goppa codes, Selected Areas in Cryptography, Springer, Berlin (2009), 376–392. 10.1007/978-3-642-05445-7_24Search in Google Scholar
 R. Misoczki, J.-P. Tillich, N. Sendrier and P. L. S. M. Barreto, MDPC-McEliece: New McEliece variants from moderate density parity-check codes, International Symposium on Information Theory—ISIT 2013, IEEE Press, Piscataway (2013), 2069–2073. 10.1109/ISIT.2013.6620590Search in Google Scholar
 R. Niebuhr, Statistical decoding of codes over , Post-quantum Cryptography, Lecture Notes in Comput. Sci. 7071, Springer, Heidelberg (2011), 217–227. 10.1007/978-3-642-25405-5_14Search in Google Scholar
 R. Niebuhr, E. Persichetti, P.-L. Cayrel, S. Bulygin and J. Buchmann, On lower bounds for information set decoding over and on the effect of partial knowledge, Int. J. Inf. Coding Theory 4 (2017), no. 1, 47–78. Search in Google Scholar
 R. Nojima, H. Imai, K. Kobara and K. Morozov, Semantic security for the McEliece cryptosystem without random oracles, Des. Codes Cryptogr. 49 (2008), no. 1–3, 289–305. 10.1007/s10623-008-9175-9Search in Google Scholar
 E. Persichetti, Secure and anonymous hybrid encryption from coding theory, Post-Quantum Cryptography—PQCrypto 2013, Berlin, Heidelberg (2013), 174–187. 10.1007/978-3-642-38616-9_12Search in Google Scholar
 C. Peters, Information-set decoding for linear codes over , Post-quantum Cryptography, Lecture Notes in Comput. Sci. 6061, Springer, Berlin (2010), 81–94. 10.1007/978-3-642-12929-2_7Search in Google Scholar
 P. W. Shor, Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer, SIAM J. Comput. 26 (1997), no. 5, 1484–1509. 10.1137/S0097539795293172Search in Google Scholar
 F. Strenzke, A timing attack against the secret permutation in the McEliece PKC, Post-quantum Cryptography, Lecture Notes in Comput. Sci. 6061, Springer, Berlin (2010), 95–107. 10.1007/978-3-642-12929-2_8Search in Google Scholar
 F. Strenzke, E. Tews, H. G. Molter, R. Overbeck and A. Shoufan, Side channels in the McEliece PKC, Post-quantum Cryptography, Lecture Notes in Comput. Sci. 5299, Springer, Berlin (2008), 216–229. 10.1007/978-3-540-88403-3_15Search in Google Scholar
© 2018 Walter de Gruyter GmbH, Berlin/Boston
This article is distributed under the terms of the Creative Commons Attribution Non-Commercial License, which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.