Reproducible families of codes and cryptographic applications

: Structured linear block codes such as cyclic, quasi - cyclic and quasi - dyadic codes have gained an increasing role in recent years both in the context of error control and in that of code - based cryptography. Some well known families of structured linear block codes have been separately and intensively studied, without searching for possible bridges between them. In this article, we start from well known examples of this type and generalize them into a wider class of codes that we call (cid:2) - reproducible codes. Some families of (cid:2) - reproducible codes have the property that they can be entirely generated from a small number of signature vectors, and consequently admit matrices that can be described in a very compact way. We denote these codes as compactly reproducible codes and show that they encompass known families of compactly describable codes such as quasi - cyclic and quasi - dyadic codes. We then consider some cryptographic applications of codes of this type and show that their use can be advantageous for hindering some current attacks against cryptosystems relying on structured codes. This suggests that the general framework we introduce may enable future developments of code - based cryptography.


Introduction
Defining linear block codes that possess a certain inner structure and verify some regularity properties is a natural process in coding theory. Arguably, the most relevant example is represented by the class of cyclic codes, which includes several families of codes that proved to be important throughout the history of communications, such as BCH and Hamming codes, as well as the binary Golay codes, Reed-Solomon codes, and many others. This class is defined by the property of having codewords that are invariant under the action of a specific permutation, namely the cyclic (circular) shift, which consists of cyclically rotating a vector by one position to the right (equivalently, to the left). Other examples which are well known in the literature include constacyclic codes, negacyclic codes, quasi-cyclic codes, and many others.
usually consists precisely of a generator or parity-check matrix of a linear block code. With the size of the codes used in code-based cryptography (typical code lengths are in the order of 10 3 to 10 4 ), describing a whole matrix results in a public key of several kilobytes, and this size increases quadratically in the code length. This has historically prevented the use of the original McEliece cryptosystem, which exploits random-looking public codes, in many applications. On the other hand, structured codes admit a generator and parity-check matrix which can be entirely described by one or few rows; this allows for a very important reduction in public key size, and it is arguably a fundamental step toward making code-based cryptography truly practical. Previous efforts to reduce key size were centered on quasi-cyclic algebraic codes [3] and have been since then extended to codes of a different nature, namely the Low-Density Parity-Check (LDPC) codes [4] and their recent generalization known as Moderate-Density Parity-Check (MDPC) codes [5]. These codes are characterized by sparse parity-check matrices and admit matrices in quasi-cyclic form, formed by circulant square blocks. Due to their efficient decoding algorithms and the lack of additional algebraic structure that could lead to structural attacks, schemes based on Quasi-Cyclic Low-Density Parity-Check (QC-LDPC) codes [6] and Quasi-Cyclic Moderate-Density Parity-Check (QC-MDPC) codes [5] are among the most promising solution in this area.
The importance of code-based cryptography has risen dramatically in modern times due to the work of Shor [7], who showed how it will be possible to effectively break cryptography based on "classical" number theory problems by introducing polynomial-time algorithms for factoring large integers and computing discrete logarithms on a quantum computer. This calls for cryptographic primitives that rely on different hard problems, which will not be affected once quantum computers of an appropriate size will be available. Code-based cryptography is one of the most important areas in this scenario, and ever since McEliece's seminal work in 1978 [2], it has shown no vulnerabilities against quantum attackers. Moreover, generic decoding attacks, which have exponential complexity, have improved only marginally over nearly 40 years of cryptanalysis. Together with lattice-based schemes, code-based cryptography is at the basis of many candidates for the post-quantum standardization call recently launched by NIST [8].
In this article, we provide a general framework for the definition of structured codes, which are of increasing interest in several McEliece and Niederreiter cryptosystem variants. First, we introduce the notion of -reproducible codes as a general framework for describing both structured and unstructured codes. Then, we introduce some special families of -reproducible codes, that we denote as compactly reproducible (CR) codes, which require a smaller-than-maximum number of degrees of freedom for the representation of each code belonging to the same family. This generalizes existing families of structured codes used in code-based cryptosystems. We also propose a framework for constructing -reproducible codes of any kind and present concrete families of non-trivial CR codes which have not appeared in literature before. Our goal is to provide a generic framework to serve as a basis for future constructions, as indeed was the case in ref. [9], which references a preprint version of this work.
To highlight the importance of these codes in cryptography, we mention that among the 26 candidates that were admitted to the second round of the NIST's standardization effort [10], 5 are based on structured random and pseudo-random codes, which are the focus of this article. In particular, BIKE and LEDAcrypt are two public-key encryption schemes based on, respectively, QC-MDPC and QC-LDPC codes, which naturally fit into the general framework we describe in this article. The same occurs for the system named HQC, in which part of the public key consists in a random QC code. Although we focus on the Hamming metric case, the framework we describe could also be applied to the generation of structured codes in the rank metric (with the proper modifications). ROLLO and RQC are other two candidates that could be encompassed by such a framework in the rank metric domain.
The article is organized as follows. In Section 2, we recall some basic concepts and introduce the notation we use throughout the article. In Section 3, we introduce -reproducible matrices, and we use them to define the new class of codes in Section 4. Section 5 is devoted to the study of their possible use in code-based cryptosystems and provides some practical constructions for this purpose. In Section 6, we draw some conclusions.

Preliminaries and notation
We denote with q the finite field with q elements, where q is a prime power. For two sets X and Y, X Y denotes the set of all maps from Y to X. For a set S we then denote by 2 S its power set, i.e., the set containing all possible subsets of S, exploiting the well known bijection with the set of functions from S to { } 0, 1 . We use bold letters to denote vectors and matrices. Given a vector a, we refer to its element in position i as a i . The size-k identity matrix is denoted as I k , while the × k n null matrix is denoted as × 0 k n . Finally, we use the term pseudo-ring to denote a structure that satisfies all the ring axioms, apart from the existence of the multiplicative identity. Such a structure is also typically known as rng.

Coding theory background
A linear code is a k-dimensional subspace of the n-dimensional vector space over the finite field q . The parameters n (length) and k (dimension) are positive integers with ≤ k n. The value = − r n k is known as codimension of the code. . The minimum distance d of a code is defined as the minimum distance between any two different codewords of , or equivalently as the minimum weight over all non-zero codewords.
A linear code of length n, dimension k, and minimum distance d is called an [ ] n k d , , -code. The error-correcting capability of a linear code is connected to its minimum distance, and in particular it corresponds to ( ) ⌊ − / ⌋ d 1 2 under bounded distance decoding. When soft-decision decoding is used, a linear block code with distance d may correct up to − d 1 symbol errors.
Definition 2.2. (Generator and parity-check matrices) Let be a linear code over q . We call generator matrix of a × k n matrix G whose rows form a basis for the vector space defined by , i.e.: For any matrix H and any vector x, the vector Hx T is called syndrome of x. We then call parity-check matrix of a full rank × r n matrix H such that every codeword belonging to has syndrome 0 with respect to H, i.e., Note that the parity-check matrix of a code is also a generator matrix of the dual code ⊥ , i.e., the linear code formed by all the words of q n that are orthogonal to . It follows that for any generator matrix G and parity-check matrix H of a code, we have = × HG 0 T r k . Both matrices are required to have full rank. Moreover, note that, clearly, neither matrix is unique: for instance, given a generator matrix G it is always possible to obtain another generator matrix for the same code by a linear transformation, that is, the left multiplication by an invertible × k k matrix S, so that ′ = G SG. This corresponds to a change of basis for the vector space. A similar property is verified by the parity-check matrix. Finally, two generator matrices generate equivalent codes if one is obtained from the other by a permutation of columns. These two facts are at the basis of the McEliece cryptosystem.
Joining the two properties above, we can write any generator matrix G in systematic form as

The McEliece cryptosystem
The McEliece public-key encryption scheme [2] was introduced by R.J. McEliece in 1978. The original scheme uses binary Goppa codes, with which it remains unbroken (with a proper choice of parameters), but the scheme can be used with any class of codes for which an efficient decoding algorithm is known.

Key generation
Let G be a generator matrix of a linear [ ] n k d , , -code over q with an efficient decoding algorithm which can correct up to ( ) = ⌊ − / ⌋ t d 1 2 errors under bounded-distance decoding. Let S be an invertible × k k matrix and P be a random × n n permutation matrix over q . The private key is ( ) S G P , , and the public key is ′ ≔ G SGP.

Encryption
To be able to encrypt a plaintext, it has to be represented as a vector m of length k over q . The encryption algorithm chooses a random error vector e of weight t in q n and computes the ciphertext = ′ + c mG e.

Decryption
The decryption algorithm first computes = = + − − c cP mSG eP 1 1 . As P is a permutation matrix, − eP 1 has the same weight as e. Therefore, can be used to decode the errors and obtain ( ) = = m mS ĉ. Finally, the plaintext is retrieved as = − m mŜ 1 . In successive papers, the original McEliece cryptosystem was refined and tweaked many times; for example, it is now common practice to replace the scrambling method given by S and P with the computation of the systematic form, i.e., ′ G is the systematic form of G. This is possible when the McEliece cryptosystem is embedded into a larger framework to convert it into an IND-CCA2¹ secure Public Key Encryption (PKE) scheme or Key Encapsulation Mechanism (KEM), and has the additional advantage (beyond the obvious simpler formulation) of a smaller public key (since only the non-identity submatrix needs to be stored).
The (one-way) security of McEliece is based on the following hard problem.

Problem 2.3
(Syndrome decoding problem) Given an × r n full-rank matrix H and a vector s, both with entries in q , and a non-negative integer t; find a vector ∈ e q n of weight t such that = He s T T .
The Syndrome Decoding Problem (SDP) is a well known problem in complexity theory, and it has been shown to be NP complete [11]. Note that, since the McEliece cryptosystem uses an [ ] n k d , , code, the number of error vectors of weight t is ( ) while the number of possible syndromes is q r . Therefore, is a necessary condition for the existence of at most one solution to the problem, i.e., for the decoding process to have a unique solution.  1 The term IND-CCA2 stands for Indistinguishability under Adaptively Chosen Ciphertext Attack, which is the highest security notion for a PKE and KEM since it considers the strongest adversarial model.

Sparse-matrix codes
One of the most delicate points about the McEliece cryptosystem is that, in order for the security to reduce to the SDP, it is assumed that the matrix used as the public key is indistinguishable from a uniformly random matrix of the same size. This is a plausible assumption, which however has been shown to be false in several cases. For many variants of McEliece (e.g., ref. [12]), in fact, this opened up avenues of attack which simply ruled out the variant altogether. Even the long-standing binary Goppa codes have been shown to be distinguishable from random codes [13] when the code rate is chosen carelessly (too high). This is arguably one of the main reasons that pushed researchers away from algebraic codes and toward codes of a different nature.
LDPC codes are defined by parity-check matrices whose main requirement is to be sparse, with a very low row and column weight. These codes are easy to generate and moreover admit a variety of choices for the decoding algorithm , inspired by the Bit Flipping (BF) decoder of Gallager [14], which is very efficient in practice. For these reasons, this class of codes is a natural candidate for the McEliece cryptosystem. A first instantiation was studied in ref. [4], where a private LDPC matrix was considered, along with a linearly transformed version of the same matrix used as the public key. As highlighted in ref. [4], security of the private LDPC code is not preserved unless the public matrix is dense. Thus, in such a framework, the private LDPC code is represented through its sparse parity-check matrix H, while the public key corresponds to a dense generator matrix G for . It is important to note that, from the knowledge of G, the opponent can compute several parity-check matrices ′ H for , but they will not lead to an efficient decoding, unless they are sparse. As explained in Section 2.2, typically having G in systematic form is enough to guarantee such a property. It is important to note that, due to their probabilistic nature, decoding algorithms for LDPC codes are characterized by a non-trivial Decoding Failure Rate (DFR). This means that, in the case of a decoding failure, Bob must ask Alice for a retransmission of the plaintext, encrypted with a different error vector. In order to avoid frequent retransmissions, which would obviously increase the latency of the system, the DFR must be kept sufficiently low; typically, values are in the range of − 10 6 to − 10 9 . As we will discuss later, this fact represents a crucial difference, with respect to the case of algebraic codes, since it leads to a new family of attacks aimed at recovering the secret key by observing Bob's reactions. This also has implications on the security model against a Chosen Ciphertext Attack (CCA) for these systems [15]. Therefore, finding reliable models for their DFR is necessary to ensure that its value is negligible for those instances designed to achieve indistinguishability under chosen ciphertext attack (IND-CCA) [16].

Main attacks
We briefly recall the two main types of attacks that can be mounted against the McEliece cryptosystem and its variants when using sparse-matrix codes.

Decoding attacks
Decoding attacks are aimed at recovering the plaintext from the ciphertext by performing decoding through the public code. In fact, being unable to retrieve the private code representation that enables efficient decoding, an attacker can still try to perform decoding through the public code, which looks like a general random code.
At the current state of the art, the best procedure for this task is the Information-Set Decoding (ISD) algorithm, which was first introduced by Prange in 1962 [17] and has received many improvements during the years [18][19][20][21]. However, ISD and all its variants are characterized by an exponential complexity: the search for a weight-w codeword has asymptotic complexity equal to 2 αw , where the value of the constant α depends on the code parameters and on the particular algorithm we are analyzing. Even in a quantum setting, ISD algorithms are still characterized by exponential complexity: indeed, the only known application of a quantum algorithm to an ISD algorithm, which consists in using Grover's algorithm [22] to speed up the search, leads to a reduction in the complexity, with respect to the classical case, which cannot be larger than half the exponent α [23].

Key-recovery attacks
When LDPC codes are used, key recovery attacks boil down to recovering low-weight codewords from the dual of the public code, which is again a decoding problem. Let us denote by ⊥ the dual code of , having generator matrix H. Since the rows of H are sparse, and of maximum weight ≪ w n, they are minimumweight codewords in ⊥ with overwhelming probability, and so can be searched with a generic algorithm for finding low-weight words, for which ISD algorithms can be used as well.
Since the difficulty of such a task increases with the weight of the searched codewords, it makes sense to relax the notion of "low-density": the authors in ref. [5] introduce the notion of "moderate-density" by increasing the allowed row weight in the parity-check matrix from ( ( )) O n log to ( ) O n , thus defining moderate-density parity-check (MDPC) codes. It is still possible to decode MDPC codes with the previously mentioned algorithms; the error-correction capacity gets obviously worse, but the gain in security makes this tradeoff worth it. In the end, the adoption of LDPC and MDPC codes in modern variants of the McEliece cryptosystem does not reduce the security against key recovery attacks, since attacks deriving from the structure of the secret code can be easily avoided by fixing the minimum weight of the rows of H.

Structured sparse-matrix codes
Using generic LDPC and MDPC codes without any structure in the McEliece cryptosystem is not a practical choice, as pointed out in ref. [4]. This is because the need to avoid sparse public matrices makes the resulting public key sizes significantly larger than the ones we can obtain with other families of codes, like Goppa codes. In fact, even if the private sparse parity-check matrix can be compactly represented through the positions of its non-null entries (and so, a row with Hamming weight equal to w can be stored just with w n q log log 2 2 bits), applying this technique to the public key is not possible, since a sparse G might compromise the security of the system. One way to avoid this issue is to add some structure to the code family. This idea was first introduced by considering Quasi-Cyclic (QC) codes [3] and was then extended to LDPC codes [24] and algebraic codes [25]. In all cases, the authors propose to use QC codes to reduce the public key size. A QC code can be simply seen as a code which admits parity-check and generator matrices made of circulant blocks. A circulant matrix is a matrix in which every row is obtained as the cyclic shift of the previous one; an example of a circulant matrix is a a a a a A . Any circulant matrix is fully described by one of its rows, conventionally the first one. This means that, in the McEliece cryptosystem, we can describe the public key completely using just the first row of each one of its circulant blocks; it is clear that this results in a significant reduction in the public key size with respect to instances using non-structured public matrices. However, this additional structure presents some drawbacks, since it exposes the system to structural weaknesses. In particular, the QC structure summed to the algebraic structure of the underlying codes provides a lot of information to the attacker and opens up the possibility of structural attacks aimed at recovering the private code. The most famous structural attack of this type is known as FOPT [26] and works by solving a multivariate algebraic system with Gröbner bases techniques together with the QC property, which greatly reduces the number of unknowns of the system. As a result, it seems very hard to provide secure schemes which involve QC algebraic codes (Goppa, GRS etc.), while still obtaining an effective key reduction: the recent NIST proposal BIG QUAKE [27] shows a reduction of about 1/4 in the key size compared to what would be obtained in a "classical" McEliece using unstructured binary Goppa codes.
Therefore, once again, it seems safer to deploy code-based schemes using sparse-matrix codes, since in this case there is no additional algebraic structure, and the QC property alone is not enough to provide a structural attack. However, some care is still necessary when using sparse-matrix codes. In particular, two main aspects have to be considered: • ISD algorithms might obtain a speed up from the QC structure. This results in a complexity reduction for the relevant attacks. Such a speedup is achieved for both key recovering attacks and decoding attacks (following from the Decoding One Out of Many [DOOM] approach [28]). The attack complexity remains exponential in the key length, but the attack speedup leads to an increase in the row weight of H and in the number of errors to be used during encryption, which in turn results in an increase in the key length. • It has been recently shown that the probability of a decoding failure depends on the number of overlapping ones between the error vector and rows of H [29]. In addition, in a circulant matrix, all the rows are characterized by the same set of cyclic distances between set symbols (given two ones at positions i and j, the corresponding cyclic distance is computed as mod , with p being the circulant size). Based on these considerations, it has been shown in ref. [29] that an adversary can mount a key recovery attack by impersonating Alice, producing many ciphertexts and requesting Bob to decrypt them. The adversary can then exploit Bob's reactions concerning decoding failures, which are of public knowledge, in order to gather information about the secret key structure. The set of all distances of the rows of H is called distance spectrum and can be used to reconstruct H. This problem can be related to a graph problem, in which a row of H corresponds to a clique with maximum size. For a sparse QC matrix, such a graph is sparse as well, which gives a small number of cliques. This means that, once the distance spectrum is known, recovering the corresponding parity-check matrix is not a hard task in most cases.
Currently, the countermeasures that have been devised against the aforementioned reaction attacks exploit the use of ephemeral keys [30,31], of special iterative decoders that allow theoretical modeling of their failure rate [32,33], or of particular families of codes that make the reconstruction of the secret key unfeasible [34]. However, all these solutions come with some price to key pair must be generated for each encryption (in the first case) or the size of the public key must be increased (in the second and third cases).
As we will see in the rest of this article, the idea of using some structure to reduce the public key size can be strongly generalized. In particular, we will show that existing solutions are just very special cases of a wider framework, characterized by a large variety of options. This generalization comes with no increase in public key size, while on the other hand potentially allows us to avoid DOOM and/or reaction attacks, or at least to reduce their efficiency.

Reproducibility
We now introduce the main notions we use to provide a generalized approach to the design of structured codes.
1 be a family of ℓ linear maps, with ↦ σ : i q n q n (thus, we can think of each σ i as a square matrix of size n and values in q ). We say that a × k n matrix A is an -reproducible matrix if there exists an × m n matrix a such that We call m the reproducible order and a the signature set and write ( ) = A a . We say that a code ⊆ q n is an -reproducible code if it admits a generator matrix and/or a parity-check matrix which are -reproducible.
Let us consider an -reproducible code described by an -reproducible generator matrix ∈ × G q k n such that, for where g is the × m n signature set of G. Then, for the fixed family of linear maps, the code is completely represented through g. The same reasoning applies to an -reproducible code described by an -reproducible parity-check matrix ∈ × H q r n with signature set h.
-code over q is an -reproducible code for at least one choice of and the corresponding signature set. Such a choice corresponds to ℓ = 1, = m k, = g G, and { } = I n , where I n is the × n n identity matrix. Equivalently, the code can be described through the parity-check matrix H considering Once the family is defined, an -reproducible matrix can be described just by its signature set. Consequently, when the family of maps is fixed and universally known, having an -reproducible generator matrix (or equivalently parity-check matrix) with ℓ > 1 leads to a more compact representation of the code with respect to storing its full generator or parity-check matrix. This happens because is universally known, and it does not need to be included in the code representation, thus the signature set alone is sufficient for representing the code.
If we consider a single code, then it is always possible to find some family according to which such a code has an -reproducible generator matrix (or equivalently parity-check matrix) with ℓ > 1. This is detailed in the following two propositions. Proof. Let ∈ × G q k n be a valid generator matrix for the code . Let us consider the ith row g i of G and define , as the × n n matrix ∈ × q n n having its first row equal to g i , and all the other rows filled with arbitrary entries. Then, G is easily obtained as 1, 0, 0, ,0 . The fact that admits an -reproducible parity-check matrix with reproducible order = m 1 can be proved with a similar reasoning. □ From Proposition 3.4, we know that any single code is -reproducible for some family yielding ℓ > 1 and < m k (considering the generator matrix) or < m r (considering the parity-check matrix). However, if instead of a single code we consider a group of codes and aim at representing all of them as -reproducible codes for the same, universally known family of maps , then it is not always possible to find a solution with ℓ > 1 and < m k (considering the generator matrix) or < m r (considering the parity-check matrix). The only trivial solutions that always exist are those of the type considered in Proposition 3.2, yielding ℓ = 1 and = m k (considering the generator matrix) or = m r (considering the parity-check matrix), and thus not enabling more compact code representations than those corresponding to storing the full generator or parity-check matrix. We are instead interested in group of codes that, besides these trivial solutions, also admit -reproducible generator and parity-check matrices for a fixed with ℓ > 1 and < m k or < m r, as detailed in the next definition.
Definition 3.5. We say that a group of [ ] n k d , , -codes over q are Compactly Reproducible (CR) codes if, for a fixed with ℓ > 1, each of them admits at least one -reproducible generator matrix with < m k, or at least one -reproducible parity-check matrix with < m r, thus enabling a more compact code representation with respect to storing the full generator or parity-check matrix.
The condition for a code to be CR can be generalized, in order to take into account other structures that enable a compact representation.
, be -reproducible matrices, each with its own dimensions, signature set , , and family of linear functions i j , . Let A be a matrix obtained using as building blocks the matrices A i j , ; then, we say that A is -quasi-reproducible.
Definition 3.7. Let us consider a group of linear codes over q . If, for a fixed with ℓ > 1, any code in such a group can be described by an -quasi-reproducible generator matrix ∈ × G q k n such that < m k, and/or an -quasi-reproducible parity-check matrix ∈ × H q r n such that < m r, then we say that is a quasi-compactly reproducible (QCR) code.
It is clear that, in order to describe an -quasi-reproducible matrix, we just need the ensemble of the signature sets of its building blocks, together with the corresponding families of linear functions. Quasireproducibility generalizes the concept of reproducibility, since each reproducible code can be seen as a particular quasi-reproducible code, with a generator matrix described just by one signature set. A particular type of quasi-reproducible code is the one in which the blocks A i j , are square matrices, defined by the same family .
We are now ready to introduce a very important notion regarding the set of -reproducible matrices obtained via a given family of transformations. Specifically, consider a family of linear functions , the set of all -reproducible matrices over q obtained via signatures of size × m p and , equipped with the usual operations of matrix sum and multiplication. Then the following results² hold. Proof. Showing that q m , is an additive abelian group is quite straightforward. In fact, the signature of the sum of two matrices corresponds to the sum of the original signatures. Commutativity and associativity follow from the element-wise sum between two matrices. The identity is given by the null signature (i.e., the signature made of all zeros), while the inverse of a matrix with signature a is the matrix with signature −a. □  2 For simplicity we assume = σ I p 0 , but this is not necessary and the results hold even if does not contain the identity function.
On the other hand, it is possible to show that the set, with respect to the multiplication, is a semigroup; in this case, the only requirements are closure and associativity. While associativity easily follows from the properties of the multiplication between two matrices, in order to guarantee closure, we must make an additional assumption.
We show that commutativity is necessary first. For what we discussed above, we only need to prove closure. Let A and B be two matrices of q m , , with respective signatures a 0 , b 0 , that is, Multiplying these two matrices we get Now by hypothesis It follows that C is -reproducible and defined by . Conversely, suppose q m , is a semigroup, and in particular that it is closed with respect to multiplication. Consider again two matrices A and B and their product, defined as in equation (3.3). Since by hypothesis ∈ C q m , , and therefore is -reproducible, we have that = σ c c Now, since equation (3.5) holds in general for every signature a 0 , it must be that Finally, note that multiplication distributes over addition, as usual. This means that, if Theorem 3.9 holds, q m , verifies all the requisites of a mathematical pseudo-ring, i.e., a ring without multiplicative identity, as defined in Section 2. We call this the -reproducible pseudo-ring induced by over q .

Pseudo-rings induced by families of permutations
In the particular case of signatures made of just one row (i.e., reproducible order = m 1) and the functions σ i being permutations, we have a further result, which is described in Theorem 3.10. We point out that all the results we present in this section can be generalized, in order to consider the case > m 1, but we will not go into further details here. Since a × p p permutation corresponds to a matrix in which every row and column has weight equal to 1, it can equivalently be described as a bijection over [ ] − ⊂ p 0, 1 . Given a permutation matrix σ i , we denote the corresponding bijection as f σi . If the element of σ i in position ( ) v z , is equal to 1, , respectively, such that ′ = σ a a i . Then, We to denote the bijection defined by the application of f σi after f σj . In other words, The identity I p can be seen as the particular permutation that does not change the order of the elements; the corresponding bijection, which will be denoted as f Ip , is such that each element is mapped into itself (in other words, .
1 be a family of linear transformations, with each σ i being a permutation, and suppose that induces the -reproducible pseudo-ring q ,1 over q . Then, the following relation must be satisfied Proof. Since q ,1 is a pseudo-ring, we know from Theorem 3.9 that, for every matrix ∈ B q ,1 and every In particular, the left-hand term multiplication of σ i by B corresponds to a row permutation, such that where b i denotes the ith row of B. The product σ B i instead defines a column permutation of B, and can be expressed as Putting together equations (3.6) and (3.7), we obtain which must be satisfied for every pair of indexes ( ) i j , . □ Starting from the result of Theorem 3.10, we can easily derive some other properties that must satisfy.
Corollary 3.11. Let be a family of permutations such that the induced q m , is a pseudo-ring. Then, has the following properties: Proof. Since satisfies the hypothesis of Theorem 3.10, we have , is a pseudo-ring. Then, q ,1 is a ring, which we call, by analogy, -reproducible ring induced by .
Proof. Let us show that q ,1 contains the multiplicative identity, i.e., the × p p identity matrix. Because of Corollary 3. Proof. Based on Corollary 3.12, q ,1 is an -reproducible ring provided with multiplicative identity. Now, we need to prove that any non-singular matrix in  Proof. Let ∈ A q ,1 , with signature a, and denote as = B A T its transpose. The ith row of B corresponds to the ith column of A. In particular, the ith column of A is defined as Because B is the transpose of A, the ith row of B corresponds to the ith column of A. Let us denote as b 0 the first row of B, that is,  a  a  a  b  b  , , , , , , .
Now suppose that ( ) = f v j σi ; then, the jth entry of b i corresponds to the vth entry of b 0 , that is, In order to satisfy eq. (3.12), a z must be equal to the jth entry of the ith column of A, that is, which concludes the proof. □ Depending on the properties stated in the previous theorems, the family might induce different algebraic structures over × q p p . In particular, let us consider the case of corresponding to q ,1 satisfying both Theorems 3.13 and 3.14. Let A be a square matrix whose elements are picked from q ,1 . By definition, involves only sums and multiplications: this means that involves sums, multiplications and transpositions: because of Theorem 3.14, we have that the entries of ( ) A adj are again elements of q ,1 . This means that − A 1 is a matrix whose elements belong to q ,1 , and so has the same -quasi-reproducible structure of A.

Known examples of -reproducible pseudo-rings
In Section 3.1, we have described some properties that a family of permutations must have to guarantee that it induces algebraic structures on × q p p . Well-known cases of such objects, with common use in cryptography, are circulant matrices and dyadic matrices.

Circulant matrices
As we have seen before, a circulant matrix is a × p pmatrix for which each row is obtained as the cyclic shift of the previous one. In particular, a circulant matrix can be seen as a square -reproducible matrix, whose signature corresponds to the first row and the functions σ i defining correspond to π i , where π is the unitary circulant permutation matrix with entries

15)
Basically, the bijection representing π is defined as (3.16) It can be easily shown that . With some simple computations, it can be easily shown that circulant matrices satisfy Theorem 3.14 and that the multiplication between two circulant matrices is commutative.

Dyadic matrices
A dyadic matrix is a × p p matrix, with p being a power of 2, whose signature is again its first row. The rows of a dyadic matrix are obtained by permuting the elements of the signature, such that the element at position ( ) i j , is the one in the signature at position ⊕ i j, where ⊕ denotes the bitwise XOR between i and j. Then, a dyadic matrix can be written as an -reproducible matrix, for which each function σ i is the dyadic matrix whose signature has all-zero entries, except that at position i. This means that σ i can be described by the following bijection: (3.19) If we combine two transformations, we obtain , this proves that the family of dyadic matrices is compliant with Theorem 3.10. It can be straightforwardly proven that dyadic matrices are symmetric (and so satisfy Theorem 3.14), and that the multiplication between two dyadic matrices is commutative.
Circulant and dyadic matrices are just two particular cases of -reproducible pseudo-rings and can obviously be further generalized by considering signatures that are composed by more than one row. In addition, several more constructions can be obtained. For instance, for every permutation matrix ψ and every -reproducible pseudo-ring

Compactly reproducible codes
In the previous section, we have described the properties that a family of functions must have in order to generate -reproducible matrices. This opens a wide range of possibilities for obtaining codes with compact representations, that is, CR codes according to Definition 3.5. In fact, -reproducible pseudo-rings allow us to design codes that can be described in a very compact manner. Codes of this type are of interest in code-based cryptography, where small public keys are important.
In this section, we describe how to design CR codes, and the properties that characterize them. In particular, we study how to achieve an -reproducible representation for the parity-check matrix H starting from an -reproducible generator matrix G. In addition, we provide intuitive methods to obtain randomlooking CR codes, starting from their parity-check matrix.
Let be a CR code over q , with length n, dimension k, and codimension = − r n k, with an -reproducible generator matrix ∈ × G where each h i is a matrix with dimensions × s n. Since by definition = × GH 0 T k r , it must be Let us assume that = × g H 0 T m n 0 : as we explain later, in the practical case of a cryptographic scheme, this condition can be easily satisfied. The following theorem considers a particular construction for a CR code and states some properties that its parity-check matrix must satisfy.  In order for G to be a valid generator matrix, it must be = × GH 0 T k r , that is, Consider now the product with the aforementioned property described by (4.4), then for all couples of indexes i j , we have and (4.6) is surely satisfied, since if the functions σ i have all full rank (for instance, they are permutations), then H cannot have maximum rank r. Hence, when r is a prime, the only case with practical interest is that of = s 1 (i.e., the one in which each h j is actually a row vector).
For G and H to be, respectively, the generator and parity-check matrix of a code , some conditions have to be verified, given in Corollary 4.3.
Proof. We want the -reproducible × k n matrix G to be the generator matrix of a code with dimension k: then, G must have rank equal to k. If contains two transformations = σ σ i j , with ≠ i j, then the rows of G obtained as σ g i 0 are identical to the ones obtained as σ g j 0 . If G has some identical rows, then its rank cannot be maximum, and this proves condition ( ) a . It is straightforward to show that this condition can also be expressed as follows: there cannot exist three integers , . Indeed, if we can determine such integers, then which results in If H is the parity-check matrix of a code with redundancy r, then it must have rank equal to r. . Then, g 0 is a valid signature for an -reproducible generator matrix, defined by the family . On condition that both H and G have full rank, and < ⇒ > m k l 1, then they can be used to represent the CR code with length n, dimension k, and redundancy r.

If we suppose that there exists three integers
We point out that the properties defined by Theorem 4.1 can be described in a graphical way, considering the fact that the linear functions σ i define a mapping acting on the ensemble of matrices h j . We can We can now consider two different paths having the same starting and final nodes, with the corresponding sets of edges labeled as I a and I b . Then, it must be The definitions we have introduced in the previous section describe codes whose generator matrices can be efficiently described just by a subset of their entries; for this reason, they are natural candidates for being used in a McEliece cryptosystem. Actually, some variants of this type have already been proposed during the years, with the aim of reducing the public-key size by exploiting such a property. We show that these already existing variants are encompassed by our general framework and that the possibilities for obtaining such features are actually many more than those already exploited. In some cases, a QCR code can be seen as a particular case of a CR code (and viceversa). Let us consider a code with length = n n p 0 , dimension = k p, and codimension ( ) = − r n p 1 0 , for some integer ∈ n 0 . Let us suppose that G is obtained as a row of n 0 blocks with size × p p, that is, This form of the generator matrix is commonly used in sparse-matrix code-based cryptosystems [5,35]. Suppose that G in (4.12) is an -quasi-reproducible matrix, i.e., each G i is an element of the pseudo-ring  . In order to fulfill the conditions of Theorem 4.1, these matrices must form a commutative group, that is, Let us consider two sets containing all the 2 v distinct binary v-tuples, i.e., where ( ) a l i is the lth entry of ( ) a i . Since we are considering Householder matrices with the property (4.15), it is easy to verify that = σ I i n 2 , and it follows that each function is an involution. The family can be used to define an -reproducible generator matrix G for a code ; a parity-check matrix for can then be the -reproducible matrix H, with signature ∈ × h q s n 0 , whose rows are obtained as If H has full rank, the corresponding code has redundancy = r s2 v , and It is straightforward to show that such a function satisfies the properties required by Theorem 4.1 and Corollary 4.3. The corresponding code has length n, dimension = k m2 v , and redundancy = r s2 v , thus the code rate corresponds to + m m s . In addition, we point out that it might be possible to tune the code parameters, by selecting only proper subsets of all the binary v-tuples, in order to form the rows of both G and H.

CR codes from powers of a single function
In this section, we present another construction of reproducible codes satisfying Theorem 4.1. Let us consider an × n n matrix π such that . Then, given an × m n signature g 0 , we can use the family to obtain a generator matrix G for a code as An -reproducible parity-check matrix for can be obtained by taking an × s n matrix h 0 , and using it to generate the parity-check matrix H as If H is full rank, then has redundancy = r s b v ; the code dimension and redundancy must be linked to the code length according to . It is quite easy to show that such a parity-check matrix is compliant with Theorem 4.1. In fact, we have In the case of < z j i , we can write Thus, we have proven that such that the function ( ) f x x , 0 1 required by Theorem 4.1 is defined as For instance, a simple construction can be obtained by choosing = = m s 1 and = = / k r n 2: the matrices G and H are two -reproducible matrices, with signatures that are row vectors of length n and are characterized by the same number of rows (thus, has rate 1/2).
For what concerns property ( ) b , we can consider the following equivalence:

(4.27)
Then, it is clear that it must be ′ ″ < x x , r s : however, this condition is quite straightforward, since j denotes the row index of the matrix blocks in H. In the same way, when considering the index of the transformation σ i , we have which turns into Again, in order to guarantee that the previous equivalence has no solution, it must be ′ ″ < x x , . This basically means that we must have ≤ k m r s .
Remark. There is a clear analogy between the concept of reproducibility and that of automorphism group of a code. Remember that, by automorphism group, we refer to the set of functions that map a code into itself. For instance, consider codes obtained from generator matrices as in (4.20) and assume that π is a permutation. Let us further assume, for simplicity, that = v 1 and choose = k b, i.e., suppose the code has dimension equal to the order of the considered permutation π. We then have { } = … − π π π I , , , , n k 2 1 , and for each each ∈ g q n 0 we obtain an -reproducible generator matrix as It is trivial to show that is in the automorphism group of the code having G as a generator matrix. Indeed, each codeword is obtained as 1 . With similar arguments, one can prove that analogous results hold for other families of transformations that we consider in this article.

Code-based schemes from QCR codes
The algebraic structures we have introduced in the previous sections can be used to generate key pairs in code-based cryptosystems. For instance, let us consider a parity-check matrix H made of × r n 0 0 matrices belonging to a pseudo-ring q m , . In order to use H as the private key of a sparse-matrix code-based instance of the Niederreiter cryptosystem, we must guarantee that H is sufficiently sparse: this property can be easily achieved by choosing a family of sparse matrices σ i , which guarantee that an -reproducible matrix defined by a sparse signature will be sparse as well. In such a case, we can obtain the public key as ′ = H SH, where S is a random dense matrix, whose elements are picked over   [37]. When both Theorems 3.13 and 3.14 are satisfied, we can obtain a generator matrix in systematic form, which is still an -reproducible matrix. In fact, starting from an × r n parity-check matrix H, where the elements are picked randomly from q ,1 , we can use the corresponding parity-check matrix in systematic form as the public key for a Niederreiter cryptosystem instance. In the same way, we can compute the systematic generator matrix, and use it as the public key in a McEliece cryptosystem instance. The idea of using codes that are completely reproducible, and not formed by reproducible pseudorings, opens up for the possibility of a whole new way of generating key pairs in the McEliece cryptosystem. Indeed, once we have generated a sparse parity-check matrix H, we can use it as the secret key. Then, a possible public key can be obtained by taking a bunch of linearly independent codewords, and using them as the signature of the public generator matrix. If such codewords correspond to rows of the generator matrix in systematic form, then we obviously obtain another significant reduction in the public key size, since there is no need for publishing the first k bits of each one of the selected codewords.
It is clear that having a CR public code may lead to a significant reduction in the public-key size. Indeed, once the structure of the matrix is fixed by the protocol (i.e., dimensions, family ), the whole public key can be efficiently represented using just the signatures of each building block.

Cryptographic properties and attacks
In the previous sections, we have introduced the notion of reproducibility and have described some properties of reproducible codes. Our analysis has shown that there can be a wide variety of methods which allow us obtaining reproducible codes. As we have seen in Section 4.3, these codes can be used to generate key pairs in code-based cryptosystems. The main advantage is the possibility of reducing the information needed to represent the matrix used as the public key. In particular, following the considerations in Section 2.3, this framework is well suited for sparse-matrix code-based cryptosystems. Let be a secret code with parity-check matrix H, and suppose that the public key is constituted by a general generator matrix (for the McEliece case) or parity-check matrix (for the Niederreiter case) of . Then, the following properties must be satisfied: (a) H is sufficiently sparse to perform efficient decoding; (b) the knowledge of the public key does not admit efficient techniques for obtaining H or another valid sparse parity-check matrix ′ H .
When property (a) is satisfied, is an LDPC code and so admits an efficient decoding algorithm . We point out that this property can be easily satisfied if we choose as a family of sparse matrices: this way, choosing a sparse signature for H guarantees that H will be sparse as well. Satisfying property (b) might result in being the most delicate part, since it depends on the particular reproducible structure we consider. , and so we can make analogous considerations.
Regardless of the particular choice of , it is important to note that this additional structure does not expose the secret key to the risk of enumeration. For instance, let us consider the construction described in Section 4.2, in which the signature H is defined by a signature of size × m n, with all the rows having weight w. If we assume that the rows are picked in such a way as to be linearly independent, the cardinality of the secret key is then approximately equal to ( ) n w m . It is easy to see that, for practical choices of the parameters, this number is sufficiently large to make attacks based on the enumeration of the secret key unfeasible. In the next sections, we provide some considerations on attacks that work for QC codes and that may be hindered by proper families of reproducible codes. We only provide some qualitative arguments and leave detailed and thorough considerations about these attacks for future works.

Reaction attacks
Reaction attacks [29,[38][39][40] are a recent kind of attacks aimed at recovering the private key by exploiting events of decoding failure. In this section, we briefly describe the attack proposed in ref. [29], and then we make some considerations about reproducible codes. In particular, we consider a binary QC code with parity-check matrix where each H i is a sparse × p p circulant with row and column weight equal to w. Then, the resulting code has length = n p 2 , dimension and redundancy equal to p. In a reaction attack, the opponent impersonates Alice, producing ciphertexts and sending them to Bob. Events of decoding failure can be detected since, in the case of a decoding failure, Bob must ask for a retransmission. A crucial player in a reaction attack is the distance spectrum, that is, the set of all distances produced by the elements of value 1 in a vector [29]. If a distance d appears μ times in the spectrum, we say that it has multiplicity equal to μ; if a distance is not in the spectrum, we say that it has zero multiplicity. In the case of QC codes, these distances are computed cyclically: given two ones at positions x 0 and x 1 , the corresponding distance is obtained as . In a circulant matrix, all the rows are characterized by the same distance spectrum; in particular, an opponent performing a reaction attack aims to obtain the distance spectrum of the rows of H 0 . For this purpose, he collects the produced ciphertexts into subsets Σ d , such that each error vector used for the encryption of a ciphertext in Σ d has d in the distance spectrum of its first circulant block. Then he observes a sufficiently large number of Bob's reactions and assigns a decoding failure probability to each set. As observed in ref. [29], the decoding failure probability of Σ d depends on the presence of couples of ones in the rows of H 0 , at the same distance d. Indeed, suppose that the first length-p block of e has a couple of ones forming the distance d; then, the following properties hold • if the distance spectrum of H 0 contains d with multiplicity μ, then the couple of ones overlaps with μ rows of H; • if the distance spectrum of H 0 does not contain d, then the couple of ones does not overlap with any row of H.
These justify the fact that the average syndrome weight of the ciphertexts belonging to the same set Σ d depends on the multiplicity of d in the spectrum of H 0 , as observed in ref. [40]. In particular, the syndrome weight slightly decreases as μ increases, and this causes the difference in the corresponding decoding failure probabilities [40]. This allows an opponent to obtain the distance spectrum of H 0 , since he can guess the multiplicity of each distance d by looking at the decoding failure probability of the corresponding set Σ d .
Since H 0 is sparse, its distance spectrum is not dense, which means that it contains a small number of distances, with multiplicities that generically are rather low. It is then possible to recover H 0 from the knowledge of its distance spectrum, with a procedure that can be related to that of finding cliques of prefixed size in a given graph. In principle, cliques finding algorithms run with a time complexity that grows exponentially with the clique size; however, for sparse graphs (i.e., graphs that contain a small number of edges), the problem becomes significantly easier [29,38].
In summary, reaction attacks against QC codes are possible because of two factors: (i) A sufficiently high DFR; (ii) The invariance of the set of distances between pairs of ones in a row of the secret key with respect to the row index. This guarantees feasibility of the key reconstruction phase, since the resulting graph (in which rows of the secret key are represented by cliques of fixed size) is sparse.
In particular, one can try to counter reaction attacks by choosing codes for which condition (ii) is not met. For instance, in ref. [34] authors propose to use a specific family of QC monomial codes with the property that the distances between pairs of ones in the secret key fill the distance spectrum. In this way, the density in the obtained graph becomes maximal and, as a consequence, reconstructing the secret key becomes unfeasible. We argue that families of reproducible codes may, in general, be characterized by analogous properties. For simplicity, consider the example of a reproducible code with = = k r p and = n p 2 , with a signature made of just one row, and a family of functions σ i that are obtained as consecutive powers of a permutation ψ. In addition, suppose that ψ is obtained as the product of two disjoint p-cycles. In other words, ψ is such that that we can find two disjoint sets { } It is clear that Suppose now that the signature of H has two ones at positions ( )  The distances between these ones are all different and, furthermore, are not an invariant of the row index. Thus, differently from the case of QC codes, the distances that are produced between ones in the first row of the secret key are not maintained in the other rows.
With this simple example we have shown that, differently from the QC case, the distance spectrum of generic reproducible codes becomes richer and, as a consequence, the graph which is used to discover the secret key becomes denser. Thus, the secret key reconstruction phase, which is the final step of a reaction attack, may be hindered, and this may be enough to remove the basis upon which reaction attacks are built. Asserting the resistance of general families of transformations requires a deeper investigation, although some conclusions can already be drawn.

DOOM
In ref. [28], Sendrier introduced a technique, called DOOM, which is able to speed up the execution of ISD algorithms for certain families of codes, including QC codes. In general, this technique can be applied whenever there are multiple instances of SDP with just one solution. When ISD is used to perform a decoding attack, the gain obtained from DOOM can be explained as follows. . According to DOOM, we consider N i independent calls to an ISD algorithm. As soon as one of these runs successfully comes to an end, the whole algorithm ends as well, since ( ) e 0 has been found. The corresponding gain is equal to Obviously, exploiting DOOM is beneficial when the N i independent decoding instances have comparable complexity. This only occurs on the condition that ( ) has the same Hamming weight as ( ) e 0 , or almost the same.
The rationale of exploiting DOOM for a decoding attack is to intercept one ciphertext and then try to obtain other valid ciphertexts from it, corresponding to transformed versions of the same error vector. Let us consider the case in which the opponent intercepts a ciphertext corresponding to an initial syndrome ( ) s 0 and wants to recover the vector ( ) e 0 used during encryption. Then, in order to apply DOOM, the opponent must produce other syndromes corresponding to as many error vectors being deterministic functions of ( ) e 0 . In other words, suppose that ISD returns the solution ( ) e i for ( ) s i , then it must be ( ) ( ) = e Ae i 0 , with A being a full-rank matrix. For instance, in the QC case, the opponent can obtain a set of p syndromes just by cyclically shifting the initial syndrome ( ) s 0 and the corresponding error vector ( ) e 0 . In general terms, the applicability of DOOM can be modeled as follows. Starting from a syndrome He T 0 0 , we want to determine a transformation Φ of the syndrome that corresponds to a transformation Ψ of the error vector, that is, ΦHe H e Ψ HΨ e , where Φ and Ψ are two matrices over q , with size × r r and × n n, respectively. The previous equation must be satisfied for every vector ( ) e 0 ; this can happen only if For the general class of reproducible codes, the applicability of DOOM must be carefully analyzed. For instance, consider a code obtained with the procedure described in Section 4.2, using a family of functions consisting of powers of a single function. If this is a permutation, due to Theorem 4.1, we have that σ H i with ∈ σ i always results in a permutation of the rows of H. So, the opponent can build the set , which is used as input for the DOOM algorithm, by multiplying the initial syndrome by the matrices σ i .
However, as we have described in the previous sections, reproducible families of codes can be obtained in many different ways. For instance, we can use functions σ i that are powers of a matrix θ that is not a permutation. In this case, the opponent can still produce a set , since equation . If t is the weight of ( ) e 0 , then we have that the ISD algorithm taking ( ) s 0 as input is expected to run in time 2 ct . Since all the other syndromes ( ) s i , with ≥ i 1, are associated with error vectors with weights significantly larger than t, applying ISD on them requires a time complexity that is significantly larger than 2 ct . Then, there is no gain in considering this set of multiple instances, since the additional instances (which are produced by the opponent) are associated with an ISD complexity that is significantly larger than that of the original one.
We note that codes of this type may be employed in cryptosystems where codes in compact form are not required to admit efficient decoding. This is the case, for instance, of the HQC KEM [42] and the AGS identification scheme [43]. In both schemes, a code in compact form is needed to obtain a syndrome decoding instance: while in HQC decoding is done with a public and fixed code, in AGS decoding is not involved at all.
Hence, in this type of applications, the adoption of reproducible families of codes may be convenient: defeating DOOM would obviously result in the possibility of choosing better parameters for a scheme.

Construction examples
We provide some explicit constructions of reproducible codes that can be advantageous for the use in codebased cryptographic schemes, with the aim of illustrating the potential of the introduced theoretical framework.

Quasi-dyadic MDPC codes
Dyadic matrices, which we have already mentioned in Section 3.2, have been used with some measure of success in cryptography, but always in the context of algebraic codes. The first proposal using quasi-dyadic (QD) Goppa codes [1] was cryptanalyzed [26] almost in its entirety. A later proposal based on generalized Srivastava (GS) codes [44] was designed to be more robust against the previous attack and led to one of the NIST submissions for the key exchange functionality, DAGS [45,46]. Nevertheless, the threat of structural attacks is always present, as shown by the recent results of Barelli and Couvreur [47]. On the other hand, using dyadic matrices has undeniable advantages, not only in terms of key reduction but also because it leads to fast and efficient arithmetic (as shown in ref. [48]) while at the same time featuring a reproducible structure which is less "obvious" than that provided by circulant matrices.
The reasons mentioned above are why we believe that designing MDPC codes with a QD structure, i.e., QD-MDPC codes, has potential in cryptography. Dyadic matrices have many good properties (e.g., they are symmetric and orthogonal) and satisfy Theorems 3.9-3.13, which means the ensemble q ,1 of dyadic matrices forms a fully-fledged ring (which is also commutative). A formal definition of reproducible codes having such a structure is given below. Constructing a code-based cryptosystem from QD-MDPC codes is actually rather intuitive, since we can follow the guidelines detailed in Section 4.3. However, due to the very same properties we just mentioned, building QD-MDPC codes for cryptographic purposes requires some caution. For example, in the simplest instantiation, one could form a parity-check matrix by selecting just two blocks, i.e., 1 may still reveal the private key, due to the sparsity of the inverse of a dyadic matrix.
As a consequence, to construct code-based schemes using this particular family of reproducible codes, it is recommended to choose ≥ r 2 0 and employ "true" block matrices, with blocks in q ,1 .

Block-wise circulant matrices
As shown in Section 3.2, circulant matrices are a classic special case of reproducible matrices and have already been used in cryptography for quite some time. For a traditional circulant matrix, the signature corresponds to its first row and the set of transformations is reproducible and quasi-reproducible codes, which are codes described by a generator or a parity-check matrix yielding a compact representation. We have shown that existing and well known families of structured codes are encompassed within this framework, and have provided some concrete constructions of other families of reproducible codes. A direct application of this work is in code-based cryptography, where the representation of a code is commonly used as the public key. As the recent NIST call for the standardization of post-quantum cryptosystems clearly emphasizes, random and pseudo-random codes are of interest for many code-based cryptosystems. In particular, at the current state of the art, many systems rely on the quasi-cyclic structure of codes in order to reduce the public key size. Essentially, all the schemes employing such structured codes can be generalized to the use of reproducible codes, via some of the constructions we have shown in this article. While the compactness of the public key is preserved, advantages come from the fact that attacks targeting the specific quasi-cyclic structure can be avoided when more general code constructions are considered. Although a complete cryptanalysis of these new families of codes requires a deeper investigation, and is out of the scope of this article, these potential benefits motivate the study of reproducible codes as a generalization of quasi-cyclic and other known structured codes.