Skip to content
BY-NC-ND 3.0 license Open Access Published by De Gruyter February 19, 2019

Generic constructions of PoRs from codes and instantiations

Julien Lavauzelle ORCID logo EMAIL logo and Franรงoise Levy-dit-Vehel


In this paper, we show how to construct โ€“ from any linear code โ€“ a Proof of Retrievability (๐–ฏ๐—ˆ๐–ฑ) which features very low computation complexity on both the client (๐–ต๐–พ๐—‹๐—‚๐–ฟ๐—‚๐–พ๐—‹) and the server (๐–ฏ๐—‹๐—ˆ๐—๐–พ๐—‹) sides, as well as small client storage (typically 512โ€‰bits). We adapt the security model initiated by Juels and Kaliski [PoRs: Proofs of retrievability for large files, Proceedings of the 2007 ACM Conference on Computer and Communications Securityโ€”CCS 2007, ACM, New York 2007, 584โ€“597] to fit into the framework of Paterson, Stinson and Upadhyay [A coding theory foundation for the analysis of general unconditionally secure proof-of-retrievability schemes for cloud storage, J. Math. Cryptol. 7 2013, 3, 183โ€“216], from which our construction evolves. We thus provide a rigorous treatment of the security of our generic design; more precisely, we sharply bound the extraction failure of our protocol according to this security model. Next we instantiate our formal construction with codes built from tensor-products as well as with Reedโ€“Muller codes and lifted codes, yielding ๐–ฏ๐—ˆ๐–ฑs with moderate communication complexity and (server) storage overhead, in addition to the aforementioned features.

MSC 2010: 11T71

1 Introduction

1.1 Motivation

Cloud computing and storage has evolved quite spectacularly over the past decade. Especially, data outsourcing allows users and companies to lighten their storage burden and maintenance cost. Though, it raises several issues: for example, how can someone check efficiently that he can retrieve without any loss a massive file that he had uploaded on a distant server and erased from his personal system?

Proofs of retrievability (๐–ฏ๐—ˆ๐–ฑs) address this issue. They are cryptographic protocols involving two parts: a client (or a verifier) and a server (or a prover). ๐–ฏ๐—ˆ๐–ฑs usually consist in the following phases. First, a key generation process creates secret material related to the file, meant to be kept by the client only. Then the file is initialised, that is, it is encoded and/or encrypted according to the secret data held by the client. This processed file is uploaded to the server. In order to check retrievability, the client can run a verification procedure, which is the core of the ๐–ฏ๐—ˆ๐–ฑ. Finally, if the client is convinced that the server still holds his file, the client can proceed at any time to the extraction of the file.

Several parameters must be taken into account. Plainly, the verification process has to feature a low communication complexity, as the main goal is to avoid downloading a large part of the file to only check its extractability. Second, the storage overhead induced by the protocol must be low, as large server overhead would imply high fees for the customer. Third, the computation cost of the verification procedure must be low, both for the client (which is likely to own a lightweight device) and the server (whose computation work could also be expensive for the client).

Notice that proofs of data possession (๐–ฏ๐–ฃ๐–ฏ) represent protocols close to what is needed in ๐–ฏ๐—ˆ๐–ฑs. However, in ๐–ฏ๐–ฃ๐–ฏs, one does not require the client to be able to extract the file from the server. Instances of ๐–ฏ๐–ฃ๐–ฏs are given by Ateniese et al. [2]. Besides, protocols of Lillibridge et al. [8] and Naor and Rothblum [10] are very often seen as precursors for ๐–ฏ๐—ˆ๐–ฑs. For instance, the work of Naor and Rothblum [10] considers a setting in which the client directly accesses the file stored by the prover/server (while the actual ๐–ฏ๐—ˆ๐–ฑ definition uses โ€œan arbitrary program as opposed to a simple memory layout and this program may answer these questions in an arbitrary mannerโ€ [14]).

1.2 Previous work

Juels and Kaliski [6] gave the first formal definition of ๐–ฏ๐—ˆ๐–ฑs. They also proposed a first construction based on so-called sentinels (namely, random parts of the file to be checked during the verification step) the client keeps secretly on his device. Additionally, an erasure code ensures the integrity of the file to be extracted. This seminal work also raised several interesting points. On the one hand, it revealed that (i) the client must store secret data to be used in the verification step and (ii) coding is needed in order to retrieve the file without erasures or errors. On the other hand, in Juels and Kaliskiโ€™s construction, the verification step can only be performed a finite number of times since sentinels cannot be reused endlessly.

As a consequence, Shacham and Waters proposed to consider unbounded-use๐–ฏ๐—ˆ๐–ฑs in [14], where they built two kinds of ๐–ฏ๐—ˆ๐–ฑs. The first one is based on linear combinations of authenticators produced via pseudo-random functions; its security was proved using cryptographic tools such as unforgeable MAC scheme, semantically secure symmetric encryption and secure PRFs. The second one is a publicly verifiable scheme based on the Diffieโ€“Hellman problem in bilinear groups.

Bowers, Juels and Oprea [3] adopted a coding-theoretic approach (inner code, outer code) to compare variants of Shachamโ€“Waters and Juelsโ€“Kaliski schemes. They focused on the efficiency of the schemes, and proved that, despite bounded use, new variants of Juelsโ€“Kaliski construction are highly competitive compared to other existing schemes.

In [11], Paterson, Stinson and Upadhyay provide a general framework for ๐–ฏ๐—ˆ๐–ฑs in the unconditional security model. They show that retrievability of the file can be expressed as error correction of a so-called response code. That allows them to precisely quantify the extraction success as a function of the success probability of a proving algorithm: indeed, in this setting, extraction can be naturally seen as nearest-neighbour decoding in the response code. They notably apply their framework to prove the security of a modified version of the Shachamโ€“Waters scheme. Also, notice that, prior to [11], Dodis, Vahan and Wichs [4] proposed another coding-theoretic model for ๐–ฏ๐—ˆ๐–ฑs that allowed them to build efficient bounded-use and unbounded-use ๐–ฏ๐—ˆ๐–ฑ schemes.

With practicality in mind, other features have been deployed on ๐–ฏ๐—ˆ๐–ฑs. For instance, Wang et al. [15] presented a ๐–ฏ๐—ˆ๐–ฑ construction based on Merkle hash trees, which allows efficient file updates on the server. Their scheme is provably secure under cryptographic assumptions (hardness of Diffieโ€“Hellman in bilinear groups, unforgeable signatures, etc.) and has been improved by Mo, Zhou and Chen [9] in order to prevent unbalanced trees. More recently, other features have been proposed for ๐–ฏ๐—ˆ๐–ฑs, such as multi-prover ๐–ฏ๐—ˆ๐–ฑs (see [12]) or public verifiability (for instance in [13]).

1.3 Our approach

As we remarked before, most ๐–ฏ๐—ˆ๐–ฑ schemes rely on two techniques: (i) the client locally stores secret data in order to check the integrity of the file, and (ii) the client encodes the file in order to repair a small number of erasures and errors that could have been missed during the verification step.

In this work, we propose to build ๐–ฏ๐—ˆ๐–ฑ schemes using codes that fulfil the two previous goals, when equipped with a suitable family of efficiently computable random permutations. More precisely, our idea is the following. Given a file F, a code ๐’ž and a family of random permutations ฯƒK, the client sends to the server an encoded and scrambled version ฯƒKโข(๐’žโข(F)) of his file. Then the verification step consists in checking โ€œshortโ€ relations among descrambled symbols of w=๐’žโข(F), which come, for instance, from low-weight parity-check equations for ๐’ž. Moreover, during the extraction step, the code ๐’ž provides the redundancy necessary to repair erasures and potential unnoticed errors.

In the present work, we develop a seminal idea that appeared in [7], where the authors proposed a construction of ๐–ฏ๐—ˆ๐–ฑs based on lifted codes. We here provide a more generic construction and give a deeper analysis of its security.

While our scheme does not feature updatability nor public verifiability, we emphasise the genericity of our construction, which is based on well-studied algebraic and combinatorial structures, namely, codes and their parity-check equations. Moreover, since the code ๐’ž is public, the client must only store the secret material associated to the random permutations ฯƒK, which consist in a few bytes. Besides, an honest server simply needs to read pieces of w during the verification step, and therefore has very low computational burden compared to many other ๐–ฏ๐—ˆ๐–ฑ schemes.

1.4 Organisation

Section 2 is devoted to the definition and security model of proofs of retrievability. Despite the great disparity of models in ๐–ฏ๐—ˆ๐–ฑ literature, we try to keep close to the definitions given in [6, 11] for the sake of uniformity.

Section 3 presents our construction of ๐–ฏ๐—ˆ๐–ฑ. Precisely, in Section 3.1, we introduce objects called verification structures for a code ๐’ž that will be used in the definition of our ๐–ฏ๐—ˆ๐–ฑ scheme (Section 3.2). A rigorous analysis of our scheme is the purpose of the remainder of that section.

The performance of our generic construction is given in Section 4. We then provide several instances in Section 5, proving the practicality of our ๐–ฏ๐—ˆ๐–ฑ schemes for some classes of codes.

2 Proofs of retrievability

2.1 Definition of underlying protocols

We recall that, in proofs of retrievability, a user wants to estimate if a message m can be retrieved from a encoded version w of the message stored on a server. In all what follows, the user will be known as the ๐–ต๐–พ๐—‹๐—‚๐–ฟ๐—‚๐–พ๐—‹ (wants to verify the retrievability of the message) while the server is the ๐–ฏ๐—‹๐—ˆ๐—๐–พ๐—‹ (aims at proving the retrievability). The message space is denoted by โ„ณ while ๐’ฒ, the (server) file space, is the set of encoded versions of the messages. We also denote by ๐’ฆ the set of secret values (or keys) kept by the ๐–ต๐–พ๐—‹๐—‚๐–ฟ๐—‚๐–พ๐—‹, and by โ„› the space of responses to challenges.

Throughout the paper, the symbols โ†R and โ† respectively denote the output of randomised and deterministic algorithms.

Definition 2.1.

A keyed proof of retrievability (๐–ฏ๐—ˆ๐–ฑ) is a tuple of algorithms (๐–ช๐–พ๐—’๐–ฆ๐–พ๐—‡, ๐–จ๐—‡๐—‚๐—, ๐–ต๐–พ๐—‹๐—‚๐–ฟ๐—’, ๐–ค๐—‘๐—๐—‹๐–บ๐–ผ๐—) running as follows:

  1. The key generation algorithm ๐–ช๐–พ๐—’๐–ฆ๐–พ๐—‡ generates uniformly at random a key ฮบโ†R๐’ฆ. The key ฮบ is secretly kept by the ๐–ต๐–พ๐—‹๐—‚๐–ฟ๐—‚๐–พ๐—‹.

  2. The initialisation algorithm๐–จ๐—‡๐—‚๐— is a deterministic algorithm which takes, as input, a message mโˆˆโ„ณ and a key ฮบโˆˆ๐’ฆ, and outputs a file wโˆˆ๐’ฒ. ๐–จ๐—‡๐—‚๐— is run by the ๐–ต๐–พ๐—‹๐—‚๐–ฟ๐—‚๐–พ๐—‹ which initially holds the message m. After the process, the file w is sent to the ๐–ฏ๐—‹๐—ˆ๐—๐–พ๐—‹, and the message m is erased on ๐–ต๐–พ๐—‹๐—‚๐–ฟ๐—‚๐–พ๐—‹โ€™s side. Upon receipt of w, the ๐–ฏ๐—‹๐—ˆ๐—๐–พ๐—‹ sets a deterministic algorithm ๐–ฏ(w) that will be run during the verification procedure.

  3. The verification algorithm๐–ต๐–พ๐—‹๐—‚๐–ฟ๐—’ is a randomised algorithm initiated by the ๐–ต๐–พ๐—‹๐—‚๐–ฟ๐—‚๐–พ๐—‹ which needs a secret key ฮบโˆˆ๐’ฆ and interacts with the ๐–ฏ๐—‹๐—ˆ๐—๐–พ๐—‹. ๐–ต๐–พ๐—‹๐—‚๐–ฟ๐—’ is depicted in Figure 1 and works as follows:

    1. the ๐–ต๐–พ๐—‹๐—‚๐–ฟ๐—‚๐–พ๐—‹ runs a random query generator that outputs a challenge uโ†R๐’ฌ (the set ๐’ฌ being the so-called query set);

    2. the challenge u is sent to the ๐–ฏ๐—‹๐—ˆ๐—๐–พ๐—‹;

    3. the ๐–ฏ๐—‹๐—ˆ๐—๐–พ๐—‹ outputs a response ruโ†๐–ฏ(w)โข(u)โˆˆโ„›;

    4. the ๐–ต๐–พ๐—‹๐—‚๐–ฟ๐—‚๐–พ๐—‹ checks the validity of ru according to u and ฮบ; the algorithm ๐–ต๐–พ๐—‹๐—‚๐–ฟ๐—’ finally outputs the Boolean value ๐–ข๐—๐–พ๐–ผ๐—„โข(u,ru,ฮบ).

  4. The extraction algorithm๐–ค๐—‘๐—๐—‹๐–บ๐–ผ๐— is run by the ๐–ต๐–พ๐—‹๐—‚๐–ฟ๐—‚๐–พ๐—‹. It takes, as input, ฮบ and r=(ru:uโˆˆ๐’ฌ)โˆˆโ„›๐’ฌ and outputs either a message mโ€ฒโˆˆโ„ณ or a failure symbol โŠฅ. We say that extraction succeeds if ๐–ค๐—‘๐—๐—‹๐–บ๐–ผ๐—โข(r,ฮบ)=m.

The vector r=(ruโ†๐–ฏ(w)(u))uโˆˆ๐’ฌโˆˆโ„›๐’ฌ is called the response word associated to ๐–ฏ(w).

Figure 1 Definition of the algorithm ๐–ต๐–พ๐—‹๐—‚๐–ฟ๐—’{\mathsf{Verify}}.
Figure 1

Definition of the algorithm ๐–ต๐–พ๐—‹๐—‚๐–ฟ๐—’.

Note that, in assuming that the response algorithm ๐–ฏ(w) is deterministic and non-adaptive[1], we follow the work of Paterson, Stinson and Upadhyay [11]. The authors justify determinism of response algorithms by the fact that any probabilistic prover can be replaced by a deterministic prover whose success probability is at least as good as the probabilistic one.

In Definition 2.1, we can see that a deterministic algorithm ๐–ฏ(w) can be represented by the vector of its outputs r=(๐–ฏ(w)โข(u))uโˆˆ๐’ฌ, called the response word of ๐–ฏ(w). Therefore, we can assume that, before the verification step, the ๐–ฏ๐—‹๐—ˆ๐—๐–พ๐—‹ produces a word r(w)โˆˆโ„›๐’ฌ related to the file w he holds. In other words, we model provers as algorithms ๐–ฏ which, given as input w, return a word rโˆˆโ„›๐’ฌ.

Following [11], we also assume in this chapter that the extraction algorithm ๐–ค๐—‘๐—๐—‹๐–บ๐–ผ๐— is deterministic, though, in general, it can be randomised. Finally, notice that proofs of retrievability aim at proving the extractability of a file. The extraction algorithm is therefore a tool to retrieve the whole file. Hence its computational efficiency is not a crucial feature.

Table 1 summarises the information held by each entity after the initialisation step. Table 2 reports the inputs and outputs of the algorithms involved in a ๐–ฏ๐—ˆ๐–ฑ.

Table 1

Information held by each entity after the initialisation step.

Table 2

Inputs and outputs of the algorithms involved in a ๐–ฏ๐—ˆ๐–ฑ.

Input1ฮปm, ฮบr, ฮบu, ru, ฮบr, ฮบ
OutputฮบwTrue or FalseTrue or Falsemโ€ฒ or โŠฅ

2.2 Security models

One should first notice that, despite many efforts, proofs of retrievability lack a general agreement on the definition of their security model. Nevertheless, our definitions remain very close to the ones given in the original work of Juels and Kaliski [6].

For a response word rโˆˆโ„›๐’ฌ given by the ๐–ฏ๐—‹๐—ˆ๐—๐–พ๐—‹ and a key ฮบโˆˆ๐’ฆ kept by the ๐–ต๐–พ๐—‹๐—‚๐–ฟ๐—‚๐–พ๐—‹, we first define the success of r according to ฮบ as


where the probability is taken over the internal randomness of ๐–ต๐–พ๐—‹๐—‚๐–ฟ๐—’. A first security model can be defined as follows.

Definition 2.2 (Security model, strong version).

Let ฮต,ฯ„โˆˆ[0,1]. A proof of retrievability (๐–ช๐–พ๐—’๐–ฆ๐–พ๐—‡,๐–จ๐—‡๐—‚๐—,๐–ต๐–พ๐—‹๐—‚๐–ฟ๐—’,๐–ค๐—‘๐—๐—‹๐–บ๐–ผ๐—) is strongly (ฮต,ฯ„)-sound if, for every initial file mโˆˆโ„ณ, every uploaded file wโˆˆ๐’ฒ and every prover ๐–ฏ:๐’ฒโ†’โ„›๐’ฌ, we have

(2.1)Prโก(๐–ค๐—‘๐—๐—‹๐–บ๐–ผ๐—โข(r,ฮบ)โ‰ msuccโก(r,ฮบ)โ‰ฅ1-ฮต|ฮบโ†R๐–ช๐–พ๐—’๐–ฆ๐–พ๐—‡โข(1ฮป)wโ†๐–จ๐—‡๐—‚๐—โข(m,ฮบ)rโ†๐–ฏโข(w))โ‰คฯ„,

the probability being taken over the internal randomness of ๐–ช๐–พ๐—’๐–ฆ๐–พ๐—‡ under the constraint that w=๐–จ๐—‡๐—‚๐—โข(m,ฮบ).

A remark concerning parameters ฮต and ฯ„

In proofs of retrievability, we aim at making the extraction of the desired file m as sure as possible when the audit succeeds. Hence it is desirable to have ฯ„ small. On the other hand, the parameter ฮต measures the rate of unsuccessful audits which leads the ๐–ต๐–พ๐—‹๐—‚๐–ฟ๐—‚๐–พ๐—‹ to believe the extraction will fail. Therefore, one does not necessarily need to look for large values of ฮต, though, in practice, large ฮต afford more flexibility, for instance, if communication errors occur between the ๐–ฏ๐—‹๐—ˆ๐—๐–พ๐—‹ and the ๐–ต๐–พ๐—‹๐—‚๐–ฟ๐—‚๐–พ๐—‹ during the verification procedure.

Definition 2.2 provides a strong security model, in the sense that (i) it does not require any bound on the response algorithms given by the ๐–ฏ๐—‹๐—ˆ๐—๐–พ๐—‹ and (ii) the probability in (2.1) is taken over fixed messages m (informally, it means the ๐–ฏ๐—‹๐—ˆ๐—๐–พ๐—‹ knows m).

However, keyed proofs of retrievability are usually insecure according to the security model given in Definition 2.2. For instance, in [11], Paterson, Stinson and Upadhyay noticed that in the Shachamโ€“Waters scheme [14], given the knowledge of m and w, an unbounded ๐–ฏ๐—‹๐—ˆ๐—๐–พ๐—‹ may be able to

  1. compute (or at least randomly guess) a key ฮบ such that ๐–จ๐—‡๐—‚๐—โข(m,ฮบ)=w,

  2. build mโ€ฒโ‰ m such that ๐–จ๐—‡๐—‚๐—โข(mโ€ฒ,ฮบ)=wโ€ฒ,

  3. set ๐–ฏ(wโ€ฒ)=rโ€ฒ which (a) successfully passes every audit and (b) leads to the extraction of mโ€ฒโ‰ m.

Hence we choose to use a weaker but still realistic security model, where, informally, the ๐–ฏ๐—‹๐—ˆ๐—๐–พ๐—‹ only knows what he stores (that is, w) and has no information on the initial message m. The following security model thus remains conform with the one given by Paterson, Stinson and Upadhyay [11].

Definition 2.3 (Security model, weak version).

Let ฮต,ฯ„โˆˆ[0,1]. A proof of retrievability (๐–ช๐–พ๐—’๐–ฆ๐–พ๐—‡,๐–จ๐—‡๐—‚๐—,๐–ต๐–พ๐—‹๐—‚๐–ฟ๐—’,๐–ค๐—‘๐—๐—‹๐–บ๐–ผ๐—) is weakly (ฮต,ฯ„)-sound (or simply (ฮต,ฯ„)-sound) if, for every polynomial-time prover ๐–ฏ:๐’ฒโ†’โ„›๐’ฌ and every uploaded file wโˆˆ๐’ฒ, we have

(2.2)Prโก(๐–ค๐—‘๐—๐—‹๐–บ๐–ผ๐—โข(r,ฮบ)โ‰ msuccโก(r,ฮบ)โ‰ฅ1-ฮต|mโ†Rโ„ณฮบโ†R๐–ช๐–พ๐—’๐–ฆ๐–พ๐—‡โข(1ฮป)wโ†๐–จ๐—‡๐—‚๐—โข(m,ฮบ)rโ†๐–ฏโข(w))โ‰คฯ„.

In equation (2.2), the randomness comes from pairs (m,ฮบ)โˆˆโ„ณร—๐’ฆ picked uniformly at random among those satisfying w=๐–จ๐—‡๐—‚๐—โข(m,ฮบ).

Since we deal with values of ฯ„ very close to 0, we also say that a strongly (ฮต,ฯ„)-sound ๐–ฏ๐—ˆ๐–ฑ admits ฮป=-log2โก(ฯ„) bits of security against ฮต-adversaries.

Informally, saying that a ๐–ฏ๐—ˆ๐–ฑ is not weakly sound amounts to finding a polynomial-time deterministic algorithm ๐–ฏ which

  1. takes, as input, a file wโˆˆ๐’ฒ and outputs a response word rโˆˆโ„›๐’ฌ,

  2. makes the extraction fail with non-negligible probability (over messages m and keys ฮบ such that the corresponding response words are successfully audited).

3 Our generic construction

Schematically, in the initialisation phase of our construction, the ๐–ต๐–พ๐—‹๐—‚๐–ฟ๐—‚๐–พ๐—‹

  1. encodes his file according to a code ๐’ž,

  2. scrambles the resulting codeword using a tuple of permutations over the base field,

  3. uploads the result to the ๐–ฏ๐—‹๐—ˆ๐—๐–พ๐—‹.

As we explained in the introduction, the verification step then consists in checking that the server is still able to give answers that, once descrambled, satisfy low-weight parity-check equations for ๐’ž.

For this purpose, we next introduce objects called verification structures for codes, which will be used in the definition of our generic ๐–ฏ๐—ˆ๐–ฑ scheme.

3.1 Verification structures: A tool for our PoR scheme

We here consider ๐”ฝq, the finite field with q elements. From well-known coding theory terminology, the support of a word wโˆˆ๐”ฝqn is supp(w):={iโˆˆ[1,n],wiโ‰ 0}, and its weight is wt(w):=|supp(w)|.

In this work, we need to consider codes whose alphabets are finite-dimensional spaces โ„› over ๐”ฝq, typically โ„›=๐”ฝqs. Precisely, a code ๐’ž of length n over โ„› is a subset of โ„›n. A code ๐’žโŠ†โ„›n is ๐”ฝq-linear if ๐’ž is a vector space over ๐”ฝq. When โ„›=๐”ฝq, we get the usual definition of linear codes over finite fields. Unless stated otherwise, we only consider ๐”ฝq-linear codes, that we will refer to as codes.

We usually denote by k the dimension over ๐”ฝq of a code ๐’ž. Its minimum distancedminโข(๐’ž) is the smallest Hamming distance between two distinct codewords. If n is the length of ๐’ž, then dminโข(๐’ž)/nโˆˆ[0,1] is the relative minimum distance of the code ๐’ž, while k/n represents its rate. If ๐’žโŠ†๐”ฝqn, its dual code ๐’žโŠฅ is defined as {hโˆˆ๐”ฝqn,โˆ‘i=1nhiโขci=0โขfor allโขcโˆˆ๐’ž}. Codewords in ๐’žโŠฅ are also called parity-check equations for ๐’ž.

Definition 3.1 (Verification structure).

Let 1โ‰คโ„“โ‰คn and ๐’žโŠ†๐”ฝqn be a code. Let also ๐’ฌ be a non-empty set of โ„“-subsets of [1,n]. Set โ„›=๐”ฝqโ„“. We define the restriction mapR associated to ๐’ฌ as


Given an integer sโ‰ฅ1 and a map V:๐’ฌร—โ„›โ†’๐”ฝqs, we say that (๐’ฌ,V) is a verification structure for ๐’ž if the following holds:

  1. For all iโˆˆ[1,n], there exists uโˆˆ๐’ฌ such that iโˆˆu.

  2. For all uโˆˆ๐’ฌ, the map ๐”ฝqnโ†’๐”ฝqs given by aโ†ฆVโข(u,Rโข(u,a)) is surjective and vanishes on the code ๐’ž. Explicitly,

    Vโข(u,Rโข(u,c))=0โ€ƒfor allโขcโˆˆ๐’ž.

The map V is then called a verification map for ๐’ž, and the set ๐’ฌ a query set for ๐’ž. By convention, for wโˆˆ๐”ฝqn and rโˆˆโ„›๐’ฌ, we define


Finally, the code R(๐’ž):={R(c),cโˆˆ๐’ž} is called the response code of ๐’ž.

Example 3.2 (Fundamental example).

Let ๐’ž be a code, and let โ„‹ be a set of parity-check equations for ๐’ž of Hamming weight โ„“, whose supports are pairwise distinct. Define the query set ๐’ฌ={supp(h),hโˆˆโ„‹} and, for any uโˆˆ๐’ฌ, hโข(u) to be the unique parity-check equation in โ„‹ whose support is u. Finally, we define a map V by


Notice that we set s=1 here. By construction, it is clear that (๐’ฌ,V) is a verification structure for ๐’ž.

Example 3.3 (Toy example).

Let ๐’žโŠ†๐”ฝ27 be a binary Hadamard code of length n=7 and dimension k=3. In other words, ๐’ž is defined by a parity-check matrix


According to Example 3.2, we define ๐’ฌ to be the set of supports of rows of H. In other words,


Then the verification map V:๐’ฌร—๐”ฝ23โ†’๐”ฝ2 can be defined as follows. If u={u1,u2,u3}โˆˆ๐’ฌ and bโˆˆ๐”ฝ2u is indexed according to u, then we define


Now let m=(m1,m2,m3)โˆˆ๐”ฝ23. The message m can be encoded into


Hence the word r=Rโข(c)โˆˆ(๐”ฝ23)7 is


For each vector-coordinate bโˆˆ๐”ฝ23 of r=Rโข(c), one can now check that โˆ‘jbj=0. Hence we get Vโข(Rโข(c))=0, as expected.

From now on, we denote by N=|๐’ฌ| the length of the response code Rโข(๐’ž) of a code ๐’ž equipped with a verification structure (๐’ฌ,V).

3.2 Definition of our PoR scheme

Let (๐’ฌ,V) be a verification structure for ๐’žโŠ†๐”ฝqn, and let ฯƒโˆˆ๐”–โข(๐”ฝq)n, where ๐”–โข(๐”ฝq) denotes the set of permutations over ๐”ฝq. Any n-tuple of permutations ฯƒ=(ฯƒ1,โ€ฆ,ฯƒn)โˆˆ๐”–โข(๐”ฝq)n naturally acts on cโˆˆ๐”ฝqn by


and we define ฯƒ(๐’ž)={ฯƒ(c),cโˆˆ๐’ž}. Let finally


where ฯƒ|u-1โข(y)=(ฯƒu1-1โข(y1),โ€ฆ,ฯƒuโ„“-1โข(yโ„“)). The map Vฯƒ has been defined in order to satisfy


for every (c,u)โˆˆ๐’žร—๐’ฌ.

Based on this, our ๐–ฏ๐—ˆ๐–ฑ construction is given in Figure 2.

Figure 2 Definition of our ๐–ฏ๐—ˆ๐–ฑ{\mathsf{PoR}} scheme.
Figure 2

Definition of our ๐–ฏ๐—ˆ๐–ฑ scheme.

Figure 3 Our extraction procedure ๐–ค๐—‘๐—๐—‹๐–บ๐–ผ๐—โข(r,ฯƒ){\mathsf{Extract}(r,\sigma)}.
Figure 3

Our extraction procedure ๐–ค๐—‘๐—๐—‹๐–บ๐–ผ๐—โข(r,ฯƒ).

3.3 Analysis

3.3.1 Preliminary results

We first give results concerning verification structures and response codes. The following two lemmata are straightforward to prove.

Lemma 3.4.

Let (Q,V) be a verification structure for a code CโŠ†Fqn. Then (Q,Vฯƒ) is a verification structure for ฯƒโข(C).

Lemma 3.5.

Let Q be any query-set for a code CโŠ†Fqn whose elements have cardinality โ„“โ‰ฅ1. Then its response code Rโข(C) is an Fq-linear code over the alphabet Rโ‰ƒFqโ„“.

Remark 3.6.

By considering ฯƒโข(๐’ž) instead of ๐’ž, we loose the ๐”ฝq-linearity, but one can check that verification structures still make sense and provide the result claimed in Lemma 3.4.

The next result states that the map ๐’žโ†ฆฯƒโข(๐’ž) does not modify the distance between codewords.

Lemma 3.7.

Let CโŠ†Fqn be a linear code, (Q,V) a verification structure for C, and ฯƒโˆˆSโข(Fq)n. Then it holds that

  1. the distribution of distances in ๐’ž and ฯƒโข(๐’ž) are the same,

  2. the distribution of distances in Rโข(๐’ž) and Rโข(ฯƒโข(๐’ž)) are the same.


Since every ฯƒi is one-to-one, for any c,cโ€ฒโˆˆ๐’ž, we get

dโข(c,cโ€ฒ)=|{iโˆˆ[1,n],ciโ‰ ciโ€ฒ}|=|{iโˆˆ[1,n],ฯƒi(ci)โ‰ ฯƒi(ciโ€ฒ)}|=dโข(ฯƒโข(c),ฯƒโข(cโ€ฒ)).

The proof for response codes relies on the same argument. โˆŽ

Remark these results imply that, if ๐’ž is linear, then the minimum distance of Rโข(ฯƒโข(๐’ž)) is the minimum weight of Rโข(๐’ž).

Definition 3.8.

Let ฮตโˆˆ[0,1] and (๐’ฌ,V) be a verification structure for a code ๐’žโŠ†๐”ฝqn. We say rโˆˆโ„›๐’ฌ is ฮต-close to (๐’ฌ,V) if

wt(V(r)):=|{uโˆˆ๐’ฌ,V(u,ru)โ‰ 0}|โ‰คฮตN.

Let now cโˆˆ๐’ž and ฮฒโˆˆ[0,1]. We say that rโˆˆโ„›๐’ฌ is a ฮฒ-liar for (๐’ฌ,V,c) if

|{uโˆˆ๐’ฌ,V(u,ru)=0andruโ‰ R(u,c)}|โ‰คฮฒN.
Bounded-distance error-and-erasure decoder

Let ๐’œโŠ†๐”ฝqn be any code of minimum distance d, and let aโˆˆ๐’œ be corrupted with b errors and e erasures, resulting in a word rโ€ฒโˆˆ(๐”ฝqโˆช{โŠฅ})n. Then it is well known that, as long as 2โขb+e<d, it is possible to retrieve a from rโ€ฒ thanks to a so-called bounded-distance error-and-erasure decoding algorithm. This is precisely the decoding algorithm that we employ in Figure 3 on the code ๐’œ=Rโข(๐’ž).

Our framework allows us to reformulate the extraction success in terms of a probability to decode corrupted codewords. More precisely:

Proposition 3.9.

Let ฯƒโˆˆSโข(Fq)n, mโˆˆFqk, and denote by d the minimum distance of Rโข(C) of length N. Let also rโˆˆRQ be the response word, output of a proving algorithm P taking w=ฯƒโข(Cโข(m)) as input. Finally, assume that r is ฮต-close to (Q,Vฯƒ) and a ฮฒ-liar for (Q,Vฯƒ,w), with (ฮต+2โขฮฒ)โขN<d. Then Extractโข(r,ฯƒ)=m, where Extractโข(r,ฯƒ) is defined in Figure 3.


Recall that rโ€ฒโˆˆ(โ„›โˆช{โŠฅ})๐’ฌ represents the word we get from r after step (ii) of the algorithm given in Figure 3. Let us now translate our assumptions on r in coding-theoretic terminology:

  1. r is ฮต-close to (๐’ฌ,Vฯƒ) means that there are at most ฮตโขN challenges uโˆˆ๐’ฌ for which we know that the coordinate ruโ€ฒ is not authentic. This justifies that we assign erasure symbols to these coordinates.

  2. r is a ฮฒ-liar for (๐’ฌ,V,c) means that there are at most ฮฒโขN other corrupted values ruโ€ฒ, but we cannot identify them. Therefore, we can assimilate these coordinates to errors.

To sum up, we see rโ€ฒ as a corruption of Rโข(๐’žโข(m)) with at most ฮตโขN erasures and at most ฮฒโขN errors, where N=|๐’ฌ|. Since we assume that (ฮต+2โขฮฒ)โขN<d, we know from the previous discussion that the decoding succeeds to retrieve m. โˆŽ

3.3.2 Bounding the extraction failure

According to Definition 2.3, our ๐–ฏ๐—ˆ๐–ฑ scheme is weakly (ฮต,ฯ„)-sound if, for every polynomial-time algorithm ๐–ฏ outputting a response word r(w) from a file w, we have


Using Proposition 3.9, the security analysis of our ๐–ฏ๐—ˆ๐–ฑ scheme reduces to measuring the ability of the ๐–ฏ๐—‹๐—ˆ๐—๐–พ๐—‹ to produce a response word r which is ฮต-close to (๐’ฌ,Vฯƒ) and a ฮฒ-liar for (๐’ฌ,Vฯƒ,w), with (ฮต+2โขฮฒ)โขNโ‰ฅd.

For fixed rโˆˆโ„›๐’ฌ, ฯƒโˆˆ๐”–โข(๐”ฝq)n and w=ฯƒโข(๐’žโข(m)) the authentic file given to the prover, we define three subsets of ๐’ฌ:

  1. ๐’Ÿ(r,w):={uโˆˆ๐’ฌ,ruโ‰ R(w)u} and D(r,w):=|๐’Ÿ(r,w)|=wt(r-R(w)). This represents challenges u on which the response word r differs from the authentic one Rโข(w).

  2. โ„ฐ(r,ฯƒ):={uโˆˆ๐’ฌ,Vฯƒ(u,ru)โ‰ 0} and E(r,ฯƒ):=|โ„ฐ(r,ฯƒ)|=wt(Vฯƒ(r)). These are challenges u on which the associated coordinate ru is not accepted by the verification map (it corresponds to erasures in the decoding process).

  3. โ„ฌ(r,ฯƒ,w):={uโˆˆ๐’ฌ,ruโ‰ R(w)uandVฯƒ(u,ru)=0} and B(r,ฯƒ,m):=|โ„ฌ(r,ฯƒ,m)|. These are the challenges u on which the associated coordinate ru is accepted by the verification map, but differs from the authentic response su (it corresponds to errors in the decoding process).

One can easily check that, for every ฯƒ, the sets โ„ฐโข(r,ฯƒ) and โ„ฌโข(r,ฯƒ,w) define a partition of ๐’Ÿโข(r,w). The probability of extraction failure can thus be written as


For wโˆˆ๐”ฝqn , let us define the set of admissible permutations and messages


so that equation (3.1) rewrites


Later on, we will use the notation Prฮฆw to refer to the fact that (ฯƒ,m) is uniformly drawn from ฮฆw. Similarly we will use notation ๐”ผฮฆw for the expectancy and Varฮฆw for the variance.

Given rโˆˆโ„›๐’ฌ, we also define


and ฮฑ:=max(r,w)ฮฑ(r,w), where (r,w) are such that Dโข(r,w)โ‰ 0. The parameter ฮฑโˆˆ(0,1) is called the bias of the verification structure (๐’ฌ,V) for ๐’ž. It corresponds to the maximum probability that a response is accepted but not authentic.

Lemma 3.10.

For all rโˆˆRQ and wโˆˆFqn, we have



A simple computation shows

๐”ผฮฆwโข(Eโข(r,ฯƒ))=๐”ผฮฆwโข(โˆ‘uโˆˆ๐’Ÿโข(r,w)๐Ÿ™Vฯƒโข(u,ru)โ‰ 0)=โˆ‘uโˆˆ๐’Ÿโข(r,w)Prฮฆwโก(Vฯƒโข(u,ru)โ‰ 0)โ‰ฅโˆ‘uโˆˆ๐’Ÿโข(r,w)(1-ฮฑ)โ‰ฅ(1-ฮฑ)โขDโข(r,w).โˆŽ

Lemma 3.10 essentially means that, if an adversary to our ๐–ฏ๐—ˆ๐–ฑ scheme wants its response word to be (in average) ฮต-close to the verification structure, then he should modify at most Dโข(r,w)โ‰คฮตโขN1-ฮฑ responses. Below, we take advantage of this result, and we measure the probability of an extraction failure.

First, for ฮด,ฮตโˆˆ(0,1), let


The probability pโข(r,w;ฮต,ฮด) represents the probability that the extraction fails for a response code of relative distance ฮด and an adversarial response word r associated to w, which is ฮต-close to the verification structure. Let us bound pโข(r,w;ฮต,ฮด).

Proposition 3.11.

Let ฮด,ฮตโˆˆ(0,1) such that ฮดโข1-ฮฑ1+ฮฑ>ฮต. Let also rโˆˆRQ and wโˆˆFqn. Then we have



We distinguish three cases.

(i) 2โขDโข(r,w)-ฮดโขN<0. The event Eโข(r,ฯƒ)โ‰คminโก{ฮตโขN,2โขDโข(r,w)-ฮดโขN} never occurs since Eโข(r,ฯƒ)โ‰ฅ0. Hence pโข(r,w;ฮต,ฮด)=0.

(ii) ฮตโขNโ‰ค2โขDโข(r,w)-ฮดโขN. The inequality Eโข(r,ฯƒ)โ‰คฮตโขN implies


Hence, using Chebychevโ€™s inequality,


(iii) 0โ‰ค2โขDโข(r,w)-ฮดโขN<ฮตโขN. In this case, Eโข(r,ฯƒ)โ‰ค2โขDโข(r,w)-ฮดโขN implies


Therefore, similarly to the previous case, we obtain the claimed result. โˆŽ

For any uโˆˆ๐’Ÿโข(r,w), denote by Xu the {0,1}-random variable โ€œ๐Ÿ™Vฯƒโข(u,ru)=0โ€ when ฯƒ is uniformly drawn from ฮฆw. It holds that Eโข(r,ฯƒ)=โˆ‘uโˆˆ๐’Ÿโข(r,w)(1-Xu).

Recall that two real random variables Y,Z are uncorrelated if ๐”ผโข(YโขZ)=๐”ผโข(Y)โข๐”ผโข(Z). For instance, two independent random variables are uncorrelated.

Lemma 3.12.

Let rโˆˆRQ and wโˆˆFqn. If the random variables {Xu}uโˆˆDโข(r,w) are pairwise uncorrelated, then



By assumption, {Xu}uโˆˆ๐’Ÿโข(r,w) are pairwise uncorrelated; hence


The trivial bound Varฮฆwโข(1-Xu)โ‰ค1 gives the result. โˆŽ

As a corollary of Proposition 3.11 and Lemma 3.12, under the same hypothesis and assuming ฮดโข1-ฮฑ1+ฮฑ>ฮต, we get


since Dโข(r,w)โ‰คN. Moreover, if limNโ†’โˆžโกฮด>0 and limNโ†’โˆžโกฮฑ=0, then pโข(r,w;ฮต,ฮด)=๐’ชโข(1/N).

Therefore, we end up with the following theorem.

Theorem 3.13.

Let (Q,V) be a verification structure for C with bias ฮฑ. Let N=|Q|, and let ฮด=dminโข(Rโข(C))/N be the relative distance of the associated response code. Finally, assume that, for any rโˆˆRQ and any wโˆˆFqn, the variables {Xu}uโˆˆDโข(r,w) are pairwise uncorrelated. Then, for any ฮต<ฮดโข1-ฮฑ1+ฮฑ, the PoR scheme associated to C and (Q,V) is (ฮต,ฯ„)-sound, where


For asymptotically small ฮฑ, a code ๐’ž equipped with a verification structure satisfying the conditions of Theorem 3.13 thus gives an (ฮต,ฯ„)-sound ๐–ฏ๐—ˆ๐–ฑ scheme for every ฮต<(1+oโข(1))โขฮด and ฯ„=๐’ชโข(1/N).

According to Theorem 3.13, we thus need to look for (sequences of) codes ๐’ž and associated verification structures (๐’ฌ,V) such that

  1. the response code Rโข(๐’ž) admits a good relative distance ฮด=dminโข(Rโข(๐’ž))/N,

  2. the bias ฮฑ is small,

  3. random variables {Xu}uโˆˆ๐’Ÿโข(r,w) are pairwise uncorrelated.

Sections 3.4 and 3.5 characterise conditions under which the last two points are fulfilled. Then, in Section 5, we discuss which response codes can achieve good relative distance.

3.4 Estimating ฮฑ

In this section, we prove that, assuming ฮฆw approximates the uniform distribution over ๐”–โข(๐”ฝq)n in a sense that we make precise later, the bias ฮฑ can be bounded according to parameters of the verification structure.

Let us fix rโˆˆโ„›๐’ฌ, wโˆˆ๐”ฝqn and uโˆˆ๐’ฌ. We recall that ฮฑ is defined by


where randomness comes from ฯƒโ†Rฮฆw={(ฯƒ,m)โˆˆ๐”–(๐”ฝq)nร—๐”ฝqk,w=ฯƒ(๐’ž(m))}. We notice that this is equivalent to write ฯƒโ†R{ฯƒโˆˆ๐”–(๐”ฝq)n,ฯƒ-1(w)โˆˆ๐’ž}.

For convenience, we will view ruโˆˆโ„›=๐”ฝqโ„“ as a vector indexed by u=(u1,โ€ฆ,uโ„“), so that we can easily denote by ruโข[uj]โˆˆ๐”ฝq its j-th coordinate, 1โ‰คjโ‰คโ„“. We define the code Ku:=kerV(u,โ‹…)โŠ†๐”ฝqโ„“, and up to re-indexing coordinates, ๐’ž|uโŠ†Ku. This allows us to write that, for every ฯƒ, we have Vฯƒโข(u,ru)=0 if and only if ฯƒu-1โข(ru)โˆˆKu. Finally, we denote by Zu:={iโˆˆu,ru[i]โ‰ R(w)u[i]} the set of coordinates of ru that are not authentic.

Let Yuโข(ฯƒ) represent the event โ€œฯƒu-1(ru)โˆˆKuโˆฃsupp(ฯƒu-1(ru))=Zuโ€. Informally, the reason why we consider an event Yuโข(ฯƒ) conditioned by suppโก(ฯƒu-1โข(ru))=Zu is that the ๐–ฏ๐—‹๐—ˆ๐—๐–พ๐—‹ is free to choose any support Zu on which he can modify the original file. More formally, this constraint will help us to bound the probability Prฮฆwโก(Vฯƒโข(u,ru)=0) in Lemma 3.14. We say that ฮฆw is sufficiently uniform if, for every uโˆˆ๐’ฌ, we have


when the file size nโขlogโกqโ†’โˆž. In other words, ฮฆw is sufficiently uniform if it is a good approximation of the whole set of n-tuples of permutations, when considering the probability that Yuโข(ฯƒ) happens.

Lemma 3.14.

Let r, w, u and Zu be defined as above. Let also Au=|{xโˆˆKu,supp(x)=Zu}|. Then



For every ฯƒ such that (ฯƒ,m)โˆˆฮฆw, we know that ฯƒu-1โข(Rโข(w)u)โˆˆKu, and we recall that Vฯƒโข(u,ru)=0 if and only if ฯƒu-1โข(ru)โˆˆKu. Since Ku is linear, and up to considering ฯƒu-1โข(Rโข(w)u-ru) instead, we can assume without loss of generality that ฯƒu-1โข(ru)โข[i]=0 for every iโˆˆuโˆ–Zu. In other words, we assume that suppโก(ฯƒu-1โข(ru))=Zu.

Remark that


since Au counts the number of codewords in Ku whose support is Zu.

Therefore, we get


Lemma 3.15.

Let Su be the Fq-vector space ใ€ˆ{xโˆˆKu,supp(x)=Zu}ใ€‰, and assume that Suโ‰ {0}. We have



We prove that, if Au>qe for some integer eโ‰ฅ0, then dminโข(Su)โ‰ค|Zu|-e, which clearly induces our result. If Au>qe, then dimโกSu>e since |Su|โ‰ฅAu. The Singleton bound then provides


Finally, we get the following upper bound on ฮฑ.

Proposition 3.16.

Let ฮ”=minโก{dminโข(Ku),uโˆˆQ}. Then


where ฮณu=maxโกฮณu.


Remark that Su, defined in previous lemma, is a subcode of Ku shortened on uโˆ–Zu. Hence


and we can apply previous results and obtain the desired bound


where ฮณ=maxuโกฮณu. โˆŽ

If every ฮฆw is sufficiently uniform, then, by definition, we have ฮณ=oโข(1) when the file size nโขlogโกqโ†’โˆž. This assumption is significant since we desire to have a small bias ฮฑ, which is deeply linked to the soundness of ๐–ฏ๐—ˆ๐–ฑs (see Theorem 3.13). In Appendix A, we present experimental estimates of ฮฑ, validating that the assumption that ฮฆw is sufficiently uniform.

3.5 Pairwise uncorrelation of {Xu}uโˆˆ๐’Ÿ

This section is devoted to proving that variables {Xu}uโˆˆ๐’Ÿโข(r,w) are pairwise uncorrelated if the supports of challenges uโˆˆ๐’Ÿโข(r,w) have small pairwise intersection. For this purpose, let us recall that, for fixed rโˆˆโ„›๐’ฌ, w and uโˆˆ๐’Ÿโข(r,w), the random variable Xu represents ๐Ÿ™Vฯƒโข(u,ru)=0 when ฯƒ is uniformly picked in ฮฆw.

We first state a technical lemma that will be useful to prove Proposition 3.18 below. For clarity, we denote by dโŠฅโข(๐’ž) the minimum distance of the dual code ๐’žโŠฅ of a linear code ๐’ž.

Lemma 3.17.

Let CโŠ†Fqn be a linear code and TโŠ‚[1,n], |T|=t, where t<dโŠฅโข(C). For aโˆˆFqT, we define



  1. ๐’ฑ0={vโˆˆ๐’ž,v|T=0} is a linear subcode of ๐’ž;

  2. for every non-zero aโˆˆ๐”ฝqT, there exists a non-zero c(a)โˆˆ๐’ž such that ๐’ฑa=๐’ฑ0+{c(a)};

  3. for every aโˆˆ๐”ฝqT, Na=qk-t, where k=dimโก๐’ž.


(i) The fact that ๐’ฑ0={vโˆˆ๐”ฝqX,v|T=0} is actually the well-known definition of the shortening of a code. It is easy to prove that it defines a linear code.

(ii) Let aโˆˆ๐”ฝqT be non-zero, and let us first prove that there exists c(a)โˆˆ๐’ž such that c|T(a)=a. If it were not the case, then, by definition, we would have ๐’ž|Tโ‰ ๐”ฝqt. But this is impossible since ๐’žโŠฅ contains no non-zero codeword of weight less that t. It is then easy to check that ๐’ฑa=๐’ฑ0+{c(a)}.

(iii) First notice that ๐’ฑaโˆฉ๐’ฑb=โˆ… if aโ‰ b. Since


we get the expected result. โˆŽ

Proposition 3.18.

If maxโก{|uโˆฉv|,uโ‰ vโˆˆQ}<minโก{dโŠฅโข(C|u),uโˆˆQ}, then the random variables {Xu}uโˆˆQ are pairwise uncorrelated.


Recall that Ku:=kerV(u,โ‹…) and that, by definition of a verification structure, we have ๐’ž|uโŠ†Ku. For uโ‰ vโˆˆ๐’ฌ, let us prove that ๐”ผโข(XuโขXv)=๐”ผโข(Xu)โข๐”ผโข(Xv). First,


Denote t=|uโˆฉv|, and let (๐š,๐›)โˆˆ(๐”ฝqt)2. We denote by Zโข(ฯƒ,๐š,๐›) the event


We first notice that {ฯƒ|uโˆฉv-1,ฯƒโˆˆฮฆw}=๐”–(๐”ฝq)t. Indeed, we can here use an argument similar to the proof of Lemma 3.17: the constraint ฯƒ-1โข(w)โˆˆ๐’ž is ineffective on ฯƒ|uโˆฉv-1 since |uโˆฉv|โ‰คt<dโŠฅโข(๐’ž|z) for every zโˆˆ๐’ฌ. Therefore, for every (๐š,๐›)โˆˆ(๐”ฝqt)2, we have


and it follows that


Recall now that t<minโก{dโŠฅโข(๐’ž|u),uโˆˆ๐’ฌ}โ‰คminโก{dโŠฅโข(Ku),uโˆˆ๐’ฌ}. Hence, for fixed ๐š and ๐›, the variables ฯƒ-1(ru)|uโˆˆKuโˆฃZ(ฯƒ,๐š,๐›) and ฯƒ-1(rv)|vโˆˆKvโˆฃZ(ฯƒ,๐š,๐›) are independent (once again, it is a consequence of the structure results of Lemma 3.17). Therefore,




and we conclude since


4 Performance

4.1 Efficient scrambling of the encoded file

In the ๐–ฏ๐—ˆ๐–ฑ scheme we propose, the storage cost of an n-tuple of permutations in ๐”–โข(๐”ฝq)n is excessive since it is superlinear in the original file size. In this subsection, we propose a storage-efficient way to scramble the codeword cโˆˆ๐’ž produced by the ๐–ต๐–พ๐—‹๐—‚๐–ฟ๐—‚๐–พ๐—‹.

Precisely, we want to define a family of maps (ฯƒ(ฮบ))ฮบ, where ฯƒ(ฮบ):๐’žโ†’๐”ฝqn, cโ†ฆwโˆˆ๐”ฝqn, with the following requirements:

  1. For every ฮบ, the map ฯƒ(ฮบ) is efficiently computable and requires a low storage.

  2. For every ฮบ and every cโˆˆ๐’ž, if w=ฯƒ(ฮบ)โข(c), then, for every iโˆˆ[1,n], the local inverse map wiโ†ฆci is efficiently computable.

  3. If ฮบ is randomly generated but unknown, then, given the knowledge of w=ฯƒ(ฮบ)โข(c) and ๐’ž, it is hard to produce a response word rโˆˆโ„›๐’ฌ such that, for many uโˆˆ๐’ฌ, both Vฯƒ(ฮบ)โข(u,ru)=0 and ruโ‰ w|u hold. To be more specific and in light of the security analysis of Section 3.3, we require that it is hard to distinguish ฯƒ(ฮบ)โข(c) from a random (z1,โ€ฆ,zn)โˆˆ๐”ฝqn, where symbols zi are picked independently and uniformly at random.

We here propose to derive ฯƒ(ฮบ) from a suitable block cipher, yielding the explicit construction given below. Of course, other proposals can be envisioned.

The construction

Let IV denote a random initialisation vector for AES in CTR mode (IV could be a nonce concatenated with a random value). Vector IV is kept secret by the ๐–ต๐–พ๐—‹๐—‚๐–ฟ๐—‚๐–พ๐—‹, as well as a randomly chosen key ฮบ for the cipher. Let also f be a permutation polynomial over ๐”ฝq of degree d>1. For instance, one could choose fโข(x)=xd with gcdโก(d,q-1)=1. Notice that polynomial f can be made public.

Let s=โŒŠ256โŒˆlog2โกqโŒ‰โŒ‹ be the number of ๐”ฝq-symbols one can store in a 256-bit word[2]. Up to appending a few random bits to c, we assume that sโˆฃn, and we define t=n/s. Let us fix a partition of [1,n] into s-tuples i=(i1,โ€ฆ,is); it can be, for instance, (1,โ€ฆ,s), (s+1,โ€ฆ,2โขt),โ€ฆ,((t-1)โขs+1,โ€ฆ,n). Notice that this partition does not need to be chosen at random. Given c=(c1,โ€ฆ,cn)โˆˆ๐’ž and i an element of the above partition, we now define


If log2โกqโˆค256, trailing zeroes can be added to evaluations of f. Finally, the pseudo-random permutation ฯƒ is defined by


Design rationale

AES is a natural choice when one needs a (secret-)keyed pseudo-random permutation. Also notice that, with this construction, one only needs to store the key ฮบ and the vector IV since the other objects (the polynomial f, the partition) are made public. Hence our objectives in terms of storage are met.

We now point out the necessity to use i as a part of the input of the AES cipher. Assume that we do not. Then the local permutation ฯƒj, 1โ‰คjโ‰คn, would not depend on j. As a consequence, for a certain class of codes, the local verification map ruโ†ฆVฯƒโข(u,ru) would not depend on u, and a malicious ๐–ฏ๐—‹๐—ˆ๐—๐–พ๐—‹ would then be able to produce accepted answers while storing only a small piece of the file w (e.g., w|u for only one uโˆˆ๐’ฌ).

Another mandatory feature is the non-linearity of the permutation polynomial f. Indeed, assume, for instance, that f=id. Then, given the knowledge of w=ฯƒโข(c), it would be very easy for a malicious ๐–ฏ๐—‹๐—ˆ๐—๐–พ๐—‹ to produce a word wโ€ฒโ‰ w such that rโ€ฒ=Rโข(wโ€ฒ) is always accepted by the ๐–ต๐–พ๐—‹๐—‚๐–ฟ๐—‚๐–พ๐—‹. Simply, the ๐–ฏ๐—‹๐—ˆ๐—๐–พ๐—‹ defines wโ€ฒ=w+cโ€ฒ, where cโ€ฒ is any non-zero codeword of ๐’ž. Hence one sees that the polynomial f must be non-linear in order to prevent such kind of attacks.

4.2 Parameters

We here consider a ๐–ฏ๐—ˆ๐–ฑ built upon a code ๐’žโŠ†๐”ฝqn with verification structure (๐’ฌ,V) satisfying โ„›=๐”ฝqโ„“ and Vโข(โ„›)=๐”ฝqs. We also assume that we use an n-tuple of pseudo-random permutations as described in the previous subsection.

Communication complexity

At each verification step, the client sends an โ„“-tuple of coordinates (u1,โ€ฆ,uโ„“), uiโˆˆ[1,n]. The server then answers with corresponding symbols wuiโˆˆ๐”ฝq. Therefore, the upload communication cost is โ„“โขlog2โกn bits, while the download communication cost is โ„“โขlog2โกq, thus a total of โ„“โข(log2โกn+log2โกq) bits.

Computation complexity

In the initialisation phase, following the encryption described in Section 4.1, the client essentially has

  1. to compute the codeword cโˆˆ๐’ž associated to its message,

  2. to make n evaluations of the permutation polynomial f over ๐”ฝq,

  3. to compute t=nโขlog2โกq256 AES ciphertexts to produce the word w to be sent to the server.

Given a generator matrix of ๐’ž, the codeword c can be computed in ๐’ชโข(kโขn) operations over ๐”ฝq with a matrix-vector product. Notice that quasi-linear-time encoding algorithms exist for some classes of codes. Besides, if a monomial or a sparse permutation polynomial is used, then the cost of each evaluation is ๐’ชโข((log2โกq)3). If we denote by c the bitcost of an AES encryption, we get a total bitcost of ๐’ชโข(nโขkโข(log2โกq)2+nโข(log2โกq)3+cโขnโขlog2โกq) for the initialisation phase. Recall this is a worst-case scenario in which the encoding process is inefficient.

At each verification step, an honest server only needs to read โ„“ symbols from the file it stores. Hence its computation complexity is ๐’ชโข(โ„“). The client has to compute a matrix-vector product over ๐”ฝq, where the matrix has size sร—โ„“ and the vector has size โ„“, thus a computation cost of ๐’ชโข(โ„“โขs) operations over ๐”ฝq.

Storage needs

The client stores 2ร—256 bits for secret material ฮบ and IV to use in AES. The server storage overhead exactly corresponds to the redundancy of the linear code ๐’ž, that is, (n-dimโก๐’ž)โขlog2โกq bits.

Other features

Our ๐–ฏ๐—ˆ๐–ฑ scheme is unbounded-use since every challenge reveals nothing about the secret data held by the client. It does not feature dynamic updates of files. Though, we must emphasise that the file w the client produces can be split among several servers, and the verification step remains possible even if the servers do not communicate with each other. Indeed, computing a response to a challenge does not require mixing distinct symbols wi of the uploaded file. Therefore, our scheme is well suited for the storage of large static distributed databases. Parameters of the ๐–ฏ๐—ˆ๐–ฑ schemes we propose are reported in Figure 4.

Figure 4

Summary of parameters of our ๐–ฏ๐—ˆ๐–ฑ construction for an original file of size kโขlog2โกq bits and a code ๐’ž of dimension k over ๐”ฝq equipped with a verification structure (๐’ฌ,V) such that |u|=โ„“ and rankโกVโข(u,โ‹…)โ‰คs for all uโˆˆ๐’ฌ.

5 Instantiations

In this section, we present several instantiations of our ๐–ฏ๐—ˆ๐–ฑ construction. We first recall basics and notation from coding theory.

The code Repโข(โ„“)โŠ†๐”ฝqโ„“ denotes the repetition code ใ€ˆ(1,โ€ฆ,1)ใ€‰. We recall that Repโข(โ„“)โŠฅ is the parity code Par(โ„“):={cโˆˆ๐”ฝqโ„“,โˆ‘i=1โ„“ci=0}. Let ๐’ž,๐’žโ€ฒ be two linear codes over ๐”ฝq of respective parameters [n,k,d] and [n,kโ€ฒ,dโ€ฒ]. Their tensor product ๐’žโŠ—๐’žโ€ฒ is the ๐”ฝq-linear code generated by words


It has dimension kโขkโ€ฒ and minimum distance dโขdโ€ฒ. We also denote by

๐’žโŠ—s:=๐’žโŠ—โ€ฆโŠ—๐’žโŸsโขย timesโŠ†๐”ฝqns

the s-fold tensor product of ๐’ž with itself.

5.1 Tensor-product codes

The upcoming subsection illustrates our construction with a non practical but simple instance. The next ones lead to practical ๐–ฏ๐—ˆ๐–ฑ instances.

5.1.1 A simple but non-practical instance

Let n=Nโขโ„“ and ๐’ฌ={ui={iโ„“+1,iโ„“+2,โ€ฆ,(i+1)โ„“},iโˆˆ[0,N-1]}. The set ๐’ฌ defines a partition of [1,n]. We define the code

๐’ž={cโˆˆ๐”ฝqn,โˆ‘jโˆˆucj=0for alluโˆˆ๐’ฌ}โŠ†๐”ฝqn.

In other words, ๐’ž=Parโข(โ„“)โŠ—๐”ฝqN, and a parity-check matrix H for ๐’ž is given by


The verification map V:๐’ฌร—๐”ฝqโ„“โ†’๐”ฝq is defined by V(u,b):=โˆ‘j=1โ„“buj for all (u,b)โˆˆ๐’ฌร—๐”ฝqโ„“. By construction (see the fundamental Example 3.2), the pair (๐’ฌ,V) defines a verification structure for ๐’ž.

Lemma 5.1.

Let C=Parโข(โ„“)โŠ—FqN as above. Then the response code Rโข(C) has minimum distance 1.


We see that the restriction map R sends the codeword (1,-1,0,0,โ€ฆ,0)โˆˆ๐’ž to a word of weight 1. Besides, R is injective, so dminโข(Rโข(๐’ž))>0. โˆŽ

Since ฮด=dminโข(Rโข(๐’ž))/N=1/Nโ†’0 when N goes to infinity, an attempt to build a ๐–ฏ๐—ˆ๐–ฑ scheme from ๐’ž cannot be practical.

5.1.2 Higher order tensor-product codes

Let ๐’œโŠ†๐”ฝqโ„“ be a non-degenerate [โ„“,k๐’œ,d๐’œ]q-linear code, and define ๐’ž=๐’œโŠ—sโŠ†๐”ฝqn, where n=โ„“s. Notice that it will be more convenient to see coordinates of words wโˆˆ๐”ฝqn as elements of [1,โ„“]s.

For ๐šโˆˆ[1,โ„“]s and 1โ‰คiโ‰คs, we define Li,๐šโŠ‚[1,โ„“]s, the โ€œi-th axis-parallel line with basis ๐šโ€, as

Li,๐š:={๐ฑโˆˆ[1,โ„“]ssuch thatxj=ajfor alljโ‰ i}.

By definition of ๐’ž, a word c lies in ๐’ž if and only if, for every L=Li,๐š, the restriction c|Lโˆˆ๐’œ. This means that we can define

  1. a set of queries ๐’ฌ={Li,๐š,iโˆˆ[1,s],๐šโˆˆ[1,โ„“]s},

  2. a verification map


    where H is a parity-check matrix for ๐’œ whose columns are ordered according to the line L.

By the previous discussion, it is clear that cโˆˆ๐’ž implies that Vโข(L,c|L)=0 for every Lโˆˆ๐’ฌ (in fact, these two assertions are equivalent). Hence (๐’ฌ,V) defines a verification structure for ๐’ž, and we have N=|๐’ฌ|=sโขโ„“s-1.

Lemma 5.2.

Let C=AโŠ—s as above. Then Rโข(C) has minimum distance sโ‹…dAs-1.


Let us first prove that the minimum distance of Rโข(๐’ž) is larger than sโ‹…d๐’œs-1. Let r=Rโข(c)โˆˆRโข(๐’ž), and assume rโ‰ 0. Then there exists Lโˆˆ๐’ฌ such that 0โ‰ rL=c|Lโˆˆ๐’œ. Therefore, c๐ฑโ‰ 0 for some ๐ฑโˆˆLโŠ‚[1,โ„“]s. Consider the set


Very informally, the set Si,๐ฑ corresponds to the hyperplane passing through ๐ฑ and โ€œorthogonalโ€ to the i-th axis. By definition of ๐’ž=๐’œโŠ—s, we know that c|Si,๐ฑโˆˆ๐’œโŠ—(s-1)โˆ–{0} for every 1โ‰คiโ‰คs. Let


with tiโ‰ฅdminโข(๐’œโŠ—(s-1))=(d๐’œ)s-1. Every ๐ฎ(i,j)โˆˆUi defines a line Li,๐ฎ(i,j) on which c|Li,๐ฎ(i,j) is a non-zero codeword of ๐’œ. Equivalently, r is non-zero on index Li,๐ฎ(i,j)โˆˆ๐’ฌ. Therefore,

wt(r)=|{Lโˆˆ๐’ฌ,rLโ‰ 0}|โ‰ฅ|โ‹ƒi=1s{Li,๐ฎ(i,j),โ€‰1โ‰คjโ‰คti}|โ‰ฅโˆ‘i=1stiโ‰ฅs(d๐’œ)s-1.

Let us now build a word rโˆˆRโข(๐’ž) of weight sโข(d๐’œ)s-1. Let wโˆˆ๐’œโˆ–{0} be a minimum-weight codeword of ๐’œ, and define W:=supp(w)โŠ†A. Define c=wโŠ—sโˆˆ๐’ž; then suppโก(c)=Ws. Let finally r=Rโข(c). We see that rLi,๐ฑโ‰ 0 if and only if ๐ฑโˆˆWs. Hence we get

wt(r)=|{Lโˆˆ๐’ฌ,rLโ‰ 0}|=|โ‹ƒi=1s{Li,๐ฑ,๐ฑโˆˆWs}|=sโ‹…d๐’œs-1

since each line Li,๐ฑ is counted d๐’œ times when ๐ฑ runs over Ws. โˆŽ

Proposition 5.3.

Let ฮด>0, and let A be an [โ„“,โ„“โข(1-ฮด)+1,โ„“โขฮด]q MDS code. Define C=AโŠ—s and (Q,V) as above. If every ฮฆw is sufficiently uniform, then the PoR scheme associated to C and (Q,V) is (ฮต,ฯ„)-sound for ฯ„=Oโข(1(ฮดโขโ„“)sโขs) and every ฮต<ฮต0, where ฮต0=(1+Oโข(q-ฮดโขโ„“+1))โขฮดs when โ„“โ†’โˆž.


First, the relative distance of Rโข(๐’ž) is ฮดs according to Lemma 5.2. Then the random variables {Xu}uโˆˆ๐’Ÿ are pairwise uncorrelated because the inequality

maxuโ‰ vโˆˆ๐’ฌ2โก|uโˆฉv|=1<โ„“โข(1-ฮด)+2=minuโˆˆ๐’ฌโกdminโข((๐’ž|u)โŠฅ)

allows us to apply Proposition 3.18. Besides, if every ฮฆw is sufficiently uniform, then the bias ฮฑ satisfies ฮฑ=๐’ชโข(q-ฮดโขโ„“+1) and hence 1-ฮฑ1+ฮฑ=1+๐’ชโข(q-ฮดโขโ„“+1). Therefore, we can use Theorem 3.13, and we get the desired result. โˆŽ


We mainly focus on the download communication complexity in the verification step and on the server storage overhead since these are the most crucial parameters which depend on the family of codes ๐’ž we use. Besides, we consider that it is more relevant to analyse the ratio between these quantities and the file size than their absolute values.

Here, for an initial file of size |F|=((1-ฮด)โขq+1)sโขlog2โกq bits, we get

  1. a redundancy rate

  2. a communication complexity rate


Example 5.4.

In Table 3, we present various parameters of ๐–ฏ๐—ˆ๐–ฑ instances admitting 0.10โ‰คฮต0โ‰ค0.16, for files of size approaching 104, 106 and 109 bits. Here ๐’œ is a [q,(1-ฮด)โขq+1,ฮดโขq]q MDS code (e.g., a Reedโ€“Solomon code), and ๐’ž=๐’œโŠ—s.

Table 3

Parameters of ๐–ฏ๐—ˆ๐–ฑ instances admitting 0.10โ‰คฮต0โ‰ค0.16.

qฮดโขqsFile size (bits)Comm. rateRedundancy rateฮต0

The previous example shows that, while the communication rate is reasonable for these ๐–ฏ๐—ˆ๐–ฑ instances over large files, the storage needs remain large.

5.2 Reedโ€“Muller and related codes

Low-degree Reedโ€“Muller codes are known to admit many distinct low-weight parity-check equations, whose supports correspond to affine subspaces of the ambient space. Therefore, they seem naturally adapted to our construction. Let us first consider the plane (or bivariate) Reedโ€“Muller code case.

5.2.1 The plane Reedโ€“Muller code RMqโข(2,q-2)

Let ๐’ž be the Reedโ€“Muller code


It is well known that ๐’ž has length q2 and dimension (q-1)โข(q-2)/2. Besides, for every line


and every cโˆˆ๐’ž, we can check that โˆ‘๐ฑโˆˆLc๐ฑ=0. Indeed, let fโˆˆ๐”ฝqโข[X,Y], degโกf=aโ‰คq-2. The restriction of f on an affine line L can be interpolated as a univariate polynomial f|L of degree at most a. Our claim follows since โˆ‘zโˆˆ๐”ฝqzi=0 for every iโ‰คq-2.

Therefore, we can define ๐’ฌ as the set of affine lines L of ๐”ฝq2 and Vโข(L,r)=โˆ‘j=1โ„“rjโˆˆ๐”ฝq. From the previous discussion, we see that (๐’ฌ,V) is a verification structure for ๐’ž. Also notice there are qโข(q+1) distinct affine lines in ๐”ฝq2; hence N=qโข(q+1).

Lemma 5.5.

Let C=RMqโข(2,q-2), equipped with its verification structure defined as above. Then the response code Rโข(C) has minimum distance q2+2.


Any non-zero codeword cโˆˆ๐’ž consists in the evaluation of a non-zero polynomial fโข(X,Y)โˆˆ๐”ฝqโข[X,Y] of degree at most q-2. Denote by L1,โ€ฆ,LaโŠ‚๐”ฝq2 the affine lines on which f vanishes, i.e., fโข(P)=0 for every PโˆˆLi, 1โ‰คiโ‰คa. We claim that aโ‰คq-2. Indeed, since f has total degree less than q-1, it also vanishes on closed lines L1ยฏ,โ€ฆ,Laยฏ, considered as affine lines in ๐”ฝqยฏ2, where ๐”ฝqยฏ denotes the algebraic closure of ๐”ฝq. Denote by giโˆˆ๐”ฝqโข[X,Y] the monic polynomial of degree 1 which defines Liยฏ. From Hilbertโ€™s Nullstellensatz, there exists r>0 such that (โˆi=1agi)โˆฃfr. Since the giโ€™s have degree 1 and are distinct, we get aโ‰คdegโกfโ‰คq-2. Hence the affine lines different from L1,โ€ฆ,La correspond to non-zero coordinates of Rโข(c). There are qโข(q+1)-aโ‰ฅq2+2 such lines, so dminโข(Rโข(๐’ž))โ‰ฅq2+2.

Now we claim there exists a word rโˆˆRโข(๐’ž) of weight N-q+2=q2+2. Let L(0) and L(1) be two distinct parallel affine lines, respectively defined by X=0 and X=1. We build the word c which is -1 on coordinates corresponding to points in L(0), 1 on those corresponding to points in L(1) and 0 elsewhere. One can check that cโˆˆ๐’ž; indeed, c corresponds to the evaluation of โˆzโˆˆ๐”ฝqโˆ–{0,1}(z-X). Now, if we want to compute wtโก(Rโข(c)), we only need to count the number of lines which do not intersect L(0) nor L(1). Clearly, there are only q-2 such lines. Hence wtโก(Rโข(c))=qโข(q+1)-(q-2), and this concludes the proof. โˆŽ

Proposition 5.6.

Let C=RMโข(2,q-2), and let (Q,V) be its associated verification structure. If every ฮฆw is sufficiently uniform, then the PoR scheme associated to C and (Q,V) is (ฮต,ฯ„)-sound for ฮต=1-oโข(1) and ฯ„=Oโข(1(1-ฮต)โขq2), when qโ†’โˆž.


One can check that the random variables {Xu}uโˆˆ๐’Ÿ are pairwise uncorrelated since

maxuโ‰ vโˆˆ๐’ฌ2โก|uโˆฉv|=1<โ„“โข(1-ฮด)+2=minuโˆˆ๐’ฌโกdminโข((๐’ž|u)โŠฅ).

Besides, the relative distance of Rโข(๐’ž) is q2+2qโข(q+1)โ†’1 according to Lemma 5.5. If every ฮฆw is sufficiently uniform, the bias ฮฑ satisfies ฮฑโˆˆ๐’ชโข(1/q) and hence 1-ฮฑ1+ฮฑ=1+๐’ชโข(1/q). Therefore, we can use Theorem 3.13, and we get the desired result. โˆŽ


For an initial file of size |F|=12โข(q-1)โข(q-2)โขlog2โกq bits, we get

  1. a redundancy rate

  2. a communication complexity rate


5.2.2 Storage improvements via lifted codes

The redundancy rate of Reedโ€“Muller codes presented above stays stuck above 2. Affine lifted codes, introduced by Guo, Kopparty and Sudan [5], allow to break this barrier while keeping the same verification structure. Generically, they are defined as follows:

Lift(m,d):={(f(๐))๐โˆˆ๐”ฝqm,fโˆˆ๐”ฝq[X1,โ€ฆ,Xm]for every affine lineLโŠ‚๐”ฝqm,(f(๐))๐โˆˆLโˆˆRSq(d+1)}.

We refer to [5] for more details about the construction. Here we focus on Liftโข(2,q-2) since it can be compared to RMโข(2,q-2). Indeed, one sees that


and equation (5.1) turns into a proper inclusion as long as q is not a prime. Besides, by definition of lifted codes, Liftโข(2,q-2) admits the same verification structure as the one presented previously for RMโข(2,q-2).

Lemma 5.7.

The response code of Liftโข(2,q-2) has minimum distance at least q2-q+2.


The rationale is similar to the proof of Lemma 5.5. Let 0โ‰ cโˆˆ๐’ž, c=(fโข(๐))๐โˆˆ๐”ฝq2, fโˆˆ๐”ฝqโข[X,Y], and denote by L1,โ€ฆ,LaโŠ‚๐”ฝq2 the lines on which f vanishes. The restriction of f along Li can be interpolated as a univariate polynomial f|Liโข(T) of degree at most q-2 since (fโข(๐))๐โˆˆLi lies in the Reedโ€“Solomon code RSqโข(q-1) by definition of lifted codes. Therefore, f|Liโข(T)=0, and f vanishes on Liยฏ. Repeating arguments in the proof of Lemma 5.5, we get aโ‰คdegโกfโ‰ค2โขq-2 and dminโข(โ„›โข(Liftโข(2,q-2)))โ‰ฅq2+q-2โขq+2=q2-q+2. โˆŽ

We believe the bound given in Lemma 5.7 is not tight, but it is sufficient to have dminโข(โ„›โข(Liftโข(2,q-2)))/Nโ†’1. Similarly to Proposition 5.6, we can then prove that practical ๐–ฏ๐—ˆ๐–ฑs can be constructed with the family of lifted codes Liftโข(2,q-2).

Proposition 5.8.

Let C=Liftโข(2,q-2), and let (Q,V) be its associated verification structure. If every ฮฆw is sufficiently uniform, then the PoR scheme associated to C and (Q,V) is (ฮต,ฯ„)-sound for every ฮต<1 and ฯ„=Oโข(1(1-ฮต)โขq2), when qโ†’โˆž

The crucial improvement is that lifted codes potentially have much higher dimension than Reedโ€“Muller codes. For q=2e, the dimension of Liftโข(2,q-2) can be proved to equal 4e-3e [5].

Example 5.9.

In Table 4, we present parameters of ๐–ฏ๐—ˆ๐–ฑs based on Reedโ€“Muller codes and lifted codes, using files of size approaching 104, 106 and 109 bits.

Table 4

Parameters of ๐–ฏ๐—ˆ๐–ฑs based on Reedโ€“Muller codes and lifted codes.

CodeqFile sizeComm. rateRedundancy rate

Note that this family of codes has been used in the ๐–ฏ๐—ˆ๐–ฑ proposal of [7].

5.2.3 On more generic families of codes

We have presented two rather small families of codes producing practical instances of ๐–ฏ๐—ˆ๐–ฑ. Let us give a short summary of approximate lower bounds on crucial ๐–ฏ๐—ˆ๐–ฑ parameters that have been shown in previous sections in Table 5.

Table 5

Approximate lower bounds on crucial ๐–ฏ๐—ˆ๐–ฑ parameters.

Family of codes over ๐”ฝqRedundancy rateCommunication complexity rate
s-fold tensor product (Section 5.1.2)(1-ฮด)-sq-(s-1)โข(1-ฮด)-s
Plane RM (Section 5.2.1)22โขq-1
Plane lifted code (Section 5.2.2)1+qlog2โก(3)-2q-1+qlog2โก(3)-3

Now we quickly mention other families of codes that could be interesting to consider.

Multi-variate generalisation

We have only presented Reedโ€“Muller and lifted codes embedded into the affine plane๐”ฝq2. One could of course consider a broader ambient space ๐”ฝqm, m>2. Lines would have smaller relative weight compared to the ambient space, and thus we would decrease the communication complexity of our ๐–ฏ๐—ˆ๐–ฑ schemes. We must however care about the storage overhead which can drastically increase if m gets large: for instance, any Reedโ€“Muller code RMqโข(m,q-2) has rate โ‰ค1/m!.

Lower degree generalisation

In order to increase the soundness of our ๐–ฏ๐—ˆ๐–ฑ schemes, one could consider Reedโ€“Muller codes RMqโข(2,d) (as well as related lifted codes) with a lower degree d<q-2. The communication complexity remains unchanged; however, we could observe overwhelming storage overhead if d is too small.

Combinatorial generalisation

Codes Liftโข(2,q-2) can be viewed as codes from designs (see [1] for more details), where the underlying block design is the classical affine plane. Considering designs with smaller block size would lead to ๐–ฏ๐—ˆ๐–ฑs with smaller communication complexity. But once again, this could be expensive in terms of storage since only a few designs produce high-dimensional codes.

6 Conclusion

We have proposed a security model for ๐–ฏ๐—ˆ๐–ฑs in line of previous work, together with a generic code-based framework. We have then sharply quantified the extraction failure of our ๐–ฏ๐—ˆ๐–ฑ construction as a function of code parameters. Specialising this construction for particular families of codes, we provided instances with practical parameters. We hope our work will be an incentive for further proposals of code instances, aiming at better ๐–ฏ๐—ˆ๐–ฑ parameters.

Communicated by Doug Stinson

Award Identifier / Grant number: 15-CE39-0013-01

Funding statement: This work is partially funded by French ANR-15-CE39-0013-01 โ€œMantaโ€.

A Experimental estimate of the bias ฮฑ

We here confirm our heuristic on the fact that ฮฆw is sufficiently uniform, by providing experimental estimates of ฮฑ.


We consider ๐–ฏ๐—ˆ๐–ฑ schemes using Reedโ€“Muller codes ๐’ž=RMqโข(2,q-2), as presented in Section 5.2.1. We also fix the word wโˆˆ๐”ฝqn uploaded on the server during the initialisation step. Remark that, for varying w, all ฮฆw are equivalently distributed. Indeed, if ฯˆโˆˆ๐”–โข(๐”ฝq)n satisfies ฯˆโข(w)=wโ€ฒ, then the distribution of permutations picked from ฮฆwโ€ฒ can be obtained by applying ฯˆ to permutations picked from ฮฆw. Hence, without loss of generality, we assume w=0. Proposition 3.16 claims that, in this context, ฮฑ should be ๐’ชโข(1/q) since ฮ”=2 and โ„“โ‰คq. For convenience, we write pฮฆ:=โ„™ฮฆw(Vฯƒ(u,ru)=0), and we recall that ฮฑ is an upper bound on pฮฆ (for varying u and r).

We proceed to three kinds of tests in order to estimate ฮฑ:

  1. Test 1. We sample N challenges u, and, for each sample, we fix tโ‰คโ„“ and ru in {xโˆˆ๐”ฝqโ„“,|Zu|=t}. Then we estimate pฮฆ by running M trials and computing the average number of times Vฯƒโข(u,ru)=0 occurs. We denote by ฮพMโข(pฮฆ) this estimator. We then collect the maximum value of ฮพMโข(pฮฆ) among the N samples of u.

  2. Test 2. A challenge u is fixed. For several values of t, we pick N responses ru randomly in {xโˆˆ๐”ฝqโ„“,|Zu|=t}. For every ru, we estimate pฮฆ with M samples. We collect the maximum value of ฮพMโข(pฮฆ) among the N values of ru that have been picked.

  3. Test 3. A challenge u is fixed, as well as a response ru to this challenge, which satisfies |Zu|=t for several values of tโˆˆ[2,โ„“]. We then run M trials and collect ฮพMโข(pฮฆ).

Figure 5 Estimators for various values of Mโˆˆ[103,106]{M\in[10^{3},10^{6}]}, of qโˆˆ{8,64}{q\in\{8,64\}} and of test i, iโˆˆ{1,2,3}{i\in\{1,2,3\}}.
Support size t=2{t=2} is fixed.
For tests 1 and 2, the parameter N is set to 10.
Black horizontal lines represent the expected value of ฮฑ.
Figure 5

Estimators for various values of Mโˆˆ[103,106], of qโˆˆ{8,64} and of test i, iโˆˆ{1,2,3}. Support size t=2 is fixed. For tests 1 and 2, the parameter N is set to 10. Black horizontal lines represent the expected value of ฮฑ.

Figure 6 Estimators for various values of Mโˆˆ[103,106]{M\in[10^{3},10^{6}]}, of qโˆˆ{8,64}{q\in\{8,64\}}, and of test i, iโˆˆ{1,2,3}{i\in\{1,2,3\}}.
Support size t=3{t=3} is fixed.
For tests 1 and 2, the parameter N is set to 10.
Black horizontal lines represent the expected value of ฮฑ.
Figure 6

Estimators for various values of Mโˆˆ[103,106], of qโˆˆ{8,64}, and of test i, iโˆˆ{1,2,3}. Support size t=3 is fixed. For tests 1 and 2, the parameter N is set to 10. Black horizontal lines represent the expected value of ฮฑ.

Figure 7 Estimators for various values of Mโˆˆ[103,106]{M\in[10^{3},10^{6}]}, of qโˆˆ{8,64}{q\in\{8,64\}} and of test i, iโˆˆ{1,2,3}{i\in\{1,2,3\}}.
Support size t=โ„“{t=\ell} is fixed.
For tests 1 and 2, the parameter N is set to 10.
Black horizontal lines represent the expected value of ฮฑ.
Figure 7

Estimators for various values of Mโˆˆ[103,106], of qโˆˆ{8,64} and of test i, iโˆˆ{1,2,3}. Support size t=โ„“ is fixed. For tests 1 and 2, the parameter N is set to 10. Black horizontal lines represent the expected value of ฮฑ.

Influence of M and the chosen test on the estimator

At the end of the document, Figures 5, 6 and 7 confirm that, for fixed N and q and for any test i we use, iโˆˆ{1,2,3}, our estimator ฮพMโข(pฮฆ) converges to a value close to 1/(q-1).

Influence of N on the estimator

Table 6 shows experimentally that, for M large enough and fixed q, the number N has few influence on the estimator (N being respectively the number of responses ru sampled in test 2, and the number of challenges u sampled in test 1). The minor increase of the values can be thought as a standard deviation due to the fact that the number of samples M=100,000 is finite.

Table 6

Estimators using tests 1 and 2 with M=100,000 and t=2 for qโˆˆ{8,64} and various values of N. The quantity 1/(q-1) represents an estimated upper bound on ฮฑ that ฮพMโข(pฮฆ) should approximate.

Test 1Test 2

Influence of q on the estimator

In Table 7, we show that the estimator ฮพMโข(pฮฆ) converges to an expected value 1/(q-1) for any value of q.

Table 7

Estimators using test 3 with M=1,000,000 and t=2 for various values of prime powers q. The quantity 1/(q-1) represents an estimated upper bound on ฮฑ that ฮพMโข(pฮฆ) should approximate.



The authors would like to thank Daniel Augot who shared fruitful discussions on the definition of proofs-of-retrievability, as well as Alain Couvreur for his suggestion leading to the proof of Lemma 5.7.


[1] E. F. Assmus and J. D. Key, Designs and Their Codes, Cambridge Tracts in Math., Cambridge University, Cambridge, 1992. 10.1017/CBO9781316529836Search in Google Scholar

[2] G. Ateniese, R. C. Burns, R. Curtmola, J. Herring, O. Khan, L. Kissner, Z. N. J. Peterson and D. Song, Remote data checking using provable data possession., ACM Trans. Inf. Syst. Secur. 14 (2011), no. 1, Article ID 12. 10.1145/1952982.1952994Search in Google Scholar

[3] K. D. Bowers, A. Juels and A. Oprea, Proofs of Retrievability: Theory and Implementation, Proceedings of the First ACM Cloud Computing Security Workshopโ€”CCSW 2009, ACM, New York (2009), 43โ€“54. 10.1145/1655008.1655015Search in Google Scholar

[4] Y. Dodis, S. P. Vadhan and D. Wichs, Proofs of retrievability via hardness amplification, Theory of Cryptographyโ€”TCC 2009, Lecture Notes in Comput. Sci. 5444, Springer, Berlin (2009), 109โ€“127. 10.1007/978-3-642-00457-5_8Search in Google Scholar

[5] A. Guo, S. Kopparty and M. Sudan, New affine-invariant codes from lifting, Innovations in Theoretical Computer Scienceโ€”ITCS โ€™13, ACM, New York (2013), 529โ€“540. 10.1145/2422436.2422494Search in Google Scholar

[6] A. Juels and B. S. Kaliski, Jr., PoRs: Proofs of retrievability for large files, Proceedings of the 2007 ACM Conference on Computer and Communications Securityโ€”CCS 2007, ACM, New York (2007), 584โ€“597. 10.1145/1315245.1315317Search in Google Scholar

[7] J. Lavauzelle and F. Levy-dit-Vehel, New proofs of retrievability using locally decodable codes, IEEE International Symposium on Information Theoryโ€”ISIT 2016, IEEE Press, Piscataway (2016), 1809โ€“1813. 10.1109/ISIT.2016.7541611Search in Google Scholar

[8] M. Lillibridge, S. Elnikety, A. Birrell, M. Burrows and M. Isard, A cooperative internet backup scheme, Proceedings of the 2003 Usenix Annual Technical Conference, USENIX, Berkeley (2003), 29โ€“41. Search in Google Scholar

[9] Z. Mo, Y. Zhou and S. Chen, A dynamic proof of retrievability (por) scheme with Oโข(logโกn) complexity, Proceedings of IEEE International Conference on Communicationsโ€”ICC 2012, IEEE Press, Piscataway (2012), 912โ€“916. 10.1109/ICC.2012.6364056Search in Google Scholar

[10] M. Naor and G. N. Rothblum, The complexity of online memory checking, J. ACM 56 (2009), no. 1, Article ID 2. 10.1109/SFCS.2005.71Search in Google Scholar

[11] M. B. Paterson, D. R. Stinson and J. Upadhyay, A coding theory foundation for the analysis of general unconditionally secure proof-of-retrievability schemes for cloud storage, J. Math. Cryptol. 7 (2013), no. 3, 183โ€“216. 10.1515/jmc-2013-5002Search in Google Scholar

[12] M. B. Paterson, D. R. Stinson and J. Upadhyay, Multi-prover proof of retrievability, J. Math. Cryptol. 12 (2018), no. 4, 203โ€“220. 10.1515/jmc-2018-0012Search in Google Scholar

[13] B. Sengupta and S. Ruj, Efficient proofs of retrievability with public verifiability for dynamic cloud storage, preprint (2016), 10.1109/TCC.2017.2767584Search in Google Scholar

[14] H. Shacham and B. Waters, Compact proofs of retrievability, J. Cryptology 26 (2013), no. 3, 442โ€“483. 10.1007/s00145-012-9129-2Search in Google Scholar

[15] Q. Wang, C. Wang, K. Ren, W. Lou and J. Li, Enabling public auditability and data dynamics for storage security in cloud computing, IEEE Trans. Parallel Distrib. Syst. 22 (2011), no. 5, 847โ€“859. 10.1109/TPDS.2010.183Search in Google Scholar

Received: 2018-04-27
Revised: 2018-09-04
Accepted: 2019-01-25
Published Online: 2019-02-19
Published in Print: 2019-06-01

ยฉ 2019 Walter de Gruyter GmbH, Berlin/Boston

This article is distributed under the terms of the Creative Commons Attribution Non-Commercial License, which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Downloaded on 8.12.2022 from
Scroll Up Arrow