A framework for reducing the overhead of the quantum oracle for use with Grover’s algorithm with applications to cryptanalysis of SIKE

: In this paper we provide a framework for applying classical search and preprocessing to quantum oracles for use with Grover’s quantum search algorithm in order to lower the quantum circuit-complexity of Grover’s algorithm for single-target search problems. This has the effect (for certain problems) of reducing a portion of the polynomial overhead contributed by the implementation cost of quantum oracles and can be used to provide either strict improvements or advantageous trade-offs in circuit-complexity. Our results indicate that it is possible for quantum oracles for certain single-target preimage search problems to reduce the quantum circuit-size from O (︁ 2 n /2 · mC )︁ (where C originates from the cost of implementing the quantum oracle) to O (2 n /2 · m √ C ) without the use of quantum ram, whilst also slightly reducing the number of required qubits. This framework captures a previous optimisation of Grover’s algorithm using preprocessing [21] applied to cryptanalysis, providing new asymptotic analysis. We additionally provide insights and asymptotic improvements on recent cryptanalysis [16] of SIKE [14] via Grover’s algorithm, demonstrating that the speedup applies to this attack and impacting upon quantum security estimates [16] incorporated into the SIKE specification [14].


Introduction
Whilst the quantum circuit-complexity of a quantum algorithm is linked to the cost of executing a quantum algorithm, this link is not yet fully understood owing to the uncertainty regarding the eventual architecture of quantum computers and the need to perform quantum error-correction to protect the state from environmental noise. The logical quantum circuit-model of computation ignores the issue of noise and has been the de-facto choice of assigning a cost to quantum algorithms for the cryptographic community as our understanding of the true costs involved with executing quantum algorithms has been evolving. In particular, there is the issue of quantum query-complexity versus quantum bit-complexity when assigning a cost to the best known quantum attack on a cryptosystem for purposes of choosing quantum-resistant cryptographic parameters in relation to it. )︁ calls to this quantum oracle if we assume that the quantum oracle is a black-box [28] (in that we model it simply via input and output), we focus upon redefining what it means for the oracle to be called. By doing this, we note that for certain problems we can in fact increase the query-complexity but reduce the total cost of the quantum algorithm itself.

Contributions
We provide a framework for reasoning about how the quantum circuit-complexity of Grover's algorithm can be reduced via design principles that can be applied to the quantum oracle, allowing strict gains in all metrics for certain problems. This is done via combining classical search with Grover's algorithm, increasing the cost of the quantum oracle, but defining it over a smaller search-space. This approach allows for a balancing of the query-complexity and the cost of the quantum oracle and admits a number of benefits, such as preprocessing options which strictly improve the efficiency of Grover's algorithm.
We demonstrate the utility of our framework by applying it to two known quantum attacks on cryptosystems using Grover's algorithm, demonstrating that it captures and improves upon a known quantum attack on the Multivariate Quadratic problem over F 2 using Grover's algorithm and provides new results on quantum cryptanalysis of SIKE [14], providing evidence that the cost of attacking SIKE via Grover's algorithm is asymptotically lower than previously estimated [14,16].

Outline of this paper
In Section 2, we review Grover's algorithm. In Section 3 we introduce our framework. In Section 4 we examine several applications to cryptanalysis and give our conclusions in Section 5. (1)|. The unstructured search problem defined by χ is the problem of finding an element x ∈ {0, 1} n such that χ(x) = 1 or proving that no such element exists, given only the ability to evaluate χ.

Background
A classical computer requires O( 2 n Mχ ) calls to a classical circuit which evaluates χ before a solution to the unstructured search problem (Definition 2.1) is found [1]. In comparison, Grover's algorithm requires O( calls to a quantum circuit which evaluates χ and terminates with a solution to the unstructured search problem with high probability. It will additionally prove useful to consider another formulation of the search problem.
Any algorithm that solves arbitrary instances of the preimage search problem can be used to solve the search problem and vice versa, but it is clear that there is more computational structure in the preimage search problem compared to the unstructured search problem which can benefit the design of of algorithms.

Quantum algorithms
Quantum states consist of qubits (quantum bits) and an n qubit quantum state relative to the computational basis {︀ |x⟩ : x ∈ {0, 1} n }︀ can be expressed as ∑︀ x∈{0,1} n αx |x⟩ where αx ∈ C and ∑︀ x∈{0,1} n |αx| 2 = 1. The αx are the amplitudes of each computational basis state |x⟩ and measurement of this quantum state results in the bitstring x ∈ {0, 1} n with probability |αx| 2 . Quantum algorithms therefore consist of increasing the magnitude of αx which encode algorithmically useful information -Grover's algorithm consists of the repeated application of a quantum circuit, each of which (up to a point) increases the magnitude of αx which encode solutions to the search problem.

Cost models and reversibility
Quantum circuits that do not include measurement are equivalent to unitary operators (U such that there exists U † with the property UU † = U † U = I) and because of this correspondence, quantum circuits which implement χ : {0, 1} n −→ {0, 1} can be designed by considering reversible classical circuits (which implement permutations and therefore all have inverses), with each reversible gate assigned a cost in terms of quantum gates.
Much as the universal boolean gate set {¬, ⊕, ∧} can implement arbitrary classical circuits, quantum algorithms can be implemented (up to an arbitrary level of precision) by a universal quantum gate set. For reasons of space we deal only with asymptotics in this paper, but illustrate the above in terms of the Clifford+T universal quantum gate which consists of the Clifford gate set (the Hadamard, Phase and CNOT gates) and the single T gate. By fixing a universal quantum gate set we can reason about the quantum circuit-complexity (cost) of a quantum algorithm which consists of the quantum circuit-size (number of quantum gates), quantum circuit-depth (timesteps taken) and quantum circuit-width (quantum bits required). It is plain that the set of quantum gates }︀ and more generally ∧ k (X) for k ≥ 1 acting upon computational basis states defined by X |x where ∧ 0 (X) := X is sufficient to implement all reversible classical circuits on computational basis states, if we have sufficient ancilla qubits as this gate set corresponds to the universal boolean gate set {¬, ⊕, ∧}. The ∧ k (X) for k ≥ 2 is simply a useful abstraction. The X and ∧ 1 (X) gate each require one Clifford gate to implement, whilst the ∧ 2 (X) (Toffoli gate) can be implemented using 17 Clifford+T gates [2,24] and the ∧ k (X) gate to require at most 40k − 64 Clifford gates for k > 2 [17] if we have a single ancilla qubit, which can be in any state.

Definition 2.3 (Cost notation) If
A is any quantum algorithm or quantum gate, we denote the execution cost of A by the notation C A . Costs will be provided in terms of components that are executed in serial, so that C A can be substituted for circuit-size, circuit-depth or either metric applied to a subset of quantum gates.

Quantum oracles and Grover's algorithm
Quantum oracles will be used in conjunction with Grover's algorithm, which we state and provide a cost for without proof. Our modifications will simply be alterations of the quantum bit oracle and are used with Grover's algorithm. is the parallel application of n Hadamard gates, each of which cost 1 Clifford gate and the diffusion operator on n qubits is can be assigned a circuit-size of 44n − 105 Clifford+T gates for n ≥ 7 [17,18] and circuit-depth of 44n − 103. Our framework will enable the cost expressed in Theorem 2.5 to be optimised by trading off between the cost Cχ + C Dn and the query-complexity term . Much as we require memory to implement classical functions efficiently, we often require ancilla qubits to implement the action of quantum bit oracle. In this paper we use a decomposition of the quantum bit oracle that captures this fact. Definition 2.6 (Bitwise decomposition of the oracle) A bitwise decomposition of quantum bit oracle O (b) χ consists of the n + 1 unitary operators Uχ * , Uχ n , . . . , Uχ 1 acting upon n + w + 1 qubits, such that for any where We there have that χn Uχ * Uχ n · · · Uχ 1 and that Uχ i should be interpreted as producing a memory state g i (x 1 , . . . , x i ) ∈ {0, 1} w computed using only the first i bits of a possible solution to the search problem. The memory state g 0 ∈ {0, 1} w can be considered as an initial memory-state which does not depend upon any of the bits x 1 , . . . , xn. Typically, we can take g 0 = 0 w . This decomposition applies trivially to quantum bit oracles constructed using only reversible boolean primitives (we define Uχ i = I ⊗n+w+1 and Uχ * = O (b) χ ) but non-trivial decompositions may require special design. The single-target preimage search problem (see Definition 2.2) can be modelled by simply by setting Uχ n · · · Uχ 1 to compute ⃒ ⃒ h(x 1 . . . xn) ⊕ 1 m ⟩︀ and setting Uχ * := ∧m(X).

A framework for preprocessing
In this section we present our framework for optimising applications of Grover's algorithm via modifying quantum bit oracles to take advantage of classical search and preprocessing. Computational gains will be made possible via examining the role of memory in implementing the action of the quantum bit oracle and trading off between query-complexity and computational effort required to implement the action of the quantum bit oracle. With this in mind we can choose an integer 0 ≤ k ≤ n that defines a cut of the bitwise decomposition of the quantum bit oracle (see Definition 2.6), splitting it into three separate components so that U n−k := Uχ n−k · · · Uχ 1 , U k := Uχ n · · · Uχ n−k+1 and U * := Uχ * .

Combining classical search with Grover's algorithm
and whose cost is Proof. We first execute U n−k to compute then simply follow the procedure of executing the sequence U † k U * U k on all possible assignments of the final k values of the search-space. This can be performed efficiently via using the k qubits following the register If we use a binary reflected Gray Code [11], we can start in the state 0 k and cycle through all 2 k elements of {0, 1} k , ending in the state 10 k−1 by flipping only a single bit at a time, which can be accomplished via using an X gate on the relevant qubit and if we wish to return the state to |0 k ⟩, then we need only execute an additional X gate for a total cost of 2 k X gates. After this, we simply execute the unitary U † n−k , leaving us with the computational basis state Proof. This can easily be seen as the modified quantum oracle will mark any element Such stategies are possible with classical computation, but require state to be stored. By their nature, reversible logic circuits store state implicitly and by using this fact we avoid increasing the number of qubits. There is no guarantee that a non-trivial advantageous cut will be possible, but we can simply follow a design heuristic where as much cost as possible is shifted towards U n−k . As we can simply compute the costs C U n−k , C U k and C U * as a function of k, we can easily find an optimal k via numerical simulation of the costs involved (often a simple formula) on all values of 0 ≤ k ≤ n, which is a negligible classical computation. Example 3. 3 We consider the case where C Uχ 1 = · · · = C Uχ n = C Uχ * and these costs dominate that of the diffusion step, so that C U n−k = (n − k)D, C U k = kD and C U * = D for some constant D. Choosing k = log 2 n and using Equation (8) in conjunction with the Theorem 2.5 gives us a cost of π Proof. This can be easily seen as if we denote via X i the application of an X gate to the i th qubit of the searchspace then each subsequence of unitary operators that appears in the unitary U k can be replaced by the subsequence χ ′ can be reduced to Proof. Again using the notation X i for the application of an X gate to the i th , we can adapt Theorem 3.1 by simply replacing any subsequence U * Uχ n · · · Uχ i · · · Uχ n−k+1 X n−k+i U † χ n−k+1 · · · U † χ i · · · U † χn U * (16) that appears in the unitary U k by by the commuting property of each Uχ i and invariance of the unitary sequence Uχ n · · · Uχ n−k+1 upon the variable z i . From there it is a simple matter to note that the inner unitaries cancel each other out and we must first fully compute the sequence Uχ n · · · Uχ 1 and end with the sequence U † χ1 · · · U † χn .
Example 3. 6 We again consider the case where each unitary operator a cost of D as in Example 3.3, but where we can instead apply Theorem 3.5. The choice of k = log 2 n can now be seen to be optimal if we take the derivative of the full cost equation for Grover's algorithm with the modified quantum bit oracle. This gives an asymptotic cost for Grover's algorithm with this modified quantum bit oracle of O

Preprocessing the classical secondary-search procedure
We now turn to the benefits of preprocessing any of the previously described methods of secondary classical search. Proof. The proof of this is trivial and relies solely upon the definition of the bitwise decomposition of the quantum bit oracle.
In an ideal situation, the unitary costs will be shifted as much as possible to U n−k .

Theorem 3.8 (Classical preprocessing allows strict gains) Let O (b)
χ ′ be a modified quantum bit oracle parameterised by 0 < k < n as in Theorem 3.1. Then at the cost of classical storage space and/or classical preprocessing and without affecting the correctness of this algorithm, the quantum cost of O (b) χ ′ can be reduced and is at worst unchanged, whilst we reduce the number of qubits required by k.
Proof. We will create 2 i circuits for each Uχ n−k+i , each of which are hardcoded to assume that the bits z 1 . . . z i ∈ {0, 1} i are fixed. The first benefit is that as we are implicitly creating a circuit which is hardcoded with a choice of z 1 . . . z i , we need not include these qubits or any qubits which interact only with them (and not x 1 . . . x n−k by any circuit-path) in the search-space or the w-bit memory-state.
The second benefit is in a reduction in the complexity of the individual circuits themselves. If we consider purely reversible circuits, then for any unitary U we have that if any z i appears in the control qubits for ∧ k (U), then this can be hardcoded as a either a ∧ k−1 (U) gate if z i = 1 or removed completely if z i = 0.
The third benefit is that further optimisations are possible in the sequence of hardcoded circuits Uχ n−k+1 · · · Uχ n as a whole. If we consider a simple circuit constructed of multiple ∧ k (X) gates, all of which write to the same target qubit and where no cancellation is posible, then any hardcoding of these ∧ k (X) gates that results in a circuit with r ∧ k ′ (X) for (k ′ < k) gates with identical controls allows them to be removed if r is even or replaced with a single gate if r is odd.
Thus if we allow for the preprocessing and additional storage or alternatively online computation then these hardcoded quantum circuits are no more expensive to execute and we can always reduce the number of qubits by k.
We briefly mention that we could employ parallelism (communication costs allowing), whereby we compute U n−k , then create 2 k copies of the resulting state and execute the sequence of unitaries U † k U * U k upon each one. This strategy allows us to bypass some of the increase in circuit-size that is a hard-limit if we treat the quantum oracle as a black-box [28] as this increase only applies to C U k and C U * .

Applications to Cryptanalysis
In this section we demonstrate that our framework captures one previously proposed attack using Grover's algorithm on Multivariate Quadratic cryptosystems, provides missing asymptotic analysis on its results and improves upon it. We conclude with demonstrating our methodology can be applied to recent quantum cryptanalysis [16] of the proposed quantum resistant cryptosystem SIKE [14].

The Multivariate Quadratic problem over
Several quantum resistant signature schemes [13,20] have been published which rely upon the hardness of solving the Multivariate Quadratic problem over F 2 . Whilst asymptotically more efficient algorithms exist [3,9], a basic attack [23] using Grover's algorithm that was later optimised via preprocessing [21] is both captured and improved upon by our framework. We leave explicit details to Appendix A for reasons of space and to avoid duplication of preexisting work [21,23].
This case-study provides important commentary upon the difficulty in choosing quantum resistant parameters in relation to Grover's algorithm as the initial quantum resistant parameters were suggested [20] in relation to the query-complexity of O (︁ 2 n/2 )︁ for Grover's algorithm to solve the MQ problem over F 2 . After publication of an explicit design for a quantum bit oracle to use in conjunction with Grover's algorithm for this problem [23] which gave the quantum circuit-size O (︁ 2 n/2 · mn 2 )︁ for Grover's algorithm, new parameters were suggested in a subsequent paper [19] in relation to this cost. These costs were also quoted in several specifications for quantum-resistant cryptosystems in the NIST competition [6,8]. Our framework demonstrates that one optimisation [21] using preprocessing lowers the cost to O (︁ 2 n/2 · mn 3/2 )︁ and that by using our framework this improves to O (︁ 2 n/2 · mn )︁ by using an additional O(m log 2 n) ancilla qubits. We discuss the problem of choosing quantum-resistant cryptographic parameters in relation to anything but the querycomplexity of Grover's algorithm further in Section 5.

The Computational SuperSingular Isogeny (CSSI) problem
In this section we reexamine the cost of a Grover-based attack upon the quantum-resistant key encapsulation method SIKE [14], whereby Grover is used to attack the CSSI problem (see Definition 4.2) via searching for a unique collision between two functions. We demonstrate how this attack fits into, and can be improved upon by, our framework. We provide an asymptotically better attack using Grover's algorithm and new estimates for the hardness of solving the CSSI problem via Grover's algorithm under various constraints (see Appendix B). These results impact upon the estimates in [16] which are quoted in the SIKE specification [14].
This problem has previously studied in [16] where the authors argue that whilst Tani's algorithm [25] may be the most asymptotically efficient method to solve this problem in terms of query-complexity, once the implementation of the underlying quantum data structure and memory is taken into account, Grover's algorithm may be competitive with Tani's algorithm.

On the cost of computing an isogeny-path
Isogenies are morphisms that are rational maps between groups of points of elliptic curves. Their degree is that of their rational map structure, and they are uniquely determined by their kernel. Given the 2 e -torsion E [2 e ] of E, a degree-2 e isogeny uniquely corresponds to a x 1 . . . xe ∈ {0, 1} e via a choice of a (cyclic) kernel in E [2 e ]. Given a kernel, the total cost of computing the corresponding 2 e -isogenous curve is in O (e log 2 e) elliptic curve operations [10].

Definition 4.2 (The Computational SuperSingular Isogeny problem¹ [15])
Let E 1 , E 2 be two supersingular elliptic curves defined over F p 2 such that there is a degree 2 e isogeny ϕ : , then we have solved the CSSI problem. As in [16] we work under the assumption that there is a single such isogeny ϕ : (hence there is one target in our search-space), which is justified under the arguments of [26].

Fitting the attack to our framework
When e 1 ≈ e 2 ≈ e/2 as suggested in [16], we obtain a constant time saving over the simple search case e 1 = e, e 2 = 0 as 2 · e 2 · log 2 (e/2) = e(log 2 e − 1). This does not impact the asymptotic complexity of the search procedure. In our framework, we define the initial unitary U n−k (in this scenario n = e and k = e 2 ) to compute (whereĝe 1 where . Theorem 3.1 therefore gives us that we can perform a secondary classical search procedure and we can use preprocessing as described in Theorem 3.8 to reduce the cost of the circuit. As ⟩︀ depends solely upon z 1 . . . ze 2 at all times, after hardcoding is completed, the qubits required to represent it can be removed in addition to the e 2 qubits of the search-space Grover is defined upon. After cancellations of layers of X gates, the 2 k applications of U † k U * U k is then simply 2 k + 1 layers of 2⌈log 2 p⌉ X gates executed in parallel with 2 k ∧ 2⌈log 2 p⌉ (X) gates in between each layer. In relation to the CSSI problem, the security level of SIKE [14] is parameterised by a prime p of the form 2 e 3 f − 1 where 2 e ≈ 3 f so that e ≈ p 1/2 . The problem of breaking an instance of SIKE-p is then equivalent to finding the unique degree 2 e isogeny defined by the public-parameters of SIKE-p.

Theorem 4.3 (Grover vs CSSI) Let Ce be the cost (either quantum circuit-size or quantum circuit-depth) of evaluating a degree 2 e isogeny as a reversible quantum circuit. Solving the CSSI problem via Grover's algorithm then has a cost of
Proof. We can express the asymptotic cost of our attack parameterised by our choice of e 2 (where we recall e 1  )︀ . In [16], Grover's algorithm is used to derive estimates on the cost of attacking SIKE for specific security parameters and in Appendix B, we use our result with their methodology.

Conclusions
The extent to which the overhead of the quantum oracle can be reduced is clearly an important issue if the cryptographic community is choosing parameters relative to a costing of Grover's algorithm which takes into account both the query-complexity and the cost of the queries themselves. The safest route is of course to simply choose the query-complexity as a lower-bound on the circuit-size for such Grover-based attacks and this protects against our optimisation as we only increase the total number of queries.
Our gains have instead been enabled via better use of intermediate computations and exploiting classical computation to create efficient hardcoded circuits, both of which can then be used find an optimal balance between the cost of the quantum oracle and the query-complexity. Whilst our methods are obviously not applicable to all quantum oracles, a cautionary half-way measure between using the lower-bound of querycomplexity and the current methodology may be to produce a conservative quantum resource estimate for the cost of the quantum oracle and use the square root of this for the overhead of the quantum oracle when choosing cryptographic parameters relative to Grover's algorithm.

Applying our framework
A previously published use of preprocessing exploits only a basic form of secondary classical-search (Theorem 3.1) combined with preprocessing (Theorem 3.8), which under our framework can be interpreted by defining U n−k to evaluate m equations of the form which is possible as they are simply m equations in n − k variables. U k is then the addition of to each equation register, whilst U * is a ∧m(X) gate as before. It is easily seen that C ∧m(X) is O(m), that the C U n−k is O(m · (n − k) 2 ) by the discussion on the previous page that C U k is O(m · (n − k)) as hardcoding collapses sums involving x i z j to either 0 or x i and interactions between z i z j or z k to a single bit. The asymptotic cost of Grover's algorithm with the modified quantum bit oracle using secondary classical search and hardcoded bits is therefore )︁)︁ (A. 6) and by taking the derivative and we find that the optimal k ≈ log 2 (n) so that the asymptotic quantum circuitsize of Grover using approximately n−log 2 n+m+2 qubits and the method described in [21] is O (︁ 2 n/2 · mn 3/2 )︁ . This asymptotic analysis was not performed in the original paper.

Following a heuristic design pattern with our framework
We use our framework to improve upon this result, obtaining a quantum bit oracle for the MQ problem over F 2 that uses n + km + m + 2 qubits and enables Grover's algorithm to be implemented with a quantum circuitsize of O (︁ 2 n/2 · mn )︁ . This can be done via simply redefining the unitary operators to use Theorem 3.5 in conjunction with Theorem 3.7. By keeping U n−k as before, but defining each unitary Uχ n−k+i for 1 ≤ i ≤ k by the action of adding only the component It is clear that these linear sums can be computed and stored on ancilla qubits via Theorem 3.7 and that this cost can be shifted to U n−k . We then have these unitary operator fulfil Theorem 3.5 and that after the shifting of costs to U n−k , we have that Uχ i consists of simple one ∧ 1 (X) gate. We can then define U * to add the component which only involves the bits z 1 . . . z k (which collapse to a hardcoded bit and at most m X gates), execute a ∧m(X) gate and uncompute the hardcoded bit again via at most one X gate. In this way the cost for Grover's algorithm (see Theorem 3.5) using this quantum bit oracle becomes hence after optimisation via taking the derivative again with respect to k and simplifying we obtain that the optimal cut to choose is k ≈ log 2 (︀ n 2 )︀ . This gives us the result that if we allow n + m(2 log 2 n + 1) + 2 qubits then we have that Grover's algorithm requires a quantum circuit-size of O (︁ 2 n/2 · mn )︁ .
-Grover may be superior in the Depth × Width-cost metric. For SIKE-434 we have a cost of 2 126 for Grover's algorithm compared to 2 132 for Tani's algorithm and for SIKE-610, the cost is 2 170 compared to Tani's cost of 2 176 . -Grover may be competitive in the gate based metric. For SIKE-434 this translates into a cost of 2 126 for Grover's algorithm compared to 2 124 for Tani's algorithm and for SIKE-610, a cost of 2 171 compared to Tani's cost of 2 169 .