## Abstract

Chaotic Pseudo Random Number Generators have been seen as a promising candidate for secure random number generation. Using the logistic map as state transition function, we perform number generation experiments that illustrate the challenges when trying to do a replication study. Those challenges range from uncertainties about the rounding mode in arithmetic hardware over chosen number representations for variables to compiler or programmer decisions on evaluation order for arithmetic expressions. We find that different decisions lead to different streams with different security properties, where we focus on period length. However, descriptions in articles often are not detailed enough to deduce all decisions unambiguously. To address similar problems in other replication studies for security applications, we propose recommendations for descriptions of numerical experiments on security applications to avoid the above challenges. Moreover, we use the results to propose the use of higher-radix and mixed-radix representations to trade storage size for period length, and investigate if exploiting the symmetry of the logistic map function for number representation is advantageous.

## 1 Introduction

Chaotic pseudo-random number generators (cPRNGs), in particular the logistic map [1], have been considered as a means for secure PRNGs. Experiments [11], [13] revealed that the period lengths are not satisfactory in relation to state size. When studying the descriptions of these experiments, we found that they are missing some implementation details that also influence the period length, even for fixed parameter settings. Hence, replicating those studies experiences some challenges, which we report and from which we derive recommendations for describing experiments including numeric calculations. We also underline experimentally that the implementation details such as chosen number representation, rounding mode and order of evaluation in a formula influence the results.

Beyond the typical number representations integer, i. e. fixed-point, and (binary) floating-point, we next present number representation formats in-between those two that use radix higher than 2 and possibly mixed radix. We demonstrate experimentally that those formats enable to trade accuracy and storage. Thus, our investigations also exhibit relations to the field of approximate computing. Moreover, we investigate if exploiting the symmetry of the logistic map function, i. e.

The remainder of this article is structured as follows. Section 2 gives information on pseudo-random number generation for security applications, especially when using chaotic functions, summarizes basic information on number representations for rational numbers, and discusses related work. Section 3 presents the challenges that we encountered in experiments with chaotic pseudo-random number generation based on logistic map and reports on own experiments to demonstrate those challenges. In Section 4 we present number formats between standard floating-point and fixed-point representations, as well as experimental results that indicate that those formats allow a tradeoff between storage size and achievable period lengths. Section 5 investigates how symmetry in the logistic map function can be exploited in number representations, while Section 6 concludes with recommendations on configuration details to be reported about numerical experiments, and an outlook on future work.

## 2 Background

### 2.1 Chaotic pseudo-random number generation

A pseudo-random number generator is a finite state automaton. Upon seeding, it receives input which is transformed into a value for the inner state. When the generator is called from an application, an output is generated by applying an output function on the state, and the state transitions to the follow-up state by applying the state transition function (or update function) on the current state. The follow-up state is unique for each state, as the generator receives no further input until a re-seeding takes place.

The pseudo-random number generator should produce outputs that follow the claimed distribution (e. g. if bits are output with uniform distribution, each value should appear 50 % of the time), pass statistical tests of randomness like Diehard [17] (also 0,1,0,1,0,1 has uniform distribution but does not look random), and should have a sufficiently long period before output sequences repeat. The latter is unavoidable as the inner state is finite. A good exposition of random number generators can be found in [12].

For using pseudo-random numbers in security applications, it should not be possible to infer the inner state from outputs, or to predict future outputs from current outputs. Finally, it should not be possible to compute from one state previous states or previous outputs [7].

Therefore, while in general the state could be a counter and the complexity rest in the output function, here the state transition function must be one-way. Thus, so-called chaotic functions seemed attractive for use as state transition functions.

The *logistic map* [1] is a function

If parameter *a* is close to 4, then the range

### 2.2 Finite-length representations for real numbers

For non-integral numbers, the commonly used representation are floating-point numbers [14], which have been standardized by IEEE in 1985, and revised twice since then [8]. Interval [0;1) is partitioned into

Thus, a floating-point representation is comprised of a power of 2 times a mantissa, i. e.

As IEEE standard 754 also covers negative numbers and numbers larger than 1, a floating-point representation with a sign bit *s*, mantissa bits *M* of length *m* and an exponent *E* of *e* bits comprises

where *de-normalized* numbers

the most important of which is 0. Thus interval [0;1) is represented by all representations with sign bit *m*, round to infinity, i. e. always round up, and round to nearest, i. e. round to zero if bit

IEEE Standard 854 [9] generalizes floating-point representations to radix or basis

Also fixed-point representations [14] can be used to represent interval [0;1). For other real numbers, a combination of their integral part and their rational part is used for representation. A fixed-point representation *M* of

Also a rounding can be applied similar to floating-point numbers.

Please note that for

### 2.3 Related work

The investigations by Keller and Wiese [11] and Kneusel [13] focus on the influence of parameter *a*: even for two close values of

An extension of fixed-point format is described in [4], where two different exponents

[2] point out that higher radix floating-point format is advantageous in reconfigurable hardware, but that normalization (ensuring that not all of the

In [6], the necessary mantissa length (bitwidth) for a given application is analyzed. This approach might be used as a post-processing to our approaches in Sections 3.4 and 4.

## 3 Challenges

In this section, we will investigate challenges for replication of execution results, when evaluating the formula

### 3.1 Order of computations

For real numbers, the order of above multiplications does not matter, as multiplication is commutative and associative. Yet, associativity is not given anymore for finite number representations, because rounding will occur. Hence, depending on the choice, there are three possibilities for choosing the arguments of the first multiplication, when we assume commutativity. Even when the source code is given, we still cannot be sure about the order, as compilers have the possibility to re-order operations, e. g. to improve performance for a given processor architecture. Yet, as long as the compiler product, its version and its compilation flags are specified in detail, this can be avoided. However, when the source code is not given, the implementer of the replication code is free to choose order, and over time several different implementation orders have been seen by the author. While the differences between results from different orders will be tiny, the resulting state graphs will have notable differences, cf. Subsection 3.4.

As an example, we assume a representation with 8 significant binary digits and take

If we multiply *x* and *a* results in 0.84259033203125=0.11010111101101 which will be rounded to 0.11011000=0.84375.

### 3.2 Rounding modes

Even if the former challenge has been mastered, there is more in the multiplications. So far, we have assumed that rounding goes to the nearest representable number. Yet IEEE 754 standard defines several rounding modes [8], i. e. round towards zero (the rounding applied so far), round towards infinity (always round up to next representation), and round to nearest, i. e. depending on the value of the first digit to be cut away, one rounds up or down.

All the rounding modes are nowadays implemented in hardware by IEEE754 compliant floating-point units in processor cores. The choice of the rounding mode is done either by the BIOS on power up, but might also be changed by the operating system or even by the application program, e. g. via function `fesetround()` in *fenv.h*.

As an example, when multiplying

Unfortunately, rounding modes normally are not mentioned when experimental conditions are described in articles, so everyone seems to assume that round to nearest is used.

### 3.3 Number representation

So far, we have used the straightforward number representation for reals, i. e. floating-point numbers. However, this has some disadvantage. First, two bits of the available format are lost: we are not using negative numbers, as we are focusing on interval [0;1), and we only use half of the possible exponent values. Second, the representable numbers in IEEE 754 are not evenly distributed over the interval [0;1) as mentioned in Section 2.

In contrast, a fixed-point number representation comprising 32 bits has `uint32` to implement this scheme. Addition (and subtraction) is done as usual, at least as long as no overflow occurs, which we can exclude here. Multiplication is done by converting the arguments to `uint64`, doing the multiplication, shifting by 32 bits to the right, and converting back to `uint32`. Please note that in addition to a round towards zero (by shifting 32 bits to the right), also other rounding modes can be realized by considering the highest bit to be rounded away. Rounding up then is realized by incrementing after the right shift.

Multiplication of *a*, which is in interval [3;4), is realized by representing

Please note that this realization, which has been presented and used in [20], prefers a particular order of multiplications, i. e. first multiply *x* with

### 3.4 Experiments

As floating-point implementations have been investigated in the past [11], [13], we will first focus on fixed-point implementations in our experiments. For 63 equi-spaced values of *a*, ranging from *a*, i. e. their choice should be as good as random. The results are depicted in Table 1.

Please note that differences between the rounding modes have been investigated in [10], and that in all except one scenario, round to nearest lead to longest periods. Hence we will use this rounding mode in the rest of this article.

fixed-point | min | avg | std. dev | max |

2493 | 14755.3 | 7997.6 | 34204 | |

210 | 31220.3 | 23147.2 | 98969 | |

6347 | 37542.7 | 21576.7 | 101665 | |

all | 210 | 27839.5 | 21151.0 | 101665 |

The order where *a* in the same range, over which the average is taken. The same situation is encountered for floating-point formats, cf. Section 5. For all variants, the spread of cycle lengths among the different values of *a* is large: from 210 to 101,665.

To compare with floating-point implementations, we used an implementation based on IEEE754-compliant floats with round to nearest. For them (cf. Section 5 and Table 3), computation of

This comparison also serves to illustrate a tradeoff that is possible. We might shorten the number of bits used in the fixed-point representation. Table 2 illustrates how average cycle lengths (averaged over 63 values of *a* as before, with round to nearest an

reduced bits | avg. cycle length |

0 | 37542.7 |

1 | 24319.3 |

2 | 16003.9 |

3 | 9670.9 |

4 | 7107.6 |

5 | 6560.0 |

6 | 4016.9 |

7 | 3310.9 |

8 | 2446.8 |

9 | 1746.3 |

10 | 1069.5 |

## 4 Comparing different number formats

Fixed-point and floating-point representations are just the opposite ends of a whole spectrum of possible number formats. In the following, we describe a generic scheme to specify different number formats. To do so, we partition the interval [0;1) into a sequence of *d* intervals

Thus, to store a number in this format, we need *m* bits to represent the mantissa (or

We start with fixed-point representations. They only comprise one interval

In single precision floating-point representations, *M* must be set (so-called implicit 1). Thus, only *m* bits, we really work with a

Another example of a possible format is a separation in

If all differences

Hence, for given total storage size of *m*, and for each *d* the exponents can have a number of possible distances leading to different number formats, which will in turn lead to different cycle lengths in the logistic map application. In this configuration space, we can search for the representation which is best with respect to a particular application.

Alternatively, one can fix the required performance, i. e. cycle length in our application, and search for the representation with shortest storage size that achieves that performance, taking up the view of approximate computing [19].

Moreover, from all combinations of performance and storage size, the user can draw a plot and identify the Pareto front, i. e. the curve of Pareto-optimal combinations, which is the (typically small) number of combinations that the user can choose from depending on her needs [3].

In general, if

To explore the spectrum of possible number formats, we have implemented a custom library where *d*, *M* and the sequence of ^{[1]}

We investigate total lengths *t* from 32 down to 24, with mantissa lengths ranging from *t* (resulting in *e* and *m* we explore sequences of

For each number representation, we choose again 63 different values of *a* as before, and compute the minimum, maximum and average cycle lengths (plus standard deviation).

Like in previous experiments, we see that standard deviation is quite large (ranging 49 % to 69 % of the average) and does not get better if we take larger number of samples. For almost all (31 of 33) combinations of *e* and *m*, we find that

In the following, we further investigate the *m* it is largely constant over the different exponent lengths *e*, and the curves are ordered by growing *m*. Please note that the cycle lengths for

Figure 2 depicts that for a fixed exponent length *e*, the average cycle length grows with total length *m*, and the curves are ordered by shrinking exponent length *e*. The only exception is a crossing of the curves for *e*, i. e. shrinking *m*, and the curves are ordered by *t*.

In summary, we find that for our application, distances

Still, our view unifies several proposals for number representations from literature, cf. Section 2.3, and our implementation allows to find the optimal representation, e. g. for a reconfigurable hardware device with fixed application, with low effort.

## 5 Extending number formats for symmetry

We note that the logistic map as defined in Eq. (1) is symmetric with middle 0.5, i. e. for *z* is any of the previous representations, only restricted to [0;0.5] instead of [0;1], then *s*. The product *s*, and

If

ordinary |
min |
avg |
std. dev |
max |

91 | 3172.1 | 2053.1 | 7802 | |

141 | 2089.2 | 1330.4 | 5300 | |

49 | 2281.9 | 1630.9 | 6308 | |

all | 49 | 2514.4 | 1761.8 | 7802 |

symmetric |
min |
avg |
std. dev |
max |

240 | 2459.5 | 1610.0 | 6803 | |

1127 | 2596.8 | 1710.6 | 10277 | |

306 | 1734.8 | 915.3 | 4316 | |

25 | 1945.4 | 1316.6 | 5630 | |

all | 25 | 2184.1 | 1465.8 | 10277 |

To see if “symmetric” floating-point representations are advantageous, we did experiments similar to those in Section 3.4: for each *a* tested, we find the largest cycle reachable from 32 starting points, and we compute minimum, maximum, and average cycle length over 32 values of *a* starting from *a* last, or *x* last, or *a* last, i. e. computation of *z* last, i. e. computation of

The results can be seen in Table 3. There is no clear winner. With respect to average cycle length found, ordinary floating-point representation with multiplication order “*a* last” is best, but the best variant of symmetric floating-point representations is much better in minimum (and maximum) cycle length found. Moreover, the standard deviations are very large compared to the averages, so that the averages are not very meaningful, and minimum and maximum lengths might provide a better insight. Please note that increasing the number of trials over which averages are computed does not help here, due to the chaotic nature of the logistic map for different values of *a* close to 4. For 1024 equi-spaced values of *a* starting from *a* last. Yet, in some variants standard deviation even increases, e. g. from 1630 to 1688 for conventional floats and multiplication with *a* is done last reduces to 2459.7, so that the range of averages is much closer together.

## 6 Conclusions

We have presented challenges that occur with respect to replicability in security or cryptographic applications that involve numeric parts originally devised for real numbers. As example, we have chosen the logistic map, which has been one proposal for a chaotic pseudo-random number generator. We have demonstrated by experiments the relevance of considering the differences that possibly result from not arbitrating the ambiguities underlying the above challenges.

As similar problems might occur in other security applications as well, we derive the following recommendations. First, the rounding mode used in the original experiments should be found out by the experimenters and should be reported. Second, the order of computations used in the original experiments should be reported as well, and possibly should be modified by the first experimenters to check if notable differences in result can be the consequence. Finally, different number representations should be considered by experimenters, and results on their influence and reasons for the final choice of representation should be reported.

Moreover, we have investigated higher and mixed radix representations in between fixed-point and binary floating-point representations, plus symmetric representations, and demonstrated how their tradeoff between storage size and performance (in the sense of period length in the application) can be used by users to tailor their application to their needs, thus connecting this investigation to the field of approximate computing.

Future work will comprise a deeper analysis of our experimental results. Also, the scope of numerical challenges will be extended from chaotic random numbers to other branches of IT security. Moreover, the introduction of redundancy by higher radix representations, e. g. by skipping normalization [2], opens the path for information hiding, e. g. storage covert channels [18], in such representations, which we plan to investigate further.

# Article note

This article is an extended version of [10].

**Funding source: **Horizon 2020 Framework Programme

**Award Identifier / Grant number: **833042

**Funding statement: **This work is supported by project SIMARGL (Secure intelligent methods for advanced recognition of malware, stegomalware & information hiding methods, https://simargl.eu), which has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 833042.

## About the authors

M. Sc. Carina Heßeling studied Computer Science at FernUniversität in Hagen, Germany (Bachelor 2017, Master 2021). Since August 2021, she is a PhD student and Research Assistant at FernUniversität in Hagen, Germany. Her research interests are approximate computing and network steganography.

Prof. Dr. Jörg Keller studied Computer Science and Mathematics at Saarland University, Saarbrücken, Germany (Master 1989, PhD 1992, Habilitation 1996). Since 1996, he is Professor for Computer Engineering at FernUniversität in Hagen, Germany. His research interests are network steganography, energy-efficient and fault-tolerant parallel computing, and virtual laboratories.

## References

1. Marcel Ausloos and Michel Dirickx, editors. *The Logistic Map and the Route to Chaos*. Springer, Berlin Heidelberg, 2006.10.1007/3-540-32023-7Search in Google Scholar

2. Bryan Catanzaro and Brent E. Nelson. Higher radix floating-point representations for FPGA-based arithmetic. In *13th IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM 2005), Proceedings*, Napa, CA, USA, 17–20 April 2005, pages 161–170. IEEE Computer Society, 2005.10.1109/FCCM.2005.43Search in Google Scholar

3. Matthias Ehrgott. *Multicriteria Optimization*, volume 491 of *Lecture Notes in Economic and Mathematical Systems*. Springer, 2000.10.1007/978-3-662-22199-0Search in Google Scholar

4. Chun Te Ewe. Dual fixed-point: An efficient alternative to floating-point computation for DSP applications. In *Proceedings of the 2005 International Conference on Field Programmable Logic and Applications (FPL)*, Tampere, Finland, August 24–26, 2005, pages 715–716. IEEE, 2005.Search in Google Scholar

5. Philippe Flajolet and Andrew M. Odlyzko. Random mapping statistics. In *Advances in Cryptology – EUROCRYPT ’89*, volume 434 of *Lecture Notes in Computer Science*, pages 329–354. Springer, 1989.10.1007/3-540-46885-4_34Search in Google Scholar

6. Altaf Abdul Gaffar, Oskar Mencer, Wayne Luk, Peter Y. K. Cheung, and Nabeel Shirazi. Floating-point bitwidth analysis via automatic differentiation. In *Proceedings of the 2002 IEEE International Conference on Field-Programmable Technology, FPT 2002*, Hong Kong, China, December 16–18, 2002, pages 158–165. IEEE, 2002.10.1109/FPT.2002.1188677Search in Google Scholar

7. Oded Goldreich. *Foundations of Cryptography I: Basic Tools*. Cambridge University Press, Cambridge, UK, 2001.10.1017/CBO9780511546891Search in Google Scholar

8. IEEE standard for floating-point arithmetic, 2019. IEEE Standard 754-2019 (Revision of IEEE 754-2008).Search in Google Scholar

9. IEEE standard for radix-independent floating-point arithmetic, 1987. IEEE Standard 854-1987.Search in Google Scholar

10. Jörg Keller. Chaotic pseudo random number generators: A case study on replication study challenges. In *ARES 2021: The 16th International Conference on Availability, Reliability and Security*, Vienna, Austria, August 17–20, 2021, pages 94:1–94:4. ACM, 2021.10.1145/3465481.3470062Search in Google Scholar

11. Jörg Keller and Hanno Wiese. Period lengths of chaotic pseudo-random number generators. In *Proc. IASTED International Conference on Communication, Network, and Information Security (CNIS 2007)*, Calgary, Canada. Acta Press, 2007.Search in Google Scholar

12. Wolfgang Killmann and Werner Schindler. A proposal for: Functionality classes for random number generators. Report ais 20/31 v2.0, Federal Office for Information Security/Bundesamt für Sicherheit in der Informationstechnik (BSI), Bonn, Germany, 2011.Search in Google Scholar

13. Ronald T. Kneusel. *Random Numbers and Computers*. Springer, Cham (CH), 2018.10.1007/978-3-319-77697-2Search in Google Scholar

14. Israel Koren. *Computer Arithmetic Algorithms*. Taylor & Francis, Milton Park, UK, 2nd edition, 2001.Search in Google Scholar

15. Ping Li, Zhong Li, Wolfgang A. Halang, and Guanrong Chen. A stream cipher based on a spatiotemporal chaotic system. *Chaos, Solitons & Fractals*, 32(5):1867–1876, 2007.10.1016/j.chaos.2005.12.021Search in Google Scholar

16. Gerald Luber. Statistical analysis of outputs from chaotic pseudo random number generators with fixpoint representation (in German: Statistische Analyse der Ausgaben von chaotischen Pseudozufallszahlengeneratoren mit Festkommadarstellung). Master thesis, FernUniversität in Hagen, Germany, 2014.Search in Google Scholar

17. George Marsaglia. The marsaglia random number CDROM including the Diehard battery of tests of randomness. Technical report, Florida State University, Tallahassee, FL, 1995.Search in Google Scholar

18. Wojciech Mazurczyk, Steffen Wendzel, Sebastian Zander, Amir Houmansadr, and Krzysztof Szczypiorski. *Information Hiding in Communication Networks. Fundamentals, Mechanisms, and Applications. IEEE Series on Information and Communication Networks Security*. Wiley-IEEE, 2016.10.1002/9781119081715Search in Google Scholar

19. Sparsh Mittal. A survey of techniques for approximate computing. *ACM Comput. Surv.*, 48(4):62:1–62:33, 2016.10.1145/2893356Search in Google Scholar

20. Lena Oden and Jörg Keller. Improving cryptanalytic applications with stochastic runtimes on GPUs. In *Proc. 11th International Workshop on Accelerators and Hybrid Emerging Systems (AsHES@IPDPS 2021)*, New York, NY, May 2021. IEEE, 2021.10.1109/IPDPSW52791.2021.00077Search in Google Scholar

21. Dhananjay S. Phatak and Israel Koren. Hybrid signed-digit number systems: A unified framework for redundant number representations with bounded carry propagation chains. *IEEE Trans. Computers*, 43(8):880–891, 1994.10.1109/12.295850Search in Google Scholar

**Received:**2021-11-30

**Accepted:**2022-01-21

**Published Online:**2022-02-04

**Published in Print:**2022-04-26

© 2022 Heßeling and Keller, published by De Gruyter

This work is licensed under the Creative Commons Attribution 4.0 International License.