Randomness tests

From Simulace.info
Revision as of 23:58, 24 January 2016 by Martin.zima (talk | contribs) (created (work in progress))
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

When dealing with random values and their sources (often commonly referred-to as random number generators, RNGs), it is important to know what behavior we may expect from used generators. To ensure the used RNG implementations meet certain levels of required quality (randomness), one can carry out various tests specifically designed to verify such properties (sometimes called ‘randomness tests’). In the worse-case scenarios, use of an inappropriate randomization source may have fatal consequences for the systems involved – ranging from simulation results being completely irrelevant to serious security vulnerabilities because of cryptographic weaknesses.

Introduction

In the previous chapters of this textbook, we have explained what pseudorandom number generators (PRNGs) are, their areas of application and why is it both difficult and important for computers to generate truly random numbers (or at least try to). Even when using quality seed sources (which may also include various truly random values), as long we are using deterministic algorithms to generate the output values, the best result we can get is just pseudo-random. However, given such algorithm or, in general, any other (possibly black-boxed or unknown) value source, we might want to analyze and measure the actual ‘randomness’ of their output, and in consequence also the quality of the considered source.

Definition

Because the requirements for such source will greatly differ, we should first start off by defining what a ‘random value’ means in our context of system simulations. To consider a set of values random, we usually should be able to state that independently for every single item, the probability of it having a certain value should be equal to probability of having any other value (or at least for it to appear so to a certain degree). In terms of our random number generators, this would mean we usually will require that the probability of occurrence of all possible values will correspond to a certain probability distribution, usually the uniform distribution (this is the case for most PRNGs). In consequence, the resulting set of values should be absent of any detectable, non-random patterns.

For simplicity, we will now omit the extra intricacies that cryptographically-secure pseudo-random number generators (CSPRNGs) entail and focus on the more general case of regular pseudo-random number generators. CSPRNGs are usually also required to guarantee to be unpredictable - given a number of previously generated values (but not the generator's internal state), it should be impossible (using reverse engineering or other methods) to predict any future values - aka passing the next-bit test. They usually also have a lot more sophisticated seed-derivation procedure and in general they serve a slightly different purpose than the other generators - proper design and testing of a CSPRNG is a complicated case of its own.

The problem with testing randomness is that given any (pseudo-random-generated) data, one can never truly determine their real randomness - and can only make a good guess at best (on a side note - I remember seeing this humorously commented in a Dilbert strip a long time ago [1]). Although we cannot determine whether the true randomness of data, there is a number of options available to measure and quantify how much random the generated values appear to be - or to be more precise, how well the values fit the aforementioned 'randomness' specifications. Whether a generator passes such test is then only a matter of the limits of what we accept as sufficiently random - either the test results fall into our considered ranges and the generator succeeds, or they do not and it fails (and these criteria will differ for different PRNG use scenarios). To carry out most of those tests, we usually will operate on a set of data of specific fixed length - for obvious reasons, we cannot test an infinitely long (or generally too long, for that matter) dataset, meaning the values will usually be only a part of the random sequences a generator can produce - yet it still should be long enough (or the test should get repeated enough times) to 1) statistically eliminate the effect of patterns that will always occur in any shorter sequences by a chance and 2) to reveal possible algorithm deficiencies that might occur over longer periods of values (like recurring patterns after a full cycle of all values of a PRNG).

Tests

One of the first authors to discuss PRNG randomness testing was Donald Knuth in his book The Art of Computer Programming: Volume 2: Seminumerical Algorithms (1969) [2], presenting a complete battery of statistical tests mostly operating on empirical data of the generators. This battery of test was later adapted and extended by George Marsaglia in 1996 in his Diehard Battery of Tests of Randomness, also accompanied by test implementations in ANSI C language [3].

Some of the most common tests included in those batteries are described in following chapters.

Equidistribution test

One of the discussed requirements that a sequence must meet in order to be considered random is the uniform distribution of values in it. To verify that assumption, we can verify the frequency of every possible value in the result set and compare it with an expected probability distribution (i.e. uniform distribution). Generally speaking, there are two common methods that may be used to achieve this, the first being the Kolmogorov–Smirnov test. An alternative approach to this is using the (Pearson's) Chi-squared test, which is usable if we consider a smaller group of discrete values.

Because the equidistribution of single values itself is only a part of the requirements ensuring randomness, Knuth [2] also suggests extending this method to testing the frequency of pairs or other longer specific subsequences in the resulting sequence as a whole. Every of the combinations in one of those considered groups should be 'uniformly distributed in an independent manner'.

Autocorrelation

Autocorrelation (also known as serial correlation) is a term in statistics describing a correlation between values of a function or data at different points in time.

Gap test

External links

References

  1. Dilbert Comic Strip on 2001-10-25. Dilbert by Scott Adams [online]. 2001 [2016-01-24]. Available at: http://dilbert.com/strip/2001-10-25
  2. 2.0 2.1 KNUTH, Donald Ervin. The art of computer programming. 3rd ed. Reading: Addison-Wesley Publishing Company, 1998, xiii, 762 s. Classic work newly updated and revised. ISBN 0-201-89684-2.
  3. The Marsaglia Random Number CDROM including the Diehard Battery of Tests of Randomness. Florida State University Department of Statistics [online]. [2016-01-24]. Available at: http://stat.fsu.edu/pub/diehard/