Use and Testing of Pseudo-Random Number Generators

A. M. Amthor, J. C. Woods

North Georgia College and State University

Abstract

Several types of pseudo-random number generators (PRNGs) will be discussed along with methods for evaluating PRNGs. Some applications of random number generators will also be discussed. Special attention will be given to the limitations of particular generators and requirements for Monte Carlo simulations and cryptographic applications. Examples will be given throughout.

1Introduction

Today randomness is a necessary ingredient in many man-made systems, and, while these systems may vary widely, most share a common centerpiece, a digital computer. This fact allows a valuable simplification at the start. Because they are to be used on digital computers, the discussion of randomness in general may be limited to random integers. One might argue that computers are capable of processing non-integer values, but even floating point numbers regardless of precision are stored in memory as binary integers. Furthermore, a stream of random integers on a given range can easily be converted to a stream of floating point numbers of any desired precision over any desired range. The chief benefit of this simplification is that it will allow discrete probabilities to be considered over discrete and finite sets of values as opposed to dealing always with continuous probability distributions integrated on intervals.

Luckily as the demand for random numbers has increased with the pervasion of computer technology and probabilistic mathematics in many fields of human venture, so has our ability to generate “randomness”. Although, much of this generated randomness is more properly called pseudo-randomness since it is in actuality the result of a deterministic mathematical process, which only has the appearance of randomness in the natural sense. Why settle for second best, using pseudorandom numbers when random ones are available? Consider using an equiprobable six-sided die to generate a stream of random digits modulo six, or perhaps several such die in a box to increase the speed of the process. The output of this process could be considered truly random, since the outcome could not possibly be determined in advance. The main problem with this system is that most applications of random numbers require a much higher bit rate (number of random bits generated per unit time) of random numbers than could be produced through this method. If a computer expects a random stream as an input to some process, then random numbers must be generated at a rate comparable to the speed of the computer. The most obvious solution is to have the computer generate the random numbers. Then as computers become faster and demand higher bit rates, higher bit rates can be provided by the same increase in computational speed that required them. The downside to this solution is that computers are by nature not random but deterministic and are designed to provide predictable output for any given state and input values. Therefore, these computer-generated numbers will never be truly random, but rather they will be the apparently random output of some deterministic process operating on a finite set of integers or integer equivalents.

It should be noted that there is some variation in the field with respect to certain acronyms. The reader should be careful when reading any literature on the subject to verify the definition of an acronym before assuming equivalence.

2Generators

It seems only appropriate to begin the study of pseudo-random number generators (PRNGs) with a definition of a generator. L’Ecuyer gives such a definition in [1]. A generator is a structure made up of a finite set of states, S; an initial state, so, also known as the seed; a transition function, T, from S to S; a finite set of outputs, U; and an output function, G, from S to U.

As an example, consider one of the most fundamental generators, known as a linear congruential generator (LCG). A LCG uses a recursion relation with four user-defined values to generate a stream of integers. LCG(M,A,b,so) defines a generator with state set S ⊂ℤ mod M, T(sn) = sn+1 = (A*sn + b) mod M, U = S, and G(sn) = sn, where sn is defined as the nth state. Notice that the number of states in S, according to this definition, must be less than or equal to M. This, combined with the fact that T is a function, allows us to set an upper limit on the period, p, where p is defined to be the minimum value of k such that sn = sn+k. In fact it has been shown that the maximum period is M-1, and necessary conditions for a LCG to achieve this period are that M is prime, A is a primitive root[1] mod M, and so≠ 0.

Another quantity, the transient τ, is the least value of n for which sn = sn+k. Although it may be tempting to think that the transient is always zero, that would be to assume that T is injective, which is not necessarily true. I accidentally proved the existence of nonzero transients when I tried to write a program to find the period of several LCGs, incorrectly, by searching for the value n such that sn = so. For any generator with nonzero transient however, so is never repeated even after generating states beyond sM.

LCGs are by far the most widely used of all PRNGs primarily because they are simple in design and lack the significant memory and processor requirements of more complex generators. The choice of the four input parameters is the difficult part. It usually involves a good deal of study as these four parameters will determine the extent to which the generated numbers will appear random as well as determining the period of the generator. Some examples of well-known LCGs among students of mathematics would be the generator in the Maple function rand(), which is given by LCG(101211,427419669081,0,1) and the Mathematica PRNG, which is given by LCG(α48α8+1,α=231,0,1). On the other hand, LCGs are not suitable for all applications due to their poor performance on several empirical tests, their relatively short periods, and the fact that the state is equal to the output. LCGs do serve as the bases for several other more complex PRNGs with better characteristics.

The extended linear congruential generator (ELCG), also known as a recursive generator (RG) or multiple recursive generator (MRG), is similar to an LCG except that its output function, G, includes more than one recursive term, and consequently its state consists of a vector of previous outputs. ELCG(M,a1,a2,…,ak,b,so) defines a generator with state set S ⊂ {(un1,…,unk) such that all um∈U}, k-dimensional vectors containing the k sequential outputs preceding the nth output of the generator; T(sn) gives the new state vector by removing the rightmost element, shifting the elements right one space, and making G(sn) the new leftmost element; U ⊂ℤ mod M; and G(sn = (un1,…,unk)) = (a1*un1+…+ ak*unk+b) mod M. The most significant difference between the ELCG and the standard LCG is that for the ELCG, S ≠ U. This eliminates the basis for the theoretical limit on the period that exists for an LCG. As explained in [2], the period length is now limited by p ≤ Mk1, and necessary conditions for the maximum period are that M is prime and the characteristic polynomial, P(z) = zk – a1zk1 - … - ak, is a primitive polynomial[2] modulo M. The cost of the improved performance of the ELCG over the standard LCG is the more complicated implementation. Storing the larger state variable should not be a problem, but one multiplication operation and likely one modulus operation will be needed for each nonzero am, considering the large size of each amun-m. Although, methods do exist for dealing easily with larger numbers, such that only one modulus operation per output is needed, if the an parameters are chosen well.

A combined linear congruential generator (cLCG) is another variation of the standard LCG technique. In this case, more than one LCG is used and the individual outputs are combined in the process of computing the final output, G(sn). In this case, the period is rather obviously limited by the least common multiple of the periods of the individual LCGs. Some methods of combining the individual LCG outputs are as follows.

Consider the individual outputs of a certain number, J, of separately defined LCGs. The individual outputs of the J generators are multiplied by independent constant integer coefficients, δj, usually chosen to be relatively prime to Mj, the modulus of the jth generator, then divided by Mj. All J products of this process are then summed, and their remainder after division by one is a pseudo-random number on the interval [0,1). Alternatively, the outputs of the individual LCGs may be combined using an exclusive or binary operation. Another method similar to the first takes the sum of all terms δjuj,n (where uj,n is the nth output of the jth generator) over J generators modulo Mc, divided by Mc (where Mc is a modulus specific to the combination and not necessarily equal to any of the Mj values), to generate a uniform variate on [0,1). Since it requires substantially fewer division operations, this third method is preferable to the first computationally.

Wichman and Hill designed a cLCG using three LCGs: LCG1(30269,171,0,1), LCG2(30307,172,0,1) and LCG3(30323,170,0,1) and combined the generators using the first process described above, see [8]. Their cLCG was shown in 1986 by Zeisl in the Journal of Applied Statistics to be equivalent to the generator:

LCG(27817185604309,16555425264690,0,1)

Theoretically, all such combined generators are equivalent to single LCGs with large moduli. This is why they are so popular, because you get the effective output of the larger generator without quite so much of the computational difficulty.

Both of these two altered LCG methods may be combined to form a third variant, a multiple recursive generator (MRG). These are sometimes referred to as combined multiple recursive generators by those who refer to an ELCG or RG as a multiple recursive generator. These are also in wide use in the scientific community although the limitations of standard computer architecture make implementation of strong MRGs difficult except for specially chosen parameters.

A fundamentally different method, but one which may still be seen as a multiplicative, recursive method, is a generator based on a feedback shift-register (FSR). One of the most fundamental FSRs is a linear feedback shift-register (LFSR), see [3], [4], and [5]. These generators produce streams of random bits based on an n-bit state value. The output bit is the rightmost or nth bit, and the transition function determines the new state by shifting each bit right one position and recursively generating a new first, leftmost, bit. The recursion generates the first bit by adding the values in selected bit positions—known as taps—in the previous state modulo 2. The process is often considered in terms of matrices, with the state represented by an n-dimensional binary vector acted on by the transition function in the form of matrix multiplication by an n-by-n transition matrix. The transition matrix contains ones in the positions corresponding to taps in the first column, zeros otherwise, and the other columns form a diagonal matrix of 1's, excluding the bottom row, which serve to shift the bits right. The bottom row will be all zeros except possibly the first column.

The period of an FSR is determined by the number of states, which is 2n, and the fact that the zero vector must be excluded. Regardless of the tap sequence a zero state vector would always yield a zero state. Thus, the maximum period is 2n – 1. To achieve the maximum period it is necessary that:

A) the characteristic polynomial be monic irreducible[3],

and

b) m = 2n – 1 is the smallest for the smallest m such that the characteristic polynomial divides xm + 1. The characteristic polynomial is produced by xn + cnxn-1 + … + c1, with the first column of the transition matrix elements numbered cn, … , c1 from top to bottom.

For example, a FSR with a 4-bit state will have a full period for c1 = c4 = 1 and c2 = c3 = 0, since x4 + x3 + 1 is monic irreducible, and the smallest m for which the characteristic polynomial divides xm + 1 is m =15, which is 24 – 1. Other more complicated FSRs may achieve more random output by varying the transition matrix between outputs according to some set algorithm.

3 Tests

Given so many generators, some method must be found to determine which generators are suitable, and which are not, for a given application. There has yet to be developed a complete theoretical understanding of all methods involved in PRNGs, especially the most complex mechanisms in the field, combined hybrid generators, which combine several different generator methods, and even the conceptually simple Bays-Durham shuffle, a method used to enhance the performance of some simpler pseudo-random number sources. Thus, it is essential that methods of empirical testing be developed to evaluate the performance of PRNGs, and, more specifically, to determine which generators are suitable for which applications.

The most fundamental requirement for a random number generator is that its output, for sample sets much smaller than the full period, be approximately uniformly distributed over the set of possible outputs. More generally, it is important that n-dimensional vectors, composed of n sequential outputs, be approximately uniformly distributed over the n-dimensional output space. Depending on the application, uniformity in some particular dimension or dimensions may be considerably more important than in others. The standard method for evaluating uniformity in n-dimensional space is to generate a certain number of n-dimensional vectors, N, and divide the space into K "bins," with each bin having the same probability that an n-dimensional vector will fall into it. Then examine the distribution of N vectors into the K bins using a Chi-squared test with K-1 degrees of freedom to see if the distribution is adequately similar to a distribution that would be expected from a truly random set of vectors with approximately N/K vectors in each bin and an error near ±(N/K)(1/2).

A Monte Carlo integration test is another test that similarly compares the spatial distribution of pseudo-random vectors to that expected of a truly random sequence. A definite integral whose value is well known is used together with a larger area, which contains the entire integrated area. The ratio of the integral to the larger area is compared to the number of pseudo-random vectors within the area of integration divided by the total number of vectors in the larger area. For example, consider the unit circle in the first quadrant. The area under the curve is π/4. This means that a PRNG with a uniform distribution in two dimensions should have N*π/4 of its two dimensional vectors (x,y) such that x2 + y2 < 1.

One property of many PRNGs, including most LCG based varieties as well as FSR based generators is that, considered over a complete period, they exhibit a lattice structure in s-dimensions. The spectral test studies the s-dimensional lattice structure of a generator as a measure of the “coarseness” of the distribution. Ideally, the parallel hyperplanes formed in s-dimensional space by the generator should not have a great variation in the distances measured between hyperplanes. The spectral test considers the maximal distance between adjacent parallel hyperplanes in the s-dimensional lattice, taking the maximum of these values over all families of hyperplanes as ds. Since the vectors for the entire space are typically mapped onto the unit cube, the distance, ds, will naturally depend on the total number of possible output values, though if it performs well, ds should not be much greater than the distance between planes created by points which are perfectly evenly distributed in the space, where the distance between several hyperplanes will be equal.

The tests mentioned so far have focused primarily on the geometrical distribution of the generator and are mainly useful for determining the viability of a generator for use in Monte Carlo simulations. More rigorous tests are called for, however, when the PRNG output is expected to imitate purely random numbers in other ways. The Diehard battery of tests developed by George Marsaglia is intended to test the randomness of PRNGs to this higher standard for use in computational number theory.

The GCD (greatest common divisor) test, thoroughly described in [6], from the Diehard battery, studies the distribution of two variables, GCD(un, un+1) and k, derived from application of the Euclidian algorithm to consecutive output values, un and un+1, from a PRNG. The GCD(un, un+1), as well as k, the number of iterations of the Euclidian algorithm required to find the GCD for two independent random numbers, have distributions that are well known either theoretically or through extensive experiment. Marsaglia states that given two random integers on [1,n], k is approximately normally distributed with a mean of 0.842766 ln(n) + 0.06535, and a variance of 0.5151 ln(n) + 0.1666. The distribution of GCD(un, un+1) was found through extensive experiment for 32-bit integers. Part of the experimental results given in [6] for selected generators are shown below, with the expected counts in the first row of each table. Each generator was used to generate 7x106 random numbers to be tested.

K / <=3 / 4 / 5 / 6 / 7 / 8
Expected Counts / 5.5 / 29.5 / 144.6 / 590.7 / 2065 / 6277
KISS / 7 / 30 / 142 / 583 / 2148 / 6294
Xn=69069*Xn-1+12345 mod 232 / 3 / 20 / 101 / 491 / 1629 / 4965
Xn=69071*Xn-1mod(232+15) / 154 / 21 / 123 / 701 / 1896 / 6121
GCD / 1 / 2 / 3 / 4 / 5 / 6
Expected Counts / 6079271 / 1519817 / 675474 / 379954 / 243171 / 168869
KISS / 6078818 / 1521176 / 675496 / 379749 / 242677 / 168462
Xn=69067*Xn-1 mod(235+951) / 6077628 / 1520790 / 675940 / 379369 / 243256 / 168537
Xn=69069*Xn-1+12345 mod 232 / 8106215 / 0 / 900376 / 0 / 324000 / 0

The performance of a given generator is evaluated by comparing the number of counts in each column to the expected number of counts using a chi-squared test. The p-value gives the probability that the numbers in the sample were generated from a random sample. All LCGs fail the Diehard tests. This does not mean that LCGs are useless, since the Diehard tests are more demanding than many applications require. Passing the GCD test, the one element of Diehard mentioned here, depends heavily on the last digit of the output value, even though the last digit has relatively little effect on the uniformity of the distribution.

4Applications

The purpose of these efforts in the development and testing of PRNGs is to have a selection of easily implementable and verifiably robust methods to produce numbers that satisfy the various requirements for “randomness” of several man-made systems that require them to operate. Since these applications are the justification for much of the study which has and continues to take place in the field it is only proper to give at least a brief synopsis of the most popular applications. Those being Monte Carlo simulations, cryptography, and computational number theory.