Test Suite
The SPRNG test suite is available in the TESTS directory under sprng. The tests can be divided into two categories, (i) Physical model tests and (ii) Statistical tests. The former are described in the section on Physical model tests. The latter are described in this section.
We have implemented parallel versions of certain popular tests described by Knuth. These test can also be used for sequential random number generators. We have also provided a few other parallel tests. Before describing the test suite, we shall provide an overview of testing random number generators.
Testing Random Number Generators
Tests of random number generators typically compute some statistic from a portion of a random stream. This statistic is compared against the expected value from a truly random sample from a uniform distribution. If the results from the random number generator are consistent with those from a random sample, then the test is said to have been "passed" by the random number generator. Passing a test does not imply that the generator is producing a truly random sequence. It just means that that particular test could not differentiate between a truly random stream and the stream generated by the generator under consideration. However, if several different tests are passed, then our confidence in the random number generator increases. Since we need to compare the statistics from the generate stream with that from a truly random stream, we need to use some general statistical test procedures that can compare different distributions. We mention below the Chisquare test and the Kolmogorov-Smirnov (KS) test, which we use to determine the likelihood of our random streams coming from a uniform distribution.
Chisquare test
We can use this test for discrete distributions. Let us assume that we have k possible values or "categories" for our variable, and that the probability of a sample being of category i is pi. We make n independent observations and count the number of times each category occurred.For example, we can generate random numbers in (0.0,1.0) and divide them into four categories. If the number is less than 0.25, then it goes to category 0; if it is between 0.25 and 0.5, then it goes to category 1; if it is between 0.5 and 0.75, then it goes to category 2; if it is greater than 0.75 then it goes to category 3. Note that although the sampled distribution was continuous, we have transformed it to a discrete distribution by associating the numbers with their categories. The probability for each category is equal to 0.25. We count the number of times numbers in each of the four categories occur. For n trials, we expect that each category will have roughly 0.25 x n samples.
In general, if category i occurred Yi times, then Yi - npi is the difference between the observed number of values and the expected number of values, and can be considered an "error" between the observed and expected quantity. The chisquare statistic is a function of these "errors" for each of the categories and n. The distribution of the Chisquare statistic is known when the number of categories is given. We can therefore determine the Chisquare percentile for a given chisquare statistic, which is just the proportion of samples from the true distribution that have a chisquare statistic less than the percentile.
For example, if our expected probabilities were computed based on samples from a uniform distribution and the chisquare percentile is 10%, then this means that 10% of random samples drawn from the uniform distribution will have chisquare statistics less than what we observed. Thus if the chisquare values is very high, say 99%, then it is unlikely that our sample came from the uniform distribution, since only 1% of the samples from the uniform distribution will have such a high value. Less obvious is the fact that even very low values are unacceptable. For example, if the chisquare statistic is 1%, this means that the observed values were too close to the expected values; even a truly random sample from the uniform distribution would have a higher chisquare value 99% of the time.
Kolmogorov-Smirnov Test
The Kolmogorov-Smirnov (KS) test is used to compare continuous distributions. In this tests, the cumulative distribution function is determined for the samples. Then the maximum difference (in absolute value) between this distribution and the exact tested distribution is computed. The KS statistic is the product of this difference and the square root of the number of samples. A high value of this statistic therefore indicates that the two distributions are far apart, and vice versa. The KS percentile can be computed, analogous to the chisquare percentile. We reject samples in either tails, as in the case for the chisquare test.SPRNG Test Suite
The SPRNG test suite implements the tests described by Knuth, and a few others. We have generalized the tests of single random number streams to detect correlations between streams in a parallel random number generator. In our test scheme we interleave several streams to form a new stream, and then test this new stream with conventional tests. We show an example in the figure below where we combine three streams to form a new stream.
We form several such new streams, and test several blocks of random numbers from each stream. Usually the result of the test for each block is a chisquare value. We take the chisquare statistics for all the blocks and use the Kolmogorov-Smirnov test to verify that they are distributed according to the chisquare distribution. If the Kolmogorov-Smirnov percentile is between 1% and 99%, then the test is passed by the random number generator.
The tests in the SPRNG test suite take several arguments. The first few arguments are common to all the tests, and are explained below. The test specific arguments will be explained later. The SPRNG tests are called as follows:
test.lib nstreams ncombine seed param nblocks skip test_arguments
where the name of the executable test.lib is formed by concatenating the name of the test and the random number library from which the random numbers are generated. For example
Example:
equidist.lcg 4 2 0 0 3 1 2 100
mpirun -np 2 equidist.lcg 4 2 0 0 3 1 2 100perform the equidistribution test with the 48 bit Linear Congruential Generator with prime addend in a serial and parallel machine respectively.
The argument ncombine ( = 2 in our example) indicates the number of streams we interleave to form a new stream. We form nstreams (=4) such new streams and test nblocks (=3) blocks of random numbers from each new stream. The argument seed (=0) is the encoded seed to the random number generator, and param (=0) is the parameter to the generator. The argument skip (=1) indicates how many random numbers we skip after testing a block before we start a test on the next block. The rest of the arguments in our example are specific to that test. Note that we can perform tests on individual streams by setting ncombine to 1.
The result of the example given above are as follows:
sprng/TESTS:sif% mpirun -np 2 equidist.lcg 4 2 0 0 3 1 2 100
equidist.lcg 4 2 0 0 3 1 2 100
Result: KS value = 0.601752
KS value prob = 17.50 %
The KS value prob line gives the KS percentile for the entire set of tests. Since it is between 1% and 99%, we consider this example to have passed. It should be noted that the real tests are much larger than this simple example.
The figure below illustrates the use of random numbers in the example mentioned above.
Parallel versions of sequential tests
While most of tests we have implemented are parallel versions of popular sequential tests, some others are inherently parallel. We first describe the former. The users should note that when we state that a particular test is parallel, we are refering to the fact that it can be used to test correlations between streams. We do not mean that it actually runs on multiple processors. All SPRNG tests run can be run either on a single processor or on multiple processors.We next briefly describe each test followed by its test specific arguments. We also give the number of random numbers tested and asymptotic memory requirements (in bytes, assuming an integer is 4 bytes and double precision is 8 bytes). This should helps users estimate the time required for their calculations from smaller sample runs.
The details concerning these tests are presented in Knuth's book, unless we give some other reference.
- 1. Collisions test: n logmd logd
- Suppose we throw n << m balls into m urns, then most of the balls will fall into empty urns. If a ball falls into an urn that is occupied, then a collision is said to have occurred. The collisions test counts the number of collisions.
We generate random integers in [0,2logd-1]. We divide the sequence into n disjoint subsequences of length 2logmd each. We count the number of distinct sub-sequences obtained empirically and subtract this from n to obtain a count of the number of ``collisions''.
Our implementation of this test requires logmd*logd < 32. The number of subsequences, n, should also be less than the number of possible subsequences, 2logmd*logd.
Random numbers tested: n*logmd
Memory: 8*nstreams*nblocks + 4*n + 2^{logmd*logd}- 2. Coupon collector's test: n t d
- We generate random integers in [0,d-1]. We then scan the sequence until we find at least one instance of each of the d integers, and note the length of the segment over which we found this complete set. For example, if d = 3 and the sequence is: 0 2 0 1 ..., then the length of the first segment over which we found a complete set of integers is 4. We continue from the next position in the sequence until we find n such complete sets. The distribution of lengths of the segments is compared against the expected distribution. In our analysis, we lump segments of length > t together.
Random numbers tested: n*d*ln(d)
Memory: 8*nstreams*nblocks + 4*d + 16*(t-d+1)- 3. Equidistribution test: d n
- We generate random integers in [0,d-1] and check whether they come from a uniform distribution, that is, each of the d numbers has equal probability.
Random numbers tested: n
Memory: 8*nstreams*nblocks + 16*d- 4. Gap test: t a b n
- We generate floating point numbers in (0,1) and note the gap in the sequence between successive appearances of numbers in the interval [a,b] in (0,1). For example, if [a,b] = [0.4,0.7] and the sequence is: 0.1, 0.5, 0.9, 0.6, ..., then the length of the first gap (between the numbers 0.5 and 0.6) is 2. We record 'n' such gaps, and lump gap lengths greater than 't' together in a category in our analysis.
Random numbers tested: n/(b-a)
Memory: 8*nstreams*nblocks + 16*t- 5. Maximum-of-t test: n t
- We generate t floating point numbers in [0,1) and note the largest number. We repeat this n times. The distribution of this largest number should be as xt.
Random numbers tested: n*m
Memory: 8*nstreams*nblocks + 16*n- 6. Permutations test: m n
- We generate m floating point numbers in (0,1). We can rank them according to their magnitude; the smallest number is ranked 1, ..., the largest is ranked m. There are m! possible ways in which the ranks can be ordered. For example, if m = 3, then the following orders are possible: (1,2,3), (1,3,2), (2,1,3), (2,3,1), (3,1,2), (3,2,1). We repeat this test n times and check if each possible permutations was equally probable.
Random numbers tested: n*m
Memory: 8*nstreams*nblocks + 8*m + 16*(m!)- 7. Poker test: n k d
- We generate k integers in [0,d-1] and count the number of distinct integers obtained. For example if k = 3, d = 3 and the sequence is: 0, 1, 1, ..., then the number of distinct integers obtained in the first 3-tuple is 2. We repeat this n times and compare with the expected distribution for random samples from the uniform distribution.
Random numbers tested: n*k
Memory: 8*nstreams*nblocks + 0.4*min(n,k) + 12*k + 4*d- 8. Runs up test: t n
- We count the length of a "run" in which successive random numbers are non-decreasing. For example if the sequence is: 0.1, 0.2, 0.4, 0.3, then the length of the first run is 3. We repeat this n times and compare with the expected distribution of run lengths for random samples from the uniform distribution. Runs of length greater than 't' are lumped together during our analysis. We implement a modified version of this test in which we discard the number that follows a run before starting the new run. In our example, the number 0.3 would be discarded.
Random numbers tested: 1.5*n
Memory: 8*nstreams*nblocks + 16*t- 9. Serial test: d n
- We generate n pairs of integers in [0,d-1]. Each of the d2 pairs should be equally likely to occur.
Random numbers tested: 2*n
Memory:8*nstreams*nblocks + 16*d*dInherently parallel tests
Unlike the preceding tests which modify sequential tests to test for correlations between streams, the tests mentioned below are inherently parallel and test for correlations between streams. The meaning of the arguments for these tests are slightly different from those for the preceding tests. Since these tests are inherently parallel, we need not interleave streams, and thus the second argument ncombine should be set to 1. The first argument nstreams is the total number of streams tested. All these streams are tested simulatnaneously, rather than independently as in the previous case. The rest of the arguments are indentical to the previous case.
- 1. Sum of independent distributions test: n groupsize
- We can use the fact that the sum of independent variables approaches the normal distribution to test for the independence of random number streams. We add random numbers from several streams and form a sum. We generate several such sums and check if their distribution is normal.
We add groupsize random numbers in (0,1) from each stream to form a sum. We verify if the distribution of n such sums is normal using the Kolmogorov-Smirnov test. We illustrate this test in the figure below with nstreams = 3, n = 4, and groupsize = 2.
In the current implementation, the argument nblocks must be 1, implying that only one test is performed. The reason for this is that the result of this test is not exact. The true distribution approaches the normal distribution, but is not identical to it. Thus the result of this test has an error, and we are not sure if subjecting a number of such results to another test would yield meaningful results.
We have numerically calculated the maximum difference between the exact distribution and the normal distribution. Based on this result, we also output a range for the the exact KS statistic, if we had used the true distribution instead of the normal approximation.
Test Wizard
To aid you in running the statistical tests, we have included a Test Wizard in this document. It is Java applet that will translate your choices of test parameters into the appropriate command line. In addition to relieving you from remembering the command line syntax, the wizard will also catch certain simple errors in your choice of parameters. Only some of the tests are included in the wizard.If your browser's Java feature is not enabled, and if you have java, then please move to the sprng/www/testwiz directory if you have installed SPRNG and type
java TestWizardClick here to invoke the Translator:
Testing New Random Number Generators
We can easily test new random number generators. Let us assume that the new generator is in the library libnew.a, in the same directory as the other SPRNG libraries.Exectuables are created with the name test.new for each test test. For example equidist.new is the equidistribution test executable for the new generator.
- This new generator must define functions called init_rng, get_rn_dbl, and free_rng that follow the specifications of the SPRNG functions init_sprng, sprng, and free_sprng respectively (that is, the first function must perform an initialization, the second must produce random double precision numbers, and the third must free up any memory allocated for the random stream).
If users wish to test only a single random number stream, then they can, instead of defining the above functions, just write the stream of random double precision numbers to a file in ASCII (not in binary format). Then they should define the macro READ_FROM_STDIN in the file sprng/TESTS/init_tests.c and compile. The test programs will then read input from standard input. If users redirect stdin from the random number file, then these numbers will be tested. This procedure will not, however, work for the Ising model programs (Wolff and Metropolis.)
- The user just needs to add new to the list specified by LIBLIST in the Makefile in the TESTS directory and compile.
Example: LIBLIST = lcg lfg new