delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1997/06/10/03:36:09

Message-ID: <339C7002.7B53@pobox.oleane.com>
Date: Mon, 09 Jun 1997 23:05:06 +0200
From: Francois Charton <deef AT pobox DOT oleane DOT com>
Organization: CCMSA
MIME-Version: 1.0
To: billc AT blackmagic DOT tait DOT co DOT nz
CC: djgpp AT delorie DOT com
Subject: Re: One method for small random nuimbers (getting offtopic)
References: <2 DOT 2 DOT 32 DOT 19970604162434 DOT 0069d50c AT gate> <3395B097 DOT 5077 AT cornell DOT edu>
<339830A6 DOT 672 AT blackmagic DOT tait DOT co DOT nz> <3397CBCF DOT 6536 AT pobox DOT oleane DOT com> <339C297F DOT 23A4 AT blackmagic DOT tait DOT co DOT nz>

Bill Currie wrote:
> 
> >
> > I read about such a method being a bad thing in several books (Numerical
> > Recipes warns a lot against this, and I remember reading something about
> > it in Knuth, I'll dig it out tonight).
> 
> If you can send me a quote, that would be apreciated.
> 

Numerical recipes in C, 2ndEd, p. 277: 
"Correlation in k-space is not the only weakness of linear congruential 
generators. Such generators often have their low-order (least 
significant) bits much less random than their high-order bits. If you 
want to generate a random integer beween 1 and 10, you should always do 
it using the high-order bits, as in:

j=1+(int)(10.0*rand()/(RAND_MAX+1.0));

and never by anything resembling

j=1+(rand()%10);

(which uses lower-order bits). Similarly, you should never try to take 
apart a "rand()" number into several supposedly random pieces. Instead, 
use separate calls for every piece".


Knuth (Seminumerical Algorithms, p.12) explains why this is so in linear 
congruential generators: 

let w be the modulus of our RNG (generally 2*32), if d divides w, and if 
the RNG transforms X_n in X_n+1, and Y_n in Y_n+1,

if
X_n % w == Y_n % w
then 
X_n+1 % w == Y_n+1 % w

This means that (when the modulus is a power of two) the 3 lowest order 
bits have a period of 8 at most, the 6 low order bits have a period of 64 
at most... Knowing that many C library rand() functions are LC generators 
(the ANSI spec gives an example of rand() routine in this way), using a 
modulo op (%) in a portable program randomiser means taking a big risk...

Of course, this does not mean that all RNG will exhibit such behaviour, 
but rather that even good RNG (linear congruential generators can be 
good) may become very unrandom when unproperly used. 

This might not be specific to linear congruential generators; they are 
known to have many failures, but maybe because they have been more 
researched than others. To quote Knuth (on the Mitchell Moore generator):

"the only reason it is difficult to recommend sequence (7) wholeheartedly 
is that there is still very little theory to prove that it does or does 
not have desirable randomness properties..."


> 
> This could vary well be true.  My test for improvement was just looking
> at the distribution of results (I was using the RNG to simulate N
> dice).  My original code just took the modulo of the random number and
> the results were horible (sure I got a bell type curve, but every second
> slot was about half (or maybe even 0) of what it should be (and the
> period was noticably short).  Using the code I posted, the curve I got
> was much closer to what I calculated it should be and I didn't get a
> glaringly obviously short period.  Sure the period may have been even
> shorter, but atleast it was out of phase with what I was doing.
> 

A theory on why this might happen with LC generators as well: suppose you 
have to model 16-side dice (strange animals indeed!), if you use the 
"%16" method on the generator, you will get unrandom numbers when you 
repeat calling rand(). But if you split one 32 bit number into 8 
"throws", you have one very bad number (the four lowest order bits), one 
better one (the four next ones), etc... On average, this is a better 
generating scheme than the previous awful one... (which does not make it 
a good one, by far...)

A better way to do just the same in one rand() call would be to get a 
random number between 0 and 35 (for two dice), and convert it: first 
die, rand/6; second die, rand%6... And then add the dice rolls...

> 
> If you can
> point me in the right direction for information on testing RNGs, that
> would be very kind of you.
> 

A good first reading on RNG with good code examples is Numerical Recipes 
in C, 2nd ed.

The definitive book on the subject is Knuth (the art of computer 
programming). Random numbers are in the beginning of the second book 
(Semi-numerical algorithms), but reading the first half of the first book 
is recommended if you want to understand most of the arguments and 
demonstrations he uses. This one is not for the faint of heart.

I think the right test for your die rolling application is the spectral 
tests he describes in section 3.3.4: how random are successive sequences 
of numbers?


Francois

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019