From: George Marsaglia Newsgroups: comp.lang.c,comp.os.msdos.djgpp Subject: An inline assembler RNG for C Date: Wed, 17 Feb 1999 13:35:29 -0500 Organization: Florida State University Lines: 108 Message-ID: <36CB0BF1.23FF9E2@stat.fsu.edu> NNTP-Posting-Host: dial209.acns.fsu.edu Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Mailer: Mozilla 4.08 [en] (Win95; I) To: djgpp AT delorie DOT com DJ-Gateway: from newsgroup comp.os.msdos.djgpp Reply-To: djgpp AT delorie DOT com After a number of requests for random number generators in C, I thought it worthwhile to learn enough C to suggest a few methods. A posting of the methods in sci.stat.math, sci.math and sci.math.num-analysis led to a number of responses and improvements. The various postings may be found in those newsgroups or via deja news (a search on LFIB4 may be the most effective). But one of my most promising methods cannot be implemented directly in C, and the main purpose of this posting is to seek help on how it might be done, via inline assembler. The method is Multiply-with-Carry for 32-bit integers. This is the idea. Suppose you have a current 32-bit integer x a current 32-bit carry c a fixed multiplier, say, a=2083801278 Now form a*x+c in 64 bits. Return the bottom 32 bits as the new x, and keep the top 32 bits as the new carry, c. The sequence of x's produced by repeating this process has many desirable properties. A listing of some of them is deferred to the end of this posting, in order to get directly to my appeal for help in C. The problem is to form a*x+c in 64 bits, then extract the bottom and top halves. This is easily done in assembler. For Intel chips, these instructions do the job: mov eax,x mov edx,2083801278 mul edx ; a*x in edx:eax mov ecx,c ; get carry add eax,ecx ; right half of a*x+c in eax adc edx,0 ; if overflow, increment edx mov c,edx ; store new carry mov x,eax ; store new x ret According to various posts to this group, many C compilers seem to permit inline assembler calls, but details vary. If you are familiar with how inline assembler works for C compilers you use, and can provide a version of the above for those compilers, a posting might be of interest to many potential users. George Marsaglia Advantages of the 32-bit multiply-with-carry RNG: It exploits the built-in arithmetic modulo 2^32 that almost all modern CPU's provide. It is very fast. It uses multiplication, which experience has shown does a better job of mixing bits than do other RNG methods based on +,- or xor. It has period "billions and billions" of times longer than those of ordinary congruential generators. It produces the trailing 32-bits of a congruential generator modulo a very large prime. If integers z(n)=a*z(n-1) mod a*2^32-1 are formed, the sequence z(n) mod 2^32 is the sequence of x's produced by the 32-bit multiply-with-carry method. The trailing bits of the x's do not have the regularity that is present in standard congruential generators for modulus 2^32. Indeed, if only the last bits of successive x's are used to form a large binary file, that file will pass all the tests in the Diehard Battery of tests of randomness. (http://stat.fsu.edu/pub/diehard/cdrom), as do the full set of 32-bit integers from MWC. The theory behind MWC is in the file mwc1.ps of the CDROM. Any choice of multiplier 'a' from this list: 1791398085 1929682203 1683268614 1965537969 1675393560 1967773755 1517746329 1447497129 1655692410 1606218150 2051013963 1075433238 1557985959 1781943330 1893513180 1631296680 2131995753 2083801278 1873196400 1554115554 can serve as multiplier, or any 'a' for which both a*2^32-1 and a*2^31-1 are prime. The method seems ideally suited for modern CPU's. I have predicted that it will be the system generator in future platforms. Because it is so well suited for implementation directly in the CPU, it has been adopted by several manufacturers in the gaming industry. If you play a slot machine in Las Vegas, chances are your luck--- good or bad---will be determined by the x's in a 32-bit MWC sequence. The second manufacturer to use the method was sued by the first, on the grounds that they were the first to provide a hardware implementation. Since a mathematical algorithm cannot be patented, I don't profit from its use. But I, and perhaps many followers of this group, might profit intellectually, should any experts provide inline assembler versions for C compilers we use. gm