Mail Archives: djgpp/2000/03/08/14:26:00
Eli Zaretskii) wrote:
>Did you look at the generated assembly? That could provide important
>clues.
I slightly changed my source, to make the difference even more
obvious.
This is the context diff off gcc -O2 -S output of the two versions.
*** const.s Wed Mar 8 19:03:10 2000
--- nonconst.s Wed Mar 8 19:04:52 2000
***************
*** 99,108 ****
_zseed:
.long -2023406815
.long 305419896
- .text
.p2align 2
_mul.12:
.long 999996864
.p2align 2
.globl _mwc32
_mwc32:
--- 99,108 ----
_zseed:
.long -2023406815
.long 305419896
.p2align 2
_mul.12:
.long 999996864
+ .text
.p2align 2
.globl _mwc32
_mwc32:
The only difference is, that the varible mul (_mul.12) is in the
text segment for const and in the data segment otherwise (as you
would suspect), and that the const version is much slower (factor
of ten!).
To exclude, that there may be a (hardware) problem with my system:
Could please anybody try, to reproduce my results by compiling
the following program with
gcc -O2 mwc32tst.c
and running a.exe, then uncomment the const close to the end
of the listing and recompile and rerun. Please post or mail your
results, maybe including your processor and versions of gcc and
binutils (I have AMD K6-2, tried with various versions of gcc and
binutils, including gcc 2.95.2 and binutils 2.9.5).
Regards,
Dieter
/* mwc32tst.c */
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define CALLS (1UL << 27) /* Tune this as appropriate */
/* Call function pointed to by tr n times */
unsigned long speed_loop(unsigned long (*tr)(void), unsigned long n)
{
unsigned long s;
s = 0;
do
s+=tr();
while (--n != 0);
return s;
}
/* avoid inlining of these functions */
unsigned long dum_rand(void);
unsigned long mwc32(void);
/* test the speed of function mwc32, take function call and loop
overhead into account */
int main(void)
{
clock_t anf, anfdum;
unsigned long s, n = CALLS;
anfdum = clock();
speed_loop(dum_rand, n);
anfdum = clock() - anfdum;
anf = clock();
s = speed_loop(mwc32, n);
anf = clock() - anf;
anf -= anfdum;
printf("s=%lu, used %.5f usec/call (w.o call overhead)\n",
s, 1e6/n*(double)anf/CLOCKS_PER_SEC);
return 0;
}
unsigned long dum_rand(void)
{
return 0UL;
}
typedef unsigned long long ul64;
static ul64 zseed = ((ul64)0x12345678UL<<32) | 0x87654321UL;
/* Multiply with carry RNG */
unsigned long mwc32(void)
{
unsigned long l1, l2;
ul64 res;
/* Uncommenting the const can make this function much slower,
depending on compiler switches and the phase of the moon :-) */
static /* const */ unsigned long mul=999996864UL;
l1 = (unsigned long)(zseed & 0xffffffffUL);
l2 = zseed>>32;
res = l2+l1*(ul64)mul;
zseed = res;
return (unsigned long)(res & 0xffffffffUL);
}
- Raw text -