Mail Archives: djgpp/1996/11/27/20:08:26
> ***** Actually, after several tests I ran, I found the best perfermance
> came from C code with algorithmic optimizations, -O3 used, and NO
> ASSEMBLY OPTIMIZATIONS. At least with my code, GCC was able to figure
> out better ways of shuffling registers than I was. I'll admit, this
> won't work with everyone though - you'll have to either profile with
> different compilations or run some other benchmarks on them to know for
> sure.
Generally speaking, DJGPP with -O3 is pretty good at optimization,
except I found its not so good at mixing (naturally) FPU and integer
code (although the pentium optimizations patch would probably fix
this).
> >types etc. For example: if I don't really need 32bits worth of int,
> >will
> >things be faster if I declare my variables as short ints?
> ***** No! Benchmarks show that a 486 is slowest with 8-bit data, about
> twice as fast with 16-bit data, and even faster with 32-bit data! On the
> Pentium, the difference between 16-bit and 32-bit is even greater. And a
> Pentium Pro actually runs 16-bit SLOWER than a Pentium, with the 32-bit
> code much faster than a Pentium. AVOID 16-bit DATA!
> ***** Warning, though, if you get a 32-bit DWord aligned wrong, you COULD
> actually end up with code that's less than half the speed it should be.
> I *believe* (I'm not sure) the way to make sure it's aligned properly is
> to use
> _PACKED_.
> ^^^^^^^^ Somebody check me here. I could be wrong...
I though DJGPP automatically aligned structures on 32 bit boundaries,
and you used the __attribute__ ((packed)) when you wanted otherwise...
Leathal.
- Raw text -