From: leathm AT solwarra DOT gbrmpa DOT gov DOT au (Leath Muller) Message-Id: <199703052331.JAA14058@solwarra.gbrmpa.gov.au> Subject: Re: Allegro perspective-correct .. (fpu memcopy) To: nikki AT gameboutique DOT co (nikki) Date: Thu, 6 Mar 1997 09:31:41 +1000 (EST) Cc: djgpp AT delorie DOT com In-Reply-To: <5fji73$8fo@flex.uunet.pipex.com> from "nikki" at Mar 5, 97 10:34:11 am Content-Type: text > ah, but then it's >32 bytes and won't fit in a cache. the resulting loss is > probably not worth it therefore :( if you have a moment give it a try though > and see if you can come up with any hard and fast values here, my timing > routines suck pretty bad :( I timed it last night and found using all 8 registers sped it up overall about 1 cycle per 2 pixels... If your moving a huge chunk of memory around, your going to have hit the cache anyway. While moving around 8 lots of 8 bytes, your filling 2 32-byte cache lines automatically because the instructions are sequentially reading from the same memory. Did that make sense? :) > ah there's a problem there. using 80bit values will take longer to load :( > it's 3 cycles for an 80bit load and 1cycle for a 64bit load. > how do i change the fpu mode in inline asm like that anyway btw? i haven't > managed to ever get that to work :( No, I mean manually put the machine into double (64 bit) precision to ensure your running at that precision. I can't remember exact numbers, but from memory to put the FPU in say single precision the code is something of the sort: short OldFPUCW, FPUCW; asm volatile (" fstcw %ax; movw %ax, (_OldFPUCW); andw $value, %ax; movw %ax, (_FPUCW); fldcw (_FPUCW); "); then to restore precision to its previous state: asm volatile (" fldcw (_OldFPUCW); "); I think single precision mode can be attained with NOT 110000000. Try that value and see how you go... > i suspect the 6 cycle loading rather than 2 cycle loading now causes considerable > slowdown though :( I wrote a small routine last night to use normal fldl's and fstpl's and didn't have a problem. The screen (appeared to) blit perfectly. One note though, it was blitting to a true colour (32 bit) display... Leathal.