Message-ID: <000a01be0b69$933be440$c5223182@marst96.m.resnet.pitt.edu> From: "mark reed" To: Subject: fixed point vs floating and djgpp compiler Date: Sun, 8 Nov 1998 17:46:07 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 4.72.3110.1 X-MimeOLE: Produced By Microsoft MimeOLE V4.72.3110.3 Reply-To: djgpp AT delorie DOT com anyone done any timing of floating point vs fixed point? I'm multiplying two 4x4 matrices, using floating point, and i have it down to 100,000,000 times in 72.6374 seconds. Thats 1,376,701.25 matrix multiplies per second. And dividing 300,000,000 by that (300mhz processor) thats 218 cycles for 341 lines of assembly, including 64 fmuls, and 48 faddp's I wrote some fixed point stuff, in inline assembly, but it doesn't come out right when djgpp compiles, and I dont want to spend the time fixing it all up. Since my floating point is looking good. It took almost 4 times as long to do the fixed point that i used. but anyone else do any fixed point timing? and have some numbers to compare?? about the djgpp compiler, When i had it do the fixed point stuff, it kept thinking that fmul (%ecx) and fld (%edx), and fstp (%eax) were somehow changing the values of ecx, edx, and eax. So it kept reloading each of them every time they were used. lots of unnecesary movl 12(%ebp), %ecx . Anyone know why this would be? also is there a pentium optimization flag or anything? i found another thing that it did not do, fstp, requires that its target be not used one cycle before fstp is called, and the compiler put faddp with %eax, right before fstp with %eax. Putting some addl's in between sped up my code about .6 cycles per instance of faddp fstps. Which came to 6 cycles per matrix multiplication which isnt much. Dont know if it will do the same on other processors? Or if it is too much of a bother to add in to the compiler. Mark - marst96 AT pitt DOT edu