From: Mario Deschenes Newsgroups: comp.os.msdos.djgpp Subject: cmpl takes 14 clk cycles on a Pentium ??? Date: Thu, 12 Feb 1998 16:46:29 -0500 Organization: Bell Network Solutions Lines: 59 Message-ID: <34E36DB5.71744432@btg.bombardier.com> NNTP-Posting-Host: 207.61.235.2 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit To: djgpp AT delorie DOT com DJ-Gateway: from newsgroup comp.os.msdos.djgpp Precedence: bulk Hi everyone, I'm using RDTSC to profile a routine and I got something strange. My routine looks like: void rts( void ) { static int i=0; __asm__(" cli movl _TSC,%eax cdq cdq rdtsc movl %eax,_TSC movl %edx,_TSC+4 /* Some code */ je RTS_end /* <-- This test ALWAYS jumps to RTS_end (always true) */ /* More code NEVER reached */ cmpl %eax,_ROM .align 4,0x90 RTS_end: rdtsc subl _TSC,%eax sbbl _TSC+4,%edx addl %eax,_TOTAL adcl %edx,_TOTAL+4 sti"); i++; printf("%10ld", TOTAL/i); } What I do is boot the computer under DOS (not a shell or restart in MS_DOS mode) and I average the time this routine takes and print it. This routine as you see it takes 84 clk cycles. But if I remove the line "cmpl %eax,_ROM" my program takes 70 clk cycles. What an improvement!!! The strange thing is that this part of code is NEVER executed. If I replace the "cmpl" with 6 nop (to take the same number of bytes), I still got 84 clk cycles. Looking at the assembly listing, my code always start at 0x810 and the je RTS_end is at 0x868. But the RTS_end is at 0x8C0 without the "cmpl" and at 0x8D0 with the "cmpl" opcode. This is not an easy one and I would really appreciate if someone got a hint. Thanks Mario Deschenes Software Engineer Bombardier transport group mdeschen AT btg DOT bombardier DOT com