Mail Archives: djgpp/1999/09/25/15:26:22
Oliver Roese (oroese AT edina DOT xnc DOT com) wrote:
: Yesterday i found out by accident, that gcc knows how to divide by constants.
: This is a optimization technique, mentioned by several experts (http://www.agner.org/ or
: http://www.azillionmonkeys.com/qed/index .html). If you know about this, i am sure you are
: interested to hear, that gcc 2.95 can do that for you.
: Example:
: #include <stdio.h>
: int Divc (int u)
: { return u / 3;} // u%3 would work
: too.
:
: int main(void)
: {
: int u = (int)&u;
: printf ("%d",Div3(u));
: return 0;
: }
:
: compile: gcc -S <filename>
: I have seen that gcc sometimes inlines short functions, but it doesent do that here, even with -O6.
: Anyway here is the result:
: ...
: .globl _Divc
: _Divc:
: pushl %ebp
: movl %esp,%ebp
: movl $1431655766,%edx
: movl 8(%ebp),%ecx
: movl %ecx,%eax
: imull %edx
: sarl $31,%ecx
: movl %ebp,%esp
: popl %ebp
: subl %ecx,%edx
: movl %edx,%eax
: ret
: ...
: I dont tried to decipher it.
: If you declare u as unsigned, you get this:
:
: LC2:
: .long 0xaaaaaaab
: .p2align 2
: .globl _Divc
Things are becoming clear now. Note that 0xaaaaaaab =~ 2*1431655766.
So it seems the signed version is using the same technique as the
unsigned one, but for 31 bits instead of 32. Then it adjusts the
result with 1 if the operator is negative as the C operator /
truncates towards 0 (while the optimisation technique truncates
towards -infinity).
: _Divc:
: pushl %ebp
: movl %esp,%ebp
: movl 8(%ebp),%eax
: mull LC2
: movl %ebp,%esp
: popl %ebp
: shrl $1,%edx
: movl %edx,%eax
: ret
:
: We learn from that:
: floor(x/3) = floor((0xaaaaaaab*x)/2^33) if 0<=x<2^32
: Trickery is simpler here, but suffers from "mull", which is not as optimized as "imull".
: Which case is faster?
I wonder if the same results would be obatined if imull was used
instead of mull. If so and if imull is faster than mull, perhaps a
possible improvement?
Right,
MartinS
- Raw text -