Mail Archives: djgpp/2003/08/31/09:17:47
Carlo <cbramix AT libero DOT it> wrote:
: Hello,
: I have coded a very simple C program.
: It's very simple and it can be coded into a different way, but I just
: want to show the point.
: #include <stdio.h>
(You need <stdlib.h> for random().)
: #define MAXBUF 16
: unsigned char funct(unsigned char *ptr,int size)
: {
: unsigned char res = 0;
: unsigned char a,b,c,d,e,f;
: size >>= 3;
: do {
: a = ptr[0];
: b = ptr[1];
: c = ptr[2];
: d = ptr[3];
: e = ptr[4];
: f = ptr[5];
: if (a>64) a=64;
: if (b>64) b=64;
: if (c>64) c=64;
: if (d>64) d=64;
: if (e>64) c=64;
: if (f>64) d=64;
: res += ((a^b) & (c^d)) | (e^f);
: ptr += 4;
: } while (--size);
: return res;
: }
...
: I have used GCC 3.2.3 and I got this assembly output for funct():
: _funct:
: pushl %ebp
: pushl %edi
: pushl %esi
: pushl %ebx
: pushl %ebx
: pushl %ebx
: movl 32(%esp), %ebp
: movl 28(%esp), %edx
: movb $0, 7(%esp)
: sarl $3, %ebp
: .p2align 4,,7
: L2:
: movb 3(%edx), %bl
: movb (%edx), %cl
: movzbl 1(%edx), %edi
: movb 2(%edx), %al
: cmpb $64, %cl
: movb %bl, 3(%esp)
: movb 5(%edx), %bl
: movzbl 4(%edx), %esi
: movb %bl, 6(%esp)
: jbe L5
: movb $64, %cl
: L5:
: movl %edi, %ebx
: cmpb $64, %bl
: jbe L6
: movl $64, %edi
...
: It has been compiled with:
: gcc demo.c -S -O2 -fomit-frame-pointer
: In my opinion this is a better code:
: _funct:
: pushl %edi
: pushl %esi
: pushl %ebx
: movl ARG1, %edi
: movl ARG0, %esi
: xorl %eax, %eax
: sarl $3, %edi
: .p2align 4,,7
: L2:
: movb (%edx), %bl
: movb 1(%edx), %bh
: movb 2(%edx), %cl
: movb 3(%edx), %ch
: movb 4(%edx), %dl
: movb 5(%edx), %dh
: cmpb $64, %cl
: jbe L5
: movb $64, %cl
: L5:
: cmpb $64, %ch
: jbe L6
: movb $64, %ch
I see what you mean now.
: I know there are many things to examine, like memory access speed (GCC
: compiled version could be fast too).
There might be partial register stall in your optimised version (it's
just a guess; I'm not an expert on that sort of things), try -m386 or
-m486 to gcc.
: However, I just wonder if there is a way for telling: "use upper
: registers too".
Not that I know of. It should know that by itself, but I suspect it
doesn't. If -m[34]86 didn't change anything it probably doesn't.
: Maybe the only way is to code the interesting parts with inline
: assembly functions (when it's possible) or an entire assembly code
: with our wanted function.
: I'm waiting your opinions for that.
Yes with that you'd get exactly what you coded. Try it and benchmark
it and see.
Right,
MartinS
- Raw text -