Message-ID: <3f51f2fe$0$169$cc7c7865@news.luth.se> From: Martin Str|mberg Subject: Re: Optimizing 8 bit variables? Newsgroups: comp.os.msdos.djgpp References: User-Agent: tin/1.4.6-20020816 ("Aerials") (UNIX) (NetBSD/1.6Q (alpha)) Date: 31 Aug 2003 13:07:10 GMT Lines: 132 NNTP-Posting-Host: speedy.ludd.luth.se X-Trace: 1062335230 news.luth.se 169 130.240.16.13 To: djgpp AT delorie DOT com DJ-Gateway: from newsgroup comp.os.msdos.djgpp Reply-To: djgpp AT delorie DOT com Carlo wrote: : Hello, : I have coded a very simple C program. : It's very simple and it can be coded into a different way, but I just : want to show the point. : #include (You need for random().) : #define MAXBUF 16 : unsigned char funct(unsigned char *ptr,int size) : { : unsigned char res = 0; : unsigned char a,b,c,d,e,f; : size >>= 3; : do { : a = ptr[0]; : b = ptr[1]; : c = ptr[2]; : d = ptr[3]; : e = ptr[4]; : f = ptr[5]; : if (a>64) a=64; : if (b>64) b=64; : if (c>64) c=64; : if (d>64) d=64; : if (e>64) c=64; : if (f>64) d=64; : res += ((a^b) & (c^d)) | (e^f); : ptr += 4; : } while (--size); : return res; : } ... : I have used GCC 3.2.3 and I got this assembly output for funct(): : _funct: : pushl %ebp : pushl %edi : pushl %esi : pushl %ebx : pushl %ebx : pushl %ebx : movl 32(%esp), %ebp : movl 28(%esp), %edx : movb $0, 7(%esp) : sarl $3, %ebp : .p2align 4,,7 : L2: : movb 3(%edx), %bl : movb (%edx), %cl : movzbl 1(%edx), %edi : movb 2(%edx), %al : cmpb $64, %cl : movb %bl, 3(%esp) : movb 5(%edx), %bl : movzbl 4(%edx), %esi : movb %bl, 6(%esp) : jbe L5 : movb $64, %cl : L5: : movl %edi, %ebx : cmpb $64, %bl : jbe L6 : movl $64, %edi ... : It has been compiled with: : gcc demo.c -S -O2 -fomit-frame-pointer : In my opinion this is a better code: : _funct: : pushl %edi : pushl %esi : pushl %ebx : movl ARG1, %edi : movl ARG0, %esi : xorl %eax, %eax : sarl $3, %edi : .p2align 4,,7 : L2: : movb (%edx), %bl : movb 1(%edx), %bh : movb 2(%edx), %cl : movb 3(%edx), %ch : movb 4(%edx), %dl : movb 5(%edx), %dh : cmpb $64, %cl : jbe L5 : movb $64, %cl : L5: : cmpb $64, %ch : jbe L6 : movb $64, %ch I see what you mean now. : I know there are many things to examine, like memory access speed (GCC : compiled version could be fast too). There might be partial register stall in your optimised version (it's just a guess; I'm not an expert on that sort of things), try -m386 or -m486 to gcc. : However, I just wonder if there is a way for telling: "use upper : registers too". Not that I know of. It should know that by itself, but I suspect it doesn't. If -m[34]86 didn't change anything it probably doesn't. : Maybe the only way is to code the interesting parts with inline : assembly functions (when it's possible) or an entire assembly code : with our wanted function. : I'm waiting your opinions for that. Yes with that you'd get exactly what you coded. Try it and benchmark it and see. Right, MartinS