delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/2003/08/31/09:17:47

Message-ID: <3f51f2fe$0$169$cc7c7865@news.luth.se>
From: Martin Str|mberg <ams AT speedy DOT ludd DOT luth DOT se>
Subject: Re: Optimizing 8 bit variables?
Newsgroups: comp.os.msdos.djgpp
References: <d2ad330a DOT 0308260427 DOT 49b6ab37 AT posting DOT google DOT com> <bifne4$6c9$1 AT antares DOT lu DOT erisoft DOT se> <d2ad330a DOT 0308300441 DOT 2acd05ad AT posting DOT google DOT com>
User-Agent: tin/1.4.6-20020816 ("Aerials") (UNIX) (NetBSD/1.6Q (alpha))
Date: 31 Aug 2003 13:07:10 GMT
Lines: 132
NNTP-Posting-Host: speedy.ludd.luth.se
X-Trace: 1062335230 news.luth.se 169 130.240.16.13
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp
Reply-To: djgpp AT delorie DOT com

Carlo <cbramix AT libero DOT it> wrote:
: Hello,
: I have coded a very simple C program.
: It's very simple and it can be coded into a different way, but I just
: want to show the point.

: #include <stdio.h>

(You need <stdlib.h> for random().)

: #define MAXBUF  16

: unsigned char funct(unsigned char *ptr,int size)
: {
:     unsigned char res = 0;
:     unsigned char a,b,c,d,e,f;

:     size >>= 3;
:     do {
:         a = ptr[0];
:         b = ptr[1];
:         c = ptr[2];
:         d = ptr[3];
:         e = ptr[4];
:         f = ptr[5];
:         if (a>64) a=64;
:         if (b>64) b=64;
:         if (c>64) c=64;
:         if (d>64) d=64;
:         if (e>64) c=64;
:         if (f>64) d=64;
:         res += ((a^b) & (c^d)) | (e^f);
:         ptr += 4;
:     } while (--size);

:     return res;
: }

...

: I have used GCC 3.2.3 and I got this assembly output for funct():

: _funct:
: 	pushl	%ebp
: 	pushl	%edi
: 	pushl	%esi
: 	pushl	%ebx
: 	pushl	%ebx
: 	pushl	%ebx
: 	movl	32(%esp), %ebp
: 	movl	28(%esp), %edx
: 	movb	$0, 7(%esp)
: 	sarl	$3, %ebp
: 	.p2align 4,,7
: L2:
: 	movb	3(%edx), %bl
: 	movb	(%edx), %cl
: 	movzbl	1(%edx), %edi
: 	movb	2(%edx), %al
: 	cmpb	$64, %cl
: 	movb	%bl, 3(%esp)
: 	movb	5(%edx), %bl
: 	movzbl	4(%edx), %esi
: 	movb	%bl, 6(%esp)
: 	jbe	L5
: 	movb	$64, %cl
: L5:
: 	movl	%edi, %ebx
: 	cmpb	$64, %bl
: 	jbe	L6
: 	movl	$64, %edi

...

: It has been compiled with:

: gcc demo.c -S -O2 -fomit-frame-pointer

: In my opinion this is a better code:

: _funct:
: 	pushl	%edi
: 	pushl	%esi
: 	pushl	%ebx
: 	movl	ARG1, %edi
: 	movl	ARG0, %esi
: 	xorl	%eax, %eax
: 	sarl	$3, %edi
: 	.p2align 4,,7
: L2:
: 	movb	 (%edx), %bl
: 	movb	1(%edx), %bh
: 	movb	2(%edx), %cl
: 	movb	3(%edx), %ch
: 	movb	4(%edx), %dl
: 	movb	5(%edx), %dh

: 	cmpb	$64, %cl
: 	jbe	L5
: 	movb	$64, %cl
: L5:
: 	cmpb	$64, %ch
: 	jbe	L6
: 	movb	$64, %ch

I see what you mean now. 

: I know there are many things to examine, like memory access speed (GCC
: compiled version could be fast too).

There might be partial register stall in your optimised version (it's
just a guess; I'm not an expert on that sort of things), try -m386 or
-m486 to gcc.

: However, I just wonder if there is a way for telling: "use upper
: registers too".

Not that I know of. It should know that by itself, but I suspect it
doesn't. If -m[34]86 didn't change anything it probably doesn't.

: Maybe the only way is to code the interesting parts with inline
: assembly functions (when it's possible) or an entire assembly code
: with our wanted function.
: I'm waiting your opinions for that.

Yes with that you'd get exactly what you coded. Try it and benchmark
it and see.


Right,

						MartinS

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019