delorie.com/archives/browse.cgi   search  
Mail Archives: pgcc/1998/07/06/17:28:38

X-pop3-spooler: POP3MAIL 2.1.0 b 4 980420 -bs-
Date: Mon, 6 Jul 1998 19:19:36 +0200 (CEST)
From: Andrea Arcangeli <arcangeli AT mbox DOT queen DOT it>
X-Sender: andrea AT penguin DOT e-mind DOT com
To: Tuukka Toivonen <tuukkat AT ees2 DOT oulu DOT fi>
cc: Linux Programming <linux-c-programming AT tower DOT itis DOT com>,
linuxprog AT geeky1 DOT ebtech DOT net, beastium-list <beastium-list AT Desk DOT nl>
Subject: Re: passing args in regs speed (was:something else)
In-Reply-To: <Pine.SOL.3.96.980706180936.11646B-100000@stekt3>
Message-ID: <Pine.LNX.3.96.980706190925.32147A-100000@penguin.e-mind.com>
X-Public-Key-URL: http://www-linux.deis.unibo.it/~mirror/aa.asc
MIME-Version: 1.0
Sender: Marc Lehmann <pcg AT goof DOT com>
Status: RO
Lines: 68

On Mon, 6 Jul 1998, Tuukka Toivonen wrote:

>Test program: bzip2 0.1pl2
>
>I added function prototypes for all functions in the program
>(and removed those already existing). I told the compiler
>to use different amount of register parameters and then
>compiled the program and measured how long it took to
>compress uncompressed LyX 0.12.0 source tar file (7997440 bytes)
>to /dev/null.

Nice!

>My test system: Pentium 120 MHz, 24 MB main memory, 32 MB
>swap, Linux 2.0.34, gcc version 2.7.2. There were no other
>active programs background eating CPU-time, but the
>hard disk rotated few times showing that not everything
>fit in the disk cache.

OK. Don' t worry about cache since in the real world all is not in the
cache but in the ideal world also the kernel would be compiled with
-mregparm=3 ;-).

>The tests show no significant speedup until I use all
>3 registers, in which case it's about 6% faster.

Cool!

>Question: why gcc doesn't allow more than 3 registers
>to be used?? x86 would have 7 or at least 6 free registers.

I think that you can use only the register that gcc doesn' t save across
call (eax/edx and ?!?)... 

>Each case first shows the used compiler flags, and then
>the test run was made 4 times. The times are in real-time
>seconds (measured using my own program using RDTSC instruction)
>The last number is length of the stripped ELF executable 
>(so case 4 gives smallest executables).
>
>Patch for bzip and some more information is in file
>http://www.ee.oulu.fi/~tuukkat/regpass-test.tar.gz

Good! Remeber to put in #ifdef __i386__ (I have not read the patch
though).

>Considerations: 
>- All libc calls used conventional stack parameter passing 
>  convention. This could be changed by breaking compatibility.
>- Why kernel doesn't use register parameters?? It would be
>  ideal since it wouldn't break compatibility!

Break the asm function in the arch specific code :-(, I just spent some
hours to try to compile with -mregparm=1. Also the kernel don' t compile
at all with -mregparm=3 since sometimes it need registers... I could spent
some more time on this though (also looking at the great improvement you
got with bzip!).

>( I'm CCing this to pgcc list since I think those people
>could be interested; maybe they could implement automatic
>register passing for static functions?)

Also egcs/gcc people could be interested but I think they are aware of
that just now. Anyway the problem is still backwards compatibilty. The
only piece of code that could use -mregparm=3 without major problems is
the kernel.

Andrea[s] Arcangeli

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019