From: Michiel de Bondt <michielb AT sci DOT kun DOT nl>
Newsgroups: comp.os.msdos.djgpp
Subject: Re: how to use inline push and pop
Date: Thu, 17 May 2001 17:07:00 +0200
Organization: University of Nijmegen
Lines: 60
Message-ID: <3B03E914.28DC92D@sci.kun.nl>
References: <Pine DOT SUN DOT 3 DOT 91 DOT 1010510171303 DOT 7067H-100000 AT is>
NNTP-Posting-Host: fanth.sci.kun.nl
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Trace: wnnews.sci.kun.nl 990112020 464 131.174.132.54 (17 May 2001 15:07:00 GMT)
X-Complaints-To: usenet AT sci DOT kun DOT nl
NNTP-Posting-Date: Thu, 17 May 2001 15:07:00 +0000 (UTC)
X-Mailer: Mozilla 4.75 [en] (X11; U; SunOS 5.7 sun4u)
X-Accept-Language: en
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp
Reply-To: djgpp AT delorie DOT com

Eli Zaretskii wrote:

> On 10 May 2001, Hans-Bernhard Broeker wrote:
>
> > You really should rethink your strategy all the way since you started
> > at square one. What you're doing here, actually, is to avoid the usual
> > overhead involved in recursive function calling --- but you're doing
> > it the wrong way. To avoid having more than absolutely necessary
> > pushed to the stack in recursive calls, use the following recipe,
> > instead:
> >
> > 1) Remove as many as possible of the arguments of the recursive function
> > 2) Remove as many as possible of the recursive function's local variables
> >    in scope at the time of the resursion call(s). Block-scope variables
> >    of blocks without a recursion call in them can stay as they are.
> > 3) Replace them by a manually implemented stack of 'status structs', where
> >    the status struct holds *exactly* those variables that have to be
> >    preserved across a recursion call.
> > 4) Compile the whole stuff with -save-temps -fverbose-asm -O6 and see if
> >    you can really beat GCC in assembly hacking :-)
>
> And I would like to point out, again, that all this complicated procedure
> and the additional complexity it imposes on the code (meaning more bugs
> and more slow-down) might be _totally_ unnecessary!
>
> There's a popular belief that recursive code is terribly slow, but
> experience shows that this is mostly a myth.  Recursive code _might_ be
> slow, but in many cases it isn't.  Because recursive code is usually
> smaller, it fits better into the CPU caches.  It is also simpler, so you
> have less probability for bugs, and it lends itself better to compiler
> optimizations.
>
> In other words, I _strongly_ recommend to write a recursive implementation
> in a high-level language such as C or C++, compile it with optimizations,
> and _profile_ it, before you decide to go to such lengths.  There are good
> chances that the code will run fast enough to make any of this completely
> unnecessary.

When I used progress printfs in my procedure, I clearly won. Probably, since
the compiler only reserves registers in sections. With rarely performed
printfs,
it is a better idea to disreserve registers in the section with the call to
printf.

But in order to make some memory optimization, I made some other changes
as well. But after these changes, there is no difference in speed any more.
Maybe, the moves to and from the stack can be paired. I do not optimize to
pairability.

As you see, I read some information about the Pentium. I also read that the
memory access takes very much time. I can not change the number of memory
accesses which are all different, but I will try to do about half of the
accesses
to recent blocks of 32 bytes. I do not know how to align arrays in memory, but
I saw in the .s-output that the compiler makes the required alingments itself
(.p2align 5).


Best regards, Michiel