delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1998/03/09/04:31:48

Date: Mon, 9 Mar 1998 11:30:30 +0200 (IST)
From: Eli Zaretskii <eliz AT is DOT elta DOT co DOT il>
To: Randy Sorensen <randy AT NOSPAM DOT idcomm DOT com>
cc: djgpp AT delorie DOT com
Subject: Re: Optimized code, comparing with Borland C++ 4.5 w/ Power Pack
In-Reply-To: <350334da.0@superego.idcomm.com>
Message-ID: <Pine.SUN.3.91.980309113011.22925K-100000@is>
MIME-Version: 1.0

On Sun, 8 Mar 1998, Randy Sorensen wrote:

> Here's my problem.  When I do shut down out of DOS and run the exec's
> included on the CD, they run up to 60 fps, where as the code that I ported
> to DJGPP runs at 40 fps with the following optimizations:
> "-O6 -ffast-math -funroll-loops -finline -m486".

Which version of the compiler did you use?  I can confirm that for
heavy-duty C++ code (not C code), GCC 2.7.x is not as good as BC++.
It seems like the optimizer in cc1plus needs some work (which isn't
surprising in a language as complex as C++).  But I haven't tried my
benchmark, which is a fast chess player, on GCC 2.8 or EGCS (aka
PGCC), so I don't know whether these do better.  If you used 2.7.2.1,
I suggest trying these latter versions as well.

One thing you need to be aware of is that the startup code of v2.01
doesn't ensure the stack is aligned even on a word boundary, let alone
double-word (the latter is best for Pentium, so I'm told).  I think
this is only important on Windows, but I'm not sure what you meant by
``shut down out of DOS''--do you reboot in plain DOS, or do you use
the so-called ``DOS mode'' of Windows?  (If the latter, try the
former.)

Anyway, you could try replacing crt0.o with the version from the
latest v2.02 alpha, which does align the stack.

> Is there any other optimizations that will speed it up?  Also, I've
> heard that using high "-O"'s will cause problems.. should I bring it
> down to 4 or 3?

You will have to experiment.  The FAQ has some advice in section 14.2,
but it wasn't yet updated to reflect the changes in GCC 2.8 and PGCC.

In general, I find that -funroll-loops sometimes slows down the code,
so make sure you really want it.

The newer versions of GCC have some additional align-related options
which you might try.

> Also, since you can't write to video memory in DJGPP by default, I
> went about doing so using __djgpp_nearptr_enable() and adding
> __djgpp_conventional_base to the video memory address.  Is there a
> faster way of going about video memory writing?

It depends on the program code.  If the code allows to switch to far
pointers easily, I suggest to try it.  My experience is that far
pointers are no slower than near pointers, but you might avoid the
overhead of the call to `__djgpp_nearptr_enable', which is quite
heavy (it calls a DPMI function).

> And lastly, I had to put some extra type-casts in there, since gcc.exe kept
> giving me warnings about assigning doubles to unsigned char's and stuff.

Huh?  Are you sure it was about doubles and not pointers to double?
If the former, then the original program was sure broken.

> Thanks for all your help :) I'd appreciate it if you emailed me rather than
> posted a response, since I don't check the groups or the listserv very
> often.

If you want direct replies, please make the effort to set your headers
correctly, without any anti-spammed fakes.  Some people reply to so
many messages that they cannot afford editing the headers.

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019